from:"Jeff King"

Re: [PATCH v8 03/12] Move lower case functions into wrapper.c

2014-03-28 Thread Jeff King

On Fri, Mar 28, 2014 at 10:12:15AM -0700, Junio C Hamano wrote:

 By the way, that is rfc2822---do we want rfc822 as its synonym
 as well as rfc, I wonder ;-)

Oops, I wrote that as I was literally looking at the code that said
rfc2822 and didn't notice. On the other hand, I have never made the
mistake when actually running git, so it is probably not a big deal.

Besides which, isn't it 5322 these days? I do not think we need to keep
up with the treadmill.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Problems with git 1.8.5.3 on HP-UX 11.11

2014-03-28 Thread Jeff King

On Fri, Mar 28, 2014 at 11:09:14AM -, Gerhard Grimm wrote:

 So I examined the git source package and found that the author of the
 HP-UX port forgot to set
 
 PTHREAD_CFLAGS=-mt
 
 in config.mak.autogen to enable threading.

You probably want to place such manual settings in config.mak. If you
use the ./configure script, it will overwrite config.mak.autogen.

 git submodule init
 
 fails with the output
 
     Assertion failed: err == REG_ESPACE, file compat/regex/regexec.c, line 
 1096
     No submodule mapping found in .gitmodules for path 'module'
 
 and the stacktrace of the resulting core dump is
 
 #0  0xc020ced0 in kill+0x10 () from /usr/lib/libc.2
 #1  0xc01a7f84 in raise+0x24 () from /usr/lib/libc.2
 #2  0xc01e9308 in abort_C+0x160 () from /usr/lib/libc.2
 #3  0xc01e9364 in abort+0x1c () from /usr/lib/libc.2
 #4  0xc0176998 in _assert+0x178 () from /usr/lib/libc.2
 #5  0x205fa0 in check_matching+0x290 ()
 #6  0x2053b8 in re_search_internal+0x128 ()
 #7  0x204ac0 in regexec+0xc8 ()
 #8  0x4da40 in collect_config+0x60 ()
 #9  0x108b30 in get_value+0xd8 ()
 [...]

The regexes we use here are not particularly complicated. So either
there is a bug (but nobody else has reported anything on any other
platform) or your system regex library has some problem with what we are
feeding it. The simplest solution may be to compile with:

  NO_REGEX=YesPlease

which will build and use the glibc implementation in compat/regex
instead.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] t4212: handle systems with post-apocalyptic gmtime

2014-03-28 Thread Jeff King

On Fri, Mar 28, 2014 at 12:02:46PM -0700, Junio C Hamano wrote:

   - teach the is the result sane, even though we may have got a
 non-NULL from gmtime?  otherwise let's signal a failure by
 replacing it with a known sentinel value codepath the new
 failure mode Charles's report suggests---if we feed a positive
 timestamp and gmtime gave us back a tm_year+1900  0, that is
 certainly an overflow; and
 
  I don't think we can analyze the output from gmtime. If it wraps the
  year at N, then won't N+2014 look like a valid value?
 
 Yes, but I was hoping that there are small number of possible N's
 ;-)

I'm not sure I understand. Even if we know N, we've lost information
during the truncation done by time_t (we cannot distingiuish true M from
N+M).

  diff --git a/date.c b/date.c
  index e1a2cee..e0c43c4 100644
  --- a/date.c
  +++ b/date.c
  @@ -57,6 +57,8 @@ static time_t gm_time_t(unsigned long time, int tz)
   static struct tm *time_to_tm(unsigned long time, int tz)
   {
  time_t t = gm_time_t(time, tz);
  +   if (t  )
  +   return NULL;
  return gmtime(t);
   }
 
  I suspect that would handle the FreeBSD case, as well.
 
  By the way, I have a suspicion that the gm_time_t above can overflow if
  you specially craft a value at the edge of what time_t can handle (we
  check that our value will not overflow time_t earlier, but now we might
  be adding up to 86400 seconds to it). sigh
 
 Yuck.  Let's not go there.

Do you mean let's not worry about the absurdly specific overflow case,
or let's not do this gross time_to_tm hack?

This (non-)issue has consumed a lot more brain power than it is probably
worth. I'd like to figure out which patch to go with and be done. :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Problems with git 1.8.5.3 on HP-UX 11.11

2014-03-28 Thread Jeff King

On Fri, Mar 28, 2014 at 03:43:29PM -0400, Eric Sunshine wrote:

 On Fri, Mar 28, 2014 at 3:01 PM, Jeff King p...@peff.net wrote:
  On Fri, Mar 28, 2014 at 11:09:14AM -, Gerhard Grimm wrote:
  git submodule init
 
  fails with the output
 
  Assertion failed: err == REG_ESPACE, file compat/regex/regexec.c, line 
  1096
  No submodule mapping found in .gitmodules for path 'module'
 
  The regexes we use here are not particularly complicated. So either
  there is a bug (but nobody else has reported anything on any other
  platform) or your system regex library has some problem with what we are
  feeding it. The simplest solution may be to compile with:
 
NO_REGEX=YesPlease
 
  which will build and use the glibc implementation in compat/regex
  instead.
 
 Based upon the assertion-failure message, it looks like he's already
 using compat/regex.

Heh, I didn't even notice that. I just looked at all of the libc calls
at the top of the backtrace, but of course that is just from assert() on
up.

So now it seems doubly odd to me, since it is running the same regex
library that is used elsewhere.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Bug report: Git 1.8 on Ubuntu 13.10 refs not valid

2014-03-31 Thread Jeff King

On Mon, Mar 31, 2014 at 07:19:09PM +0200, Siggi wrote:

 here are the two outputs you wanted to see.

The interesting bit is at the end...

  HTTP/1.1 200 OK
  Date: Mon, 31 Mar 2014 17:04:57 GMT
 * Server Apache/2.2.22 (Ubuntu) is not blacklisted
  Server: Apache/2.2.22 (Ubuntu)
  X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 3.0.19
  ETag: 2ab570112563de50022189f0a2ffcdd4
  Pragma: no-cache
  Expires: Fri, 01 Jan 1980 00:00:00 GMT
  X-Runtime: 72
  Cache-Control: no-cache, max-age=0, must-revalidate
  Content-Length: 1047
  Status: 200
  Content-Type: application/x-git-upload-pack-advertisement; charset=utf-8

This content-type is the problem. There should not be a charset
parameter (it is meaningless, and it throws off git's content-type
check). So your web server configuration (or the redmine git plugin)
should be fixed, but you'll have to talk to redmine folks to figure that
part out.

That being said, git _could_ be more liberal in accepting a content-type
with parameters (even though it does not know about any parameters, and
charset here is completely meaningless). I have mixed feelings on that.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Bug report: Git 1.8 on Ubuntu 13.10 refs not valid

2014-03-31 Thread Jeff King

On Mon, Mar 31, 2014 at 11:27:53AM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  That being said, git _could_ be more liberal in accepting a content-type
  with parameters (even though it does not know about any parameters, and
  charset here is completely meaningless). I have mixed feelings on that.
 
 It may be just a matter of replacing strbuf_cmp() with the initial
 part must be this string followed by it could have an optional
 whitespaces and semicolon after that, but I share the mixed
 feelings.

Yeah, I think something like this is probably enough:

diff --git a/remote-curl.c b/remote-curl.c
index 52c2d96..be21fb6 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -221,6 +221,13 @@ static int show_http_message(struct strbuf *type, struct 
strbuf *msg)
return 0;
 }
 
+static int match_content_type(struct strbuf *have, struct strbuf *want)
+{
+   if (!starts_with(have-buf, want-buf))
+   return 0;
+   return have-len == want-len || have-buf[want-len] == ';';
+}
+
 static struct discovery* discover_refs(const char *service, int for_push)
 {
struct strbuf exp = STRBUF_INIT;
@@ -277,7 +284,7 @@ static struct discovery* discover_refs(const char *service, 
int for_push)
strbuf_addf(exp, application/x-%s-advertisement, service);
if (maybe_smart 
(5 = last-len  last-buf[4] == '#') 
-   !strbuf_cmp(exp, type)) {
+   match_content_type(type.buf, exp.buf)) {
char *line;
 
/*

though perhaps there are exotic whitespace rules that we would also need
to allow. Either way, I do not think it is the implementation that is
the problem, but rather the semantics.

 I am not sure if it is a right thing to follow be liberal to
 accept dictum in this case.  It may be liberal in accepting
 blindly, but if the other end is asking a behaviour that is not
 standard (but perhaps in future versions of Git such an enhanced
 service may be implemented by the client), by ignoring the parameter
 we do not even know about how to handle, we would be giving surprises
 to the overall protocol exchange.

I actually think ignoring them could provide room for future expansion
in git (i.e., we could add new c-t parameters with the knowledge that
existing git would ignore them). So there is a potential upside.  But
current git does _not_ ignore them. Git v1.10 (or 2.0, or whatever)
could, but we would have to wait for old versions to die out before
making that assumption anyway.

It's also possible that we would want to intentionally break
compatibility (to say if you do not understand foo=bar, then do not
even process this). But I don't think that is a big deal, as we could
already switch the content-type itself (x-upload-pack-advertisement2
or something). And if we really wanted to dump extra optional data into
the http conversation, we can already do so with http headers.

So really, I do not see us ever realistically expanding into
content-type parameters. This would _just_ be about handling odd
implementations. And there I can see it as us being nice (I do not think
charset is doing anything here), or us being careful about broken
implementations (why would one add a charset parameter at all? If the
implementation is blindly re-encoding to utf8 or something, it could
very well be corrupting the data in rare cases).

Given that there is only one implementation known to this in the first
place, I'd be inclined to say fix that implementation. If it becomes
a widespread problem, then we can think about loosening the check after
reviewing exactly what each implementation is doing.

I think all of that is to say I'm violently agreeing with you. But
having tried to think it through, I felt it was worth posting the more
detailed thought process.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 0/3] Take four on fixing OPT_SET_PTR issues

2014-03-31 Thread Jeff King

On Mon, Mar 31, 2014 at 10:23:44AM -0700, Junio C Hamano wrote:

 SET_PTR() may not be used, but are there places where SET_INT() is
 abused with a cast-to-pointer for the same effect?  I didn't check,
 but if there are such places, converting them to use SET_PTR() with
 their existing cast removed may be a better way to go.

Anyone doing that should be beaten with a clue stick.

Fortunately, I grepped through and I did not see any cases. My clue
stick remains untouched.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] MSVC: fix t0040-parse-options crash

2014-03-31 Thread Jeff King

On Sun, Mar 30, 2014 at 10:29:04AM +0200, Andreas Schwab wrote:

 Junio C Hamano gits...@pobox.com writes:
 
  As OPT_SET_PTR() is about setting the pointer value to intptr_t defval,
  a follow-up patch on top of this fix (see attached) may not be a bad
  thing to have, but that patch alone will not fix this issue without
  dropping the unneeded and unwanted cast to unsigned long.
 
 Wouldn't it make sense to change defval into a union to avoid the cast?
 (The intptr_t type may be too narrow for other values to be put there.)

The primary function of these structs is to capture the information
found in brace initializers.  Is it possible in C89 to initialize the
second member of a union (I think in C99, you can use named
initializers).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] date: recognize bogus FreeBSD gmtime output

2014-04-01 Thread Jeff King

Most gmtime implementations return a NULL value when they
encounter an error (and this behavior is specified by ANSI C
and POSIX).  FreeBSD's implementation, however, will
indicate an error by returning a pointer to a struct tm
with all fields set to zero. Let's also recognize this and
convert it to a NULL (with this patch, t4212 should pass on
FreeBSD).

Reported-by: René Scharfe l@web.de
Signed-off-by: Jeff King p...@peff.net
---
There are actually a few callers to gmtime and gmtime_r, so I pushed
this fix up into a compat wrapper rather than in time_to_tm to get them
all. It's possible that localtime() would want to receive the same
treatment, too.  It's not strictly necessary to make the wrapper
conditional, but it was easy to do so. We could also just run this code
all the time.

I don't have a FreeBSD VM handy to test this, so confirmation that it
passes the test would be nice.

 Makefile  |  8 
 compat/gmtime.c   | 26 ++
 config.mak.uname  |  1 +
 git-compat-util.h |  7 +++
 4 files changed, 42 insertions(+)
 create mode 100644 compat/gmtime.c

diff --git a/Makefile b/Makefile
index 3646391..2f3758c 100644
--- a/Makefile
+++ b/Makefile
@@ -338,6 +338,9 @@ all::
 # Define TEST_GIT_INDEX_VERSION to 2, 3 or 4 to run the test suite
 # with a different indexfile format version.  If it isn't set the index
 # file format used is index-v[23].
+#
+# Define GMTIME_UNRELIABLE_ERRORS if your gmtime() function does not
+# return NULL when it receives a bogus time_t.
 
 GIT-VERSION-FILE: FORCE
@$(SHELL_PATH) ./GIT-VERSION-GEN
@@ -1489,6 +1492,11 @@ ifneq (,$(XDL_FAST_HASH))
BASIC_CFLAGS += -DXDL_FAST_HASH
 endif
 
+ifdef GMTIME_UNRELIABLE_ERRORS
+   COMPAT_OBJS += compat/gmtime.o
+   BASIC_CFLAGS += -DGMTIME_UNRELIABLE_ERRORS
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
diff --git a/compat/gmtime.c b/compat/gmtime.c
new file mode 100644
index 000..ffcabf4
--- /dev/null
+++ b/compat/gmtime.c
@@ -0,0 +1,26 @@
+#include ../git-compat-util.h
+#undef gmtime
+#undef gmtime_r
+
+struct tm *git_gmtime(const time_t *timep)
+{
+   static struct tm result;
+   return git_gmtime_r(timep, result);
+}
+
+struct tm *git_gmtime_r(const time_t *timep, struct tm *result)
+{
+   struct tm *ret;
+
+   ret = gmtime_r(timep, result);
+
+   /*
+* Rather than NULL, FreeBSD gmtime will return a struct tm with all
+* fields zeroed. Since mday cannot otherwise be zero, we can test
+* this very quickly.
+*/
+   if (ret  !ret-tm_mday)
+   ret = NULL;
+
+   return ret;
+}
diff --git a/config.mak.uname b/config.mak.uname
index 6069a44..0e22ac0 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -187,6 +187,7 @@ ifeq ($(uname_S),FreeBSD)
endif
PYTHON_PATH = /usr/local/bin/python
HAVE_PATHS_H = YesPlease
+   GMTIME_UNRELIABLE_ERRORS = UnfortunatelyYes
 endif
 ifeq ($(uname_S),OpenBSD)
NO_STRCASESTR = YesPlease
diff --git a/git-compat-util.h b/git-compat-util.h
index 892032b..5191866 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -716,4 +716,11 @@ void warn_on_inaccessible(const char *path);
 /* Get the passwd entry for the UID of the current process. */
 struct passwd *xgetpwuid_self(void);
 
+#ifdef GMTIME_UNRELIABLE_ERRORS
+struct tm *git_gmtime(const time_t *);
+struct tm *git_gmtime_r(const time_t *, struct tm *);
+#define gmtime git_gmtime
+#define gmtime_r git_gmtime_r
+#endif
+
 #endif
-- 
1.9.1.656.ge8a0637

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] t4212: loosen far-in-future test for AIX

2014-04-01 Thread Jeff King

One of the tests in t4212 checks our behavior when we feed
gmtime a date so far in the future that it gives up and
returns NULL. Some implementations, like AIX, may actually
just provide us a bogus result instead.

It's not worth it for us to come up with heuristics that
guess whether the return value is sensible or not. On good
platforms where gmtime reports the problem to us with NULL,
we will print the epoch value. On bad platforms, we will
print garbage.  But our test should be written for the
lowest common denominator so that it passes everywhere.

Reported-by: Charles Bailey cbaile...@bloomberg.net
Signed-off-by: Jeff King p...@peff.net
---
 t/t4212-log-corrupt.sh | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/t/t4212-log-corrupt.sh b/t/t4212-log-corrupt.sh
index 3fa1715..58b792b 100755
--- a/t/t4212-log-corrupt.sh
+++ b/t/t4212-log-corrupt.sh
@@ -82,11 +82,9 @@ test_expect_success 'date parser recognizes time_t overflow' 
'
 '
 
 # date is within 2^63-1, but enough to choke glibc's gmtime
-test_expect_success 'absurdly far-in-future dates produce sentinel' '
+test_expect_success 'absurdly far-in-future date' '
commit=$(munge_author_date HEAD 99) 
-   echo Thu Jan 1 00:00:00 1970 + expect 
-   git log -1 --format=%ad $commit actual 
-   test_cmp expect actual
+   git log -1 --format=%ad $commit
 '
 
 test_done
-- 
1.9.1.656.ge8a0637
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2alt/2] work around unreliable gmtime errors on AIX

2014-04-01 Thread Jeff King

AIX's gmtime will happily overflow the tm_year field. Let's
catch this error before handing the value to gmtime.

Signed-off-by: Jeff King p...@peff.net
---
This is an alternative to loosening the test in t4212.

It's really not _that_ ugly.  The LL here may not be portable, though.
32-bit systems can't represent this timestamp at all (so they're safe),
but I don't know what would be the best way to conditionally compile
here.

 compat/gmtime.c  | 10 ++
 config.mak.uname |  1 +
 2 files changed, 11 insertions(+)

diff --git a/compat/gmtime.c b/compat/gmtime.c
index 75a5835..f95ba50 100644
--- a/compat/gmtime.c
+++ b/compat/gmtime.c
@@ -12,6 +12,16 @@ struct tm *git_gmtime_r(const time_t *timep, struct tm 
*result)
 {
struct tm *ret;
 
+   /*
+* Some systems, like AIX, will happily overflow the tm_year field.
+* So let's recognize obviously out-of-bound data before it hits gmtime
+* and just mark it as an error. This date is ~316 million years in the
+* future, which is far enough that nobody should care, but close
+* enough for the year to fit into a 32-bit tm_year.
+*/
+   if (*timep  LL)
+   return NULL;
+
ret = gmtime_r(timep, result);
 
/*
diff --git a/config.mak.uname b/config.mak.uname
index 0e22ac0..c1110ad 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -236,6 +236,7 @@ ifeq ($(uname_S),AIX)
INLINE = ''
endif
GIT_TEST_CMP = cmp
+   GMTIME_UNRELIABLE_ERRORS = UnfortunatelyYes
 endif
 ifeq ($(uname_S),GNU)
# GNU/Hurd
-- 
1.9.1.656.ge8a0637

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] add `ignore_missing_links` mode to revwalk

2014-04-01 Thread Jeff King

On Mon, Mar 31, 2014 at 02:48:45PM -0700, Siddharth Agarwal wrote:

 On 03/28/2014 03:00 AM, Jeff King wrote:
 From: Vicent Marti tan...@gmail.com
 
 When pack-objects is computing the reachability bitmap to serve a
 fetch request, it can erroneously die() if some of the UNINTERESTING
 objects are not present. Upload-pack throws away HAVE lines from the
 client for objects we do not have, but we may have a tip object
 without all of its ancestors (e.g., if the tip is no longer reachable
 and was new enough to survive a `git prune`, but some of its
 reachable objects did get pruned).
 
 Thanks for this patch. It looks pretty sensible.
 
 Unfortunately, I can't provide feedback on running it in production
 because we've decided to set aside experimenting with bitmaps for a
 bit. I hope to get back to it in a couple of months.

Bummer. Thanks for taking a look at it.

I do think this patch is definitely fixing a bug, and needs to be
pursued.  We've been running with bitmaps in production on GitHub since
last summer, but have never run into this situation. However, I think it
is largely caused by our pruning parameters:

  1. We tend not to prune very often, and instead keep unreachable
 objects around as a safety mechanism.

  2. When we do prune, we use a very tight cutoff, rather than the
 default 2-week period. So the window of opportunity is much smaller
 for a repo to prune an object but not its descendant (typically
 either we keep both, or they both get pruned).

So if you do come back to it later, the fix should have filtered through
to master by then. :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: fast-import deltas

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 07:25:54PM +0900, Mike Hommey wrote:

 I am currently prototyping a native mercurial remote handler for git,

For my own curiosity, how does this differ from what is in
contrib/remote-helpers/git-remote-hg?

 Would adding a fast-import command to handle deltas be considered useful
 for git? If so, what kind of format would be suitable?

It breaks fast-import's lowest common denominator data model that is
just passing commits and their contents over the stream. But we already
do that in other cases for the sake of performance. I think the
important thing is that the alternate formats are optional and enabled
by the caller with command-line options.

That being said, I foresee a few complications:

  1. Git needs to know the sha1 of the full object. So unless the
 generating script knows that ahead of time, git has to expand the
 delta immediately anyway (this could still be a win if we end up
 using a good delta from elsewhere rather than doing our own delta
 search, but I suspect it's not so big a win as if we can just blit
 the delta straight to disk).

  2. Git does not store on-disk deltas between objects that are not in
 the same packfile. So you'd only be able to delta against an object
 that came in the same stream (or you'd have to fix the packfile
 on disk by adding an extra copy of the delta base, but that
 probably eliminates any savings).

As for format, I believe that git is basically xdelta under the hood, so
you'd want either that or something that can be trivially converted to
it.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Bug in git-diff output

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 12:49:00PM +0200, rocketscienc01100101 . wrote:

 I tried to get a diff between HEAD and the current version of my
 project, so I did git diff.

That actually diffs between the index and the working tree, but if you
haven't used git add to add any changes, the index content is the same
as HEAD.

 It's a web project with a CSS file that contains the following CSS rule:
 
 a[href^=tel] {
 color:inherit;
 text-decoration:none;
 }
 
 Now, whenever I do git diff, it will always show the a[href^=tel]
 part and mess up the output, even when I didn't change anything near
 that line. The problem is easily reproducable in a newly created
 repository.

I don't understand what you mean here by mess up the output. Can you
show us an example (and tell us what you expected to see)?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: fast-import deltas

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 10:07:03PM +0900, Mike Hommey wrote:

  For my own curiosity, how does this differ from what is in
  contrib/remote-helpers/git-remote-hg?
 
 contrib/remote-helpers/git-remote-hg does a local mercurial clone before
 doing the git conversion. While this is not really a problem for most
 mercurial projects, it tends to be slow with big ones, like the firefox
 source code. What I'm aiming at is something that can talk directly to a
 remote mercurial server.

Ah, that makes sense. Thanks for explaining.

2. Git does not store on-disk deltas between objects that are not in
   the same packfile. So you'd only be able to delta against an object
   that came in the same stream (or you'd have to fix the packfile
   on disk by adding an extra copy of the delta base, but that
   probably eliminates any savings).
 
 Arguably, this would make the most difference on initial clone of big
 projects, or large incremental updates (like, after a few weeks), which
 would use a single pack anyways.

Yeah. The nice thing is that this can be an opportunistic optimization.
If the delta base is part of the same output stream, then send the
delta. Otherwise, you can always fall back to reconstructing and sending
the full object yourself.

 It seems to me fast-import keeps a kind of human readable format for its
 protocol, i wonder if xdelta format would fit the bill. That being said,
 I also wonder if i shouldn't just try to write a pack on my own...

The fast-import commands are human readable, but the blob contents are
included inline. I don't see how sending a binary delta is any worse
than sending a literal binary blob over the stream.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] t4212: handle systems with post-apocalyptic gmtime

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 12:07:22PM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  On Fri, Mar 28, 2014 at 12:30:02PM -0700, Junio C Hamano wrote:
 
  Let's just deal with a simple known cases (like FreeBSD) in the real
  code that everybody exercises at runtime, and have the new test only
  check we do not segfault on a value we used to segfault upon seeing.
 
  OK. Here it is, with the other option as an alt patch for reference.
 
[1/2]: date: recognize bogus FreeBSD gmtime output
[2/2]: t4212: loosen far-in-future test for AIX
[2alt/2]: work around unreliable gmtime errors on AIX
 
  -Peff
 
 Thanks.  2alt does not look too bad, but on the other hand, we are
 replacing a value that can produce the right result on correctly
 implemented gmtime with a completely bogus value only because we
 know there exists one broken implementation---which does not sound a
 very good trade-off, given that we would get a result that does not
 correspond to the input anyway with or without the change on the
 broken implementation.

One other option is to push _all_ callers through a git_gmtime()
wrapper, and then use Makefile knobs to turn on specific quirks, like:

struct tm *git_gmtime_r(const time_t *timep, struct tm *result)
{
#ifdef GMTIME_OVERFLOWS_32BIT
  if (*timep  999)
return NULL;
#endif

  ret = gmtime_r(timep, result);

#ifdef GMTIME_ERROR_BLANK
  if (ret  !ret-tm_mday)
return NULL;
#endif

  return ret;
}

and then each platform can turn the knobs as appropriate.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 01/22] t3204: test deleting references when lock files already exist

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 05:58:09PM +0200, Michael Haggerty wrote:

 When deleting a reference, it might be that another process already
 holds the lock on the loose reference file and/or the packed-refs
 file.  In those cases, there is no alternative but for the delete to
 fail.  Verify that in such cases the reference values are left
 unchanged.
 
 But in fact, when the packed-refs file lock cannot be acquired, the
 loose reference file is deleted anyway, potentially leaving the
 reference with a changed value (its packed value, which might even
 point at an object that has been garbage collected).  Thus one of the
 new tests is marked test_expect_failure.

Nice find. If I understand correctly, the problem is at the plumbing
level, and this could also be demonstrated with update-ref?

I don't think it is a big deal, but I was thrown for a minute by the use
of git branch (as in, is this something special with branches, or is
this about all refs?).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 04/22] rollback_lock_file(): set fd to -1

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 05:58:12PM +0200, Michael Haggerty wrote:

 When rolling back the lockfile, call close_lock_file() so that the
 lock_file's fd field gets set back to -1.  This could help prevent
 confusion in the face of hypothetical future programming errors.

This also solves a race. We could be in the middle of rollback_lock_file
when we get a signal, and double-close. It's probably not a big deal,
though, since nobody could have opened a new descriptor in the interim
that got the same number (so the second close will just fail silently).

Still, this seems like a definite improvement.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 05/22] lockfile: unlock file if lockfile permissions cannot be adjusted

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 05:58:13PM +0200, Michael Haggerty wrote:

 If the call to adjust_shared_perm() fails, lock_file returns -1, which
 to the caller looks like any other failure to lock the file.  So in
 this case, roll back the lockfile before returning so that the lock
 file is deleted immediately and the lockfile object is left in a
 predictable state that (namely, unlocked).  Previously, the lockfile
 was retained until process cleanup in this situation.

Another nice find. I wondered if we could test this, but I think it
would be hard to trigger. The obvious reason for adjust_shared_perm to
fail is that you do not have permissions on the file, but by definition
you just created it. So I doubt this ever comes up in practice short of
weird races (somebody dropping the x bit from an intermediate
directory between the open() and chmod() or something).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/22] try_merge_strategy(): remove redundant lock_file allocation

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 05:58:10PM +0200, Michael Haggerty wrote:

 By the time the if block is entered, the lock_file instance from the
 main function block is no longer in use, so re-use that one instead of
 allocating a second one.
 
 Note that the lock variable in the if block used to shadow the
 lock variable at function scope, so the only change needed is to
 remove the inner definition.

I wonder if this would also be simpler if lock were simply declared as
a static variable, and we drop the allocation entirely. I suppose that
does create more cognitive load, though, in that it is only correct if
the function is not recursive. On the other hand, the current code makes
a reader unfamiliar with struct lock wonder if there is a free(lock)
missing.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 05/22] lockfile: unlock file if lockfile permissions cannot be adjusted

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 04:02:42PM -0400, Jeff King wrote:

 On Tue, Apr 01, 2014 at 05:58:13PM +0200, Michael Haggerty wrote:
 
  If the call to adjust_shared_perm() fails, lock_file returns -1, which
  to the caller looks like any other failure to lock the file.  So in
  this case, roll back the lockfile before returning so that the lock
  file is deleted immediately and the lockfile object is left in a
  predictable state that (namely, unlocked).  Previously, the lockfile
  was retained until process cleanup in this situation.
 
 Another nice find. I wondered if we could test this, but I think it
 would be hard to trigger. The obvious reason for adjust_shared_perm to
 fail is that you do not have permissions on the file, but by definition
 you just created it. So I doubt this ever comes up in practice short of
 weird races (somebody dropping the x bit from an intermediate
 directory between the open() and chmod() or something).

...and I should have read the final sentence in your message more
carefully. Even if we did trigger it, the problem would only last until
the program exits anyway.

I agree that this is a nice cleanup, though; a caller that wants to
retry the lock before exiting would be much less surprised. And the same
logic applies to 06/22.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/22] lock_file(): always add lock_file object to lock_file_list

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 05:58:15PM +0200, Michael Haggerty wrote:

 diff --git a/lockfile.c b/lockfile.c
 index e679e4c..c989f6c 100644
 --- a/lockfile.c
 +++ b/lockfile.c
 @@ -130,6 +130,22 @@ static int lock_file(struct lock_file *lk, const char 
 *path, int flags)
*/
   static const size_t max_path_len = sizeof(lk-filename) - 5;
  
 + if (!lock_file_list) {
 + /* One-time initialization */
 + sigchain_push_common(remove_lock_file_on_signal);
 + atexit(remove_lock_file);
 + }
 +
 + lk-owner = getpid();
 + if (!lk-on_list) {
 + /* Initialize *lk and add it to lock_file_list: */
 + lk-fd = -1;
 + lk-on_list = 1;
 + lk-filename[0] = 0;
 + lk-next = lock_file_list;
 + lock_file_list = lk;
 + }

Initializing here is good, since we might be interrupted by a signal at
any time. But what about during the locking procedure? We do:

strcpy(lk-filename, path);
if (!(flags  LOCK_NODEREF))
resolve_symlink(lk-filename, max_path_len);
strcat(lk-filename, .lock);

So for a moment, lk-filename contains the name of the valuable file we
are locking.  If we get a signal at that moment, do we accidentally
delete it in remove_lock_file?

I think the answer is no, because we check lk-owner before deleting,
which will not match our pid (it should generally be zero due to xcalloc
or static initialization, though perhaps we should clear it here).

But that makes me wonder about the case of a reused lock. It will have
lk-owner set from a previous invocation, and would potentially suffer
from this problem. In other words, I think the change you are
introducing does not have the problem, but the existing code does. :-/

I didn't reproduce it experimentally, though.  We should be able to just

lk-owner = 0;

before the initial strcpy to fix it, I would think.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 09/22] api-lockfile: expand the documentation

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 05:58:17PM +0200, Michael Haggerty wrote:

 +unable_to_lock_error::
 +
 + Emit an error describing that there was an error locking the
 + specified path.  The err parameter should be the errno of the
 + problem that caused the failure.
 +
 +unable_to_lock_index_die::
 +
 + Like `unable_to_lock_error()`, but also `die()`.

Should this last one lost the index in its name? I think it is
vestigial at this point.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 12/22] delete_ref_loose(): don't muck around in the lock_file's filename

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 05:58:20PM +0200, Michael Haggerty wrote:

 It's bad manners.  Especially since, if unlink_or_warn() failed, the
 memory wasn't restored to its original contents.
 
 So make our own copy to work with.

Sounds good...

   if (!(flag  REF_ISPACKED) || flag  REF_ISSYMREF) {
 - /* loose */
 - int err, i = strlen(lock-lk-filename) - 5; /* .lock */
 -
 - lock-lk-filename[i] = 0;
 - err = unlink_or_warn(lock-lk-filename);
 - lock-lk-filename[i] = '.';
 + /*
 +  * loose.  The loose file name is the same as the
 +  * lockfile name, minus .lock:
 +  */
 + char *loose_filename = xmemdupz(lock-lk-filename,
 + strlen(lock-lk-filename) - 5);
 + int err = unlink_or_warn(loose_filename);
 + free(loose_filename);

Should we be using LOCK_SUFFIX_LEN from the previous commit here?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 14/22] lockfile: use strbufs when handling (most) paths

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 05:58:22PM +0200, Michael Haggerty wrote:

  /*
 - * p = path that may be a symlink
 - * s = full size of p
 - *
 - * If p is a symlink, attempt to overwrite p with a path to the real
 - * file or directory (which may or may not exist), following a chain of
 - * symlinks if necessary.  Otherwise, leave p unmodified.
 + * path contains a path that may be a symlink
   *
 - * This is a best-effort routine.  If an error occurs, p will either be
 - * left unmodified or will name a different symlink in a symlink chain
 - * that started with p's initial contents.
 + * If path is a symlink, attempt to overwrite it with a path to the
 + * real file or directory (which may or may not exist), following a
 + * chain of symlinks if necessary.  Otherwise, leave path unmodified.
   *
 - * Always returns p.
 + * This is a best-effort routine.  If an error occurs, path will
 + * either be left unmodified or will name a different symlink in a
 + * symlink chain that started with path's initial contents.
   */
 -
 -static char *resolve_symlink(char *p, size_t s)
 +static void resolve_symlink(struct strbuf *path)
 [...]

This is not a problem you are introducing, but do we really want
resolve_symlink to be best-effort here?

What happens if a is a symlink to b, and two processes try to lock
a simultaneously? If one succeeds, it will take b.lock. But the
other will take a.lock, and both will think they have the target
locked.

I suspect we need to recognize ENOENT for cases where we are creating
the file for the first time, but it seems like we should bail on locking
from any other transient errors.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 22/22] lockfile: allow new file contents to be written while retaining lock

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 05:58:30PM +0200, Michael Haggerty wrote:

 Add a new option flag, LOCK_SEPARATE_STAGING_FILE, that can be passed
 to hold_lock_file_for_update() or hold_lock_file_for_append() to use a
 staging file that is independent of the lock file.
 
 Add a new function activate_staging_file() that activates the contents
 that have been written to the staging file without releasing the lock.
 
 This functionality can be used to ensure that changes to two files are
 seen by other processes in one order even if correctness requires the
 locks to be released in another order.

Can you give an example of when this is useful? I'm guessing the
application is for writing out packed-refs before pruning loose refs in
git-pack-refs?

It seems like this makes the API much more confusing.  If I understand
correctly, this is basically allowing us to take a lock, write to
_another_ tmpfile that is not the lock, then rename the tmpfile into
place without releasing the lock (and then we can drop the lock at our
convenience).

I wonder if it would be simpler to build an API for that around the
lock_file API, rather than as part of it. Or am I misunderstanding
what's going on?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/22] Lockfile refactoring and pre-activation

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 05:58:08PM +0200, Michael Haggerty wrote:

 I've had this patch series kicking around for a long time, along with
 some followup patches to fix a race in reference deletion.  I haven't
 had the time to get everything done and tested, but let me at least
 push this first series out there.  I especially want to submit it in
 case we accept a GSoC student for the project Refactor tempfile
 handling, because (1) I don't want me and the student to be stepping
 on each others' toes, and (2) there are some cleanups and
 documentation improvements here that will hopefully be useful to the
 student.

Thanks, this sort of preparation for GSoC students is very much welcome.

 The first patch actually demonstrates the race condition that I hope
 to fix someday.  The last patch introduces the lockfile feature that I
 think is needed to fix it: the ability to activate a packed-refs file
 while still holding the lock to prevent another process from
 overwriting it before the accompanying loose reference updates are all
 finished.  But the fix itself is not included here, so you might want
 to keep the last patch on hold until there is a concrete user of the
 feature.

I should have read this more carefully when I responded to the final
patch. I surmised your intent based on our previous work on packed-refs,
but here you spell out my guesses explicitly. :)

I think all of the patches look good. I'd prefer to hold back the final
one (and probably 19/22, which has no purpose except to prepare for
22/22) until seeing its application in practice.

Thanks for a very pleasant read.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] date: recognize bogus FreeBSD gmtime output

2014-04-01 Thread Jeff King

On Tue, Apr 01, 2014 at 11:17:14PM +0200, René Scharfe wrote:

 So are you saying we should set EOVERFLOW ourselves, or does FreeBSD
 set EOVERFLOW for us in this case and we do not have to worry?
 
 If we correct the return value then we should correct errno as well.
 gmtime() on FreeBSD 10 leaves errno untouched when it encounters an
 overflow.
 
 While testing this again I just noticed that FreeBSD doesn't strictly return
 a pointer to a cleared struct tm.  It simply leaves its static buffer
 untouched.  We should probably clear it (memset in git_gmtime_r).

Thanks, that all makes sense (we do not ever care about gmtime's errno
in our code, but it does not hurt to be thorough to avoid surprising any
new callers). Here's a replacement for patch 1.

-- 8 --
Subject: date: recognize bogus FreeBSD gmtime output

Most gmtime implementations return a NULL value when they
encounter an error (and this behavior is specified by ANSI C
and POSIX).  FreeBSD's implementation, however, will simply
leave the struct tm untouched.  Let's also recognize this
and convert it to a NULL (with this patch, t4212 should pass
on FreeBSD).

Reported-by: René Scharfe l@web.de
Signed-off-by: Jeff King p...@peff.net
---
 Makefile  |  8 
 compat/gmtime.c   | 29 +
 config.mak.uname  |  1 +
 git-compat-util.h |  7 +++
 4 files changed, 45 insertions(+)
 create mode 100644 compat/gmtime.c

diff --git a/Makefile b/Makefile
index 3646391..2f3758c 100644
--- a/Makefile
+++ b/Makefile
@@ -338,6 +338,9 @@ all::
 # Define TEST_GIT_INDEX_VERSION to 2, 3 or 4 to run the test suite
 # with a different indexfile format version.  If it isn't set the index
 # file format used is index-v[23].
+#
+# Define GMTIME_UNRELIABLE_ERRORS if your gmtime() function does not
+# return NULL when it receives a bogus time_t.
 
 GIT-VERSION-FILE: FORCE
@$(SHELL_PATH) ./GIT-VERSION-GEN
@@ -1489,6 +1492,11 @@ ifneq (,$(XDL_FAST_HASH))
BASIC_CFLAGS += -DXDL_FAST_HASH
 endif
 
+ifdef GMTIME_UNRELIABLE_ERRORS
+   COMPAT_OBJS += compat/gmtime.o
+   BASIC_CFLAGS += -DGMTIME_UNRELIABLE_ERRORS
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
diff --git a/compat/gmtime.c b/compat/gmtime.c
new file mode 100644
index 000..e8362dd
--- /dev/null
+++ b/compat/gmtime.c
@@ -0,0 +1,29 @@
+#include ../git-compat-util.h
+#undef gmtime
+#undef gmtime_r
+
+struct tm *git_gmtime(const time_t *timep)
+{
+   static struct tm result;
+   return git_gmtime_r(timep, result);
+}
+
+struct tm *git_gmtime_r(const time_t *timep, struct tm *result)
+{
+   struct tm *ret;
+
+   memset(result, 0, sizeof(*result));
+   ret = gmtime_r(timep, result);
+
+   /*
+* Rather than NULL, FreeBSD gmtime simply leaves the struct tm
+* untouched when it encounters overflow. Since mday cannot otherwise
+* be zero, we can test this very quickly.
+*/
+   if (ret  !ret-tm_mday) {
+   ret = NULL;
+   errno = EOVERFLOW;
+   }
+
+   return ret;
+}
diff --git a/config.mak.uname b/config.mak.uname
index 6069a44..0e22ac0 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -187,6 +187,7 @@ ifeq ($(uname_S),FreeBSD)
endif
PYTHON_PATH = /usr/local/bin/python
HAVE_PATHS_H = YesPlease
+   GMTIME_UNRELIABLE_ERRORS = UnfortunatelyYes
 endif
 ifeq ($(uname_S),OpenBSD)
NO_STRCASESTR = YesPlease
diff --git a/git-compat-util.h b/git-compat-util.h
index 892032b..5191866 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -716,4 +716,11 @@ void warn_on_inaccessible(const char *path);
 /* Get the passwd entry for the UID of the current process. */
 struct passwd *xgetpwuid_self(void);
 
+#ifdef GMTIME_UNRELIABLE_ERRORS
+struct tm *git_gmtime(const time_t *);
+struct tm *git_gmtime_r(const time_t *, struct tm *);
+#define gmtime git_gmtime
+#define gmtime_r git_gmtime_r
+#endif
+
 #endif
-- 
1.9.1.656.ge8a0637

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] pack-objects: do not reuse packfiles without --delta-base-offset

2014-04-02 Thread Jeff King

When we are sending a packfile to a remote, we currently try
to reuse a whole chunk of packfile without bothering to look
at the individual objects. This can make things like initial
clones much lighter on the server, as we can just dump the
packfile bytes.

However, it's possible that the other side cannot read our
packfile verbatim. For example, we may have objects stored
as OFS_DELTA, but the client is an antique version of git
that only understands REF_DELTA. We negotiate this
capability over the fetch protocol. A normal pack-objects
run will convert OFS_DELTA into REF_DELTA on the fly, but
the reuse pack code path never even looks at the objects.

This patch disables packfile reuse if the other side is
missing any capabilities that we might have used in the
on-disk pack. Right now the only one is OFS_DELTA, but we
may need to expand in the future (e.g., if packv4 introduces
new object types).

We could be more thorough and only disable reuse in this
case when we actually have an OFS_DELTA to send, but:

  1. We almost always will have one, since we prefer
 OFS_DELTA to REF_DELTA when possible. So this case
 would almost never come up.

  2. Looking through the objects defeats the purpose of the
 optimization, which is to do as little work as possible
 to get the bytes to the remote.

Signed-off-by: Jeff King p...@peff.net
---
I happened to be fooling around with git v1.4.0 today, and noticed a
problem fetching from GitHub. Pre-OFS_DELTA git versions are ancient by
today's standard, but it's quite easy to remain compatible here, so I
don't see why not. And in theory, alternate implementations might not
understand OFS_DELTA, though in practice I would consider such an
implementation to be pretty crappy.

 builtin/pack-objects.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 7950c43..1503632 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2439,12 +2439,23 @@ static void loosen_unused_packed_objects(struct 
rev_info *revs)
}
 }
 
+/*
+ * This tracks any options which a reader of the pack might
+ * not understand, and which would therefore prevent blind reuse
+ * of what we have on disk.
+ */
+static int pack_options_allow_reuse(void)
+{
+   return allow_ofs_delta;
+}
+
 static int get_object_list_from_bitmap(struct rev_info *revs)
 {
if (prepare_bitmap_walk(revs)  0)
return -1;
 
-   if (!reuse_partial_packfile_from_bitmap(
+   if (pack_options_allow_reuse() 
+   !reuse_partial_packfile_from_bitmap(
reuse_packfile,
reuse_packfile_objects,
reuse_packfile_offset)) {
-- 
1.9.1.656.ge8a0637
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 12/22] delete_ref_loose(): don't muck around in the lock_file's filename

2014-04-02 Thread Jeff King

On Wed, Apr 02, 2014 at 08:52:17AM +0200, Torsten Bögershausen wrote:

 + /*
 +  * loose.  The loose file name is the same as the
 +  * lockfile name, minus .lock:
 +  */
 + char *loose_filename = xmemdupz(lock-lk-filename,
 + strlen(lock-lk-filename) - 5);
 
 At other places (lockfile.c) we use this
 
 +#define LOCK_SUFFIX_LEN 5
 
 I think it makes sense to move this definition to an include file (lockfile.h 
 ??)
 and use it here.

I had the same comment, but if you read through to patch 18, this manual
munging goes away entirely, and the end result is much cleaner.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] pack-objects: do not reuse packfiles without --delta-base-offset

2014-04-04 Thread Jeff King

On Wed, Apr 02, 2014 at 10:39:13AM -0700, Junio C Hamano wrote:

  However, it's possible that the other side cannot read our
  packfile verbatim. For example, we may have objects stored
  as OFS_DELTA, but the client is an antique version of git
  that only understands REF_DELTA. We negotiate this
  capability over the fetch protocol. A normal pack-objects
  run will convert OFS_DELTA into REF_DELTA on the fly, but
  the reuse pack code path never even looks at the objects.
 
 The above makes it sound like reuse pack codepath is broken.

It is broken (without this patch), though in practice only for ancient
(pre-1.4.x) clients.

 Is it too much hassle to peek at the initial bytes of each object to
 see how they are encoded? Would it be possible to convert OFS_DELTA to
 REF_DELTA on the fly on that codepath as well, instead of disabling
 the reuse altogether?

It's a mistake to peek ahead of time. Part of the point of the
pack-reuse optimization is to start sending out bytes as soon as
possible, since the network is quite often the bottleneck. So we would
not want to look through all of the to-be-sent data before sending out
the first byte.

We could convert OFS_DELTA to REF_DELTA on the fly. That _may_ have a
performance impact. Right now, we are basically doing the equivalent of
sendfile(), and conversion would involve iterating through each object
and examining the header.  I think that's probably not too bad, though.
The most expensive part of that, stepping to the next object, requires
scanning through the zlib packets, but we should be able to use the
revidx to avoid that.

I'm not sure it's even worth the code complexity, though. The non-reuse
codepath is not that much slower, and it should be extremely rare for a
client not to support OFS_DELTA these days.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] pack-objects: do not reuse packfiles without --delta-base-offset

2014-04-04 Thread Jeff King

On Fri, Apr 04, 2014 at 03:28:48PM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  We could convert OFS_DELTA to REF_DELTA on the fly. That _may_ have a
  performance impact. Right now, we are basically doing the equivalent of
  sendfile(), and conversion would involve iterating through each object
  and examining the header.  I think that's probably not too bad, though.
  The most expensive part of that, stepping to the next object, requires
  scanning through the zlib packets, but we should be able to use the
  revidx to avoid that.
 
  I'm not sure it's even worth the code complexity, though. The non-reuse
  codepath is not that much slower, and it should be extremely rare for a
  client not to support OFS_DELTA these days.
 
 OK, together with the fact that only ancient versions of fetcher
 would trigger this do not reuse codepath, I agree that we should
 go the simplest route this patch takes.

By the way, we may want to revisit this if we grow more features that do
not allow straight byte-for-byte reuse. I am thinking specifically if we
grow a packv4-like representation for an object, and we plan to convert
on-the-fly to existing packv2 clients. But I think the sensible steps
for that are:

  1. If we have v4 on disk and are outputting v2, add this case to the
 can_reuse function I just added. I.e., start out correct, and
 turn off the optimization.

  2. Experiment with on-the-fly conversion. It may be that the
 conversion is so expensive that the reuse optimization gets lost in
 the noise. Or maybe we can reclaim most of the advantage of the
 reuse code path, and it is worth going object-by-object and
 converting. But we won't know until we can measure.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/22] lock_file(): always add lock_file object to lock_file_list

2014-04-07 Thread Jeff King

On Sun, Apr 06, 2014 at 11:54:59PM +0200, Michael Haggerty wrote:

  I didn't reproduce it experimentally, though.  We should be able to just
  
  lk-owner = 0;
  
  before the initial strcpy to fix it, I would think.
 
 I think that using the owner field to avoid this problem is a bit
 indirect, so I will soon submit a fix that involves adding a flag to
 lock_file objects indicating whether the filename field currently
 contains the name of a file that needs to be deleted.

Yeah, I agree that the current code is a bit subtle. I was planning to
address this during the tempfile cleanup project (either in GSoC, if it
gets a slot, or just doing it myself). But I don't mind if you want to
do something in the interim.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 01/25] api-lockfile: expand the documentation

2014-04-07 Thread Jeff King

On Mon, Apr 07, 2014 at 01:33:43AM +0200, Michael Haggerty wrote:

 +unable_to_lock_error::
 +
 + Emit an error describing that there was an error locking the
 + specified path.  The err parameter should be the errno of the
 + problem that caused the failure.
 +
 +unable_to_lock_die::
 +
 + Like `unable_to_lock_error()`, but also `die()`.

The die() function is still called unable_to_lock_index_die() at this
point in the series.  Presumably you change it later. I don't know if it
is worth caring about the order or not; it's a doc change, so it's not
like it breaks bisectability.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 02/25] unable_to_lock_die(): rename function from unable_to_lock_index_die()

2014-04-07 Thread Jeff King

On Mon, Apr 07, 2014 at 01:33:44AM +0200, Michael Haggerty wrote:

 This function is used for other things besides the index, so rename it
 accordingly.

Oh, and here it is. I should really have just read ahead before
responding to patch 1.

We can either re-order these first two, or just not worry about it.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 17/25] commit_lock_file(): make committing an unlocked lockfile a NOP

2014-04-07 Thread Jeff King

On Mon, Apr 07, 2014 at 01:33:59AM +0200, Michael Haggerty wrote:

 It was previously a bug to call commit_lock_file() with a lock_file
 object that was not active (an illegal access would happen within the
 function).  It was presumably never done, but this would be an easy
 programming error to overlook.  So guard the file-renaming code with
 an if statement to change committing an unlocked file into a NOP.
 
 Signed-off-by: Michael Haggerty mhag...@alum.mit.edu
 ---
 Alternatively, we could make it a fatal error (e.g., an assert() or
 if...die()) to call commit_lock_file() on an unlocked file, or omit a
 warning in this case.  But since it is so hard to test code related to
 locking failures, I have the feeling that such an error is most likely
 to occur in some error-handling path, maybe when some other lockfile
 acquisition has failed, and it would be better to let the code
 continue its attempted cleanup instead of dying.  But it would be easy
 to persuade me to change my opinion.

Yeah, I would have expected a die(BUG) here.

I think it is worth making it a fatal mistake and catching it. Rolling
back an uninitialized lockfile is probably OK; we are canceling an
operation that never started. But committing a lockfile that we didn't
actually fill out could be a sign of a serious error, and we may be
propagating a bogus success code. E.g., imagine that receive-pack claims
to have written your ref, but actually commit_lock_file was a silent
NOP. I'd much rather have it die loudly so we can track down the case.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 18/25] lockfile: avoid transitory invalid states

2014-04-07 Thread Jeff King

On Mon, Apr 07, 2014 at 03:12:49PM +0200, Michael Haggerty wrote:

  How far *do* you want to go? I'm certainly not opposed to field-test your
  current changeset (plus and adjustment to use sig_atomic_t) -- overall it
  is an improvement. And then we will see how it works.
 
 For now I think I'd just like to get the biggest problems fixed without
 making anything worse.  Given that there might be a GSoC student working
 in this neighborhood, he/she might be able to take up the baton.
 
 I changed the patch series to use a new volatile sig_atomic_t active
 field rather than a bit in a flags field.

That seems like a good place to stop for now.

If any code touches the fields, it can unset the active flag (even
temporarily), and restore it when the structure is in a known state.

I'm not sure if we can ever reach full safety. If you have an event
loop, the sane thing to do is set an atomic flag in your signal handler
and then handle it on the next iteration of the loop. But all of our
signal handlers are jumped to from arbitrary code, and are just going to
die(). There's nothing to return to, so any useful work we do has to be
done from the handler.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 00/25] Lockfile correctness and refactoring

2014-04-07 Thread Jeff King

On Mon, Apr 07, 2014 at 01:33:42AM +0200, Michael Haggerty wrote:

 This is a second attempt at renovating the lock file code.  Thanks to
 Peff, Junio, Torsten, and Eric for their helpful reviews of v1.
 
 v1 of this patch series [1] did some refactoring and then added a new
 feature to the lock_file API: the ability to activate a new version of
 a locked file while retaining the lock.
 
 But the review of v1 turned up even more correctness issues in the
 existing implementation of lock files.  So this v2 dials back the
 scope of the changes (it omits the new feature) but does more work to
 fix problems with the current lock file implementation.
 
 The main theme of this patch series is to better define the state
 diagram for lock_file objects and to fix code that left them in
 incorrect, indeterminate, or unexpected states.  There are also a few
 patches that convert several functions to use strbufs instead of
 limiting pathnames to a maximum length.

Looks OK to me, modulo the few comments I sent.

I still think resolve_symref should probably not be best-effort, but
that can come later on top.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] describe: rewrite name_rev() iteratively

2014-04-08 Thread Jeff King

On Mon, Apr 07, 2014 at 01:47:14AM +0300, Dragos Foianu wrote:

 The git describe --contains command uses the name_rev() function which
 is currently a recursive function. This causes a Stack Overflow when the
 history is large enough.
 
 Rewrite name_rev iteratively using a stack on the heap. This slightly
 reduces performance due to the extra operations on the heap, but the
 function no longer overflows the stack.

You can avoid the heap overhead by using an array for your stack, and
only resizing it when necessary. Like this:

struct rev_stack {
int nr, alloc;
struct rev_data *data;
};

static struct rev_data *rev_stack_push(struct rev_stack *stack)
{
ALLOC_GROW(stack-data, stack-nr + 1, stack-alloc);
return stack-data[stack-nr++];
}

static void rev_stack_pop(struct rev_stack *stack)
{
stack-nr--;
}

static void rev_stack_init(struct rev_stack *stack)
{
stack-nr = stack-alloc = 0;
stack-data = NULL;
}

static void rev_stack_release(struct rev_stack *stack)
{
free(stack-data);
rev_stack_init(stack);
}

Usage would be something like:

struct rev_data *data = rev_stack_push(stack);
data-commit = commit;
data-tip_name = tip_name;
...

IOW, you push first to allocate the space, and then do your
make_rev_data, rather than the other way around.

The downside is that your allocation is always as big as the deepest
recursion so far, so you hold on to the memory a little longer than
necessary. I think that's a good tradeoff versus an extra malloc() for
every commit.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Changes based on a branch not yet merged back to master

2014-04-08 Thread Jeff King

On Tue, Apr 08, 2014 at 11:22:44AM -0700, K. Kwan wrote:

 My tree would look like this:
 
 - x - x - x (master) 
 
    \ x - x (A) x - x (B)
 
 
 But after merge of branch (A), I would like it to look like this:
 
 - x - x - x  x (master)
    \/ \
 x - x (A)  x - x (B)
 
 
 Perhaps this?
 
 $ git checkout master
 $ git pull origin master
 $ git merge A
 $ git rebase --onto master A B

Yes, that should work. I think you could even just do:

  git rebase master B

since after the merge, all of A is contained in master (and so will be
omitted from the rebased commits).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 6/9] branch: display publish branch

2014-04-11 Thread Jeff King

On Thu, Apr 10, 2014 at 05:36:59PM -0500, Felipe Contreras wrote:

  I noticed that this only picks up a publish-branch if
  branch.*.pushremote is configured. What happened to the case when
  remote.pushdefault is configured?
 
 What happens when branch.*.remote is not configured for @{upstream}? The same
 thing.

I don't know if that is a good comparison.

In other threads, the discussed meaning of @{publish} was something like
the tracking branch of the ref you would push to if you ran 'git push'
without arguments.

That is consistent with @{upstream} being the tracking branch of the
ref you would pull from with 'git pull'. But git pull without a
branch.*.remote will do nothing, so what pull would do is the same as
what you have configured in your branch.*.remote.

Whereas git push does not depend on having branch.*.pushremote
configured. Its behavior is based on push.default and push refspecs, so
what push would do must take that into account.

 It might be useful to visualize what would be the name of the branch when
 pushing it (without a refspec) even if the publish branch hasn't been
 configured, but I think the code would be much more coplicated, and it would
 break symetry with @{upstream}, besides, the user can just do 'git push -p
 branch', and from that moment on it will be visible.

It is more complicated (see the patches that Junio had at
jk/branch-at-publish), but I think it is more likely to do what the user
expects.

For instance, it looks like your @{publish} requires config like:

  [branch master]
  pushremote = foo
  push = refs/heads/bar

to operate. Setting pushremote affects what git push does; it will
go to the foo remote. But the branch.master.push setting does not do
anything to git push. Only a push refspec (or push.default setting)
will change that. So the branch.*.push must be kept in sync manually
(perhaps by running git push -p).

Whereas if @{publish} means where you would push to, then
branch.*.push does not need to exist at all. The values can be taken
automatically from the other push settings.

-Peff

PS I first tried just setting branch.master.pushremote without setting
   branch.master.push. This results in a segfault, as branch_get()
   assumes that push_name is always set and tries to xstrdup() it.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Our official home page and logo for the Git project

2014-04-11 Thread Jeff King

On Thu, Apr 10, 2014 at 10:24:24AM +1000, Andrew Ardill wrote:

 It's normal for an organisation to have a collection of logos to
 choose from, with one 'official' version. For example, a black and
 white version is useful for print. Similarly, it's useful to have a
 couple of different contrast level/colours that can be used in the
 appropriate situations.

There are a few options at

  http://git-scm.com/downloads/logos

for matching the logo to the background.

 There is nothing wrong with having alternates that have been approved
 for various situations.

I'm not sure if this is how you meant it, but I want to emphasize that
there is no approval necessary for using alternate logos. Saying
let's recognize this one as an official logo is not meant to shut down
the use of others. It is only meant to say when people ask for an
official logo (e.g., GSoC does so), this one is a good answer.

That is not to say that proliferation of logos is a good idea either.
The point of a logo is recognizability, and if there are dozens of git
logos, chances are that most of them are not recognizable.

 I recommend creating a git repository called git-resources,
 git-marketing, or git-assets, to contain the various approved logos.
 If there is not another location, or a more appropriate one,
 https://github.com/git would be a good place to put this.

I think the logo page above is a good start for variations of that
particular logo. I'd prefer not to put other random logos there unless
they also get wide enough use that they are recognized by the project.
But I have no objection to a repository of random logos.

The git-scm.com page is mostly targeted at end users: what is it, how do
I get it, where is the documentation. Things like a logo repository, or
developer information is spread across various wikis and other sites.
If there's interest, we can make dev.git-scm.com for such things, or
host repositories under http://github.com/git. But we would first need
content to put there, and somebody would need to step forward to
organize and maintain that content.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] test: fix t7001 cp to use POSIX options

2014-04-11 Thread Jeff King

On Fri, Apr 11, 2014 at 01:24:02AM -0700, Kyle J. McKay wrote:

 Since 11502468 and 04c1ee57 (both first appearing in v1.8.5), the
 t7001-mv test has used cp -a to perform a copy in several of the
 tests.
 
 However, the -a option is not required for a POSIX cp utility and
 some platforms' cp utilities do not support it.
 
 The POSIX equivalent of -a is -R -P -p.
 
 Change cp -a to cp -R -P -p so that the t7001-mv test works
 on systems with a cp utility that only implements the POSIX
 required set of options and not the -a option.

I wonder if the -R is the part that we actually care about here.
Including the others does not hurt in that case, but using only -R
would perhaps make it more obvious to a later reader of the code exactly
what we are trying to do.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 6/9] branch: display publish branch

2014-04-12 Thread Jeff King

On Fri, Apr 11, 2014 at 08:48:01AM -0500, Felipe Contreras wrote:

 I think of @{publish} as the branch the user has configured to push
 to; it overrides all other configurations (push.default and push
 refspecs). I wouldn't mind having a @{push} *in addition* to @{publish}
 that would have the behavior you mention, but for @{publish} I'm pretty
 certain the behavior I want is that it maps *directly* to what the user
 has configured.

I guess I don't understand why we need branch.*.push when we already
have push refspecs that do the same thing (and are more powerful, as
they can glob).

The behavior you describe is not useful to me, as it would mean having
to manage branch.*.push as a duplicate of the information that is
already found in remote.$(git config remote.pushdefault).push.  I do
not mind if the two co-exist, but please do not take the short @{p} as
your series does (not only because of @{push}, but also because @{pull}
has been mentioned elsewhere as a possible other name for @{upstream}).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 6/9] branch: display publish branch

2014-04-12 Thread Jeff King

On Fri, Apr 11, 2014 at 12:24:35PM -0700, Junio C Hamano wrote:

  But the branch.master.push setting does not do
  anything to git push.
 
 I am not sure I understand this.  I thought that the desire behind
 the branch.*.push is to allow something like:
 
   ... other things in the config ...
   [remote]
   pushdefault = foo
   [remote foo]
   url = ...
   push = +refs/heads/*:refs/remotes/satellite/*
 fetch = +refs/heads/*:refs/remotes/foo/*
   [branch master]
   ; pushremote = foo
   push = refs/heads/bar
 
 so that git push on 'master' will override the more generic all
 local branches here will go to their remote-tracking hierarchy for
 this satellite refspec.  And in that sense branch.master.push would
 do something to git push.

Ah, I see. If I set push.default to upstream, then the config I
showed before _does_ affect git push. But I do not usually do that. I
have push.default set to current, and sometimes override it using push
refspecs on certain repositories.

And that is why I find branch.*.push and Felipe's @{publish} useless for
my workflow. Pushes already go where I want them to, and I just want a
way to ask git to perform that config resolution for me. Whereas
Felipe's workflow is (I think) something like:

  # make a new branch...
  git checkout -b topic origin/master

  # now publish our branch, and remember our publishing point
  git push -p my-repo topic

  # and now further pushes automatically go to my-repo/topic
  git push

I can see there is some value in that override if you do things like:

  git push -p my-repo topic:some-other-name

because the -p means remember this other name I gave.

I would think in such a workflow that most of your branches would end up
with publish config, though. And therefore @{publish} would become
equivalent to where you would push. But Felipe indicated that he would
not want branch -vv to match where all branches would be pushed, but
rather only those that were specifically configured. So maybe I do not
understand his workflow after all.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Our official home page and logo for the Git project

2014-04-12 Thread Jeff King

On Fri, Apr 11, 2014 at 12:25:17PM -0700, Junio C Hamano wrote:

 The mention of dev.git-scm.com gives me a mixed feeling.  The
 chasm between the developer community and casual end-users who know
 about Git primarily via their perusal of git-scm.com is one of the
 root causes of this confusion.

I do not think you can get rid of that split, though. Different people
want different content from a site. Somebody who wants to download and
run git does not care about our Summer of Code ideas page. Somebody who
wants to get a logo does not care about seeing an in-progress logo
contest, or discussion on which logos people are working on.

Historically most of the dev information has been on the mailing list.
But sometimes it is more helpful to have a web page showing the current
state of some content (e.g., the list of SoC ideas) and just
periodically update it, rather than having each reader assemble the
current state from whatever has been posted to the list.  We have used
the kernel.org wiki for this in the past. What I was suggesting is that
those things could fall under the name dev.git-scm.com (which could
even just point to the k.org wiki, or some other wiki, or a site to
which many devs had push access).

The wiki has _also_ been used for user-facing content. E.g., the list of
tools that build on git. That kind of content would make sense to me on
git-scm.com, and perhaps it could be ported there to give it better
exposure.

 The one on the left-top corner was one of the alternatives that
 received favorable reactions from multiple people (I am not sure if
 there was a clear majority though) submitted when we briefly had a
 poll to come up with an updated logo.

Do you have a link to the poll or its results? I could not find one in
the list archive. Not that it necessarily matters to the current
discussion, but I was interested for historical curiosity.

I have also seen that logo receive unfavorable reactions from people,
but my recollection is probably biased because I was one of those
people. :)

 In any case, this motion is not about let's declare the logo we see
 on git-scm.com today as _the_ official one.  It is not about that
 logo on git-scm.com sucks; let's come up with a better one.  People
 are welcome to do that discussion elsewhere, and I do not mind a
 repository of contestants created somewhere, but personally I think
 the project is too mature for that and it is too late, even though
 the bleeding-red fork logo may not be my favorite.

Thanks, this is what I was trying to say in my earlier message.

 The motion is about this:
 
 Outside people, like the party who approached us about putting
 our logo on their trinket, seem to associate that logo we see on
 git-scm.com today with our project, but we never officially said
 it was our logo (we did not endorse that git-scm.com is our
 official home page, either, for that matter).
 
 It is silly for us to have to say Ehh, that is a logo that was
 randomly done and slapped on git-scm.com which is not even our
 official home page, and the logo is licensed CC-BY by somebody
 else.  Go talk to them., every time such a request comes.
 
 Please help us by letting us answer Yup, that is a logo (among
 others) that represents our project, and we are OK with you
 using it to help promote our project instead.
 
 That is what I meant by our official logo in the first message.
 
 So,... seconds?

I do not know if I count, as I am listed as one of the proposers in your
original message. But yes, I agree with this.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Our official home page and logo for the Git project

2014-04-12 Thread Jeff King

On Fri, Apr 11, 2014 at 08:24:48AM -0500, Felipe Contreras wrote:

 I would actually like you (everyone) to be honest and answer this
 question;
 
 Have you actually analized the logo? Or are you just arguing against
 change, because the logo is already used by git-scm.com, and related
 stuff?

Is this rhetorical? If not...

Yes, I really thought about the logo and like it.

Many of your complaints are about how git concepts map onto the logo
(for instance, the direction of the graph nodes).  That is _one_ way of
evaluating the logo.

But there are other criteria, as well. For example, is the logo pleasing
to the eye? Is it memorable and recognizable? Things like pleasing are
subjective, but there are patterns across humanity. Graphic artists have
studied this for some time and have guidelines for layouts, contrast,
balance, proportionality, etc.

For example, in the git-fc logo you mentioned, you rotated the logo from
git-scm.com. I find it less visually pleasing than the original. It
seems somehow more wobbly to me with the two branches sticking up.
Now, that is my completely subjective opinion. I do not know very much
about graphic design, and whether guidelines could help there, nor did I
conduct any empirical research. So maybe it is just me, or maybe one
design is universally more pleasing than the other.

But I think that visual art considerations should be at least as
important in a logo as whether the logo pedantically matches the tool's
output.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Clear an invalid password out of the credential-cache?

2014-04-14 Thread Jeff King

On Sat, Apr 12, 2014 at 09:12:57AM -0400, Jason Pyeron wrote:

 Is it me or is the only way to clear a single invalid password out of the
 credential-cache is by git credential-cache exit?

Try the reject action:

  $ : remember a credential
  $ url() { echo url=https://example.com; }
  $ (url; echo username=me; echo password=foo) | git credential-cache store

  $ : get it again
  $ url | git credential-cache get
  username=me
  password=foo

  $ : forget it
  $ url | git credential-cache erase

  $ : now produces nothing
  $ url | git credential-cache get

Git should do this for you automatically if it tries to use the password
and gets rejected (but only if the rejection is a password rejection,
like an HTTP 401).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 6/9] branch: display publish branch

2014-04-14 Thread Jeff King

On Sat, Apr 12, 2014 at 10:05:15AM -0500, Felipe Contreras wrote:

 As you can see; some branches are published, others are not. The ones that are
 not published don't have a @{publish}, and `git branch -v` doesn't show them.
 Why is that hard to understand?

Do you ever push the unpublished branches anywhere at all? If not, then
you would not have a tracking branch. E.g., git _would_ push to remote
gh, branch refs/heads/topic, but there is no remote tracking branch
refs/remotes/gh/topic, because you have never actually pushed there.
So there is no @{publish} branch.

Or do you have some branches in a state where they are pushed, but not
published? It wasn't clear to me from your example if your pu or
dev/remote/hg-extra ever get pushed.

I do not use git branch -v myself, so I don't personally care that
much how it behaves. But I do use a separate script that does the same
thing, and I would want it to show the ahead/behind relationship between
each branch and where it would be pushed to (and as I said, I define
mine with refspecs). Right now it uses nasty hackery to guess at where
things will be pushed, but ideally it would ask git via @{push} or some
similar mechanism.

If the former (you do not actually push them), then I think the
semantics I am looking for and the ones you want would coincide. If not,
then I think we are really talking about two different things.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] config.c: fix a compiler warning

2014-04-16 Thread Jeff King

On Wed, Apr 16, 2014 at 04:13:53PM +0200, Stepan Kasal wrote:

 Date: Thu, 10 Apr 2014 16:37:15 +0200
 
 This change fixes a gcc warning when building msysGit.

What warning? I'm assuming -Wuninitialized?

 diff --git a/config.c b/config.c
 index 314d8ee..0b7e4f8 100644
 --- a/config.c
 +++ b/config.c
 @@ -572,7 +572,7 @@ static void die_bad_number(const char *name, const char 
 *value)
  
  int git_config_int(const char *name, const char *value)
  {
 - int ret;
 + int ret = 0;
   if (!git_parse_int(value, ret))
   die_bad_number(name, value);
   return ret;

Hmph. Generally gcc should assume that a variable is initialized after a
pointer to it is passed into a function. Unless it inlines that function
and can see that it isn't. But if we do inline git_parse_int, we see
that it ret is always initialized if it returns non-zero.

If it also inlines die_bad_number, it would see that we end in die(),
which is marked NORETURN. But if it does not, it will not realize that
we do not get to return ret in that case.

So perhaps a better solution is:

diff --git a/config.c b/config.c
index 6821cef..a30cb5c 100644
--- a/config.c
+++ b/config.c
@@ -557,6 +557,7 @@ int git_parse_ulong(const char *value, unsigned long *ret)
return 1;
 }
 
+NORETURN
 static void die_bad_number(const char *name, const char *value)
 {
const char *reason = errno == ERANGE ?

Does that also silence the warning?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] git tag --contains : avoid stack overflow

2014-04-16 Thread Jeff King

On Wed, Apr 16, 2014 at 04:15:19PM +0200, Stepan Kasal wrote:

 From: Jean-Jacques Lafay jeanjacques.la...@gmail.com
 Date: Sat, 10 Nov 2012 18:36:10 +0100

 In large repos, the recursion implementation of contains(commit,
 commit_list) may result in a stack overflow. Replace the recursion with
 a loop to fix it.

 This problem is more apparent on Windows than on Linux, where the stack
 is more limited by default.

I think this is a good thing to be doing, and it looks mostly good to
me. A few comments:

 -static int contains_recurse(struct commit *candidate,
 +/*
 + * Test whether the candidate or one of its parents is contained in the list.
 + * Do not recurse to find out, though, but return -1 if inconclusive.
 + */
 +static int contains_test(struct commit *candidate,
   const struct commit_list *want)

Can we turn this return value into

  enum {
CONTAINS_UNKNOWN = -1,
CONTAINS_NO = 0,
CONTAINS_YES = 1,
  } contains_result;

to make the code a little more self-documenting?

  static int contains(struct commit *candidate, const struct commit_list *want)
  {
 - return contains_recurse(candidate, want);
 + struct stack stack = { 0, 0, NULL };
 + int result = contains_test(candidate, want);
 +
 + if (result = 0)
 + return result;

Then this can become:

  if (result != CONTAINS_UNKNOWN)
return result;

 + if (!parents) {
 + commit-object.flags = UNINTERESTING;
 + stack.nr--;
 + }

Shouldn't this be |= when setting the flag?

 + /*
 +  * If we just popped the stack, parents-item has been marked,
 +  * therefore contains_test will return a meaningful 0 or 1.
 +  */
 + else switch (contains_test(parents-item, want)) {
 + case 1:
 + commit-object.flags |= TMP_MARK;
 + stack.nr--;
 + break;
 + case 0:
 + entry-parents = parents-next;
 + break;
 + default:
 + push_to_stack(parents-item, stack);
 + break;
 + }

And if we have an enum, this switch() becomes more readable (the
default here threw me off initially, because it is actually just
looking for -1).

 +expect
 +# ulimit is a bash builtin; we can rely on that in MinGW, but nowhere else
 +test_expect_success MINGW '--contains works in a deep repo' '
 + ulimit -s 64

It would be nice to test this on Linux.

Can we do something like:

  test_lazy_prereq BASH 'bash --version'

  test_expect_success BASH '--contains works in a deep repo' '
... setup repo ...
bash -c ulimit -s 64  git tag --contains HEAD actual 
test_cmp expect actual
  '

As a bonus, then our ulimit call does not pollute the environment of
subsequent tests.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] git tag --contains : avoid stack overflow

2014-04-17 Thread Jeff King

On Thu, Apr 17, 2014 at 07:31:54PM +0200, Johannes Schindelin wrote:

  bash -c ulimit -s 64  git tag --contains HEAD actual 
 [...]
 Please see https://github.com/msysgit/git/c63d196 for the fixup, and
 https://github.com/msysgit/git/compare/tag-contains%5E...tag-contains for
 the updated patch.

I tried running the test on my Linux box, but it doesn't fail with the
existing recursive code. So I tried a few different stack sizes, like:

  for i in `seq 1 64`; do
bash -c 
  ulimit -s $i 
  ../../git tag --contains HEAD ||
  echo fail $i
  done

The results are strangely non-deterministic, but with -O0, we generally
die reliably below about 60. With -O2, though, it's more like 43. We
can't go _too_ low here, though, as lots of things start breaking around
32.

If we instead bump the size of the history to 2000 commits, then I
reliably fail with a 64k stack (even with -O2, it needs around 80k).

Of course those numbers are all black magic, and are going to vary based
on the system, the compiler, settings, etc. My system is 64-bit, and the
current code needs at least 3 pointers per recursive invocation. So
we're spending ~46K on those variables, not counting any calling
convention overhead (and presumably we at least need a function return
pointer there). So a 32-bit system might actually get by, as it would
need half as much.

So we can bump the depth further; probably 4000 is enough for any system
to fail with a 64k stack. The deeper we make it, the longer it takes to
run the test, though. At 4000, my machine seems to take about 300ms to
run it. That's may be OK.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] git tag --contains : avoid stack overflow

2014-04-17 Thread Jeff King

On Thu, Apr 17, 2014 at 11:52:56PM +0200, Johannes Schindelin wrote:

  I tried running the test on my Linux box, but it doesn't fail with the
  existing recursive code.
 
 I cannot recall how I came to choose 64, but I *think* I only tested on
 Windows, and I *think* I reduced the number of tags in order to make
 things faster (Windows is *unbearably* slow with spawn-happy programs such
 as Git's tests -- literally every single line in a shell script tests the
 patience of this developer, running the complete test suite with 15
 parallel threads takes several hours, no kidding).

Yeah, I figured speed had something to do with it. However, since you
are using a bash loop to generate the input (and it should all be done
as builtins in bash, I think), and fast-import to create the objects, I
don't think bumping it will actually increase your process count.

  The results are strangely non-deterministic, but with -O0, we generally
  die reliably below about 60. With -O2, though, it's more like 43. We
  can't go _too_ low here, though, as lots of things start breaking around
  32.
 
 How about using 40, then? I am more interested in reducing the runtime
 than reducing the number of false negatives. The problem will be exercised
 enough on Windows, but not if the test suite becomes even slower than it
 already is.

I'm OK with doing that. My biggest concern is that it will cause false
positives on systems that are hungrier for stack space, but we can
address that if it happens.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] tag: add -i and --introduced modifier for --contains

2014-04-17 Thread Jeff King

On Thu, Apr 17, 2014 at 10:04:52AM -0700, Junio C Hamano wrote:

 Commit A can be described in terms of both v3.4 and v9.0, and it may
 be closer to v9.0 than v3.4, and under that definition we pick the
 closest tag, the current describe --contains behaviour may be
 correct, but from the human point of view, it is *WRONG*.
 
 It is wrong because v9.0 can reach v3.4.  So perhaps the rule should
 be updated to do something like:
 
 - find candidate tags that can be used to describe --contains
   the commit A, yielding v3.4, v3.5 (not shown), and v9.0;
 
 - among the candidate tags, cull the ones that contain another
   candidate tag, rejecting v3.5 (not shown) and v9.0;
 
 - among the surviving tags, pick the closest.
 
 Hmm?

Interesting.  I think that would cover some cases, but there are others
in which the tags are not direct descendants. For example, imagine you
have both a master and a maint branch. You fork a topic from an old
commit that both branches contain, and then independently merge the
topic to each branch. You then cut a release for each. So your graph
might look like:

 ---A---B---C-D---E---F (maint, v3.4)
 \   \   /
  \   ---G-H---I (master, v4.0)
   \   /  /
--J---

The fix is J, and it got merged up to maint at D, and to master at H.
v4.0 does not contain v3.4. What's the best description of J?

By the rules above, we hit the third rule pick the closest. Which
means we choose v3.4 or v4.0 based solely on how many commits are
between the topic's merge and the tag release. Which has nothing at all
to do with the topic itself.

In this case we'd show v4.0 (because J-H-I is shorter than J-D-E-F).
But I suspect most users would want to know v3.4, because they want to
know the oldest release they can move up to that contains the commit.
But that notion of oldness is not conveyed by the graph above; it's only
an artifact of the tag names.

So you can solve this by actually representing the relationship with a
merge. IOW, by merging v3.4 into v4.0 to say yes, v4.0 is a superset.
And that's generally what we do in git.git, merging maint into master
periodically. But I imagine there are other possible workflows where
people do not do that merge up, and the maint and master branches
diverge (and maybe they even cherry-pick from each other, but sometimes
merge if the fix can be based on a common ancestor, as in this case).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/6] run_external_diff cleanups

2014-04-19 Thread Jeff King

It's possible to overflow an array in run_external_diff and write a
single NULL onto the stack. The first patch below fixes that. The rest
are cleanups and modernizations I noticed while in the area. It's
possible that patch 3 is also a bug fix, depending on your
interpretation.

  [1/6]: run_external_diff: use an argv_array for the command line
  [2/6]: run_external_diff: use an argv_array for the environment
  [3/6]: run_external_diff: clean up error handling
  [4/6]: run_external_diff: drop fflush(NULL)
  [5/6]: run_external_diff: hoist common bits out of conditional
  [6/6]: run_external_diff: refactor cmdline setup logic

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/6] run_external_diff: use an argv_array for the command line

2014-04-19 Thread Jeff King

We currently generate the command-line for the external
command using a fixed-length array of size 10. But if there
is a rename, we actually need 11 elements (10 items, plus a
NULL), and end up writing a random NULL onto the stack.

Rather than bump the limit, let's just an argv_array, which
makes this sort of error impossible.

Noticed-by: Max L infthi.in...@gmail.com
Signed-off-by: Jeff King p...@peff.net
---
This was actually noticed by a GitHub user, who proposed bumping
the array size to 11:

  https://github.com/git/git/pull/92

Even though this fix is a bigger change, I'd rather do it this way, as
it is more obviously correct to a reader (and it solves the problem
forever). I pulled the name/email from that commit, but please let me
know if you'd prefer to be credited differently.

 diff.c | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/diff.c b/diff.c
index 539997f..b154284 100644
--- a/diff.c
+++ b/diff.c
@@ -16,6 +16,7 @@
 #include submodule.h
 #include ll-merge.h
 #include string-list.h
+#include argv-array.h
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
@@ -2902,9 +2903,8 @@ static void run_external_diff(const char *pgm,
  int complete_rewrite,
  struct diff_options *o)
 {
-   const char *spawn_arg[10];
+   struct argv_array argv = ARGV_ARRAY_INIT;
int retval;
-   const char **arg = spawn_arg[0];
struct diff_queue_struct *q = diff_queued_diff;
const char *env[3] = { NULL };
char env_counter[50];
@@ -2915,23 +2915,22 @@ static void run_external_diff(const char *pgm,
const char *othername = (other ? other : name);
temp_one = prepare_temp_file(name, one);
temp_two = prepare_temp_file(othername, two);
-   *arg++ = pgm;
-   *arg++ = name;
-   *arg++ = temp_one-name;
-   *arg++ = temp_one-hex;
-   *arg++ = temp_one-mode;
-   *arg++ = temp_two-name;
-   *arg++ = temp_two-hex;
-   *arg++ = temp_two-mode;
+   argv_array_push(argv, pgm);
+   argv_array_push(argv, name);
+   argv_array_push(argv, temp_one-name);
+   argv_array_push(argv, temp_one-hex);
+   argv_array_push(argv, temp_one-mode);
+   argv_array_push(argv, temp_two-name);
+   argv_array_push(argv, temp_two-hex);
+   argv_array_push(argv, temp_two-mode);
if (other) {
-   *arg++ = other;
-   *arg++ = xfrm_msg;
+   argv_array_push(argv, other);
+   argv_array_push(argv, xfrm_msg);
}
} else {
-   *arg++ = pgm;
-   *arg++ = name;
+   argv_array_push(argv, pgm);
+   argv_array_push(argv, name);
}
-   *arg = NULL;
fflush(NULL);
 
env[0] = env_counter;
@@ -2940,8 +2939,9 @@ static void run_external_diff(const char *pgm,
env[1] = env_total;
snprintf(env_total, sizeof(env_total), GIT_DIFF_PATH_TOTAL=%d, q-nr);
 
-   retval = run_command_v_opt_cd_env(spawn_arg, RUN_USING_SHELL, NULL, 
env);
+   retval = run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, 
env);
remove_tempfile();
+   argv_array_clear(argv);
if (retval) {
fprintf(stderr, external diff died, stopping at %s.\n, name);
exit(1);
-- 
1.9.1.656.ge8a0637

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/6] run_external_diff: use an argv_array for the environment

2014-04-19 Thread Jeff King

We currently use static buffers and a static array for
formatting the environment passed to the external diff.
There's nothing wrong in the code, but it is much easier to
verify that it is correct if we use a dynamic argv_array.

Signed-off-by: Jeff King p...@peff.net
---
 diff.c | 14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/diff.c b/diff.c
index b154284..760fc96 100644
--- a/diff.c
+++ b/diff.c
@@ -2904,11 +2904,9 @@ static void run_external_diff(const char *pgm,
  struct diff_options *o)
 {
struct argv_array argv = ARGV_ARRAY_INIT;
+   struct argv_array env = ARGV_ARRAY_INIT;
int retval;
struct diff_queue_struct *q = diff_queued_diff;
-   const char *env[3] = { NULL };
-   char env_counter[50];
-   char env_total[50];
 
if (one  two) {
struct diff_tempfile *temp_one, *temp_two;
@@ -2933,15 +2931,13 @@ static void run_external_diff(const char *pgm,
}
fflush(NULL);
 
-   env[0] = env_counter;
-   snprintf(env_counter, sizeof(env_counter), GIT_DIFF_PATH_COUNTER=%d,
-++o-diff_path_counter);
-   env[1] = env_total;
-   snprintf(env_total, sizeof(env_total), GIT_DIFF_PATH_TOTAL=%d, q-nr);
+   argv_array_pushf(env, GIT_DIFF_PATH_COUNTER=%d, 
++o-diff_path_counter);
+   argv_array_pushf(env, GIT_DIFF_PATH_TOTAL=%d, q-nr);
 
-   retval = run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, 
env);
+   retval = run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, 
env.argv);
remove_tempfile();
argv_array_clear(argv);
+   argv_array_clear(env);
if (retval) {
fprintf(stderr, external diff died, stopping at %s.\n, name);
exit(1);
-- 
1.9.1.656.ge8a0637

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/6] run_external_diff: clean up error handling

2014-04-19 Thread Jeff King

When the external diff reports an error, we try to clean up
and die. However, we can make this process a bit simpler:

  1. We do not need to bother freeing memory, since we are
 about to exit.  Nor do we need to clean up our
 tempfiles, since the atexit() handler will do it for
 us. So we can die as soon as we see the error.

  3. We can just call die() rather than fprintf/exit. This
 does technically change our exit code, but the exit
 code of 1 is not meaningful here. In fact, it is
 probably wrong, since 1 from diff usually means
 completed successfully, but there were differences.

And while we're there, we can mark the error message for
translation, and drop the full stop at the end to make it
more like our other messages.

Signed-off-by: Jeff King p...@peff.net
---
Note that we do have to update one test, which was expecting difftool to
exit with 1 (and difftool just propagates diff's exit status in this
case). I couldn't find any reasoning in the history for this exit(1),
and it dates all the way back to May 2005. So I do not think it had any
particular purpose, and for the reasons above, I do not think anyone
would be sane to be relying on it.

 diff.c  | 9 +++--
 t/t7800-difftool.sh | 2 +-
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/diff.c b/diff.c
index 760fc96..b517d01 100644
--- a/diff.c
+++ b/diff.c
@@ -2905,7 +2905,6 @@ static void run_external_diff(const char *pgm,
 {
struct argv_array argv = ARGV_ARRAY_INIT;
struct argv_array env = ARGV_ARRAY_INIT;
-   int retval;
struct diff_queue_struct *q = diff_queued_diff;
 
if (one  two) {
@@ -2934,14 +2933,12 @@ static void run_external_diff(const char *pgm,
argv_array_pushf(env, GIT_DIFF_PATH_COUNTER=%d, 
++o-diff_path_counter);
argv_array_pushf(env, GIT_DIFF_PATH_TOTAL=%d, q-nr);
 
-   retval = run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, 
env.argv);
+   if (run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, 
env.argv))
+   die(_(external diff died, stopping at %s), name);
+
remove_tempfile();
argv_array_clear(argv);
argv_array_clear(env);
-   if (retval) {
-   fprintf(stderr, external diff died, stopping at %s.\n, name);
-   exit(1);
-   }
 }
 
 static int similarity_index(struct diff_filepair *p)
diff --git a/t/t7800-difftool.sh b/t/t7800-difftool.sh
index 5a193c5..dc30a51 100755
--- a/t/t7800-difftool.sh
+++ b/t/t7800-difftool.sh
@@ -58,7 +58,7 @@ test_expect_success PERL 'custom tool commands override 
built-ins' '
 
 test_expect_success PERL 'difftool ignores bad --tool values' '
: expect 
-   test_expect_code 1 \
+   test_must_fail \
git difftool --no-prompt --tool=bad-tool branch actual 
test_cmp expect actual
 '
-- 
1.9.1.656.ge8a0637

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/6] run_external_diff: drop fflush(NULL)

2014-04-19 Thread Jeff King

This fflush was added in d5535ec (Use run_command() to spawn
external diff programs instead of fork/exec., 2007-10-19),
because flushing buffers before forking is a good habit.

But later, 7d0b18a (Add output flushing before fork(),
2008-08-04) added it to the generic run-command interface,
meaning that our flush here is redundant.

Signed-off-by: Jeff King p...@peff.net
---
 diff.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/diff.c b/diff.c
index b517d01..fdb7f6c 100644
--- a/diff.c
+++ b/diff.c
@@ -2928,7 +2928,6 @@ static void run_external_diff(const char *pgm,
argv_array_push(argv, pgm);
argv_array_push(argv, name);
}
-   fflush(NULL);
 
argv_array_pushf(env, GIT_DIFF_PATH_COUNTER=%d, 
++o-diff_path_counter);
argv_array_pushf(env, GIT_DIFF_PATH_TOTAL=%d, q-nr);
-- 
1.9.1.656.ge8a0637

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/6] run_external_diff: hoist common bits out of conditional

2014-04-19 Thread Jeff King

Whether we have diff_filespecs to give to the diff command
or not, we always are going to run the program and pass it
the pathname. Let's pull that duplicated part out of the
conditional to make it more obvious.

Signed-off-by: Jeff King p...@peff.net
---
 diff.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/diff.c b/diff.c
index fdb7f6c..173b657 100644
--- a/diff.c
+++ b/diff.c
@@ -2907,26 +2907,24 @@ static void run_external_diff(const char *pgm,
struct argv_array env = ARGV_ARRAY_INIT;
struct diff_queue_struct *q = diff_queued_diff;
 
+   argv_array_push(argv, pgm);
+   argv_array_push(argv, name);
+
if (one  two) {
struct diff_tempfile *temp_one, *temp_two;
const char *othername = (other ? other : name);
temp_one = prepare_temp_file(name, one);
temp_two = prepare_temp_file(othername, two);
-   argv_array_push(argv, pgm);
-   argv_array_push(argv, name);
argv_array_push(argv, temp_one-name);
argv_array_push(argv, temp_one-hex);
argv_array_push(argv, temp_one-mode);
argv_array_push(argv, temp_two-name);
argv_array_push(argv, temp_two-hex);
argv_array_push(argv, temp_two-mode);
if (other) {
argv_array_push(argv, other);
argv_array_push(argv, xfrm_msg);
}
-   } else {
-   argv_array_push(argv, pgm);
-   argv_array_push(argv, name);
}
 
argv_array_pushf(env, GIT_DIFF_PATH_COUNTER=%d, 
++o-diff_path_counter);
-- 
1.9.1.656.ge8a0637

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 6/6] run_external_diff: refactor cmdline setup logic

2014-04-19 Thread Jeff King

The current logic makes it hard to see what gets put onto
the command line in which cases. Pulling out a helper
function lets us see that we have two sets of file data, and
the second set either uses the original name, or the other
renamed/copy name.

Signed-off-by: Jeff King p...@peff.net
---
The last patch and this one are getting a little bit into code churn,
perhaps. I think the prior one is hands-down more readable. This one, I
am on the fence on whether it is a true improvement or simply this is
how _I_ would have written it. I won't be offended if we drop it.

 diff.c | 26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/diff.c b/diff.c
index 173b657..4b8bfc6 100644
--- a/diff.c
+++ b/diff.c
@@ -2888,6 +2888,16 @@ static struct diff_tempfile *prepare_temp_file(const 
char *name,
return temp;
 }
 
+static void add_external_diff_name(struct argv_array *argv,
+  const char *name,
+  struct diff_filespec *df)
+{
+   struct diff_tempfile *temp = prepare_temp_file(name, df);
+   argv_array_push(argv, temp-name);
+   argv_array_push(argv, temp-hex);
+   argv_array_push(argv, temp-mode);
+}
+
 /* An external diff command takes:
  *
  * diff-cmd name infile1 infile1-sha1 infile1-mode \
@@ -2911,17 +2921,11 @@ static void run_external_diff(const char *pgm,
argv_array_push(argv, name);
 
if (one  two) {
-   struct diff_tempfile *temp_one, *temp_two;
-   const char *othername = (other ? other : name);
-   temp_one = prepare_temp_file(name, one);
-   temp_two = prepare_temp_file(othername, two);
-   argv_array_push(argv, temp_one-name);
-   argv_array_push(argv, temp_one-hex);
-   argv_array_push(argv, temp_one-mode);
-   argv_array_push(argv, temp_two-name);
-   argv_array_push(argv, temp_two-hex);
-   argv_array_push(argv, temp_two-mode);
-   if (other) {
+   add_external_diff_name(argv, name, one);
+   if (!other)
+   add_external_diff_name(argv, name, two);
+   else {
+   add_external_diff_name(argv, other, two);
argv_array_push(argv, other);
argv_array_push(argv, xfrm_msg);
}
-- 
1.9.1.656.ge8a0637
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/6] run_external_diff: use an argv_array for the command line

2014-04-19 Thread Jeff King

On Sun, Apr 20, 2014 at 02:09:49AM +0400, Max L wrote:

 One more note: at this moment the problem is slightly deeper. This
 array is next passed to the execvp function, which now falls with
 EFAULT on two my machines (both faced this problem after upgrading to
 ubuntu 14.04, everything 'worked' fine before, looks like now execvp
 checks input more strictly). This leads to non-working 'git difftool'.

Interesting. We're overwriting whatever is after spawn_arg on the stack,
so I'd expect the fork/exec to work, but the function to complain while
popping the stack frame (though I couldn't get it to do so). I wonder if
some kind of stack protection is kicking in, and the NULL doesn't get
written or something. Either way, we should definitely address it.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Project idea: github-like diff view

2014-04-20 Thread Jeff King

On Sun, Apr 20, 2014 at 04:58:28PM +0700, Duy Nguyen wrote:

 - --color-words within unified diff format, using background color to
 show what part of the line has changed. This is only enabled for
 1-line changes.

See contrib/diff-highlight.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [SECURITY PATCH] git-prompt.sh: don't put unsanitized branch names in $PS1

2014-04-21 Thread Jeff King

On Mon, Apr 21, 2014 at 03:07:28PM -0400, Richard Hansen wrote:

 Both bash and zsh subject the value of PS1 to parameter expansion,
 command substitution, and arithmetic expansion.  Rather than include
 the raw, unescaped branch name in PS1 when running in two- or
 three-argument mode, construct PS1 to reference a variable that holds
 the branch name.  Because the shells do not recursively expand, this
 avoids arbitrary code execution by specially-crafted branch names such
 as '$(IFS=_;cmd=sudo_rm_-rf_/;$cmd)'.

Cute. We already disallow quite a few characters in refnames (including
space, as you probably discovered), and generally enforce that during
ref transfer. I wonder if we should tighten that more as a precuation.
It would be backwards-incompatible, but I wonder if things like $ and
; in refnames are actually useful to people.

Did you look into similar exploits with completion? That's probably
slightly less dire (this one hits you as soon as you cd into a
malicious clone, whereas completion problems require you to actually hit
tab). I'm fairly sure that we miss some quoting on pathnames, for
example. That can lead to bogus completion, but I'm not sure offhand if
it can lead to execution.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Project idea: github-like diff view

2014-04-22 Thread Jeff King

On Tue, Apr 22, 2014 at 04:59:17PM +0700, Duy Nguyen wrote:

 On Sun, Apr 20, 2014 at 9:46 PM, Jeff King p...@peff.net wrote:
  On Sun, Apr 20, 2014 at 04:58:28PM +0700, Duy Nguyen wrote:
 
  - --color-words within unified diff format, using background color to
  show what part of the line has changed. This is only enabled for
  1-line changes.
 
  See contrib/diff-highlight.
 
 Thanks. I'd rather have it built in core git still. I'll try to see if
 I can rewrite it in C. Else, any objection to promote it to a core
 helper and setup pager automatically? We can have a config key to turn
 it off, but if git diff is colored, then it could be on by default.

If you are going to write it as part of git, it would be interesting to
try using a real word-diff to find the inter-line changes, instead of
the front and back match heuristic that the script uses. I know there
are some cases that would look better, like:

  -foo(buf, len);
  +foo(obj-buf, obj-len);

but I suspect some cases would also look worse. It would be interesting
to experiment with, though.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] pack-bitmap: do not core dump

2014-04-22 Thread Jeff King

On Tue, Apr 22, 2014 at 03:53:02PM -0700, Kyle J. McKay wrote:

 So I was trying to use pack.writebitmaps=true and all I got was core dumps.

Eek.

 The fix with a real subject line ;) is below.  I think perhaps this should be
 picked up for the 2.0.0 release.  (Patch is against master.)

Yes, this is definitely the sort of bugfix we want to see during the -rc
period (well, we would prefer not to see bugs at all, but if we must
have them, fixes are helpful).

  8 
 Subject: [PATCH] ewah_bitmap.c: do not assume size_t and eword_t are the same 
 size

Thanks for a very well-written commit message. I think your fix makes
sense:

 - self-rlw = self-buffer + (rlw_offset / sizeof(size_t));
 + self-rlw = self-buffer + (rlw_offset / sizeof(eword_t));

We could also write it as:

  self-rlw = (uint8_t *)self-buffer + rlw_offset;

but I do not think that is necessarily any more readable, especially
because we probably need to cast it like:

  self-rlw = (eword_t *)((uint8_t *)self-buffer + rlw_offset);

Given that self-rlw is a pointer to eword_t, though, we can assume
rlw_offset is always going to be a multiple of sizeof(eword_t) anyway
(and if it is not, the division in the original is a big problem, but I
do not think that is the case).  So why do any uint8_t math in the first
place? I think we could write it as:

eword_t *old = self-buffer;
... realloc ...
self-rlw = self-buffer + (self-rlw - old);

I'm fine with your patch, though.

I also poked through the rest of the bitmap code looking for similar
problems, but didn't find any. I do not think this was a systemic issue
with bad use of types; it was just a think-o that happened to work on
64-bit machines.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Pack bitmaps on Git for Windows

2014-04-23 Thread Jeff King

On Tue, Apr 22, 2014 at 05:58:10PM +1000, Bryan Turner wrote:

 It looks like the documentation for bitmaps is being included in the
 1.9.2 release of Git for Windows but the functionality itself is not
 present. For example, doc\git\html\git-config.html includes settings
 like these:
 
 pack.useBitmaps

I think this is a documentation problem with Git for Windows. The bitmap
code is going into v2.0, but there should be nothing (not even
documentation) in v1.9.2. I don't know how the msysgit folks build their
documentation, but I wonder if it accidentally built from current
'master' rather than v1.9.2.

You might get more information at the msysgit mailing list:

  https://groups.google.com/forum/#!forum/msysgit

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: error: git-remote-https died of signal 13

2014-04-23 Thread Jeff King

On Sun, Apr 20, 2014 at 08:42:15PM -0400, Greg M wrote:

 Using git version 1.9.2 I am getting this error:
 
 [normal@laptop tmp]$ git clone https://github.com/mozilla/rust.git
 Cloning into 'rust'...
 remote: Reusing existing pack: 296648, done.
 remote: Counting objects: 80, done.
 remote: Compressing objects: 100% (77/77), done.
 remote: Total 296728 (delta 22), reused 9 (delta 3)
 Receiving objects: 100% (296728/296728), 110.68 MiB | 190.00 KiB/s, done.
 Resolving deltas: 100% (238828/238828), done.
 Checking connectivity... done.
 error: git-remote-https died of signal 13

Thanks for a thorough bug report. I looked through your strace output,
and this really does look like another case of OpenSSL getting SIGPIPE
while closing the connection.

It's odd, though, as your curl version has my patches, along with
similar extra fixes in 854aca5 (multi: ignore sigpipe internally,
2014-02-17). But I guess there may be a code path that needs similar
treatment.

The easiest way to find it is probably to attach a debugger to the
running git-remote-https, and get a backtrace when it dies from SIGPIPE.
You'll probably want to install your system's debug packages for curl,
too.

 I have curl version 7.36 though, in case some of the other output matters:
 
 [normal@laptop tmp]$ curl --version
 curl 7.36.0 (x86_64-unknown-linux-gnu) libcurl/7.36.0 OpenSSL/1.0.1g
 zlib/1.2.8 libssh2/1.4.3

Another possibility is that your curl binary is up-to-date, but you are
linking against an older version of libcurl that does not have the
SIGPIPE workarounds.

I'm not sure of the best way to check that, but a hacky way under Linux
is:

  $ ldd $(git --exec-path)/git-remote-https | grep libcurl
  libcurl.so.4 = /usr/lib/x86_64-linux-gnu/libcurl.so.4
  $ strings  /usr/lib/x86_64-linux-gnu/libcurl.so.4 | grep '7\.'
  CLIENT libcurl 7.36.0

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] git tag --contains : avoid stack overflow

2014-04-23 Thread Jeff King

On Wed, Apr 23, 2014 at 12:12:14PM -0700, Junio C Hamano wrote:

  +ulimit_stack=ulimit -s 64
  +test_lazy_prereq ULIMIT 'bash -c '$ulimit_stack''
 
 With this implementaion, ULIMIT implies bash, and we use bash that
 appears on user's PATH that may not be the one the user chose to run
 git with.  Can't we fix both of them by using $SHELL_PATH?

I don't think so. The point is that we _must_ use bash here, not any
POSIX shell. So my $SHELL_PATH is /bin/sh, which is dash, and would not
run the test.

We want to run some bash if we can. We may pick a bash on the user's
PATH that is not what they put into $SHELL_PATH, but that should be
relatively rare. And the consequence is that either that bash works fine
and we run the test, or it does not, and we skip the test.

 How about doing it along this line instead?
 
   run_with_limited_stack () {
   $SHELL_PATH -c ulimit -s 64  $*
   }
 
   test_lazy_prereq ULIMIT run_with_limited_stack true

That's a much more direct test. I like it (aside from the $SHELL_PATH
thing as described above).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] git tag --contains : avoid stack overflow

2014-04-23 Thread Jeff King

On Wed, Apr 23, 2014 at 09:53:25AM +0200, Stepan Kasal wrote:

 I have found out that ulimit -s does not work on Windows.
 Adding this as a prerequisite, we will skip the test there.

I found this bit weird, as the test originated on Windows. Did it never
actually cause a failure there (i.e., the ulimit -s doesn't do
anything)? Or does ulimit fail?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH v2 1/3] sh-setup: export GIT_DIR

2014-04-23 Thread Jeff King

On Wed, Apr 23, 2014 at 02:42:38PM -0500, Felipe Contreras wrote:

 It is what the clients of this library expect.

Is it? Passing GIT_DIR to sub-invocations of git will change how they
determine the repo and working tree. Your patch seems to cause failures
all over the test suite.

Without looking too hard, I'd guess the problems are one of:

  1. Setting GIT_DIR fixes the repo directory for all sub-invocations. A
 script that does cd some-other-repo.git  git  You'd need to
 teach calling scripts to unset GIT_DIR when trying to move to
 another repo.

  2. If GIT_DIR is set but GIT_WORK_TREE is not, then GIT_WORK_TREE will
 default to .. It might be sufficient to set GIT_WORK_TREE when
 you are setting GIT_DIR here. But as I said, I didn't look too
 hard, so there might be other complications.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: How to trim the fat on my log graphs

2014-04-23 Thread Jeff King

On Wed, Apr 23, 2014 at 02:59:26PM -0500, Robert Dailey wrote:

 I referred back to the documentation for --decorate:
 
 --decorate[=short|full|no]
 Print out the ref names of any commits that are shown. If short is
 specified, the ref name prefixesrefs/heads/, refs/tags/ and
 refs/remotes/ will not be printed. If full is specified, the full ref
 name (including prefix) will be printed. The default option is short.
 
 since default is short, and its documented to exclude tag names, I
 would expect them to not be there. I also tried removing it to test
 but i saw no difference; tag names are still visible.

Are you reading that as if short is specified, then refs whose names
are prefixed with refs/heads, refs/tags, etc will be omitted entirely
from decoration?

The intent is if short is specified, then a ref whose names has the
prefix of refs/heads, refs/tags, etc, will have that prefix removed when
showing it.

IOW, the options are:

  $ git log -1 --no-decorate | head -1
  commit 0bc85abb7aa9b24b093253018801a0fb43d01122

  $ git log -1 --decorate=short | head -1
  commit 0bc85abb7aa9b24b093253018801a0fb43d01122 (HEAD, tag: v1.9.2, 
origin/maint)

  $ git log -1 --decorate=full | head -1
  commit 0bc85abb7aa9b24b093253018801a0fb43d01122 (HEAD, tag: refs/tags/v1.9.2, 
refs/remotes/origin/maint)

If that's the confusion, feel free to suggest better wording for the
documentation.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] git tag --contains : avoid stack overflow

2014-04-23 Thread Jeff King

On Wed, Apr 23, 2014 at 01:48:05PM -0700, Junio C Hamano wrote:

  I don't think so. The point is that we _must_ use bash here, not any
  POSIX shell.
 
 Sorry, but I do not understand.  Isn't what you want any POSIX
 shell with 'ulimit -s 64' supported?

Sure, that would be fine, but the original patch which started this
thread claimed that bash was required. I had assumed that to be true,
but it seems like it's not:

 $ dash -c 'ulimit -s  ulimit -s 64  ulimit -s'
 8192
 64

If we are just using the same shell we are already running, then why
invoke it by name in the first place? IOW, why not:

  run_with_limited_stack () {
(ulimit -s 64  $@)
  }

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: error: git-remote-https died of signal 13

2014-04-23 Thread Jeff King

On Wed, Apr 23, 2014 at 07:49:09AM -0400, Greg M wrote:

  The easiest way to find it is probably to attach a debugger to the
  running git-remote-https, and get a backtrace when it dies from SIGPIPE.
  You'll probably want to install your system's debug packages for curl,
  too.
 
 
 The backtrace:
 
 Program received signal SIGPIPE, Broken pipe.
 0x7fdcfd511a2d in write () from /usr/lib/libpthread.so.0
 (gdb) bt
 #0  0x7fdcfd511a2d in write () from /usr/lib/libpthread.so.0
 #1  0x7fdcfd81a0f6 in sock_write () from /usr/lib/libcrypto.so.1.0.0
 #2  0x7fdcfd817edb in BIO_write () from /usr/lib/libcrypto.so.1.0.0
 #3  0x7fdcfc662902 in ssl3_write_pending () from /usr/lib/libssl.so.1.0.0
 #4  0x7fdcfc664b77 in ssl3_dispatch_alert () from /usr/lib/libssl.so.1.0.0
 #5  0x7fdcfc660822 in ssl3_shutdown () from /usr/lib/libssl.so.1.0.0
 #6  0x7fdcfd2e944e in Curl_ossl_close () from /usr/lib/libcurl.so.4
 #7  0x7fdcfd2b6459 in Curl_disconnect () from /usr/lib/libcurl.so.4
 #8  0x7fdcfd2c8131 in curl_multi_cleanup () from /usr/lib/libcurl.so.4
 #9  0x0040764b in ?? ()
 #10 0x00404e4d in ?? ()
 #11 0x7fdcfcf0fb05 in __libc_start_main () from /usr/lib/libc.so.6
 #12 0x0040552e in ?? ()

Thanks, that's very helpful. I wasn't able to reproduce your problem
locally, but I suspect the curl patch below may fix it:

diff --git a/lib/multi.c b/lib/multi.c
index bc93264..72e4825 100644
--- a/lib/multi.c
+++ b/lib/multi.c
@@ -1804,10 +1804,13 @@ static void close_all_connections(struct Curl_multi 
*multi)
 
   conn = Curl_conncache_find_first_connection(multi-conn_cache);
   while(conn) {
+SIGPIPE_VARIABLE(pipe_st);
 conn-data = multi-closure_handle;
 
+sigpipe_ignore(conn-data, pipe_st);
 /* This will remove the connection from the cache */
 (void)Curl_disconnect(conn, FALSE);
+sigpipe_restore(pipe_st);
 
 conn = Curl_conncache_find_first_connection(multi-conn_cache);
   }

Let me know if you need any pointers on getting curl built or tested
with git.

Daniel, I think the similar fix to curl_multi_cleanup in commit a900d45
missed this code path, and we need something like the above patch. I
know you were trying to keep the SIGPIPE mess at the entrance-points to
the library, and this works against that. But we need a SessionHandle to
pass to sigpipe_ignore to look at its no_signal flag, and of course in
the case of multi we may have several such handles. If there's a similar
flag we can check on the multi handle, we could just cover all of
curl_multi_cleanup with a single ignore/reset pair.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What is missing from Git v2.0

2014-04-24 Thread Jeff King

On Thu, Apr 24, 2014 at 01:26:33PM -0500, Felipe Contreras wrote:

 Jonathan Nieder wrote:
  Stefan Beller wrote:
  
   I may have missunderstood.
  
   So today you cannot commit if you don't provide an email address
   (usually the first time you try to commit, git asks to git config
   --global author.email=y...@mail.here), if I remember correctly, so
   there is definitely a valid (i.e. user approved) email address.
  
  Not true.  But you do get a big wall of text when you make your
  first commit without an EMAIL envvar or configured [user] section,
  including
 
 Only if you don't have a fully qualified hostname.

No, we alway warn if the values weren't explicitly provided:

  $ git config --global --unset user.email
  $ git commit --allow-empty -m foo
  [master 1e987ba] foo
   Committer: Jeff King p...@sigill.intra.peff.net
  Your name and email address were configured automatically based
  on your username and hostname. Please check that they are accurate.
  You can suppress this message by setting them explicitly:
  
  git config --global user.name Your Name
  git config --global user.email y...@example.com
  
  After doing this, you may fix the identity used for this commit with:
  
  git commit --amend --reset-author

but we will consider several sources explicit, like
$GIT_COMMITTER_EMAIL, $EMAIL, and of course user.email:

  $ EMAIL=whate...@example.com git commit --allow-empty -m foo
  [master e75f17a] foo

We die when the values are implicitly derived from the system _and_ they
look bogus:

  $ sudo rm /etc/mailname
  $ sudo hostname bogus
  $ git commit --allow-empty -m foo

  *** Please tell me who you are.
  
  Run
  
git config --global user.email y...@example.com
git config --global user.name Your Name
  
  to set your account's default identity.
  Omit --global to set the identity only in this repository.
  
  fatal: unable to auto-detect email address (got 'peff@bogus.(none)')

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Harmful LESS flags

2014-04-24 Thread Jeff King

On Thu, Apr 24, 2014 at 12:29:21PM -0700, Junio C Hamano wrote:

 David Kastrup d...@gnu.org writes:
 
  Junio C Hamano gits...@pobox.com writes:
 
  Traditionally, because the tool grew in a context of being used in a
  project whose participants are at least not malicious, always having
  to be on the lookout for fear of middle-of-line tabs hiding bad
  contents near the right edges of lines has never been an issue.
 
  My beef is not with hiding bad contents but with hiding contents.
  It makes the output useless for seeing what is actually happening as
  soon as the option starts having an effect.
 
 My suspicion is that one of the reasons why S was chosen to be in
 the default was to mildly discourage people from busting the usual
 line-length limit, but I am not Linus ;-)

I would think it's the opposite. Long lines look _horrible_ without
-S, as they get wrapped at awkward points. Using -S means that long
lines don't bug you, unless you really want to scroll over and see the
content.

I really think the right solution here is to teach less to make it more
obvious that there is something worth scrolling over to. Here's a very
rough patch for less, if you want to see what I'm thinking of.

diff --git a/input.c b/input.c
index b211323..01aa411 100755
--- a/input.c
+++ b/input.c
@@ -178,6 +178,7 @@ get_forw_line:
 */
if (chopline || hshift  0)
{
+   set_chopped_marker(ch_tell()-1);
do
{
if (ABORT_SIGS())
diff --git a/line.c b/line.c
index 1eb3914..b3358a0 100755
--- a/line.c
+++ b/line.c
@@ -1080,6 +1080,20 @@ set_status_col(c)
attr[0] = AT_NORMAL|AT_HILITE;
 }
 
+   public void
+set_chopped_marker(pos)
+   POSITION pos;
+{
+   /*
+* Roll back output by one character; probably
+* we need to actually walk curr back further
+* for multibyte characters?
+*/
+   column--;
+   curr--;
+   store_char('', AT_NORMAL|AT_HILITE, NULL, pos);
+}
+
 /*
  * Get a character from the current line.
  * Return the character as the function return value,

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Harmful LESS flags

2014-04-24 Thread Jeff King

On Thu, Apr 24, 2014 at 02:47:01PM -0700, Junio C Hamano wrote:

 And I do agree that the chopped marker would be a very sensible
 thing to show in the -S output; I would have chosen $ myself for
 that to match an existing practice in (setq truncate-lines t) in
 Emacs, though.

Hmm. I do not use Emacs, but I explicitly avoided $ because of its
end-of-line connotations. E.g., in cat -A it means the opposite: this
is the real \n end-of-line. But if there's existing precedent for $,
that would be fine with me.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Harmful LESS flags

2014-04-24 Thread Jeff King

On Thu, Apr 24, 2014 at 11:48:30PM +0200, David Kastrup wrote:

  I really think the right solution here is to teach less to make it more
  obvious that there is something worth scrolling over to. Here's a very
  rough patch for less, if you want to see what I'm thinking of.
 
 Still useless.  I'm not actually interested in a more prominent I could
 be useful indicator.

So don't set -S, then.

There are two questions here:

  1. Can less do a better job of indicating what's in the input when -S
 is in effect?

  2. What should get put into $LESS by default?

I was specifically addressing (1). Your comment does not help at all
there.

It could have an impact on (2), but you didn't say anything besides I
don't like it. That doesn't add anything to the conversation.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] git tag --contains: avoid stack overflow

2014-04-24 Thread Jeff King

On Thu, Apr 24, 2014 at 02:24:39PM +0200, Stepan Kasal wrote:

 From: Jean-Jacques Lafay jeanjacques.la...@gmail.com
 
 In large repos, the recursion implementation of contains(commit,
 commit_list) may result in a stack overflow. Replace the recursion with
 a loop to fix it.
 
 This problem is more apparent on Windows than on Linux, where the stack
 is more limited by default.
 
 See also this thread on the msysGit list:
 
   https://groups.google.com/d/topic/msysgit/FqT6boJrb2g/discussion
 
 [jes: re-written to imitate the original recursion more closely]
 
 Thomas Braun pointed out several documentation shortcomings.
 
 Tests are run only if ulimit -s is available.  This means they cannot
 be run on Windows.
 
 Signed-off-by: Jean-Jacques Lafay jeanjacques.la...@gmail.com
 Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de
 Tested-by: Stepan Kasal ka...@ucw.cz

Thanks, this version looks good to me.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What is missing from Git v2.0

2014-04-25 Thread Jeff King

On Fri, Apr 25, 2014 at 12:45:27PM -0500, Felipe Contreras wrote:

 When I say literally everbody agreed to move away from the name index 
 (except
 Junio and another guy) I mean it. I even composed a list:
 
 http://article.gmane.org/gmane.comp.version-control.git/233469
 
 Jeff King, Jonathan Nieder, Matthieu Moy, they all agreed.

With reference to my name, your email says:

  Jeff King:
  staging area makes perfect sense

But here's that statement in context[1]:

  So the term staging area makes perfect sense to me; it is where we
  collect changes to make a commit. I am willing to accept that does not
  to others (native English speakers or no), and that we may need to
  come up with a better term. But I think just calling it the stage is
  even worse; it loses the concept that it is a place for collecting and
  organizing.

In other words, I was saying that the term makes sense to me, and
primarily comparing favorably to a worse proposal. I am not commenting
at all on a plan to change names, which is what you are claiming above.

I do think the term staging area is fine, but picking a term is only
step one of a plan. The rest is deciding how to make the change, and
whether it is worth it. I remain undecided on the latter bits. Please
don't quote me out of context in a way that implies that I am decided.

-Peff

[1] http://article.gmane.org/gmane.comp.version-control.git/168012
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What is missing from Git v2.0

2014-04-25 Thread Jeff King

On Fri, Apr 25, 2014 at 01:27:17PM -0500, Felipe Contreras wrote:

 I specifically said everybody agreed to move away from the name 'index', I
 didn't say everybody agreed on the staged area although the vast majority
 did, and I didn't say anybody agreed on my patches, although some did.
 
 I think I was clear.

Maybe I was not clear in my response, so let me try again. I do _not_
necessarily agree that we need to move away from the name index. I am
specifically saying that your everyone includes my name explicitly,
and that I do not agree with the statement you are attributing to it.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What is missing from Git v2.0

2014-04-25 Thread Jeff King

On Fri, Apr 25, 2014 at 01:57:59PM -0500, Felipe Contreras wrote:

  Maybe I was not clear in my response, so let me try again. I do _not_
  necessarily agree that we need to move away from the name index.
 
 So you agree that the index is a bad name, and you agree staging area is a
 better name, yet you don't agree we should move away from the term index?

I don't agree that the index is a bad name, because that implies
some objective level of bad.

I do think the name staging area is fine, and I think it may even be
better than index, if we were picking a name out of the blue.

I am undecided on whether we should move away from the term index. The
way you have phrased it seems like you are trying to create a logical
contradiction: A is bad, B is good, therefore we should move from A to
B. But that neglects the cost of moving.

Frankly, I am not that interested in discussing it with you. I _am_
interested in you not using my name to claim that I believe things I do
not.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] commit: do not complain of empty messages from -C

2014-04-25 Thread Jeff King

When we pick another commit's message, we die() immediately
if we find that it's empty and we are not going to run an
editor (i.e., when running -C instead of -c).  However,
this check is redundant and harmful.

It's redundant because we will already notice the empty
message later, after we would have run the editor, and die
there (just as we would for a regular, not -C case, where
the user provided an empty message in the editor).

It's harmful for a few reasons:

  1. It does not respect --allow-empty-message. As a result,
 a git rebase -i cannot pick such a commit. So you
 cannot even go back in time to fix it with a reword
 or edit instruction.

  2. It does not take into account other ways besides the
 editor to modify the message. For example, git commit
 -C empty-commit -m foo could take the author
 information from empty-commit, but add a message to it.
 There's more to do to make that work correctly (and
 right now we explicitly forbid -C with -m), but this
 removes one roadblock.

  3. The existing check is not enough to prevent segfaults.
 We try to find the \n\n header/body boundary in the
 commit. If it is at the end of the string (i.e., no
 body), _or_ if we cannot find it at all (i.e., a
 truncated commit object), we consider the message
 empty. With -C, that's OK; we die in either case. But
 with -c, we continue on, and in the case of a
 truncated commit may end up dereferencing NULL+2.

Signed-off-by: Jeff King p...@peff.net
---
I care most about the rebase -i thing, especially because it is the
primary method for fixing old mistakes. The segfault fix is a nice
bonus.

The git commit -C empty -m foo thing might be nice, but I don't plan
to work on it further. The semantics would need to be figured out (does
it append or replace?), and you can always just use -c to fire up an
actual editor and write the new content there.

 builtin/commit.c  |  5 ++---
 t/t7500-commit.sh | 11 ++-
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 9cfef6c..65c069d 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -650,9 +650,8 @@ static int prepare_to_commit(const char *index_file, const 
char *prefix,
} else if (use_message) {
char *buffer;
buffer = strstr(use_message_buffer, \n\n);
-   if (!use_editor  (!buffer || buffer[2] == '\0'))
-   die(_(commit has empty message));
-   strbuf_add(sb, buffer + 2, strlen(buffer + 2));
+   if (buffer)
+   strbuf_add(sb, buffer + 2, strlen(buffer + 2));
hook_arg1 = commit;
hook_arg2 = use_message;
} else if (fixup_message) {
diff --git a/t/t7500-commit.sh b/t/t7500-commit.sh
index bdc1f29..116885a 100755
--- a/t/t7500-commit.sh
+++ b/t/t7500-commit.sh
@@ -223,7 +223,8 @@ test_expect_success 'Commit without message is allowed with 
--allow-empty-messag
git add foo 
empty 
git commit --allow-empty-message empty 
-   commit_msg_is 
+   commit_msg_is  
+   git tag empty-message-commit
 '
 
 test_expect_success 'Commit without message is no-no without 
--allow-empty-message' '
@@ -240,6 +241,14 @@ test_expect_success 'Commit a message with 
--allow-empty-message' '
commit_msg_is hello there
 '
 
+test_expect_success 'commit -C empty respects --allow-empty-message' '
+   echo more foo 
+   git add foo 
+   test_must_fail git commit -C empty-message-commit 
+   git commit -C empty-message-commit --allow-empty-message 
+   commit_msg_is 
+'
+
 commit_for_rebase_autosquash_setup () {
echo first content line foo 
git add foo 
-- 
1.9.1.656.ge8a0637
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What's cooking in git.git (Apr 2014, #08; Fri, 25)

2014-04-25 Thread Jeff King

On Fri, Apr 25, 2014 at 03:50:26PM -0700, Junio C Hamano wrote:

 * jk/external-diff-use-argv-array (2014-04-21) 6 commits
   (merged to 'next' on 2014-04-22 at e6d92d7)
  + run_external_diff: refactor cmdline setup logic
  + run_external_diff: hoist common bits out of conditional
  + run_external_diff: drop fflush(NULL)
  + run_external_diff: clean up error handling
  + run_external_diff: use an argv_array for the environment
  + run_external_diff: use an argv_array for the command line
 
  Code clean-up.
 
  Will keep in 'next' for the remainder of the cycle.

The first one does fix a possible stack overflow (albeit of one NULL,
not arbitrary content, so I don't think it's exploitable). We may want
to do:

diff --git a/diff.c b/diff.c
index 54d5308..a03744b 100644
--- a/diff.c
+++ b/diff.c
@@ -2894,7 +2894,7 @@ static void run_external_diff(const char *pgm,
  int complete_rewrite,
  struct diff_options *o)
 {
-   const char *spawn_arg[10];
+   const char *spawn_arg[11];
int retval;
const char **arg = spawn_arg[0];
struct diff_queue_struct *q = diff_queued_diff;

as a fix for maint/2.0.0 in the interim. I can write a commit message
for that if you're interested.

 * fc/publish-vs-upstream (2014-04-21) 8 commits
  - sha1_name: add support for @{publish} marks
  - sha1_name: simplify track finding
  - sha1_name: cleanup interpret_branch_name()
  - branch: display publish branch
  - push: add --set-publish option
  - branch: add --set-publish-to option
  - Add concept of 'publish' branch
  - t5516 (fetch-push): fix test restoration
 
  Add branch@{publish}; it seems that this is somewhat different from
  Ram and Peff started working on.  There were many discussion
  messages going back and forth but it does not appear that the
  design issues have been worked out among participants yet.

If you are waiting on me, I do not have much else to say on this topic.
@{publish} as specified by Felipe is not useful to me, and I would
continue to pursue @{push} separately as the remote-tracking branch of
where you would push to. I think there is room for both concepts.

As for the patches themselves, I have not reviewed them carefully, and
would prefer not to. As I mentioned before, though, I would prefer the
short @{p} not be taken for @{publish} until it has proven itself.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What's cooking in git.git (Apr 2014, #08; Fri, 25)

2014-04-25 Thread Jeff King

On Fri, Apr 25, 2014 at 08:36:55PM -0500, Felipe Contreras wrote:

  As for the patches themselves, I have not reviewed them carefully, and
  would prefer not to. As I mentioned before, though, I would prefer the
  short @{p} not be taken for @{publish} until it has proven itself.
 
 Presumably you want to save it for @{push}. While I'm not against to having
 just @{publish} for now, I'm farily certain most people would be using
 @{publish} and not @{push}, as that's what `git branch -v` would show, and it
 would be closely similar to @{upstream}. Therefore it would make sense to use
 @{p} for @{publish}

No, I do not think it would be a good idea for @{push}, either. If we
have two concepts so similarly named (and especially if we add @{pull},
which has also been mentioned), then I think having @{p} just adds to
confusion. So I would much rather wait and see.  It is very easy to add
@{p} later, but it is very hard to take it back once used.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Revert Stop starting pager recursively

2014-04-26 Thread Jeff King

[+cc Duy, whose patch this is]

On Fri, Apr 25, 2014 at 04:10:49PM -0400, Jörn Engel wrote:

 A second option is to add a --pager (or rather --no-pager) option to
 the command line and allow the user to specify
 GIT_PAGER=git --no-pager -p column --mode='dense color' git -p branch

I think we have --no-pager already. But the -p is turning _on_ the
pager, so you could also just omit it. IOW, I really don't understand
why the original command was not simply:

  GIT_PAGER=git column --mode='dense color' git -p branch

The whole infinite loop that the original commit solved is caused by
specifying the -p. So it sounds like the right solution is don't do
that. Am I missing something useful that the -p does?

I wonder if perhaps the intent was that the user might have set
pager.column, in which case the use of the pager is implied. I still
think that the right solution is to use --no-pager explicitly then. If
the user is invoking git inside GIT_PAGER, it is up to them to save
themselves from infinite recursion.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] subtree/Makefile: Standardize (esp. for packagers)

2014-04-26 Thread Jeff King

On Sat, Apr 26, 2014 at 02:56:15PM +1000, nod.h...@gmail.com wrote:

  contrib/subtree/Makefile is a shambles in regards to it's consistency
  with other makefiles, which makes subtree overly painful to include in
  build scripts.
 
  Two major issues are present:
 
  Firstly, calls to git itself (for $(gitdir) and $(gitver)), making
  building difficult on systems that don't have git.
 
  Secondly, the Makefile uses the variable $(libexecdir) for defining the
  exec path.
 
  (...)
 
 I hate to be that guy, but could I get an opinion on the proposed patch?

It's OK to be that guy; prompting or reposting when a patch has been
overlooked is normal here.

 Is git interested in purely makefile patches, or should I find further
 improvements to make in subtree and purpose this again with those?

Makefile improvements are fine on their own. I think the problem is that
contrib/subtree does not really have an active dedicated area
maintainer.

Your changes look fine to me from a cursory examination. It would
probably be more readable as four patches (the 3 fix points from your
list, plus the minor fixes mentioned at the end). Then each patch
stands on its own, can say what problem it's fixing, and how.

 I've left `rm -f -r subproj mainline` in the clean rule for now,
 however I'd suggest those actually belong in
 contrib/subtree/t/Makefile:clean, given that they are only ever
 generated by `make test`. But given that there aren't any other
 comparable setups in contrib/, I'm somewhat apprehensive to move them
 without opinion.

Do we even make those directories anymore? It looks like they are part
of the tests, but the whole test script runs inside its own trash
directory. I wonder if they are vestiges from the time when subtree was
its own repository outside of contrib/. If so, they can be dropped here
(and from .gitignore).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Revert Stop starting pager recursively

2014-04-26 Thread Jeff King

[+cc Duy for real this time]

On Sat, Apr 26, 2014 at 10:27:07AM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  [+cc Duy, whose patch this is]
 
  On Fri, Apr 25, 2014 at 04:10:49PM -0400, Jörn Engel wrote:
 
  A second option is to add a --pager (or rather --no-pager) option to
  the command line and allow the user to specify
  GIT_PAGER=git --no-pager -p column --mode='dense color' git -p branch
 
  I think we have --no-pager already. But the -p is turning _on_ the
  pager, so you could also just omit it. IOW, I really don't understand
  why the original command was not simply:
 
GIT_PAGER=git column --mode='dense color' git -p branch
 
  The whole infinite loop that the original commit solved is caused by
  specifying the -p. So it sounds like the right solution is don't do
  that. Am I missing something useful that the -p does?
 
 My reading of this is that Duy wanted to let -p kick in to pass
 the columnized output, which could be more than a page long, thru
 the usual less or whatever pager.

I thought that at first, too. But it doesn't work, because his patch
actually _suppresses_ the pager. For example, something like:

  git config pager.column less
  git config pager.branch git column --mode=dense
  git branch

actively works without the original patch (and after Jörn's revert), but
not with current git. However, you would not use -p there in any case,
because it kicks in early and only respects $GIT_PAGER.

And I would also say that you are much better off there actually
specifying your whole pipeline as a unit:

  git config pager.branch git column --mode=dense | less

I do a similar thing where I have PAGER=less, but set pager.log to
diff-highlight | less. I tried at one point to have it chain
automatically, but the setup is just too complicated (as we're seeing
here), and it's not _that_ big a deal to explicitly pipe to the next
pager in the sequence.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Adding git hooks

2014-04-26 Thread Jeff King

On Sat, Apr 26, 2014 at 10:24:50AM -0700, Junio C Hamano wrote:

 Suvorov Ivan sv...@inbox.ru writes:
 
  I want to extend the functionality of git due to the possibility of
  separation of the user repository into 2 parts - one part will be
  stored as usual, under version control git, and the second part will
  be stored in another location such as an FTP-server.
 
 Sounds like you are looking for git-annex, perhaps?

I agree that is the right approach, but git-annex and git-media work
_above_ the object layer, and taint the history by storing symlinks in
git instead of the real sha1s. I'd love to see a solution that does the
same thing, but lives at the pack/loose object layer. Basically:

  1. Teach sha1-file.c to look for missing objects by hitting an
 external script, like:

git config odb.command curl https://example.com/%s;

 and place them in an alternates-like separate object database.

  2. Teach the git protocol a new extension to say don't bother sending
 blobs over size X. You'd have to coordinate that X with the source
 from your odb.command.

You'd probably want to wrap up the odb.command in a more fancy helper.
For example, for performance, we'd probably want to be able to query it
for which objects do you have, as well as fetch this object. And it
would be nice if it could auto-query the X for step 2, and manage
pruning local objects (e.g., when they become deep in history).

We'd probably also want to teach a few places in git to treat external
objects specially. For example, they should probably be auto-treated as
binary, so that a log -p does not try to fetch all of them. And
likewise, things like log -S should probably ignore them by default.

I have a messy sketch of step 1 that I did quite a while ago, but
haven't proceeded further on it.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Uses git-credential for git-imap-send

2014-04-26 Thread Jeff King

On Sat, Apr 26, 2014 at 10:50:26AM -0700, Dan Albert wrote:

 git-imap-send was directly prompting for a password rather than using
 git-credential. git-send-email, on the other hand, supports
 git-credential.

Yay. These sorts of conversions were definitely on my mind when I did
the original credential work, and have mostly been waiting on somebody
who cares enough about specific tools to switch them over.

  static struct imap_store *imap_open_store(struct imap_server_conf *srvc)
  {
 + struct credential cred = CREDENTIAL_INIT;

Your patch seems to have been whitespace-damaged in transit (there are
hints for gmail in the format-patch and send-email manpages).

 + cred.username = xstrdup(srvc-user);
 + cred.protocol = xstrdup(imap);
 + cred.host = xstrdup(srvc-host);
 + credential_fill(cred);
 + srvc-pass = xstrdup(cred.password);

The imap protocol will get passed along to any helpers. I think the
only one which actually cares about the specific content is the
osxkeychain helper, which differentiates between imap and imaps. I
think we can know which is which at this point and do:

  cred.protocol = xstrdup(srvc-use_ssl ? imaps : imap);

Other than that, the patch looks good. Tests would be wonderful, but we
have no infrastructure at all, so it would involve writing a mock IMAP
server. Having implemented both IMAP servers and clients in a previous
life, I could not ask anyone to go through that pain. :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] subtree/Makefile: Standardize (esp. for packagers)

2014-04-26 Thread Jeff King

On Sun, Apr 27, 2014 at 12:35:13PM +1000, James Denholm wrote:

 Do we even make [subproject and mainline] anymore? It looks like they are 
 part
 of the tests, but the whole test script runs inside its own trash
 directory.
 
 subproject and mainline are actually made in  contrib/subtree,
 but I'll look at perhaps fixing that when I split the proposal
 into a series as you suggest.

Are they? I couldn't find any reference to them as directories except in
the test script, and doing a make from contrib/subtree didn't create
them. I'll leave it to you to investigate further whether the clean
rules are cruft or not, but certainly if they are, cleaning up cruft is
a good thing.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Uses git-credential for git-imap-send

2014-04-27 Thread Jeff King

On Sat, Apr 26, 2014 at 11:30:11AM -0700, Dan Albert wrote:

 I had resent a less broken patch after I noticed the tabs, but it seems to
 have gotten lost. Better formatted patch at the bottom of this message.

Your emails (including this one) are multipart/alternatives with an html
part, which will cause the mailing list software to reject them.

This email also still seems whitespace-damaged to me (the leading tabs
are collapsed to a single space).

It looks like you're using gmail to send; you might try using git
send-email (the example at the end of git help send-email can walk
you through it).

 About imap vs. imaps: I actually had your exact line in before, but decided
 that as long as its for the same host the user probably wants to use the
 same credentials for both imap and imaps (if they for some reason had both
 configured). Hard coding imap allows them to use either protocol with
 only one keychain entry. The use case is a stretch, but it doesn't do any
 harm to implement it this way.

My concerns with conflating the two are:

  1. The system helper might care about the distinction and prefer imaps
 (e.g., it might already have the credential stored for your regular
 mail client, which uses imaps). But osxkeychain is the only helper
 that makes the distinction, and I don't really know how OS X's
 keychain code handles the distinction.

  2. With http and https, we are careful to make the distinction,
 because we would not want to accidentally share a credential over http
 that was stored via https. But it's pretty easy to use an http URL
 rather than an https one. It's probably pretty rare to accidentally
 turn off imap SSL.

So I'd be OK with leaving it as imap for now, and waiting for somebody
to actually come up with a real case where the distinction matters.

 ---
 
 Uses git-credential for git-imap-send
 
 git-imap-send was directly prompting for a password rather than using
 git-credential. git-send-email, on the other hand, supports git-credential.
 
 This is a necessary improvement for users that use two factor
 authentication, as
 they should not be expected to remember all of their app specific passwords.
 
 Signed-off-by: Dan Albert danalb...@google.com
 ---
  imap-send.c | 29 +++--
  1 file changed, 15 insertions(+), 14 deletions(-)

A side note on formatting your commit message: The maintainer picks up
patches from the list with git am, which will take everything up to
the first --- as the commit message, and discard everything after up
to the start of the diff. So in this case it would take your
cover-letter material as the commit message, and drop your actual commit
message.

The usual formats are either to put the cover letter material after the
---, like:

  $COMMIT_MESSAGE

  Signed-off-by: You
  ---
  $COVER_LETTER
  $DIFFSTAT
  $DIFF

or to use a scissors-line -- 8 -- instead of three-dash:

  $COVER_LETTER

  -- 8 --
  $COMMIT_MESSAGE

  ---
  $DIFFSTAT
  $DIFF

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Revert Stop starting pager recursively

2014-04-27 Thread Jeff King

On Sun, Apr 27, 2014 at 09:12:39AM +0700, Duy Nguyen wrote:

 The intent of the commit was that is a stupid thing to do, but it's
 not so obvious from the first glance, do not freeze my system for my
 mistake. But if it stops an actual use case, then I agree it should
 be reverted.

Thanks for the explanation. I think we should just go with Jörn's patch
as-is, then.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-28 Thread Jeff King

If you have existing decomposed filenames in your git
repository (e.g., that were created with older versions of
git that did not precompose unicode), a modern git with
core.precomposeunicode set does not handle them well.

The problem is that we normalize the paths coming from the
disk into their precomposed form, and then compare them
against the literal bytes in the index. This makes things
better if you have the precomposed form in the index. It
makes things worse if you actually have the decomposed form
in the index.

As a result, paths with decomposed filenames may have their
precomposed variants listed as untracked files (even though
the precomposed variants do not exist on-disk at all).

This patch just adds a test to demonstrate the breakage.
Some possible fixes are:

  1. Tell everyone that NFD in the git repo is wrong, and
 they should make a new commit to normalize all their
 in-repo files to be precomposed.

 This is probably not the right thing to do, because it
 still doesn't fix checkouts of old history. And it
 spreads the problem to people on byte-preserving
 filesystems (like ext4), because now they have to start
 precomposing their filenames as they are adde to git.

  2. Do all index filename comparisons using a UTF-8 aware
 comparison function when core.precomposeunicode is set.
 This would probably have bad performance, and somewhat
 defeats the point of converting the filenames at the
 readdir level in the first place.

  3. Convert index filenames to their precomposed form when
 we read the index from disk. This would be efficient,
 but we would have to be careful not to write the
 precomposed forms back out to disk.

  4. Introduce some infrastructure to efficiently match up
 the precomposed/decomposed forms. We already do
 something similar for case-insensitive files using
 name-hash.c. We might be able to adapt that strategy
 here.

Signed-off-by: Jeff King p...@peff.net
---
 t/t3910-mac-os-precompose.sh | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/t/t3910-mac-os-precompose.sh b/t/t3910-mac-os-precompose.sh
index e4ba601..23aa61e 100755
--- a/t/t3910-mac-os-precompose.sh
+++ b/t/t3910-mac-os-precompose.sh
@@ -140,6 +140,16 @@ test_expect_success Add long precomposed filename '
git add * 
git commit -m Long filename
 '
+
+test_expect_failure 'handle existing decomposed filenames' '
+   echo content verbatim.$Adiarnfd 
+   git -c core.precomposeunicode=false add verbatim.$Adiarnfd 
+   git commit -m existing decomposed file 
+   expect 
+   git ls-files --exclude-standard -o verbatim* untracked 
+   test_cmp expect untracked
+'
+
 # Test if the global core.precomposeunicode stops autosensing
 # Must be the last test case
 test_expect_success respect git config --global core.precomposeunicode '
-- 
1.9.1.656.ge8a0637
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] PAGER_ENV: remove 'S' from $LESS by default

2014-04-28 Thread Jeff King

On Mon, Apr 28, 2014 at 02:22:21PM +0200, Matthieu Moy wrote:

  Not since f82c3ffd862c7 (Wed Feb 5 2014, move LESS/LV pager environment
  to Makefile).
 
  The only upstream branch containing this commit is pu.  So this patch
  should likely not go anywhere else for now.
 
 Oops, indeed, I made my patch on top of pu by mistake. Anyway, my patch
 can wait for the other series to be merged.
 
 Jeff, you're the author of f82c3ffd862c7, topic jk/makefile in git.git,
 marked expecting a reroll by Junio. Any news from the series?

I am planning to revisit it eventually, but it's fairly low priority.
There is some pretty heavy refactoring in the series, and the PAGER_ENV
bits do not have to be held hostage to that refactoring (they are really
just demonstrating the refactoring).

I'd be OK with doing the moral equivalent for now (perhaps just taking
Junio's proposal[1]), and I can deal with the refactoring later when
re-rolling the Makefile series.

-Peff

[1] http://article.gmane.org/gmane.comp.version-control.git/240637
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Adding git hooks

2014-04-28 Thread Jeff King

On Mon, Apr 28, 2014 at 09:43:10AM -0700, Junio C Hamano wrote:

 Yes, I'd love to see something along that line in the longer term,
 showing all the objects as just regular objects under the hood, with
 implementation details hidden in the object layer (just like there
 is no distinction between packed and loose objects from the point of
 view of read_sha1_file() users), as a real solution to address
 issues in larger trees.
 
 Also see http://thread.gmane.org/gmane.comp.version-control.git/241940
 where Shawn had an interesting experiment.

Yeah, I think it's pretty clear that a naive high-latency object store
is unusably slow. You mentioned in that thread trying to do pre-fetching
based on commits/trees, and I recall that Shawn's Cassandra experiments
did that (and maybe the BigTable-backed Google Code does, too?).

There's also a question of deltas. You don't want to get trees or text
blobs individually without deltas, because your total size ends up way
bigger.

But I think for large object support, we can side-step the issue. The
objects will all be blobs (so they cannot refer to anything else), they
will typically not delta well, and the connection setup and latency will
be dwarfed by actual transfer time. My plan was to have all clones fetch
all commits and trees (and small blobs, too), and then download and
cache the large blobs as-needed.

That doesn't help with repositories where the actual commit history or
tree size is a problem. But we already have shallow clones to help with
the former. And for the latter, I think we would want a narrow clone
that behaves differently than what I described above. You'd probably
want a specific widen operation that would fetch all of the objects
for the newly-widened part of the tree in one go (including deltas), and
you wouldn't want it to happen on an as-needed basis.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Uses git-credential for git-imap-send

2014-04-28 Thread Jeff King

On Sun, Apr 27, 2014 at 10:58:51AM -0700, Dan Albert wrote:

 git-imap-send was directly prompting for a password rather than using
 git-credential. git-send-email, on the other hand, supports git-credential.
 
 This is a necessary improvement for users that use two factor authentication, 
 as
 they should not be expected to remember all of their app specific passwords.
 
 Signed-off-by: Dan Albert danalb...@google.com

Thanks, this version looks good to me.

I noticed that we are just filling in the password here, since we'll
always fill cred.username from srvc-user. The lines directly above are:

if (!srvc-user) {
fprintf(stderr, Skipping server %s, no user\n, srvc-host);
goto bail;
}

That comes from the imap.user config variable. I wonder if we should
just pass it off to credential_fill() in this case, too, which will fill
in the username if necessary.

It probably doesn't matter much, though, as nobody is complaining. And
if we were designing from scratch, I would say that imap.user and
imap.pass would not need to exist, as you can configure
credential.imaps://host/.* for the same purpose. But since we would
have to keep supporting them anyway for compatibility, it's not worth
trying to transition.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-28 Thread Jeff King

On Mon, Apr 28, 2014 at 12:17:28PM -0700, Junio C Hamano wrote:

3. Convert index filenames to their precomposed form when
   we read the index from disk. This would be efficient,
   but we would have to be careful not to write the
   precomposed forms back out to disk.
 
 I think this may be the right approach, especially if you are going
 to do this only when core.precomposeunicode is set.
 
 the reasoning behind we would have to be careful not to write
 part, is unclear to me, though.  Don't decomposing filesystems
 perform the manglig from the precomposed form without even being
 asked to do so, just like a case insensitive filesystem will
 overwrite an existing makefile on a request to write to
 Makefile?

Sorry, I meant do not write the precomposed forms back out to the
on-disk index. And by extension, do not update cache-tree and write
them out to git trees.

IOW, it is not enough to just set cache_entry-name to the normalized
form. You'd need to store both.

Since such entries are in the minority, and because cache_entry is
already a variable-length struct, I think you could get away with
sticking it after the name field, and then comparing like:

  const char *ce_normalized_name(struct cache_entry *ce, size_t *len)
  {
const char *ret;

/* Normal, fast path */
if (!(ce-ce_flags  CE_NORMALIZED_NAME)) {
len = ce_namelen(ce);
return ce-name;
}

/* Slow path for normalized names */
ret = ce-name + ce-namelen + 1;
*len = strlen(name);
return ret;
  }

The strlen is probably OK since such paths are presumably in the
minority (even for UTF-8 paths, we can avoid storing the extra copy if
they do not need any normalization). Or we could get fancy and encode
the length in front, but I am not sure it is worth the complexity.

Anyway, the tricky part is then making sure that all cache_entry name
comparisons use ce_normalized_name instead of ce-name.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What's cooking in git.git (Apr 2014, #08; Fri, 25)

2014-04-28 Thread Jeff King

On Mon, Apr 28, 2014 at 10:48:21AM -0700, Junio C Hamano wrote:

  diff --git a/diff.c b/diff.c
  index 54d5308..a03744b 100644
  --- a/diff.c
  +++ b/diff.c
  @@ -2894,7 +2894,7 @@ static void run_external_diff(const char *pgm,
int complete_rewrite,
struct diff_options *o)
   {
  -   const char *spawn_arg[10];
  +   const char *spawn_arg[11];
  int retval;
  const char **arg = spawn_arg[0];
  struct diff_queue_struct *q = diff_queued_diff;
 
  as a fix for maint/2.0.0 in the interim. I can write a commit message
  for that if you're interested.
 
 I think we should merge the first one (and possibly the second one,
 too) as-is for 2.0 instead.  No change can possibly be more
 trivially correct than these two ;-)

I'm fine with that. The second patch is pure clean-up and doesn't fix
anything (because the 3 in the env array is actually correct), so it
can happily wait for the next cycle.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames

2014-04-28 Thread Jeff King

On Mon, Apr 28, 2014 at 09:52:07PM +0200, Torsten Bögershausen wrote:

 To my knowledge repos with decomposed unicode should be rare in
 practice.  I only can speak for european (or latin based) or cyrillic
 languages myself:

I've run across several cases in the past few months, but only just
figured out what was going on. Most were tickets to GitHub support, but
we actually have such a case in our github/github repository. In most
cases, I think they were created on older versions of git on OS X,
either before core.precomposeunicode existed, or before it was turned on
by default. The decomposed form got baked into the tree (whatever the
user originally typed, git probably found out about it via git add .).

I think reports are just coming in now because we didn't start turning
on core.precomposeunicode by default until v1.8.5, shipped in November.
And then, a person working on the repository would not notice anything,
since we only set the flag during clone. So it took time for people to
upgrade _and_ to make fresh clones.

 So for me the test case could sense, even if I think that nobody (TM)
 uses an old Git version under Mac OS X which is not able to handle
 precomposed unicode.

Even when they do not, the decomposed values are baked into history from
those old versions. So it is a matter of history created with older
versions not interacting well with newer versions.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 13736 matches

Mail list logo