Re: [PATCH v8 03/12] Move lower case functions into wrapper.c
On Fri, Mar 28, 2014 at 10:12:15AM -0700, Junio C Hamano wrote: By the way, that is rfc2822---do we want rfc822 as its synonym as well as rfc, I wonder ;-) Oops, I wrote that as I was literally looking at the code that said rfc2822 and didn't notice. On the other hand, I have never made the mistake when actually running git, so it is probably not a big deal. Besides which, isn't it 5322 these days? I do not think we need to keep up with the treadmill. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problems with git 1.8.5.3 on HP-UX 11.11
On Fri, Mar 28, 2014 at 11:09:14AM -, Gerhard Grimm wrote: So I examined the git source package and found that the author of the HP-UX port forgot to set PTHREAD_CFLAGS=-mt in config.mak.autogen to enable threading. You probably want to place such manual settings in config.mak. If you use the ./configure script, it will overwrite config.mak.autogen. git submodule init fails with the output Assertion failed: err == REG_ESPACE, file compat/regex/regexec.c, line 1096 No submodule mapping found in .gitmodules for path 'module' and the stacktrace of the resulting core dump is #0 0xc020ced0 in kill+0x10 () from /usr/lib/libc.2 #1 0xc01a7f84 in raise+0x24 () from /usr/lib/libc.2 #2 0xc01e9308 in abort_C+0x160 () from /usr/lib/libc.2 #3 0xc01e9364 in abort+0x1c () from /usr/lib/libc.2 #4 0xc0176998 in _assert+0x178 () from /usr/lib/libc.2 #5 0x205fa0 in check_matching+0x290 () #6 0x2053b8 in re_search_internal+0x128 () #7 0x204ac0 in regexec+0xc8 () #8 0x4da40 in collect_config+0x60 () #9 0x108b30 in get_value+0xd8 () [...] The regexes we use here are not particularly complicated. So either there is a bug (but nobody else has reported anything on any other platform) or your system regex library has some problem with what we are feeding it. The simplest solution may be to compile with: NO_REGEX=YesPlease which will build and use the glibc implementation in compat/regex instead. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] t4212: handle systems with post-apocalyptic gmtime
On Fri, Mar 28, 2014 at 12:02:46PM -0700, Junio C Hamano wrote: - teach the is the result sane, even though we may have got a non-NULL from gmtime? otherwise let's signal a failure by replacing it with a known sentinel value codepath the new failure mode Charles's report suggests---if we feed a positive timestamp and gmtime gave us back a tm_year+1900 0, that is certainly an overflow; and I don't think we can analyze the output from gmtime. If it wraps the year at N, then won't N+2014 look like a valid value? Yes, but I was hoping that there are small number of possible N's ;-) I'm not sure I understand. Even if we know N, we've lost information during the truncation done by time_t (we cannot distingiuish true M from N+M). diff --git a/date.c b/date.c index e1a2cee..e0c43c4 100644 --- a/date.c +++ b/date.c @@ -57,6 +57,8 @@ static time_t gm_time_t(unsigned long time, int tz) static struct tm *time_to_tm(unsigned long time, int tz) { time_t t = gm_time_t(time, tz); + if (t ) + return NULL; return gmtime(t); } I suspect that would handle the FreeBSD case, as well. By the way, I have a suspicion that the gm_time_t above can overflow if you specially craft a value at the edge of what time_t can handle (we check that our value will not overflow time_t earlier, but now we might be adding up to 86400 seconds to it). sigh Yuck. Let's not go there. Do you mean let's not worry about the absurdly specific overflow case, or let's not do this gross time_to_tm hack? This (non-)issue has consumed a lot more brain power than it is probably worth. I'd like to figure out which patch to go with and be done. :) -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problems with git 1.8.5.3 on HP-UX 11.11
On Fri, Mar 28, 2014 at 03:43:29PM -0400, Eric Sunshine wrote: On Fri, Mar 28, 2014 at 3:01 PM, Jeff King p...@peff.net wrote: On Fri, Mar 28, 2014 at 11:09:14AM -, Gerhard Grimm wrote: git submodule init fails with the output Assertion failed: err == REG_ESPACE, file compat/regex/regexec.c, line 1096 No submodule mapping found in .gitmodules for path 'module' The regexes we use here are not particularly complicated. So either there is a bug (but nobody else has reported anything on any other platform) or your system regex library has some problem with what we are feeding it. The simplest solution may be to compile with: NO_REGEX=YesPlease which will build and use the glibc implementation in compat/regex instead. Based upon the assertion-failure message, it looks like he's already using compat/regex. Heh, I didn't even notice that. I just looked at all of the libc calls at the top of the backtrace, but of course that is just from assert() on up. So now it seems doubly odd to me, since it is running the same regex library that is used elsewhere. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug report: Git 1.8 on Ubuntu 13.10 refs not valid
On Mon, Mar 31, 2014 at 07:19:09PM +0200, Siggi wrote: here are the two outputs you wanted to see. The interesting bit is at the end... HTTP/1.1 200 OK Date: Mon, 31 Mar 2014 17:04:57 GMT * Server Apache/2.2.22 (Ubuntu) is not blacklisted Server: Apache/2.2.22 (Ubuntu) X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 3.0.19 ETag: 2ab570112563de50022189f0a2ffcdd4 Pragma: no-cache Expires: Fri, 01 Jan 1980 00:00:00 GMT X-Runtime: 72 Cache-Control: no-cache, max-age=0, must-revalidate Content-Length: 1047 Status: 200 Content-Type: application/x-git-upload-pack-advertisement; charset=utf-8 This content-type is the problem. There should not be a charset parameter (it is meaningless, and it throws off git's content-type check). So your web server configuration (or the redmine git plugin) should be fixed, but you'll have to talk to redmine folks to figure that part out. That being said, git _could_ be more liberal in accepting a content-type with parameters (even though it does not know about any parameters, and charset here is completely meaningless). I have mixed feelings on that. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug report: Git 1.8 on Ubuntu 13.10 refs not valid
On Mon, Mar 31, 2014 at 11:27:53AM -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: That being said, git _could_ be more liberal in accepting a content-type with parameters (even though it does not know about any parameters, and charset here is completely meaningless). I have mixed feelings on that. It may be just a matter of replacing strbuf_cmp() with the initial part must be this string followed by it could have an optional whitespaces and semicolon after that, but I share the mixed feelings. Yeah, I think something like this is probably enough: diff --git a/remote-curl.c b/remote-curl.c index 52c2d96..be21fb6 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -221,6 +221,13 @@ static int show_http_message(struct strbuf *type, struct strbuf *msg) return 0; } +static int match_content_type(struct strbuf *have, struct strbuf *want) +{ + if (!starts_with(have-buf, want-buf)) + return 0; + return have-len == want-len || have-buf[want-len] == ';'; +} + static struct discovery* discover_refs(const char *service, int for_push) { struct strbuf exp = STRBUF_INIT; @@ -277,7 +284,7 @@ static struct discovery* discover_refs(const char *service, int for_push) strbuf_addf(exp, application/x-%s-advertisement, service); if (maybe_smart (5 = last-len last-buf[4] == '#') - !strbuf_cmp(exp, type)) { + match_content_type(type.buf, exp.buf)) { char *line; /* though perhaps there are exotic whitespace rules that we would also need to allow. Either way, I do not think it is the implementation that is the problem, but rather the semantics. I am not sure if it is a right thing to follow be liberal to accept dictum in this case. It may be liberal in accepting blindly, but if the other end is asking a behaviour that is not standard (but perhaps in future versions of Git such an enhanced service may be implemented by the client), by ignoring the parameter we do not even know about how to handle, we would be giving surprises to the overall protocol exchange. I actually think ignoring them could provide room for future expansion in git (i.e., we could add new c-t parameters with the knowledge that existing git would ignore them). So there is a potential upside. But current git does _not_ ignore them. Git v1.10 (or 2.0, or whatever) could, but we would have to wait for old versions to die out before making that assumption anyway. It's also possible that we would want to intentionally break compatibility (to say if you do not understand foo=bar, then do not even process this). But I don't think that is a big deal, as we could already switch the content-type itself (x-upload-pack-advertisement2 or something). And if we really wanted to dump extra optional data into the http conversation, we can already do so with http headers. So really, I do not see us ever realistically expanding into content-type parameters. This would _just_ be about handling odd implementations. And there I can see it as us being nice (I do not think charset is doing anything here), or us being careful about broken implementations (why would one add a charset parameter at all? If the implementation is blindly re-encoding to utf8 or something, it could very well be corrupting the data in rare cases). Given that there is only one implementation known to this in the first place, I'd be inclined to say fix that implementation. If it becomes a widespread problem, then we can think about loosening the check after reviewing exactly what each implementation is doing. I think all of that is to say I'm violently agreeing with you. But having tried to think it through, I felt it was worth posting the more detailed thought process. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 0/3] Take four on fixing OPT_SET_PTR issues
On Mon, Mar 31, 2014 at 10:23:44AM -0700, Junio C Hamano wrote: SET_PTR() may not be used, but are there places where SET_INT() is abused with a cast-to-pointer for the same effect? I didn't check, but if there are such places, converting them to use SET_PTR() with their existing cast removed may be a better way to go. Anyone doing that should be beaten with a clue stick. Fortunately, I grepped through and I did not see any cases. My clue stick remains untouched. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] MSVC: fix t0040-parse-options crash
On Sun, Mar 30, 2014 at 10:29:04AM +0200, Andreas Schwab wrote: Junio C Hamano gits...@pobox.com writes: As OPT_SET_PTR() is about setting the pointer value to intptr_t defval, a follow-up patch on top of this fix (see attached) may not be a bad thing to have, but that patch alone will not fix this issue without dropping the unneeded and unwanted cast to unsigned long. Wouldn't it make sense to change defval into a union to avoid the cast? (The intptr_t type may be too narrow for other values to be put there.) The primary function of these structs is to capture the information found in brace initializers. Is it possible in C89 to initialize the second member of a union (I think in C99, you can use named initializers). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] date: recognize bogus FreeBSD gmtime output
Most gmtime implementations return a NULL value when they encounter an error (and this behavior is specified by ANSI C and POSIX). FreeBSD's implementation, however, will indicate an error by returning a pointer to a struct tm with all fields set to zero. Let's also recognize this and convert it to a NULL (with this patch, t4212 should pass on FreeBSD). Reported-by: René Scharfe l@web.de Signed-off-by: Jeff King p...@peff.net --- There are actually a few callers to gmtime and gmtime_r, so I pushed this fix up into a compat wrapper rather than in time_to_tm to get them all. It's possible that localtime() would want to receive the same treatment, too. It's not strictly necessary to make the wrapper conditional, but it was easy to do so. We could also just run this code all the time. I don't have a FreeBSD VM handy to test this, so confirmation that it passes the test would be nice. Makefile | 8 compat/gmtime.c | 26 ++ config.mak.uname | 1 + git-compat-util.h | 7 +++ 4 files changed, 42 insertions(+) create mode 100644 compat/gmtime.c diff --git a/Makefile b/Makefile index 3646391..2f3758c 100644 --- a/Makefile +++ b/Makefile @@ -338,6 +338,9 @@ all:: # Define TEST_GIT_INDEX_VERSION to 2, 3 or 4 to run the test suite # with a different indexfile format version. If it isn't set the index # file format used is index-v[23]. +# +# Define GMTIME_UNRELIABLE_ERRORS if your gmtime() function does not +# return NULL when it receives a bogus time_t. GIT-VERSION-FILE: FORCE @$(SHELL_PATH) ./GIT-VERSION-GEN @@ -1489,6 +1492,11 @@ ifneq (,$(XDL_FAST_HASH)) BASIC_CFLAGS += -DXDL_FAST_HASH endif +ifdef GMTIME_UNRELIABLE_ERRORS + COMPAT_OBJS += compat/gmtime.o + BASIC_CFLAGS += -DGMTIME_UNRELIABLE_ERRORS +endif + ifeq ($(TCLTK_PATH),) NO_TCLTK = NoThanks endif diff --git a/compat/gmtime.c b/compat/gmtime.c new file mode 100644 index 000..ffcabf4 --- /dev/null +++ b/compat/gmtime.c @@ -0,0 +1,26 @@ +#include ../git-compat-util.h +#undef gmtime +#undef gmtime_r + +struct tm *git_gmtime(const time_t *timep) +{ + static struct tm result; + return git_gmtime_r(timep, result); +} + +struct tm *git_gmtime_r(const time_t *timep, struct tm *result) +{ + struct tm *ret; + + ret = gmtime_r(timep, result); + + /* +* Rather than NULL, FreeBSD gmtime will return a struct tm with all +* fields zeroed. Since mday cannot otherwise be zero, we can test +* this very quickly. +*/ + if (ret !ret-tm_mday) + ret = NULL; + + return ret; +} diff --git a/config.mak.uname b/config.mak.uname index 6069a44..0e22ac0 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -187,6 +187,7 @@ ifeq ($(uname_S),FreeBSD) endif PYTHON_PATH = /usr/local/bin/python HAVE_PATHS_H = YesPlease + GMTIME_UNRELIABLE_ERRORS = UnfortunatelyYes endif ifeq ($(uname_S),OpenBSD) NO_STRCASESTR = YesPlease diff --git a/git-compat-util.h b/git-compat-util.h index 892032b..5191866 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -716,4 +716,11 @@ void warn_on_inaccessible(const char *path); /* Get the passwd entry for the UID of the current process. */ struct passwd *xgetpwuid_self(void); +#ifdef GMTIME_UNRELIABLE_ERRORS +struct tm *git_gmtime(const time_t *); +struct tm *git_gmtime_r(const time_t *, struct tm *); +#define gmtime git_gmtime +#define gmtime_r git_gmtime_r +#endif + #endif -- 1.9.1.656.ge8a0637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] t4212: loosen far-in-future test for AIX
One of the tests in t4212 checks our behavior when we feed gmtime a date so far in the future that it gives up and returns NULL. Some implementations, like AIX, may actually just provide us a bogus result instead. It's not worth it for us to come up with heuristics that guess whether the return value is sensible or not. On good platforms where gmtime reports the problem to us with NULL, we will print the epoch value. On bad platforms, we will print garbage. But our test should be written for the lowest common denominator so that it passes everywhere. Reported-by: Charles Bailey cbaile...@bloomberg.net Signed-off-by: Jeff King p...@peff.net --- t/t4212-log-corrupt.sh | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/t/t4212-log-corrupt.sh b/t/t4212-log-corrupt.sh index 3fa1715..58b792b 100755 --- a/t/t4212-log-corrupt.sh +++ b/t/t4212-log-corrupt.sh @@ -82,11 +82,9 @@ test_expect_success 'date parser recognizes time_t overflow' ' ' # date is within 2^63-1, but enough to choke glibc's gmtime -test_expect_success 'absurdly far-in-future dates produce sentinel' ' +test_expect_success 'absurdly far-in-future date' ' commit=$(munge_author_date HEAD 99) - echo Thu Jan 1 00:00:00 1970 + expect - git log -1 --format=%ad $commit actual - test_cmp expect actual + git log -1 --format=%ad $commit ' test_done -- 1.9.1.656.ge8a0637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2alt/2] work around unreliable gmtime errors on AIX
AIX's gmtime will happily overflow the tm_year field. Let's catch this error before handing the value to gmtime. Signed-off-by: Jeff King p...@peff.net --- This is an alternative to loosening the test in t4212. It's really not _that_ ugly. The LL here may not be portable, though. 32-bit systems can't represent this timestamp at all (so they're safe), but I don't know what would be the best way to conditionally compile here. compat/gmtime.c | 10 ++ config.mak.uname | 1 + 2 files changed, 11 insertions(+) diff --git a/compat/gmtime.c b/compat/gmtime.c index 75a5835..f95ba50 100644 --- a/compat/gmtime.c +++ b/compat/gmtime.c @@ -12,6 +12,16 @@ struct tm *git_gmtime_r(const time_t *timep, struct tm *result) { struct tm *ret; + /* +* Some systems, like AIX, will happily overflow the tm_year field. +* So let's recognize obviously out-of-bound data before it hits gmtime +* and just mark it as an error. This date is ~316 million years in the +* future, which is far enough that nobody should care, but close +* enough for the year to fit into a 32-bit tm_year. +*/ + if (*timep LL) + return NULL; + ret = gmtime_r(timep, result); /* diff --git a/config.mak.uname b/config.mak.uname index 0e22ac0..c1110ad 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -236,6 +236,7 @@ ifeq ($(uname_S),AIX) INLINE = '' endif GIT_TEST_CMP = cmp + GMTIME_UNRELIABLE_ERRORS = UnfortunatelyYes endif ifeq ($(uname_S),GNU) # GNU/Hurd -- 1.9.1.656.ge8a0637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] add `ignore_missing_links` mode to revwalk
On Mon, Mar 31, 2014 at 02:48:45PM -0700, Siddharth Agarwal wrote: On 03/28/2014 03:00 AM, Jeff King wrote: From: Vicent Marti tan...@gmail.com When pack-objects is computing the reachability bitmap to serve a fetch request, it can erroneously die() if some of the UNINTERESTING objects are not present. Upload-pack throws away HAVE lines from the client for objects we do not have, but we may have a tip object without all of its ancestors (e.g., if the tip is no longer reachable and was new enough to survive a `git prune`, but some of its reachable objects did get pruned). Thanks for this patch. It looks pretty sensible. Unfortunately, I can't provide feedback on running it in production because we've decided to set aside experimenting with bitmaps for a bit. I hope to get back to it in a couple of months. Bummer. Thanks for taking a look at it. I do think this patch is definitely fixing a bug, and needs to be pursued. We've been running with bitmaps in production on GitHub since last summer, but have never run into this situation. However, I think it is largely caused by our pruning parameters: 1. We tend not to prune very often, and instead keep unreachable objects around as a safety mechanism. 2. When we do prune, we use a very tight cutoff, rather than the default 2-week period. So the window of opportunity is much smaller for a repo to prune an object but not its descendant (typically either we keep both, or they both get pruned). So if you do come back to it later, the fix should have filtered through to master by then. :) -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fast-import deltas
On Tue, Apr 01, 2014 at 07:25:54PM +0900, Mike Hommey wrote: I am currently prototyping a native mercurial remote handler for git, For my own curiosity, how does this differ from what is in contrib/remote-helpers/git-remote-hg? Would adding a fast-import command to handle deltas be considered useful for git? If so, what kind of format would be suitable? It breaks fast-import's lowest common denominator data model that is just passing commits and their contents over the stream. But we already do that in other cases for the sake of performance. I think the important thing is that the alternate formats are optional and enabled by the caller with command-line options. That being said, I foresee a few complications: 1. Git needs to know the sha1 of the full object. So unless the generating script knows that ahead of time, git has to expand the delta immediately anyway (this could still be a win if we end up using a good delta from elsewhere rather than doing our own delta search, but I suspect it's not so big a win as if we can just blit the delta straight to disk). 2. Git does not store on-disk deltas between objects that are not in the same packfile. So you'd only be able to delta against an object that came in the same stream (or you'd have to fix the packfile on disk by adding an extra copy of the delta base, but that probably eliminates any savings). As for format, I believe that git is basically xdelta under the hood, so you'd want either that or something that can be trivially converted to it. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug in git-diff output
On Tue, Apr 01, 2014 at 12:49:00PM +0200, rocketscienc01100101 . wrote: I tried to get a diff between HEAD and the current version of my project, so I did git diff. That actually diffs between the index and the working tree, but if you haven't used git add to add any changes, the index content is the same as HEAD. It's a web project with a CSS file that contains the following CSS rule: a[href^=tel] { color:inherit; text-decoration:none; } Now, whenever I do git diff, it will always show the a[href^=tel] part and mess up the output, even when I didn't change anything near that line. The problem is easily reproducable in a newly created repository. I don't understand what you mean here by mess up the output. Can you show us an example (and tell us what you expected to see)? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fast-import deltas
On Tue, Apr 01, 2014 at 10:07:03PM +0900, Mike Hommey wrote: For my own curiosity, how does this differ from what is in contrib/remote-helpers/git-remote-hg? contrib/remote-helpers/git-remote-hg does a local mercurial clone before doing the git conversion. While this is not really a problem for most mercurial projects, it tends to be slow with big ones, like the firefox source code. What I'm aiming at is something that can talk directly to a remote mercurial server. Ah, that makes sense. Thanks for explaining. 2. Git does not store on-disk deltas between objects that are not in the same packfile. So you'd only be able to delta against an object that came in the same stream (or you'd have to fix the packfile on disk by adding an extra copy of the delta base, but that probably eliminates any savings). Arguably, this would make the most difference on initial clone of big projects, or large incremental updates (like, after a few weeks), which would use a single pack anyways. Yeah. The nice thing is that this can be an opportunistic optimization. If the delta base is part of the same output stream, then send the delta. Otherwise, you can always fall back to reconstructing and sending the full object yourself. It seems to me fast-import keeps a kind of human readable format for its protocol, i wonder if xdelta format would fit the bill. That being said, I also wonder if i shouldn't just try to write a pack on my own... The fast-import commands are human readable, but the blob contents are included inline. I don't see how sending a binary delta is any worse than sending a literal binary blob over the stream. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] t4212: handle systems with post-apocalyptic gmtime
On Tue, Apr 01, 2014 at 12:07:22PM -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: On Fri, Mar 28, 2014 at 12:30:02PM -0700, Junio C Hamano wrote: Let's just deal with a simple known cases (like FreeBSD) in the real code that everybody exercises at runtime, and have the new test only check we do not segfault on a value we used to segfault upon seeing. OK. Here it is, with the other option as an alt patch for reference. [1/2]: date: recognize bogus FreeBSD gmtime output [2/2]: t4212: loosen far-in-future test for AIX [2alt/2]: work around unreliable gmtime errors on AIX -Peff Thanks. 2alt does not look too bad, but on the other hand, we are replacing a value that can produce the right result on correctly implemented gmtime with a completely bogus value only because we know there exists one broken implementation---which does not sound a very good trade-off, given that we would get a result that does not correspond to the input anyway with or without the change on the broken implementation. One other option is to push _all_ callers through a git_gmtime() wrapper, and then use Makefile knobs to turn on specific quirks, like: struct tm *git_gmtime_r(const time_t *timep, struct tm *result) { #ifdef GMTIME_OVERFLOWS_32BIT if (*timep 999) return NULL; #endif ret = gmtime_r(timep, result); #ifdef GMTIME_ERROR_BLANK if (ret !ret-tm_mday) return NULL; #endif return ret; } and then each platform can turn the knobs as appropriate. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/22] t3204: test deleting references when lock files already exist
On Tue, Apr 01, 2014 at 05:58:09PM +0200, Michael Haggerty wrote: When deleting a reference, it might be that another process already holds the lock on the loose reference file and/or the packed-refs file. In those cases, there is no alternative but for the delete to fail. Verify that in such cases the reference values are left unchanged. But in fact, when the packed-refs file lock cannot be acquired, the loose reference file is deleted anyway, potentially leaving the reference with a changed value (its packed value, which might even point at an object that has been garbage collected). Thus one of the new tests is marked test_expect_failure. Nice find. If I understand correctly, the problem is at the plumbing level, and this could also be demonstrated with update-ref? I don't think it is a big deal, but I was thrown for a minute by the use of git branch (as in, is this something special with branches, or is this about all refs?). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/22] rollback_lock_file(): set fd to -1
On Tue, Apr 01, 2014 at 05:58:12PM +0200, Michael Haggerty wrote: When rolling back the lockfile, call close_lock_file() so that the lock_file's fd field gets set back to -1. This could help prevent confusion in the face of hypothetical future programming errors. This also solves a race. We could be in the middle of rollback_lock_file when we get a signal, and double-close. It's probably not a big deal, though, since nobody could have opened a new descriptor in the interim that got the same number (so the second close will just fail silently). Still, this seems like a definite improvement. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/22] lockfile: unlock file if lockfile permissions cannot be adjusted
On Tue, Apr 01, 2014 at 05:58:13PM +0200, Michael Haggerty wrote: If the call to adjust_shared_perm() fails, lock_file returns -1, which to the caller looks like any other failure to lock the file. So in this case, roll back the lockfile before returning so that the lock file is deleted immediately and the lockfile object is left in a predictable state that (namely, unlocked). Previously, the lockfile was retained until process cleanup in this situation. Another nice find. I wondered if we could test this, but I think it would be hard to trigger. The obvious reason for adjust_shared_perm to fail is that you do not have permissions on the file, but by definition you just created it. So I doubt this ever comes up in practice short of weird races (somebody dropping the x bit from an intermediate directory between the open() and chmod() or something). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/22] try_merge_strategy(): remove redundant lock_file allocation
On Tue, Apr 01, 2014 at 05:58:10PM +0200, Michael Haggerty wrote: By the time the if block is entered, the lock_file instance from the main function block is no longer in use, so re-use that one instead of allocating a second one. Note that the lock variable in the if block used to shadow the lock variable at function scope, so the only change needed is to remove the inner definition. I wonder if this would also be simpler if lock were simply declared as a static variable, and we drop the allocation entirely. I suppose that does create more cognitive load, though, in that it is only correct if the function is not recursive. On the other hand, the current code makes a reader unfamiliar with struct lock wonder if there is a free(lock) missing. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/22] lockfile: unlock file if lockfile permissions cannot be adjusted
On Tue, Apr 01, 2014 at 04:02:42PM -0400, Jeff King wrote: On Tue, Apr 01, 2014 at 05:58:13PM +0200, Michael Haggerty wrote: If the call to adjust_shared_perm() fails, lock_file returns -1, which to the caller looks like any other failure to lock the file. So in this case, roll back the lockfile before returning so that the lock file is deleted immediately and the lockfile object is left in a predictable state that (namely, unlocked). Previously, the lockfile was retained until process cleanup in this situation. Another nice find. I wondered if we could test this, but I think it would be hard to trigger. The obvious reason for adjust_shared_perm to fail is that you do not have permissions on the file, but by definition you just created it. So I doubt this ever comes up in practice short of weird races (somebody dropping the x bit from an intermediate directory between the open() and chmod() or something). ...and I should have read the final sentence in your message more carefully. Even if we did trigger it, the problem would only last until the program exits anyway. I agree that this is a nice cleanup, though; a caller that wants to retry the lock before exiting would be much less surprised. And the same logic applies to 06/22. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 07/22] lock_file(): always add lock_file object to lock_file_list
On Tue, Apr 01, 2014 at 05:58:15PM +0200, Michael Haggerty wrote: diff --git a/lockfile.c b/lockfile.c index e679e4c..c989f6c 100644 --- a/lockfile.c +++ b/lockfile.c @@ -130,6 +130,22 @@ static int lock_file(struct lock_file *lk, const char *path, int flags) */ static const size_t max_path_len = sizeof(lk-filename) - 5; + if (!lock_file_list) { + /* One-time initialization */ + sigchain_push_common(remove_lock_file_on_signal); + atexit(remove_lock_file); + } + + lk-owner = getpid(); + if (!lk-on_list) { + /* Initialize *lk and add it to lock_file_list: */ + lk-fd = -1; + lk-on_list = 1; + lk-filename[0] = 0; + lk-next = lock_file_list; + lock_file_list = lk; + } Initializing here is good, since we might be interrupted by a signal at any time. But what about during the locking procedure? We do: strcpy(lk-filename, path); if (!(flags LOCK_NODEREF)) resolve_symlink(lk-filename, max_path_len); strcat(lk-filename, .lock); So for a moment, lk-filename contains the name of the valuable file we are locking. If we get a signal at that moment, do we accidentally delete it in remove_lock_file? I think the answer is no, because we check lk-owner before deleting, which will not match our pid (it should generally be zero due to xcalloc or static initialization, though perhaps we should clear it here). But that makes me wonder about the case of a reused lock. It will have lk-owner set from a previous invocation, and would potentially suffer from this problem. In other words, I think the change you are introducing does not have the problem, but the existing code does. :-/ I didn't reproduce it experimentally, though. We should be able to just lk-owner = 0; before the initial strcpy to fix it, I would think. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/22] api-lockfile: expand the documentation
On Tue, Apr 01, 2014 at 05:58:17PM +0200, Michael Haggerty wrote: +unable_to_lock_error:: + + Emit an error describing that there was an error locking the + specified path. The err parameter should be the errno of the + problem that caused the failure. + +unable_to_lock_index_die:: + + Like `unable_to_lock_error()`, but also `die()`. Should this last one lost the index in its name? I think it is vestigial at this point. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/22] delete_ref_loose(): don't muck around in the lock_file's filename
On Tue, Apr 01, 2014 at 05:58:20PM +0200, Michael Haggerty wrote: It's bad manners. Especially since, if unlink_or_warn() failed, the memory wasn't restored to its original contents. So make our own copy to work with. Sounds good... if (!(flag REF_ISPACKED) || flag REF_ISSYMREF) { - /* loose */ - int err, i = strlen(lock-lk-filename) - 5; /* .lock */ - - lock-lk-filename[i] = 0; - err = unlink_or_warn(lock-lk-filename); - lock-lk-filename[i] = '.'; + /* + * loose. The loose file name is the same as the + * lockfile name, minus .lock: + */ + char *loose_filename = xmemdupz(lock-lk-filename, + strlen(lock-lk-filename) - 5); + int err = unlink_or_warn(loose_filename); + free(loose_filename); Should we be using LOCK_SUFFIX_LEN from the previous commit here? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 14/22] lockfile: use strbufs when handling (most) paths
On Tue, Apr 01, 2014 at 05:58:22PM +0200, Michael Haggerty wrote: /* - * p = path that may be a symlink - * s = full size of p - * - * If p is a symlink, attempt to overwrite p with a path to the real - * file or directory (which may or may not exist), following a chain of - * symlinks if necessary. Otherwise, leave p unmodified. + * path contains a path that may be a symlink * - * This is a best-effort routine. If an error occurs, p will either be - * left unmodified or will name a different symlink in a symlink chain - * that started with p's initial contents. + * If path is a symlink, attempt to overwrite it with a path to the + * real file or directory (which may or may not exist), following a + * chain of symlinks if necessary. Otherwise, leave path unmodified. * - * Always returns p. + * This is a best-effort routine. If an error occurs, path will + * either be left unmodified or will name a different symlink in a + * symlink chain that started with path's initial contents. */ - -static char *resolve_symlink(char *p, size_t s) +static void resolve_symlink(struct strbuf *path) [...] This is not a problem you are introducing, but do we really want resolve_symlink to be best-effort here? What happens if a is a symlink to b, and two processes try to lock a simultaneously? If one succeeds, it will take b.lock. But the other will take a.lock, and both will think they have the target locked. I suspect we need to recognize ENOENT for cases where we are creating the file for the first time, but it seems like we should bail on locking from any other transient errors. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 22/22] lockfile: allow new file contents to be written while retaining lock
On Tue, Apr 01, 2014 at 05:58:30PM +0200, Michael Haggerty wrote: Add a new option flag, LOCK_SEPARATE_STAGING_FILE, that can be passed to hold_lock_file_for_update() or hold_lock_file_for_append() to use a staging file that is independent of the lock file. Add a new function activate_staging_file() that activates the contents that have been written to the staging file without releasing the lock. This functionality can be used to ensure that changes to two files are seen by other processes in one order even if correctness requires the locks to be released in another order. Can you give an example of when this is useful? I'm guessing the application is for writing out packed-refs before pruning loose refs in git-pack-refs? It seems like this makes the API much more confusing. If I understand correctly, this is basically allowing us to take a lock, write to _another_ tmpfile that is not the lock, then rename the tmpfile into place without releasing the lock (and then we can drop the lock at our convenience). I wonder if it would be simpler to build an API for that around the lock_file API, rather than as part of it. Or am I misunderstanding what's going on? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/22] Lockfile refactoring and pre-activation
On Tue, Apr 01, 2014 at 05:58:08PM +0200, Michael Haggerty wrote: I've had this patch series kicking around for a long time, along with some followup patches to fix a race in reference deletion. I haven't had the time to get everything done and tested, but let me at least push this first series out there. I especially want to submit it in case we accept a GSoC student for the project Refactor tempfile handling, because (1) I don't want me and the student to be stepping on each others' toes, and (2) there are some cleanups and documentation improvements here that will hopefully be useful to the student. Thanks, this sort of preparation for GSoC students is very much welcome. The first patch actually demonstrates the race condition that I hope to fix someday. The last patch introduces the lockfile feature that I think is needed to fix it: the ability to activate a packed-refs file while still holding the lock to prevent another process from overwriting it before the accompanying loose reference updates are all finished. But the fix itself is not included here, so you might want to keep the last patch on hold until there is a concrete user of the feature. I should have read this more carefully when I responded to the final patch. I surmised your intent based on our previous work on packed-refs, but here you spell out my guesses explicitly. :) I think all of the patches look good. I'd prefer to hold back the final one (and probably 19/22, which has no purpose except to prepare for 22/22) until seeing its application in practice. Thanks for a very pleasant read. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] date: recognize bogus FreeBSD gmtime output
On Tue, Apr 01, 2014 at 11:17:14PM +0200, René Scharfe wrote: So are you saying we should set EOVERFLOW ourselves, or does FreeBSD set EOVERFLOW for us in this case and we do not have to worry? If we correct the return value then we should correct errno as well. gmtime() on FreeBSD 10 leaves errno untouched when it encounters an overflow. While testing this again I just noticed that FreeBSD doesn't strictly return a pointer to a cleared struct tm. It simply leaves its static buffer untouched. We should probably clear it (memset in git_gmtime_r). Thanks, that all makes sense (we do not ever care about gmtime's errno in our code, but it does not hurt to be thorough to avoid surprising any new callers). Here's a replacement for patch 1. -- 8 -- Subject: date: recognize bogus FreeBSD gmtime output Most gmtime implementations return a NULL value when they encounter an error (and this behavior is specified by ANSI C and POSIX). FreeBSD's implementation, however, will simply leave the struct tm untouched. Let's also recognize this and convert it to a NULL (with this patch, t4212 should pass on FreeBSD). Reported-by: René Scharfe l@web.de Signed-off-by: Jeff King p...@peff.net --- Makefile | 8 compat/gmtime.c | 29 + config.mak.uname | 1 + git-compat-util.h | 7 +++ 4 files changed, 45 insertions(+) create mode 100644 compat/gmtime.c diff --git a/Makefile b/Makefile index 3646391..2f3758c 100644 --- a/Makefile +++ b/Makefile @@ -338,6 +338,9 @@ all:: # Define TEST_GIT_INDEX_VERSION to 2, 3 or 4 to run the test suite # with a different indexfile format version. If it isn't set the index # file format used is index-v[23]. +# +# Define GMTIME_UNRELIABLE_ERRORS if your gmtime() function does not +# return NULL when it receives a bogus time_t. GIT-VERSION-FILE: FORCE @$(SHELL_PATH) ./GIT-VERSION-GEN @@ -1489,6 +1492,11 @@ ifneq (,$(XDL_FAST_HASH)) BASIC_CFLAGS += -DXDL_FAST_HASH endif +ifdef GMTIME_UNRELIABLE_ERRORS + COMPAT_OBJS += compat/gmtime.o + BASIC_CFLAGS += -DGMTIME_UNRELIABLE_ERRORS +endif + ifeq ($(TCLTK_PATH),) NO_TCLTK = NoThanks endif diff --git a/compat/gmtime.c b/compat/gmtime.c new file mode 100644 index 000..e8362dd --- /dev/null +++ b/compat/gmtime.c @@ -0,0 +1,29 @@ +#include ../git-compat-util.h +#undef gmtime +#undef gmtime_r + +struct tm *git_gmtime(const time_t *timep) +{ + static struct tm result; + return git_gmtime_r(timep, result); +} + +struct tm *git_gmtime_r(const time_t *timep, struct tm *result) +{ + struct tm *ret; + + memset(result, 0, sizeof(*result)); + ret = gmtime_r(timep, result); + + /* +* Rather than NULL, FreeBSD gmtime simply leaves the struct tm +* untouched when it encounters overflow. Since mday cannot otherwise +* be zero, we can test this very quickly. +*/ + if (ret !ret-tm_mday) { + ret = NULL; + errno = EOVERFLOW; + } + + return ret; +} diff --git a/config.mak.uname b/config.mak.uname index 6069a44..0e22ac0 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -187,6 +187,7 @@ ifeq ($(uname_S),FreeBSD) endif PYTHON_PATH = /usr/local/bin/python HAVE_PATHS_H = YesPlease + GMTIME_UNRELIABLE_ERRORS = UnfortunatelyYes endif ifeq ($(uname_S),OpenBSD) NO_STRCASESTR = YesPlease diff --git a/git-compat-util.h b/git-compat-util.h index 892032b..5191866 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -716,4 +716,11 @@ void warn_on_inaccessible(const char *path); /* Get the passwd entry for the UID of the current process. */ struct passwd *xgetpwuid_self(void); +#ifdef GMTIME_UNRELIABLE_ERRORS +struct tm *git_gmtime(const time_t *); +struct tm *git_gmtime_r(const time_t *, struct tm *); +#define gmtime git_gmtime +#define gmtime_r git_gmtime_r +#endif + #endif -- 1.9.1.656.ge8a0637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] pack-objects: do not reuse packfiles without --delta-base-offset
When we are sending a packfile to a remote, we currently try to reuse a whole chunk of packfile without bothering to look at the individual objects. This can make things like initial clones much lighter on the server, as we can just dump the packfile bytes. However, it's possible that the other side cannot read our packfile verbatim. For example, we may have objects stored as OFS_DELTA, but the client is an antique version of git that only understands REF_DELTA. We negotiate this capability over the fetch protocol. A normal pack-objects run will convert OFS_DELTA into REF_DELTA on the fly, but the reuse pack code path never even looks at the objects. This patch disables packfile reuse if the other side is missing any capabilities that we might have used in the on-disk pack. Right now the only one is OFS_DELTA, but we may need to expand in the future (e.g., if packv4 introduces new object types). We could be more thorough and only disable reuse in this case when we actually have an OFS_DELTA to send, but: 1. We almost always will have one, since we prefer OFS_DELTA to REF_DELTA when possible. So this case would almost never come up. 2. Looking through the objects defeats the purpose of the optimization, which is to do as little work as possible to get the bytes to the remote. Signed-off-by: Jeff King p...@peff.net --- I happened to be fooling around with git v1.4.0 today, and noticed a problem fetching from GitHub. Pre-OFS_DELTA git versions are ancient by today's standard, but it's quite easy to remain compatible here, so I don't see why not. And in theory, alternate implementations might not understand OFS_DELTA, though in practice I would consider such an implementation to be pretty crappy. builtin/pack-objects.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 7950c43..1503632 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2439,12 +2439,23 @@ static void loosen_unused_packed_objects(struct rev_info *revs) } } +/* + * This tracks any options which a reader of the pack might + * not understand, and which would therefore prevent blind reuse + * of what we have on disk. + */ +static int pack_options_allow_reuse(void) +{ + return allow_ofs_delta; +} + static int get_object_list_from_bitmap(struct rev_info *revs) { if (prepare_bitmap_walk(revs) 0) return -1; - if (!reuse_partial_packfile_from_bitmap( + if (pack_options_allow_reuse() + !reuse_partial_packfile_from_bitmap( reuse_packfile, reuse_packfile_objects, reuse_packfile_offset)) { -- 1.9.1.656.ge8a0637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/22] delete_ref_loose(): don't muck around in the lock_file's filename
On Wed, Apr 02, 2014 at 08:52:17AM +0200, Torsten Bögershausen wrote: + /* + * loose. The loose file name is the same as the + * lockfile name, minus .lock: + */ + char *loose_filename = xmemdupz(lock-lk-filename, + strlen(lock-lk-filename) - 5); At other places (lockfile.c) we use this +#define LOCK_SUFFIX_LEN 5 I think it makes sense to move this definition to an include file (lockfile.h ??) and use it here. I had the same comment, but if you read through to patch 18, this manual munging goes away entirely, and the end result is much cleaner. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] pack-objects: do not reuse packfiles without --delta-base-offset
On Wed, Apr 02, 2014 at 10:39:13AM -0700, Junio C Hamano wrote: However, it's possible that the other side cannot read our packfile verbatim. For example, we may have objects stored as OFS_DELTA, but the client is an antique version of git that only understands REF_DELTA. We negotiate this capability over the fetch protocol. A normal pack-objects run will convert OFS_DELTA into REF_DELTA on the fly, but the reuse pack code path never even looks at the objects. The above makes it sound like reuse pack codepath is broken. It is broken (without this patch), though in practice only for ancient (pre-1.4.x) clients. Is it too much hassle to peek at the initial bytes of each object to see how they are encoded? Would it be possible to convert OFS_DELTA to REF_DELTA on the fly on that codepath as well, instead of disabling the reuse altogether? It's a mistake to peek ahead of time. Part of the point of the pack-reuse optimization is to start sending out bytes as soon as possible, since the network is quite often the bottleneck. So we would not want to look through all of the to-be-sent data before sending out the first byte. We could convert OFS_DELTA to REF_DELTA on the fly. That _may_ have a performance impact. Right now, we are basically doing the equivalent of sendfile(), and conversion would involve iterating through each object and examining the header. I think that's probably not too bad, though. The most expensive part of that, stepping to the next object, requires scanning through the zlib packets, but we should be able to use the revidx to avoid that. I'm not sure it's even worth the code complexity, though. The non-reuse codepath is not that much slower, and it should be extremely rare for a client not to support OFS_DELTA these days. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] pack-objects: do not reuse packfiles without --delta-base-offset
On Fri, Apr 04, 2014 at 03:28:48PM -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: We could convert OFS_DELTA to REF_DELTA on the fly. That _may_ have a performance impact. Right now, we are basically doing the equivalent of sendfile(), and conversion would involve iterating through each object and examining the header. I think that's probably not too bad, though. The most expensive part of that, stepping to the next object, requires scanning through the zlib packets, but we should be able to use the revidx to avoid that. I'm not sure it's even worth the code complexity, though. The non-reuse codepath is not that much slower, and it should be extremely rare for a client not to support OFS_DELTA these days. OK, together with the fact that only ancient versions of fetcher would trigger this do not reuse codepath, I agree that we should go the simplest route this patch takes. By the way, we may want to revisit this if we grow more features that do not allow straight byte-for-byte reuse. I am thinking specifically if we grow a packv4-like representation for an object, and we plan to convert on-the-fly to existing packv2 clients. But I think the sensible steps for that are: 1. If we have v4 on disk and are outputting v2, add this case to the can_reuse function I just added. I.e., start out correct, and turn off the optimization. 2. Experiment with on-the-fly conversion. It may be that the conversion is so expensive that the reuse optimization gets lost in the noise. Or maybe we can reclaim most of the advantage of the reuse code path, and it is worth going object-by-object and converting. But we won't know until we can measure. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 07/22] lock_file(): always add lock_file object to lock_file_list
On Sun, Apr 06, 2014 at 11:54:59PM +0200, Michael Haggerty wrote: I didn't reproduce it experimentally, though. We should be able to just lk-owner = 0; before the initial strcpy to fix it, I would think. I think that using the owner field to avoid this problem is a bit indirect, so I will soon submit a fix that involves adding a flag to lock_file objects indicating whether the filename field currently contains the name of a file that needs to be deleted. Yeah, I agree that the current code is a bit subtle. I was planning to address this during the tempfile cleanup project (either in GSoC, if it gets a slot, or just doing it myself). But I don't mind if you want to do something in the interim. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 01/25] api-lockfile: expand the documentation
On Mon, Apr 07, 2014 at 01:33:43AM +0200, Michael Haggerty wrote: +unable_to_lock_error:: + + Emit an error describing that there was an error locking the + specified path. The err parameter should be the errno of the + problem that caused the failure. + +unable_to_lock_die:: + + Like `unable_to_lock_error()`, but also `die()`. The die() function is still called unable_to_lock_index_die() at this point in the series. Presumably you change it later. I don't know if it is worth caring about the order or not; it's a doc change, so it's not like it breaks bisectability. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 02/25] unable_to_lock_die(): rename function from unable_to_lock_index_die()
On Mon, Apr 07, 2014 at 01:33:44AM +0200, Michael Haggerty wrote: This function is used for other things besides the index, so rename it accordingly. Oh, and here it is. I should really have just read ahead before responding to patch 1. We can either re-order these first two, or just not worry about it. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 17/25] commit_lock_file(): make committing an unlocked lockfile a NOP
On Mon, Apr 07, 2014 at 01:33:59AM +0200, Michael Haggerty wrote: It was previously a bug to call commit_lock_file() with a lock_file object that was not active (an illegal access would happen within the function). It was presumably never done, but this would be an easy programming error to overlook. So guard the file-renaming code with an if statement to change committing an unlocked file into a NOP. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- Alternatively, we could make it a fatal error (e.g., an assert() or if...die()) to call commit_lock_file() on an unlocked file, or omit a warning in this case. But since it is so hard to test code related to locking failures, I have the feeling that such an error is most likely to occur in some error-handling path, maybe when some other lockfile acquisition has failed, and it would be better to let the code continue its attempted cleanup instead of dying. But it would be easy to persuade me to change my opinion. Yeah, I would have expected a die(BUG) here. I think it is worth making it a fatal mistake and catching it. Rolling back an uninitialized lockfile is probably OK; we are canceling an operation that never started. But committing a lockfile that we didn't actually fill out could be a sign of a serious error, and we may be propagating a bogus success code. E.g., imagine that receive-pack claims to have written your ref, but actually commit_lock_file was a silent NOP. I'd much rather have it die loudly so we can track down the case. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 18/25] lockfile: avoid transitory invalid states
On Mon, Apr 07, 2014 at 03:12:49PM +0200, Michael Haggerty wrote: How far *do* you want to go? I'm certainly not opposed to field-test your current changeset (plus and adjustment to use sig_atomic_t) -- overall it is an improvement. And then we will see how it works. For now I think I'd just like to get the biggest problems fixed without making anything worse. Given that there might be a GSoC student working in this neighborhood, he/she might be able to take up the baton. I changed the patch series to use a new volatile sig_atomic_t active field rather than a bit in a flags field. That seems like a good place to stop for now. If any code touches the fields, it can unset the active flag (even temporarily), and restore it when the structure is in a known state. I'm not sure if we can ever reach full safety. If you have an event loop, the sane thing to do is set an atomic flag in your signal handler and then handle it on the next iteration of the loop. But all of our signal handlers are jumped to from arbitrary code, and are just going to die(). There's nothing to return to, so any useful work we do has to be done from the handler. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 00/25] Lockfile correctness and refactoring
On Mon, Apr 07, 2014 at 01:33:42AM +0200, Michael Haggerty wrote: This is a second attempt at renovating the lock file code. Thanks to Peff, Junio, Torsten, and Eric for their helpful reviews of v1. v1 of this patch series [1] did some refactoring and then added a new feature to the lock_file API: the ability to activate a new version of a locked file while retaining the lock. But the review of v1 turned up even more correctness issues in the existing implementation of lock files. So this v2 dials back the scope of the changes (it omits the new feature) but does more work to fix problems with the current lock file implementation. The main theme of this patch series is to better define the state diagram for lock_file objects and to fix code that left them in incorrect, indeterminate, or unexpected states. There are also a few patches that convert several functions to use strbufs instead of limiting pathnames to a maximum length. Looks OK to me, modulo the few comments I sent. I still think resolve_symref should probably not be best-effort, but that can come later on top. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] describe: rewrite name_rev() iteratively
On Mon, Apr 07, 2014 at 01:47:14AM +0300, Dragos Foianu wrote: The git describe --contains command uses the name_rev() function which is currently a recursive function. This causes a Stack Overflow when the history is large enough. Rewrite name_rev iteratively using a stack on the heap. This slightly reduces performance due to the extra operations on the heap, but the function no longer overflows the stack. You can avoid the heap overhead by using an array for your stack, and only resizing it when necessary. Like this: struct rev_stack { int nr, alloc; struct rev_data *data; }; static struct rev_data *rev_stack_push(struct rev_stack *stack) { ALLOC_GROW(stack-data, stack-nr + 1, stack-alloc); return stack-data[stack-nr++]; } static void rev_stack_pop(struct rev_stack *stack) { stack-nr--; } static void rev_stack_init(struct rev_stack *stack) { stack-nr = stack-alloc = 0; stack-data = NULL; } static void rev_stack_release(struct rev_stack *stack) { free(stack-data); rev_stack_init(stack); } Usage would be something like: struct rev_data *data = rev_stack_push(stack); data-commit = commit; data-tip_name = tip_name; ... IOW, you push first to allocate the space, and then do your make_rev_data, rather than the other way around. The downside is that your allocation is always as big as the deepest recursion so far, so you hold on to the memory a little longer than necessary. I think that's a good tradeoff versus an extra malloc() for every commit. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Changes based on a branch not yet merged back to master
On Tue, Apr 08, 2014 at 11:22:44AM -0700, K. Kwan wrote: My tree would look like this: - x - x - x (master) \ x - x (A) x - x (B) But after merge of branch (A), I would like it to look like this: - x - x - x x (master) \/ \ x - x (A) x - x (B) Perhaps this? $ git checkout master $ git pull origin master $ git merge A $ git rebase --onto master A B Yes, that should work. I think you could even just do: git rebase master B since after the merge, all of A is contained in master (and so will be omitted from the rebased commits). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/9] branch: display publish branch
On Thu, Apr 10, 2014 at 05:36:59PM -0500, Felipe Contreras wrote: I noticed that this only picks up a publish-branch if branch.*.pushremote is configured. What happened to the case when remote.pushdefault is configured? What happens when branch.*.remote is not configured for @{upstream}? The same thing. I don't know if that is a good comparison. In other threads, the discussed meaning of @{publish} was something like the tracking branch of the ref you would push to if you ran 'git push' without arguments. That is consistent with @{upstream} being the tracking branch of the ref you would pull from with 'git pull'. But git pull without a branch.*.remote will do nothing, so what pull would do is the same as what you have configured in your branch.*.remote. Whereas git push does not depend on having branch.*.pushremote configured. Its behavior is based on push.default and push refspecs, so what push would do must take that into account. It might be useful to visualize what would be the name of the branch when pushing it (without a refspec) even if the publish branch hasn't been configured, but I think the code would be much more coplicated, and it would break symetry with @{upstream}, besides, the user can just do 'git push -p branch', and from that moment on it will be visible. It is more complicated (see the patches that Junio had at jk/branch-at-publish), but I think it is more likely to do what the user expects. For instance, it looks like your @{publish} requires config like: [branch master] pushremote = foo push = refs/heads/bar to operate. Setting pushremote affects what git push does; it will go to the foo remote. But the branch.master.push setting does not do anything to git push. Only a push refspec (or push.default setting) will change that. So the branch.*.push must be kept in sync manually (perhaps by running git push -p). Whereas if @{publish} means where you would push to, then branch.*.push does not need to exist at all. The values can be taken automatically from the other push settings. -Peff PS I first tried just setting branch.master.pushremote without setting branch.master.push. This results in a segfault, as branch_get() assumes that push_name is always set and tries to xstrdup() it. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Our official home page and logo for the Git project
On Thu, Apr 10, 2014 at 10:24:24AM +1000, Andrew Ardill wrote: It's normal for an organisation to have a collection of logos to choose from, with one 'official' version. For example, a black and white version is useful for print. Similarly, it's useful to have a couple of different contrast level/colours that can be used in the appropriate situations. There are a few options at http://git-scm.com/downloads/logos for matching the logo to the background. There is nothing wrong with having alternates that have been approved for various situations. I'm not sure if this is how you meant it, but I want to emphasize that there is no approval necessary for using alternate logos. Saying let's recognize this one as an official logo is not meant to shut down the use of others. It is only meant to say when people ask for an official logo (e.g., GSoC does so), this one is a good answer. That is not to say that proliferation of logos is a good idea either. The point of a logo is recognizability, and if there are dozens of git logos, chances are that most of them are not recognizable. I recommend creating a git repository called git-resources, git-marketing, or git-assets, to contain the various approved logos. If there is not another location, or a more appropriate one, https://github.com/git would be a good place to put this. I think the logo page above is a good start for variations of that particular logo. I'd prefer not to put other random logos there unless they also get wide enough use that they are recognized by the project. But I have no objection to a repository of random logos. The git-scm.com page is mostly targeted at end users: what is it, how do I get it, where is the documentation. Things like a logo repository, or developer information is spread across various wikis and other sites. If there's interest, we can make dev.git-scm.com for such things, or host repositories under http://github.com/git. But we would first need content to put there, and somebody would need to step forward to organize and maintain that content. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] test: fix t7001 cp to use POSIX options
On Fri, Apr 11, 2014 at 01:24:02AM -0700, Kyle J. McKay wrote: Since 11502468 and 04c1ee57 (both first appearing in v1.8.5), the t7001-mv test has used cp -a to perform a copy in several of the tests. However, the -a option is not required for a POSIX cp utility and some platforms' cp utilities do not support it. The POSIX equivalent of -a is -R -P -p. Change cp -a to cp -R -P -p so that the t7001-mv test works on systems with a cp utility that only implements the POSIX required set of options and not the -a option. I wonder if the -R is the part that we actually care about here. Including the others does not hurt in that case, but using only -R would perhaps make it more obvious to a later reader of the code exactly what we are trying to do. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/9] branch: display publish branch
On Fri, Apr 11, 2014 at 08:48:01AM -0500, Felipe Contreras wrote: I think of @{publish} as the branch the user has configured to push to; it overrides all other configurations (push.default and push refspecs). I wouldn't mind having a @{push} *in addition* to @{publish} that would have the behavior you mention, but for @{publish} I'm pretty certain the behavior I want is that it maps *directly* to what the user has configured. I guess I don't understand why we need branch.*.push when we already have push refspecs that do the same thing (and are more powerful, as they can glob). The behavior you describe is not useful to me, as it would mean having to manage branch.*.push as a duplicate of the information that is already found in remote.$(git config remote.pushdefault).push. I do not mind if the two co-exist, but please do not take the short @{p} as your series does (not only because of @{push}, but also because @{pull} has been mentioned elsewhere as a possible other name for @{upstream}). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/9] branch: display publish branch
On Fri, Apr 11, 2014 at 12:24:35PM -0700, Junio C Hamano wrote: But the branch.master.push setting does not do anything to git push. I am not sure I understand this. I thought that the desire behind the branch.*.push is to allow something like: ... other things in the config ... [remote] pushdefault = foo [remote foo] url = ... push = +refs/heads/*:refs/remotes/satellite/* fetch = +refs/heads/*:refs/remotes/foo/* [branch master] ; pushremote = foo push = refs/heads/bar so that git push on 'master' will override the more generic all local branches here will go to their remote-tracking hierarchy for this satellite refspec. And in that sense branch.master.push would do something to git push. Ah, I see. If I set push.default to upstream, then the config I showed before _does_ affect git push. But I do not usually do that. I have push.default set to current, and sometimes override it using push refspecs on certain repositories. And that is why I find branch.*.push and Felipe's @{publish} useless for my workflow. Pushes already go where I want them to, and I just want a way to ask git to perform that config resolution for me. Whereas Felipe's workflow is (I think) something like: # make a new branch... git checkout -b topic origin/master # now publish our branch, and remember our publishing point git push -p my-repo topic # and now further pushes automatically go to my-repo/topic git push I can see there is some value in that override if you do things like: git push -p my-repo topic:some-other-name because the -p means remember this other name I gave. I would think in such a workflow that most of your branches would end up with publish config, though. And therefore @{publish} would become equivalent to where you would push. But Felipe indicated that he would not want branch -vv to match where all branches would be pushed, but rather only those that were specifically configured. So maybe I do not understand his workflow after all. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Our official home page and logo for the Git project
On Fri, Apr 11, 2014 at 12:25:17PM -0700, Junio C Hamano wrote: The mention of dev.git-scm.com gives me a mixed feeling. The chasm between the developer community and casual end-users who know about Git primarily via their perusal of git-scm.com is one of the root causes of this confusion. I do not think you can get rid of that split, though. Different people want different content from a site. Somebody who wants to download and run git does not care about our Summer of Code ideas page. Somebody who wants to get a logo does not care about seeing an in-progress logo contest, or discussion on which logos people are working on. Historically most of the dev information has been on the mailing list. But sometimes it is more helpful to have a web page showing the current state of some content (e.g., the list of SoC ideas) and just periodically update it, rather than having each reader assemble the current state from whatever has been posted to the list. We have used the kernel.org wiki for this in the past. What I was suggesting is that those things could fall under the name dev.git-scm.com (which could even just point to the k.org wiki, or some other wiki, or a site to which many devs had push access). The wiki has _also_ been used for user-facing content. E.g., the list of tools that build on git. That kind of content would make sense to me on git-scm.com, and perhaps it could be ported there to give it better exposure. The one on the left-top corner was one of the alternatives that received favorable reactions from multiple people (I am not sure if there was a clear majority though) submitted when we briefly had a poll to come up with an updated logo. Do you have a link to the poll or its results? I could not find one in the list archive. Not that it necessarily matters to the current discussion, but I was interested for historical curiosity. I have also seen that logo receive unfavorable reactions from people, but my recollection is probably biased because I was one of those people. :) In any case, this motion is not about let's declare the logo we see on git-scm.com today as _the_ official one. It is not about that logo on git-scm.com sucks; let's come up with a better one. People are welcome to do that discussion elsewhere, and I do not mind a repository of contestants created somewhere, but personally I think the project is too mature for that and it is too late, even though the bleeding-red fork logo may not be my favorite. Thanks, this is what I was trying to say in my earlier message. The motion is about this: Outside people, like the party who approached us about putting our logo on their trinket, seem to associate that logo we see on git-scm.com today with our project, but we never officially said it was our logo (we did not endorse that git-scm.com is our official home page, either, for that matter). It is silly for us to have to say Ehh, that is a logo that was randomly done and slapped on git-scm.com which is not even our official home page, and the logo is licensed CC-BY by somebody else. Go talk to them., every time such a request comes. Please help us by letting us answer Yup, that is a logo (among others) that represents our project, and we are OK with you using it to help promote our project instead. That is what I meant by our official logo in the first message. So,... seconds? I do not know if I count, as I am listed as one of the proposers in your original message. But yes, I agree with this. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Our official home page and logo for the Git project
On Fri, Apr 11, 2014 at 08:24:48AM -0500, Felipe Contreras wrote: I would actually like you (everyone) to be honest and answer this question; Have you actually analized the logo? Or are you just arguing against change, because the logo is already used by git-scm.com, and related stuff? Is this rhetorical? If not... Yes, I really thought about the logo and like it. Many of your complaints are about how git concepts map onto the logo (for instance, the direction of the graph nodes). That is _one_ way of evaluating the logo. But there are other criteria, as well. For example, is the logo pleasing to the eye? Is it memorable and recognizable? Things like pleasing are subjective, but there are patterns across humanity. Graphic artists have studied this for some time and have guidelines for layouts, contrast, balance, proportionality, etc. For example, in the git-fc logo you mentioned, you rotated the logo from git-scm.com. I find it less visually pleasing than the original. It seems somehow more wobbly to me with the two branches sticking up. Now, that is my completely subjective opinion. I do not know very much about graphic design, and whether guidelines could help there, nor did I conduct any empirical research. So maybe it is just me, or maybe one design is universally more pleasing than the other. But I think that visual art considerations should be at least as important in a logo as whether the logo pedantically matches the tool's output. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Clear an invalid password out of the credential-cache?
On Sat, Apr 12, 2014 at 09:12:57AM -0400, Jason Pyeron wrote: Is it me or is the only way to clear a single invalid password out of the credential-cache is by git credential-cache exit? Try the reject action: $ : remember a credential $ url() { echo url=https://example.com; } $ (url; echo username=me; echo password=foo) | git credential-cache store $ : get it again $ url | git credential-cache get username=me password=foo $ : forget it $ url | git credential-cache erase $ : now produces nothing $ url | git credential-cache get Git should do this for you automatically if it tries to use the password and gets rejected (but only if the rejection is a password rejection, like an HTTP 401). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/9] branch: display publish branch
On Sat, Apr 12, 2014 at 10:05:15AM -0500, Felipe Contreras wrote: As you can see; some branches are published, others are not. The ones that are not published don't have a @{publish}, and `git branch -v` doesn't show them. Why is that hard to understand? Do you ever push the unpublished branches anywhere at all? If not, then you would not have a tracking branch. E.g., git _would_ push to remote gh, branch refs/heads/topic, but there is no remote tracking branch refs/remotes/gh/topic, because you have never actually pushed there. So there is no @{publish} branch. Or do you have some branches in a state where they are pushed, but not published? It wasn't clear to me from your example if your pu or dev/remote/hg-extra ever get pushed. I do not use git branch -v myself, so I don't personally care that much how it behaves. But I do use a separate script that does the same thing, and I would want it to show the ahead/behind relationship between each branch and where it would be pushed to (and as I said, I define mine with refspecs). Right now it uses nasty hackery to guess at where things will be pushed, but ideally it would ask git via @{push} or some similar mechanism. If the former (you do not actually push them), then I think the semantics I am looking for and the ones you want would coincide. If not, then I think we are really talking about two different things. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] config.c: fix a compiler warning
On Wed, Apr 16, 2014 at 04:13:53PM +0200, Stepan Kasal wrote: Date: Thu, 10 Apr 2014 16:37:15 +0200 This change fixes a gcc warning when building msysGit. What warning? I'm assuming -Wuninitialized? diff --git a/config.c b/config.c index 314d8ee..0b7e4f8 100644 --- a/config.c +++ b/config.c @@ -572,7 +572,7 @@ static void die_bad_number(const char *name, const char *value) int git_config_int(const char *name, const char *value) { - int ret; + int ret = 0; if (!git_parse_int(value, ret)) die_bad_number(name, value); return ret; Hmph. Generally gcc should assume that a variable is initialized after a pointer to it is passed into a function. Unless it inlines that function and can see that it isn't. But if we do inline git_parse_int, we see that it ret is always initialized if it returns non-zero. If it also inlines die_bad_number, it would see that we end in die(), which is marked NORETURN. But if it does not, it will not realize that we do not get to return ret in that case. So perhaps a better solution is: diff --git a/config.c b/config.c index 6821cef..a30cb5c 100644 --- a/config.c +++ b/config.c @@ -557,6 +557,7 @@ int git_parse_ulong(const char *value, unsigned long *ret) return 1; } +NORETURN static void die_bad_number(const char *name, const char *value) { const char *reason = errno == ERANGE ? Does that also silence the warning? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] git tag --contains : avoid stack overflow
On Wed, Apr 16, 2014 at 04:15:19PM +0200, Stepan Kasal wrote: From: Jean-Jacques Lafay jeanjacques.la...@gmail.com Date: Sat, 10 Nov 2012 18:36:10 +0100 In large repos, the recursion implementation of contains(commit, commit_list) may result in a stack overflow. Replace the recursion with a loop to fix it. This problem is more apparent on Windows than on Linux, where the stack is more limited by default. I think this is a good thing to be doing, and it looks mostly good to me. A few comments: -static int contains_recurse(struct commit *candidate, +/* + * Test whether the candidate or one of its parents is contained in the list. + * Do not recurse to find out, though, but return -1 if inconclusive. + */ +static int contains_test(struct commit *candidate, const struct commit_list *want) Can we turn this return value into enum { CONTAINS_UNKNOWN = -1, CONTAINS_NO = 0, CONTAINS_YES = 1, } contains_result; to make the code a little more self-documenting? static int contains(struct commit *candidate, const struct commit_list *want) { - return contains_recurse(candidate, want); + struct stack stack = { 0, 0, NULL }; + int result = contains_test(candidate, want); + + if (result = 0) + return result; Then this can become: if (result != CONTAINS_UNKNOWN) return result; + if (!parents) { + commit-object.flags = UNINTERESTING; + stack.nr--; + } Shouldn't this be |= when setting the flag? + /* + * If we just popped the stack, parents-item has been marked, + * therefore contains_test will return a meaningful 0 or 1. + */ + else switch (contains_test(parents-item, want)) { + case 1: + commit-object.flags |= TMP_MARK; + stack.nr--; + break; + case 0: + entry-parents = parents-next; + break; + default: + push_to_stack(parents-item, stack); + break; + } And if we have an enum, this switch() becomes more readable (the default here threw me off initially, because it is actually just looking for -1). +expect +# ulimit is a bash builtin; we can rely on that in MinGW, but nowhere else +test_expect_success MINGW '--contains works in a deep repo' ' + ulimit -s 64 It would be nice to test this on Linux. Can we do something like: test_lazy_prereq BASH 'bash --version' test_expect_success BASH '--contains works in a deep repo' ' ... setup repo ... bash -c ulimit -s 64 git tag --contains HEAD actual test_cmp expect actual ' As a bonus, then our ulimit call does not pollute the environment of subsequent tests. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] git tag --contains : avoid stack overflow
On Thu, Apr 17, 2014 at 07:31:54PM +0200, Johannes Schindelin wrote: bash -c ulimit -s 64 git tag --contains HEAD actual [...] Please see https://github.com/msysgit/git/c63d196 for the fixup, and https://github.com/msysgit/git/compare/tag-contains%5E...tag-contains for the updated patch. I tried running the test on my Linux box, but it doesn't fail with the existing recursive code. So I tried a few different stack sizes, like: for i in `seq 1 64`; do bash -c ulimit -s $i ../../git tag --contains HEAD || echo fail $i done The results are strangely non-deterministic, but with -O0, we generally die reliably below about 60. With -O2, though, it's more like 43. We can't go _too_ low here, though, as lots of things start breaking around 32. If we instead bump the size of the history to 2000 commits, then I reliably fail with a 64k stack (even with -O2, it needs around 80k). Of course those numbers are all black magic, and are going to vary based on the system, the compiler, settings, etc. My system is 64-bit, and the current code needs at least 3 pointers per recursive invocation. So we're spending ~46K on those variables, not counting any calling convention overhead (and presumably we at least need a function return pointer there). So a 32-bit system might actually get by, as it would need half as much. So we can bump the depth further; probably 4000 is enough for any system to fail with a 64k stack. The deeper we make it, the longer it takes to run the test, though. At 4000, my machine seems to take about 300ms to run it. That's may be OK. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] git tag --contains : avoid stack overflow
On Thu, Apr 17, 2014 at 11:52:56PM +0200, Johannes Schindelin wrote: I tried running the test on my Linux box, but it doesn't fail with the existing recursive code. I cannot recall how I came to choose 64, but I *think* I only tested on Windows, and I *think* I reduced the number of tags in order to make things faster (Windows is *unbearably* slow with spawn-happy programs such as Git's tests -- literally every single line in a shell script tests the patience of this developer, running the complete test suite with 15 parallel threads takes several hours, no kidding). Yeah, I figured speed had something to do with it. However, since you are using a bash loop to generate the input (and it should all be done as builtins in bash, I think), and fast-import to create the objects, I don't think bumping it will actually increase your process count. The results are strangely non-deterministic, but with -O0, we generally die reliably below about 60. With -O2, though, it's more like 43. We can't go _too_ low here, though, as lots of things start breaking around 32. How about using 40, then? I am more interested in reducing the runtime than reducing the number of false negatives. The problem will be exercised enough on Windows, but not if the test suite becomes even slower than it already is. I'm OK with doing that. My biggest concern is that it will cause false positives on systems that are hungrier for stack space, but we can address that if it happens. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] tag: add -i and --introduced modifier for --contains
On Thu, Apr 17, 2014 at 10:04:52AM -0700, Junio C Hamano wrote: Commit A can be described in terms of both v3.4 and v9.0, and it may be closer to v9.0 than v3.4, and under that definition we pick the closest tag, the current describe --contains behaviour may be correct, but from the human point of view, it is *WRONG*. It is wrong because v9.0 can reach v3.4. So perhaps the rule should be updated to do something like: - find candidate tags that can be used to describe --contains the commit A, yielding v3.4, v3.5 (not shown), and v9.0; - among the candidate tags, cull the ones that contain another candidate tag, rejecting v3.5 (not shown) and v9.0; - among the surviving tags, pick the closest. Hmm? Interesting. I think that would cover some cases, but there are others in which the tags are not direct descendants. For example, imagine you have both a master and a maint branch. You fork a topic from an old commit that both branches contain, and then independently merge the topic to each branch. You then cut a release for each. So your graph might look like: ---A---B---C-D---E---F (maint, v3.4) \ \ / \ ---G-H---I (master, v4.0) \ / / --J--- The fix is J, and it got merged up to maint at D, and to master at H. v4.0 does not contain v3.4. What's the best description of J? By the rules above, we hit the third rule pick the closest. Which means we choose v3.4 or v4.0 based solely on how many commits are between the topic's merge and the tag release. Which has nothing at all to do with the topic itself. In this case we'd show v4.0 (because J-H-I is shorter than J-D-E-F). But I suspect most users would want to know v3.4, because they want to know the oldest release they can move up to that contains the commit. But that notion of oldness is not conveyed by the graph above; it's only an artifact of the tag names. So you can solve this by actually representing the relationship with a merge. IOW, by merging v3.4 into v4.0 to say yes, v4.0 is a superset. And that's generally what we do in git.git, merging maint into master periodically. But I imagine there are other possible workflows where people do not do that merge up, and the maint and master branches diverge (and maybe they even cherry-pick from each other, but sometimes merge if the fix can be based on a common ancestor, as in this case). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/6] run_external_diff cleanups
It's possible to overflow an array in run_external_diff and write a single NULL onto the stack. The first patch below fixes that. The rest are cleanups and modernizations I noticed while in the area. It's possible that patch 3 is also a bug fix, depending on your interpretation. [1/6]: run_external_diff: use an argv_array for the command line [2/6]: run_external_diff: use an argv_array for the environment [3/6]: run_external_diff: clean up error handling [4/6]: run_external_diff: drop fflush(NULL) [5/6]: run_external_diff: hoist common bits out of conditional [6/6]: run_external_diff: refactor cmdline setup logic -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/6] run_external_diff: use an argv_array for the command line
We currently generate the command-line for the external command using a fixed-length array of size 10. But if there is a rename, we actually need 11 elements (10 items, plus a NULL), and end up writing a random NULL onto the stack. Rather than bump the limit, let's just an argv_array, which makes this sort of error impossible. Noticed-by: Max L infthi.in...@gmail.com Signed-off-by: Jeff King p...@peff.net --- This was actually noticed by a GitHub user, who proposed bumping the array size to 11: https://github.com/git/git/pull/92 Even though this fix is a bigger change, I'd rather do it this way, as it is more obviously correct to a reader (and it solves the problem forever). I pulled the name/email from that commit, but please let me know if you'd prefer to be credited differently. diff.c | 32 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/diff.c b/diff.c index 539997f..b154284 100644 --- a/diff.c +++ b/diff.c @@ -16,6 +16,7 @@ #include submodule.h #include ll-merge.h #include string-list.h +#include argv-array.h #ifdef NO_FAST_WORKING_DIRECTORY #define FAST_WORKING_DIRECTORY 0 @@ -2902,9 +2903,8 @@ static void run_external_diff(const char *pgm, int complete_rewrite, struct diff_options *o) { - const char *spawn_arg[10]; + struct argv_array argv = ARGV_ARRAY_INIT; int retval; - const char **arg = spawn_arg[0]; struct diff_queue_struct *q = diff_queued_diff; const char *env[3] = { NULL }; char env_counter[50]; @@ -2915,23 +2915,22 @@ static void run_external_diff(const char *pgm, const char *othername = (other ? other : name); temp_one = prepare_temp_file(name, one); temp_two = prepare_temp_file(othername, two); - *arg++ = pgm; - *arg++ = name; - *arg++ = temp_one-name; - *arg++ = temp_one-hex; - *arg++ = temp_one-mode; - *arg++ = temp_two-name; - *arg++ = temp_two-hex; - *arg++ = temp_two-mode; + argv_array_push(argv, pgm); + argv_array_push(argv, name); + argv_array_push(argv, temp_one-name); + argv_array_push(argv, temp_one-hex); + argv_array_push(argv, temp_one-mode); + argv_array_push(argv, temp_two-name); + argv_array_push(argv, temp_two-hex); + argv_array_push(argv, temp_two-mode); if (other) { - *arg++ = other; - *arg++ = xfrm_msg; + argv_array_push(argv, other); + argv_array_push(argv, xfrm_msg); } } else { - *arg++ = pgm; - *arg++ = name; + argv_array_push(argv, pgm); + argv_array_push(argv, name); } - *arg = NULL; fflush(NULL); env[0] = env_counter; @@ -2940,8 +2939,9 @@ static void run_external_diff(const char *pgm, env[1] = env_total; snprintf(env_total, sizeof(env_total), GIT_DIFF_PATH_TOTAL=%d, q-nr); - retval = run_command_v_opt_cd_env(spawn_arg, RUN_USING_SHELL, NULL, env); + retval = run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, env); remove_tempfile(); + argv_array_clear(argv); if (retval) { fprintf(stderr, external diff died, stopping at %s.\n, name); exit(1); -- 1.9.1.656.ge8a0637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] run_external_diff: use an argv_array for the environment
We currently use static buffers and a static array for formatting the environment passed to the external diff. There's nothing wrong in the code, but it is much easier to verify that it is correct if we use a dynamic argv_array. Signed-off-by: Jeff King p...@peff.net --- diff.c | 14 +- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/diff.c b/diff.c index b154284..760fc96 100644 --- a/diff.c +++ b/diff.c @@ -2904,11 +2904,9 @@ static void run_external_diff(const char *pgm, struct diff_options *o) { struct argv_array argv = ARGV_ARRAY_INIT; + struct argv_array env = ARGV_ARRAY_INIT; int retval; struct diff_queue_struct *q = diff_queued_diff; - const char *env[3] = { NULL }; - char env_counter[50]; - char env_total[50]; if (one two) { struct diff_tempfile *temp_one, *temp_two; @@ -2933,15 +2931,13 @@ static void run_external_diff(const char *pgm, } fflush(NULL); - env[0] = env_counter; - snprintf(env_counter, sizeof(env_counter), GIT_DIFF_PATH_COUNTER=%d, -++o-diff_path_counter); - env[1] = env_total; - snprintf(env_total, sizeof(env_total), GIT_DIFF_PATH_TOTAL=%d, q-nr); + argv_array_pushf(env, GIT_DIFF_PATH_COUNTER=%d, ++o-diff_path_counter); + argv_array_pushf(env, GIT_DIFF_PATH_TOTAL=%d, q-nr); - retval = run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, env); + retval = run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, env.argv); remove_tempfile(); argv_array_clear(argv); + argv_array_clear(env); if (retval) { fprintf(stderr, external diff died, stopping at %s.\n, name); exit(1); -- 1.9.1.656.ge8a0637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/6] run_external_diff: clean up error handling
When the external diff reports an error, we try to clean up and die. However, we can make this process a bit simpler: 1. We do not need to bother freeing memory, since we are about to exit. Nor do we need to clean up our tempfiles, since the atexit() handler will do it for us. So we can die as soon as we see the error. 3. We can just call die() rather than fprintf/exit. This does technically change our exit code, but the exit code of 1 is not meaningful here. In fact, it is probably wrong, since 1 from diff usually means completed successfully, but there were differences. And while we're there, we can mark the error message for translation, and drop the full stop at the end to make it more like our other messages. Signed-off-by: Jeff King p...@peff.net --- Note that we do have to update one test, which was expecting difftool to exit with 1 (and difftool just propagates diff's exit status in this case). I couldn't find any reasoning in the history for this exit(1), and it dates all the way back to May 2005. So I do not think it had any particular purpose, and for the reasons above, I do not think anyone would be sane to be relying on it. diff.c | 9 +++-- t/t7800-difftool.sh | 2 +- 2 files changed, 4 insertions(+), 7 deletions(-) diff --git a/diff.c b/diff.c index 760fc96..b517d01 100644 --- a/diff.c +++ b/diff.c @@ -2905,7 +2905,6 @@ static void run_external_diff(const char *pgm, { struct argv_array argv = ARGV_ARRAY_INIT; struct argv_array env = ARGV_ARRAY_INIT; - int retval; struct diff_queue_struct *q = diff_queued_diff; if (one two) { @@ -2934,14 +2933,12 @@ static void run_external_diff(const char *pgm, argv_array_pushf(env, GIT_DIFF_PATH_COUNTER=%d, ++o-diff_path_counter); argv_array_pushf(env, GIT_DIFF_PATH_TOTAL=%d, q-nr); - retval = run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, env.argv); + if (run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, env.argv)) + die(_(external diff died, stopping at %s), name); + remove_tempfile(); argv_array_clear(argv); argv_array_clear(env); - if (retval) { - fprintf(stderr, external diff died, stopping at %s.\n, name); - exit(1); - } } static int similarity_index(struct diff_filepair *p) diff --git a/t/t7800-difftool.sh b/t/t7800-difftool.sh index 5a193c5..dc30a51 100755 --- a/t/t7800-difftool.sh +++ b/t/t7800-difftool.sh @@ -58,7 +58,7 @@ test_expect_success PERL 'custom tool commands override built-ins' ' test_expect_success PERL 'difftool ignores bad --tool values' ' : expect - test_expect_code 1 \ + test_must_fail \ git difftool --no-prompt --tool=bad-tool branch actual test_cmp expect actual ' -- 1.9.1.656.ge8a0637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/6] run_external_diff: drop fflush(NULL)
This fflush was added in d5535ec (Use run_command() to spawn external diff programs instead of fork/exec., 2007-10-19), because flushing buffers before forking is a good habit. But later, 7d0b18a (Add output flushing before fork(), 2008-08-04) added it to the generic run-command interface, meaning that our flush here is redundant. Signed-off-by: Jeff King p...@peff.net --- diff.c | 1 - 1 file changed, 1 deletion(-) diff --git a/diff.c b/diff.c index b517d01..fdb7f6c 100644 --- a/diff.c +++ b/diff.c @@ -2928,7 +2928,6 @@ static void run_external_diff(const char *pgm, argv_array_push(argv, pgm); argv_array_push(argv, name); } - fflush(NULL); argv_array_pushf(env, GIT_DIFF_PATH_COUNTER=%d, ++o-diff_path_counter); argv_array_pushf(env, GIT_DIFF_PATH_TOTAL=%d, q-nr); -- 1.9.1.656.ge8a0637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/6] run_external_diff: hoist common bits out of conditional
Whether we have diff_filespecs to give to the diff command or not, we always are going to run the program and pass it the pathname. Let's pull that duplicated part out of the conditional to make it more obvious. Signed-off-by: Jeff King p...@peff.net --- diff.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/diff.c b/diff.c index fdb7f6c..173b657 100644 --- a/diff.c +++ b/diff.c @@ -2907,26 +2907,24 @@ static void run_external_diff(const char *pgm, struct argv_array env = ARGV_ARRAY_INIT; struct diff_queue_struct *q = diff_queued_diff; + argv_array_push(argv, pgm); + argv_array_push(argv, name); + if (one two) { struct diff_tempfile *temp_one, *temp_two; const char *othername = (other ? other : name); temp_one = prepare_temp_file(name, one); temp_two = prepare_temp_file(othername, two); - argv_array_push(argv, pgm); - argv_array_push(argv, name); argv_array_push(argv, temp_one-name); argv_array_push(argv, temp_one-hex); argv_array_push(argv, temp_one-mode); argv_array_push(argv, temp_two-name); argv_array_push(argv, temp_two-hex); argv_array_push(argv, temp_two-mode); if (other) { argv_array_push(argv, other); argv_array_push(argv, xfrm_msg); } - } else { - argv_array_push(argv, pgm); - argv_array_push(argv, name); } argv_array_pushf(env, GIT_DIFF_PATH_COUNTER=%d, ++o-diff_path_counter); -- 1.9.1.656.ge8a0637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/6] run_external_diff: refactor cmdline setup logic
The current logic makes it hard to see what gets put onto the command line in which cases. Pulling out a helper function lets us see that we have two sets of file data, and the second set either uses the original name, or the other renamed/copy name. Signed-off-by: Jeff King p...@peff.net --- The last patch and this one are getting a little bit into code churn, perhaps. I think the prior one is hands-down more readable. This one, I am on the fence on whether it is a true improvement or simply this is how _I_ would have written it. I won't be offended if we drop it. diff.c | 26 +++--- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/diff.c b/diff.c index 173b657..4b8bfc6 100644 --- a/diff.c +++ b/diff.c @@ -2888,6 +2888,16 @@ static struct diff_tempfile *prepare_temp_file(const char *name, return temp; } +static void add_external_diff_name(struct argv_array *argv, + const char *name, + struct diff_filespec *df) +{ + struct diff_tempfile *temp = prepare_temp_file(name, df); + argv_array_push(argv, temp-name); + argv_array_push(argv, temp-hex); + argv_array_push(argv, temp-mode); +} + /* An external diff command takes: * * diff-cmd name infile1 infile1-sha1 infile1-mode \ @@ -2911,17 +2921,11 @@ static void run_external_diff(const char *pgm, argv_array_push(argv, name); if (one two) { - struct diff_tempfile *temp_one, *temp_two; - const char *othername = (other ? other : name); - temp_one = prepare_temp_file(name, one); - temp_two = prepare_temp_file(othername, two); - argv_array_push(argv, temp_one-name); - argv_array_push(argv, temp_one-hex); - argv_array_push(argv, temp_one-mode); - argv_array_push(argv, temp_two-name); - argv_array_push(argv, temp_two-hex); - argv_array_push(argv, temp_two-mode); - if (other) { + add_external_diff_name(argv, name, one); + if (!other) + add_external_diff_name(argv, name, two); + else { + add_external_diff_name(argv, other, two); argv_array_push(argv, other); argv_array_push(argv, xfrm_msg); } -- 1.9.1.656.ge8a0637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] run_external_diff: use an argv_array for the command line
On Sun, Apr 20, 2014 at 02:09:49AM +0400, Max L wrote: One more note: at this moment the problem is slightly deeper. This array is next passed to the execvp function, which now falls with EFAULT on two my machines (both faced this problem after upgrading to ubuntu 14.04, everything 'worked' fine before, looks like now execvp checks input more strictly). This leads to non-working 'git difftool'. Interesting. We're overwriting whatever is after spawn_arg on the stack, so I'd expect the fork/exec to work, but the function to complain while popping the stack frame (though I couldn't get it to do so). I wonder if some kind of stack protection is kicking in, and the NULL doesn't get written or something. Either way, we should definitely address it. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Project idea: github-like diff view
On Sun, Apr 20, 2014 at 04:58:28PM +0700, Duy Nguyen wrote: - --color-words within unified diff format, using background color to show what part of the line has changed. This is only enabled for 1-line changes. See contrib/diff-highlight. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [SECURITY PATCH] git-prompt.sh: don't put unsanitized branch names in $PS1
On Mon, Apr 21, 2014 at 03:07:28PM -0400, Richard Hansen wrote: Both bash and zsh subject the value of PS1 to parameter expansion, command substitution, and arithmetic expansion. Rather than include the raw, unescaped branch name in PS1 when running in two- or three-argument mode, construct PS1 to reference a variable that holds the branch name. Because the shells do not recursively expand, this avoids arbitrary code execution by specially-crafted branch names such as '$(IFS=_;cmd=sudo_rm_-rf_/;$cmd)'. Cute. We already disallow quite a few characters in refnames (including space, as you probably discovered), and generally enforce that during ref transfer. I wonder if we should tighten that more as a precuation. It would be backwards-incompatible, but I wonder if things like $ and ; in refnames are actually useful to people. Did you look into similar exploits with completion? That's probably slightly less dire (this one hits you as soon as you cd into a malicious clone, whereas completion problems require you to actually hit tab). I'm fairly sure that we miss some quoting on pathnames, for example. That can lead to bogus completion, but I'm not sure offhand if it can lead to execution. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Project idea: github-like diff view
On Tue, Apr 22, 2014 at 04:59:17PM +0700, Duy Nguyen wrote: On Sun, Apr 20, 2014 at 9:46 PM, Jeff King p...@peff.net wrote: On Sun, Apr 20, 2014 at 04:58:28PM +0700, Duy Nguyen wrote: - --color-words within unified diff format, using background color to show what part of the line has changed. This is only enabled for 1-line changes. See contrib/diff-highlight. Thanks. I'd rather have it built in core git still. I'll try to see if I can rewrite it in C. Else, any objection to promote it to a core helper and setup pager automatically? We can have a config key to turn it off, but if git diff is colored, then it could be on by default. If you are going to write it as part of git, it would be interesting to try using a real word-diff to find the inter-line changes, instead of the front and back match heuristic that the script uses. I know there are some cases that would look better, like: -foo(buf, len); +foo(obj-buf, obj-len); but I suspect some cases would also look worse. It would be interesting to experiment with, though. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] pack-bitmap: do not core dump
On Tue, Apr 22, 2014 at 03:53:02PM -0700, Kyle J. McKay wrote: So I was trying to use pack.writebitmaps=true and all I got was core dumps. Eek. The fix with a real subject line ;) is below. I think perhaps this should be picked up for the 2.0.0 release. (Patch is against master.) Yes, this is definitely the sort of bugfix we want to see during the -rc period (well, we would prefer not to see bugs at all, but if we must have them, fixes are helpful). 8 Subject: [PATCH] ewah_bitmap.c: do not assume size_t and eword_t are the same size Thanks for a very well-written commit message. I think your fix makes sense: - self-rlw = self-buffer + (rlw_offset / sizeof(size_t)); + self-rlw = self-buffer + (rlw_offset / sizeof(eword_t)); We could also write it as: self-rlw = (uint8_t *)self-buffer + rlw_offset; but I do not think that is necessarily any more readable, especially because we probably need to cast it like: self-rlw = (eword_t *)((uint8_t *)self-buffer + rlw_offset); Given that self-rlw is a pointer to eword_t, though, we can assume rlw_offset is always going to be a multiple of sizeof(eword_t) anyway (and if it is not, the division in the original is a big problem, but I do not think that is the case). So why do any uint8_t math in the first place? I think we could write it as: eword_t *old = self-buffer; ... realloc ... self-rlw = self-buffer + (self-rlw - old); I'm fine with your patch, though. I also poked through the rest of the bitmap code looking for similar problems, but didn't find any. I do not think this was a systemic issue with bad use of types; it was just a think-o that happened to work on 64-bit machines. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Pack bitmaps on Git for Windows
On Tue, Apr 22, 2014 at 05:58:10PM +1000, Bryan Turner wrote: It looks like the documentation for bitmaps is being included in the 1.9.2 release of Git for Windows but the functionality itself is not present. For example, doc\git\html\git-config.html includes settings like these: pack.useBitmaps I think this is a documentation problem with Git for Windows. The bitmap code is going into v2.0, but there should be nothing (not even documentation) in v1.9.2. I don't know how the msysgit folks build their documentation, but I wonder if it accidentally built from current 'master' rather than v1.9.2. You might get more information at the msysgit mailing list: https://groups.google.com/forum/#!forum/msysgit -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: error: git-remote-https died of signal 13
On Sun, Apr 20, 2014 at 08:42:15PM -0400, Greg M wrote: Using git version 1.9.2 I am getting this error: [normal@laptop tmp]$ git clone https://github.com/mozilla/rust.git Cloning into 'rust'... remote: Reusing existing pack: 296648, done. remote: Counting objects: 80, done. remote: Compressing objects: 100% (77/77), done. remote: Total 296728 (delta 22), reused 9 (delta 3) Receiving objects: 100% (296728/296728), 110.68 MiB | 190.00 KiB/s, done. Resolving deltas: 100% (238828/238828), done. Checking connectivity... done. error: git-remote-https died of signal 13 Thanks for a thorough bug report. I looked through your strace output, and this really does look like another case of OpenSSL getting SIGPIPE while closing the connection. It's odd, though, as your curl version has my patches, along with similar extra fixes in 854aca5 (multi: ignore sigpipe internally, 2014-02-17). But I guess there may be a code path that needs similar treatment. The easiest way to find it is probably to attach a debugger to the running git-remote-https, and get a backtrace when it dies from SIGPIPE. You'll probably want to install your system's debug packages for curl, too. I have curl version 7.36 though, in case some of the other output matters: [normal@laptop tmp]$ curl --version curl 7.36.0 (x86_64-unknown-linux-gnu) libcurl/7.36.0 OpenSSL/1.0.1g zlib/1.2.8 libssh2/1.4.3 Another possibility is that your curl binary is up-to-date, but you are linking against an older version of libcurl that does not have the SIGPIPE workarounds. I'm not sure of the best way to check that, but a hacky way under Linux is: $ ldd $(git --exec-path)/git-remote-https | grep libcurl libcurl.so.4 = /usr/lib/x86_64-linux-gnu/libcurl.so.4 $ strings /usr/lib/x86_64-linux-gnu/libcurl.so.4 | grep '7\.' CLIENT libcurl 7.36.0 -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] git tag --contains : avoid stack overflow
On Wed, Apr 23, 2014 at 12:12:14PM -0700, Junio C Hamano wrote: +ulimit_stack=ulimit -s 64 +test_lazy_prereq ULIMIT 'bash -c '$ulimit_stack'' With this implementaion, ULIMIT implies bash, and we use bash that appears on user's PATH that may not be the one the user chose to run git with. Can't we fix both of them by using $SHELL_PATH? I don't think so. The point is that we _must_ use bash here, not any POSIX shell. So my $SHELL_PATH is /bin/sh, which is dash, and would not run the test. We want to run some bash if we can. We may pick a bash on the user's PATH that is not what they put into $SHELL_PATH, but that should be relatively rare. And the consequence is that either that bash works fine and we run the test, or it does not, and we skip the test. How about doing it along this line instead? run_with_limited_stack () { $SHELL_PATH -c ulimit -s 64 $* } test_lazy_prereq ULIMIT run_with_limited_stack true That's a much more direct test. I like it (aside from the $SHELL_PATH thing as described above). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] git tag --contains : avoid stack overflow
On Wed, Apr 23, 2014 at 09:53:25AM +0200, Stepan Kasal wrote: I have found out that ulimit -s does not work on Windows. Adding this as a prerequisite, we will skip the test there. I found this bit weird, as the test originated on Windows. Did it never actually cause a failure there (i.e., the ulimit -s doesn't do anything)? Or does ulimit fail? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH v2 1/3] sh-setup: export GIT_DIR
On Wed, Apr 23, 2014 at 02:42:38PM -0500, Felipe Contreras wrote: It is what the clients of this library expect. Is it? Passing GIT_DIR to sub-invocations of git will change how they determine the repo and working tree. Your patch seems to cause failures all over the test suite. Without looking too hard, I'd guess the problems are one of: 1. Setting GIT_DIR fixes the repo directory for all sub-invocations. A script that does cd some-other-repo.git git You'd need to teach calling scripts to unset GIT_DIR when trying to move to another repo. 2. If GIT_DIR is set but GIT_WORK_TREE is not, then GIT_WORK_TREE will default to .. It might be sufficient to set GIT_WORK_TREE when you are setting GIT_DIR here. But as I said, I didn't look too hard, so there might be other complications. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to trim the fat on my log graphs
On Wed, Apr 23, 2014 at 02:59:26PM -0500, Robert Dailey wrote: I referred back to the documentation for --decorate: --decorate[=short|full|no] Print out the ref names of any commits that are shown. If short is specified, the ref name prefixesrefs/heads/, refs/tags/ and refs/remotes/ will not be printed. If full is specified, the full ref name (including prefix) will be printed. The default option is short. since default is short, and its documented to exclude tag names, I would expect them to not be there. I also tried removing it to test but i saw no difference; tag names are still visible. Are you reading that as if short is specified, then refs whose names are prefixed with refs/heads, refs/tags, etc will be omitted entirely from decoration? The intent is if short is specified, then a ref whose names has the prefix of refs/heads, refs/tags, etc, will have that prefix removed when showing it. IOW, the options are: $ git log -1 --no-decorate | head -1 commit 0bc85abb7aa9b24b093253018801a0fb43d01122 $ git log -1 --decorate=short | head -1 commit 0bc85abb7aa9b24b093253018801a0fb43d01122 (HEAD, tag: v1.9.2, origin/maint) $ git log -1 --decorate=full | head -1 commit 0bc85abb7aa9b24b093253018801a0fb43d01122 (HEAD, tag: refs/tags/v1.9.2, refs/remotes/origin/maint) If that's the confusion, feel free to suggest better wording for the documentation. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] git tag --contains : avoid stack overflow
On Wed, Apr 23, 2014 at 01:48:05PM -0700, Junio C Hamano wrote: I don't think so. The point is that we _must_ use bash here, not any POSIX shell. Sorry, but I do not understand. Isn't what you want any POSIX shell with 'ulimit -s 64' supported? Sure, that would be fine, but the original patch which started this thread claimed that bash was required. I had assumed that to be true, but it seems like it's not: $ dash -c 'ulimit -s ulimit -s 64 ulimit -s' 8192 64 If we are just using the same shell we are already running, then why invoke it by name in the first place? IOW, why not: run_with_limited_stack () { (ulimit -s 64 $@) } -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: error: git-remote-https died of signal 13
On Wed, Apr 23, 2014 at 07:49:09AM -0400, Greg M wrote: The easiest way to find it is probably to attach a debugger to the running git-remote-https, and get a backtrace when it dies from SIGPIPE. You'll probably want to install your system's debug packages for curl, too. The backtrace: Program received signal SIGPIPE, Broken pipe. 0x7fdcfd511a2d in write () from /usr/lib/libpthread.so.0 (gdb) bt #0 0x7fdcfd511a2d in write () from /usr/lib/libpthread.so.0 #1 0x7fdcfd81a0f6 in sock_write () from /usr/lib/libcrypto.so.1.0.0 #2 0x7fdcfd817edb in BIO_write () from /usr/lib/libcrypto.so.1.0.0 #3 0x7fdcfc662902 in ssl3_write_pending () from /usr/lib/libssl.so.1.0.0 #4 0x7fdcfc664b77 in ssl3_dispatch_alert () from /usr/lib/libssl.so.1.0.0 #5 0x7fdcfc660822 in ssl3_shutdown () from /usr/lib/libssl.so.1.0.0 #6 0x7fdcfd2e944e in Curl_ossl_close () from /usr/lib/libcurl.so.4 #7 0x7fdcfd2b6459 in Curl_disconnect () from /usr/lib/libcurl.so.4 #8 0x7fdcfd2c8131 in curl_multi_cleanup () from /usr/lib/libcurl.so.4 #9 0x0040764b in ?? () #10 0x00404e4d in ?? () #11 0x7fdcfcf0fb05 in __libc_start_main () from /usr/lib/libc.so.6 #12 0x0040552e in ?? () Thanks, that's very helpful. I wasn't able to reproduce your problem locally, but I suspect the curl patch below may fix it: diff --git a/lib/multi.c b/lib/multi.c index bc93264..72e4825 100644 --- a/lib/multi.c +++ b/lib/multi.c @@ -1804,10 +1804,13 @@ static void close_all_connections(struct Curl_multi *multi) conn = Curl_conncache_find_first_connection(multi-conn_cache); while(conn) { +SIGPIPE_VARIABLE(pipe_st); conn-data = multi-closure_handle; +sigpipe_ignore(conn-data, pipe_st); /* This will remove the connection from the cache */ (void)Curl_disconnect(conn, FALSE); +sigpipe_restore(pipe_st); conn = Curl_conncache_find_first_connection(multi-conn_cache); } Let me know if you need any pointers on getting curl built or tested with git. Daniel, I think the similar fix to curl_multi_cleanup in commit a900d45 missed this code path, and we need something like the above patch. I know you were trying to keep the SIGPIPE mess at the entrance-points to the library, and this works against that. But we need a SessionHandle to pass to sigpipe_ignore to look at its no_signal flag, and of course in the case of multi we may have several such handles. If there's a similar flag we can check on the multi handle, we could just cover all of curl_multi_cleanup with a single ignore/reset pair. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What is missing from Git v2.0
On Thu, Apr 24, 2014 at 01:26:33PM -0500, Felipe Contreras wrote: Jonathan Nieder wrote: Stefan Beller wrote: I may have missunderstood. So today you cannot commit if you don't provide an email address (usually the first time you try to commit, git asks to git config --global author.email=y...@mail.here), if I remember correctly, so there is definitely a valid (i.e. user approved) email address. Not true. But you do get a big wall of text when you make your first commit without an EMAIL envvar or configured [user] section, including Only if you don't have a fully qualified hostname. No, we alway warn if the values weren't explicitly provided: $ git config --global --unset user.email $ git commit --allow-empty -m foo [master 1e987ba] foo Committer: Jeff King p...@sigill.intra.peff.net Your name and email address were configured automatically based on your username and hostname. Please check that they are accurate. You can suppress this message by setting them explicitly: git config --global user.name Your Name git config --global user.email y...@example.com After doing this, you may fix the identity used for this commit with: git commit --amend --reset-author but we will consider several sources explicit, like $GIT_COMMITTER_EMAIL, $EMAIL, and of course user.email: $ EMAIL=whate...@example.com git commit --allow-empty -m foo [master e75f17a] foo We die when the values are implicitly derived from the system _and_ they look bogus: $ sudo rm /etc/mailname $ sudo hostname bogus $ git commit --allow-empty -m foo *** Please tell me who you are. Run git config --global user.email y...@example.com git config --global user.name Your Name to set your account's default identity. Omit --global to set the identity only in this repository. fatal: unable to auto-detect email address (got 'peff@bogus.(none)') -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Harmful LESS flags
On Thu, Apr 24, 2014 at 12:29:21PM -0700, Junio C Hamano wrote: David Kastrup d...@gnu.org writes: Junio C Hamano gits...@pobox.com writes: Traditionally, because the tool grew in a context of being used in a project whose participants are at least not malicious, always having to be on the lookout for fear of middle-of-line tabs hiding bad contents near the right edges of lines has never been an issue. My beef is not with hiding bad contents but with hiding contents. It makes the output useless for seeing what is actually happening as soon as the option starts having an effect. My suspicion is that one of the reasons why S was chosen to be in the default was to mildly discourage people from busting the usual line-length limit, but I am not Linus ;-) I would think it's the opposite. Long lines look _horrible_ without -S, as they get wrapped at awkward points. Using -S means that long lines don't bug you, unless you really want to scroll over and see the content. I really think the right solution here is to teach less to make it more obvious that there is something worth scrolling over to. Here's a very rough patch for less, if you want to see what I'm thinking of. diff --git a/input.c b/input.c index b211323..01aa411 100755 --- a/input.c +++ b/input.c @@ -178,6 +178,7 @@ get_forw_line: */ if (chopline || hshift 0) { + set_chopped_marker(ch_tell()-1); do { if (ABORT_SIGS()) diff --git a/line.c b/line.c index 1eb3914..b3358a0 100755 --- a/line.c +++ b/line.c @@ -1080,6 +1080,20 @@ set_status_col(c) attr[0] = AT_NORMAL|AT_HILITE; } + public void +set_chopped_marker(pos) + POSITION pos; +{ + /* +* Roll back output by one character; probably +* we need to actually walk curr back further +* for multibyte characters? +*/ + column--; + curr--; + store_char('', AT_NORMAL|AT_HILITE, NULL, pos); +} + /* * Get a character from the current line. * Return the character as the function return value, -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Harmful LESS flags
On Thu, Apr 24, 2014 at 02:47:01PM -0700, Junio C Hamano wrote: And I do agree that the chopped marker would be a very sensible thing to show in the -S output; I would have chosen $ myself for that to match an existing practice in (setq truncate-lines t) in Emacs, though. Hmm. I do not use Emacs, but I explicitly avoided $ because of its end-of-line connotations. E.g., in cat -A it means the opposite: this is the real \n end-of-line. But if there's existing precedent for $, that would be fine with me. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Harmful LESS flags
On Thu, Apr 24, 2014 at 11:48:30PM +0200, David Kastrup wrote: I really think the right solution here is to teach less to make it more obvious that there is something worth scrolling over to. Here's a very rough patch for less, if you want to see what I'm thinking of. Still useless. I'm not actually interested in a more prominent I could be useful indicator. So don't set -S, then. There are two questions here: 1. Can less do a better job of indicating what's in the input when -S is in effect? 2. What should get put into $LESS by default? I was specifically addressing (1). Your comment does not help at all there. It could have an impact on (2), but you didn't say anything besides I don't like it. That doesn't add anything to the conversation. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] git tag --contains: avoid stack overflow
On Thu, Apr 24, 2014 at 02:24:39PM +0200, Stepan Kasal wrote: From: Jean-Jacques Lafay jeanjacques.la...@gmail.com In large repos, the recursion implementation of contains(commit, commit_list) may result in a stack overflow. Replace the recursion with a loop to fix it. This problem is more apparent on Windows than on Linux, where the stack is more limited by default. See also this thread on the msysGit list: https://groups.google.com/d/topic/msysgit/FqT6boJrb2g/discussion [jes: re-written to imitate the original recursion more closely] Thomas Braun pointed out several documentation shortcomings. Tests are run only if ulimit -s is available. This means they cannot be run on Windows. Signed-off-by: Jean-Jacques Lafay jeanjacques.la...@gmail.com Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de Tested-by: Stepan Kasal ka...@ucw.cz Thanks, this version looks good to me. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What is missing from Git v2.0
On Fri, Apr 25, 2014 at 12:45:27PM -0500, Felipe Contreras wrote: When I say literally everbody agreed to move away from the name index (except Junio and another guy) I mean it. I even composed a list: http://article.gmane.org/gmane.comp.version-control.git/233469 Jeff King, Jonathan Nieder, Matthieu Moy, they all agreed. With reference to my name, your email says: Jeff King: staging area makes perfect sense But here's that statement in context[1]: So the term staging area makes perfect sense to me; it is where we collect changes to make a commit. I am willing to accept that does not to others (native English speakers or no), and that we may need to come up with a better term. But I think just calling it the stage is even worse; it loses the concept that it is a place for collecting and organizing. In other words, I was saying that the term makes sense to me, and primarily comparing favorably to a worse proposal. I am not commenting at all on a plan to change names, which is what you are claiming above. I do think the term staging area is fine, but picking a term is only step one of a plan. The rest is deciding how to make the change, and whether it is worth it. I remain undecided on the latter bits. Please don't quote me out of context in a way that implies that I am decided. -Peff [1] http://article.gmane.org/gmane.comp.version-control.git/168012 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What is missing from Git v2.0
On Fri, Apr 25, 2014 at 01:27:17PM -0500, Felipe Contreras wrote: I specifically said everybody agreed to move away from the name 'index', I didn't say everybody agreed on the staged area although the vast majority did, and I didn't say anybody agreed on my patches, although some did. I think I was clear. Maybe I was not clear in my response, so let me try again. I do _not_ necessarily agree that we need to move away from the name index. I am specifically saying that your everyone includes my name explicitly, and that I do not agree with the statement you are attributing to it. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What is missing from Git v2.0
On Fri, Apr 25, 2014 at 01:57:59PM -0500, Felipe Contreras wrote: Maybe I was not clear in my response, so let me try again. I do _not_ necessarily agree that we need to move away from the name index. So you agree that the index is a bad name, and you agree staging area is a better name, yet you don't agree we should move away from the term index? I don't agree that the index is a bad name, because that implies some objective level of bad. I do think the name staging area is fine, and I think it may even be better than index, if we were picking a name out of the blue. I am undecided on whether we should move away from the term index. The way you have phrased it seems like you are trying to create a logical contradiction: A is bad, B is good, therefore we should move from A to B. But that neglects the cost of moving. Frankly, I am not that interested in discussing it with you. I _am_ interested in you not using my name to claim that I believe things I do not. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] commit: do not complain of empty messages from -C
When we pick another commit's message, we die() immediately if we find that it's empty and we are not going to run an editor (i.e., when running -C instead of -c). However, this check is redundant and harmful. It's redundant because we will already notice the empty message later, after we would have run the editor, and die there (just as we would for a regular, not -C case, where the user provided an empty message in the editor). It's harmful for a few reasons: 1. It does not respect --allow-empty-message. As a result, a git rebase -i cannot pick such a commit. So you cannot even go back in time to fix it with a reword or edit instruction. 2. It does not take into account other ways besides the editor to modify the message. For example, git commit -C empty-commit -m foo could take the author information from empty-commit, but add a message to it. There's more to do to make that work correctly (and right now we explicitly forbid -C with -m), but this removes one roadblock. 3. The existing check is not enough to prevent segfaults. We try to find the \n\n header/body boundary in the commit. If it is at the end of the string (i.e., no body), _or_ if we cannot find it at all (i.e., a truncated commit object), we consider the message empty. With -C, that's OK; we die in either case. But with -c, we continue on, and in the case of a truncated commit may end up dereferencing NULL+2. Signed-off-by: Jeff King p...@peff.net --- I care most about the rebase -i thing, especially because it is the primary method for fixing old mistakes. The segfault fix is a nice bonus. The git commit -C empty -m foo thing might be nice, but I don't plan to work on it further. The semantics would need to be figured out (does it append or replace?), and you can always just use -c to fire up an actual editor and write the new content there. builtin/commit.c | 5 ++--- t/t7500-commit.sh | 11 ++- 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index 9cfef6c..65c069d 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -650,9 +650,8 @@ static int prepare_to_commit(const char *index_file, const char *prefix, } else if (use_message) { char *buffer; buffer = strstr(use_message_buffer, \n\n); - if (!use_editor (!buffer || buffer[2] == '\0')) - die(_(commit has empty message)); - strbuf_add(sb, buffer + 2, strlen(buffer + 2)); + if (buffer) + strbuf_add(sb, buffer + 2, strlen(buffer + 2)); hook_arg1 = commit; hook_arg2 = use_message; } else if (fixup_message) { diff --git a/t/t7500-commit.sh b/t/t7500-commit.sh index bdc1f29..116885a 100755 --- a/t/t7500-commit.sh +++ b/t/t7500-commit.sh @@ -223,7 +223,8 @@ test_expect_success 'Commit without message is allowed with --allow-empty-messag git add foo empty git commit --allow-empty-message empty - commit_msg_is + commit_msg_is + git tag empty-message-commit ' test_expect_success 'Commit without message is no-no without --allow-empty-message' ' @@ -240,6 +241,14 @@ test_expect_success 'Commit a message with --allow-empty-message' ' commit_msg_is hello there ' +test_expect_success 'commit -C empty respects --allow-empty-message' ' + echo more foo + git add foo + test_must_fail git commit -C empty-message-commit + git commit -C empty-message-commit --allow-empty-message + commit_msg_is +' + commit_for_rebase_autosquash_setup () { echo first content line foo git add foo -- 1.9.1.656.ge8a0637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Apr 2014, #08; Fri, 25)
On Fri, Apr 25, 2014 at 03:50:26PM -0700, Junio C Hamano wrote: * jk/external-diff-use-argv-array (2014-04-21) 6 commits (merged to 'next' on 2014-04-22 at e6d92d7) + run_external_diff: refactor cmdline setup logic + run_external_diff: hoist common bits out of conditional + run_external_diff: drop fflush(NULL) + run_external_diff: clean up error handling + run_external_diff: use an argv_array for the environment + run_external_diff: use an argv_array for the command line Code clean-up. Will keep in 'next' for the remainder of the cycle. The first one does fix a possible stack overflow (albeit of one NULL, not arbitrary content, so I don't think it's exploitable). We may want to do: diff --git a/diff.c b/diff.c index 54d5308..a03744b 100644 --- a/diff.c +++ b/diff.c @@ -2894,7 +2894,7 @@ static void run_external_diff(const char *pgm, int complete_rewrite, struct diff_options *o) { - const char *spawn_arg[10]; + const char *spawn_arg[11]; int retval; const char **arg = spawn_arg[0]; struct diff_queue_struct *q = diff_queued_diff; as a fix for maint/2.0.0 in the interim. I can write a commit message for that if you're interested. * fc/publish-vs-upstream (2014-04-21) 8 commits - sha1_name: add support for @{publish} marks - sha1_name: simplify track finding - sha1_name: cleanup interpret_branch_name() - branch: display publish branch - push: add --set-publish option - branch: add --set-publish-to option - Add concept of 'publish' branch - t5516 (fetch-push): fix test restoration Add branch@{publish}; it seems that this is somewhat different from Ram and Peff started working on. There were many discussion messages going back and forth but it does not appear that the design issues have been worked out among participants yet. If you are waiting on me, I do not have much else to say on this topic. @{publish} as specified by Felipe is not useful to me, and I would continue to pursue @{push} separately as the remote-tracking branch of where you would push to. I think there is room for both concepts. As for the patches themselves, I have not reviewed them carefully, and would prefer not to. As I mentioned before, though, I would prefer the short @{p} not be taken for @{publish} until it has proven itself. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Apr 2014, #08; Fri, 25)
On Fri, Apr 25, 2014 at 08:36:55PM -0500, Felipe Contreras wrote: As for the patches themselves, I have not reviewed them carefully, and would prefer not to. As I mentioned before, though, I would prefer the short @{p} not be taken for @{publish} until it has proven itself. Presumably you want to save it for @{push}. While I'm not against to having just @{publish} for now, I'm farily certain most people would be using @{publish} and not @{push}, as that's what `git branch -v` would show, and it would be closely similar to @{upstream}. Therefore it would make sense to use @{p} for @{publish} No, I do not think it would be a good idea for @{push}, either. If we have two concepts so similarly named (and especially if we add @{pull}, which has also been mentioned), then I think having @{p} just adds to confusion. So I would much rather wait and see. It is very easy to add @{p} later, but it is very hard to take it back once used. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Revert Stop starting pager recursively
[+cc Duy, whose patch this is] On Fri, Apr 25, 2014 at 04:10:49PM -0400, Jörn Engel wrote: A second option is to add a --pager (or rather --no-pager) option to the command line and allow the user to specify GIT_PAGER=git --no-pager -p column --mode='dense color' git -p branch I think we have --no-pager already. But the -p is turning _on_ the pager, so you could also just omit it. IOW, I really don't understand why the original command was not simply: GIT_PAGER=git column --mode='dense color' git -p branch The whole infinite loop that the original commit solved is caused by specifying the -p. So it sounds like the right solution is don't do that. Am I missing something useful that the -p does? I wonder if perhaps the intent was that the user might have set pager.column, in which case the use of the pager is implied. I still think that the right solution is to use --no-pager explicitly then. If the user is invoking git inside GIT_PAGER, it is up to them to save themselves from infinite recursion. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] subtree/Makefile: Standardize (esp. for packagers)
On Sat, Apr 26, 2014 at 02:56:15PM +1000, nod.h...@gmail.com wrote: contrib/subtree/Makefile is a shambles in regards to it's consistency with other makefiles, which makes subtree overly painful to include in build scripts. Two major issues are present: Firstly, calls to git itself (for $(gitdir) and $(gitver)), making building difficult on systems that don't have git. Secondly, the Makefile uses the variable $(libexecdir) for defining the exec path. (...) I hate to be that guy, but could I get an opinion on the proposed patch? It's OK to be that guy; prompting or reposting when a patch has been overlooked is normal here. Is git interested in purely makefile patches, or should I find further improvements to make in subtree and purpose this again with those? Makefile improvements are fine on their own. I think the problem is that contrib/subtree does not really have an active dedicated area maintainer. Your changes look fine to me from a cursory examination. It would probably be more readable as four patches (the 3 fix points from your list, plus the minor fixes mentioned at the end). Then each patch stands on its own, can say what problem it's fixing, and how. I've left `rm -f -r subproj mainline` in the clean rule for now, however I'd suggest those actually belong in contrib/subtree/t/Makefile:clean, given that they are only ever generated by `make test`. But given that there aren't any other comparable setups in contrib/, I'm somewhat apprehensive to move them without opinion. Do we even make those directories anymore? It looks like they are part of the tests, but the whole test script runs inside its own trash directory. I wonder if they are vestiges from the time when subtree was its own repository outside of contrib/. If so, they can be dropped here (and from .gitignore). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Revert Stop starting pager recursively
[+cc Duy for real this time] On Sat, Apr 26, 2014 at 10:27:07AM -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: [+cc Duy, whose patch this is] On Fri, Apr 25, 2014 at 04:10:49PM -0400, Jörn Engel wrote: A second option is to add a --pager (or rather --no-pager) option to the command line and allow the user to specify GIT_PAGER=git --no-pager -p column --mode='dense color' git -p branch I think we have --no-pager already. But the -p is turning _on_ the pager, so you could also just omit it. IOW, I really don't understand why the original command was not simply: GIT_PAGER=git column --mode='dense color' git -p branch The whole infinite loop that the original commit solved is caused by specifying the -p. So it sounds like the right solution is don't do that. Am I missing something useful that the -p does? My reading of this is that Duy wanted to let -p kick in to pass the columnized output, which could be more than a page long, thru the usual less or whatever pager. I thought that at first, too. But it doesn't work, because his patch actually _suppresses_ the pager. For example, something like: git config pager.column less git config pager.branch git column --mode=dense git branch actively works without the original patch (and after Jörn's revert), but not with current git. However, you would not use -p there in any case, because it kicks in early and only respects $GIT_PAGER. And I would also say that you are much better off there actually specifying your whole pipeline as a unit: git config pager.branch git column --mode=dense | less I do a similar thing where I have PAGER=less, but set pager.log to diff-highlight | less. I tried at one point to have it chain automatically, but the setup is just too complicated (as we're seeing here), and it's not _that_ big a deal to explicitly pipe to the next pager in the sequence. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Adding git hooks
On Sat, Apr 26, 2014 at 10:24:50AM -0700, Junio C Hamano wrote: Suvorov Ivan sv...@inbox.ru writes: I want to extend the functionality of git due to the possibility of separation of the user repository into 2 parts - one part will be stored as usual, under version control git, and the second part will be stored in another location such as an FTP-server. Sounds like you are looking for git-annex, perhaps? I agree that is the right approach, but git-annex and git-media work _above_ the object layer, and taint the history by storing symlinks in git instead of the real sha1s. I'd love to see a solution that does the same thing, but lives at the pack/loose object layer. Basically: 1. Teach sha1-file.c to look for missing objects by hitting an external script, like: git config odb.command curl https://example.com/%s; and place them in an alternates-like separate object database. 2. Teach the git protocol a new extension to say don't bother sending blobs over size X. You'd have to coordinate that X with the source from your odb.command. You'd probably want to wrap up the odb.command in a more fancy helper. For example, for performance, we'd probably want to be able to query it for which objects do you have, as well as fetch this object. And it would be nice if it could auto-query the X for step 2, and manage pruning local objects (e.g., when they become deep in history). We'd probably also want to teach a few places in git to treat external objects specially. For example, they should probably be auto-treated as binary, so that a log -p does not try to fetch all of them. And likewise, things like log -S should probably ignore them by default. I have a messy sketch of step 1 that I did quite a while ago, but haven't proceeded further on it. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Uses git-credential for git-imap-send
On Sat, Apr 26, 2014 at 10:50:26AM -0700, Dan Albert wrote: git-imap-send was directly prompting for a password rather than using git-credential. git-send-email, on the other hand, supports git-credential. Yay. These sorts of conversions were definitely on my mind when I did the original credential work, and have mostly been waiting on somebody who cares enough about specific tools to switch them over. static struct imap_store *imap_open_store(struct imap_server_conf *srvc) { + struct credential cred = CREDENTIAL_INIT; Your patch seems to have been whitespace-damaged in transit (there are hints for gmail in the format-patch and send-email manpages). + cred.username = xstrdup(srvc-user); + cred.protocol = xstrdup(imap); + cred.host = xstrdup(srvc-host); + credential_fill(cred); + srvc-pass = xstrdup(cred.password); The imap protocol will get passed along to any helpers. I think the only one which actually cares about the specific content is the osxkeychain helper, which differentiates between imap and imaps. I think we can know which is which at this point and do: cred.protocol = xstrdup(srvc-use_ssl ? imaps : imap); Other than that, the patch looks good. Tests would be wonderful, but we have no infrastructure at all, so it would involve writing a mock IMAP server. Having implemented both IMAP servers and clients in a previous life, I could not ask anyone to go through that pain. :) -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] subtree/Makefile: Standardize (esp. for packagers)
On Sun, Apr 27, 2014 at 12:35:13PM +1000, James Denholm wrote: Do we even make [subproject and mainline] anymore? It looks like they are part of the tests, but the whole test script runs inside its own trash directory. subproject and mainline are actually made in contrib/subtree, but I'll look at perhaps fixing that when I split the proposal into a series as you suggest. Are they? I couldn't find any reference to them as directories except in the test script, and doing a make from contrib/subtree didn't create them. I'll leave it to you to investigate further whether the clean rules are cruft or not, but certainly if they are, cleaning up cruft is a good thing. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Uses git-credential for git-imap-send
On Sat, Apr 26, 2014 at 11:30:11AM -0700, Dan Albert wrote: I had resent a less broken patch after I noticed the tabs, but it seems to have gotten lost. Better formatted patch at the bottom of this message. Your emails (including this one) are multipart/alternatives with an html part, which will cause the mailing list software to reject them. This email also still seems whitespace-damaged to me (the leading tabs are collapsed to a single space). It looks like you're using gmail to send; you might try using git send-email (the example at the end of git help send-email can walk you through it). About imap vs. imaps: I actually had your exact line in before, but decided that as long as its for the same host the user probably wants to use the same credentials for both imap and imaps (if they for some reason had both configured). Hard coding imap allows them to use either protocol with only one keychain entry. The use case is a stretch, but it doesn't do any harm to implement it this way. My concerns with conflating the two are: 1. The system helper might care about the distinction and prefer imaps (e.g., it might already have the credential stored for your regular mail client, which uses imaps). But osxkeychain is the only helper that makes the distinction, and I don't really know how OS X's keychain code handles the distinction. 2. With http and https, we are careful to make the distinction, because we would not want to accidentally share a credential over http that was stored via https. But it's pretty easy to use an http URL rather than an https one. It's probably pretty rare to accidentally turn off imap SSL. So I'd be OK with leaving it as imap for now, and waiting for somebody to actually come up with a real case where the distinction matters. --- Uses git-credential for git-imap-send git-imap-send was directly prompting for a password rather than using git-credential. git-send-email, on the other hand, supports git-credential. This is a necessary improvement for users that use two factor authentication, as they should not be expected to remember all of their app specific passwords. Signed-off-by: Dan Albert danalb...@google.com --- imap-send.c | 29 +++-- 1 file changed, 15 insertions(+), 14 deletions(-) A side note on formatting your commit message: The maintainer picks up patches from the list with git am, which will take everything up to the first --- as the commit message, and discard everything after up to the start of the diff. So in this case it would take your cover-letter material as the commit message, and drop your actual commit message. The usual formats are either to put the cover letter material after the ---, like: $COMMIT_MESSAGE Signed-off-by: You --- $COVER_LETTER $DIFFSTAT $DIFF or to use a scissors-line -- 8 -- instead of three-dash: $COVER_LETTER -- 8 -- $COMMIT_MESSAGE --- $DIFFSTAT $DIFF -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Revert Stop starting pager recursively
On Sun, Apr 27, 2014 at 09:12:39AM +0700, Duy Nguyen wrote: The intent of the commit was that is a stupid thing to do, but it's not so obvious from the first glance, do not freeze my system for my mistake. But if it stops an actual use case, then I agree it should be reverted. Thanks for the explanation. I think we should just go with Jörn's patch as-is, then. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames
If you have existing decomposed filenames in your git repository (e.g., that were created with older versions of git that did not precompose unicode), a modern git with core.precomposeunicode set does not handle them well. The problem is that we normalize the paths coming from the disk into their precomposed form, and then compare them against the literal bytes in the index. This makes things better if you have the precomposed form in the index. It makes things worse if you actually have the decomposed form in the index. As a result, paths with decomposed filenames may have their precomposed variants listed as untracked files (even though the precomposed variants do not exist on-disk at all). This patch just adds a test to demonstrate the breakage. Some possible fixes are: 1. Tell everyone that NFD in the git repo is wrong, and they should make a new commit to normalize all their in-repo files to be precomposed. This is probably not the right thing to do, because it still doesn't fix checkouts of old history. And it spreads the problem to people on byte-preserving filesystems (like ext4), because now they have to start precomposing their filenames as they are adde to git. 2. Do all index filename comparisons using a UTF-8 aware comparison function when core.precomposeunicode is set. This would probably have bad performance, and somewhat defeats the point of converting the filenames at the readdir level in the first place. 3. Convert index filenames to their precomposed form when we read the index from disk. This would be efficient, but we would have to be careful not to write the precomposed forms back out to disk. 4. Introduce some infrastructure to efficiently match up the precomposed/decomposed forms. We already do something similar for case-insensitive files using name-hash.c. We might be able to adapt that strategy here. Signed-off-by: Jeff King p...@peff.net --- t/t3910-mac-os-precompose.sh | 10 ++ 1 file changed, 10 insertions(+) diff --git a/t/t3910-mac-os-precompose.sh b/t/t3910-mac-os-precompose.sh index e4ba601..23aa61e 100755 --- a/t/t3910-mac-os-precompose.sh +++ b/t/t3910-mac-os-precompose.sh @@ -140,6 +140,16 @@ test_expect_success Add long precomposed filename ' git add * git commit -m Long filename ' + +test_expect_failure 'handle existing decomposed filenames' ' + echo content verbatim.$Adiarnfd + git -c core.precomposeunicode=false add verbatim.$Adiarnfd + git commit -m existing decomposed file + expect + git ls-files --exclude-standard -o verbatim* untracked + test_cmp expect untracked +' + # Test if the global core.precomposeunicode stops autosensing # Must be the last test case test_expect_success respect git config --global core.precomposeunicode ' -- 1.9.1.656.ge8a0637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] PAGER_ENV: remove 'S' from $LESS by default
On Mon, Apr 28, 2014 at 02:22:21PM +0200, Matthieu Moy wrote: Not since f82c3ffd862c7 (Wed Feb 5 2014, move LESS/LV pager environment to Makefile). The only upstream branch containing this commit is pu. So this patch should likely not go anywhere else for now. Oops, indeed, I made my patch on top of pu by mistake. Anyway, my patch can wait for the other series to be merged. Jeff, you're the author of f82c3ffd862c7, topic jk/makefile in git.git, marked expecting a reroll by Junio. Any news from the series? I am planning to revisit it eventually, but it's fairly low priority. There is some pretty heavy refactoring in the series, and the PAGER_ENV bits do not have to be held hostage to that refactoring (they are really just demonstrating the refactoring). I'd be OK with doing the moral equivalent for now (perhaps just taking Junio's proposal[1]), and I can deal with the refactoring later when re-rolling the Makefile series. -Peff [1] http://article.gmane.org/gmane.comp.version-control.git/240637 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Adding git hooks
On Mon, Apr 28, 2014 at 09:43:10AM -0700, Junio C Hamano wrote: Yes, I'd love to see something along that line in the longer term, showing all the objects as just regular objects under the hood, with implementation details hidden in the object layer (just like there is no distinction between packed and loose objects from the point of view of read_sha1_file() users), as a real solution to address issues in larger trees. Also see http://thread.gmane.org/gmane.comp.version-control.git/241940 where Shawn had an interesting experiment. Yeah, I think it's pretty clear that a naive high-latency object store is unusably slow. You mentioned in that thread trying to do pre-fetching based on commits/trees, and I recall that Shawn's Cassandra experiments did that (and maybe the BigTable-backed Google Code does, too?). There's also a question of deltas. You don't want to get trees or text blobs individually without deltas, because your total size ends up way bigger. But I think for large object support, we can side-step the issue. The objects will all be blobs (so they cannot refer to anything else), they will typically not delta well, and the connection setup and latency will be dwarfed by actual transfer time. My plan was to have all clones fetch all commits and trees (and small blobs, too), and then download and cache the large blobs as-needed. That doesn't help with repositories where the actual commit history or tree size is a problem. But we already have shallow clones to help with the former. And for the latter, I think we would want a narrow clone that behaves differently than what I described above. You'd probably want a specific widen operation that would fetch all of the objects for the newly-widened part of the tree in one go (including deltas), and you wouldn't want it to happen on an as-needed basis. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Uses git-credential for git-imap-send
On Sun, Apr 27, 2014 at 10:58:51AM -0700, Dan Albert wrote: git-imap-send was directly prompting for a password rather than using git-credential. git-send-email, on the other hand, supports git-credential. This is a necessary improvement for users that use two factor authentication, as they should not be expected to remember all of their app specific passwords. Signed-off-by: Dan Albert danalb...@google.com Thanks, this version looks good to me. I noticed that we are just filling in the password here, since we'll always fill cred.username from srvc-user. The lines directly above are: if (!srvc-user) { fprintf(stderr, Skipping server %s, no user\n, srvc-host); goto bail; } That comes from the imap.user config variable. I wonder if we should just pass it off to credential_fill() in this case, too, which will fill in the username if necessary. It probably doesn't matter much, though, as nobody is complaining. And if we were designing from scratch, I would say that imap.user and imap.pass would not need to exist, as you can configure credential.imaps://host/.* for the same purpose. But since we would have to keep supporting them anyway for compatibility, it's not worth trying to transition. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames
On Mon, Apr 28, 2014 at 12:17:28PM -0700, Junio C Hamano wrote: 3. Convert index filenames to their precomposed form when we read the index from disk. This would be efficient, but we would have to be careful not to write the precomposed forms back out to disk. I think this may be the right approach, especially if you are going to do this only when core.precomposeunicode is set. the reasoning behind we would have to be careful not to write part, is unclear to me, though. Don't decomposing filesystems perform the manglig from the precomposed form without even being asked to do so, just like a case insensitive filesystem will overwrite an existing makefile on a request to write to Makefile? Sorry, I meant do not write the precomposed forms back out to the on-disk index. And by extension, do not update cache-tree and write them out to git trees. IOW, it is not enough to just set cache_entry-name to the normalized form. You'd need to store both. Since such entries are in the minority, and because cache_entry is already a variable-length struct, I think you could get away with sticking it after the name field, and then comparing like: const char *ce_normalized_name(struct cache_entry *ce, size_t *len) { const char *ret; /* Normal, fast path */ if (!(ce-ce_flags CE_NORMALIZED_NAME)) { len = ce_namelen(ce); return ce-name; } /* Slow path for normalized names */ ret = ce-name + ce-namelen + 1; *len = strlen(name); return ret; } The strlen is probably OK since such paths are presumably in the minority (even for UTF-8 paths, we can avoid storing the extra copy if they do not need any normalization). Or we could get fancy and encode the length in front, but I am not sure it is worth the complexity. Anyway, the tricky part is then making sure that all cache_entry name comparisons use ce_normalized_name instead of ce-name. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Apr 2014, #08; Fri, 25)
On Mon, Apr 28, 2014 at 10:48:21AM -0700, Junio C Hamano wrote: diff --git a/diff.c b/diff.c index 54d5308..a03744b 100644 --- a/diff.c +++ b/diff.c @@ -2894,7 +2894,7 @@ static void run_external_diff(const char *pgm, int complete_rewrite, struct diff_options *o) { - const char *spawn_arg[10]; + const char *spawn_arg[11]; int retval; const char **arg = spawn_arg[0]; struct diff_queue_struct *q = diff_queued_diff; as a fix for maint/2.0.0 in the interim. I can write a commit message for that if you're interested. I think we should merge the first one (and possibly the second one, too) as-is for 2.0 instead. No change can possibly be more trivially correct than these two ;-) I'm fine with that. The second patch is pure clean-up and doesn't fix anything (because the 3 in the env array is actually correct), so it can happily wait for the next cycle. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] t3910: show failure of core.precomposeunicode with decomposed filenames
On Mon, Apr 28, 2014 at 09:52:07PM +0200, Torsten Bögershausen wrote: To my knowledge repos with decomposed unicode should be rare in practice. I only can speak for european (or latin based) or cyrillic languages myself: I've run across several cases in the past few months, but only just figured out what was going on. Most were tickets to GitHub support, but we actually have such a case in our github/github repository. In most cases, I think they were created on older versions of git on OS X, either before core.precomposeunicode existed, or before it was turned on by default. The decomposed form got baked into the tree (whatever the user originally typed, git probably found out about it via git add .). I think reports are just coming in now because we didn't start turning on core.precomposeunicode by default until v1.8.5, shipped in November. And then, a person working on the repository would not notice anything, since we only set the flag during clone. So it took time for people to upgrade _and_ to make fresh clones. So for me the test case could sense, even if I think that nobody (TM) uses an old Git version under Mac OS X which is not able to handle precomposed unicode. Even when they do not, the decomposed values are baked into history from those old versions. So it is a matter of history created with older versions not interacting well with newer versions. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html