Re: [BUG] add_again() off-by-one error in custom format
On Thu, Jun 22, 2017 at 08:19:40PM +0200, René Scharfe wrote: > > I'd be OK with keeping it if we could reduce the number of magic > > numbers. E.g,. rather than 32 elsewhere use: > > > >(sizeof(*loose_objects_subdir_bitmap) * CHAR_BIT) > > We have a bitsizeof macro for that. > > > and similarly rather than 8 here use > > > >256 / sizeof(*loose_objects_subdir_bitmap) / CHAR_BIT > > If we're pretending not to know the number of bits in a byte then we > need to round up, and we have DIV_ROUND_UP for that. :) Thanks, I meant to mention the rounding thing but forgot. I didn't know about either of those macros, though. > There is another example of a bitmap in shallow_c (just search for > "32"). It would benefit from the macros mentioned above. That > might make it easier to switch to the native word size (unsigned int) > instead of using uint32_t everywhere. > > But perhaps this one is actually a candidate for using EWAH, depending > on the number of refs the code is supposed to handle. I thought at first you meant real EWAH bitmaps. But those aren't random access (so you have to read them into an uncompressed bitmap in memory, which is why ewah/bitmap.c exists). But if you just mean reusing those bitmaps, then yeah, possibly. It looks like it's using a commit slab, though. If you know the number of bits you need up front, then the commit slab can do it without any waste. I didn't dig enough to see if that's what it's doing or not. -Peff
Re: [BUG] add_again() off-by-one error in custom format
Am 18.06.2017 um 15:56 schrieb Jeff King: On Sun, Jun 18, 2017 at 02:59:04PM +0200, René Scharfe wrote: @@ -1586,6 +1587,9 @@ extern struct alternate_object_database { struct strbuf scratch; size_t base_len; + uint32_t loose_objects_subdir_bitmap[8]; Is it worth the complexity of having an actual bitmap and not just an array of char? I guess it's not _that_ complex to access the bits, but there are a lot of magic numbers involved. That would be 224 bytes more per alternate_object_database, and we'd gain simpler code. Hmm. We could add some bitmap helper macros, but you're probably right that the first version should use the simplest form for representing small bitmaps that we currently have. I'd be OK with keeping it if we could reduce the number of magic numbers. E.g,. rather than 32 elsewhere use: (sizeof(*loose_objects_subdir_bitmap) * CHAR_BIT) We have a bitsizeof macro for that. and similarly rather than 8 here use 256 / sizeof(*loose_objects_subdir_bitmap) / CHAR_BIT If we're pretending not to know the number of bits in a byte then we need to round up, and we have DIV_ROUND_UP for that. :) There's also already a bitmap data structure in ewah/bitmap.c. It's a little bit overkill, though, because it mallocs and will grow the bitmap as needed. Yes, I feel that's too big a hammer for this purpose. There is another example of a bitmap in shallow_c (just search for "32"). It would benefit from the macros mentioned above. That might make it easier to switch to the native word size (unsigned int) instead of using uint32_t everywhere. But perhaps this one is actually a candidate for using EWAH, depending on the number of refs the code is supposed to handle. +static void read_loose_object_subdir(struct alternate_object_database *alt, +int subdir_nr) I think it's nice to pull this out into a helper function. I do wonder if it should just be reusing for_each_loose_file_in_objdir(). You'd just need to expose the in_obj_subdir() variant publicly. It does do slightly more than we need (for the callbacks it actually builds the filename), but I doubt memcpy()ing a few bytes would be measurable. Good point. The function also copies the common first two hex digits for each entry. But all that extra work is certainly dwarfed by the readdir calls. Yes. You're welcome to micro-optimize that implementation if you want. ;) My knee-jerk reaction was that this would lead to ugliness, but on second look it might actually be doable. Will check, but I don't expect much of a speedup. René
Re: [BUG] add_again() off-by-one error in custom format
Jeff King writes: > On Sun, Jun 18, 2017 at 12:58:49PM +0200, René Scharfe wrote: > >> Anyway, here's a patch for stat-based invalidation, on top of the other >> one. Array removal is really slow (hope I didn't sneak a bug in there, >> but my confidence in this code isn't very high). No locking is done; >> parallel threads removing and adding entries could make a mess, but >> that's not an issue for log. >> >> Timings for "time git log --pretty=%h >/dev/null" in my git repository >> with 5094 loose objects on Debian: >> >> master first patch this patch >> real0m1.065s 0m0.581s 0m0.633s >> user0m0.648s 0m0.564s 0m0.580s >> sys 0m0.412s 0m0.016s 0m0.052s >> >> >> And on mingw with 227 loose objects: >> >> master first patch this patch >> real0m1.756s 0m0.546s 0m1.659s >> user0m0.000s 0m0.000s 0m0.000s >> sys 0m0.000s 0m0.000s 0m0.000s >> >> So at least for Windows it would be really nice if we could avoid >> calling stat.. > > Thanks for doing the timings. Given those numbers and the earlier > discussion, I'd be inclined to skip the mtime check. Yeah, thanks for these experiments. With or without invalidation, we already accept that racing with other processes will make the result inaccurate, so I am also inclined to say that it would be best to take the first one alone.
Re: [BUG] add_again() off-by-one error in custom format
On Sun, Jun 18, 2017 at 02:59:04PM +0200, René Scharfe wrote: > > > @@ -1586,6 +1587,9 @@ extern struct alternate_object_database { > > > struct strbuf scratch; > > > size_t base_len; > > > + uint32_t loose_objects_subdir_bitmap[8]; > > > > Is it worth the complexity of having an actual bitmap and not just an > > array of char? I guess it's not _that_ complex to access the bits, but > > there are a lot of magic numbers involved. > > That would be 224 bytes more per alternate_object_database, and we'd > gain simpler code. Hmm. We could add some bitmap helper macros, but > you're probably right that the first version should use the simplest > form for representing small bitmaps that we currently have. I'd be OK with keeping it if we could reduce the number of magic numbers. E.g,. rather than 32 elsewhere use: (sizeof(*loose_objects_subdir_bitmap) * CHAR_BIT) and similarly rather than 8 here use 256 / sizeof(*loose_objects_subdir_bitmap) / CHAR_BIT There's also already a bitmap data structure in ewah/bitmap.c. It's a little bit overkill, though, because it mallocs and will grow the bitmap as needed. > > We should probably insert a comment here warning that the cache may go > > out of date during the process run, and should only be used when that's > > an acceptable tradeoff. > > Then we need to offer a way for opting out. Through a global variable? > (I'd rather reduce their number, but don't see how allow programs to > nicely toggle this cache otherwise.) Sorry, I didn't mean disabling it for a particular run of a program. I just meant that we know this is an OK tradeoff for short-sha1 lookups, but has_sha1_file(), for example, would never want to use it. So we are warning programmers from using it in more code, not giving users a way to turn it off run-to-run. > > > +static void read_loose_object_subdir(struct alternate_object_database > > > *alt, > > > + int subdir_nr) > > > > I think it's nice to pull this out into a helper function. I do wonder > > if it should just be reusing for_each_loose_file_in_objdir(). You'd just > > need to expose the in_obj_subdir() variant publicly. > > > > It does do slightly more than we need (for the callbacks it actually > > builds the filename), but I doubt memcpy()ing a few bytes would be > > measurable. > > Good point. The function also copies the common first two hex digits > for each entry. But all that extra work is certainly dwarfed by the > readdir calls. Yes. You're welcome to micro-optimize that implementation if you want. ;) -Peff
Re: [BUG] add_again() off-by-one error in custom format
Am 18.06.2017 um 13:49 schrieb Jeff King: On Sun, Jun 18, 2017 at 12:58:37PM +0200, René Scharfe wrote: An oid_array spends ca. 30 bytes per entry (20 bytes for a hash times a factor of 1.5 from alloc_nr). How many loose objects do we have to expect? 30 MB for a million of them sounds not too bad, but can there realistically be orders of magnitude more? Good point. We gc by default at 6000 loose objects, and lots of thing will suffer poor performance by the time you hit a million. So a little extra space is probably not worth worrying about. So here's a patch for caching loose object hashes in an oid_array without a way to invalidate or release the cache. It uses a single oid_array for simplicity, but reads each subdirectory individually and remembers that fact using a bitmap. I like the direction of this very much. @@ -1586,6 +1587,9 @@ extern struct alternate_object_database { struct strbuf scratch; size_t base_len; + uint32_t loose_objects_subdir_bitmap[8]; Is it worth the complexity of having an actual bitmap and not just an array of char? I guess it's not _that_ complex to access the bits, but there are a lot of magic numbers involved. That would be 224 bytes more per alternate_object_database, and we'd gain simpler code. Hmm. We could add some bitmap helper macros, but you're probably right that the first version should use the simplest form for representing small bitmaps that we currently have. + struct oid_array loose_objects_cache; We should probably insert a comment here warning that the cache may go out of date during the process run, and should only be used when that's an acceptable tradeoff. Then we need to offer a way for opting out. Through a global variable? (I'd rather reduce their number, but don't see how allow programs to nicely toggle this cache otherwise.) +static void read_loose_object_subdir(struct alternate_object_database *alt, +int subdir_nr) I think it's nice to pull this out into a helper function. I do wonder if it should just be reusing for_each_loose_file_in_objdir(). You'd just need to expose the in_obj_subdir() variant publicly. It does do slightly more than we need (for the callbacks it actually builds the filename), but I doubt memcpy()ing a few bytes would be measurable. Good point. The function also copies the common first two hex digits for each entry. But all that extra work is certainly dwarfed by the readdir calls. René
Re: [BUG] add_again() off-by-one error in custom format
On Sun, Jun 18, 2017 at 12:58:49PM +0200, René Scharfe wrote: > Anyway, here's a patch for stat-based invalidation, on top of the other > one. Array removal is really slow (hope I didn't sneak a bug in there, > but my confidence in this code isn't very high). No locking is done; > parallel threads removing and adding entries could make a mess, but > that's not an issue for log. > > Timings for "time git log --pretty=%h >/dev/null" in my git repository > with 5094 loose objects on Debian: > > master first patch this patch > real0m1.065s 0m0.581s 0m0.633s > user0m0.648s 0m0.564s 0m0.580s > sys 0m0.412s 0m0.016s 0m0.052s > > > And on mingw with 227 loose objects: > > master first patch this patch > real0m1.756s 0m0.546s 0m1.659s > user0m0.000s 0m0.000s 0m0.000s > sys 0m0.000s 0m0.000s 0m0.000s > > So at least for Windows it would be really nice if we could avoid > calling stat.. Thanks for doing the timings. Given those numbers and the earlier discussion, I'd be inclined to skip the mtime check. -Peff
Re: [BUG] add_again() off-by-one error in custom format
On Sun, Jun 18, 2017 at 12:58:37PM +0200, René Scharfe wrote: > An oid_array spends ca. 30 bytes per entry (20 bytes for a hash times > a factor of 1.5 from alloc_nr). How many loose objects do we have to > expect? 30 MB for a million of them sounds not too bad, but can there > realistically be orders of magnitude more? Good point. We gc by default at 6000 loose objects, and lots of thing will suffer poor performance by the time you hit a million. So a little extra space is probably not worth worrying about. > So here's a patch for caching loose object hashes in an oid_array > without a way to invalidate or release the cache. It uses a single > oid_array for simplicity, but reads each subdirectory individually and > remembers that fact using a bitmap. I like the direction of this very much. > @@ -1586,6 +1587,9 @@ extern struct alternate_object_database { > struct strbuf scratch; > size_t base_len; > > + uint32_t loose_objects_subdir_bitmap[8]; Is it worth the complexity of having an actual bitmap and not just an array of char? I guess it's not _that_ complex to access the bits, but there are a lot of magic numbers involved. > + struct oid_array loose_objects_cache; We should probably insert a comment here warning that the cache may go out of date during the process run, and should only be used when that's an acceptable tradeoff. > +static void read_loose_object_subdir(struct alternate_object_database *alt, > + int subdir_nr) I think it's nice to pull this out into a helper function. I do wonder if it should just be reusing for_each_loose_file_in_objdir(). You'd just need to expose the in_obj_subdir() variant publicly. It does do slightly more than we need (for the callbacks it actually builds the filename), but I doubt memcpy()ing a few bytes would be measurable. > + pos = oid_array_lookup(&alt->loose_objects_cache, &ds->bin_pfx); > + if (pos < 0) > + pos = -1 - pos; > + while (!ds->ambiguous && pos < alt->loose_objects_cache.nr) { > + const struct object_id *oid; > + oid = alt->loose_objects_cache.oid + pos; > + if (!match_sha(ds->len, ds->bin_pfx.hash, oid->hash)) > + break; > + update_candidates(ds, oid); > + pos++; > } This part looks quite nice and straightforward. -Peff
Re: [BUG] add_again() off-by-one error in custom format
Am 15.06.2017 um 15:25 schrieb Jeff King: > On Thu, Jun 15, 2017 at 01:33:34PM +0200, René Scharfe wrote: >> Can we trust object directory time stamps for cache invalidation? > > I think it would work on POSIX-ish systems, since loose object changes > always involve new files, not modifications of existing ones. I don't > know if there are systems that don't update directory mtimes even for > newly added files. > > That would still be a stat() per request. I'd be curious how the timing > compares to the opendir/readdir that happens now. Modification times wouldn't be exact, as you mentioned above: An object could be added just after checking the time stamp. So perhaps we don't need that, or a time-based invalidation suffices, e.g. we discard the cache after five minutes or so? Anyway, here's a patch for stat-based invalidation, on top of the other one. Array removal is really slow (hope I didn't sneak a bug in there, but my confidence in this code isn't very high). No locking is done; parallel threads removing and adding entries could make a mess, but that's not an issue for log. Timings for "time git log --pretty=%h >/dev/null" in my git repository with 5094 loose objects on Debian: master first patch this patch real0m1.065s 0m0.581s 0m0.633s user0m0.648s 0m0.564s 0m0.580s sys 0m0.412s 0m0.016s 0m0.052s And on mingw with 227 loose objects: master first patch this patch real0m1.756s 0m0.546s 0m1.659s user0m0.000s 0m0.000s 0m0.000s sys 0m0.000s 0m0.000s 0m0.000s So at least for Windows it would be really nice if we could avoid calling stat.. --- cache.h | 1 + sha1_name.c | 32 2 files changed, 29 insertions(+), 4 deletions(-) diff --git a/cache.h b/cache.h index 8c6748907b..386b09710d 100644 --- a/cache.h +++ b/cache.h @@ -1589,6 +1589,7 @@ extern struct alternate_object_database { uint32_t loose_objects_subdir_bitmap[8]; struct oid_array loose_objects_cache; + time_t loose_objects_subdir_mtime[256]; char path[FLEX_ARRAY]; } *alt_odb_list; diff --git a/sha1_name.c b/sha1_name.c index ae6a5c3abe..baecb10454 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -77,6 +77,24 @@ static void update_candidates(struct disambiguate_state *ds, const struct object /* otherwise, current can be discarded and candidate is still good */ } +static void remove_subdir_entries(struct oid_array *array, int subdir_nr) +{ + struct object_id oid; + int start, end; + + memset(&oid, 0, sizeof(oid)); + oid.hash[0] = subdir_nr; + start = oid_array_lookup(array, &oid); + if (start < 0) + start = -1 - start; + end = start; + while (end < array->nr && array->oid[end].hash[0] == subdir_nr) + end++; + memmove(array->oid + start, array->oid + end, + (array->nr - end) * sizeof(array->oid[0])); + array->nr -= end - start; +} + static void read_loose_object_subdir(struct alternate_object_database *alt, int subdir_nr) { @@ -86,15 +104,21 @@ static void read_loose_object_subdir(struct alternate_object_database *alt, struct dirent *de; size_t bitmap_index = subdir_nr / 32; uint32_t bitmap_mask = 1 << (subdir_nr % 32); - - if (alt->loose_objects_subdir_bitmap[bitmap_index] & bitmap_mask) - return; - alt->loose_objects_subdir_bitmap[bitmap_index] |= bitmap_mask; + struct stat st; buf = alt_scratch_buf(alt); strbuf_addf(buf, "%02x/", subdir_nr); xsnprintf(hex, sizeof(hex), "%02x", subdir_nr); + stat(buf->buf, &st); + if (alt->loose_objects_subdir_bitmap[bitmap_index] & bitmap_mask) { + if (alt->loose_objects_subdir_mtime[subdir_nr] == st.st_mtime) + return; + remove_subdir_entries(&alt->loose_objects_cache, subdir_nr); + } + alt->loose_objects_subdir_mtime[subdir_nr] = st.st_mtime; + alt->loose_objects_subdir_bitmap[bitmap_index] |= bitmap_mask; + dir = opendir(buf->buf); if (!dir) return; -- 2.13.0
Re: [BUG] add_again() off-by-one error in custom format
Am 15.06.2017 um 15:25 schrieb Jeff King: > On Thu, Jun 15, 2017 at 01:33:34PM +0200, René Scharfe wrote: >>> I'm not really sure how, though, short of caching the directory >>> contents. That opens up questions of whether and when to invalidate the >>> cache. If the cache were _just_ about short hashes, it might be OK to >>> just assume that it remains valid through the length of the program (so >>> worst case, a simultaneous write might mean that we generate a sha1 >>> which just became ambiguous, but that's generally going to be racy >>> anyway). >>> >>> The other downside of course is that we'd spend RAM on it. We could >>> bound the size of the cache, I suppose. >> >> You mean like an in-memory pack index for loose objects? In order to >> avoid the readdir call in sha1_name.c::find_short_object_filename()? >> We'd only need to keep the hashes of found objects. An oid_array >> would be quite compact. > > Yes, that's what I was thinking of. An oid_array spends ca. 30 bytes per entry (20 bytes for a hash times a factor of 1.5 from alloc_nr). How many loose objects do we have to expect? 30 MB for a million of them sounds not too bad, but can there realistically be orders of magnitude more? >> Non-racy writes inside a process should not be ignored (write, read >> later) -- e.g. checkout does something like that. > > Ideally, yes. Though thinking on this more, it's racy even today, > because we rely on the in-memory packed_git list. We usually re-check the > directory only when we look for an object and it's missing. So any new > packs which have been added while the process runs will be missed when > doing short-sha1 lookups (and actually, find_short_packed_object does > not even seem to do the usual retry-on-miss). > > A normal process like "checkout" isn't writing new packs, but a > simultaneous repack could be migrating its new objects to a pack behind > its back. (It also _can_ write packs, but only for large blobs). > > Given that we are talking about finding "unique" abbreviations here, and > that those abbreviations can become invalidated immediately anyway as > more objects are added (even by the same process), I'm not sure we need > to strive for absolute accuracy. Agreed. And processes that add objects and use them later probably use full hashes (at least checkout does). So here's a patch for caching loose object hashes in an oid_array without a way to invalidate or release the cache. It uses a single oid_array for simplicity, but reads each subdirectory individually and remembers that fact using a bitmap. --- cache.h | 4 sha1_name.c | 69 +++-- 2 files changed, 53 insertions(+), 20 deletions(-) diff --git a/cache.h b/cache.h index 4d92aae0e8..8c6748907b 100644 --- a/cache.h +++ b/cache.h @@ -11,6 +11,7 @@ #include "string-list.h" #include "pack-revindex.h" #include "hash.h" +#include "sha1-array.h" #ifndef platform_SHA_CTX /* @@ -1586,6 +1587,9 @@ extern struct alternate_object_database { struct strbuf scratch; size_t base_len; + uint32_t loose_objects_subdir_bitmap[8]; + struct oid_array loose_objects_cache; + char path[FLEX_ARRAY]; } *alt_odb_list; extern void prepare_alt_odb(void); diff --git a/sha1_name.c b/sha1_name.c index 5126853bb5..ae6a5c3abe 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -77,10 +77,46 @@ static void update_candidates(struct disambiguate_state *ds, const struct object /* otherwise, current can be discarded and candidate is still good */ } +static void read_loose_object_subdir(struct alternate_object_database *alt, +int subdir_nr) +{ + struct strbuf *buf; + char hex[GIT_MAX_HEXSZ]; + DIR *dir; + struct dirent *de; + size_t bitmap_index = subdir_nr / 32; + uint32_t bitmap_mask = 1 << (subdir_nr % 32); + + if (alt->loose_objects_subdir_bitmap[bitmap_index] & bitmap_mask) + return; + alt->loose_objects_subdir_bitmap[bitmap_index] |= bitmap_mask; + + buf = alt_scratch_buf(alt); + strbuf_addf(buf, "%02x/", subdir_nr); + xsnprintf(hex, sizeof(hex), "%02x", subdir_nr); + + dir = opendir(buf->buf); + if (!dir) + return; + + while ((de = readdir(dir)) != NULL) { + struct object_id oid; + + if (strlen(de->d_name) != GIT_SHA1_HEXSZ - 2) + continue; + memcpy(hex + 2, de->d_name, GIT_SHA1_HEXSZ - 2); + if (!get_oid_hex(hex, &oid)) + oid_array_append(&alt->loose_objects_cache, &oid); + } + closedir(dir); +} + +static int match_sha(unsigned, const unsigned char *, const unsigned char *); + static void find_short_object_filename(struct disambiguate_state *ds) { + int subdir_nr = ds->bin_pfx.hash[0]; struct alternate_object_database *alt; - char hex[GIT_MAX_HEXSZ];
Re: [BUG] add_again() off-by-one error in custom format
René Scharfe writes: > Am 13.06.2017 um 23:20 schrieb Junio C Hamano: > >> I think the real question is how likely people use more than one >> occurrence of the same thing in their custom format, and how deeply >> they care that --format='%h %h' costs more than --format='%h'. The >> cost won't of course be double (because the main traversal costs >> without any output), but it would be rather unreasonable to expect >> that --format='%h %h %h %h %h' to cost the same as --format='%h'; >> after all, Git is doing more for them ;-) > > The answer to the first half is obviously "very likely" -- otherwise > this bug wouldn't have been found, right? :) Not really. There was only one (this one) after all these years. The question we are asking is not "very rarely this is used and we can afford to leave it broken?" It is "very rarely this is used and we can afford not to optimize for that rare use case?". > Regarding the question of how bad a 50% slowdown for a second %h > would be: No idea. If ran interactively it may not even be noticeable > because the user can read the first few lines in less while the rest > is prepared in the background. We don't have a perf test for formats > with duplicate short hashes, so we don't promise anything, right? :) OK. > -- >8 -- > Subject: [PATCH] pretty: recalculate duplicate short hashes > > b9c6232138 (--format=pretty: avoid calculating expensive expansions > twice) optimized adding short hashes multiple times by using the > fact that the output strbuf was only ever simply appended to and > copying the added string from the previous run. That prerequisite > is no longer given; we now have modfiers like %< and %+ that can > cause the cache to lose track of the correct offsets. Remove it. > > Reported-by: Michael Giuffrida > Signed-off-by: Rene Scharfe > --- > I'm sending this out in the hope that there might be a simple way > to fix it after all, like Gábor's patch does for %+. %< and %> > seem to be the only other problematic modifiers for now -- I'm > actually surprised that %w seems to be OK. Thanks, this looks like a sensible first step. Will queue.
Re: [BUG] add_again() off-by-one error in custom format
On Thu, Jun 15, 2017 at 01:33:34PM +0200, René Scharfe wrote: > > The difference is that in the "before" case, we actually opened each > > directory and ran getdents(). But after gc, the directories are gone > > totally and open() fails. We also have to do a linear walk through the > > objects in each directory, since the contents are sorted. > > Do you mean "unsorted"? Er yeah, sorry, I meant to say "aren't". > > I'm not really sure how, though, short of caching the directory > > contents. That opens up questions of whether and when to invalidate the > > cache. If the cache were _just_ about short hashes, it might be OK to > > just assume that it remains valid through the length of the program (so > > worst case, a simultaneous write might mean that we generate a sha1 > > which just became ambiguous, but that's generally going to be racy > > anyway). > > > > The other downside of course is that we'd spend RAM on it. We could > > bound the size of the cache, I suppose. > > You mean like an in-memory pack index for loose objects? In order to > avoid the readdir call in sha1_name.c::find_short_object_filename()? > We'd only need to keep the hashes of found objects. An oid_array > would be quite compact. Yes, that's what I was thinking of. > Non-racy writes inside a process should not be ignored (write, read > later) -- e.g. checkout does something like that. Ideally, yes. Though thinking on this more, it's racy even today, because we rely on the in-memory packed_git list. We usually re-check the directory only when we look for an object and it's missing. So any new packs which have been added while the process runs will be missed when doing short-sha1 lookups (and actually, find_short_packed_object does not even seem to do the usual retry-on-miss). A normal process like "checkout" isn't writing new packs, but a simultaneous repack could be migrating its new objects to a pack behind its back. (It also _can_ write packs, but only for large blobs). Given that we are talking about finding "unique" abbreviations here, and that those abbreviations can become invalidated immediately anyway as more objects are added (even by the same process), I'm not sure we need to strive for absolute accuracy. > Can we trust object directory time stamps for cache invalidation? I think it would work on POSIX-ish systems, since loose object changes always involve new files, not modifications of existing ones. I don't know if there are systems that don't update directory mtimes even for newly added files. That would still be a stat() per request. I'd be curious how the timing compares to the opendir/readdir that happens now. -Peff
Re: [BUG] add_again() off-by-one error in custom format
Am 15.06.2017 um 07:56 schrieb Jeff King: One interesting thing is that the cost of finding short hashes very much depends on your loose object setup. I timed: git log --format=%H >/dev/null versus git log --format=%h >/dev/null on git.git. It went from about 400ms to about 800ms. But then I noticed I had a lot of loose object directories, and ran "git gc --prune=now". Afterwards, my timings were more like 380ms and 460ms. The difference is that in the "before" case, we actually opened each directory and ran getdents(). But after gc, the directories are gone totally and open() fails. We also have to do a linear walk through the objects in each directory, since the contents are sorted. Do you mean "unsorted"? So I wonder if it is worth trying to optimize the short-sha1 computation in the first place. Double-%h aside, that would make _everything_ faster, including --oneline. Right. I'm not really sure how, though, short of caching the directory contents. That opens up questions of whether and when to invalidate the cache. If the cache were _just_ about short hashes, it might be OK to just assume that it remains valid through the length of the program (so worst case, a simultaneous write might mean that we generate a sha1 which just became ambiguous, but that's generally going to be racy anyway). The other downside of course is that we'd spend RAM on it. We could bound the size of the cache, I suppose. You mean like an in-memory pack index for loose objects? In order to avoid the readdir call in sha1_name.c::find_short_object_filename()? We'd only need to keep the hashes of found objects. An oid_array would be quite compact. Non-racy writes inside a process should not be ignored (write, read later) -- e.g. checkout does something like that. Can we trust object directory time stamps for cache invalidation? René
Re: [BUG] add_again() off-by-one error in custom format
On Wed, Jun 14, 2017 at 08:24:25PM +0200, René Scharfe wrote: > > I think the real question is how likely people use more than one > > occurrence of the same thing in their custom format, and how deeply > > they care that --format='%h %h' costs more than --format='%h'. The > > cost won't of course be double (because the main traversal costs > > without any output), but it would be rather unreasonable to expect > > that --format='%h %h %h %h %h' to cost the same as --format='%h'; > > after all, Git is doing more for them ;-) > > The answer to the first half is obviously "very likely" -- otherwise > this bug wouldn't have been found, right? :) > > Regarding the question of how bad a 50% slowdown for a second %h > would be: No idea. If ran interactively it may not even be noticeable > because the user can read the first few lines in less while the rest > is prepared in the background. We don't have a perf test for formats > with duplicate short hashes, so we don't promise anything, right? :) One interesting thing is that the cost of finding short hashes very much depends on your loose object setup. I timed: git log --format=%H >/dev/null versus git log --format=%h >/dev/null on git.git. It went from about 400ms to about 800ms. But then I noticed I had a lot of loose object directories, and ran "git gc --prune=now". Afterwards, my timings were more like 380ms and 460ms. The difference is that in the "before" case, we actually opened each directory and ran getdents(). But after gc, the directories are gone totally and open() fails. We also have to do a linear walk through the objects in each directory, since the contents are sorted. So I wonder if it is worth trying to optimize the short-sha1 computation in the first place. Double-%h aside, that would make _everything_ faster, including --oneline. I'm not really sure how, though, short of caching the directory contents. That opens up questions of whether and when to invalidate the cache. If the cache were _just_ about short hashes, it might be OK to just assume that it remains valid through the length of the program (so worst case, a simultaneous write might mean that we generate a sha1 which just became ambiguous, but that's generally going to be racy anyway). The other downside of course is that we'd spend RAM on it. We could bound the size of the cache, I suppose. -Peff
Re: [BUG] add_again() off-by-one error in custom format
Am 13.06.2017 um 23:20 schrieb Junio C Hamano: > René Scharfe writes: > >> The difference is about the same as the one between: >> >> $ time git log --format="" >/dev/null >> >> real0m0.463s >> user0m0.448s >> sys 0m0.012s >> >> and: >> >> $ time git log --format="%h" >/dev/null >> >> real0m1.062s >> user0m0.636s >> sys 0m0.416s >> >> With caching duplicates are basically free and without it short >> hashes have to be looked up again. Other placeholders may reduce >> the relative slowdown, depending on how expensive they are. > > I think the real question is how likely people use more than one > occurrence of the same thing in their custom format, and how deeply > they care that --format='%h %h' costs more than --format='%h'. The > cost won't of course be double (because the main traversal costs > without any output), but it would be rather unreasonable to expect > that --format='%h %h %h %h %h' to cost the same as --format='%h'; > after all, Git is doing more for them ;-) The answer to the first half is obviously "very likely" -- otherwise this bug wouldn't have been found, right? :) Regarding the question of how bad a 50% slowdown for a second %h would be: No idea. If ran interactively it may not even be noticeable because the user can read the first few lines in less while the rest is prepared in the background. We don't have a perf test for formats with duplicate short hashes, so we don't promise anything, right? :) René -- >8 -- Subject: [PATCH] pretty: recalculate duplicate short hashes b9c6232138 (--format=pretty: avoid calculating expensive expansions twice) optimized adding short hashes multiple times by using the fact that the output strbuf was only ever simply appended to and copying the added string from the previous run. That prerequisite is no longer given; we now have modfiers like %< and %+ that can cause the cache to lose track of the correct offsets. Remove it. Reported-by: Michael Giuffrida Signed-off-by: Rene Scharfe --- I'm sending this out in the hope that there might be a simple way to fix it after all, like Gábor's patch does for %+. %< and %> seem to be the only other problematic modifiers for now -- I'm actually surprised that %w seems to be OK. pretty.c | 32 strbuf.c | 7 --- strbuf.h | 6 -- 3 files changed, 45 deletions(-) diff --git a/pretty.c b/pretty.c index 09701bd2ff..cc099dfdd1 100644 --- a/pretty.c +++ b/pretty.c @@ -783,29 +783,9 @@ struct format_commit_context { size_t body_off; /* The following ones are relative to the result struct strbuf. */ - struct chunk abbrev_commit_hash; - struct chunk abbrev_tree_hash; - struct chunk abbrev_parent_hashes; size_t wrap_start; }; -static int add_again(struct strbuf *sb, struct chunk *chunk) -{ - if (chunk->len) { - strbuf_adddup(sb, chunk->off, chunk->len); - return 1; - } - - /* -* We haven't seen this chunk before. Our caller is surely -* going to add it the hard way now. Remember the most likely -* start of the to-be-added chunk: the current end of the -* struct strbuf. -*/ - chunk->off = sb->len; - return 0; -} - static void parse_commit_header(struct format_commit_context *context) { const char *msg = context->message; @@ -1147,24 +1127,16 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */ return 1; case 'h': /* abbreviated commit hash */ strbuf_addstr(sb, diff_get_color(c->auto_color, DIFF_COMMIT)); - if (add_again(sb, &c->abbrev_commit_hash)) { - strbuf_addstr(sb, diff_get_color(c->auto_color, DIFF_RESET)); - return 1; - } strbuf_add_unique_abbrev(sb, commit->object.oid.hash, c->pretty_ctx->abbrev); strbuf_addstr(sb, diff_get_color(c->auto_color, DIFF_RESET)); - c->abbrev_commit_hash.len = sb->len - c->abbrev_commit_hash.off; return 1; case 'T': /* tree hash */ strbuf_addstr(sb, oid_to_hex(&commit->tree->object.oid)); return 1; case 't': /* abbreviated tree hash */ - if (add_again(sb, &c->abbrev_tree_hash)) - return 1; strbuf_add_unique_abbrev(sb, commit->tree->object.oid.hash, c->pretty_ctx->abbrev); - c->abbrev_tree_hash.len = sb->len - c->abbrev_tree_hash.off; return 1; case 'P': /* parent hashes */ for (p = commit->parents; p; p = p->next) { @@ -1174,16 +1146,12 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */ }
Re: [BUG] add_again() off-by-one error in custom format
Am 14.06.2017 um 00:24 schrieb SZEDER Gábor: [sorry for double post, forgot the mailing list...] To throw in a fourth option, this one adjusts the expansions' cached offsets when the magic makes it necessary. It's not necessary for '%-', because it only makes a difference when the expansion is empty, and in that case - add_again() doesn't consider it cached, - and even if it did, the offset of a zero-length string wouldn't matter. pretty.c | 32 +--- 1 file changed, 21 insertions(+), 11 deletions(-) There are other modifiers, e.g. try the format '%h%>(20)%h' -- its output is obviously wrong (contains control characters), even with your patch. We'd have to update the offsets for each placeholder that changes the output of others. I don't think that approach scales, alas. René
Re: [BUG] add_again() off-by-one error in custom format
[sorry for double post, forgot the mailing list...] To throw in a fourth option, this one adjusts the expansions' cached offsets when the magic makes it necessary. It's not necessary for '%-', because it only makes a difference when the expansion is empty, and in that case - add_again() doesn't consider it cached, - and even if it did, the offset of a zero-length string wouldn't matter. pretty.c | 32 +--- 1 file changed, 21 insertions(+), 11 deletions(-) diff --git a/pretty.c b/pretty.c index d0f86f5d8..69c4f2a21 100644 --- a/pretty.c +++ b/pretty.c @@ -1066,9 +1066,17 @@ static size_t parse_padding_placeholder(struct strbuf *sb, return 0; } +enum format_commit_item_magic { + NO_MAGIC, + ADD_LF_BEFORE_NON_EMPTY, + DEL_LF_BEFORE_EMPTY, + ADD_SP_BEFORE_NON_EMPTY +}; + static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */ const char *placeholder, - void *context) + void *context, + enum format_commit_item_magic magic) { struct format_commit_context *c = context; const struct commit *commit = c->commit; @@ -1155,6 +1163,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */ c->pretty_ctx->abbrev); strbuf_addstr(sb, diff_get_color(c->auto_color, DIFF_RESET)); c->abbrev_commit_hash.len = sb->len - c->abbrev_commit_hash.off; + if (magic == ADD_LF_BEFORE_NON_EMPTY || magic == ADD_SP_BEFORE_NON_EMPTY) + c->abbrev_commit_hash.off++; return 1; case 'T': /* tree hash */ strbuf_addstr(sb, oid_to_hex(&commit->tree->object.oid)); @@ -1165,6 +1175,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */ strbuf_add_unique_abbrev(sb, commit->tree->object.oid.hash, c->pretty_ctx->abbrev); c->abbrev_tree_hash.len = sb->len - c->abbrev_tree_hash.off; + if (magic == ADD_LF_BEFORE_NON_EMPTY || magic == ADD_SP_BEFORE_NON_EMPTY) + c->abbrev_tree_hash.off++; return 1; case 'P': /* parent hashes */ for (p = commit->parents; p; p = p->next) { @@ -1184,6 +1196,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */ } c->abbrev_parent_hashes.len = sb->len - c->abbrev_parent_hashes.off; + if (magic == ADD_LF_BEFORE_NON_EMPTY || magic == ADD_SP_BEFORE_NON_EMPTY) + c->abbrev_parent_hashes.off++; return 1; case 'm': /* left/right/bottom */ strbuf_addstr(sb, get_revision_mark(NULL, commit)); @@ -1314,7 +1328,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */ static size_t format_and_pad_commit(struct strbuf *sb, /* in UTF-8 */ const char *placeholder, - struct format_commit_context *c) + struct format_commit_context *c, + enum format_commit_item_magic magic) { struct strbuf local_sb = STRBUF_INIT; int total_consumed = 0, len, padding = c->padding; @@ -1329,7 +1344,7 @@ static size_t format_and_pad_commit(struct strbuf *sb, /* in UTF-8 */ } while (1) { int modifier = *placeholder == 'C'; - int consumed = format_commit_one(&local_sb, placeholder, c); + int consumed = format_commit_one(&local_sb, placeholder, c, magic); total_consumed += consumed; if (!modifier) @@ -1420,12 +1435,7 @@ static size_t format_commit_item(struct strbuf *sb, /* in UTF-8 */ { int consumed; size_t orig_len; - enum { - NO_MAGIC, - ADD_LF_BEFORE_NON_EMPTY, - DEL_LF_BEFORE_EMPTY, - ADD_SP_BEFORE_NON_EMPTY - } magic = NO_MAGIC; + enum format_commit_item_magic magic = NO_MAGIC; switch (placeholder[0]) { case '-': @@ -1445,9 +1455,9 @@ static size_t format_commit_item(struct strbuf *sb, /* in UTF-8 */ orig_len = sb->len; if (((struct format_commit_context *)context)->flush_type != no_flush) - consumed = format_and_pad_commit(sb, placeholder, context); + consumed = format_and_pad_commit(sb, placeholder, context, magic); else - consumed = format_commit_one(sb, placeholder, context); + consumed = format_commit_one(sb, placeholder, context, magic); if (magic == NO_MAGIC) return consumed; -- 2.13.1.438.gc638325b1
Re: [BUG] add_again() off-by-one error in custom format
René Scharfe writes: > The difference is about the same as the one between: > > $ time git log --format="" >/dev/null > > real0m0.463s > user0m0.448s > sys 0m0.012s > > and: > > $ time git log --format="%h" >/dev/null > > real0m1.062s > user0m0.636s > sys 0m0.416s > > With caching duplicates are basically free and without it short > hashes have to be looked up again. Other placeholders may reduce > the relative slowdown, depending on how expensive they are. I think the real question is how likely people use more than one occurrence of the same thing in their custom format, and how deeply they care that --format='%h %h' costs more than --format='%h'. The cost won't of course be double (because the main traversal costs without any output), but it would be rather unreasonable to expect that --format='%h %h %h %h %h' to cost the same as --format='%h'; after all, Git is doing more for them ;-) So in that sense, I am actually OK if we decide to remove the caching. > Forgot a third option, probably because it's not a particularly good > idea: Replacing the caching in pretty.c with a short static cache in > find_unique_abbrev_r(). Indeed.
Re: [BUG] add_again() off-by-one error in custom format
Am 13.06.2017 um 20:29 schrieb Junio C Hamano: René Scharfe writes: Indeed, a very nice bug report! I think the call to format_commit_one() needs to be taught to pass 'magic' through, and then add_again() mechanism needs to be told not to cache when magic is in effect, or something like that. Perhaps something along this line...? pretty.c | 64 ++-- 1 file changed, 38 insertions(+), 26 deletions(-) That looks quite big. You even invent a way to classify magic. Hmph, by "a way to classify", do you mean the enum? That thing was there from before, just moved. Oh, yes, sorry. I didn't even get that far into the patch. (I'll better go to bed after hitting send..) Also I think we do not have to change add_again() at all, because chunk->off tolerates being given a garbage value as long as chunk->len stays 0, and add_again() does not change chunk->len at all. Which cuts the diffstat down to pretty.c | 40 +--- 1 file changed, 25 insertions(+), 15 deletions(-) Does the caching feature justify the added complexity? That's a good question. I thought about your second alternative while adding the "don't cache" thing, but if we can measure and find out that we are not gaining much from caching, certainly that sounds much simpler. The difference is about the same as the one between: $ time git log --format="" >/dev/null real0m0.463s user0m0.448s sys 0m0.012s and: $ time git log --format="%h" >/dev/null real0m1.062s user0m0.636s sys 0m0.416s With caching duplicates are basically free and without it short hashes have to be looked up again. Other placeholders may reduce the relative slowdown, depending on how expensive they are. Forgot a third option, probably because it's not a particularly good idea: Replacing the caching in pretty.c with a short static cache in find_unique_abbrev_r(). René
Re: [BUG] add_again() off-by-one error in custom format
René Scharfe writes: > Indeed, a very nice bug report! > >> I think the call to format_commit_one() needs to be taught to pass >> 'magic' through, and then add_again() mechanism needs to be told not >> to cache when magic is in effect, or something like that. >> >> Perhaps something along this line...? >> >> pretty.c | 64 >> ++-- >> 1 file changed, 38 insertions(+), 26 deletions(-) > > That looks quite big. You even invent a way to classify magic. Hmph, by "a way to classify", do you mean the enum? That thing was there from before, just moved. Also I think we do not have to change add_again() at all, because chunk->off tolerates being given a garbage value as long as chunk->len stays 0, and add_again() does not change chunk->len at all. Which cuts the diffstat down to pretty.c | 40 +--- 1 file changed, 25 insertions(+), 15 deletions(-) > Does the caching feature justify the added complexity? That's a good question. I thought about your second alternative while adding the "don't cache" thing, but if we can measure and find out that we are not gaining much from caching, certainly that sounds much simpler. > > - Don't cache anymore, now that we have placeholders that change output > on top of the original append-only ones. Yields a net code reduction. > Duplicated %h, %t and %p will have to be resolved at each occurrence. > > - Provide some space for the cache instead of reusing the output, like > a strbuf for each placeholder. Adds a memory allocation to each > first occurrence of %h, %t and %p. Such a cache could be reused for > multiple commits by truncating it instead of releasing; not sure how > to pass it in nicely for it to survive a sequence of calls, though. > > René
Re: [BUG] add_again() off-by-one error in custom format
Am 13.06.2017 um 00:49 schrieb Junio C Hamano: Michael Giuffrida writes: For the abbreviated commit hash placeholder ('h'), pretty.c uses add_again() to cache the result of the expansion, and then uses that result in future expansions. This causes problems when the expansion includes whitespace: $ git log -n 1 --pretty='format:newline:%+h%nno_newline:%h' newline: a0b1c2d no_newline: a0b1c2 The second expansion of the hash added an unwanted newline and removed the final character. It seemingly used the cached expansion *starting from the inserted newline*. The expected output is: $ git log -n 1 --pretty='format:newline:%+h%nno_newline:%h' newline: a0b1c2d no_newline:a0b1c2d Nicely explained. The add_again() mechanism caches an earlier result by remembering the offset and the length in the strbuf the formatted string is being accumulated to, but because %+x (and probably %-x) magic placeholders futz with the result of format_commit_one() in place, the cache goes out of sync, of course. Indeed, a very nice bug report! I think the call to format_commit_one() needs to be taught to pass 'magic' through, and then add_again() mechanism needs to be told not to cache when magic is in effect, or something like that. Perhaps something along this line...? pretty.c | 64 ++-- 1 file changed, 38 insertions(+), 26 deletions(-) That looks quite big. You even invent a way to classify magic. Does the caching feature justify the added complexity? Alternatives: - Don't cache anymore, now that we have placeholders that change output on top of the original append-only ones. Yields a net code reduction. Duplicated %h, %t and %p will have to be resolved at each occurrence. - Provide some space for the cache instead of reusing the output, like a strbuf for each placeholder. Adds a memory allocation to each first occurrence of %h, %t and %p. Such a cache could be reused for multiple commits by truncating it instead of releasing; not sure how to pass it in nicely for it to survive a sequence of calls, though. René
Re: [BUG] add_again() off-by-one error in custom format
Michael Giuffrida writes: > For the abbreviated commit hash placeholder ('h'), pretty.c uses > add_again() to cache the result of the expansion, and then uses that > result in future expansions. This causes problems when the expansion > includes whitespace: > > $ git log -n 1 --pretty='format:newline:%+h%nno_newline:%h' > newline: > a0b1c2d > no_newline: > a0b1c2 > > The second expansion of the hash added an unwanted newline and removed > the final character. It seemingly used the cached expansion *starting > from the inserted newline*. > > The expected output is: > > $ git log -n 1 --pretty='format:newline:%+h%nno_newline:%h' > newline: > a0b1c2d > no_newline:a0b1c2d Nicely explained. The add_again() mechanism caches an earlier result by remembering the offset and the length in the strbuf the formatted string is being accumulated to, but because %+x (and probably %-x) magic placeholders futz with the result of format_commit_one() in place, the cache goes out of sync, of course. I think the call to format_commit_one() needs to be taught to pass 'magic' through, and then add_again() mechanism needs to be told not to cache when magic is in effect, or something like that. Perhaps something along this line...? pretty.c | 64 ++-- 1 file changed, 38 insertions(+), 26 deletions(-) diff --git a/pretty.c b/pretty.c index 09701bd2ff..be22b12986 100644 --- a/pretty.c +++ b/pretty.c @@ -789,20 +789,22 @@ struct format_commit_context { size_t wrap_start; }; -static int add_again(struct strbuf *sb, struct chunk *chunk) +static int add_again(struct strbuf *sb, struct chunk *chunk, int no_cache) { if (chunk->len) { strbuf_adddup(sb, chunk->off, chunk->len); return 1; } - /* -* We haven't seen this chunk before. Our caller is surely -* going to add it the hard way now. Remember the most likely -* start of the to-be-added chunk: the current end of the -* struct strbuf. -*/ - chunk->off = sb->len; + if (!no_cache) { + /* +* We haven't seen this chunk before. Our caller is surely +* going to add it the hard way now. Remember the most likely +* start of the to-be-added chunk: the current end of the +* struct strbuf. +*/ + chunk->off = sb->len; + } return 0; } @@ -1066,15 +1068,24 @@ static size_t parse_padding_placeholder(struct strbuf *sb, return 0; } +enum format_commit_item_magic { + NO_MAGIC, + ADD_LF_BEFORE_NON_EMPTY, + DEL_LF_BEFORE_EMPTY, + ADD_SP_BEFORE_NON_EMPTY +}; + static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */ const char *placeholder, - void *context) + void *context, + enum format_commit_item_magic magic) { struct format_commit_context *c = context; const struct commit *commit = c->commit; const char *msg = c->message; struct commit_list *p; int ch; + int no_cache; /* these are independent of the commit */ switch (placeholder[0]) { @@ -1139,6 +1150,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */ if (!commit->object.parsed) parse_object(&commit->object.oid); + no_cache = magic != NO_MAGIC; + switch (placeholder[0]) { case 'H': /* commit hash */ strbuf_addstr(sb, diff_get_color(c->auto_color, DIFF_COMMIT)); @@ -1147,24 +1160,26 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */ return 1; case 'h': /* abbreviated commit hash */ strbuf_addstr(sb, diff_get_color(c->auto_color, DIFF_COMMIT)); - if (add_again(sb, &c->abbrev_commit_hash)) { + if (add_again(sb, &c->abbrev_commit_hash, no_cache)) { strbuf_addstr(sb, diff_get_color(c->auto_color, DIFF_RESET)); return 1; } strbuf_add_unique_abbrev(sb, commit->object.oid.hash, c->pretty_ctx->abbrev); strbuf_addstr(sb, diff_get_color(c->auto_color, DIFF_RESET)); - c->abbrev_commit_hash.len = sb->len - c->abbrev_commit_hash.off; + if (!no_cache) + c->abbrev_commit_hash.len = sb->len - c->abbrev_commit_hash.off; return 1; case 'T': /* tree hash */ strbuf_addstr(sb, oid_to_hex(&commit->tree->object.oid)); return 1; case 't': /* abbreviated tree hash */ - if (add_again(sb, &c->abbrev_tree_hash)) +
[BUG] add_again() off-by-one error in custom format
In a custom pretty format, using the '+' or ' ' combinators to prefix a non-empty expansion with whitespace will erroneously truncate subsequent expansions of the same type. Normally '%+X' inserts a newline before , IFF the expansion of X is non-empty: $ git log -n 1 --pretty='format:newline:%+an' newline: MyName For the abbreviated commit hash placeholder ('h'), pretty.c uses add_again() to cache the result of the expansion, and then uses that result in future expansions. This causes problems when the expansion includes whitespace: $ git log -n 1 --pretty='format:newline:%+h%nno_newline:%h' newline: a0b1c2d no_newline: a0b1c2 The second expansion of the hash added an unwanted newline and removed the final character. It seemingly used the cached expansion *starting from the inserted newline*. The expected output is: $ git log -n 1 --pretty='format:newline:%+h%nno_newline:%h' newline: a0b1c2d no_newline:a0b1c2d /CC René from commit b9c62321 and Junio from 9fa708d