Re: [PATCH 06/18] t5613: clarify "too deep" recursion tests
On Mon, Oct 3, 2016 at 1:34 PM, Jeff Kingwrote: > These tests are just trying to show that we allow recursion > up to a certain depth, but not past it. But the counting is > a bit non-intuitive, and rather than test at the edge of the > breakage, we test "OK" cases in the middle of the chain. > Let's explain what's going on, and explicitly test the > switch between "OK" and "too deep". > Makes sense to actually test the edge case here instead of just in the middle. > Signed-off-by: Jeff King > --- > t/t5613-info-alternate.sh | 24 > 1 file changed, 16 insertions(+), 8 deletions(-) > > diff --git a/t/t5613-info-alternate.sh b/t/t5613-info-alternate.sh > index 7bc1c3c..b393613 100755 > --- a/t/t5613-info-alternate.sh > +++ b/t/t5613-info-alternate.sh > @@ -39,6 +39,18 @@ test_expect_success 'preparing third repository' ' > ) > ' > > +# Note: These tests depend on the hard-coded value of 5 as "too deep". We > start > +# the depth at 0 and count links, not repositories, so in a chain like: > +# > +# A -> B -> C -> D -> E -> F -> G -> H > +# 0123456 > +# Ok so we count links, but wouldn't we have 5 links when we hit F, and not G? Or am I missing something here? > +# we are OK at "G", but break at "H". > +# Seems like from the wording of this comment that we'd break at G and not H..? Obviously the test below shows G is ok. Aren't the numbers here off by 1? Regards, Jake > +# Note also that we must use "--bare -l" to make the link to H. The "-l" > +# ensures we do not do a connectivity check, and the "--bare" makes sure > +# we do not try to checkout the result (which needs objects), either of > +# which would cause the clone to fail. > test_expect_success 'creating too deep nesting' ' > git clone -l -s C D && > git clone -l -s D E && > @@ -47,16 +59,12 @@ test_expect_success 'creating too deep nesting' ' > git clone --bare -l -s G H > ' > > -test_expect_success 'invalidity of deepest repository' ' > - test_must_fail git -C H fsck > -' > - > -test_expect_success 'validity of third repository' ' > - git -C C fsck > +test_expect_success 'validity of fifth-deep repository' ' > + git -C G fsck > ' > > -test_expect_success 'validity of fourth repository' ' > - git -C D fsck > +test_expect_success 'invalidity of sixth-deep repository' ' > + test_must_fail git -C H fsck > ' > > test_expect_success 'breaking of loops' ' > -- > 2.10.0.618.g82cc264 >
Re: [PATCH 05/18] t5613: do not chdir in main process
On Mon, Oct 3, 2016 at 1:34 PM, Jeff Kingwrote: > Our usual style when working with subdirectories is to chdir > inside a subshell or to use "git -C", which means we do not > have to constantly return to the main test directory. Let's > convert this old test, which does not follow that style. > More obvious cleanup for this test file. Seems like a lot of intermediate steps to get this test file into a clean state. Thanks, Jake
Re: [PATCH 04/18] t5613: whitespace/style cleanups
On Mon, Oct 3, 2016 at 1:34 PM, Jeff Kingwrote: > Our normal test style these days puts the opening quote of > the body on the description line, and indents the body with > a single tab. This ancient test did not follow this. > I was surprised you didn't do this first, but it doesn't really make a difference either way. This is also a pretty straight forward improvement, and I can see why you'd want to split this out to review separately. Regards, Jake
Re: [PATCH 03/18] t5613: use test_must_fail
On Mon, Oct 3, 2016 at 1:34 PM, Jeff Kingwrote: > Besides being our normal style, this correctly checks for an > error exit() versus signal death. > Another very simple but obvious improvement. Regards, Jake > Signed-off-by: Jeff King > --- > t/t5613-info-alternate.sh | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/t/t5613-info-alternate.sh b/t/t5613-info-alternate.sh > index 4548fb0..65074dd 100755 > --- a/t/t5613-info-alternate.sh > +++ b/t/t5613-info-alternate.sh > @@ -46,10 +46,9 @@ git clone -l -s F G && > git clone --bare -l -s G H' > > test_expect_success 'invalidity of deepest repository' \ > -'cd H && { > - git fsck > - test $? -ne 0 > -}' > +'cd H && > +test_must_fail git fsck > +' > > cd "$base_dir" > > @@ -75,7 +74,8 @@ cd "$base_dir" > test_expect_success 'that info/alternates is necessary' \ > 'cd C && > rm -f .git/objects/info/alternates && > -! (git fsck)' > +test_must_fail git fsck > +' > > cd "$base_dir" > > @@ -89,7 +89,7 @@ cd "$base_dir" > test_expect_success \ > 'that relative alternate is only possible for current dir' ' > cd D && > -! (git fsck) > +test_must_fail git fsck > ' > > cd "$base_dir" > -- > 2.10.0.618.g82cc264 >
Re: [PATCH 02/18] t5613: drop test_valid_repo function
On Mon, Oct 3, 2016 at 1:33 PM, Jeff Kingwrote: > This function makes sure that "git fsck" does not report any > errors. But "--full" has been the default since f29cd39 > (fsck: default to "git fsck --full", 2009-10-20), and we can > use the exit code (instead of counting the lines) since > e2b4f63 (fsck: exit with non-zero status upon errors, > 2007-03-05). > > So we can just use "git fsck", which is shorter and more > flexible (e.g., we can use "git -C"). This seems obviously correct. I didn't understand your comment about the use of "git -C" at first, because I was confused about why "git -C" doesn't work with "git --full", but then I realized you can't use "git -C" with the shell test_valid_repo function. Thanks, Jake > > Signed-off-by: Jeff King > --- > t/t5613-info-alternate.sh | 19 +++ > 1 file changed, 7 insertions(+), 12 deletions(-) > > diff --git a/t/t5613-info-alternate.sh b/t/t5613-info-alternate.sh > index e13f57d..4548fb0 100755 > --- a/t/t5613-info-alternate.sh > +++ b/t/t5613-info-alternate.sh > @@ -6,11 +6,6 @@ > test_description='test transitive info/alternate entries' > . ./test-lib.sh > > -test_valid_repo() { > - git fsck --full > fsck.log && > - test_line_count = 0 fsck.log > -} > - > base_dir=$(pwd)
Re: [PATCH 01/18] t5613: drop reachable_via function
On Mon, Oct 3, 2016 at 1:33 PM, Jeff Kingwrote: > This function was never used since its inception in dd05ea1 > (test case for transitive info/alternates, 2006-05-07). > Which is just as well, since it mutates the repo state in a > way that would invalidate further tests, without cleaning up > after itself. Let's get rid of it so that nobody is tempted > to use it. > Makes sense. It wouldn't be a good idea to leave this around since it didn't clean up after itself. Curious why no test actually used it though.. Thanks, Jake
Re: [PATCH 0/18] alternate object database cleanups
On Mon, Oct 3, 2016 at 1:33 PM, Jeff Kingwrote: > This series is the result of René nerd-sniping me with the claim that we > could "easily" teach count-objects to print out the list of alternates > in: > > http://public-inbox.org/git/c27dc1a4-3c7a-2866-d9d8-f5d3eb161...@web.de/ > Hah. Nerd snipes are fun. > My real goal is just patch 17, which is needed for the quarantine series > in that thread. But along the way there were quite a few opportunities > for cleanups along with a few minor bugfixes (in patches 7 and 18), and > I think the count-objects change in patch 16 is a nice general debugging > tool. Yea there are a *lot* of cleanups here. > > The rest of it is "just" cleanup, but I'll note that it clears up some > hairy allocation code. These were bits that I noticed in my big > allocation-cleanup series last year, but were too nasty to fit any of > the more general fixes. I think the end result is much better. > Definitely agreed. I read through all the patches, and each one seemed reasonable. > The number of patches is a little intimidating, but I tried hard to > break the refactoring down into a sequence of obviously-correct steps. > You can be the judge of my success. > I read through them once. I'm going to re-read through them again and leave any comments I had. Regards, Jake > [01/18]: t5613: drop reachable_via function > [02/18]: t5613: drop test_valid_repo function > [03/18]: t5613: use test_must_fail > [04/18]: t5613: whitespace/style cleanups > [05/18]: t5613: do not chdir in main process > [06/18]: t5613: clarify "too deep" recursion tests > [07/18]: link_alt_odb_entry: handle normalize_path errors > [08/18]: link_alt_odb_entry: refactor string handling > [09/18]: alternates: provide helper for adding to alternates list > [10/18]: alternates: provide helper for allocating alternate > [11/18]: alternates: encapsulate alt->base munging > [12/18]: alternates: use a separate scratch space > [13/18]: fill_sha1_file: write "boring" characters > [14/18]: alternates: store scratch buffer as strbuf > [15/18]: fill_sha1_file: write into a strbuf > [16/18]: count-objects: report alternates via verbose mode > [17/18]: sha1_file: always allow relative paths to alternates > [18/18]: alternates: use fspathcmp to detect duplicates > > Documentation/git-count-objects.txt | 5 + > builtin/count-objects.c | 12 +++ > builtin/fsck.c | 10 +- > builtin/submodule--helper.c | 11 +- > cache.h | 36 ++- > sha1_file.c | 179 ++-- > sha1_name.c | 17 +-- > strbuf.c| 20 > strbuf.h| 8 ++ > submodule.c | 23 +--- > t/t5613-info-alternate.sh | 202 > > transport.c | 4 +- > 12 files changed, 305 insertions(+), 222 deletions(-) >
[no subject]
[no subject]
Re: Feature Request: user defined suffix for temp files created by git-mergetool
Hi Anatoly, | Sent: Monday, October 3, 2016 5:18:44 PM | | Hi Josef, | | | On Mon, Oct 3, 2016 at 8:36 AM, Josef Ridkywrote: | > In several projects, we are using git mergetool for comparing files from | > different folders. | > Unfortunately, when we have opened three files for comparing using meld | > tool (e.q. Old_version -- Result -- New_version), | > we can see only name of temporary files created by mergetool in the labels | > (e.g. foo_REMOTE -- foo_BASE -- foo_LOCAL) | > and users (and sometime even we) are confused, which of the files should | > they edit and save. | | `git mergetool` just creates temporary files (with some temporary | names) and calls `meld` (or `vimdiff`, etc) with the file names as | parameters. So why wouldn't you call `meld` with the file names you | want? Because files, that we want, are temporary files created by git mergetool and we are not able to change their name. Regards Josef
Re: What's cooking in git.git (Oct 2016, #01; Mon, 3)
Jeff Kingwrites: >> * lt/abbrev-auto (2016-10-03) 3 commits > > I kind of expected this one to cook in next for a bit while people > decided whether the larger hashes were irritating or not. Despite > working on the implementation, I'm on the fence myself. > > I'd kind of hoped people would play with core.disambiguate and the hints > and see if they still actually wanted to bump the default abbrev (and > how aggressively to do so; if core.disambiguate means most of the > ambiguity is just between commits, that cuts the number of > collision-interesting objects by an order of magnitude). Sure. Let's keep them cooking. >> * jk/pack-objects-optim-mru (2016-08-11) 4 commits >> (merged to 'next' on 2016-09-21 at 97b919bdbd) >> + pack-objects: use mru list when iterating over packs >> + pack-objects: break delta cycles before delta-search phase >> + sha1_file: make packed_object_info public >> + provide an initializer for "struct object_info" >> >> Originally merged to 'next' on 2016-08-11 >> >> "git pack-objects" in a repository with many packfiles used to >> spend a lot of time looking for/at objects in them; the accesses to >> the packfiles are now optimized by checking the most-recently-used >> packfile first. >> >> Will hold to see if people scream. > > This has been in next for 6 weeks. Is it time to consider graduating it? Perhaps.
Re: [PATCH 3/3] abbrev: auto size the default abbreviation
Jeff Kingwrites: >> OK, as Linus's "count at the point of use" is already in 'next', >> could you make it incremental with a log message? > > Sure. I wasn't sure if you actually liked my direction or not, so I was > mostly just showing off what the completed one would look like. To be quite honest, I am not just unsure if I liked your direction; rather I am not sure if I actually understood what you perceived as a difference that matters between the two approaches. I wanted to hear you explain the difference in terms of "Linus's does this, but it is bad in X and Y way, so let's avoid it and do it like Z instead". One effective way to extract that out of you was to force you to justify the "incremental" update. And it seems that I succeeded ;-). I am still not sure if I 100% agree with your first paragraph, but at least now I think I see where you are coming from. You probably will hear from Ramsay about extern-ness of msb(). > -- >8 -- > Subject: [PATCH] find_unique_abbrev: move logic out of get_short_sha1() > > The get_short_sha1() is only about reading short sha1s; we > do call it in a loop to check "is this long enough" for each > object, but otherwise it should not need to know about > things like our default_abbrev setting. > > So instead of asking it to set default_automatic_abbrev as a > side-effect, let's just have find_unique_abbrev() pick the > right place to start its loop. This requires a separate > approximate_object_count() function, but that naturally > belongs with the rest of sha1_file.c. > > Signed-off-by: Jeff King > --- > cache.h | 7 ++- > sha1_file.c | 27 +++ > sha1_name.c | 60 +++- > 3 files changed, 68 insertions(+), 26 deletions(-) > > diff --git a/cache.h b/cache.h > index 0e2a059..f22ace5 100644 > --- a/cache.h > +++ b/cache.h > @@ -1204,7 +1204,6 @@ struct object_context { > #define GET_SHA1_TREEISH 020 > #define GET_SHA1_BLOB 040 > #define GET_SHA1_FOLLOW_SYMLINKS 0100 > -#define GET_SHA1_AUTOMATIC0200 > #define GET_SHA1_ONLY_TO_DIE04000 > > #define GET_SHA1_DISAMBIGUATORS \ > @@ -1456,6 +1455,12 @@ extern void prepare_packed_git(void); > extern void reprepare_packed_git(void); > extern void install_packed_git(struct packed_git *pack); > > +/* > + * Give a rough count of objects in the repository. This sacrifices accuracy > + * for speed. > + */ > +unsigned long approximate_object_count(void); > + > extern struct packed_git *find_sha1_pack(const unsigned char *sha1, >struct packed_git *packs); > > diff --git a/sha1_file.c b/sha1_file.c > index b9c1fa3..4882440 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -1381,6 +1381,32 @@ static void prepare_packed_git_one(char *objdir, int > local) > strbuf_release(); > } > > +static int approximate_object_count_valid; > + > +/* > + * Give a fast, rough count of the number of objects in the repository. This > + * ignores loose objects completely. If you have a lot of them, then either > + * you should repack because your performance will be awful, or they are > + * all unreachable objects about to be pruned, in which case they're not > really > + * interesting as a measure of repo size in the first place. > + */ > +unsigned long approximate_object_count(void) > +{ > + static unsigned long count; > + if (!approximate_object_count_valid) { > + struct packed_git *p; > + > + prepare_packed_git(); > + count = 0; > + for (p = packed_git; p; p = p->next) { > + if (open_pack_index(p)) > + continue; > + count += p->num_objects; > + } > + } > + return count; > +} > + > static void *get_next_packed_git(const void *p) > { > return ((const struct packed_git *)p)->next; > @@ -1455,6 +1481,7 @@ void prepare_packed_git(void) > > void reprepare_packed_git(void) > { > + approximate_object_count_valid = 0; > prepare_packed_git_run_once = 0; > prepare_packed_git(); > } > diff --git a/sha1_name.c b/sha1_name.c > index beb7ab5..76e6885 100644 > --- a/sha1_name.c > +++ b/sha1_name.c > @@ -15,7 +15,6 @@ typedef int (*disambiguate_hint_fn)(const unsigned char *, > void *); > > struct disambiguate_state { > int len; /* length of prefix in hex chars */ > - unsigned int nrobjects; > char hex_pfx[GIT_SHA1_HEXSZ + 1]; > unsigned char bin_pfx[GIT_SHA1_RAWSZ]; > > @@ -119,14 +118,6 @@ static void find_short_object_filename(struct > disambiguate_state *ds) > > if (strlen(de->d_name) != 38) > continue; > - > - /* > - * We only look at the one subdirectory, and we assume > - * each subdirectory is roughly similar, so each > -
Re: [PATCH v2 6/6] git-gui: Update Japanese information
Pat Thoytswrites: > I've tried to merge in these branches as they appear in your version > although I already had one patch on top of 0.20.0 for some time. I've > tentatively pushed this up to http://github.com/patthoyts/git-gui as > branch 'pu' with additional stuff on top of the patches you already > have. If this looks ok to you I'll merge this to my master and send you > a merge request to get it all synchronized. Your 64c6b4c507 matches what I expected to see as the result of merging the above four topics on top of your 'master'. I have no opinion on the other topics that appear on top of it on the branch you pushed out, other than that I trust the maintainer of the subsystem and I'm fine to blindly pull them from you ;-) I am not sure if f64a1a9311 ("git-gui: maintain backwards compatibility for merge syntax", 2016-10-04) makes any practical difference in the real world, though. You'd need to find somebody who grabs the newer version of git-gui that includes b5f325cb4a ("git-gui: stop using deprecated merge syntax", 2016-09-24), without having updated their copy of git-core for more than a year, given that 2.5.0 is from July last year. It would not hurt, but I am not sure if an extra invocation of "git version" is really worth it. Thanks.
Re: What's cooking in git.git (Oct 2016, #01; Mon, 3)
On Mon, Oct 03, 2016 at 03:31:02PM -0700, Junio C Hamano wrote: > * lt/abbrev-auto (2016-10-03) 3 commits > (merged to 'next' on 2016-10-03 at bb188d00f7) > + abbrev: auto size the default abbreviation > + abbrev: prepare for new world order > + abbrev: add FALLBACK_DEFAULT_ABBREV to prepare for auto sizing > (this branch uses jk/ambiguous-short-object-names.) > > Allow the default abbreviation length, which has historically been > 7, to scale as the repository grows. The logic suggests to use 12 > hexdigits for the Linux kernel, and 9 to 10 for Git itself. > > Will merge to 'master'. I kind of expected this one to cook in next for a bit while people decided whether the larger hashes were irritating or not. Despite working on the implementation, I'm on the fence myself. I'd kind of hoped people would play with core.disambiguate and the hints and see if they still actually wanted to bump the default abbrev (and how aggressively to do so; if core.disambiguate means most of the ambiguity is just between commits, that cuts the number of collision-interesting objects by an order of magnitude). > * jk/pack-objects-optim-mru (2016-08-11) 4 commits > (merged to 'next' on 2016-09-21 at 97b919bdbd) > + pack-objects: use mru list when iterating over packs > + pack-objects: break delta cycles before delta-search phase > + sha1_file: make packed_object_info public > + provide an initializer for "struct object_info" > > Originally merged to 'next' on 2016-08-11 > > "git pack-objects" in a repository with many packfiles used to > spend a lot of time looking for/at objects in them; the accesses to > the packfiles are now optimized by checking the most-recently-used > packfile first. > > Will hold to see if people scream. This has been in next for 6 weeks. Is it time to consider graduating it? -Peff
Re: [RFC/PATCH 0/2] place cherry pick line below commit title
On 10/03/2016 03:13 PM, Junio C Hamano wrote: Jonathan Tanwrites: There are other options like checking for indentation or checking for balanced parentheses/brackets, but I think that these would lead to surprising behavior for the user (this would mean that whitespace or certain characters could turn a valid trailer into an invalid one or vice versa, or change the behavior of trailer.ifexists, especially "replace"). Yes, that is exactly why I said that it may be necessary for the code to analize the lines in a block identified as "likely to be a trailing block" more carefully. We can afford to be loose as long as the only allowed operation is to append one at the end, but once we start removing/replacing an existing entry, etc., the definition of what an entry is becomes very much relevant. I agree, and I was trying to discuss the possible alternatives for the definition of what an entry is in my previous e-mail. If you think that the alternatives are still too loose, I'm not sure if we can make it any tighter. As far as I know, we're dealing with trailers like the following: Signed-off-by: A [This has nothing to do with the above line] Signed-off-by: B and: Link 1: a link a continuation of the above and: Signed-off-by: Some body (comment on two lines) As I stated in the quoted paragraph, one possibility is to use indentation and/or balanced parentheses/brackets to determine if a trailer line continues onto the next line, and this would handle all the above cases, but I still think that these would lead to surprising behavior. Hence my suggestion to just simply define it as a single physical line. But if you think that the pros (of the more complicated approach) outweigh the cons, I'm OK with that. One alternative is to postpone this decision by changing sequencer only (and not trailer) to tolerate other lines in the trailer. This would make them even more divergent (sequencer supports arbitrary lines while trailer doesn't), but they were divergent already (sequencer supports "(cherry picked by" but trailer doesn't).
Re: [PATCH] http: http.emptyauth should allow empty (not just NULL) usernames
On Mon, Oct 03, 2016 at 09:54:19PM +, David Turner wrote: > > > I dunno. The code path you are changing _only_ affects anything if the > > http.emptyauth config is set. But I guess I just don't understand why you > > would say "http://@gitserver; in the first place. Is that a common thing? > > > > -Peff > > I have no idea if it is common. I know that we do it. I've never seen this. RFC 3986 does seem to allow it: authority = [ userinfo "@" ] host [ ":" port ] userinfo= *( unreserved / pct-encoded / sub-delims / ":" ) I normally write it like one of these: https://b...@git.crustytoothpaste.net/ https://:@git.crustytoothpaste.net/ Of course, the username is ignored in the first one, but it serves a documentary purpose for me. > The reason we have a required-to-be-blank username/password is > apparently Kerberos (or something about our particular Kerberos > configuration), which I treat as inscrutable black magic. The issue with git is usually that it uses libcurl, which won't do authentication unless it has a username or password, even if those are empty or ignored. http.emptyAuth was designed for this case. With Kerberos (at least in my experience), the username doesn't actually get sent, since you send only ticket-related information over the channel, and that has your principal name embedded. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: [PATCH 3/3] abbrev: auto size the default abbreviation
On Mon, Oct 03, 2016 at 03:52:44PM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > > On Mon, Oct 03, 2016 at 03:34:03PM -0700, Linus Torvalds wrote: > > > >> On Mon, Oct 3, 2016 at 3:27 PM, Jeff King wrote: > >> > > >> > + if (len < 0) { > >> > + unsigned long count = approximate_object_count(); > >> > + len = (msb(count) + 1) / 2; > >> > + if (len < 0) > >> > + len = FALLBACK_DEFAULT_ABBREV; > >> > + } > >> > >> that second "if (len < 0)" should probably be testing against > >> FALLBACK_DEFAULT_ABBREV, not zero. Or at the very least > >> MINIMUM_ABBREV. Because a two-character abbreviation won't even be > >> recognized, even if the git project is very small indeed. > > > > Oops, yes, clearly it should be FALLBACK_DEFAULT_ABBREV. What is there > > would not even pass the tests (it _does_ work on linux.git, of course, > > because it is much too large for that code to be triggered). > > OK, as Linus's "count at the point of use" is already in 'next', > could you make it incremental with a log message? Sure. I wasn't sure if you actually liked my direction or not, so I was mostly just showing off what the completed one would look like. Here it is as an incremental on top of lt/abbrev-auto. I also tweaked the math a bit to round-up more aggressively, and commented it more (I could also just make the math look exactly like Linus's, counting up until hitting an expected collision. I dunno if that is more clear). -- >8 -- Subject: [PATCH] find_unique_abbrev: move logic out of get_short_sha1() The get_short_sha1() is only about reading short sha1s; we do call it in a loop to check "is this long enough" for each object, but otherwise it should not need to know about things like our default_abbrev setting. So instead of asking it to set default_automatic_abbrev as a side-effect, let's just have find_unique_abbrev() pick the right place to start its loop. This requires a separate approximate_object_count() function, but that naturally belongs with the rest of sha1_file.c. Signed-off-by: Jeff King --- cache.h | 7 ++- sha1_file.c | 27 +++ sha1_name.c | 60 +++- 3 files changed, 68 insertions(+), 26 deletions(-) diff --git a/cache.h b/cache.h index 0e2a059..f22ace5 100644 --- a/cache.h +++ b/cache.h @@ -1204,7 +1204,6 @@ struct object_context { #define GET_SHA1_TREEISH 020 #define GET_SHA1_BLOB 040 #define GET_SHA1_FOLLOW_SYMLINKS 0100 -#define GET_SHA1_AUTOMATIC 0200 #define GET_SHA1_ONLY_TO_DIE04000 #define GET_SHA1_DISAMBIGUATORS \ @@ -1456,6 +1455,12 @@ extern void prepare_packed_git(void); extern void reprepare_packed_git(void); extern void install_packed_git(struct packed_git *pack); +/* + * Give a rough count of objects in the repository. This sacrifices accuracy + * for speed. + */ +unsigned long approximate_object_count(void); + extern struct packed_git *find_sha1_pack(const unsigned char *sha1, struct packed_git *packs); diff --git a/sha1_file.c b/sha1_file.c index b9c1fa3..4882440 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1381,6 +1381,32 @@ static void prepare_packed_git_one(char *objdir, int local) strbuf_release(); } +static int approximate_object_count_valid; + +/* + * Give a fast, rough count of the number of objects in the repository. This + * ignores loose objects completely. If you have a lot of them, then either + * you should repack because your performance will be awful, or they are + * all unreachable objects about to be pruned, in which case they're not really + * interesting as a measure of repo size in the first place. + */ +unsigned long approximate_object_count(void) +{ + static unsigned long count; + if (!approximate_object_count_valid) { + struct packed_git *p; + + prepare_packed_git(); + count = 0; + for (p = packed_git; p; p = p->next) { + if (open_pack_index(p)) + continue; + count += p->num_objects; + } + } + return count; +} + static void *get_next_packed_git(const void *p) { return ((const struct packed_git *)p)->next; @@ -1455,6 +1481,7 @@ void prepare_packed_git(void) void reprepare_packed_git(void) { + approximate_object_count_valid = 0; prepare_packed_git_run_once = 0; prepare_packed_git(); } diff --git a/sha1_name.c b/sha1_name.c index beb7ab5..76e6885 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -15,7 +15,6 @@ typedef int (*disambiguate_hint_fn)(const unsigned char *, void *); struct disambiguate_state { int len; /* length of prefix in hex chars */ - unsigned int nrobjects; char hex_pfx[GIT_SHA1_HEXSZ + 1];
Re: [PATCH] git-gui: stop using deprecated merge syntax
René Scharfewrites: >Am 03.10.2016 um 10:30 schrieb Pat Thoyts: >> The only problem I see here is that generally git-gui tries to continue >> to work with older versions of git as well. So adding a guard using the >> git-version procedure should maintain that backwards compatibility. > >Makes sense for a stand-alone tool. > >> I suggest: >> >> From c2716458f05893ca88c05ce211a295a330e74590 Mon Sep 17 00:00:00 2001 >> From: René Scharfe >> Date: Sat, 24 Sep 2016 13:30:22 +0200 >> Subject: [PATCH] git-gui: stop using deprecated merge syntax >> >> Starting with v2.5.0 git merge can handle FETCH_HEAD internally and >> warns when it's called like 'git merge HEAD ' because >> that syntax is deprecated. Use this feature in git-gui and get rid of >> that warning. >> >> Tested-by: Johannes Sixt >> Reviewed-by: Stefan Beller >> Signed-off-by: Rene Scharfe >> Signed-off-by: Pat Thoyts > >OK, but perhaps move me from From: to Original-patch-by: as the >version check is a big enough change in itself. Or add a separate >commit for it. Or at least mention that you added the check in the >commit message. > >Thanks, >René As this is one of the ones already staged to git's 'next' I'll make this as a separate commit on top. -- Pat Thoytshttp://www.patthoyts.tk/ PGP fingerprint 2C 6E 98 07 2C 59 C8 97 10 CE 11 E6 04 E0 B9 DD
Re: [PATCH v2 6/6] git-gui: Update Japanese information
Junio C Hamanowrites: >Junio C Hamano writes: > >> Pat Thoyts writes: >> >>> I'm just starting to catch up once again. hopefully I can be >>> a bit more reactive than recently. Merging 52285c83 looks fine. I'll >>> stick that onto the 0.20.0 head and see what else I can pick up on top. >>> There are a few from the git for windows set among others. >> >> Nice to hear from you again. I think I have a few topics I merged >> to my tree bypassing you in the meantime. Let me get back to you >> with a list of topic tips to bring your tree in sync with what I >> have later. > >I think the following lists everything that has been done bypassing >your tree: > >66fe3e061a ("git-gui: l10n: add Portuguese translation", 2016-05-06) >52285c8312 ("git-gui: update Japanese information", 2016-09-07) >2afe6b733e ("git-gui: respect commit.gpgsign again", 2016-09-09) >b5f325cb4a ("git-gui: stop using deprecated merge syntax", 2016-09-24) > >52285c8312 and 2afe6b733e are already in my 'master'; the other two >are already cooking in 'next'. > >So if you fetch from me and merge the above, you'd be in sync with >me (I won't be in sync with you, as you would have more than I have >from other places like Git for Windows set). > >Thanks. > I've tried to merge in these branches as they appear in your version although I already had one patch on top of 0.20.0 for some time. I've tentatively pushed this up to http://github.com/patthoyts/git-gui as branch 'pu' with additional stuff on top of the patches you already have. If this looks ok to you I'll merge this to my master and send you a merge request to get it all synchronized. -- Pat Thoytshttp://www.patthoyts.tk/
Re: [PATCH] http: http.emptyauth should allow empty (not just NULL) usernames
David Turnerwrites: >> > > I dunno. The code path you are changing _only_ affects anything if >> > > the http.emptyauth config is set. But I guess I just don't >> > > understand why you would say "http://@gitserver; in the first place. >> Is that a common thing? >> > >> > I have no idea if it is common. I know that we do it. >> >> I guess my question is: _why_ do you do it? Or more specifically, does >> http://gitserver.example.com; with http.emptyauth not work, and why? >> >> From your response, I _think_ the answer is "no, it doesn't, and I have no >> clue why". > > That was true historically. > > I just tried our old version of git 2.8 (that is, before this patch, and > before the libcurl upgrade), and http://gitserver.example.com *does* seem to > work with http.emptyauth (and does not work without). However, > http://@gitserver.example.com does *not* work with http.emptyauth, and *does* > work without. > > After the libcurl upgrade, but before this patch, > http://@gitserver.example.com does *not* work with http.emptyauth, while > http://gitserver.example.com does. > > And finally, after the upgrade and with this patch, both urls work. > >> So I dunno. It is annoying not to know what is actually going on, but I'm >> OK with it if we don't think there's a high chance of regressing any other >> workflows (which I guess not, because http.emptyauth seems to be a >> Kerberos-specific hack in the first place). > > Yes, I think this is all Kerberos-only. Now, perhaps with these back-and-forth, hopefully you have enough material to update the proposed log message to clarify so that next Peff won't have to ask "would it be common? why would you do so?" Thanks.
Re: [PATCH 3/3] abbrev: auto size the default abbreviation
Jeff Kingwrites: > On Mon, Oct 03, 2016 at 03:34:03PM -0700, Linus Torvalds wrote: > >> On Mon, Oct 3, 2016 at 3:27 PM, Jeff King wrote: >> > >> > + if (len < 0) { >> > + unsigned long count = approximate_object_count(); >> > + len = (msb(count) + 1) / 2; >> > + if (len < 0) >> > + len = FALLBACK_DEFAULT_ABBREV; >> > + } >> >> that second "if (len < 0)" should probably be testing against >> FALLBACK_DEFAULT_ABBREV, not zero. Or at the very least >> MINIMUM_ABBREV. Because a two-character abbreviation won't even be >> recognized, even if the git project is very small indeed. > > Oops, yes, clearly it should be FALLBACK_DEFAULT_ABBREV. What is there > would not even pass the tests (it _does_ work on linux.git, of course, > because it is much too large for that code to be triggered). OK, as Linus's "count at the point of use" is already in 'next', could you make it incremental with a log message? Thanks.
Re: [PATCH 3/3] abbrev: auto size the default abbreviation
On Mon, Oct 03, 2016 at 03:34:03PM -0700, Linus Torvalds wrote: > On Mon, Oct 3, 2016 at 3:27 PM, Jeff Kingwrote: > > > > + if (len < 0) { > > + unsigned long count = approximate_object_count(); > > + len = (msb(count) + 1) / 2; > > + if (len < 0) > > + len = FALLBACK_DEFAULT_ABBREV; > > + } > > that second "if (len < 0)" should probably be testing against > FALLBACK_DEFAULT_ABBREV, not zero. Or at the very least > MINIMUM_ABBREV. Because a two-character abbreviation won't even be > recognized, even if the git project is very small indeed. Oops, yes, clearly it should be FALLBACK_DEFAULT_ABBREV. What is there would not even pass the tests (it _does_ work on linux.git, of course, because it is much too large for that code to be triggered). -Peff
Re: [PATCH 3/3] abbrev: auto size the default abbreviation
On Mon, Oct 3, 2016 at 3:27 PM, Jeff Kingwrote: > > + if (len < 0) { > + unsigned long count = approximate_object_count(); > + len = (msb(count) + 1) / 2; > + if (len < 0) > + len = FALLBACK_DEFAULT_ABBREV; > + } that second "if (len < 0)" should probably be testing against FALLBACK_DEFAULT_ABBREV, not zero. Or at the very least MINIMUM_ABBREV. Because a two-character abbreviation won't even be recognized, even if the git project is very small indeed. Linus
A note from the maintainer
Welcome to the Git development community. This message is written by the maintainer and talks about how Git project is managed, and how you can work with it. * Mailing list and the community The development is primarily done on the Git mailing list. Help requests, feature proposals, bug reports and patches should be sent to the list address. You don't have to be subscribed to send messages. The convention on the list is to keep everybody involved on Cc:, so it is unnecessary to say "Please Cc: me, I am not subscribed". Before sending patches, please read Documentation/SubmittingPatches and Documentation/CodingGuidelines to familiarize yourself with the project convention. If you sent a patch and you did not hear any response from anybody for several days, it could be that your patch was totally uninteresting, but it also is possible that it was simply lost in the noise. Please do not hesitate to send a reminder message in such a case. Messages getting lost in the noise may be a sign that those who can evaluate your patch don't have enough mental/time bandwidth to process them right at the moment, and it often helps to wait until the list traffic becomes calmer before sending such a reminder. The list archive is available at a few public sites: http://public-inbox.org/git/ http://marc.info/?l=git http://www.spinics.net/lists/git/ For those who prefer to read it over NNTP: nntp://news.public-inbox.org/inbox.comp.version-control.git nntp://news.gmane.org/gmane.comp.version-control.git are available. When you point at a message in a mailing list archive, using its message ID is often the most robust (if not very friendly) way to do so, like this: http://public-inbox.org/git/pine.lnx.4.58.0504150753440.7...@ppc970.osdl.org Often these web interfaces accept the message ID with enclosing <> stripped (like the above example to point at one of the most important message in the Git list). Some members of the development community can sometimes be found on the #git and #git-devel IRC channels on Freenode. Their logs are available at: http://colabti.org/irclogger/irclogger_log/git http://colabti.org/irclogger/irclogger_log/git-devel There is a volunteer-run newsletter to serve our community ("Git Rev News" http://git.github.io/rev_news/rev_news.html). Git is a member project of software freedom conservancy, a non-profit organization (https://sfconservancy.org/). To reach a committee of liaisons to the conservancy, contact them at . * Reporting bugs When you think git does not behave as you expect, please do not stop your bug report with just "git does not work". "I used git in this way, but it did not work" is not much better, neither is "I used git in this way, and X happend, which is broken". It often is that git is correct to cause X happen in such a case, and it is your expectation that is broken. People would not know what other result Y you expected to see instead of X, if you left it unsaid. Please remember to always state - what you wanted to achieve; - what you did (the version of git and the command sequence to reproduce the behavior); - what you saw happen (X above); - what you expected to see (Y above); and - how the last two are different. See http://www.chiark.greenend.org.uk/~sgtatham/bugs.html for further hints. If you think you found a security-sensitive issue and want to disclose it to us without announcing it to wider public, please contact us at our security mailing list . This is a closed list that is limited to people who need to know early about vulnerabilities, including: - people triaging and fixing reported vulnerabilities - people operating major git hosting sites with many users - people packaging and distributing git to large numbers of people where these issues are discussed without risk of the information leaking out before we're ready to make public announcements. * Repositories and documentation. My public git.git repositories are at: git://git.kernel.org/pub/scm/git/git.git/ https://kernel.googlesource.com/pub/scm/git/git git://repo.or.cz/alt-git.git/ https://github.com/git/git/ git://git.sourceforge.jp/gitroot/git-core/git.git/ git://git-core.git.sourceforge.net/gitroot/git-core/git-core/ A few web interfaces are found at: http://git.kernel.org/cgit/git/git.git https://kernel.googlesource.com/pub/scm/git/git http://repo.or.cz/w/alt-git.git Preformatted documentation from the tip of the "master" branch can be found in: git://git.kernel.org/pub/scm/git/git-{htmldocs,manpages}.git/ git://repo.or.cz/git-{htmldocs,manpages}.git/ https://github.com/gitster/git-{htmldocs,manpages}.git/ The manual pages formatted in HTML for the tip of 'master' can be viewed online at: https://git.github.io/htmldocs/git.html * How various branches are used. There are four
What's cooking in git.git (Oct 2016, #01; Mon, 3)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The ones marked with '.' do not appear in any of the integration branches, but I am still holding onto them. With fixes since 2.10 accumulated on the 'master' front, the first maintenance release 2.10.1 has been tagged. The auto-abbreviation by Linus and Peff is now in 'next'. The tip of 'master' has quite a many topics merged since the last report. Pat is back and hopefully be more active as the git-gui maintainer. Life is good ;-) You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [Graduated to "master"] * dt/mailinfo (2016-09-26) 1 commit (merged to 'next' on 2016-09-27 at 59e95dbc0e) + add David Turner's Two Sigma address * dt/tree-fsck (2016-09-27) 2 commits (merged to 'next' on 2016-09-28 at afdfdbbf37) + fsck: handle bad trees like other errors + tree-walk: be more specific about corrupt tree errors The codepath in "git fsck" to detect malformed tree objects has been updated not to die but keep going after detecting them. * ik/gitweb-force-highlight (2016-09-25) 2 commits (merged to 'next' on 2016-09-27 at cbb8391a76) + gitweb: use highlight's shebang detection + gitweb: remove unused guess_file_syntax() parameter "gitweb" can spawn "highlight" to show blob contents with (programming) language-specific syntax highlighting, but only when the language is known. "highlight" can however be told to make the guess itself by giving it "--force" option, which has been enabled. * jc/verify-loose-object-header (2016-09-26) 2 commits (merged to 'next' on 2016-09-27 at 2947f95f14) + unpack_sha1_header(): detect malformed object header + streaming: make sure to notice corrupt object Codepaths that read from an on-disk loose object were too loose in validating what they are reading is a proper object file and sometimes read past the data they read from the disk, which has been corrected. H/t to Gustavo Grieco for reporting. * jc/worktree-config (2016-09-27) 1 commit (merged to 'next' on 2016-09-28 at 0c262f6) + worktree: honor configuration variables "git worktree", even though it used the default_abbrev setting that ought to be affected by core.abbrev configuration variable, ignored the variable setting. The command has been taught to read the default set of configuration variables to correct this. * jk/ident-ai-canonname-could-be-null (2016-09-23) 1 commit (merged to 'next' on 2016-09-26 at 0eefb29) + ident: handle NULL ai_canonname In the codepath that comes up with the hostname to be used in an e-mail when the user didn't tell us, we looked at ai_canonname field in struct addrinfo without making sure it is not NULL first. * jk/verify-packfile-gently (2016-09-22) 1 commit (merged to 'next' on 2016-09-26 at f5abba5) + verify_packfile: check pack validity before accessing data A low-level function verify_packfile() was meant to show errors that were detected without dying itself, but under some conditions it didn't and died instead, which has been fixed. * jt/fetch-pack-in-vain-count-with-stateless (2016-09-23) 1 commit (merged to 'next' on 2016-09-26 at 9645629) + fetch-pack: do not reset in_vain on non-novel acks When "git fetch" tries to find where the history of the repository it runs in has diverged from what the other side has, it has a mechanism to avoid digging too deep into irrelevant side branches. This however did not work well over the "smart-http" transport due to a design bug, which has been fixed. * jt/mailinfo-fold-in-body-headers (2016-09-21) 3 commits (merged to 'next' on 2016-09-26 at 4235eb6) + mailinfo: handle in-body header continuations + mailinfo: make is_scissors_line take plain char * + mailinfo: separate in-body header processing When "git format-patch --stdout" output is placed as an in-body header and it uses the RFC2822 header folding, "git am" failed to put the header line back into a single logical line. The underlying "git mailinfo" was taught to handle this properly. * kd/mailinfo-quoted-string (2016-09-28) 2 commits (merged to 'next' on 2016-09-28 at 2aaeb57804) + mailinfo: unescape quoted-pair in header fields + t5100-mailinfo: replace common path prefix with variable An author name, that spelled a backslash-quoted double quote in the human readable part "My \"double quoted\" name", was not unquoted correctly while applying a patch from a piece of e-mail. * mh/diff-indent-heuristic (2016-09-27) 1 commit (merged to 'next' on 2016-09-27 at 3d6fb6605a) + xdiff: rename "struct group" to "struct xdlgroup" Clean-up for a recently graduated topic. * nd/init-core-worktree-in-multi-worktree-world (2016-09-25) 5 commits (merged to 'next' on 2016-09-27 at
[ANNOUNCE] Git v2.10.1
The latest maintenance release Git v2.10.1 is now available at the usual places. The tarballs are found at: https://www.kernel.org/pub/software/scm/git/ The following public repositories all have a copy of the 'v2.10.1' tag and the 'maint' branch that the tag points at: url = https://kernel.googlesource.com/pub/scm/git/git url = git://repo.or.cz/alt-git.git url = git://git.sourceforge.jp/gitroot/git-core/git.git url = git://git-core.git.sourceforge.net/gitroot/git-core/git-core url = https://github.com/gitster/git Git v2.10.1 Release Notes = Fixes since v2.10 - * Clarify various ways to specify the "revision ranges" in the documentation. * "diff-highlight" script (in contrib/) learned to work better with "git log -p --graph" output. * The test framework left the number of tests and success/failure count in the t/test-results directory, keyed by the name of the test script plus the process ID. The latter however turned out not to serve any useful purpose. The process ID part of the filename has been removed. * Having a submodule whose ".git" repository is somehow corrupt caused a few commands that recurse into submodules loop forever. * "git symbolic-ref -d HEAD" happily removes the symbolic ref, but the resulting repository becomes an invalid one. Teach the command to forbid removal of HEAD. * A test spawned a short-lived background process, which sometimes prevented the test directory from getting removed at the end of the script on some platforms. * Update a few tests that used to use GIT_CURL_VERBOSE to use the newer GIT_TRACE_CURL. * Update Japanese translation for "git-gui". * "git fetch http::/site/path" did not die correctly and segfaulted instead. * "git commit-tree" stopped reading commit.gpgsign configuration variable that was meant for Porcelain "git commit" in Git 2.9; we forgot to update "git gui" to look at the configuration to match this change. * "git log --cherry-pick" used to include merge commits as candidates to be matched up with other commits, resulting a lot of wasted time. The patch-id generation logic has been updated to ignore merges to avoid the wastage. * The http transport (with curl-multi option, which is the default these days) failed to remove curl-easy handle from a curlm session, which led to unnecessary API failures. * "git diff -W" output needs to extend the context backward to include the header line of the current function and also forward to include the body of the entire current function up to the header line of the next one. This process may have to merge to adjacent hunks, but the code forgot to do so in some cases. * Performance tests done via "t/perf" did not use the same set of build configuration if the user relied on autoconf generated configuration. * "git format-patch --base=..." feature that was recently added showed the base commit information after "-- " e-mail signature line, which turned out to be inconvenient. The base information has been moved above the signature line. * Even when "git pull --rebase=preserve" (and the underlying "git rebase --preserve") can complete without creating any new commit (i.e. fast-forwards), it still insisted on having a usable ident information (read: user.email is set correctly), which was less than nice. As the underlying commands used inside "git rebase" would fail with a more meaningful error message and advice text when the bogus ident matters, this extra check was removed. * "git gc --aggressive" used to limit the delta-chain length to 250, which is way too deep for gaining additional space savings and is detrimental for runtime performance. The limit has been reduced to 50. * Documentation for individual configuration variables to control use of color (like `color.grep`) said that their default value is 'false', instead of saying their default is taken from `color.ui`. When we updated the default value for color.ui from 'false' to 'auto' quite a while ago, all of them broke. This has been corrected. * A shell script example in check-ref-format documentation has been fixed. * "git checkout " does not follow the usual disambiguation rules when the can be both a rev and a path, to allow checking out a branch 'foo' in a project that happens to have a file 'foo' in the working tree without having to disambiguate. This was poorly documented and the check was incorrect when the command was run from a subdirectory. * Some codepaths in "git diff" used regexec(3) on a buffer that was mmap(2)ed, which may not have a terminating NUL, leading to a read beyond the end of the mapped region. This was fixed by introducing a regexec_buf() helper that takes apair with REG_STARTEND
Re: [PATCH 3/3] abbrev: auto size the default abbreviation
On Fri, Sep 30, 2016 at 05:19:37PM -0700, Junio C Hamano wrote: > Introduce a mechanism, where we estimate the number of objects in > the repository upon the first request to abbreviate an object name > with the default setting and come up with a sane default for the > repository. Based on the expectation that we would see collision in > a repository with 2^(2N) objects when using object names shortened > to first N bits, use sufficient number of hexdigits to cover the > number of objects in the repository. Each hexdigit (4-bits) we add > to the shortened name allows us to have four times (2-bits) as many > objects in the repository. > > --- > cache.h | 1 + > environment.c | 2 +- > sha1_name.c | 28 +++- > 3 files changed, 29 insertions(+), 2 deletions(-) For reference, here's a working version that just uses a separate counting function (no commit message, because I would just steal the one from Linus ;) ). --- cache.h | 6 ++ environment.c | 2 +- sha1_file.c | 27 +++ sha1_name.c | 20 4 files changed, 54 insertions(+), 1 deletion(-) diff --git a/cache.h b/cache.h index 5a651b8..f22ace5 100644 --- a/cache.h +++ b/cache.h @@ -1455,6 +1455,12 @@ extern void prepare_packed_git(void); extern void reprepare_packed_git(void); extern void install_packed_git(struct packed_git *pack); +/* + * Give a rough count of objects in the repository. This sacrifices accuracy + * for speed. + */ +unsigned long approximate_object_count(void); + extern struct packed_git *find_sha1_pack(const unsigned char *sha1, struct packed_git *packs); diff --git a/environment.c b/environment.c index 44fb107..6f9d290 100644 --- a/environment.c +++ b/environment.c @@ -16,7 +16,7 @@ int trust_executable_bit = 1; int trust_ctime = 1; int check_stat = 1; int has_symlinks = 1; -int minimum_abbrev = 4, default_abbrev = FALLBACK_DEFAULT_ABBREV; +int minimum_abbrev = 4, default_abbrev = -1; int ignore_case; int assume_unchanged; int prefer_symlink_refs; diff --git a/sha1_file.c b/sha1_file.c index b9c1fa3..4882440 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1381,6 +1381,32 @@ static void prepare_packed_git_one(char *objdir, int local) strbuf_release(); } +static int approximate_object_count_valid; + +/* + * Give a fast, rough count of the number of objects in the repository. This + * ignores loose objects completely. If you have a lot of them, then either + * you should repack because your performance will be awful, or they are + * all unreachable objects about to be pruned, in which case they're not really + * interesting as a measure of repo size in the first place. + */ +unsigned long approximate_object_count(void) +{ + static unsigned long count; + if (!approximate_object_count_valid) { + struct packed_git *p; + + prepare_packed_git(); + count = 0; + for (p = packed_git; p; p = p->next) { + if (open_pack_index(p)) + continue; + count += p->num_objects; + } + } + return count; +} + static void *get_next_packed_git(const void *p) { return ((const struct packed_git *)p)->next; @@ -1455,6 +1481,7 @@ void prepare_packed_git(void) void reprepare_packed_git(void) { + approximate_object_count_valid = 0; prepare_packed_git_run_once = 0; prepare_packed_git(); } diff --git a/sha1_name.c b/sha1_name.c index 3b647fd..ecc4b54 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -455,10 +455,30 @@ int for_each_abbrev(const char *prefix, each_abbrev_fn fn, void *cb_data) return ret; } +/* + * Return the slot of the most-significant bit set in "val". There are various + * ways to do this quickly with fls() or __builtin_clzl(), but speed is + * probably not a big deal here. + */ +unsigned msb(unsigned long val) +{ + unsigned r = 0; + while (val >>= 1) + r++; + return r; +} + int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) { int status, exists; + if (len < 0) { + unsigned long count = approximate_object_count(); + len = (msb(count) + 1) / 2; + if (len < 0) + len = FALLBACK_DEFAULT_ABBREV; + } + sha1_to_hex_r(hex, sha1); if (len == 40 || !len) return 40; -- 2.10.0.618.g82cc264
RE: [PATCH] http: http.emptyauth should allow empty (not just NULL) usernames
> -Original Message- > From: Jeff King [mailto:p...@peff.net] > Sent: Monday, October 03, 2016 5:59 PM > To: David Turner > Cc: git@vger.kernel.org; sand...@crustytoothpaste.net > Subject: Re: [PATCH] http: http.emptyauth should allow empty (not just > NULL) usernames > > On Mon, Oct 03, 2016 at 09:54:19PM +, David Turner wrote: > > > > I dunno. The code path you are changing _only_ affects anything if > > > the http.emptyauth config is set. But I guess I just don't > > > understand why you would say "http://@gitserver; in the first place. > Is that a common thing? > > > > I have no idea if it is common. I know that we do it. > > I guess my question is: _why_ do you do it? Or more specifically, does > http://gitserver.example.com; with http.emptyauth not work, and why? > > From your response, I _think_ the answer is "no, it doesn't, and I have no > clue why". That was true historically. I just tried our old version of git 2.8 (that is, before this patch, and before the libcurl upgrade), and http://gitserver.example.com *does* seem to work with http.emptyauth (and does not work without). However, http://@gitserver.example.com does *not* work with http.emptyauth, and *does* work without. After the libcurl upgrade, but before this patch, http://@gitserver.example.com does *not* work with http.emptyauth, while http://gitserver.example.com does. And finally, after the upgrade and with this patch, both urls work. > So I dunno. It is annoying not to know what is actually going on, but I'm > OK with it if we don't think there's a high chance of regressing any other > workflows (which I guess not, because http.emptyauth seems to be a > Kerberos-specific hack in the first place). Yes, I think this is all Kerberos-only.
Re: What's cooking in git.git (Sep 2016, #08; Tue, 27)
On Mon, Oct 3, 2016 at 2:56 PM, Junio C Hamanowrote: > Stefan Beller writes: > >> // Note: git_attr_check_elem seems to be useless now, as the >> // results are not stored in there, we only make use of the `attr` key. > > I do not think git_attr_check_elem would be visible to the callers, > once we split the "inquiry" and "result" like the code illustrated > above. I actually doubt that the type would even internally need to > survive such a rewrite. So how would we go about git_all_attrs then? I think those (only builtin/check-attr.c as well as Documentation) would need to read these names off of git_attr_check_elem. So instead we could do a int git_all_attrs(const char *path, char *result_keys[], char *result_values[], int nr, int alloc) Internally git_all_attrs would use nr/alloc to resize result_{key,value} to an appropriate size and then fill it with keys/values. Although I do not think check-attr needs to be fast as it is a debugging tool rather than a daily used tool, but it would fit into the current line of thinking? > > The point of "future-proofing" the callers is to hide such > implementation details from them. We know that the current API will > need to be updated at least once to prepare the implementation of > the API so that it has some chance of becoming thread-safe, and I > think we know enough how the updated API should look like to the > callers. I don't think we have the same idea how it should look like, as e.g. it is unclear what we do with the `const struct git_attr` in the git_attr_check to me. > I was hoping the minimum future-proofing would allow us to > update the current "attr" API users only once, without having to > update them again when we make it ready to be used in threaded > environment. Ok, so let's first define how the future proofed API should look like and then we can go towards it.
Re: [RFC/PATCH 0/2] place cherry pick line below commit title
Jonathan Tanwrites: > There are other options like checking for indentation or checking for > balanced parentheses/brackets, but I think that these would lead to > surprising behavior for the user (this would mean that whitespace or > certain characters could turn a valid trailer into an invalid one or > vice versa, or change the behavior of trailer.ifexists, especially > "replace"). Yes, that is exactly why I said that it may be necessary for the code to analize the lines in a block identified as "likely to be a trailing block" more carefully. We can afford to be loose as long as the only allowed operation is to append one at the end, but once we start removing/replacing an existing entry, etc., the definition of what an entry is becomes very much relevant.
Re: Slow pushes on 'pu' - even when up-to-date..
Stefan Bellerwrites: > On Mon, Oct 3, 2016 at 2:17 PM, Stefan Beller wrote: >> >> * sb/push-make-submodule-check-the-default (2016-08-24) 1 commit >> - push: change submodule default to check >> >> Turn the default of "push.recurseSubmodules" to "check". >> >> Will hold to wait for hv/submodule-not-yet-pushed-fix >> >> This reveals that the "check" mode is too inefficient to use in >> real projects, even in ones as small as git itself. >> cf. > > So maybe we should eject this series from pu as long as > hv/submodule-not-yet-pushed-fix is ejected to enable you > running pu happily. I am planning to merge lt/abbrev-auto to 'next' together with Peff's ambiguous-short-object-names series in today's pushout. Just FYI, there is another integration branch called 'jch' that typically has several topics more than 'next' but does not merge things I haven't looked at (or things I have looked at and decided not ready). That is what I use for my daily work. You can grab it out of "git log --oneline --first-parent master..pu", or from my broken-out repository (git://github.com/gitster/git/).
Re: [PATCH 1/3] add QSORT
Am 03.10.2016 um 19:09 schrieb Kevin Bracey: As such, NULL checks can still be elided even with your change. If you effectively change your example to: if (nmemb > 1) qsort(array, nmemb, size, cmp); if (!array) printf("array is NULL\n"); array may only be checked for NULL if nmemb <= 1. You can see GCC doing that in the compiler explorer - it effectively turns that into "else if". We don't support array == NULL together with nmemb > 1, so a segfault is to be expected in such cases, and thus NULL checks can be removed safely. To make that check really work, you have to do: if (array) qsort(array, nmemb, size, cmp); else printf("array is NULL\n"); So maybe your "sane_qsort" should be checking array, not nmemb. It would be safe, but arguably too much so, because non-empty arrays with NULL wouldn't segfault anymore, and thus become harder to identify as the programming errors they are. The intention is to support NULL pointers only for empty arrays (in addition to valid pointers). That we also support NULL pointers for arrays with a single member might be considered to be the result of a premature optimization, but it should be safe -- the compiler won't remove checks unexpectedly. Does that make sense (it's getting late here, so my logic might already be resting..)? René
Re: [PATCH] http: http.emptyauth should allow empty (not just NULL) usernames
On Mon, Oct 03, 2016 at 09:54:19PM +, David Turner wrote: > > I dunno. The code path you are changing _only_ affects anything if the > > http.emptyauth config is set. But I guess I just don't understand why you > > would say "http://@gitserver; in the first place. Is that a common thing? > > I have no idea if it is common. I know that we do it. I guess my question is: _why_ do you do it? Or more specifically, does http://gitserver.example.com; with http.emptyauth not work, and why? >From your response, I _think_ the answer is "no, it doesn't, and I have no clue why". So I dunno. It is annoying not to know what is actually going on, but I'm OK with it if we don't think there's a high chance of regressing any other workflows (which I guess not, because http.emptyauth seems to be a Kerberos-specific hack in the first place). -Peff
Re: What's cooking in git.git (Sep 2016, #08; Tue, 27)
Stefan Bellerwrites: > // Note: git_attr_check_elem seems to be useless now, as the > // results are not stored in there, we only make use of the `attr` key. I do not think git_attr_check_elem would be visible to the callers, once we split the "inquiry" and "result" like the code illustrated above. I actually doubt that the type would even internally need to survive such a rewrite. The point of "future-proofing" the callers is to hide such implementation details from them. We know that the current API will need to be updated at least once to prepare the implementation of the API so that it has some chance of becoming thread-safe, and I think we know enough how the updated API should look like to the callers. I was hoping the minimum future-proofing would allow us to update the current "attr" API users only once, without having to update them again when we make it ready to be used in threaded environment.
RE: [PATCH] http: http.emptyauth should allow empty (not just NULL) usernames
> -Original Message- > From: Jeff King [mailto:p...@peff.net] > Sent: Monday, October 03, 2016 5:01 PM > To: David Turner > Cc: git@vger.kernel.org; sand...@crustytoothpaste.net > Subject: Re: [PATCH] http: http.emptyauth should allow empty (not just > NULL) usernames > > On Mon, Oct 03, 2016 at 01:19:28PM -0400, David Turner wrote: > > > When using kerberos authentication, one URL pattern which is allowed > > is http://@gitserver.example.com. This leads to a username of > > zero-length, rather than a NULL username. But the two cases should be > > treated the same by http.emptyauth. > > > > Signed-off-by: David Turner> > --- > > http.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/http.c b/http.c > > index 82ed542..bd0dba2 100644 > > --- a/http.c > > +++ b/http.c > > @@ -351,7 +351,7 @@ static int http_options(const char *var, const > > char *value, void *cb) > > > > static void init_curl_http_auth(CURL *result) { > > - if (!http_auth.username) { > > + if (!http_auth.username || !*http_auth.username) { > > Hmm. This fixes this caller, but what about other users of the credential > struct? I wonder if the correct fix is in credential_from_url(), which > should avoid writing an empty field. > > OTOH, I can imagine that "http://user:@example.com; would be a way to say > "I have a username and the password is blank" without getting prompted. > Which makes me wonder if it is useful to say "my username is blank" in the > same way. Yes, that was my thought process. > I dunno. The code path you are changing _only_ affects anything if the > http.emptyauth config is set. But I guess I just don't understand why you > would say "http://@gitserver; in the first place. Is that a common thing? > > -Peff I have no idea if it is common. I know that we do it. It used to be that git 2.8/libcurl would handle @gitserver as if the username were blank, but then we upgraded our company's libcurl and it broke (git started prompting for a password). I do not know what the previous version of libcurl was. The reason we have a required-to-be-blank username/password is apparently Kerberos (or something about our particular Kerberos configuration), which I treat as inscrutable black magic.
Re: What's cooking in git.git (Sep 2016, #08; Tue, 27)
On Mon, Oct 3, 2016 at 1:49 PM, Junio C Hamanowrote: > Stefan Beller writes: > >> I am looking at the tip of jc/attr-more and for a minimum >> thread safety we'd want to change the call sites to be aware of the >> threads, i.e. instead of doing >> > static struct git_attr_check *check; >> if (!check) >> check = git_attr_check_initl("crlf", "ident", >> "filter", "eol", "text", >> NULL); >> >> We'd rather call >> >> struct git_attr_check *check; >> check = git_attr_check_lookup_or_initl_threadsafe( >> "crlf", "ident", "filter", "eol", "text", NULL); >> if (!git_check_attr(path, check)) { >> ... > > I actually am hoping that the "static" can be kept so that we can > minimize the cost of interning these symbols into struct git_attr. > > The initialization would thus become something like: > > static struct git_attr_check *check; > git_attr_check_initl(, "crlf", "ident", ..., NULL); > > The structure will have an array of git_attr[], once interned will > be shared and used by everybody. _initl() will need to make sure > that the "check" pointer is updated atomically so that multiple > people racing to initialize this variable will not step on each > others' toes. I see and that mutex to protect the first argument of git_attr_check_initl is unrelated to the actual pointer; we can use a global mutex for that, or rather a static scoped mutex in attr.c, such that all parallel racing git_attr_check_initl try to acquire that init_lock and only one succeeds and that one makes sure to first initialize a git_attr_check completely and then atomically storing the pointer to the location. > > Then the use site would do something like > > const char *result[... some size ...]; ... some size ... depends on the git_attr_check->check_nr and not on the path as I first assumed. So when we still want to go fast here: static struct git_attr_check *check; /* hard code 2 as it has to hold results for crlf and ident only */ const char *results[2]; if (!check) git_attr_check_initl(, "crlf", "ident", NULL); if (nr != check->check_nr) ALLOC_GROW(results, check->check_nr, alloc); git_check_attr(path, check, ); // result[i] contains the result for the i-th element of check // Note: git_attr_check_elem seems to be useless now, as the // results are not stored in there, we only make use of the `attr` key.
Re: [PATCH v2 0/5] receive-pack: quarantine pushed objects
On Mon, Oct 03, 2016 at 02:25:23PM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > > Here's a v2; the original was at: > > > > > > http://public-inbox.org/git/20160930193533.ynbepaago6oyc...@sigill.intra.peff.net/ > > > > which contains the rationale. The required alternate-objects patches > > (both the "allow recursive relative" one, and the helper to add an > > internal alt-odb) have been pushed into their own series, that I posted > > here: > > > > > > http://public-inbox.org/git/20161003203321.rj5jepviwo57u...@sigill.intra.peff.net/ > > > > This needs to be applied on top. > > Sorry, you lost me. So this 5-patch series comes on top of a brand > new 18-patch clean-up series? Yes, sorry to be unclear. The final sentence should have been: This v2 quarantine series need to be applied on top of that cleanup series. -Peff
Re: [RFC/PATCH 0/2] place cherry pick line below commit title
On 10/03/2016 12:17 PM, Junio C Hamano wrote: It may be necessary for the code to analize the lines in a block identified as "likely to be a trailing block" more carefully, though. The example 59f0aa94 in the message you are responding to has "Link 1:" that consists of 3 physical lines. An instruction to interpret-trailers to add a new one _after_ "Link-$n:" may have to treat these as a single logical line and do the addition after them, i.e. before "Link 2:" line, for example. I also saw Signed-off-by: Some body(some comment that extends to the next line without being indented) Sined-off-by: Some body Else where the only clue that the second line is logically a part of the first one was the balancing of parentheses (or [brakets]). To accomodate real-world use cases, you may have to take into account a lot more than the strict rfc-822 style "line folding". If we define a logical trailer line as the set of physical lines until the next trailer line...it is true that this has some precedence in that such lines can be written (although this might not be intentional): $ git interpret-trailers --trailer "a=foo bar" 1. Checking for existence (trailer.ifexists) is now conceptually more difficult. For example, handling of inner whitespace in between lines might be confusing (currently, there is only one line, and whitespace handling is clearly described). 2. Replacement (trailer.ifexists=replace) might replace more than expected. 3. There is a corner case - the first line might not be a trailer line. Even if interpret-trailers knew about "(cherry picked from", a user might enter something else there. (We could just declare such blocks as non-trailers, but if we are already loosening the definition of a trailer, this might be something that we should allow.) My initial thought was to think of a trailer as a block of trailer lines possibly interspersed with other lines. This leads to interpret-trailers placing the trailer in the wrong place if trailer.where=after, as you describe, but this might not be a big problem if trailer.where=after is not widely used (and it is not the default). (The other options behave as expected.) There are other options like checking for indentation or checking for balanced parentheses/brackets, but I think that these would lead to surprising behavior for the user (this would mean that whitespace or certain characters could turn a valid trailer into an invalid one or vice versa, or change the behavior of trailer.ifexists, especially "replace").
Re: [PATCH v2 0/5] receive-pack: quarantine pushed objects
Jeff Kingwrites: > Here's a v2; the original was at: > > > http://public-inbox.org/git/20160930193533.ynbepaago6oyc...@sigill.intra.peff.net/ > > which contains the rationale. The required alternate-objects patches > (both the "allow recursive relative" one, and the helper to add an > internal alt-odb) have been pushed into their own series, that I posted > here: > > > http://public-inbox.org/git/20161003203321.rj5jepviwo57u...@sigill.intra.peff.net/ > > This needs to be applied on top. Sorry, you lost me. So this 5-patch series comes on top of a brand new 18-patch clean-up series?
Re: Slow pushes on 'pu' - even when up-to-date..
+cc Heiko On Mon, Oct 3, 2016 at 2:17 PM, Stefan Bellerwrote: > > * sb/push-make-submodule-check-the-default (2016-08-24) 1 commit > - push: change submodule default to check > > Turn the default of "push.recurseSubmodules" to "check". > > Will hold to wait for hv/submodule-not-yet-pushed-fix > > This reveals that the "check" mode is too inefficient to use in > real projects, even in ones as small as git itself. > cf. So maybe we should eject this series from pu as long as hv/submodule-not-yet-pushed-fix is ejected to enable you running pu happily. Thanks, Stefan
Re: Slow pushes on 'pu' - even when up-to-date..
On Mon, Oct 3, 2016 at 2:11 PM, Linus Torvaldswrote: > This seems to be because I'm now on 'pu' as of a day or two ago in > order to test the abbrev logic, but lookie here: > > time git ls-remote ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux > .. shows all the branches and tags .. > real 0m0.655s > user 0m0.011s > sys 0m0.004s > > so the remote is fast to connect to, and with network connection > overhead and everything, it's just over half a second. But then: > > time git push ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux > > and it just sits there, and it's at 100% CPU the whole time, until it says: > > Everything up-to-date > > real 1m7.307s > user 1m2.761s > sys 0m0.475s > > Whaa? It took a *minute* of CPU time to decide that everything was up-to-date? > > That's just not right. The branch is entirely up-to-date: > > git rev-parse HEAD > af79ad2b1f337a00aa150b993635b10bc68dc842 > > git ls-remote ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux > master > af79ad2b1f337a00aa150b993635b10bc68dc842 refs/heads/master > > so there should be no need for any history walking. But it sure is > doing *something*. A minute of CPU time on my machine is actually a > pretty damn big deal. > > Looking at the trace, there's no IO - there's no back-and-forth about > "I have this, do you have it?" or anything like that. The system call > trace is just a lot of allocations, which I think means that "git > push" is walking a lot of objects but not doing anything useful. > > I bisected it to commit 60cd66f "push: change submodule default to > check", which makes little sense since I have no submodules, but there > you go.. Apparently RECURSE_SUBMODULES_CHECK is just terminally > broken. Sorry for breaking you, too. Junio complained about that when I was proposing the topic; but now the strategy seems to just wait until Heiko fixed the RECURSE_SUBMODULES_CHECK to be less broken. * sb/push-make-submodule-check-the-default (2016-08-24) 1 commit - push: change submodule default to check Turn the default of "push.recurseSubmodules" to "check". Will hold to wait for hv/submodule-not-yet-pushed-fix This reveals that the "check" mode is too inefficient to use in real projects, even in ones as small as git itself. cf. Which itself is * hv/submodule-not-yet-pushed-fix (2016-09-14) 2 commits - serialize collection of refs that contain submodule changes - serialize collection of changed submodules The code in "git push" to compute if any commit being pushed in the superproject binds a commit in a submodule that hasn't been pushed out was overly inefficient, making it unusable even for a small project that does not have any submodule but have a reasonable number of refs. The last two in the original series seem to break a few tests when queued to 'pu', and dropped for now. Waiting for a reroll.
Slow pushes on 'pu' - even when up-to-date..
This seems to be because I'm now on 'pu' as of a day or two ago in order to test the abbrev logic, but lookie here: time git ls-remote ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux .. shows all the branches and tags .. real 0m0.655s user 0m0.011s sys 0m0.004s so the remote is fast to connect to, and with network connection overhead and everything, it's just over half a second. But then: time git push ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux and it just sits there, and it's at 100% CPU the whole time, until it says: Everything up-to-date real 1m7.307s user 1m2.761s sys 0m0.475s Whaa? It took a *minute* of CPU time to decide that everything was up-to-date? That's just not right. The branch is entirely up-to-date: git rev-parse HEAD af79ad2b1f337a00aa150b993635b10bc68dc842 git ls-remote ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux master af79ad2b1f337a00aa150b993635b10bc68dc842 refs/heads/master so there should be no need for any history walking. But it sure is doing *something*. A minute of CPU time on my machine is actually a pretty damn big deal. Looking at the trace, there's no IO - there's no back-and-forth about "I have this, do you have it?" or anything like that. The system call trace is just a lot of allocations, which I think means that "git push" is walking a lot of objects but not doing anything useful. I bisected it to commit 60cd66f "push: change submodule default to check", which makes little sense since I have no submodules, but there you go.. Apparently RECURSE_SUBMODULES_CHECK is just terminally broken. Linus
Re: [PATCH] http: http.emptyauth should allow empty (not just NULL) usernames
On Mon, Oct 03, 2016 at 01:19:28PM -0400, David Turner wrote: > When using kerberos authentication, one URL pattern which is > allowed is http://@gitserver.example.com. This leads to a username > of zero-length, rather than a NULL username. But the two cases > should be treated the same by http.emptyauth. > > Signed-off-by: David Turner> --- > http.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/http.c b/http.c > index 82ed542..bd0dba2 100644 > --- a/http.c > +++ b/http.c > @@ -351,7 +351,7 @@ static int http_options(const char *var, const char > *value, void *cb) > > static void init_curl_http_auth(CURL *result) > { > - if (!http_auth.username) { > + if (!http_auth.username || !*http_auth.username) { Hmm. This fixes this caller, but what about other users of the credential struct? I wonder if the correct fix is in credential_from_url(), which should avoid writing an empty field. OTOH, I can imagine that "http://user:@example.com; would be a way to say "I have a username and the password is blank" without getting prompted. Which makes me wonder if it is useful to say "my username is blank" in the same way. I dunno. The code path you are changing _only_ affects anything if the http.emptyauth config is set. But I guess I just don't understand why you would say "http://@gitserver; in the first place. Is that a common thing? -Peff
Re: [PATCH v3 0/5] Add --format to tag verification
Santiago Torreswrites: > Hi, Junio. >> I however notice that there is no new tests to protect these two new >> features from future breakages. Perhaps you want to add some in >> [6/5]? > > I'll be working on this. I spent some time looking around for example > tests for format. Are there any that I should pay special attention to? > (I'm looking at t7004 mostly right now). By the way, running t7030 seems to reveal a segfault introduced by this 5-patch series.
Re: What's cooking in git.git (Sep 2016, #08; Tue, 27)
Stefan Bellerwrites: > I am looking at the tip of jc/attr-more and for a minimum > thread safety we'd want to change the call sites to be aware of the > threads, i.e. instead of doing > static struct git_attr_check *check; > if (!check) > check = git_attr_check_initl("crlf", "ident", > "filter", "eol", "text", > NULL); > > We'd rather call > > struct git_attr_check *check; > check = git_attr_check_lookup_or_initl_threadsafe( > "crlf", "ident", "filter", "eol", "text", NULL); > if (!git_check_attr(path, check)) { > ... I actually am hoping that the "static" can be kept so that we can minimize the cost of interning these symbols into struct git_attr. The initialization would thus become something like: static struct git_attr_check *check; git_attr_check_initl(, "crlf", "ident", ..., NULL); The structure will have an array of git_attr[], once interned will be shared and used by everybody. _initl() will need to make sure that the "check" pointer is updated atomically so that multiple people racing to initialize this variable will not step on each others' toes. Then the use site would do something like const char *result[... some size ...]; git_check_attr(path, check, result[]); to receive result regardless of anybody else who is using the same "check" structure to ask the same question (for different paths).
[PATCH v2 3/5] receive-pack: quarantine objects until pre-receive accepts
When a client pushes objects to us, index-pack checks the objects themselves and then installs them into place. If we then reject the push due to a pre-receive hook, we cannot just delete the packfile; other processes may be depending on it. We have to do a normal reachability check at this point via `git gc`. But such objects may hang around for weeks due to the gc.pruneExpire grace period. And worse, during that time they may be exploded from the pack into inefficient loose objects. Instead, this patch teaches receive-pack to put the new objects into a "quarantine" temporary directory. We make these objects available to the connectivity check and to the pre-receive hook, and then install them into place only if it is successful (and otherwise remove them as tempfiles). Signed-off-by: Jeff King--- builtin/receive-pack.c | 41 - t/t5547-push-quarantine.sh | 36 2 files changed, 76 insertions(+), 1 deletion(-) create mode 100755 t/t5547-push-quarantine.sh diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 896b16f..267d320 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -20,6 +20,7 @@ #include "gpg-interface.h" #include "sigchain.h" #include "fsck.h" +#include "tmp-objdir.h" static const char * const receive_pack_usage[] = { N_("git receive-pack "), @@ -86,6 +87,8 @@ static enum { } use_keepalive; static int keepalive_in_sec = 5; +static struct tmp_objdir *tmp_objdir; + static enum deny_action parse_deny_action(const char *var, const char *value) { if (value) { @@ -663,6 +666,9 @@ static int run_and_feed_hook(const char *hook_name, feed_fn feed, } else argv_array_pushf(_array, "GIT_PUSH_OPTION_COUNT"); + if (tmp_objdir) + argv_array_pushv(_array, tmp_objdir_env(tmp_objdir)); + if (use_sideband) { memset(, 0, sizeof(muxer)); muxer.proc = copy_to_sideband; @@ -762,6 +768,7 @@ static int run_update_hook(struct command *cmd) proc.stdout_to_stderr = 1; proc.err = use_sideband ? -1 : 0; proc.argv = argv; + proc.env = tmp_objdir_env(tmp_objdir); code = start_command(); if (code) @@ -833,6 +840,7 @@ static int update_shallow_ref(struct command *cmd, struct shallow_info *si) !delayed_reachability_test(si, i)) sha1_array_append(, si->shallow->sha1[i]); + opt.env = tmp_objdir_env(tmp_objdir); setup_alternate_shallow(_lock, _file, ); if (check_connected(command_singleton_iterator, cmd, )) { rollback_lock_file(_lock); @@ -1240,12 +1248,17 @@ static void set_connectivity_errors(struct command *commands, for (cmd = commands; cmd; cmd = cmd->next) { struct command *singleton = cmd; + struct check_connected_options opt = CHECK_CONNECTED_INIT; + if (shallow_update && si->shallow_ref[cmd->index]) /* to be checked in update_shallow_ref() */ continue; + + opt.env = tmp_objdir_env(tmp_objdir); if (!check_connected(command_singleton_iterator, , -NULL)) +)) continue; + cmd->error_string = "missing necessary objects"; } } @@ -1428,6 +1441,7 @@ static void execute_commands(struct command *commands, data.si = si; opt.err_fd = err_fd; opt.progress = err_fd && !quiet; + opt.env = tmp_objdir_env(tmp_objdir); if (check_connected(iterate_receive_command_list, , )) set_connectivity_errors(commands, si); @@ -1444,6 +1458,19 @@ static void execute_commands(struct command *commands, return; } + /* +* Now we'll start writing out refs, which means the objects need +* to be in their final positions so that other processes can see them. +*/ + if (tmp_objdir_migrate(tmp_objdir) < 0) { + for (cmd = commands; cmd; cmd = cmd->next) { + if (!cmd->error_string) + cmd->error_string = "unable to migrate objects to permanent storage"; + } + return; + } + tmp_objdir = NULL; + check_aliased_updates(commands); free(head_name_to_free); @@ -1639,6 +1666,18 @@ static const char *unpack(int err_fd, struct shallow_info *si) argv_array_push(, alt_shallow_file); } + tmp_objdir = tmp_objdir_create(); + if (!tmp_objdir) + return "unable to create temporary object directory"; + child.env = tmp_objdir_env(tmp_objdir); + + /* +* Normally we just pass the tmp_objdir environment to the child +* processes that do the heavy
[PATCH v2 2/5] tmp-objdir: introduce API for temporary object directories
Once objects are added to the object database by a process, they cannot easily be deleted, as we don't know what other processes may have started referencing them. We have to clean them up with git-gc, which will apply the usual reachability and grace-period checks. This patch provides an alternative: it helps callers create a temporary directory inside the object directory, and a temporary environment which can be passed to sub-programs to ask them to write there (the original object directory remains accessible as an alternate of the temporary one). See tmp-objdir.h for details on the API. Signed-off-by: Jeff King--- Makefile | 1 + tmp-objdir.c | 273 +++ tmp-objdir.h | 54 3 files changed, 328 insertions(+) create mode 100644 tmp-objdir.c create mode 100644 tmp-objdir.h diff --git a/Makefile b/Makefile index 1aad150..4e3becb 100644 --- a/Makefile +++ b/Makefile @@ -831,6 +831,7 @@ LIB_OBJS += submodule-config.o LIB_OBJS += symlinks.o LIB_OBJS += tag.o LIB_OBJS += tempfile.o +LIB_OBJS += tmp-objdir.o LIB_OBJS += trace.o LIB_OBJS += trailer.o LIB_OBJS += transport.o diff --git a/tmp-objdir.c b/tmp-objdir.c new file mode 100644 index 000..9443868 --- /dev/null +++ b/tmp-objdir.c @@ -0,0 +1,273 @@ +#include "cache.h" +#include "tmp-objdir.h" +#include "dir.h" +#include "sigchain.h" +#include "string-list.h" +#include "strbuf.h" +#include "argv-array.h" + +struct tmp_objdir { + struct strbuf path; + struct argv_array env; +}; + +/* + * Allow only one tmp_objdir at a time in a running process, which simplifies + * our signal/atexit cleanup routines. It's doubtful callers will ever need + * more than one, and we can expand later if so. You can have many such + * tmp_objdirs simultaneously in many processes, of course. + */ +static struct tmp_objdir *the_tmp_objdir; + +static void tmp_objdir_free(struct tmp_objdir *t) +{ + strbuf_release(>path); + argv_array_clear(>env); + free(t); +} + +static int tmp_objdir_destroy_1(struct tmp_objdir *t, int on_signal) +{ + int err; + + if (!t) + return 0; + + if (t == the_tmp_objdir) + the_tmp_objdir = NULL; + + /* +* This may use malloc via strbuf_grow(), but we should +* have pre-grown t->path sufficiently so that this +* doesn't happen in practice. +*/ + err = remove_dir_recursively(>path, 0); + + /* +* When we are cleaning up due to a signal, we won't bother +* freeing memory; it may cause a deadlock if the signal +* arrived while libc's allocator lock is held. +*/ + if (!on_signal) + tmp_objdir_free(t); + return err; +} + +int tmp_objdir_destroy(struct tmp_objdir *t) +{ + return tmp_objdir_destroy_1(t, 0); +} + +static void remove_tmp_objdir(void) +{ + tmp_objdir_destroy(the_tmp_objdir); +} + +static void remove_tmp_objdir_on_signal(int signo) +{ + tmp_objdir_destroy_1(the_tmp_objdir, 1); + sigchain_pop(signo); + raise(signo); +} + +/* + * These env_* functions are for setting up the child environment; the + * "replace" variant overrides the value of any existing variable with that + * "key". The "append" variant puts our new value at the end of a list, + * separated by PATH_SEP (which is what separate values in + * GIT_ALTERNATE_OBJECT_DIRECTORIES). + */ +static void env_append(struct argv_array *env, const char *key, const char *val) +{ + const char *old = getenv(key); + + if (!old) + argv_array_pushf(env, "%s=%s", key, val); + else + argv_array_pushf(env, "%s=%s%c%s", key, old, PATH_SEP, val); +} + +static void env_replace(struct argv_array *env, const char *key, const char *val) +{ + argv_array_pushf(env, "%s=%s", key, val); +} + +static int setup_tmp_objdir(const char *root) +{ + char *path; + int ret = 0; + + path = xstrfmt("%s/pack", root); + ret = mkdir(path, 0777); + free(path); + + return ret; +} + +struct tmp_objdir *tmp_objdir_create(void) +{ + static int installed_handlers; + struct tmp_objdir *t; + + if (the_tmp_objdir) + die("BUG: only one tmp_objdir can be used at a time"); + + t = xmalloc(sizeof(*t)); + strbuf_init(>path, 0); + argv_array_init(>env); + + strbuf_addf(>path, "%s/incoming-XX", get_object_directory()); + + /* +* Grow the strbuf beyond any filename we expect to be placed in it. +* If tmp_objdir_destroy() is called by a signal handler, then +* we should be able to use the strbuf to remove files without +* having to call malloc. +*/ + strbuf_grow(>path, 1024); + + if (!mkdtemp(t->path.buf)) { + /* free, not destroy, as we never touched the filesystem */ + tmp_objdir_free(t); +
[PATCH v2 1/5] check_connected: accept an env argument
This lets callers influence the environment seen by rev-list, which will be useful when we start providing quarantined objects. Signed-off-by: Jeff King--- connected.c | 1 + connected.h | 5 + 2 files changed, 6 insertions(+) diff --git a/connected.c b/connected.c index 8e3e4b1..136c2ac 100644 --- a/connected.c +++ b/connected.c @@ -63,6 +63,7 @@ int check_connected(sha1_iterate_fn fn, void *cb_data, _("Checking connectivity")); rev_list.git_cmd = 1; + rev_list.env = opt->env; rev_list.in = -1; rev_list.no_stdout = 1; if (opt->err_fd) diff --git a/connected.h b/connected.h index afa48cc..4ca325f 100644 --- a/connected.h +++ b/connected.h @@ -33,6 +33,11 @@ struct check_connected_options { /* If non-zero, show progress as we traverse the objects. */ int progress; + + /* +* Insert these variables into the environment of the child process. +*/ + const char **env; }; #define CHECK_CONNECTED_INIT { 0 } -- 2.10.0.618.g82cc264
[PATCH v2 5/5] tmp-objdir: do not migrate files starting with '.'
This avoids "." and "..", as we already do, but also leaves room for index-pack to store extra data in the quarantine area (e.g., for passing back any analysis to be read by the pre-receive hook). Signed-off-by: Jeff King--- tmp-objdir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tmp-objdir.c b/tmp-objdir.c index 780af8e..64435f2 100644 --- a/tmp-objdir.c +++ b/tmp-objdir.c @@ -188,7 +188,7 @@ static int read_dir_paths(struct string_list *out, const char *path) return -1; while ((de = readdir(dh))) - if (!is_dot_or_dotdot(de->d_name)) + if (de->d_name[0] != '.') string_list_append(out, de->d_name); closedir(dh); -- 2.10.0.618.g82cc264
[PATCH v2 4/5] tmp-objdir: put quarantine information in the environment
The presence of the GIT_QUARANTINE_PATH variable lets any called programs know that they're operating in a temporary object directory (and where that directory is). Signed-off-by: Jeff King--- cache.h | 1 + tmp-objdir.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/cache.h b/cache.h index 9866e46..fcab658 100644 --- a/cache.h +++ b/cache.h @@ -433,6 +433,7 @@ static inline enum object_type object_type(unsigned int mode) #define GIT_GLOB_PATHSPECS_ENVIRONMENT "GIT_GLOB_PATHSPECS" #define GIT_NOGLOB_PATHSPECS_ENVIRONMENT "GIT_NOGLOB_PATHSPECS" #define GIT_ICASE_PATHSPECS_ENVIRONMENT "GIT_ICASE_PATHSPECS" +#define GIT_QUARANTINE_ENVIRONMENT "GIT_QUARANTINE_PATH" /* * This environment variable is expected to contain a boolean indicating diff --git a/tmp-objdir.c b/tmp-objdir.c index 9443868..780af8e 100644 --- a/tmp-objdir.c +++ b/tmp-objdir.c @@ -147,6 +147,8 @@ struct tmp_objdir *tmp_objdir_create(void) env_append(>env, ALTERNATE_DB_ENVIRONMENT, absolute_path(get_object_directory())); env_replace(>env, DB_ENVIRONMENT, absolute_path(t->path.buf)); + env_replace(>env, GIT_QUARANTINE_ENVIRONMENT, + absolute_path(t->path.buf)); return t; } -- 2.10.0.618.g82cc264
[PATCH v2 0/5] receive-pack: quarantine pushed objects
Here's a v2; the original was at: http://public-inbox.org/git/20160930193533.ynbepaago6oyc...@sigill.intra.peff.net/ which contains the rationale. The required alternate-objects patches (both the "allow recursive relative" one, and the helper to add an internal alt-odb) have been pushed into their own series, that I posted here: http://public-inbox.org/git/20161003203321.rj5jepviwo57u...@sigill.intra.peff.net/ This needs to be applied on top. Other than that, v2 just fixes the minor issues raised on the list (missing "static", and some clarifying comments). [1/5]: check_connected: accept an env argument [2/5]: tmp-objdir: introduce API for temporary object directories [3/5]: receive-pack: quarantine objects until pre-receive accepts [4/5]: tmp-objdir: put quarantine information in the environment [5/5]: tmp-objdir: do not migrate files starting with '.' Makefile | 1 + builtin/receive-pack.c | 41 ++- cache.h| 1 + connected.c| 1 + connected.h| 5 + t/t5547-push-quarantine.sh | 36 ++ tmp-objdir.c | 275 + tmp-objdir.h | 54 + 8 files changed, 413 insertions(+), 1 deletion(-) create mode 100755 t/t5547-push-quarantine.sh create mode 100644 tmp-objdir.c create mode 100644 tmp-objdir.h
[PATCH 18/18] alternates: use fspathcmp to detect duplicates
On a case-insensitive filesystem, we should realize that "a/objects" and "A/objects" are the same path. We already use fspathcmp() to check against the main object directory, but until recently we couldn't use it for comparing against other alternates (because their paths were not NUL-terminated strings). But now we can, so let's do so. Note that we also need to adjust count-objects to load the config, so that it can see the setting of core.ignorecase (this is required by the test, but is also a general bugfix for users of count-objects). Signed-off-by: Jeff King--- builtin/count-objects.c | 2 ++ sha1_file.c | 2 +- t/t5613-info-alternate.sh | 17 + 3 files changed, 20 insertions(+), 1 deletion(-) diff --git a/builtin/count-objects.c b/builtin/count-objects.c index a700409..a04b4f2 100644 --- a/builtin/count-objects.c +++ b/builtin/count-objects.c @@ -97,6 +97,8 @@ int cmd_count_objects(int argc, const char **argv, const char *prefix) OPT_END(), }; + git_config(git_default_config, NULL); + argc = parse_options(argc, argv, prefix, opts, count_objects_usage, 0); /* we do not take arguments other than flags for now */ if (argc) diff --git a/sha1_file.c b/sha1_file.c index b514167..b05ec9c 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -260,7 +260,7 @@ static int alt_odb_usable(struct strbuf *path, const char *normalized_objdir) * thing twice, or object directory itself. */ for (alt = alt_odb_list; alt; alt = alt->next) { - if (!strcmp(path->buf, alt->path)) + if (!fspathcmp(path->buf, alt->path)) return 0; } if (!fspathcmp(path->buf, normalized_objdir)) diff --git a/t/t5613-info-alternate.sh b/t/t5613-info-alternate.sh index 76525a0..926fe14 100755 --- a/t/t5613-info-alternate.sh +++ b/t/t5613-info-alternate.sh @@ -116,4 +116,21 @@ test_expect_success 'relative duplicates are eliminated' ' test_cmp expect actual.alternates ' +test_expect_success CASE_INSENSITIVE_FS 'dup finding can be case-insensitive' ' + git init --bare insensitive.git && + # the previous entry for "A" will have used uppercase + cat >insensitive.git/objects/info/alternates <<-\EOF && + ../../C/.git/objects + ../../a/.git/objects + EOF + cat >expect <<-EOF && + alternate: $(pwd)/C/.git/objects + alternate: $(pwd)/B/.git/objects + alternate: $(pwd)/A/.git/objects + EOF + git -C insensitive.git count-objects -v >actual && + grep ^alternate: actual >actual.alternates && + test_cmp expect actual.alternates +' + test_done -- 2.10.0.618.g82cc264
[PATCH 17/18] sha1_file: always allow relative paths to alternates
We recursively expand alternates repositories, so that if A borrows from B which borrows from C, A can see all objects. For the root object database, we allow relative paths, so A can point to B as "../B/objects". However, we currently do not allow relative paths when recursing, so B must use an absolute path to reach C. That is an ancient protection from c2f493a (Transitively read alternatives, 2006-05-07) that tries to avoid adding the same alternate through two different paths. Since 5bdf0a8 (sha1_file: normalize alt_odb path before comparing and storing, 2011-09-07), we use a normalized absolute path for each alt_odb entry. This means that in most cases the protection is no longer necessary; we will detect the duplicate no matter how we got there (but see below). And it's a good idea to get rid of it, as it creates an unnecessary complication when setting up recursive alternates (B has to know that A is going to borrow from it and make sure to use an absolute path). Note that our normalization doesn't actually look at the filesystem, so it can still be fooled by crossing symbolic links. But that's also true of absolute paths, so it's not a good reason to disallow only relative paths (it's potentially a reason to switch to real_path(), but that's a separate and non-trivial change). We adjust the test script here to demonstrate that this now works, and add new tests to show that the normalization does indeed suppress duplicates. Signed-off-by: Jeff King--- sha1_file.c | 7 +-- t/t5613-info-alternate.sh | 24 ++-- 2 files changed, 23 insertions(+), 8 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 80a..b514167 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -354,12 +354,7 @@ static void link_alt_odb_entries(const char *alt, int len, int sep, const char *entry = entries.items[i].string; if (entry[0] == '\0' || entry[0] == '#') continue; - if (!is_absolute_path(entry) && depth) { - error("%s: ignoring relative alternate object store %s", - relative_base, entry); - } else { - link_alt_odb_entry(entry, relative_base, depth, objdirbuf.buf); - } + link_alt_odb_entry(entry, relative_base, depth, objdirbuf.buf); } string_list_clear(, 0); free(alt_copy); diff --git a/t/t5613-info-alternate.sh b/t/t5613-info-alternate.sh index 74f6770..76525a0 100755 --- a/t/t5613-info-alternate.sh +++ b/t/t5613-info-alternate.sh @@ -92,8 +92,28 @@ test_expect_success 'that relative alternate is possible for current dir' ' git fsck ' -test_expect_success 'that relative alternate is only possible for current dir' ' - test_must_fail git -C D fsck +test_expect_success 'that relative alternate is recursive' ' + git -C D fsck +' + +# we can reach "A" from our new repo both directly, and via "C". +# The deep/subdir is there to make sure we are not doing a stupid +# pure-text comparison of the alternate names. +test_expect_success 'relative duplicates are eliminated' ' + mkdir -p deep/subdir && + git init --bare deep/subdir/duplicate.git && + cat >deep/subdir/duplicate.git/objects/info/alternates <<-\EOF && + ../../../../C/.git/objects + ../../../../A/.git/objects + EOF + cat >expect <<-EOF && + alternate: $(pwd)/C/.git/objects + alternate: $(pwd)/B/.git/objects + alternate: $(pwd)/A/.git/objects + EOF + git -C deep/subdir/duplicate.git count-objects -v >actual && + grep ^alternate: actual >actual.alternates && + test_cmp expect actual.alternates ' test_done -- 2.10.0.618.g82cc264
[PATCH 15/18] fill_sha1_file: write into a strbuf
It's currently the responsibility of the caller to give fill_sha1_file() enough bytes to write into, leading them to manually compute the required lengths. Instead, let's just write into a strbuf so that it's impossible to get this wrong. The alt_odb caller already has a strbuf, so this makes things strictly simpler. The other caller, sha1_file_name(), uses a static PATH_MAX buffer and dies when it would overflow. We can convert this to a static strbuf, which means our allocation cost is amortized (and as a bonus, we no longer have to worry about PATH_MAX being too short for normal use). This does introduce some small overhead in fill_sha1_file(), as each strbuf_addchar() will check whether it needs to grow. However, between the optimization in fec501d (strbuf_addch: avoid calling strbuf_grow, 2015-04-16) and the fact that this is not generally called in a tight loop (after all, the next step is typically to access the file!) this probably doesn't matter. And even if it did, the right place to micro-optimize is inside fill_sha1_file(), by calling a single strbuf_grow() there. Signed-off-by: Jeff King--- sha1_file.c | 34 ++ 1 file changed, 10 insertions(+), 24 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index efc8cee..80a 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -172,36 +172,28 @@ enum scld_error safe_create_leading_directories_const(const char *path) return result; } -static void fill_sha1_path(char *pathbuf, const unsigned char *sha1) +static void fill_sha1_path(struct strbuf *buf, const unsigned char *sha1) { int i; for (i = 0; i < 20; i++) { static char hex[] = "0123456789abcdef"; unsigned int val = sha1[i]; - *pathbuf++ = hex[val >> 4]; - *pathbuf++ = hex[val & 0xf]; + strbuf_addch(buf, hex[val >> 4]); + strbuf_addch(buf, hex[val & 0xf]); if (!i) - *pathbuf++ = '/'; + strbuf_addch(buf, '/'); } - *pathbuf = '\0'; } const char *sha1_file_name(const unsigned char *sha1) { - static char buf[PATH_MAX]; - const char *objdir; - int len; + static struct strbuf buf = STRBUF_INIT; - objdir = get_object_directory(); - len = strlen(objdir); + strbuf_reset(); + strbuf_addf(, "%s/", get_object_directory()); - /* '/' + sha1(2) + '/' + sha1(38) + '\0' */ - if (len + 43 > PATH_MAX) - die("insanely long object directory %s", objdir); - memcpy(buf, objdir, len); - buf[len] = '/'; - fill_sha1_path(buf + len + 1, sha1); - return buf; + fill_sha1_path(, sha1); + return buf.buf; } struct strbuf *alt_scratch_buf(struct alternate_object_database *alt) @@ -213,14 +205,8 @@ struct strbuf *alt_scratch_buf(struct alternate_object_database *alt) static const char *alt_sha1_path(struct alternate_object_database *alt, const unsigned char *sha1) { - /* hex sha1 plus internal "/" */ - size_t len = GIT_SHA1_HEXSZ + 1; struct strbuf *buf = alt_scratch_buf(alt); - - strbuf_grow(buf, len); - fill_sha1_path(buf->buf + buf->len, sha1); - strbuf_setlen(buf, buf->len + len); - + fill_sha1_path(buf, sha1); return buf->buf; } -- 2.10.0.618.g82cc264
[PATCH 16/18] count-objects: report alternates via verbose mode
There's no way to get the list of alternates that git computes internally; our tests only infer it based on which objects are available. In addition to testing, knowing this list may be helpful for somebody debugging their alternates setup. Let's add it to the "count-objects -v" output. We could give it a separate flag, but there's not really any need. "count-objects -v" is already a debugging catch-all for the object database, its output is easily extensible to new data items, and printing the alternates is not expensive (we already had to find them to count the objects). Signed-off-by: Jeff King--- Documentation/git-count-objects.txt | 5 + builtin/count-objects.c | 10 ++ t/t5613-info-alternate.sh | 10 ++ 3 files changed, 25 insertions(+) diff --git a/Documentation/git-count-objects.txt b/Documentation/git-count-objects.txt index 2ff3568..cb9b4d2 100644 --- a/Documentation/git-count-objects.txt +++ b/Documentation/git-count-objects.txt @@ -38,6 +38,11 @@ objects nor valid packs + size-garbage: disk space consumed by garbage files, in KiB (unless -H is specified) ++ +alternate: absolute path of alternate object databases; may appear +multiple times, one line per path. Note that if the path contains +non-printable characters, it may be surrounded by double-quotes and +contain C-style backslashed escape sequences. -H:: --human-readable:: diff --git a/builtin/count-objects.c b/builtin/count-objects.c index ba92919..a700409 100644 --- a/builtin/count-objects.c +++ b/builtin/count-objects.c @@ -8,6 +8,7 @@ #include "dir.h" #include "builtin.h" #include "parse-options.h" +#include "quote.h" static unsigned long garbage; static off_t size_garbage; @@ -73,6 +74,14 @@ static int count_cruft(const char *basename, const char *path, void *data) return 0; } +static int print_alternate(struct alternate_object_database *alt, void *data) +{ + printf("alternate: "); + quote_c_style(alt->path, NULL, stdout, 0); + putchar('\n'); + return 0; +} + static char const * const count_objects_usage[] = { N_("git count-objects [-v] [-H | --human-readable]"), NULL @@ -140,6 +149,7 @@ int cmd_count_objects(int argc, const char **argv, const char *prefix) printf("prune-packable: %lu\n", packed_loose); printf("garbage: %lu\n", garbage); printf("size-garbage: %s\n", garbage_buf.buf); + foreach_alt_odb(print_alternate, NULL); strbuf_release(_buf); strbuf_release(_buf); strbuf_release(_buf); diff --git a/t/t5613-info-alternate.sh b/t/t5613-info-alternate.sh index b393613..74f6770 100755 --- a/t/t5613-info-alternate.sh +++ b/t/t5613-info-alternate.sh @@ -39,6 +39,16 @@ test_expect_success 'preparing third repository' ' ) ' +test_expect_success 'count-objects shows the alternates' ' + cat >expect <<-EOF && + alternate: $(pwd)/B/.git/objects + alternate: $(pwd)/A/.git/objects + EOF + git -C C count-objects -v >actual && + grep ^alternate: actual >actual.alternates && + test_cmp expect actual.alternates +' + # Note: These tests depend on the hard-coded value of 5 as "too deep". We start # the depth at 0 and count links, not repositories, so in a chain like: # -- 2.10.0.618.g82cc264
[PATCH 13/18] fill_sha1_file: write "boring" characters
This function forms a sha1 as "xx/...", but skips over the slot for the slash rather than writing it, leaving it to the caller to do so. It also does not bother to put in a trailing NUL, even though every caller would want it (we're forming a path which by definition is not a directory, so the only thing to do with it is feed it to a system call). Let's make the lives of our callers easier by just writing out the internal "/" and the NUL. Signed-off-by: Jeff King--- sha1_file.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 70c3e2f..c6308c1 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -178,10 +178,12 @@ static void fill_sha1_path(char *pathbuf, const unsigned char *sha1) for (i = 0; i < 20; i++) { static char hex[] = "0123456789abcdef"; unsigned int val = sha1[i]; - char *pos = pathbuf + i*2 + (i > 0); - *pos++ = hex[val >> 4]; - *pos = hex[val & 0xf]; + *pathbuf++ = hex[val >> 4]; + *pathbuf++ = hex[val & 0xf]; + if (!i) + *pathbuf++ = '/'; } + *pathbuf = '\0'; } const char *sha1_file_name(const unsigned char *sha1) @@ -198,8 +200,6 @@ const char *sha1_file_name(const unsigned char *sha1) die("insanely long object directory %s", objdir); memcpy(buf, objdir, len); buf[len] = '/'; - buf[len+3] = '/'; - buf[len+42] = '\0'; fill_sha1_path(buf + len + 1, sha1); return buf; } @@ -406,8 +406,6 @@ struct alternate_object_database *alloc_alt_odb(const char *dir) ent->name = ent->scratch + dirlen + 1; ent->scratch[dirlen] = '/'; - ent->scratch[dirlen + 3] = '/'; - ent->scratch[entlen-1] = 0; return ent; } -- 2.10.0.618.g82cc264
[PATCH 14/18] alternates: store scratch buffer as strbuf
We pre-size the scratch buffer to hold a loose object filename of the form "xx/...", which leads to allocation code that is hard to verify. We have to use some magic numbers during the initial allocation, and then writers must blindly assume that the buffer is big enough. Using a strbuf makes it more clear that we cannot overflow. Unfortunately, we do still need some magic numbers to grow our strbuf before calling fill_sha1_path(), but the strbuf growth is much closer to the point of use. This makes it easier to see that it's correct, and opens the possibility of pushing it even further down if fill_sha1_path() learns to work on strbufs. Signed-off-by: Jeff King--- cache.h | 13 +++-- sha1_file.c | 28 ++-- sha1_name.c | 9 +++-- 3 files changed, 32 insertions(+), 18 deletions(-) diff --git a/cache.h b/cache.h index e1996c5..9866e46 100644 --- a/cache.h +++ b/cache.h @@ -1383,8 +1383,9 @@ extern void remove_scheduled_dirs(void); extern struct alternate_object_database { struct alternate_object_database *next; - char *name; - char *scratch; + /* see alt_scratch_buf() */ + struct strbuf scratch; + size_t base_len; char path[FLEX_ARRAY]; } *alt_odb_list; @@ -1413,6 +1414,14 @@ extern void add_to_alternates_file(const char *dir); */ extern void add_to_alternates_memory(const char *dir); +/* + * Returns a scratch strbuf pre-filled with the alternate object directory, + * including a trailing slash, which can be used to access paths in the + * alternate. Always use this over direct access to alt->scratch, as it + * cleans up any previous use of the scratch buffer. + */ +extern struct strbuf *alt_scratch_buf(struct alternate_object_database *alt); + struct pack_window { struct pack_window *next; unsigned char *base; diff --git a/sha1_file.c b/sha1_file.c index c6308c1..efc8cee 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -204,11 +204,24 @@ const char *sha1_file_name(const unsigned char *sha1) return buf; } +struct strbuf *alt_scratch_buf(struct alternate_object_database *alt) +{ + strbuf_setlen(>scratch, alt->base_len); + return >scratch; +} + static const char *alt_sha1_path(struct alternate_object_database *alt, const unsigned char *sha1) { - fill_sha1_path(alt->name, sha1); - return alt->scratch; + /* hex sha1 plus internal "/" */ + size_t len = GIT_SHA1_HEXSZ + 1; + struct strbuf *buf = alt_scratch_buf(alt); + + strbuf_grow(buf, len); + fill_sha1_path(buf->buf + buf->len, sha1); + strbuf_setlen(buf, buf->len + len); + + return buf->buf; } /* @@ -396,16 +409,11 @@ void read_info_alternates(const char * relative_base, int depth) struct alternate_object_database *alloc_alt_odb(const char *dir) { struct alternate_object_database *ent; - size_t dirlen = strlen(dir); - size_t entlen; - entlen = st_add(dirlen, 43); /* '/' + 2 hex + '/' + 38 hex + NUL */ FLEX_ALLOC_STR(ent, path, dir); - ent->scratch = xmalloc(entlen); - xsnprintf(ent->scratch, entlen, "%s/", dir); - - ent->name = ent->scratch + dirlen + 1; - ent->scratch[dirlen] = '/'; + strbuf_init(>scratch, 0); + strbuf_addf(>scratch, "%s/", dir); + ent->base_len = ent->scratch.len; return ent; } diff --git a/sha1_name.c b/sha1_name.c index 770ea4f..defbb3e 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -92,15 +92,12 @@ static void find_short_object_filename(int len, const char *hex_pfx, struct disa xsnprintf(hex, sizeof(hex), "%.2s", hex_pfx); for (alt = fakeent; alt && !ds->ambiguous; alt = alt->next) { + struct strbuf *buf = alt_scratch_buf(alt); struct dirent *de; DIR *dir; - /* -* every alt_odb struct has 42 extra bytes after the base -* for exactly this purpose -*/ - xsnprintf(alt->name, 42, "%.2s/", hex_pfx); - dir = opendir(alt->scratch); + strbuf_addf(buf, "%.2s/", hex_pfx); + dir = opendir(buf->buf); if (!dir) continue; -- 2.10.0.618.g82cc264
[PATCH 12/18] alternates: use a separate scratch space
The alternate_object_database struct uses a single buffer both for storing the path to the alternate, and as a scratch buffer for forming object names. This is efficient (since otherwise we'd end up storing the path twice), but it makes life hard for callers who just want to know the path to the alternate. They have to remember to stop reading after "alt->name - alt->base" bytes, and to subtract one for the trailing '/'. It would be much simpler if they could simply access a NUL-terminated path string. We could encapsulate this in a function which puts a NUL in the scratch buffer and returns the string, but that opens up questions about the lifetime of the result. The first time another caller uses the alternate, the scratch buffer may get other data tacked onto it. Let's instead just store the root path separately from the scratch buffer. There aren't enough alternates being stored for the duplicated data to matter for performance, and this keeps things simple and safe for the callers. Signed-off-by: Jeff King--- builtin/fsck.c | 10 ++ builtin/submodule--helper.c | 11 +++ cache.h | 5 - sha1_file.c | 28 sha1_name.c | 3 ++- transport.c | 4 +--- 6 files changed, 24 insertions(+), 37 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 055dfdc..f01b81e 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -644,14 +644,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) fsck_object_dir(get_object_directory()); prepare_alt_odb(); - for (alt = alt_odb_list; alt; alt = alt->next) { - /* directory name, minus trailing slash */ - size_t namelen = alt->name - alt->base - 1; - struct strbuf name = STRBUF_INIT; - strbuf_add(, alt->base, namelen); - fsck_object_dir(name.buf); - strbuf_release(); - } + for (alt = alt_odb_list; alt; alt = alt->next) + fsck_object_dir(alt->path); } if (check_full) { diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index e3fdc0a..fd72c90 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -492,20 +492,16 @@ static int add_possible_reference_from_superproject( { struct submodule_alternate_setup *sas = sas_cb; - /* directory name, minus trailing slash */ - size_t namelen = alt->name - alt->base - 1; - struct strbuf name = STRBUF_INIT; - strbuf_add(, alt->base, namelen); - /* * If the alternate object store is another repository, try the * standard layout with .git/modules//objects */ - if (ends_with(name.buf, ".git/objects")) { + if (ends_with(alt->path, ".git/objects")) { char *sm_alternate; struct strbuf sb = STRBUF_INIT; struct strbuf err = STRBUF_INIT; - strbuf_add(, name.buf, name.len - strlen("objects")); + strbuf_add(, alt->path, strlen(alt->path) - strlen("objects")); + /* * We need to end the new path with '/' to mark it as a dir, * otherwise a submodule name containing '/' will be broken @@ -533,7 +529,6 @@ static int add_possible_reference_from_superproject( strbuf_release(); } - strbuf_release(); return 0; } diff --git a/cache.h b/cache.h index d36b2ad..e1996c5 100644 --- a/cache.h +++ b/cache.h @@ -1382,8 +1382,11 @@ extern void remove_scheduled_dirs(void); extern struct alternate_object_database { struct alternate_object_database *next; + char *name; - char base[FLEX_ARRAY]; /* more */ + char *scratch; + + char path[FLEX_ARRAY]; } *alt_odb_list; extern void prepare_alt_odb(void); extern void read_info_alternates(const char * relative_base, int depth); diff --git a/sha1_file.c b/sha1_file.c index ccf59ba..70c3e2f 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -208,7 +208,7 @@ static const char *alt_sha1_path(struct alternate_object_database *alt, const unsigned char *sha1) { fill_sha1_path(alt->name, sha1); - return alt->base; + return alt->scratch; } /* @@ -261,8 +261,7 @@ static int alt_odb_usable(struct strbuf *path, const char *normalized_objdir) * thing twice, or object directory itself. */ for (alt = alt_odb_list; alt; alt = alt->next) { - if (path->len == alt->name - alt->base - 1 && - !memcmp(path->buf, alt->base, path->len)) + if (!strcmp(path->buf, alt->path)) return 0; } if (!fspathcmp(path->buf, normalized_objdir)) @@ -401,13 +400,14 @@
[PATCH 11/18] alternates: encapsulate alt->base munging
The alternate_object_database struct holds a path to the alternate objects, but we also use that buffer as scratch space for forming loose object filenames. Let's pull that logic into a helper function so that we can more easily modify it. Signed-off-by: Jeff King--- sha1_file.c | 19 +-- 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 549cf1e..ccf59ba 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -204,6 +204,13 @@ const char *sha1_file_name(const unsigned char *sha1) return buf; } +static const char *alt_sha1_path(struct alternate_object_database *alt, +const unsigned char *sha1) +{ + fill_sha1_path(alt->name, sha1); + return alt->base; +} + /* * Return the name of the pack or index file with the specified sha1 * in its filename. *base and *name are scratch space that must be @@ -601,8 +608,8 @@ static int check_and_freshen_nonlocal(const unsigned char *sha1, int freshen) struct alternate_object_database *alt; prepare_alt_odb(); for (alt = alt_odb_list; alt; alt = alt->next) { - fill_sha1_path(alt->name, sha1); - if (check_and_freshen_file(alt->base, freshen)) + const char *path = alt_sha1_path(alt, sha1); + if (check_and_freshen_file(path, freshen)) return 1; } return 0; @@ -1600,8 +1607,8 @@ static int stat_sha1_file(const unsigned char *sha1, struct stat *st) prepare_alt_odb(); errno = ENOENT; for (alt = alt_odb_list; alt; alt = alt->next) { - fill_sha1_path(alt->name, sha1); - if (!lstat(alt->base, st)) + const char *path = alt_sha1_path(alt, sha1); + if (!lstat(path, st)) return 0; } @@ -1621,8 +1628,8 @@ static int open_sha1_file(const unsigned char *sha1) prepare_alt_odb(); for (alt = alt_odb_list; alt; alt = alt->next) { - fill_sha1_path(alt->name, sha1); - fd = git_open_noatime(alt->base); + const char *path = alt_sha1_path(alt, sha1); + fd = git_open_noatime(path); if (fd >= 0) return fd; if (most_interesting_errno == ENOENT) -- 2.10.0.618.g82cc264
[PATCH 10/18] alternates: provide helper for allocating alternate
Allocating a struct alternate_object_database is tricky, as we must over-allocate the buffer to provide scratch space, and then put in particular '/' and NUL markers. Let's encapsulate this in a function so that the complexity doesn't leak into callers (and so that we can modify it later). Signed-off-by: Jeff King--- cache.h | 6 ++ sha1_file.c | 28 +++- sha1_name.c | 7 +-- 3 files changed, 26 insertions(+), 15 deletions(-) diff --git a/cache.h b/cache.h index 9a91378..d36b2ad 100644 --- a/cache.h +++ b/cache.h @@ -1391,6 +1391,12 @@ extern char *compute_alternate_path(const char *path, struct strbuf *err); typedef int alt_odb_fn(struct alternate_object_database *, void *); extern int foreach_alt_odb(alt_odb_fn, void*); +/* + * Allocate a "struct alternate_object_database" but do _not_ actually + * add it to the list of alternates. + */ +struct alternate_object_database *alloc_alt_odb(const char *dir); + /* * Add the directory to the on-disk alternates file; the new entry will also * take effect in the current process. diff --git a/sha1_file.c b/sha1_file.c index 2e41b26..549cf1e 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -283,7 +283,6 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base, int depth, const char *normalized_objdir) { struct alternate_object_database *ent; - size_t entlen; struct strbuf pathbuf = STRBUF_INIT; if (!is_absolute_path(entry) && relative_base) { @@ -311,14 +310,7 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base, return -1; } - entlen = st_add(pathbuf.len, 43); /* '/' + 2 hex + '/' + 38 hex + NUL */ - ent = xmalloc(st_add(sizeof(*ent), entlen)); - memcpy(ent->base, pathbuf.buf, pathbuf.len); - - ent->name = ent->base + pathbuf.len + 1; - ent->base[pathbuf.len] = '/'; - ent->base[pathbuf.len + 3] = '/'; - ent->base[entlen-1] = 0; + ent = alloc_alt_odb(pathbuf.buf); /* add the alternate entry */ *alt_odb_tail = ent; @@ -395,6 +387,24 @@ void read_info_alternates(const char * relative_base, int depth) munmap(map, mapsz); } +struct alternate_object_database *alloc_alt_odb(const char *dir) +{ + struct alternate_object_database *ent; + size_t dirlen = strlen(dir); + size_t entlen; + + entlen = st_add(dirlen, 43); /* '/' + 2 hex + '/' + 38 hex + NUL */ + ent = xmalloc(st_add(sizeof(*ent), entlen)); + memcpy(ent->base, dir, dirlen); + + ent->name = ent->base + dirlen + 1; + ent->base[dirlen] = '/'; + ent->base[dirlen + 3] = '/'; + ent->base[entlen-1] = 0; + + return ent; +} + void add_to_alternates_file(const char *reference) { struct lock_file *lock = xcalloc(1, sizeof(struct lock_file)); diff --git a/sha1_name.c b/sha1_name.c index faf873c..98152a6 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -86,12 +86,7 @@ static void find_short_object_filename(int len, const char *hex_pfx, struct disa * alt->name/alt->base while iterating over the * object databases including our own. */ - const char *objdir = get_object_directory(); - size_t objdir_len = strlen(objdir); - fakeent = xmalloc(st_add3(sizeof(*fakeent), objdir_len, 43)); - memcpy(fakeent->base, objdir, objdir_len); - fakeent->name = fakeent->base + objdir_len + 1; - fakeent->name[-1] = '/'; + fakeent = alloc_alt_odb(get_object_directory()); } fakeent->next = alt_odb_list; -- 2.10.0.618.g82cc264
[PATCH 08/18] link_alt_odb_entry: refactor string handling
The string handling in link_alt_odb_entry() is mostly an artifact of the original version, which took the path as a ptr/len combo, and did not have a NUL-terminated string until we created one in the alternate_object_database struct. But since 5bdf0a8 (sha1_file: normalize alt_odb path before comparing and storing, 2011-09-07), the first thing we do is put the path into a strbuf, which gives us some easy opportunities for cleanup. In particular: - we call strlen(pathbuf.buf), which is silly; we can look at pathbuf.len. - even though we have a strbuf, we don't maintain its "len" field when chomping extra slashes from the end, and instead keep a separate "pfxlen" variable. We can fix this and then drop "pfxlen" entirely. - we don't check whether the path is usable until after we allocate the new struct, making extra cleanup work for ourselves. Since we have a NUL-terminated string, we can bump the "is it usable" checks higher in the function. While we're at it, we can move that logic to its own helper, which makes the flow of link_alt_odb_entry() easier to follow. Signed-off-by: Jeff King--- And you can probably guess now how I found the issue in the last patch where pathbuf.len is totally bogus after calling normalize_path_copy. :) sha1_file.c | 83 + 1 file changed, 45 insertions(+), 38 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 68571bd..f396823 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -234,6 +234,36 @@ char *sha1_pack_index_name(const unsigned char *sha1) struct alternate_object_database *alt_odb_list; static struct alternate_object_database **alt_odb_tail; +/* + * Return non-zero iff the path is usable as an alternate object database. + */ +static int alt_odb_usable(struct strbuf *path, const char *normalized_objdir) +{ + struct alternate_object_database *alt; + + /* Detect cases where alternate disappeared */ + if (!is_directory(path->buf)) { + error("object directory %s does not exist; " + "check .git/objects/info/alternates.", + path->buf); + return 0; + } + + /* +* Prevent the common mistake of listing the same +* thing twice, or object directory itself. +*/ + for (alt = alt_odb_list; alt; alt = alt->next) { + if (path->len == alt->name - alt->base - 1 && + !memcmp(path->buf, alt->base, path->len)) + return 0; + } + if (!fspathcmp(path->buf, normalized_objdir)) + return 0; + + return 1; +} + /* * Prepare alternate object database registry. * @@ -253,8 +283,7 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base, int depth, const char *normalized_objdir) { struct alternate_object_database *ent; - struct alternate_object_database *alt; - size_t pfxlen, entlen; + size_t entlen; struct strbuf pathbuf = STRBUF_INIT; if (!is_absolute_path(entry) && relative_base) { @@ -270,47 +299,26 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base, return -1; } - pfxlen = strlen(pathbuf.buf); - /* * The trailing slash after the directory name is given by * this function at the end. Remove duplicates. */ - while (pfxlen && pathbuf.buf[pfxlen-1] == '/') - pfxlen -= 1; - - entlen = st_add(pfxlen, 43); /* '/' + 2 hex + '/' + 38 hex + NUL */ - ent = xmalloc(st_add(sizeof(*ent), entlen)); - memcpy(ent->base, pathbuf.buf, pfxlen); - strbuf_release(); - - ent->name = ent->base + pfxlen + 1; - ent->base[pfxlen + 3] = '/'; - ent->base[pfxlen] = ent->base[entlen-1] = 0; + while (pathbuf.len && pathbuf.buf[pathbuf.len - 1] == '/') + strbuf_setlen(, pathbuf.len - 1); - /* Detect cases where alternate disappeared */ - if (!is_directory(ent->base)) { - error("object directory %s does not exist; " - "check .git/objects/info/alternates.", - ent->base); - free(ent); + if (!alt_odb_usable(, normalized_objdir)) { + strbuf_release(); return -1; } - /* Prevent the common mistake of listing the same -* thing twice, or object directory itself. -*/ - for (alt = alt_odb_list; alt; alt = alt->next) { - if (pfxlen == alt->name - alt->base - 1 && - !memcmp(ent->base, alt->base, pfxlen)) { - free(ent); - return -1; - } - } - if (!fspathcmp(ent->base, normalized_objdir)) { - free(ent); - return -1; - } + entlen = st_add(pathbuf.len, 43);
[PATCH 09/18] alternates: provide helper for adding to alternates list
The submodule code wants to temporarily add an alternate object store to our in-memory alt_odb list, but does it manually. Let's provide a helper so it can reuse the code in link_alt_odb_entry(). While we're adding our new add_to_alternates_memory(), let's document add_to_alternates_file(), as the two are related. Signed-off-by: Jeff King--- cache.h | 14 +- sha1_file.c | 11 +++ submodule.c | 23 +-- 3 files changed, 25 insertions(+), 23 deletions(-) diff --git a/cache.h b/cache.h index ed3d5df..9a91378 100644 --- a/cache.h +++ b/cache.h @@ -1388,10 +1388,22 @@ extern struct alternate_object_database { extern void prepare_alt_odb(void); extern void read_info_alternates(const char * relative_base, int depth); extern char *compute_alternate_path(const char *path, struct strbuf *err); -extern void add_to_alternates_file(const char *reference); typedef int alt_odb_fn(struct alternate_object_database *, void *); extern int foreach_alt_odb(alt_odb_fn, void*); +/* + * Add the directory to the on-disk alternates file; the new entry will also + * take effect in the current process. + */ +extern void add_to_alternates_file(const char *dir); + +/* + * Add the directory to the in-memory list of alternates (along with any + * recursive alternates it points to), but do not modify the on-disk alternates + * file. + */ +extern void add_to_alternates_memory(const char *dir); + struct pack_window { struct pack_window *next; unsigned char *base; diff --git a/sha1_file.c b/sha1_file.c index f396823..2e41b26 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -440,6 +440,17 @@ void add_to_alternates_file(const char *reference) free(alts); } +void add_to_alternates_memory(const char *reference) +{ + /* +* Make sure alternates are initialized, or else our entry may be +* overwritten when they are. +*/ + prepare_alt_odb(); + + link_alt_odb_entries(reference, strlen(reference), '\n', NULL, 0); +} + /* * Compute the exact path an alternate is at and returns it. In case of * error NULL is returned and the human readable error is added to `err` diff --git a/submodule.c b/submodule.c index 0ef2ff4..8b3274a 100644 --- a/submodule.c +++ b/submodule.c @@ -123,9 +123,7 @@ void stage_updated_gitmodules(void) static int add_submodule_odb(const char *path) { struct strbuf objects_directory = STRBUF_INIT; - struct alternate_object_database *alt_odb; int ret = 0; - size_t alloc; ret = strbuf_git_path_submodule(_directory, path, "objects/"); if (ret) @@ -134,26 +132,7 @@ static int add_submodule_odb(const char *path) ret = -1; goto done; } - /* avoid adding it twice */ - prepare_alt_odb(); - for (alt_odb = alt_odb_list; alt_odb; alt_odb = alt_odb->next) - if (alt_odb->name - alt_odb->base == objects_directory.len && - !strncmp(alt_odb->base, objects_directory.buf, - objects_directory.len)) - goto done; - - alloc = st_add(objects_directory.len, 42); /* for "12/345..." sha1 */ - alt_odb = xmalloc(st_add(sizeof(*alt_odb), alloc)); - alt_odb->next = alt_odb_list; - xsnprintf(alt_odb->base, alloc, "%s", objects_directory.buf); - alt_odb->name = alt_odb->base + objects_directory.len; - alt_odb->name[2] = '/'; - alt_odb->name[40] = '\0'; - alt_odb->name[41] = '\0'; - alt_odb_list = alt_odb; - - /* add possible alternates from the submodule */ - read_info_alternates(objects_directory.buf, 0); + add_to_alternates_memory(objects_directory.buf); done: strbuf_release(_directory); return ret; -- 2.10.0.618.g82cc264
[PATCH 07/18] link_alt_odb_entry: handle normalize_path errors
When we add a new alternate to the list, we try to normalize out any redundant "..", etc. However, we do not look at the return value of normalize_path_copy(), and will happily continue with a path that could not be normalized. Worse, the normalizing process is done in-place, so we are left with whatever half-finished working state the normalizing function was in. Fortunately, this cannot cause us to read past the end of our buffer, as that working state will always leave the NUL from the original path in place. And we do tend to notice problems when we check is_directory() on the path. But you can see the nonsense that we feed to is_directory with an entry like: this/../../is/../../way/../../too/../../deep/../../to/../../resolve in your objects/info/alternates, which yields: error: object directory /to/e/deep/too/way//ects/this/../../is/../../way/../../too/../../deep/../../to/../../resolve does not exist; check .git/objects/info/alternates. We can easily fix this just by checking the return value. But that makes it hard to generate a good error message, since we're normalizing in-place and our input value has been overwritten by cruft. Instead, let's provide a strbuf helper that does an in-place normalize, but restores the original contents on error. This uses a second buffer under the hood, which is slightly less efficient, but this is not a performance-critical code path. The strbuf helper can also properly set the "len" parameter of the strbuf before returning. Just doing: normalize_path_copy(buf.buf, buf.buf); will shorten the string, but leave buf.len at the original length. That may be confusing to later code which uses the strbuf. Signed-off-by: Jeff King--- sha1_file.c | 11 +-- strbuf.c| 20 strbuf.h| 8 3 files changed, 37 insertions(+), 2 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index b9c1fa3..68571bd 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -263,7 +263,12 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base, } strbuf_addstr(, entry); - normalize_path_copy(pathbuf.buf, pathbuf.buf); + if (strbuf_normalize_path() < 0) { + error("unable to normalize alternate object path: %s", + pathbuf.buf); + strbuf_release(); + return -1; + } pfxlen = strlen(pathbuf.buf); @@ -335,7 +340,9 @@ static void link_alt_odb_entries(const char *alt, int len, int sep, } strbuf_add_absolute_path(, get_object_directory()); - normalize_path_copy(objdirbuf.buf, objdirbuf.buf); + if (strbuf_normalize_path() < 0) + die("unable to normalize object directory: %s", + objdirbuf.buf); alt_copy = xmemdupz(alt, len); string_list_split_in_place(, alt_copy, sep, -1); diff --git a/strbuf.c b/strbuf.c index b839be4..8fec657 100644 --- a/strbuf.c +++ b/strbuf.c @@ -870,3 +870,23 @@ void strbuf_stripspace(struct strbuf *sb, int skip_comments) strbuf_setlen(sb, j); } + +int strbuf_normalize_path(struct strbuf *src) +{ + struct strbuf dst = STRBUF_INIT; + + strbuf_grow(, src->len); + if (normalize_path_copy(dst.buf, src->buf) < 0) { + strbuf_release(); + return -1; + } + + /* +* normalize_path does not tell us the new length, so we have to +* compute it by looking for the new NUL it placed +*/ + strbuf_setlen(, strlen(dst.buf)); + strbuf_swap(src, ); + strbuf_release(); + return 0; +} diff --git a/strbuf.h b/strbuf.h index ba8d5f1..2262b12 100644 --- a/strbuf.h +++ b/strbuf.h @@ -443,6 +443,14 @@ extern int strbuf_getcwd(struct strbuf *sb); */ extern void strbuf_add_absolute_path(struct strbuf *sb, const char *path); + +/** + * Normalize in-place the path contained in the strbuf. See + * normalize_path_copy() for details. If an error occurs, the contents of "sb" + * are left untouched, and -1 is returned. + */ +extern int strbuf_normalize_path(struct strbuf *sb); + /** * Strip whitespace from a buffer. The second parameter controls if * comments are considered contents to be removed or not. -- 2.10.0.618.g82cc264
[PATCH 02/18] t5613: drop test_valid_repo function
This function makes sure that "git fsck" does not report any errors. But "--full" has been the default since f29cd39 (fsck: default to "git fsck --full", 2009-10-20), and we can use the exit code (instead of counting the lines) since e2b4f63 (fsck: exit with non-zero status upon errors, 2007-03-05). So we can just use "git fsck", which is shorter and more flexible (e.g., we can use "git -C"). Signed-off-by: Jeff King--- t/t5613-info-alternate.sh | 19 +++ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/t/t5613-info-alternate.sh b/t/t5613-info-alternate.sh index e13f57d..4548fb0 100755 --- a/t/t5613-info-alternate.sh +++ b/t/t5613-info-alternate.sh @@ -6,11 +6,6 @@ test_description='test transitive info/alternate entries' . ./test-lib.sh -test_valid_repo() { - git fsck --full > fsck.log && - test_line_count = 0 fsck.log -} - base_dir=$(pwd) test_expect_success 'preparing first repository' \ @@ -52,7 +47,7 @@ git clone --bare -l -s G H' test_expect_success 'invalidity of deepest repository' \ 'cd H && { - test_valid_repo + git fsck test $? -ne 0 }' @@ -60,41 +55,41 @@ cd "$base_dir" test_expect_success 'validity of third repository' \ 'cd C && -test_valid_repo' +git fsck' cd "$base_dir" test_expect_success 'validity of fourth repository' \ 'cd D && -test_valid_repo' +git fsck' cd "$base_dir" test_expect_success 'breaking of loops' \ 'echo "$base_dir"/B/.git/objects >> "$base_dir"/A/.git/objects/info/alternates&& cd C && -test_valid_repo' +git fsck' cd "$base_dir" test_expect_success 'that info/alternates is necessary' \ 'cd C && rm -f .git/objects/info/alternates && -! (test_valid_repo)' +! (git fsck)' cd "$base_dir" test_expect_success 'that relative alternate is possible for current dir' \ 'cd C && echo "../../../B/.git/objects" > .git/objects/info/alternates && -test_valid_repo' +git fsck' cd "$base_dir" test_expect_success \ 'that relative alternate is only possible for current dir' ' cd D && -! (test_valid_repo) +! (git fsck) ' cd "$base_dir" -- 2.10.0.618.g82cc264
[PATCH 01/18] t5613: drop reachable_via function
This function was never used since its inception in dd05ea1 (test case for transitive info/alternates, 2006-05-07). Which is just as well, since it mutates the repo state in a way that would invalidate further tests, without cleaning up after itself. Let's get rid of it so that nobody is tempted to use it. Signed-off-by: Jeff King--- t/t5613-info-alternate.sh | 10 -- 1 file changed, 10 deletions(-) diff --git a/t/t5613-info-alternate.sh b/t/t5613-info-alternate.sh index 9cd2626..e13f57d 100755 --- a/t/t5613-info-alternate.sh +++ b/t/t5613-info-alternate.sh @@ -6,16 +6,6 @@ test_description='test transitive info/alternate entries' . ./test-lib.sh -# test that a file is not reachable in the current repository -# but that it is after creating a info/alternate entry -reachable_via() { - alternate="$1" - file="$2" - if git cat-file -e "HEAD:$file"; then return 1; fi - echo "$alternate" >> .git/objects/info/alternate - git cat-file -e "HEAD:$file" -} - test_valid_repo() { git fsck --full > fsck.log && test_line_count = 0 fsck.log -- 2.10.0.618.g82cc264
[PATCH 05/18] t5613: do not chdir in main process
Our usual style when working with subdirectories is to chdir inside a subshell or to use "git -C", which means we do not have to constantly return to the main test directory. Let's convert this old test, which does not follow that style. Signed-off-by: Jeff King--- t/t5613-info-alternate.sh | 92 +-- 1 file changed, 33 insertions(+), 59 deletions(-) diff --git a/t/t5613-info-alternate.sh b/t/t5613-info-alternate.sh index 1f283a5..7bc1c3c 100755 --- a/t/t5613-info-alternate.sh +++ b/t/t5613-info-alternate.sh @@ -6,44 +6,39 @@ test_description='test transitive info/alternate entries' . ./test-lib.sh -base_dir=$(pwd) - test_expect_success 'preparing first repository' ' - test_create_repo A && - cd A && - echo "Hello World" > file1 && - git add file1 && - git commit -m "Initial commit" file1 && - git repack -a -d && - git prune + test_create_repo A && ( + cd A && + echo "Hello World" > file1 && + git add file1 && + git commit -m "Initial commit" file1 && + git repack -a -d && + git prune + ) ' -cd "$base_dir" - test_expect_success 'preparing second repository' ' - git clone -l -s A B && - cd B && - echo "foo bar" > file2 && - git add file2 && - git commit -m "next commit" file2 && - git repack -a -d -l && - git prune + git clone -l -s A B && ( + cd B && + echo "foo bar" > file2 && + git add file2 && + git commit -m "next commit" file2 && + git repack -a -d -l && + git prune + ) ' -cd "$base_dir" - test_expect_success 'preparing third repository' ' - git clone -l -s B C && - cd C && - echo "Goodbye, cruel world" > file3 && - git add file3 && - git commit -m "one more" file3 && - git repack -a -d -l && - git prune + git clone -l -s B C && ( + cd C && + echo "Goodbye, cruel world" > file3 && + git add file3 && + git commit -m "one more" file3 && + git repack -a -d -l && + git prune + ) ' -cd "$base_dir" - test_expect_success 'creating too deep nesting' ' git clone -l -s C D && git clone -l -s D E && @@ -53,55 +48,34 @@ test_expect_success 'creating too deep nesting' ' ' test_expect_success 'invalidity of deepest repository' ' - cd H && - test_must_fail git fsck + test_must_fail git -C H fsck ' -cd "$base_dir" - test_expect_success 'validity of third repository' ' - cd C && - git fsck + git -C C fsck ' -cd "$base_dir" - test_expect_success 'validity of fourth repository' ' - cd D && - git fsck + git -C D fsck ' -cd "$base_dir" - test_expect_success 'breaking of loops' ' - echo "$base_dir"/B/.git/objects >>"$base_dir"/A/.git/objects/info/alternatesi && - cd C && - git fsck + echo "$(pwd)"/B/.git/objects >>A/.git/objects/info/alternates && + git -C C fsck ' -cd "$base_dir" - test_expect_success 'that info/alternates is necessary' ' - cd C && - rm -f .git/objects/info/alternates && - test_must_fail git fsck + rm -f C/.git/objects/info/alternates && + test_must_fail git -C C fsck ' -cd "$base_dir" - test_expect_success 'that relative alternate is possible for current dir' ' - cd C && - echo "../../../B/.git/objects" > .git/objects/info/alternates && + echo "../../../B/.git/objects" >C/.git/objects/info/alternates && git fsck ' -cd "$base_dir" - test_expect_success 'that relative alternate is only possible for current dir' ' - cd D && - test_must_fail git fsck + test_must_fail git -C D fsck ' -cd "$base_dir" - test_done -- 2.10.0.618.g82cc264
[PATCH 06/18] t5613: clarify "too deep" recursion tests
These tests are just trying to show that we allow recursion up to a certain depth, but not past it. But the counting is a bit non-intuitive, and rather than test at the edge of the breakage, we test "OK" cases in the middle of the chain. Let's explain what's going on, and explicitly test the switch between "OK" and "too deep". Signed-off-by: Jeff King--- t/t5613-info-alternate.sh | 24 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/t/t5613-info-alternate.sh b/t/t5613-info-alternate.sh index 7bc1c3c..b393613 100755 --- a/t/t5613-info-alternate.sh +++ b/t/t5613-info-alternate.sh @@ -39,6 +39,18 @@ test_expect_success 'preparing third repository' ' ) ' +# Note: These tests depend on the hard-coded value of 5 as "too deep". We start +# the depth at 0 and count links, not repositories, so in a chain like: +# +# A -> B -> C -> D -> E -> F -> G -> H +# 0123456 +# +# we are OK at "G", but break at "H". +# +# Note also that we must use "--bare -l" to make the link to H. The "-l" +# ensures we do not do a connectivity check, and the "--bare" makes sure +# we do not try to checkout the result (which needs objects), either of +# which would cause the clone to fail. test_expect_success 'creating too deep nesting' ' git clone -l -s C D && git clone -l -s D E && @@ -47,16 +59,12 @@ test_expect_success 'creating too deep nesting' ' git clone --bare -l -s G H ' -test_expect_success 'invalidity of deepest repository' ' - test_must_fail git -C H fsck -' - -test_expect_success 'validity of third repository' ' - git -C C fsck +test_expect_success 'validity of fifth-deep repository' ' + git -C G fsck ' -test_expect_success 'validity of fourth repository' ' - git -C D fsck +test_expect_success 'invalidity of sixth-deep repository' ' + test_must_fail git -C H fsck ' test_expect_success 'breaking of loops' ' -- 2.10.0.618.g82cc264
[PATCH 04/18] t5613: whitespace/style cleanups
Our normal test style these days puts the opening quote of the body on the description line, and indents the body with a single tab. This ancient test did not follow this. Signed-off-by: Jeff King--- t/t5613-info-alternate.sh | 114 +- 1 file changed, 62 insertions(+), 52 deletions(-) diff --git a/t/t5613-info-alternate.sh b/t/t5613-info-alternate.sh index 65074dd..1f283a5 100755 --- a/t/t5613-info-alternate.sh +++ b/t/t5613-info-alternate.sh @@ -8,88 +8,98 @@ test_description='test transitive info/alternate entries' base_dir=$(pwd) -test_expect_success 'preparing first repository' \ -'test_create_repo A && cd A && -echo "Hello World" > file1 && -git add file1 && -git commit -m "Initial commit" file1 && -git repack -a -d && -git prune' +test_expect_success 'preparing first repository' ' + test_create_repo A && + cd A && + echo "Hello World" > file1 && + git add file1 && + git commit -m "Initial commit" file1 && + git repack -a -d && + git prune +' cd "$base_dir" -test_expect_success 'preparing second repository' \ -'git clone -l -s A B && cd B && -echo "foo bar" > file2 && -git add file2 && -git commit -m "next commit" file2 && -git repack -a -d -l && -git prune' +test_expect_success 'preparing second repository' ' + git clone -l -s A B && + cd B && + echo "foo bar" > file2 && + git add file2 && + git commit -m "next commit" file2 && + git repack -a -d -l && + git prune +' cd "$base_dir" -test_expect_success 'preparing third repository' \ -'git clone -l -s B C && cd C && -echo "Goodbye, cruel world" > file3 && -git add file3 && -git commit -m "one more" file3 && -git repack -a -d -l && -git prune' +test_expect_success 'preparing third repository' ' + git clone -l -s B C && + cd C && + echo "Goodbye, cruel world" > file3 && + git add file3 && + git commit -m "one more" file3 && + git repack -a -d -l && + git prune +' cd "$base_dir" -test_expect_success 'creating too deep nesting' \ -'git clone -l -s C D && -git clone -l -s D E && -git clone -l -s E F && -git clone -l -s F G && -git clone --bare -l -s G H' +test_expect_success 'creating too deep nesting' ' + git clone -l -s C D && + git clone -l -s D E && + git clone -l -s E F && + git clone -l -s F G && + git clone --bare -l -s G H +' -test_expect_success 'invalidity of deepest repository' \ -'cd H && -test_must_fail git fsck +test_expect_success 'invalidity of deepest repository' ' + cd H && + test_must_fail git fsck ' cd "$base_dir" -test_expect_success 'validity of third repository' \ -'cd C && -git fsck' +test_expect_success 'validity of third repository' ' + cd C && + git fsck +' cd "$base_dir" -test_expect_success 'validity of fourth repository' \ -'cd D && -git fsck' +test_expect_success 'validity of fourth repository' ' + cd D && + git fsck +' cd "$base_dir" -test_expect_success 'breaking of loops' \ -'echo "$base_dir"/B/.git/objects >> "$base_dir"/A/.git/objects/info/alternates&& -cd C && -git fsck' +test_expect_success 'breaking of loops' ' + echo "$base_dir"/B/.git/objects >>"$base_dir"/A/.git/objects/info/alternatesi && + cd C && + git fsck +' cd "$base_dir" -test_expect_success 'that info/alternates is necessary' \ -'cd C && -rm -f .git/objects/info/alternates && -test_must_fail git fsck +test_expect_success 'that info/alternates is necessary' ' + cd C && + rm -f .git/objects/info/alternates && + test_must_fail git fsck ' cd "$base_dir" -test_expect_success 'that relative alternate is possible for current dir' \ -'cd C && -echo "../../../B/.git/objects" > .git/objects/info/alternates && -git fsck' +test_expect_success 'that relative alternate is possible for current dir' ' + cd C && + echo "../../../B/.git/objects" > .git/objects/info/alternates && + git fsck +' cd "$base_dir" -test_expect_success \ -'that relative alternate is only possible for current dir' ' -cd D && -test_must_fail git fsck +test_expect_success 'that relative alternate is only possible for current dir' ' + cd D && + test_must_fail git fsck ' cd "$base_dir" -- 2.10.0.618.g82cc264
[PATCH 03/18] t5613: use test_must_fail
Besides being our normal style, this correctly checks for an error exit() versus signal death. Signed-off-by: Jeff King--- t/t5613-info-alternate.sh | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/t/t5613-info-alternate.sh b/t/t5613-info-alternate.sh index 4548fb0..65074dd 100755 --- a/t/t5613-info-alternate.sh +++ b/t/t5613-info-alternate.sh @@ -46,10 +46,9 @@ git clone -l -s F G && git clone --bare -l -s G H' test_expect_success 'invalidity of deepest repository' \ -'cd H && { - git fsck - test $? -ne 0 -}' +'cd H && +test_must_fail git fsck +' cd "$base_dir" @@ -75,7 +74,8 @@ cd "$base_dir" test_expect_success 'that info/alternates is necessary' \ 'cd C && rm -f .git/objects/info/alternates && -! (git fsck)' +test_must_fail git fsck +' cd "$base_dir" @@ -89,7 +89,7 @@ cd "$base_dir" test_expect_success \ 'that relative alternate is only possible for current dir' ' cd D && -! (git fsck) +test_must_fail git fsck ' cd "$base_dir" -- 2.10.0.618.g82cc264
[PATCH 0/18] alternate object database cleanups
This series is the result of René nerd-sniping me with the claim that we could "easily" teach count-objects to print out the list of alternates in: http://public-inbox.org/git/c27dc1a4-3c7a-2866-d9d8-f5d3eb161...@web.de/ My real goal is just patch 17, which is needed for the quarantine series in that thread. But along the way there were quite a few opportunities for cleanups along with a few minor bugfixes (in patches 7 and 18), and I think the count-objects change in patch 16 is a nice general debugging tool. The rest of it is "just" cleanup, but I'll note that it clears up some hairy allocation code. These were bits that I noticed in my big allocation-cleanup series last year, but were too nasty to fit any of the more general fixes. I think the end result is much better. The number of patches is a little intimidating, but I tried hard to break the refactoring down into a sequence of obviously-correct steps. You can be the judge of my success. [01/18]: t5613: drop reachable_via function [02/18]: t5613: drop test_valid_repo function [03/18]: t5613: use test_must_fail [04/18]: t5613: whitespace/style cleanups [05/18]: t5613: do not chdir in main process [06/18]: t5613: clarify "too deep" recursion tests [07/18]: link_alt_odb_entry: handle normalize_path errors [08/18]: link_alt_odb_entry: refactor string handling [09/18]: alternates: provide helper for adding to alternates list [10/18]: alternates: provide helper for allocating alternate [11/18]: alternates: encapsulate alt->base munging [12/18]: alternates: use a separate scratch space [13/18]: fill_sha1_file: write "boring" characters [14/18]: alternates: store scratch buffer as strbuf [15/18]: fill_sha1_file: write into a strbuf [16/18]: count-objects: report alternates via verbose mode [17/18]: sha1_file: always allow relative paths to alternates [18/18]: alternates: use fspathcmp to detect duplicates Documentation/git-count-objects.txt | 5 + builtin/count-objects.c | 12 +++ builtin/fsck.c | 10 +- builtin/submodule--helper.c | 11 +- cache.h | 36 ++- sha1_file.c | 179 ++-- sha1_name.c | 17 +-- strbuf.c| 20 strbuf.h| 8 ++ submodule.c | 23 +--- t/t5613-info-alternate.sh | 202 transport.c | 4 +- 12 files changed, 305 insertions(+), 222 deletions(-)
Re: What's cooking in git.git (Sep 2016, #08; Tue, 27)
Stefan Bellerwrites: >> The minimum that would future-proof us, that is still missing from >> the series, would probably be to separate the query parameter >> "struct git_attr_check" and the return values from git_check_attr(). > > Not sure what you mean here with separating as a preparation for > the thread safety. As I understand it we can still keep the thread local > states in git_attr_check, we'd just have to route each thread to its > own part of the memory in there? For example, look at what you did in your pathspec-label topic. static int match_attrs(const char *name, int namelen, const struct pathspec_item *item) { int i; git_check_attr_counted(name, namelen, item->attr_check); for (i = 0; i < item->attr_match_nr; i++) { const char *value; int matched; enum attr_match_mode match_mode; value = item->attr_check->check[i].value; match_mode = item->attr_match[i].match_mode; Each pathspec item has an attr_check member that wants to see a specific set of attributes for a path being matched. Each element of the item->attr_check->check[] array is pair, where is a constant for the purpose of the codepath (i.e. no matter which thread is asking, and no matter for which path the question is being asked, it asks for a fixed attribute that was computed when the pathspec was parsed). But is a slot to return the finding back to the caller. So you can never keep this code structure and have this function called more than once, specifically, you cannot make git_check_attr_counted() call from multiple threads, at one time. Instead the calling convention needs to be updated to allow this caller of git_check_attr_counted() to pass a separate array that is on its stack, e.g. const char *v[... some size ...]; git_check_attr_counted(name, namelen, item->attr_check, v); for (i = 0; i < item->attr_match_nr; i++) { const char *value; value = v[i]; match_mode = item->attr_match[i].match_mode; We could do that API update way before we make the attribute subsystem's implementation thread-safe, and if we did so now, then the caller will not have to change. That is what I meant as "future-proofing", i.e. future-proofing the callers. And from that point of view, I think 0a5aadcce4 is not an ideal place to stop. We'd want at least up to 079186123a but probably even more, e.g. to 48d93f7f42, I would think.
Re: What's cooking in git.git (Sep 2016, #08; Tue, 27)
On Mon, Oct 3, 2016 at 11:07 AM, Junio C Hamanowrote: > Stefan Beller writes: > >>> * jc/attr (2016-05-25) 18 commits >>> ... >>> The attributes API has been updated so that it can later be >>> optimized using the knowledge of which attributes are queried. >>> >>> I wanted to polish this topic further to make the attribute >>> subsystem thread-ready, but because other topics depend on this >>> topic and they do not (yet) need it to be thread-ready. >>> >>> As the authors of topics that depend on this seem not in a hurry, >>> let's discard this and dependent topics and restart them some other >>> day. >>> >>> Will discard. >> >> So I just realized this is a big hint for me to pick up that topic; I assumed >> you'd want to tackle the attr subsystem eventually, so all I was doing, was >> waiting for your motivation to look at attr stuff to come back. >> >> So what is the actual lacking stuff here? > > Quite a bit. Do you want a grand vision, or just the minimum that > will hopefully futureproof us? That is a good question. I do want the bare minimum at least; time permitting I can do more. I looked through jc/attr and I think all commits up to 0a5aad (2016-05-25, attr.c: plug small leak in parse_attr_line()) are better off if we'd include them today, no matter if we go with the grand vision or the bare minimum viable, because all commits up to that one are fixing real issues (memory or readability issues) Starting with its child b649c7a50c (attr: rename function and struct related to checking attributes) we start off going on an adventure which may lead to the grand vision implemented, so these should be held off until we have implemented a viable way. > > The current attribute API is optimized for a very narrow use case > where a single thread looks up a single set of attributes for > adjacent paths, without mixing lookups for other attributes of > distant paths. > > Take "conversion" codepath for an example. "git checkout" will > iterate over the paths in the index and is interested in the eol, > text and filter attributes (perhaps more, but the details do not > change the overall picture that much). When it checks dir/fileA, it > is expected that it would next want to check dir/fileB, before it > would want to check dir2/fileC and it would move much later to > otherdir/fileD. It also is expected that it would want to learn the > same set of attributes, simply because the codepath is doing the > same operation over these paths (i.e. learn how the Git "clean" > representation needs to be converted to the external "smudged" > representation). > > Based on the "adjacent paths" assumption, the attribute subsystem > has a single cache that holds the contents of the .gitattributes > files that matter to the current query. This has to be split up if > we ever want to do a parallel checkout with multiple threads. One > thread may be walking the first half of the index, while another one > may be walking the second half of the index. They would benefit > from the same optimization that keeps track of the contents of > .gitattributes files that matters to each of their traversal. If > the first thread is responsible for working on dir/fileA, it will > work on dir/fileB next, before going to dir2/fileC, but it does not > want to share the cache with the other thread that would be scanning > entries far-away from what it's scanning, like otherdir/fileD, if it > evicts the cached information for dir/* that is still useful for the > first thread. > > Another thing we would want to see is to take advantage of the > "lookup is for single set of attributes from a single codepath" for > further optimization. The way each of the callers is structured is > to first declare a set of attributes it is interested in by > preparing git_attr_check_elem[] array and then make repeated calls > to git_attr_check() passing it and a path. > > The current implementation however does not tie the cache to this > git_attr_check_elem[] array but has only one single global cache. > The cache _could_ be used to query any attribute because of it, and > that leads to inefficiency. It has everything it read from relevant > .gitattributes files, even the entries that do not affect any > attributes that a single codepath showed its interest in. I am > hoping that we can do better by having a per > cache of .gitattributes files, so that a caller in one thread (say, > "git checkout" that scans the first half of the index) that asks for > "eol" and "filter" would use a cache that does not have entries > irrelevant for the attributes the caller is intereseted in, and that > is tied to the directory hierarchy the caller is asking about > (i.e. what prepare_attr_stack() does). > > Up to 079186123a ("attr: retire git_check_attrs() API", 2016-05-16) > of the series gives us "struct git_attr_check" to replace the > git_attr_check_elem[] array. I originally hoped that this
Re: [PATCH v2 6/6] git-gui: Update Japanese information
Junio C Hamanowrites: > Pat Thoyts writes: > >> I'm just starting to catch up once again. hopefully I can be >> a bit more reactive than recently. Merging 52285c83 looks fine. I'll >> stick that onto the 0.20.0 head and see what else I can pick up on top. >> There are a few from the git for windows set among others. > > Nice to hear from you again. I think I have a few topics I merged > to my tree bypassing you in the meantime. Let me get back to you > with a list of topic tips to bring your tree in sync with what I > have later. I think the following lists everything that has been done bypassing your tree: 66fe3e061a ("git-gui: l10n: add Portuguese translation", 2016-05-06) 52285c8312 ("git-gui: update Japanese information", 2016-09-07) 2afe6b733e ("git-gui: respect commit.gpgsign again", 2016-09-09) b5f325cb4a ("git-gui: stop using deprecated merge syntax", 2016-09-24) 52285c8312 and 2afe6b733e are already in my 'master'; the other two are already cooking in 'next'. So if you fetch from me and merge the above, you'd be in sync with me (I won't be in sync with you, as you would have more than I have from other places like Git for Windows set). Thanks.
Re: [RFC/PATCH 0/2] place cherry pick line below commit title
Jonathan Tanwrites: > That sounds reasonable to me. Would a patch set to implement this new > trailer block heuristic (in both sequencer and trailer) be reasonable? > And if yes, should trailer know about the "(cherry picked from" > prefix? (I can see it both ways - knowing about the "(cherry picked > from" prefix would make it consistent with sequencer, but it seems > like a detail that it shouldn't know about. Writing > "Cherry-picked-from:" instead probably wouldn't solve our problem > because, for backwards compatibility, we would still need to support > reading the old format.) If we were to go that route, I'd suggest keeping the historical practice supported, exactly because you would need to be prepared to cherry-pick an old commit. It may be necessary for the code to analize the lines in a block identified as "likely to be a trailing block" more carefully, though. The example 59f0aa94 in the message you are responding to has "Link 1:" that consists of 3 physical lines. An instruction to interpret-trailers to add a new one _after_ "Link-$n:" may have to treat these as a single logical line and do the addition after them, i.e. before "Link 2:" line, for example. I also saw Signed-off-by: Some body (some comment that extends to the next line without being indented) Sined-off-by: Some body Else where the only clue that the second line is logically a part of the first one was the balancing of parentheses (or [brakets]). To accomodate real-world use cases, you may have to take into account a lot more than the strict rfc-822 style "line folding".
Re: Reference a submodule branch instead of a commit
Jeremy Mortonwrites: > At the moment, supermodules must reference a given commit in each of > its submodules. If one is in control of a submodule and it changes on > a regular basis, this can cause a lot of overhead with "submodule > updated" commits in the supermodule. It would be useful of git allows > the option of referencing a submodule's branch instead of a given > submodule commit. How about adding this functionality? When somebody downstream fetches from your superproject and grabs the set of submodules, how would s/he know what _exact_ state you meant to record? When s/he says "I have your superproject commit X, which binds submodule's branch Y at path sub/, and it simply does not work. Your project is broken", how do you go about reproducing the exact state s/he had trouble with to help her/him? The only thing s/he knows is that the commit used from the submodule must be one of the commits that was on branch Y at some point in time, hopefully close to the timestamp recorded in the commit in the superproject. And your record in the history of the superproject does not tell you more than that, so you wouldn't have any idea better than what s/he already has to help. Hence, such a "functionality" will never happen, at least in the exact form you are describing. It is conceivable to add some feature that allows you to squelch the report that the submodule recorded in your superproject is not up to date from "git status" etc. to help those who thinks it is OK to not bind the latest submodule commit to the superproject all the time, though.
Reference a submodule branch instead of a commit
At the moment, supermodules must reference a given commit in each of its submodules. If one is in control of a submodule and it changes on a regular basis, this can cause a lot of overhead with "submodule updated" commits in the supermodule. It would be useful of git allows the option of referencing a submodule's branch instead of a given submodule commit. How about adding this functionality? -- Best regards, Jeremy Morton (Jez)
Re: What's cooking in git.git (Sep 2016, #08; Tue, 27)
Stefan Bellerwrites: >> * jc/attr (2016-05-25) 18 commits >> ... >> The attributes API has been updated so that it can later be >> optimized using the knowledge of which attributes are queried. >> >> I wanted to polish this topic further to make the attribute >> subsystem thread-ready, but because other topics depend on this >> topic and they do not (yet) need it to be thread-ready. >> >> As the authors of topics that depend on this seem not in a hurry, >> let's discard this and dependent topics and restart them some other >> day. >> >> Will discard. > > So I just realized this is a big hint for me to pick up that topic; I assumed > you'd want to tackle the attr subsystem eventually, so all I was doing, was > waiting for your motivation to look at attr stuff to come back. > > So what is the actual lacking stuff here? Quite a bit. Do you want a grand vision, or just the minimum that will hopefully futureproof us? The current attribute API is optimized for a very narrow use case where a single thread looks up a single set of attributes for adjacent paths, without mixing lookups for other attributes of distant paths. Take "conversion" codepath for an example. "git checkout" will iterate over the paths in the index and is interested in the eol, text and filter attributes (perhaps more, but the details do not change the overall picture that much). When it checks dir/fileA, it is expected that it would next want to check dir/fileB, before it would want to check dir2/fileC and it would move much later to otherdir/fileD. It also is expected that it would want to learn the same set of attributes, simply because the codepath is doing the same operation over these paths (i.e. learn how the Git "clean" representation needs to be converted to the external "smudged" representation). Based on the "adjacent paths" assumption, the attribute subsystem has a single cache that holds the contents of the .gitattributes files that matter to the current query. This has to be split up if we ever want to do a parallel checkout with multiple threads. One thread may be walking the first half of the index, while another one may be walking the second half of the index. They would benefit from the same optimization that keeps track of the contents of .gitattributes files that matters to each of their traversal. If the first thread is responsible for working on dir/fileA, it will work on dir/fileB next, before going to dir2/fileC, but it does not want to share the cache with the other thread that would be scanning entries far-away from what it's scanning, like otherdir/fileD, if it evicts the cached information for dir/* that is still useful for the first thread. Another thing we would want to see is to take advantage of the "lookup is for single set of attributes from a single codepath" for further optimization. The way each of the callers is structured is to first declare a set of attributes it is interested in by preparing git_attr_check_elem[] array and then make repeated calls to git_attr_check() passing it and a path. The current implementation however does not tie the cache to this git_attr_check_elem[] array but has only one single global cache. The cache _could_ be used to query any attribute because of it, and that leads to inefficiency. It has everything it read from relevant .gitattributes files, even the entries that do not affect any attributes that a single codepath showed its interest in. I am hoping that we can do better by having a per cache of .gitattributes files, so that a caller in one thread (say, "git checkout" that scans the first half of the index) that asks for "eol" and "filter" would use a cache that does not have entries irrelevant for the attributes the caller is intereseted in, and that is tied to the directory hierarchy the caller is asking about (i.e. what prepare_attr_stack() does). Up to 079186123a ("attr: retire git_check_attrs() API", 2016-05-16) of the series gives us "struct git_attr_check" to replace the git_attr_check_elem[] array. I originally hoped that this struct can hold the per-callsite cache itself, before we hit the threading issue too early (IIRC, that was preload-index code) and realize that the cache needs to be not just per callsite but needs to be per . This new structure cannot be used to store the cache itself, but this change is probably a necessary first step for allowing the API in multi-threaded context. git_attr_check_elem[] array was static and had slots to receive returned values, which would not have worked in threaded environment. We'd further need to change it so that the inquiry keys (which are "interned" git_attr objects) of "git_attr_check" are initialized just once before starting to make repeated calls to git_attr_check(), but the mechanism to return values would be thread-safe. git_check_attr() call may have to gain an extra/separate variable for the caller to specify an
Re: [PATCH 1/3] add QSORT
On 01/10/2016 19:19, René Scharfe wrote: It's hard to imagine an implementation of qsort(3) that can't handle zero elements. QSORT's safety feature is that it prevents the compiler from removing NULL checks for the array pointer. E.g. the last two lines in the following example could be optimized away: qsort(ptr, n, sizeof(*ptr), fn); if (!ptr) do_stuff(); You can see that on https://godbolt.org/g/JwS99b -- an awesome website for exploring compilation results for small snippets, by the way. This optimization is dangerous when combined with the convention of using a NULL pointer for empty arrays. Diagnosing an affected NULL check is probably quite hard -- it's right there in the code after all and not all compilers remove it. Hang on, hang on. This is either a compiler bug, or you're wrong on your assumption about the specification of qsort. Either way, the extra layer of indirection is not proper protection. The unwanted compiler optimisation you're inadvertently triggering could still be triggered through the inline. Now, looking at the C standard, this isn't actually clear to me. The standard says that if you call qsort with nmemb zero, the pointer still has to be "valid". Not totally clear to me if NULL is valid, by their definition in C99 7.1.4. Googling hasn't given me a concrete answer. The compiler seems to think that NULL wouldn't be valid, so because you've called qsort on it, you've invoked undefined behaviour if it's NULL, so it's free to elide the NULL check. Kevin
Re: [RFC/PATCH 0/2] place cherry pick line below commit title
On 09/30/2016 01:49 PM, Junio C Hamano wrote: Junio C Hamanowrites: Jonathan Tan writes: I vaguely recall that there were some discussion on the definition of "what's a trailer line" with folks from the kernel land, perhaps while discussing the interpret-trailers topic. IIRC, when somebody passes an improved version along, the resulting message's trailer block may look like this: Signed-off-by: Original Author [fixed typo in the variable names] Signed-off-by: Somebhody Else and an obvious "wish" of theirs was to treat not just RFC2822-like "a line that begins with token followed by a colon" but also these short comments as part of the trailer block. Your original wish in [*1*] is to also treat "a line that begin with a whitespace that follows a line that begins with token followed by a colon" as part of the trailer block and I personally think that is a reasonable thing to wish for, too. If we allowed arbitrary lines in the trailer block, this would solve my original problem, yes. Here is an experiment I ran during my lunch break. The script (attached) is meant to run in the kernel repository and for each log messages of each non-merge commit: * find its last paragraph, where the definition of paragraph is simply "a blank/empty line"; * inspect if there is at least one RFC2822-header-looking line, or a line that begins with "(cherry picked from"; * dump the ones that do not pass the above criteria. My cursory look of the output did not spot a legitimate trailer block that we should have identified. The output lines shown were ones that are not signed off at all (e.g. af8c34ce6ae32add that says "Linux 4.7-rc2"), ones that has three-dash line "---" in them (e.g. 133d558216d9), ones that has diffstat that should have been after "---" (e.g. 259307074bfcf1f). The story is the same if you run it in git.git; the "do we have at least one rfc2822-header-looking line or '(cherry picked from' line in the last paragraph? if so, then that is an existing trailer block" seems to be a good heuristics to cover many cases like these: d0196c8d5d3057c5c21a82f3d0113ca8e501033b Signed-off-by: Arnd Bergmann [tomi.valkei...@ti.com: resolved conflicts] Signed-off-by: Tomi Valkeinen 59f0aa9480cfef9173a648cec4537addc5f3ad94 Link 1: https://bugzilla.kernel.org/show_bug.cgi?id=9916 http://bugzilla.kernel.org/show_bug.cgi?id=10100 https://lkml.org/lkml/2008/2/25/282 Link 2: https://bugzilla.kernel.org/show_bug.cgi?id=9399 https://bugzilla.kernel.org/show_bug.cgi?id=12461 https://bugzilla.kernel.org/show_bug.cgi?id=11880 Link 3: https://bugzilla.kernel.org/show_bug.cgi?id=11884 https://bugzilla.kernel.org/show_bug.cgi?id=14081 https://bugzilla.kernel.org/show_bug.cgi?id=14086 https://bugzilla.kernel.org/show_bug.cgi?id=14446 Link 4: https://bugzilla.kernel.org/show_bug.cgi?id=112911 Signed-off-by: Lv Zheng Tested-by: Chris Bainbridge Signed-off-by: Rafael J. Wysocki That sounds reasonable to me. Would a patch set to implement this new trailer block heuristic (in both sequencer and trailer) be reasonable? And if yes, should trailer know about the "(cherry picked from" prefix? (I can see it both ways - knowing about the "(cherry picked from" prefix would make it consistent with sequencer, but it seems like a detail that it shouldn't know about. Writing "Cherry-picked-from:" instead probably wouldn't solve our problem because, for backwards compatibility, we would still need to support reading the old format.)
Re: [PATCH v8 00/11] Git filter protocol
> On 03 Oct 2016, at 19:02, Junio C Hamanowrote: > > Lars Schneider writes: > >>> If the filter process refuses to die forever when Git told it to >>> shutdown (by closing the pipe to it, for example), that filter >>> process is simply buggy. I think we want users to become aware of >>> that, instead of Git leaving it behind, which essentially is to >>> sweep the problem under the rug. >>> >>> I agree with what Peff said elsewhere in the thread; if a filter >>> process wants to take time to clean things up while letting Git >>> proceed, it can do its own process management, but I think it is >>> sensible for Git to wait the filter process it directly spawned. >> >> To realize the approach above I prototyped the run-command patch below: >> >> I added an "exit_timeout" variable to the "child_process" struct. >> On exit, Git will close the pipe to the process and wait "exit_timeout" >> seconds until it kills the child process. If "exit_timeout" is negative >> then Git will wait until the process is done. > >> If we use that in the long running filter process, then we could make >> the timeout even configurable. E.g. with "filter..process-timeout". >> >> What do you think about this solution? > > Is such a configuration (or timeout in general) necessary? I > suspect that a need for timeout, especially needing timeout and > needing to get killed that happens so often to require a > configuration variable, is a sign of something else seriously wrong. > > What's the justification for a filter to _require_ getting killed > all the time when it is spawned? Otherwise you wouldn't configure > "this driver does not die when told, so we need a timeout" variable. > Is it a sign of the flaw in the protocol to talk to it? e.g. Git > has a way to tell it to die, but it somehow is very hard to hear > from filter's end and honor that request? > > I think that we would need some timeout in the mechanism, but not to > be used for "killing". > > You would decide to "kill" an filter process in two cases: the > filter is buggy and refuses to die when Git tells it to exit, or the > code in Git waiting for its death is somehow miscounting its > children, and thought it told to die one process but in fact it > didn't (perhaps it told somebody else to die), or it thought it > hasn't seen the child die when in fact it already did. Agreed. > Calling kill(2) and exiting would hide these two kind of bugs from > end users. Not doing so would give the end users a hung Git, which > is a VERY GOOD thing. Otherwise you would not notice bugs and lose > the opportunity to diagnose and fix it. Aha. I assumed that a hung Git because of a buggy filter would be a no-no. Thanks for this clarification. > The timeout would be good for you to give a message "filter process > running the script '%s' is not exiting; I am waiting for it". The > user is still left with a hung Git, and can then see if that process > is hanging around. If it is, then we found a buggy filter. Or we > found a buggy Git. Either needs to be fixed. I do not think it > would help anybody by doing a kill(2) to sweep possible bugs under > the rug. I could achieve that with this run-command patch: http://public-inbox.org/git/e9946e9f-6ee5-492b-b122-9078ceb88...@gmail.com/ (I'll remove the "timeout after x seconds" parts and keep the "wait until done" part with stderr output) Thanks, Lars
Re: [PATCH 1/3] add QSORT
On 01/10/2016 19:19, René Scharfe wrote: It's hard to imagine an implementation of qsort(3) that can't handle zero elements. QSORT's safety feature is that it prevents the compiler from removing NULL checks for the array pointer. E.g. the last two lines in the following example could be optimized away: qsort(ptr, n, sizeof(*ptr), fn); if (!ptr) do_stuff(); You can see that on https://godbolt.org/g/JwS99b -- an awesome website for exploring compilation results for small snippets, by the way. Ah, second attempt. Originally misread the original code, and didn't understand what it was doing. I get it now. A nasty trap I hadn't been aware of - I was under the impression NULL + zero length was generally legal, but the C standard does indeed not give you a specific out for NULL to library functions in that case. As such, NULL checks can still be elided even with your change. If you effectively change your example to: if (nmemb > 1) qsort(array, nmemb, size, cmp); if (!array) printf("array is NULL\n"); array may only be checked for NULL if nmemb <= 1. You can see GCC doing that in the compiler explorer - it effectively turns that into "else if". To make that check really work, you have to do: if (array) qsort(array, nmemb, size, cmp); else printf("array is NULL\n"); So maybe your "sane_qsort" should be checking array, not nmemb. Kevin
[PATCH] http: http.emptyauth should allow empty (not just NULL) usernames
When using kerberos authentication, one URL pattern which is allowed is http://@gitserver.example.com. This leads to a username of zero-length, rather than a NULL username. But the two cases should be treated the same by http.emptyauth. Signed-off-by: David Turner--- http.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/http.c b/http.c index 82ed542..bd0dba2 100644 --- a/http.c +++ b/http.c @@ -351,7 +351,7 @@ static int http_options(const char *var, const char *value, void *cb) static void init_curl_http_auth(CURL *result) { - if (!http_auth.username) { + if (!http_auth.username || !*http_auth.username) { if (curl_empty_auth) curl_easy_setopt(result, CURLOPT_USERPWD, ":"); return; -- 2.8.0.rc4.22.g8ae061a
Re: [RFC PATCH] clone: add clone.recursesubmodules config option
On Mon, Oct 3, 2016 at 8:36 AM, Jeremy Mortonwrote: > Did this ever get anywhere? Can we recursively update submodules with "git > pull" in the supermodule now? I think the idea is sound. >> diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh >> index 7ca10b8..fc2c189 100755 >> --- a/t/t7407-submodule-foreach.sh >> +++ b/t/t7407-submodule-foreach.sh Not sure if t7407-submodule-foreach.sh is the best place to put these tests, as it is not `submodule foreach`, maybe put it into 7400 (though that is larger already) >> +test_expect_success 'use "git clone" with clone.recursesubmodules to >> checkout all submodules' ' >> + git config --local clone.recursesubmodules true&& Nit of the day: I think we prefer a single white space between the line and the ending &&. No need for --local as that is the default. However I'd propose to use test_config here, as then the option is cleaned up after the test automatically. >> + git clone super clone7&& >> + ( >> + cd clone7&& >> + git rev-parse --resolve-git-dir .git&& >> + git rev-parse --resolve-git-dir sub1/.git&& >> + git rev-parse --resolve-git-dir sub2/.git&& >> + git rev-parse --resolve-git-dir sub3/.git&& >> + git rev-parse --resolve-git-dir nested1/.git&& >> + git rev-parse --resolve-git-dir nested1/nested2/.git&& >> + git rev-parse --resolve-git-dir >> nested1/nested2/nested3/.git&& >> + git rev-parse --resolve-git-dir >> nested1/nested2/nested3/submodule/.git >> + )&& >> + git config --local --unset clone.recursesubmodules No need to unset it here when test_config is used. We'd maybe would want to also test that git -c clone.recursesubmodules clone --no-recursive ... works as expected (the --no-recursive taking precedence over the config option)
Re: [PATCH v8 00/11] Git filter protocol
> On 01 Oct 2016, at 22:48, Jakub Narębskiwrote: > > W dniu 01.10.2016 o 20:59, Lars Schneider pisze: >> On 29 Sep 2016, at 23:27, Junio C Hamano wrote: >>> Lars Schneider writes: >>> >>> If the filter process refuses to die forever when Git told it to >>> shutdown (by closing the pipe to it, for example), that filter >>> process is simply buggy. I think we want users to become aware of >>> that, instead of Git leaving it behind, which essentially is to >>> sweep the problem under the rug. > > Well, it would be good to tell users _why_ Git is hanging, see below. Agreed. Do you think it is OK to write the message to stderr? >>> I agree with what Peff said elsewhere in the thread; if a filter >>> process wants to take time to clean things up while letting Git >>> proceed, it can do its own process management, but I think it is >>> sensible for Git to wait the filter process it directly spawned. >> >> To realize the approach above I prototyped the run-command patch below: >> >> I added an "exit_timeout" variable to the "child_process" struct. >> On exit, Git will close the pipe to the process and wait "exit_timeout" >> seconds until it kills the child process. If "exit_timeout" is negative >> then Git will wait until the process is done. > > That might be good approach. Probably the default would be to wait. I think I would prefer a 2sec timeout or something as default. This way we can ensure Git would not wait indefinitely for a buggy filter by default. >> If we use that in the long running filter process, then we could make >> the timeout even configurable. E.g. with "filter..process-timeout". > > Sidenote: we prefer camelCase rather than kebab-case for config > variables, that is, "filter..processTimeout". Sure! > Also, how would one set default value of timeout for all process > based filters? I think we don't need that because a timeout is always specific to a filter (if the 2sec default is not sufficient). >> >> +while ((waitpid(p->pid, , 0)) < 0 && errno == >> EINTR) >> +; /* nothing */ > > Ah, this loop is here because waiting on waitpid() can be interrupted > by the delivery of a signal to the calling process; though the result > is -1, not just any < 0. "< 0" is also used in wait_or_whine() >> +while (getpgid(p->pid) >= 0 && tv.tv_sec - secs < >> p->timeout) { >> +gettimeofday(, NULL); >> +} > > WTF? Busy wait loop??? This was just a quick prototype to show "my thinking direction". I wasn't expecting a review. Sorry :-) > Also, if we want to wait for child without blocking, then instead > of cryptic getpgid(p->pid) maybe use waitpid(p->pid, , WNOHANG); > it is more explicit. Sure! > There is also another complication: there can be more than one > long-running filter driver used. With this implementation we > wait for each of one in sequence, e.g. 10s + 10s + 10s. Good idea, I fixed that in the version below! >> >> -static void mark_child_for_cleanup(pid_t pid) >> +static void mark_child_for_cleanup(pid_t pid, int timeout, int stdin) > > H... three parameters is not too much, but we might want to > pass "struct child_process *" directly if this number grows. I used parameters because this function is also used with the async struct... I've attached a more serious patch for review below. What do you think? Thanks, Lars diff --git a/run-command.c b/run-command.c index 3269362..ca0feef 100644 --- a/run-command.c +++ b/run-command.c @@ -21,6 +21,9 @@ void child_process_clear(struct child_process *child) struct child_to_clean { pid_t pid; + char *name; + int stdin; + int timeout; struct child_to_clean *next; }; static struct child_to_clean *children_to_clean; @@ -28,12 +31,53 @@ static int installed_child_cleanup_handler; static void cleanup_children(int sig, int in_signal) { + int status; + struct timeval tv; + time_t secs; + struct child_to_clean *p = children_to_clean; + + // Send EOF to children as indicator that Git will exit soon + while (p) { + if (p->timeout != 0) { + if (p->stdin > 0) + close(p->stdin); + } + p = p->next; + } + while (children_to_clean) { - struct child_to_clean *p = children_to_clean; + p = children_to_clean; children_to_clean = p->next; + + if (p->timeout != 0) { + fprintf(stderr, _("Waiting for '%s' to finish..."), p->name); + if (p->timeout < 0) { + // No timeout given - wait indefinitely + while ((waitpid(p->pid, , 0)) < 0 && errno == EINTR) + ;
Re: [PATCH] diff_unique_abbrev(): document its assumtion and limitation
Jeff Kingwrites: > I guess my point was that the poor name may have contributed to the need > to explain it. The comment was not about "it may not be obvious but this tries to pad and align", but to say "the way this tries to pad and align is based on an unsaid assumption that leads to this limitation". I do agree it is a good idea to rename it to a name that has 'pad' or 'align' in addition to 'unique', but I doubt renaming alone would reduce the need for the new comment.
Re: [PATCH v8 00/11] Git filter protocol
Lars Schneiderwrites: >> If the filter process refuses to die forever when Git told it to >> shutdown (by closing the pipe to it, for example), that filter >> process is simply buggy. I think we want users to become aware of >> that, instead of Git leaving it behind, which essentially is to >> sweep the problem under the rug. >> >> I agree with what Peff said elsewhere in the thread; if a filter >> process wants to take time to clean things up while letting Git >> proceed, it can do its own process management, but I think it is >> sensible for Git to wait the filter process it directly spawned. > > To realize the approach above I prototyped the run-command patch below: > > I added an "exit_timeout" variable to the "child_process" struct. > On exit, Git will close the pipe to the process and wait "exit_timeout" > seconds until it kills the child process. If "exit_timeout" is negative > then Git will wait until the process is done. > If we use that in the long running filter process, then we could make > the timeout even configurable. E.g. with "filter..process-timeout". > > What do you think about this solution? Is such a configuration (or timeout in general) necessary? I suspect that a need for timeout, especially needing timeout and needing to get killed that happens so often to require a configuration variable, is a sign of something else seriously wrong. What's the justification for a filter to _require_ getting killed all the time when it is spawned? Otherwise you wouldn't configure "this driver does not die when told, so we need a timeout" variable. Is it a sign of the flaw in the protocol to talk to it? e.g. Git has a way to tell it to die, but it somehow is very hard to hear from filter's end and honor that request? I think that we would need some timeout in the mechanism, but not to be used for "killing". You would decide to "kill" an filter process in two cases: the filter is buggy and refuses to die when Git tells it to exit, or the code in Git waiting for its death is somehow miscounting its children, and thought it told to die one process but in fact it didn't (perhaps it told somebody else to die), or it thought it hasn't seen the child die when in fact it already did. Calling kill(2) and exiting would hide these two kind of bugs from end users. Not doing so would give the end users a hung Git, which is a VERY GOOD thing. Otherwise you would not notice bugs and lose the opportunity to diagnose and fix it. The timeout would be good for you to give a message "filter process running the script '%s' is not exiting; I am waiting for it". The user is still left with a hung Git, and can then see if that process is hanging around. If it is, then we found a buggy filter. Or we found a buggy Git. Either needs to be fixed. I do not think it would help anybody by doing a kill(2) to sweep possible bugs under the rug. Thanks.
Re: [Q] would it be bad to make /etc/gitconfig runtime configurable?
Jeff Kingwrites: > I admit both of those are uses for git _developers_, though, not git > _users_. Yes, this is meant for developers and not users. The initial question probably should have stated more explicitly, e.g. "I am wondering if it would be helpful to developers if we add this thing; does anybody think of a reason why exposing it to end users is a bad idea?"
Re: [PATCH v2 6/6] git-gui: Update Japanese information
Pat Thoytswrites: > I'm just starting to catch up once again. hopefully I can be > a bit more reactive than recently. Merging 52285c83 looks fine. I'll > stick that onto the 0.20.0 head and see what else I can pick up on top. > There are a few from the git for windows set among others. Nice to hear from you again. I think I have a few topics I merged to my tree bypassing you in the meantime. Let me get back to you with a list of topic tips to bring your tree in sync with what I have later.
Re: [RFC PATCH] clone: add clone.recursesubmodules config option
Did this ever get anywhere? Can we recursively update submodules with "git pull" in the supermodule now? -- Best regards, Jeremy Morton (Jez) On 04/06/2014 10:30, Chris Packham wrote: Add a config option that will cause clone to recurse into submodules as if the --recurse-submodules option had been specified on the command line. This can be overridden with the --no-recurse-submodules option. Signed-off-by: Chris Packham--- On 04/06/14 09:05, Junio C Hamano wrote: Mara Kim writes: Apologies if this question has been asked already, but what is the reasoning behind making git clone not recursive (--recursive) by default? The primary reason why submodules are separate repositories is not to require people to have everything. Some people want recursive, some others don't, and the world is not always "majority wins" (not that I am saying that majority will want recursive). Inertia, aka backward compatibility and not surprising existing users, plays some role when deciding the default. Also, going --recursive when the user did not want is a lot more expensive mistake to fix than not being --recursive when the user wanted to. Having said all that, I do not mean to say that I am opposed to introduce some mechanism to let the users express their preference between recursive and non-recursive better, so that "git clone" without an explicit --recursive (or --no-recursive) can work to their taste. A configuration in $HOME/.gitconfig might be a place to start, even though that has the downside of assuming that the given user would want to use the same settings for all his projects, which may not be the case in practice. And here's a quick proof of concept. Not sure about the config variable name and it could probably do with a negative test as well. builtin/clone.c | 9 + t/t7407-submodule-foreach.sh | 17 + 2 files changed, 26 insertions(+) diff --git a/builtin/clone.c b/builtin/clone.c index b12989d..92aea81 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -734,6 +734,14 @@ static void write_refspec_config(const char* src_ref_prefix, strbuf_release(); } +static int git_clone_config(const char *key, const char *value, void *data) +{ + if (!strcmp(key, "clone.recursesubmodules")) + option_recursive = git_config_bool(key, value); + + return 0; +} + int cmd_clone(int argc, const char **argv, const char *prefix) { int is_bundle = 0, is_local; @@ -759,6 +767,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) junk_pid = getpid(); packet_trace_identity("clone"); + git_config(git_clone_config, NULL); argc = parse_options(argc, argv, prefix, builtin_clone_options, builtin_clone_usage, 0); diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh index 7ca10b8..fc2c189 100755 --- a/t/t7407-submodule-foreach.sh +++ b/t/t7407-submodule-foreach.sh @@ -307,6 +307,23 @@ test_expect_success 'use "update --recursive nested1" to checkout all submodules ) ' +test_expect_success 'use "git clone" with clone.recursesubmodules to checkout all submodules' ' + git config --local clone.recursesubmodules true&& + git clone super clone7&& + ( + cd clone7&& + git rev-parse --resolve-git-dir .git&& + git rev-parse --resolve-git-dir sub1/.git&& + git rev-parse --resolve-git-dir sub2/.git&& + git rev-parse --resolve-git-dir sub3/.git&& + git rev-parse --resolve-git-dir nested1/.git&& + git rev-parse --resolve-git-dir nested1/nested2/.git&& + git rev-parse --resolve-git-dir nested1/nested2/nested3/.git&& + git rev-parse --resolve-git-dir nested1/nested2/nested3/submodule/.git + )&& + git config --local --unset clone.recursesubmodules +' + test_expect_success 'command passed to foreach retains notion of stdin' ' ( cd super&&
Re: [RFC/PATCH 0/2] place cherry pick line below commit title
Jonathan Tanwrites: > On 09/30/2016 12:34 PM, Junio C Hamano wrote: >>> 2) The Linux kernel's repository has some "commit ... upstream." lines >>> in this position (below the commit title) - for example, in commit >>> dacc0987fd2e. >> >> "A group of people seem to prefer it there" does not lead to >> "therefore let's move it there for everybody". It does open a >> possibility that we may want to add a new option to put it there, >> but does not justify changing what existing "-x" option does. > > To clarify, my patch adds the new option you described (to place it > below the title instead of at the bottom of the commit message). The > default is still the current behavior. Ah, sorry, I missed that. No objection from me on this point then. Thanks.
Re: Feature Request: user defined suffix for temp files created by git-mergetool
Hi Josef, On Mon, Oct 3, 2016 at 8:36 AM, Josef Ridkywrote: > In several projects, we are using git mergetool for comparing files from > different folders. > Unfortunately, when we have opened three files for comparing using meld tool > (e.q. Old_version -- Result -- New_version), > we can see only name of temporary files created by mergetool in the labels > (e.g. foo_REMOTE -- foo_BASE -- foo_LOCAL) > and users (and sometime even we) are confused, which of the files should they > edit and save. `git mergetool` just creates temporary files (with some temporary names) and calls `meld` (or `vimdiff`, etc) with the file names as parameters. So why wouldn't you call `meld` with the file names you want? -- Mit freundlichen Grüßen, Anatoly Borodin
Re: [PATCH v3 0/5] Add --format to tag verification
Hi, Junio. > I however notice that there is no new tests to protect these two new > features from future breakages. Perhaps you want to add some in > [6/5]? I'll be working on this. I spent some time looking around for example tests for format. Are there any that I should pay special attention to? (I'm looking at t7004 mostly right now). Thanks! -Santiago. signature.asc Description: PGP signature
Re: [PATCH v2 05/11] i18n: add--interactive: mark message for translation
A Sáb, 01-10-2016 às 19:09 +0200, Jakub Narębski escreveu: > W dniu 26.09.2016 o 01:09, Junio C Hamano pisze: > > Vasco Almeidawrites: > > > >> -print colored $prompt_color, > $patch_mode_flavour{VERB}, > >> - ($hunk[$ix]{TYPE} eq 'mode' ? ' mode change' : > >> - $hunk[$ix]{TYPE} eq 'deletion' ? ' deletion' : > >> - ' this hunk'), > >> - $patch_mode_flavour{TARGET}, > >> - " [y,n,q,a,d,/$other,?]? "; > > > > I hate to say this but expanding this single-liner into if/elsif/ > > cascade of uncountable number of arms is simply a disaster. > > Even if we turn this "single"-liner composition of sentence into > interpolation (allowing for reordering of parts in translation), > like > > print colored $prompt_color, __x("{verb} {noun}{maybe_target} > [y,n,q,a,d,/{other},?]? ", > verb => $patch_mode_flavour{VERB}, noun => > $patch_mode_noun{$hunk[$ix]{TYPE}}, > maybe_target => $patch_mode_flavour{TARGET} || "", other => > $other); > > This would of course require N__() on values of hash, somewhere. > > the problem is that the ordering may need to change depending on > verb: "Stage", "Stash", "Unstage", "Apply", "Discard", and/or noun: > "mode change", "deletion", "this hunk", and/or presence and value > of maybe_target: " to index", " from worktree", " from index and > worktree", > " to index and worktree". So it does not work, unfortunately. The plus side is it would be very concise compared to laying every combination as entire sentences. However, if it worked, I think it would be a bit difficult to translate and translators would be prone to commit some mistake somewhere. It may be harder to translate a sentence by its bits than translate it as a whole.
Re: [PATCH v2 04/11] i18n: add--interactive: mark plural strings
A Sáb, 01-10-2016 às 18:49 +0200, Jakub Narębski escreveu: > W dniu 26.09.2016 o 20:15, Vasco Almeida pisze: > > > > A Qua, 31-08-2016 às 12:31 +, Vasco Almeida escreveu: > > > > > > > > > Mark plural strings for translation. Unfold each action case in > > > one > > > entire sentence. > > > > > > Pass new keyword for xgettext to extract. > > > > > > Update test to include new subrotine Q__() for plural strings > > > handling. > > Why use Q__() as the name of the subroutine? [looks further]. Oh, I > see > that you are following the example of C shortcut functions (_, Q_ and > N_). > > But this is Perl, not C. The standard shortcut functions are those > defined in Locale::TextDomain, even if we can't use this module > directly. > Those that deal with plural strings handling are __n and __nx / __xn. > > The Perl equivalent of Q_ shorthand function in C, C++, etc. is __n. > There is also a function __nx for combining handling plural strings > together with variable interpolation. I was trying to follow the same style of C. I will change the name Q__ to __x for the sake of conformance. > > > diff --git a/git-add--interactive.perl b/git-add > > > --interactive.perl > > > index 4e1e857..08badfa 100755 > > > --- a/git-add--interactive.perl > > > +++ b/git-add--interactive.perl > > > @@ -666,12 +666,18 @@ sub status_cmd { > > > sub say_n_paths { > > > my $did = shift @_; > > > my $cnt = scalar @_; > > > - print "$did "; > > > - if (1 < $cnt) { > > > - print "$cnt paths\n"; > > > - } > > > - else { > > > - print "one path\n"; > > > + if ($did eq 'added') { > > > + printf(Q__("added one path\n", "added %d > > > paths\n", > > > + $cnt), $cnt); > > > + } elsif ($did eq 'updated') { > > > + printf(Q__("updated one path\n", "updated %d > > > paths\n", > > > + $cnt), $cnt); > > > + } elsif ($did eq 'reverted') { > > > + printf(Q__("reverted one path\n", "reverted %d > > > paths\n", > > > + $cnt), $cnt); > > > + } else { > > > + printf(Q__("touched one path\n", "touched %d > > > paths\n", > > > + $cnt), $cnt); > > > } > > > } > > One one hand side, it is recommended to avoid lego-like construction > of sentences. > > Translatable strings should be entire sentences. It is often not > possible to translate single verbs or adjectives in a substitutable > way. > > I think however that the action part ($did in original non-i18n code) > is whole part in any language, so something like the following would > be enough: > > # this hash is as much for validation, as for translation > my %actions = map { $_ => 1 } (N__"added", N__"updated", > N__"reverted"); > if (exists $actions{$did}) { > print __nx("{did} one path\n", "{did} {count} > paths\n", $cnt, > did => __($did), count => $cnt); > } else { > print __nx("touched one path\n", "touched {count} > paths\n", $cnt, > count => $cnt); > } > > Please correct me if I am wrong, and you know language where > "added %d paths", "updated %d paths", "reverted %d paths" etc. must > have > different word order. We may never know. :-) I prefer not to make assumptions about other languages and I think there is not much to gain from using this approach instead of marking entire sentence as the patch does. I mean the code verbosity is almost the same but maybe it gets harder to translate. Other thing, we want to avoid marking for translation single words (when that is avoidable) because those could appear on other sites that need a different translation according to the context. For example: 'commit' could be a verb or a noun from the context. > > When $cnt is 1 I get the following warning: > > Redundant argument in printf at .../libexec/git-core/git-add > > --interactive line 680. > > I wonder what is the case of C code - is similar warning here, or is > gettext smarter in that case... I do not know but I know there is a few cases like this in C code. > > The singular form does not have a %d to consume $cnt argument to > > printf(). Either we find a way to suppress that warning or we > > change > > the singular form to contain %d. > > Anyway, with __nx there should be no such problem. > > > > > > > > > > > @@ -1508,8 +1514,10 @@ sub patch_update_file { > > > ... > > > - print colored > > > $header_color, "Split into ", > > > - scalar(@split), " > > > hunks.\n"; > > > + print colored > > > $header_color, sprintf( > > > + Q__("Split into > > > %d hunk.\n", > > > + "Split into > > > %d hunks.\n", > > > + scalar(@spli > > > t)), scalar(@split)); > > > > Like we do with this. > > Note that it is a bit of
Re: [PATCH v2 02/11] i18n: add--interactive: mark simple here documents for translation
A Qui, 29-09-2016 às 23:27 +0200, Jakub Narębski escreveu: > W dniu 29.09.2016 o 19:05, Junio C Hamano pisze: > > > > Vasco Almeidawrites: > > > > > > > > On the other hand, would it make sense to translate these > > > commands? If > > > so, we would mark for translation the commands name of @cmd in > > > main_loop(). > > > > > > sub main_loop { > > > - my @cmd = ([ 'status', \_cmd, ], > > > - [ 'update', \_cmd, ], > > > - [ 'revert', \_cmd, ], > > > - [ 'add untracked', \_untracked_cmd, ], > > > - [ 'patch', \_update_cmd, ], > > > - [ 'diff', \_cmd, ], > > > - [ 'quit', \_cmd, ], > > > - [ 'help', \_cmd, ], > > > + my @cmd = ([ __('status'), \_cmd, ], > > > + [ __('update'), \_cmd, ], > > > + [ __('revert'), \_cmd, ], > > > + [ __('add untracked'), \_untracked_cmd, ], > > > + [ __('patch'), \_update_cmd, ], > > > + [ __('diff'), \_cmd, ], > > > + [ __('quit'), \_cmd, ], > > > + [ __('help'), \_cmd, ], > > > > I don't know offhand. If the code to prompt and accept the command > > given by the user can take the translated word (or a prefix of it), > > theoretically I would say it could be made to work, but to me it is > > dubious the benefit outweighs its downsides. It would make > > teaching > > Git and troubleshooting over the phone harder, I would guess. > > > > A: "Hi, I am in a 'git add -i' session." > > B: "Give 's' at the prompt." > > A: "My Git does not seem to take 's' as a valid command." > > B: "What? I've never seen that problem." > > ... back and forth wastes 10 minutes ... > > A: "By the way, I am running Git in Portuguese." > > Also, for one-letter commands to work (there is setting where you > don't even need to press enter, IIRC) all those translations would > have to be chosen to begin with different letter, isn't it? I choose not do mark those command names for translation in the next re-roll since there is no obvious gain from it.
Re: [PATCH v2 03/11] i18n: add--interactive: mark strings with interpolation for translation
> W dniu 31.08.2016 o 14:31, Vasco Almeida pisze: > > Use of sprintf following die or error_msg is necessary for > > placeholder > > substitution take place. > > No, it is not. Though I don't think that we have in out Git::I18N > the support for Perl i18n placeholder substitution. I will try to change the commit message to better reflect the reality. > From gettext manual: > https://www.gnu.org/software/gettext/manual/gettext.html#perl_002dfor > mat > > 15.3.16 Perl Format Strings > > There are two kinds format strings in Perl: those acceptable to the > Perl > built-in function printf, labelled as ‘perl-format’, and those > acceptable > to the libintl-perl function __x, labelled as ‘perl-brace-format’. > > Perl printf format strings are described in the sprintf section of > ‘man perlfunc’. > > Perl brace format strings are described in the > Locale::TextDomain(3pm) > manual page of the CPAN package libintl-perl. In brief, Perl format > uses > placeholders put between braces (‘{’ and ‘}’). The placeholder must > have > the syntax of simple identifiers. > > Git doesn't use Locale::TextDomain, from what I understand, to > provide > fallback in no-gettext case. Also, Locale::TextDomain is not in > core. Yes that can be a reason not to use Locale::TextDomain. When Ævar Arnfjörð Bjarmason added gettext support and i18n stuff, he chose no to use TextDomain because it did more than he wanted it to do, and that could introduce bugs and unnecessary work. 5e9637c ("i18n: add infrastructure for translating Git with gettext", 2011-11-18) https://public-inbox.org/git/AANLkTilYD_NyIZMyj9dHtVk-ylVBfvyxpCC7982LW n...@mail.gmail.com/ > > diff --git a/git-add--interactive.perl b/git-add--interactive.perl > > index e11a33d..4e1e857 100755 > > --- a/git-add--interactive.perl > > +++ b/git-add--interactive.perl > > @@ -612,12 +612,12 @@ sub list_and_choose { > > else { > > $bottom = $top = find_unique($choice, @stuff); > > if (!defined $bottom) { > > - error_msg "Huh ($choice)?\n"; > > + error_msg sprintf(__("Huh (%s)?\n"), > > $choice); > > So this would be, self explained without need of comment > for translators: > > + error_msg __x ("Huh ({choice})?\n"), > choice => $choice); > > > > next TOPLOOP; > > } > > Though this is probably more work that you wanted to do. > The __x might be defined like this (borrowing from Locale::TextDomain), > which needs to be put into perl/Git/I18N.pm > > sub __ ($); > sub __expand ($%); > > # With interpolation. > sub __x ($@) > { > my ($msgid, %vars) = @_; > > return __expand (__($msgid), %vars); > } > > sub __expand ($%) > { > my ($translation, %args) = @_; > > my $re = join '|', map { quotemeta $_ } keys %args; > $translation =~ s/\{($re)\}/defined $args{$1} ? $args{$1} : "{$1}"/ge; > > return $translation; > } I wonder if it is worth the trouble to add and use these functions, when there is already a way that works and for me looks simpler. One reason, if valid, would be that translators already translate strings with %d and %s from C source which is where the majority of the English text comes from. Thus it would make little difference for them. If we use in perl string like in C there is a chance that there will be a match of some string and would lead to just one msgid instead of two in the git.pot template for translation. Actually this happens for the string with "Huh (%s)?" in clean.c. Unfortunately, I do not know if I would add these changes because I know little about perl and hence I am not comfortable to do so. Maybe if you see it is indeed worth adding these to Git I18N.pm, you could send a follow-up patch or a replacement for this one.
Re: [Q] would it be bad to make /etc/gitconfig runtime configurable?
On Mon, Oct 03, 2016 at 01:06:10PM +0200, Johannes Schindelin wrote: > Hi, > > On Tue, 27 Sep 2016, Jeff King wrote: > > > On Tue, Sep 27, 2016 at 10:05:37AM -0700, Junio C Hamano wrote: > > > > > The subject says it all. Would it be bad if we introduce an > > > environment variable, GIT_SYSTEM_CONFIG=/etc/gitconfig, that names > > > an alternative location of the system-wide configuration file? > > > > > > That would supersede/deprecate GIT_CONFIG_NOSYSTEM that we > > > introduced primarily so that we can run our tests without getting > > > affected by the configuration that happens to be effective on the > > > host that the test is being run. > > > > I can't think of a reason it would be bad. > > I cannot think of any reason right now, either, but my gut tells me that > this needs to simmer a while in the backs of our minds, to give potential > reasons a chance to come forward. > > What would be the use case, BTW? IOW what would it solve that cannot > already be solved by using XDG_CONFIG_HOME? The patches Junio posted later use it for the test suite (and I also have had to skip some tests in the past related to system config because of its lack). I would also use it when doing git experiments on GitHub servers. We keep several relevant config settings in /etc/gitconfig, so if I were to say, build a new version of git and test how it repacked torvalds/linux, I need to make sure it picks up the same config. Usually I do it by baking in the right /etc/gitconfig at build time, but it would be less annoying to be able to override it at run-time. I admit both of those are uses for git _developers_, though, not git _users_. -Peff
Re: [Q] would it be bad to make /etc/gitconfig runtime configurable?
Hi, On Tue, 27 Sep 2016, Jeff King wrote: > On Tue, Sep 27, 2016 at 10:05:37AM -0700, Junio C Hamano wrote: > > > The subject says it all. Would it be bad if we introduce an > > environment variable, GIT_SYSTEM_CONFIG=/etc/gitconfig, that names > > an alternative location of the system-wide configuration file? > > > > That would supersede/deprecate GIT_CONFIG_NOSYSTEM that we > > introduced primarily so that we can run our tests without getting > > affected by the configuration that happens to be effective on the > > host that the test is being run. > > I can't think of a reason it would be bad. I cannot think of any reason right now, either, but my gut tells me that this needs to simmer a while in the backs of our minds, to give potential reasons a chance to come forward. What would be the use case, BTW? IOW what would it solve that cannot already be solved by using XDG_CONFIG_HOME? Ciao, Dscho
Re: [PATCH 1/3] Resurrect "diff-lib.c: adjust position of i-t-a entries in diff"
On Thu, Sep 29, 2016 at 2:28 AM, Junio C Hamanowrote: > After reading the three patches through, however, I do not think we > use the command line option anywhere. I'm inclined to say that we > shouldn't add it at all. Or at least do so in a separate follow-up > patch "now we have an internal mechanism, let's expose it anyway" at > the end. Which means that the last sentence in my attempted rewrite > should go. We don't use it internally _yet_. I need to go through all the external diff code and see --shift-ita should be there. The end goal is still changing the default behavior and getting rid of --shift-ita, after making sure we don't break stuff. I do use it though because "git diff" is more often run in my workflow than "git status". > As I already said, --shift-ita is not quite descriptive and I think > it should be renamed to something else, but I kept that in the > following attempt to rewrite: It's meant to be a temporary thing (which could last a year or three, depending on how fast I scan through the code base) so I didn't give much thought on naming. Umm... after a couple of minutes, I still couldn't think of any better. The one-line summary of this change is "correct the position of intent-to-add entries in diff", or as you put it more precisely (with a bit paraphrasing), "make ita entries not exist in index". I don't see any good way to shorten that to one or two words. --ita-not-in-index good enough? Or maybe --[no-]ita-visible-in-index. -- Duy
Re: [PATCH 1/5] git-gui i18n: mark strings for translation
Vasco Almeidawrites: >Mark strings for translation in lib/index.tcl that were seemingly >left behind by 700e560 ("git-gui: Mark forgotten strings for >translation.", 2008-09-04) which marks string in do_revert_selection >procedure. >These strings are passed to unstage_help and add_helper procedures. > >Signed-off-by: Vasco Almeida >--- > lib/index.tcl | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > >diff --git a/lib/index.tcl b/lib/index.tcl >index 74a81a7..3a3e534 100644 >--- a/lib/index.tcl >+++ b/lib/index.tcl >@@ -291,7 +291,7 @@ proc do_unstage_selection {} { > > if {[array size selected_paths] > 0} { > unstage_helper \ >- {Unstaging selected files from commit} \ >+ [mc "Unstaging selected files from commit"] \ > [array names selected_paths] > } elseif {$current_diff_path ne {}} { > unstage_helper \ >@@ -343,7 +343,7 @@ proc do_add_selection {} { > > if {[array size selected_paths] > 0} { > add_helper \ >- {Adding selected files} \ >+ [mc "Adding selected files"] \ > [array names selected_paths] > } elseif {$current_diff_path ne {}} { > add_helper \ >@@ -385,7 +385,7 @@ proc do_add_all {} { > set paths [concat $paths $untracked_paths] > } > } >- add_helper {Adding all changed files} $paths >+ add_helper [mc "Adding all changed files"] $paths > } > > proc revert_helper {txt paths} { This series looks good. Especially noticing the misuse of the append command. Applied to my pu for now. Thank you, -- Pat Thoytshttp://www.patthoyts.tk/ PGP fingerprint 2C 6E 98 07 2C 59 C8 97 10 CE 11 E6 04 E0 B9 DD