Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error
On Thu, Sep 29, 2016 at 10:19 AM, Junio C Hamanowrote: > Jeff King writes: >> - "cat-file --batch-check" can show you the sha1 and type, but it >> won't abbreviate sha1s, and it won't show you commit/tag information >> >> - "log --stdin --no-walk" will format the commit however you like, but >> skips the trees and blobs entirely, and the tag can only be seen via >> "%d" >> >> - "for-each-ref" has flexible formatting, too, but wants to format >> refs, not objects (and doesn't read from stdin). > > - "name-rev" is used to give "describe --contains", and can read > from its standard input, but has no format customization. > Another downside of it is that it only wants to see > committishes. > Some tool which reads standard input and can be formatted would be nice. Extending name-rev with the same format options as for-each-ref would be nice. Thanks, Jake
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
Junio C Hamanowrites: > Junio C Hamano writes: > >> There still are breakages seen in t5510 and t5526 that are about the >> verbose output of "git fetch". I'll stop digging at this point >> tonight, and welcome others who look into it ;-) > > OK, just before I leave the keyboard for the night... > > -- >8 -- > From: Junio C Hamano > Date: Thu, 29 Sep 2016 21:19:20 -0700 > Subject: [PATCH] abbrev: adjust to the new world order To those who are following from sidelines, this builds on Linus's third iteration patch (which is based on his first patch), applied on Peff's "give disambiguation help when giving an ambiguity error" series. I didn't merge the work-in-progress going back and forth between Linus and I tonight to any of the integration branches, but it is available as lt/abbrev-auto-2 branch of the "broken down" repository, i.e. git://github.com/gitster/git.git lt/abbrev-auto-2
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 9:18 PM, Linus Torvaldswrote: > > There are probably other things like that. t5510-fetch.sh fails oddly, looks like the output is off by one character. not ok 77 - fetch aligned output It has a magic "cut -c 22-" that expects the output at a specific place, and now it's at column 21 instead of column 22. Strange test, but it still seems to be aligned, just in a different column. But clearly something changed. Linus
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
Junio C Hamanowrites: > There still are breakages seen in t5510 and t5526 that are about the > verbose output of "git fetch". I'll stop digging at this point > tonight, and welcome others who look into it ;-) OK, just before I leave the keyboard for the night... -- >8 -- From: Junio C Hamano Date: Thu, 29 Sep 2016 21:19:20 -0700 Subject: [PATCH] abbrev: adjust to the new world order The default_abbrev used to be a concrete value usable as the default abbreviation length. The code that sets custom abbreviation length, in response to command line argument, often did something like: if (skip_prefix(arg, "--abbrev=", )) abbrev = atoi(arg); else if (!strcmp("--abbrev", )) abbrev = DEFAULT_ABBREV; /* make the value sane */ if (abbrev < 0 || 40 < abbrev) abbrev = ... some sane value ... The new world order however is that the default_abbrev is a negative value that signals find_unique_abbrev() that it needs to dynamically find out a good default value. We shouldn't coerce a negative value into a random positive value like the above sample code. Signed-off-by: Junio C Hamano --- builtin/rev-parse.c | 5 +++-- diff.c | 2 +- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c index 76cf05e2ad..17cbfabdde 100644 --- a/builtin/rev-parse.c +++ b/builtin/rev-parse.c @@ -643,8 +643,9 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix) filter &= ~(DO_FLAGS|DO_NOREV); verify = 1; abbrev = DEFAULT_ABBREV; - if (arg[7] == '=') - abbrev = strtoul(arg + 8, NULL, 10); + if (!arg[7]) + continue; + abbrev = strtoul(arg + 8, NULL, 10); if (abbrev < MINIMUM_ABBREV) abbrev = MINIMUM_ABBREV; else if (40 <= abbrev) diff --git a/diff.c b/diff.c index c6da383c56..cefc13eb8e 100644 --- a/diff.c +++ b/diff.c @@ -3399,7 +3399,7 @@ void diff_setup_done(struct diff_options *options) */ read_cache(); } - if (options->abbrev <= 0 || 40 < options->abbrev) + if (40 < options->abbrev) options->abbrev = 40; /* full */ /* -- 2.10.0-612-g22341905f2
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 9:10 PM, Junio C Hamanowrote: > > A quick and dirty fix for it may look like this. Crossed emails. Indeed, I just solved the builtin/rev-parse.c thing slightly differently. And you found another failure in the diff code similarly not liking the negative DEFAULT_ABBREV. There are probably other things like that. Linus
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 8:54 PM, Junio C Hamanowrote: > > * The script uses "git rev-parse --short HEAD"; I suspect that it >says "ah, default_abbrev is -1 and minimum_abbrev is 4, so let's >try abbreviating to 4 hexdigits". Ahh, right you are. The logic there is abbrev = DEFAULT_ABBREV; if (arg[7] == '=') abbrev = strtoul(arg + 8, NULL, 10); if (abbrev < MINIMUM_ABBREV) abbrev = MINIMUM_ABBREV; which now does something different than what it used to do because DEFAULT_ABBREV is -1. Putting the "sanity-check the abbrev range" tests inside the "if()" statement that does strtoul() should fix it. Let me test... [ short time passes ] Yup. Incremental patch for that single issue attached. I made it do an early "continue" instead of adding another level on indentation. Linus builtin/rev-parse.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c index 4da1f1da2..cfb0f1510 100644 --- a/builtin/rev-parse.c +++ b/builtin/rev-parse.c @@ -671,8 +671,9 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix) filter &= ~(DO_FLAGS|DO_NOREV); verify = 1; abbrev = DEFAULT_ABBREV; - if (arg[7] == '=') - abbrev = strtoul(arg + 8, NULL, 10); + if (!arg[7]) + continue; + abbrev = strtoul(arg + 8, NULL, 10); if (abbrev < MINIMUM_ABBREV) abbrev = MINIMUM_ABBREV; else if (40 <= abbrev)
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
Junio C Hamanowrites: > Linus Torvalds writes: > >> So this patch may actually be "production ready" apart from the fact >> that some tests still fail (at least t2027-worktree-list.sh) because >> of different short SHA1 cases. > > t2027 has at least two problems. > > * "git worktree" does not read the core.abbrev configuration, >without a recent fix in jc/worktree-config, i.e. d49028e6 >("worktree: honor configuration variables", 2016-09-26). > > * The script uses "git rev-parse --short HEAD"; I suspect that it >says "ah, default_abbrev is -1 and minimum_abbrev is 4, so let's >try abbreviating to 4 hexdigits". > > The first failure in t3203 seems to come from the same issue in > "rev-parse --short". A quick and dirty fix for it may look like this. We leave the variable abbrev to DEFAULT_ABBREV and let find_unique_abbrev() react to "eh, -1? I need to do the auto-scaling". "git diff-tree --abbrev" seems to have a similar problem, and the fix is the same. There still are breakages seen in t5510 and t5526 that are about the verbose output of "git fetch". I'll stop digging at this point tonight, and welcome others who look into it ;-) builtin/rev-parse.c | 14 -- diff.c | 2 +- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c index 76cf05e2ad..f8c8c6c22e 100644 --- a/builtin/rev-parse.c +++ b/builtin/rev-parse.c @@ -642,13 +642,15 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix) starts_with(arg, "--short=")) { filter &= ~(DO_FLAGS|DO_NOREV); verify = 1; - abbrev = DEFAULT_ABBREV; - if (arg[7] == '=') + if (arg[7] != '=') { + abbrev = DEFAULT_ABBREV; + } else { abbrev = strtoul(arg + 8, NULL, 10); - if (abbrev < MINIMUM_ABBREV) - abbrev = MINIMUM_ABBREV; - else if (40 <= abbrev) - abbrev = 40; + if (abbrev < MINIMUM_ABBREV) + abbrev = MINIMUM_ABBREV; + else if (40 <= abbrev) + abbrev = 40; + } continue; } if (!strcmp(arg, "--sq")) { diff --git a/diff.c b/diff.c index c6da383c56..cefc13eb8e 100644 --- a/diff.c +++ b/diff.c @@ -3399,7 +3399,7 @@ void diff_setup_done(struct diff_options *options) */ read_cache(); } - if (options->abbrev <= 0 || 40 < options->abbrev) + if (40 < options->abbrev) options->abbrev = 40; /* full */ /*
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
Linus Torvaldswrites: > So this patch may actually be "production ready" apart from the fact > that some tests still fail (at least t2027-worktree-list.sh) because > of different short SHA1 cases. t2027 has at least two problems. * "git worktree" does not read the core.abbrev configuration, without a recent fix in jc/worktree-config, i.e. d49028e6 ("worktree: honor configuration variables", 2016-09-26). * The script uses "git rev-parse --short HEAD"; I suspect that it says "ah, default_abbrev is -1 and minimum_abbrev is 4, so let's try abbreviating to 4 hexdigits". The first failure in t3203 seems to come from the same issue in "rev-parse --short".
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 12:06:23PM -0700, Linus Torvalds wrote: > On Thu, Sep 29, 2016 at 11:55 AM, Linus Torvalds >wrote: > > > > For the kernel, just the *math* right now actually gives 12 > > characters. For current git it actually seems to say that 8 is the > > correct number. For small projects, you'll still see 7. > > Sorry, the git number is 9, not 8. The reason is that git has roughly > 212k objects, and 9 hex digits gets expected collisions at about 256k > objects. > > So the logic means that we'll see 7 hex digits for projects with less > than 16k objects, 8 hex digits if there are less than 64k objects, and > 9 hex digits for projects like git that currently have fewer than 256k > objects. > > But git itself might not be *that* far from going to 10 hex digits > with my patch. > > The kernel uses 12 he digits because the collision math says that's > the right thing for a project with between 4M and 16M objects (with > the kernel being at 5M). OTOH, how often does one refer to trees or blobs with abbreviated sha1s? Most of the time, you'd use abbreviated sha1s for commits. And the number of commits in git and the kernel repositories are much lower than the number of overall objects. rev-list --all --count on the git repo gives me 46790. On the kernel, it gives 618078. Now, the interesting thing is looking at the *actual* collisions in those spaces. At 9 digits, there's only one commit collision in the kernel repo: 45f014c5264f5e68ef0e51b36f4ef5ede3d18397 45f014c52eef022873b19d6a20eb0ec9668f2b09 And two commit collisions at 8 digits in the git repo: 1536dd9c1df0b7167b139f080cc4774ef63f 1536dd9c61b5582cf07057cb715dd6dc6620 2e6e3e82ee36b3e1bec1db8db24817270080424e 2e6e3e829f3759823d70e7af511bc04cd05ad0af At 7 digits, there are 5 actual commit collisions in the git repo and 718 in the kernel repo only one of those collisions involve more than 2 commits. Mike
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 5:57 PM, Linus Torvaldswrote: > > Actually, all the other cases seem to be "parse a SHA1 with a known > length", so they really don't have a negative length. So this seems > ok, and is easier to verify than the "what all contexts might use > DEFAULT_ABBREV" thing. There's only a few callers, and it's a static > function so it's easy to check it locally in sha1_name.c. Here's my original patch with just a tiny change that instead of starting the automatic guessing at 7 each time, it starts at "default_automatic_abbrev", which is initialized to 7. The difference is that if we decide that "oh, that was too small, need to repeat", we also update that "default_automatic_abbrev" value, so that we won't start at the number that we now know was too small. So it still loops over the abbrev values, but now it only loops a couple of times. I actually verified the performance impact by doing time git rev-list --abbrev-commit HEAD > /dev/null on the kernel git tree, and it does actually matter. With my original patch, we wasted a noticeable amount of time on just the extra looping, with this it's down to the same performance as just doing it once at init time (it's about 12s vs 9s on my laptop). So this patch may actually be "production ready" apart from the fact that some tests still fail (at least t2027-worktree-list.sh) because of different short SHA1 cases. Linus cache.h | 1 + environment.c | 2 +- sha1_name.c | 26 +- 3 files changed, 27 insertions(+), 2 deletions(-) diff --git a/cache.h b/cache.h index 6e33f2f28..d2da6d186 100644 --- a/cache.h +++ b/cache.h @@ -1207,6 +1207,7 @@ struct object_context { #define GET_SHA1_TREEISH 020 #define GET_SHA1_BLOB 040 #define GET_SHA1_FOLLOW_SYMLINKS 0100 +#define GET_SHA1_AUTOMATIC 0200 #define GET_SHA1_ONLY_TO_DIE04000 #define GET_SHA1_DISAMBIGUATORS \ diff --git a/environment.c b/environment.c index c1442df9a..fd6681e46 100644 --- a/environment.c +++ b/environment.c @@ -16,7 +16,7 @@ int trust_executable_bit = 1; int trust_ctime = 1; int check_stat = 1; int has_symlinks = 1; -int minimum_abbrev = 4, default_abbrev = 7; +int minimum_abbrev = 4, default_abbrev = -1; int ignore_case; int assume_unchanged; int prefer_symlink_refs; diff --git a/sha1_name.c b/sha1_name.c index 3b647fd7c..1003c96ea 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -15,6 +15,7 @@ typedef int (*disambiguate_hint_fn)(const unsigned char *, void *); struct disambiguate_state { int len; /* length of prefix in hex chars */ + unsigned int nrobjects; char hex_pfx[GIT_SHA1_HEXSZ + 1]; unsigned char bin_pfx[GIT_SHA1_RAWSZ]; @@ -118,6 +119,12 @@ static void find_short_object_filename(struct disambiguate_state *ds) if (strlen(de->d_name) != 38) continue; + + // We only look at the one subdirectory, and we assume + // each subdirectory is roughly similar, so each object + // we find probably has 255 other objects in the other + // fan-out directories + ds->nrobjects += 256; if (memcmp(de->d_name, ds->hex_pfx + 2, ds->len - 2)) continue; memcpy(hex + 2, de->d_name, 38); @@ -151,6 +158,7 @@ static void unique_in_pack(struct packed_git *p, open_pack_index(p); num = p->num_objects; + ds->nrobjects += num; last = num; while (first < last) { uint32_t mid = (first + last) / 2; @@ -380,6 +388,9 @@ static int show_ambiguous_object(const unsigned char *sha1, void *data) return 0; } +// Why seven? That's our historical default before the automatic abbreviation +static int default_automatic_abbrev = 7; + static int get_short_sha1(const char *name, int len, unsigned char *sha1, unsigned flags) { @@ -426,6 +437,14 @@ static int get_short_sha1(const char *name, int len, unsigned char *sha1, for_each_abbrev(ds.hex_pfx, show_ambiguous_object, ); } + if (len < 16 && !status && (flags & GET_SHA1_AUTOMATIC)) { + unsigned int expect_collision = 1 << (len * 2); + if (ds.nrobjects > expect_collision) { + default_automatic_abbrev = len+1; + return SHORT_NAME_AMBIGUOUS; + } + } + return status; } @@ -458,14 +477,19 @@ int for_each_abbrev(const char *prefix, each_abbrev_fn fn, void *cb_data) int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) { int status, exists; + int flags = GET_SHA1_QUIETLY; + if (len < 0) { + flags |= GET_SHA1_AUTOMATIC; + len = default_automatic_abbrev; + }
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 5:56 PM, Mike Hommeywrote: > > OTOH, how often does one refer to trees or blobs with abbreviated sha1s? > Most of the time, you'd use abbreviated sha1s for commits. And the number > of commits in git and the kernel repositories are much lower than the > number of overall objects. See that whole other discussion about this. I agree. If we only ever worried about just commits, the abbreviation length wouldn't need to be grown nearly as aggressively. The current default would still be wrong for the kernel, but it wouldn't be as noticeably wrong, and updating it to 8 or 9 would be fine. That said, people argued against that too. We *do* end up having abbreviated SHA1's for blobs in the diff index. When I said that _I_ neer use it, somebody piped up to say that they do. So I'd rather just keep the existing semantics (a hash is a hash is a hash), and just abbreviate at a sufficient point that we don't have to worry too much about disambiguating further by object type. Linus
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 5:28 PM, Linus Torvaldswrote: > > To be fair, my original patch had a different worry that I didn't > bother with: what if one of the _other_ callers of "get_short_sha1()" > passed in -1 to it. I only handled the -1 case in th eone path care > about in that first RFC for testing. So I'm *not* suggesting you > should apply my first version,, It has issues too. Actually, all the other cases seem to be "parse a SHA1 with a known length", so they really don't have a negative length. So this seems ok, and is easier to verify than the "what all contexts might use DEFAULT_ABBREV" thing. There's only a few callers, and it's a static function so it's easy to check it locally in sha1_name.c. Linus
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 5:20 PM, Linus Torvaldswrote: > > As you say, my original patch had neither of those issues. To be fair, my original patch had a different worry that I didn't bother with: what if one of the _other_ callers of "get_short_sha1()" passed in -1 to it. I only handled the -1 case in th eone path care about in that first RFC for testing. So I'm *not* suggesting you should apply my first version,, It has issues too. Let me see if I can massage my first hacky RFC test-patch into something more reliable. Linus
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 4:13 PM, Junio C Hamanowrote: > > One thing that worries me is if we are ready to start accessing the > object store in all codepaths when we ask for DEFAULT_ABBREV. Yes. That was my main worry too. I also looked at just doing an explicit if (abbrev_commit && default_abbrev < 0) default_abbrev = get_default_abbrev(); and in many ways that would be nicer exactly because the point where this happens is then explicit, instead of being hidden behind that macro that may end up being done in random places. But it wasn't entirely obvious which all paths would need that initialization either, so on the whole it was very much a "six of one, half a dozen of the other" thing. As you say, my original patch had neither of those issues. It just stupidly re-did the loop over and over, and maybe the right thing to do is to have that original code, but just short-circuit the "over and over" behavior by just resetting default_abbrev to the value we do find. Linus
Re: [PATCH v6 3/4] ls-files: pass through safe options for --recurse-submodules
Brandon Williamswrites: > +static void compile_submodule_options(const struct dir_struct *dir, int > show_tag) > +{ > + if (line_terminator == '\0') > + argv_array_push(_options, "-z"); > + if (show_tag) > + argv_array_push(_options, "-t"); > + if (show_valid_bit) > + argv_array_push(_options, "-v"); > + if (show_cached) > + argv_array_push(_options, "--cached"); > + if (show_deleted) > + argv_array_push(_options, "--deleted"); > + if (show_modified) > + argv_array_push(_options, "--modified"); > + if (show_others) > + argv_array_push(_options, "--others"); > + if (dir->flags & DIR_SHOW_IGNORED) > + argv_array_push(_options, "--ignored"); > + if (show_stage) > + argv_array_push(_options, "--stage"); > + if (show_killed) > + argv_array_push(_options, "--killed"); > + if (dir->flags & DIR_SHOW_OTHER_DIRECTORIES) > + argv_array_push(_options, "--directory"); > + if (!(dir->flags & DIR_SHOW_OTHER_DIRECTORIES)) > + argv_array_push(_options, "--empty-directory"); > + if (show_unmerged) > + argv_array_push(_options, "--unmerged"); > + if (show_resolve_undo) > + argv_array_push(_options, "--resolve-undo"); > + if (show_eol) > + argv_array_push(_options, "--eol"); > + if (debug_mode) > + argv_array_push(_options, "--debug"); > +} With this and 4/4 applied, the documentation still says "--cached" is the only supported option. Does it really make sense to pass all of these? I understand "-z" and I suspect things like "-t" and "-v" that affect "how" things are shown may also happen to work, but I am not sure how much it makes sense for options that affect "what" things are shown. What does it even mean to ask for say "--unmerged" to be shown, for example, from the superproject? Recurse into submodules whose cache entries in the index of the superproject are unmerged, or something else? I am inclined to say that it is probably better to keep the "--cached only" as documented, at least on the "what are shown" side. Thanks.
Re: [PATCH v2] http: Control GSSAPI credential delegation.
On Wed, Sep 28, 2016 at 08:01:34PM +0200, Petr Stodulka wrote: > Delegation of credentials is disabled by default in libcurl since > version 7.21.7 due to security vulnerability CVE-2011-2192. Which > makes troubles with GSS/kerberos authentication when delegation > of credentials is required. This can be changed with option > CURLOPT_GSSAPI_DELEGATION in libcurl with set expected parameter > since libcurl version 7.22.0. > > This patch provides new configuration variable http.delegation > which corresponds to curl parameter "--delegation" (see man 1 curl). > > The following values are supported: > > * none (default). > * policy > * always I don't personally use Kerberos delegation with Git, but I don't see any problems with this patch. It preserves the security properties of the current behavior, and I think adding "policy" as an option to allow per-realm configuration is a good idea. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: [PATCH 1/3] add QSORT
Am 30.09.2016 um 01:21 schrieb René Scharfe: > Am 30.09.2016 um 00:36 schrieb Junio C Hamano: >> René Scharfewrites: >> >>> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the >>> size of the array elements and supports the convention of initializing >>> empty arrays with a NULL pointer, which we use in some places. >>> >>> Calling qsort(3) directly with a NULL pointer is undefined -- even with >>> an element count of zero -- and allows the compiler to optimize away any >>> following NULL checks. Using the macro avoids such surprises. >>> >>> Add a semantic patch as well to demonstrate the macro's usage and to >>> automate the transformation of trivial cases. >>> >>> Signed-off-by: Rene Scharfe >>> --- >>> contrib/coccinelle/qsort.cocci | 19 +++ >>> git-compat-util.h | 8 >>> 2 files changed, 27 insertions(+) >>> create mode 100644 contrib/coccinelle/qsort.cocci >> >> The direct calls to qsort(3) that this series leaves behind are >> interesting. >> >> 1. builtin/index-pack.c has this: >> >> if (1 < opts->anomaly_nr) >> qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), >> cmp_uint32); >> >> where opts->anomaly is coming from pack.h: >> >> struct pack_idx_option { >> unsigned flags; >> ... >> int anomaly_alloc, anomaly_nr; >> uint32_t *anomaly; >> }; >> >> I cannot quite see how the automated conversion misses it? It's not >> like base and nmemb are type-restricted in the rule (they are both >> just "expression"s). >> >> 2. builtin/shortlog.c has this: >> >> qsort(log->list.items, log->list.nr, sizeof(struct string_list_item), >>log->summary ? compare_by_counter : compare_by_list); >> >> where log->list is coming from shortlog.h: >> >> struct shortlog { >> struct string_list list; >> }; >> >> and string-list.h says: >> >> struct string_list { >> struct string_list_item *items; >> unsigned int nr, alloc; >> ... >> }; >> >> which seems to be a good candidate for this rule: >> >> type T; >> T *base; >> expression nmemb, compar; >> @@ >> - qsort(base, nmemb, sizeof(T), compar); >> + QSORT(base, nmemb, compar); >> >> if we take "T == struct string_list_item". > > Transformations for these two are generated if we pass --all-includes > to spatch. So let's do that. And here's the result: -- >8 -- Subject: [PATCH] use QSORT, part 2 Convert two more qsort(3) calls to QSORT to reduce code size and for better safety and consistency. Signed-off-by: Rene Scharfe --- Squashable. builtin/index-pack.c | 3 +-- builtin/shortlog.c | 2 +- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 7657d0a..0a27bab 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1531,8 +1531,7 @@ static void read_v2_anomalous_offsets(struct packed_git *p, opts->anomaly[opts->anomaly_nr++] = ntohl(idx2[off * 2 + 1]); } - if (1 < opts->anomaly_nr) - qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), cmp_uint32); + QSORT(opts->anomaly, opts->anomaly_nr, cmp_uint32); } static void read_idx_option(struct pack_idx_option *opts, const char *pack_name) diff --git a/builtin/shortlog.c b/builtin/shortlog.c index 25fa8a6..ba0e115 100644 --- a/builtin/shortlog.c +++ b/builtin/shortlog.c @@ -308,7 +308,7 @@ void shortlog_output(struct shortlog *log) struct strbuf sb = STRBUF_INIT; if (log->sort_by_number) - qsort(log->list.items, log->list.nr, sizeof(struct string_list_item), + QSORT(log->list.items, log->list.nr, log->summary ? compare_by_counter : compare_by_list); for (i = 0; i < log->list.nr; i++) { const struct string_list_item *item = >list.items[i]; -- 2.10.0
Re: Impossible to change working directory
Sebastian Feldmannwrites: > the script fails because changing the current working directory fails. > If I echo the current working directory it always echoes the root repository > path > > Is this expected behavior? Yes, we always go to the top before doing anything. If you echo environment variables Git may set up for you, what do you see? Do you see GIT_PREFIX that you can use to tell where you came from, or something like that?
Re: [PATCH 1/3] add QSORT
Am 30.09.2016 um 00:36 schrieb Junio C Hamano: > René Scharfewrites: > >> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the >> size of the array elements and supports the convention of initializing >> empty arrays with a NULL pointer, which we use in some places. >> >> Calling qsort(3) directly with a NULL pointer is undefined -- even with >> an element count of zero -- and allows the compiler to optimize away any >> following NULL checks. Using the macro avoids such surprises. >> >> Add a semantic patch as well to demonstrate the macro's usage and to >> automate the transformation of trivial cases. >> >> Signed-off-by: Rene Scharfe >> --- >> contrib/coccinelle/qsort.cocci | 19 +++ >> git-compat-util.h | 8 >> 2 files changed, 27 insertions(+) >> create mode 100644 contrib/coccinelle/qsort.cocci > > The direct calls to qsort(3) that this series leaves behind are > interesting. > > 1. builtin/index-pack.c has this: > > if (1 < opts->anomaly_nr) > qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), > cmp_uint32); > > where opts->anomaly is coming from pack.h: > > struct pack_idx_option { > unsigned flags; > ... > int anomaly_alloc, anomaly_nr; > uint32_t *anomaly; > }; > > I cannot quite see how the automated conversion misses it? It's not > like base and nmemb are type-restricted in the rule (they are both > just "expression"s). > > 2. builtin/shortlog.c has this: > > qsort(log->list.items, log->list.nr, sizeof(struct string_list_item), > log->summary ? compare_by_counter : compare_by_list); > > where log->list is coming from shortlog.h: > > struct shortlog { > struct string_list list; > }; > > and string-list.h says: > > struct string_list { > struct string_list_item *items; > unsigned int nr, alloc; > ... > }; > > which seems to be a good candidate for this rule: > > type T; > T *base; > expression nmemb, compar; > @@ > - qsort(base, nmemb, sizeof(T), compar); > + QSORT(base, nmemb, compar); > > if we take "T == struct string_list_item". Transformations for these two are generated if we pass --all-includes to spatch. So let's do that. -- >8 -- Subject: [PATCH] coccicheck: use --all-includes by default Add a make variable, SPATCH_FLAGS, for specifying flags for spatch, and set it to --all-includes by default. This option lets it consider header files which would otherwise be ignored. That's important for some rules that rely on type information. It doubles the duration of coccicheck, however. Signed-off-by: Rene Scharfe --- Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 1aad150..d15bf8d 100644 --- a/Makefile +++ b/Makefile @@ -467,6 +467,7 @@ SPATCH = spatch export TCL_PATH TCLTK_PATH SPARSE_FLAGS = +SPATCH_FLAGS = --all-includes @@ -2314,7 +2315,7 @@ C_SOURCES = $(patsubst %.o,%.c,$(C_OBJ)) %.cocci.patch: %.cocci $(C_SOURCES) @echo '' SPATCH $<; \ for f in $(C_SOURCES); do \ - $(SPATCH) --sp-file $< $$f; \ + $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS); \ done >$@ 2>$@.log; \ if test -s $@; \ then \ -- 2.10.0
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
Junio C Hamanowrites: > Linus Torvalds writes: > >> The advantage of the >> previous patch was that it got the object counting right almost >> automatically, this actually has its own new object counting code and >> maybe I screwed it up. I guess another advantage of your original approach was that it delayed the counting to the very last minute, so the things that worried me in my previous response were automatically made non-issues.
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
Linus Torvaldswrites: > Somebody should really double-check my heuristics, to see that I did > the pack counting etc right. It doesn't do alternate loose file > counting at all, and maybe it could matter. The advantage of the > previous patch was that it got the object counting right almost > automatically, this actually has its own new object counting code and > maybe I screwed it up. One thing that worries me is if we are ready to start accessing the object store in all codepaths when we ask for DEFAULT_ABBREV. The worries are twofold: (1) Do we do the right thing if object store is not available to us? Some commands can be run outside repository, and if our call to prepare_packed_git() or loose object iteration barfed in some way, that would introduce a regression. (2) Is calling prepare_packed_git() too early interfere with how the commands expect its own prepare_packed_git() work? That is, if a command has this sequence, "ask DEFAULT_ABBREV, arrange things, and then call prepare_packed_git()", and the existing "arrange things" step had something that causes a new pack to become eligible to be read by prepare_packed_git(), like adding to the list of alternate object stores, its own prepare_packed_git() will now become a no-op. I browsed through "tig grep DEFAULT_ABBREV \*.c" and it seems that in majority of the hits, we not just are ready to start accessing, but already have an object or two, which must have come from an already open object store, so they are OK. Especially the ones that use it as the last argument to find_unique_abbrev() are OK as we are about to open the object store to do the computation. There are very early ones in the program startup sequence in the following functions, but I do not think of a reason why our new and early call to prepare_packed_git() might be problematic, given that all of them require us to have an access to the repository (i.e. this change cannot introduce a regression where a command used to work outside a repository but barf when prepare_packed_git() is called early): - builtin/describe.c - builtin/rev-list.c - builtin/rev-parse.c I thought that the one in diff.c might be problematic when the "git diff" command is run outside a repository with the "--no-index" option, but it appears that init_default_abbrev() seems to be OK when run outside a repository. There is one in parse-options-cb.c that is used to parse the --abbrev command line option. This might cause a cosmetic problem but when the user is asking for an abbreviation, it is expected that we will have an access to the object store anyway, so it may be OK. I am sorry that none of the above is about your math ;-) I suck at math so I won't comment.
Re: [PATCH 1/3] add QSORT
René Scharfewrites: > Add the macro QSORT, a convenient wrapper for qsort(3) that infers the > size of the array elements and supports the convention of initializing > empty arrays with a NULL pointer, which we use in some places. > > Calling qsort(3) directly with a NULL pointer is undefined -- even with > an element count of zero -- and allows the compiler to optimize away any > following NULL checks. Using the macro avoids such surprises. > > Add a semantic patch as well to demonstrate the macro's usage and to > automate the transformation of trivial cases. > > Signed-off-by: Rene Scharfe > --- > contrib/coccinelle/qsort.cocci | 19 +++ > git-compat-util.h | 8 > 2 files changed, 27 insertions(+) > create mode 100644 contrib/coccinelle/qsort.cocci The direct calls to qsort(3) that this series leaves behind are interesting. 1. builtin/index-pack.c has this: if (1 < opts->anomaly_nr) qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), cmp_uint32); where opts->anomaly is coming from pack.h: struct pack_idx_option { unsigned flags; ... int anomaly_alloc, anomaly_nr; uint32_t *anomaly; }; I cannot quite see how the automated conversion misses it? It's not like base and nmemb are type-restricted in the rule (they are both just "expression"s). 2. builtin/shortlog.c has this: qsort(log->list.items, log->list.nr, sizeof(struct string_list_item), log->summary ? compare_by_counter : compare_by_list); where log->list is coming from shortlog.h: struct shortlog { struct string_list list; }; and string-list.h says: struct string_list { struct string_list_item *items; unsigned int nr, alloc; ... }; which seems to be a good candidate for this rule: type T; T *base; expression nmemb, compar; @@ - qsort(base, nmemb, sizeof(T), compar); + QSORT(base, nmemb, compar); if we take "T == struct string_list_item". 3. builtin/show-branch.c does this: qsort(ref_name + bottom, top - bottom, sizeof(ref_name[0]), compare_ref_name); where ref_name[] is a file-scope global: static char *ref_name[MAX_REVS + 1]; and top and bottom are plain integers. The sizeof() does not take the size of *base, so it is understandable that this does not get automatically converted. It seems that some calls to this function _could_ send the same top and bottom, asking for 0 element array to be sorted, by the way. Thanks for an amusing read.
Re: [RFC/PATCH 0/2] place cherry pick line below commit title
Jonathan Tanwrites: > This is somewhat of a follow-up to my previous e-mail with subject > "[PATCH] sequencer: support folding in rfc2822 footer" [1], in which I > proposed relaxing the definition of a commit message footer to allow > multiple-line field bodies (as described in RFC2822), but its strictness > was deemed deliberate. It does not necessarily mean we can never change it when we did something deliberately, though. With a good enough justification, and with a transitition plan if the backward incompatibility is severe enough to warrant one, we can change things. I vaguely recall that there were some discussion on the definition of "what's a trailer line" with folks from the kernel land, perhaps while discussing the interpret-trailers topic. IIRC, when somebody passes an improved version along, the resulting message's trailer block may look like this: Signed-off-by: Original Author [fixed typo in the variable names] Signed-off-by: Somebhody Else and an obvious "wish" of theirs was to treat not just RFC2822-like "a line that begins with token followed by a colon" but also these short comments as part of the trailer block. Your original wish in [*1*] is to also treat "a line that begin with a whitespace that follows a line that begins with token followed by a colon" as part of the trailer block and I personally think that is a reasonable thing to wish for, too. I recall that I was somewhat surprised and dissapointed to see no change to interpret-trailers when you tried [*1*], which was really about improving the definition of what the trailer block is, by the way. In any case, if we want to improve what the trailer block is, we would certainly need to make sure what is inserted by "cherry-pick -x" is also considered as part of the trailer block, so it may be necessary to change it to "Cherry-picked-from: ..." while doing so. I dunno. > Below is a patch set that allows placing the "cherry picked from" line > without taking into account the definition of a commit message footer. > For example, "git cherry-pick -x" (with the appropriate configuration > variable or argument) would, to this commit message: > > commit title > > This is an explanatory paragraph. > > Footer: foo > > place the "(cherry picked from ...)" line below "commit title". > > Would this be better? It is not immediately obvious what such a change buys us. Wouldn't the current code place that line below "Footer: foo"? I cannot think of any reason why anybody would want to place "cherry-picked from" immediately below the title and before the first line of the body. [Footnotes] *1* http://public-inbox.org/git/1472846322-5592-1-git-send-email-jonathanta...@google.com/
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 12:45 PM, Junio C Hamanowrote: > > I think that is a reasonable way to go. > > #define DEFAULT_ABBREV get_default_abbrev() > > would help. So something like this that replaces the previous patch? Somebody should really double-check my heuristics, to see that I did the pack counting etc right. It doesn't do alternate loose file counting at all, and maybe it could matter. The advantage of the previous patch was that it got the object counting right almost automatically, this actually has its own new object counting code and maybe I screwed it up. Linus cache.h | 3 ++- environment.c | 2 +- sha1_file.c | 43 +++ 3 files changed, 46 insertions(+), 2 deletions(-) diff --git a/cache.h b/cache.h index 6e33f2f28..a022e1bd2 100644 --- a/cache.h +++ b/cache.h @@ -1186,8 +1186,9 @@ static inline int hex2chr(const char *s) } /* Convert to/from hex/sha1 representation */ +extern int get_default_abbrev(void); #define MINIMUM_ABBREV minimum_abbrev -#define DEFAULT_ABBREV default_abbrev +#define DEFAULT_ABBREV get_default_abbrev() struct object_context { unsigned char tree[20]; diff --git a/environment.c b/environment.c index c1442df9a..fd6681e46 100644 --- a/environment.c +++ b/environment.c @@ -16,7 +16,7 @@ int trust_executable_bit = 1; int trust_ctime = 1; int check_stat = 1; int has_symlinks = 1; -int minimum_abbrev = 4, default_abbrev = 7; +int minimum_abbrev = 4, default_abbrev = -1; int ignore_case; int assume_unchanged; int prefer_symlink_refs; diff --git a/sha1_file.c b/sha1_file.c index ca149a607..28ba04b65 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -3720,3 +3720,46 @@ int for_each_packed_object(each_packed_object_fn cb, void *data, unsigned flags) } return r ? r : pack_errors; } + +static int init_default_abbrev(void) +{ + unsigned long count = 0; + struct packed_git *p; + struct strbuf buf = STRBUF_INIT; + DIR *dir; + char *name; + int ret; + + prepare_packed_git(); + for (p = packed_git; p; p = p->next) { + if (open_pack_index(p)) + continue; + count += p->num_objects; + } + + strbuf_addstr(, get_object_directory()); + strbuf_addstr(, "/42/"); + name = strbuf_detach(, NULL); + dir = opendir(name); + free(name); + if (dir) { + struct dirent *de; + while ((de = readdir(dir)) != NULL) { + count += 256; + } + closedir(dir); + } + for (ret = 7; ret < 15; ret++) { + unsigned long expect_collision = 1ul << (ret * 2); + if (count < expect_collision) + break; + } + return ret; +} + +int get_default_abbrev(void) +{ + if (default_abbrev < 0) + default_abbrev = init_default_abbrev(); + return default_abbrev; +}
[PATCH v6 2/4] ls-files: optionally recurse into submodules
Allow ls-files to recognize submodules in order to retrieve a list of files from a repository's submodules. This is done by forking off a process to recursively call ls-files on all submodules. Use top-level --super-prefix option to pass a path to the submodule which it can use to prepend to output or pathspec matching logic. Signed-off-by: Brandon Williams--- Documentation/git-ls-files.txt | 8 +- builtin/ls-files.c | 138 - git.c | 2 +- t/t3007-ls-files-recurse-submodules.sh | 100 4 files changed, 208 insertions(+), 40 deletions(-) create mode 100755 t/t3007-ls-files-recurse-submodules.sh diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt index 0d933ac..ea01d45 100644 --- a/Documentation/git-ls-files.txt +++ b/Documentation/git-ls-files.txt @@ -18,7 +18,8 @@ SYNOPSIS [--exclude-per-directory=] [--exclude-standard] [--error-unmatch] [--with-tree=] - [--full-name] [--abbrev] [--] [...] + [--full-name] [--recurse-submodules] + [--abbrev] [--] [...] DESCRIPTION --- @@ -137,6 +138,11 @@ a space) at the start of each line: option forces paths to be output relative to the project top directory. +--recurse-submodules:: + Recursively calls ls-files on each submodule in the repository. + Currently there is only support for the --cached mode without a + pathspec. + --abbrev[=]:: Instead of showing the full 40-byte hexadecimal object lines, show only a partial prefix. diff --git a/builtin/ls-files.c b/builtin/ls-files.c index 00ea91a..63befed 100644 --- a/builtin/ls-files.c +++ b/builtin/ls-files.c @@ -14,6 +14,7 @@ #include "resolve-undo.h" #include "string-list.h" #include "pathspec.h" +#include "run-command.h" static int abbrev; static int show_deleted; @@ -28,8 +29,10 @@ static int show_valid_bit; static int line_terminator = '\n'; static int debug_mode; static int show_eol; +static int recurse_submodules; static const char *prefix; +static const char *super_prefix; static int max_prefix_len; static int prefix_len; static struct pathspec pathspec; @@ -68,11 +71,24 @@ static void write_eolinfo(const struct cache_entry *ce, const char *path) static void write_name(const char *name) { /* +* Prepend the super_prefix to name to construct the full_name to be +* written. +*/ + struct strbuf full_name = STRBUF_INIT; + if (super_prefix) { + strbuf_addstr(_name, super_prefix); + strbuf_addstr(_name, name); + name = full_name.buf; + } + + /* * With "--full-name", prefix_len=0; this caller needs to pass * an empty string in that case (a NULL is good for ""). */ write_name_quoted_relative(name, prefix_len ? prefix : NULL, stdout, line_terminator); + + strbuf_release(_name); } static void show_dir_entry(const char *tag, struct dir_entry *ent) @@ -152,55 +168,84 @@ static void show_killed_files(struct dir_struct *dir) } } +/** + * Recursively call ls-files on a submodule + */ +static void show_gitlink(const struct cache_entry *ce) +{ + struct child_process cp = CHILD_PROCESS_INIT; + int status; + + argv_array_pushf(, "--super-prefix=%s%s/", +super_prefix ? super_prefix : "", +ce->name); + argv_array_push(, "ls-files"); + argv_array_push(, "--recurse-submodules"); + + cp.git_cmd = 1; + cp.dir = ce->name; + status = run_command(); + if (status) + exit(status); +} + static void show_ce_entry(const char *tag, const struct cache_entry *ce) { + struct strbuf name = STRBUF_INIT; int len = max_prefix_len; + if (super_prefix) + strbuf_addstr(, super_prefix); + strbuf_addstr(, ce->name); if (len >= ce_namelen(ce)) die("git ls-files: internal error - cache entry not superset of prefix"); - if (!match_pathspec(, ce->name, ce_namelen(ce), - len, ps_matched, - S_ISDIR(ce->ce_mode) || S_ISGITLINK(ce->ce_mode))) - return; + if (recurse_submodules && S_ISGITLINK(ce->ce_mode)) { + show_gitlink(ce); + } else if (match_pathspec(, name.buf, name.len, + len, ps_matched, + S_ISDIR(ce->ce_mode) || + S_ISGITLINK(ce->ce_mode))) { + if (tag && *tag && show_valid_bit && + (ce->ce_flags & CE_VALID)) { + static char alttag[4]; + memcpy(alttag, tag, 3); + if
[PATCH v6 4/4] ls-files: add pathspec matching for submodules
Pathspecs can be a bit tricky when trying to apply them to submodules. The main challenge is that the pathspecs will be with respect to the superproject and not with respect to paths in the submodule. The approach this patch takes is to pass in the identical pathspec from the superproject to the submodule in addition to the submodule-prefix, which is the path from the root of the superproject to the submodule, and then we can compare an entry in the submodule prepended with the submodule-prefix to the pathspec in order to determine if there is a match. This patch also permits the pathspec logic to perform a prefix match against submodules since a pathspec could refer to a file inside of a submodule. Due to limitations in the wildmatch logic, a prefix match is only done literally. If any wildcard character is encountered we'll simply punt and produce a false positive match. More accurate matching will be done once inside the submodule. This is due to the superproject not knowing what files could exist in the submodule. Signed-off-by: Brandon Williams--- Documentation/git-ls-files.txt | 3 +- builtin/ls-files.c | 27 +++-- dir.c | 46 +- dir.h | 4 ++ t/t3007-ls-files-recurse-submodules.sh | 108 - 5 files changed, 175 insertions(+), 13 deletions(-) diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt index ea01d45..51ec9a1 100644 --- a/Documentation/git-ls-files.txt +++ b/Documentation/git-ls-files.txt @@ -140,8 +140,7 @@ a space) at the start of each line: --recurse-submodules:: Recursively calls ls-files on each submodule in the repository. - Currently there is only support for the --cached mode without a - pathspec. + Currently there is only support for the --cached. --abbrev[=]:: Instead of showing the full 40-byte hexadecimal object diff --git a/builtin/ls-files.c b/builtin/ls-files.c index 6f744ef..82ec811 100644 --- a/builtin/ls-files.c +++ b/builtin/ls-files.c @@ -215,6 +215,7 @@ static void show_gitlink(const struct cache_entry *ce) { struct child_process cp = CHILD_PROCESS_INIT; int status; + int i; argv_array_pushf(, "--super-prefix=%s%s/", super_prefix ? super_prefix : "", @@ -225,6 +226,15 @@ static void show_gitlink(const struct cache_entry *ce) /* add supported options */ argv_array_pushv(, submodules_options.argv); + /* +* Pass in the original pathspec args. The submodule will be +* responsible for prepending the 'submodule_prefix' prior to comparing +* against the pathspec for matches. +*/ + argv_array_push(, "--"); + for (i = 0; i < pathspec.nr; i++) + argv_array_push(, pathspec.items[i].original); + cp.git_cmd = 1; cp.dir = ce->name; status = run_command(); @@ -243,7 +253,8 @@ static void show_ce_entry(const char *tag, const struct cache_entry *ce) if (len >= ce_namelen(ce)) die("git ls-files: internal error - cache entry not superset of prefix"); - if (recurse_submodules && S_ISGITLINK(ce->ce_mode)) { + if (recurse_submodules && S_ISGITLINK(ce->ce_mode) && + submodule_path_match(, name.buf, ps_matched)) { show_gitlink(ce); } else if (match_pathspec(, name.buf, name.len, len, ps_matched, @@ -623,16 +634,20 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix) die("ls-files --recurse-submodules does not support " "--error-unmatch"); - if (recurse_submodules && argc) - die("ls-files --recurse-submodules does not support pathspec"); - parse_pathspec(, 0, PATHSPEC_PREFER_CWD | PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP, prefix, argv); - /* Find common prefix for all pathspec's */ - max_prefix = common_prefix(); + /* +* Find common prefix for all pathspec's +* This is used as a performance optimization which unfortunately cannot +* be done when recursing into submodules +*/ + if (recurse_submodules) + max_prefix = NULL; + else + max_prefix = common_prefix(); max_prefix_len = max_prefix ? strlen(max_prefix) : 0; /* Treat unmatching pathspec elements as errors */ diff --git a/dir.c b/dir.c index 0ea235f..28e9736 100644 --- a/dir.c +++ b/dir.c @@ -207,8 +207,9 @@ int within_depth(const char *name, int namelen, return 1; } -#define DO_MATCH_EXCLUDE 1 -#define DO_MATCH_DIRECTORY 2 +#define DO_MATCH_EXCLUDE (1<<0) +#define DO_MATCH_DIRECTORY (1<<1) +#define DO_MATCH_SUBMODULE (1<<2) /* * Does 'match' match the given
[PATCH v6 1/4] git: make super-prefix option
Add a super-prefix environment variable 'GIT_INTERNAL_SUPER_PREFIX' which can be used to specify a path from above a repository down to its root. When such a super-prefix is specified, the paths reported by Git are prefixed with it to make them relative to that directory "above". The paths given by the user on the command line (e.g. "git subcmd --output-file=path/to/a/file" and pathspecs) are taken relative to the directory "above" to match. The immediate use of this option is by commands which have a --recurse-submodule option in order to give context to submodules about how they were invoked. This option is currently only allowed for builtins which support a super-prefix. Signed-off-by: Brandon Williams--- Documentation/git.txt | 6 ++ cache.h | 2 ++ environment.c | 10 ++ git.c | 26 ++ 4 files changed, 44 insertions(+) diff --git a/Documentation/git.txt b/Documentation/git.txt index 7913fc2..2188ae6 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -13,6 +13,7 @@ SYNOPSIS [--exec-path[=]] [--html-path] [--man-path] [--info-path] [-p|--paginate|--no-pager] [--no-replace-objects] [--bare] [--git-dir=] [--work-tree=] [--namespace=] +[--super-prefix=] [] DESCRIPTION @@ -601,6 +602,11 @@ foo.bar= ...`) sets `foo.bar` to the empty string. details. Equivalent to setting the `GIT_NAMESPACE` environment variable. +--super-prefix=:: + Currently for internal use only. Set a prefix which gives a path from + above a repository down to its root. One use is to give submodules + context about the superproject that invoked it. + --bare:: Treat the repository as a bare repository. If GIT_DIR environment is not set, it is set to the current working diff --git a/cache.h b/cache.h index 3556326..8cf495d 100644 --- a/cache.h +++ b/cache.h @@ -408,6 +408,7 @@ static inline enum object_type object_type(unsigned int mode) #define GIT_NAMESPACE_ENVIRONMENT "GIT_NAMESPACE" #define GIT_WORK_TREE_ENVIRONMENT "GIT_WORK_TREE" #define GIT_PREFIX_ENVIRONMENT "GIT_PREFIX" +#define GIT_SUPER_PREFIX_ENVIRONMENT "GIT_INTERNAL_SUPER_PREFIX" #define DEFAULT_GIT_DIR_ENVIRONMENT ".git" #define DB_ENVIRONMENT "GIT_OBJECT_DIRECTORY" #define INDEX_ENVIRONMENT "GIT_INDEX_FILE" @@ -468,6 +469,7 @@ extern int get_common_dir_noenv(struct strbuf *sb, const char *gitdir); extern int get_common_dir(struct strbuf *sb, const char *gitdir); extern const char *get_git_namespace(void); extern const char *strip_namespace(const char *namespaced_ref); +extern const char *get_super_prefix(void); extern const char *get_git_work_tree(void); /* diff --git a/environment.c b/environment.c index ca72464..13f3d70 100644 --- a/environment.c +++ b/environment.c @@ -100,6 +100,8 @@ static char *work_tree; static const char *namespace; static size_t namespace_len; +static const char *super_prefix; + static const char *git_dir, *git_common_dir; static char *git_object_dir, *git_index_file, *git_graft_file; int git_db_env, git_index_env, git_graft_env, git_common_dir_env; @@ -120,6 +122,7 @@ const char * const local_repo_env[] = { NO_REPLACE_OBJECTS_ENVIRONMENT, GIT_REPLACE_REF_BASE_ENVIRONMENT, GIT_PREFIX_ENVIRONMENT, + GIT_SUPER_PREFIX_ENVIRONMENT, GIT_SHALLOW_FILE_ENVIRONMENT, GIT_COMMON_DIR_ENVIRONMENT, NULL @@ -222,6 +225,13 @@ const char *strip_namespace(const char *namespaced_ref) return namespaced_ref + namespace_len; } +const char *get_super_prefix(void) +{ + if (!super_prefix) + super_prefix = getenv(GIT_SUPER_PREFIX_ENVIRONMENT); + return super_prefix; +} + static int git_work_tree_initialized; /* diff --git a/git.c b/git.c index 1c61151..f756b62 100644 --- a/git.c +++ b/git.c @@ -164,6 +164,20 @@ static int handle_options(const char ***argv, int *argc, int *envchanged) setenv(GIT_WORK_TREE_ENVIRONMENT, cmd, 1); if (envchanged) *envchanged = 1; + } else if (!strcmp(cmd, "--super-prefix")) { + if (*argc < 2) { + fprintf(stderr, "No prefix given for --super-prefix.\n" ); + usage(git_usage_string); + } + setenv(GIT_SUPER_PREFIX_ENVIRONMENT, (*argv)[1], 1); + if (envchanged) + *envchanged = 1; + (*argv)++; + (*argc)--; + } else if (skip_prefix(cmd, "--super-prefix=", )) { + setenv(GIT_SUPER_PREFIX_ENVIRONMENT, cmd, 1); + if (envchanged) + *envchanged = 1; } else if (!strcmp(cmd, "--bare")) { char *cwd =
[PATCH v6 3/4] ls-files: pass through safe options for --recurse-submodules
Pass through some known-safe options when recursing into submodules. (--cached, --stage, -v, -t, -z, --debug, --eol) Other options are compiled into an argv_array but if an unsafe option is given the caller will be errored out. Signed-off-by: Brandon Williams--- builtin/ls-files.c | 51 -- t/t3007-ls-files-recurse-submodules.sh | 17 2 files changed, 60 insertions(+), 8 deletions(-) diff --git a/builtin/ls-files.c b/builtin/ls-files.c index 63befed..6f744ef 100644 --- a/builtin/ls-files.c +++ b/builtin/ls-files.c @@ -30,6 +30,7 @@ static int line_terminator = '\n'; static int debug_mode; static int show_eol; static int recurse_submodules; +static struct argv_array submodules_options = ARGV_ARRAY_INIT; static const char *prefix; static const char *super_prefix; @@ -168,6 +169,45 @@ static void show_killed_files(struct dir_struct *dir) } } +/* + * Compile an argv_array with all of the options supported by --recurse_submodules + */ +static void compile_submodule_options(const struct dir_struct *dir, int show_tag) +{ + if (line_terminator == '\0') + argv_array_push(_options, "-z"); + if (show_tag) + argv_array_push(_options, "-t"); + if (show_valid_bit) + argv_array_push(_options, "-v"); + if (show_cached) + argv_array_push(_options, "--cached"); + if (show_deleted) + argv_array_push(_options, "--deleted"); + if (show_modified) + argv_array_push(_options, "--modified"); + if (show_others) + argv_array_push(_options, "--others"); + if (dir->flags & DIR_SHOW_IGNORED) + argv_array_push(_options, "--ignored"); + if (show_stage) + argv_array_push(_options, "--stage"); + if (show_killed) + argv_array_push(_options, "--killed"); + if (dir->flags & DIR_SHOW_OTHER_DIRECTORIES) + argv_array_push(_options, "--directory"); + if (!(dir->flags & DIR_SHOW_OTHER_DIRECTORIES)) + argv_array_push(_options, "--empty-directory"); + if (show_unmerged) + argv_array_push(_options, "--unmerged"); + if (show_resolve_undo) + argv_array_push(_options, "--resolve-undo"); + if (show_eol) + argv_array_push(_options, "--eol"); + if (debug_mode) + argv_array_push(_options, "--debug"); +} + /** * Recursively call ls-files on a submodule */ @@ -182,6 +222,9 @@ static void show_gitlink(const struct cache_entry *ce) argv_array_push(, "ls-files"); argv_array_push(, "--recurse-submodules"); + /* add supported options */ + argv_array_pushv(, submodules_options.argv); + cp.git_cmd = 1; cp.dir = ce->name; status = run_command(); @@ -567,11 +610,13 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix) if (require_work_tree && !is_inside_work_tree()) setup_work_tree(); + if (recurse_submodules) + compile_submodule_options(, show_tag); + if (recurse_submodules && - (show_stage || show_deleted || show_others || show_unmerged || + (show_deleted || show_others || show_unmerged || show_killed || show_modified || show_resolve_undo || -show_valid_bit || show_tag || show_eol || with_tree || -(line_terminator == '\0'))) +with_tree)) die("ls-files --recurse-submodules unsupported mode"); if (recurse_submodules && error_unmatch) diff --git a/t/t3007-ls-files-recurse-submodules.sh b/t/t3007-ls-files-recurse-submodules.sh index b5a53c3..e76fa30 100755 --- a/t/t3007-ls-files-recurse-submodules.sh +++ b/t/t3007-ls-files-recurse-submodules.sh @@ -34,6 +34,18 @@ test_expect_success 'ls-files correctly outputs files in submodule' ' test_cmp expect actual ' +test_expect_success 'ls-files correctly outputs files in submodule with -z' ' + lf_to_nul >expect <<-\EOF && + .gitmodules + a + b/b + submodule/c + EOF + + git ls-files --recurse-submodules -z >actual && + test_cmp expect actual +' + test_expect_success 'ls-files does not output files not added to a repo' ' cat >expect <<-\EOF && .gitmodules @@ -86,15 +98,10 @@ test_incompatible_with_recurse_submodules () { " } -test_incompatible_with_recurse_submodules -z -test_incompatible_with_recurse_submodules -v -test_incompatible_with_recurse_submodules -t test_incompatible_with_recurse_submodules --deleted test_incompatible_with_recurse_submodules --modified test_incompatible_with_recurse_submodules --others -test_incompatible_with_recurse_submodules --stage test_incompatible_with_recurse_submodules --killed test_incompatible_with_recurse_submodules --unmerged
Re: [PATCH v2 02/11] i18n: add--interactive: mark simple here documents for translation
Jakub Narębskiwrites: > W dniu 29.09.2016 o 19:05, Junio C Hamano pisze: >> Vasco Almeida writes: >> >>> On the other hand, would it make sense to translate these commands? If >>> so, we would mark for translation the commands name of @cmd in >>> main_loop(). >>> >>> sub main_loop { >>> - my @cmd = ([ 'status', \_cmd, ], >>> - [ 'update', \_cmd, ], >>> - [ 'revert', \_cmd, ], >>> - [ 'add untracked', \_untracked_cmd, ], >>> - [ 'patch', \_update_cmd, ], >>> - [ 'diff', \_cmd, ], >>> - [ 'quit', \_cmd, ], >>> - [ 'help', \_cmd, ], >>> + my @cmd = ([ __('status'), \_cmd, ], >>> + [ __('update'), \_cmd, ], >>> + [ __('revert'), \_cmd, ], >>> + [ __('add untracked'), \_untracked_cmd, ], >>> + [ __('patch'), \_update_cmd, ], >>> + [ __('diff'), \_cmd, ], >>> + [ __('quit'), \_cmd, ], >>> + [ __('help'), \_cmd, ], >> >> I don't know offhand. If the code to prompt and accept the command >> given by the user can take the translated word (or a prefix of it), >> theoretically I would say it could be made to work, but to me it is >> dubious the benefit outweighs its downsides. It would make teaching >> Git and troubleshooting over the phone harder, I would guess. >> >> A: "Hi, I am in a 'git add -i' session." >> B: "Give 's' at the prompt." >> A: "My Git does not seem to take 's' as a valid command." >> B: "What? I've never seen that problem." >> ... back and forth wastes 10 minutes ... >> A: "By the way, I am running Git in Portuguese." > > Also, for one-letter commands to work (there is setting where you > don't even need to press enter, IIRC) all those translations would > have to be chosen to begin with different letter, isn't it? The original was written with an explicit expectation that these command words will not be translated adn chose words that do not share the first letter exactly for that reason. Having said that, if somebody is willing to i18n the command words, I'd expect that the command line prompt interaction would be updated to take the unique prefix instead of the "first byte", and if that happens, I think the resulting system would at least be internally consistent. It is still dubious to me if the benefit of i18n outweighs its downsides, though.
Re: [PATCH v2 02/11] i18n: add--interactive: mark simple here documents for translation
W dniu 29.09.2016 o 19:05, Junio C Hamano pisze: > Vasco Almeidawrites: > >> On the other hand, would it make sense to translate these commands? If >> so, we would mark for translation the commands name of @cmd in >> main_loop(). >> >> sub main_loop { >> - my @cmd = ([ 'status', \_cmd, ], >> - [ 'update', \_cmd, ], >> - [ 'revert', \_cmd, ], >> - [ 'add untracked', \_untracked_cmd, ], >> - [ 'patch', \_update_cmd, ], >> - [ 'diff', \_cmd, ], >> - [ 'quit', \_cmd, ], >> - [ 'help', \_cmd, ], >> + my @cmd = ([ __('status'), \_cmd, ], >> + [ __('update'), \_cmd, ], >> + [ __('revert'), \_cmd, ], >> + [ __('add untracked'), \_untracked_cmd, ], >> + [ __('patch'), \_update_cmd, ], >> + [ __('diff'), \_cmd, ], >> + [ __('quit'), \_cmd, ], >> + [ __('help'), \_cmd, ], > > I don't know offhand. If the code to prompt and accept the command > given by the user can take the translated word (or a prefix of it), > theoretically I would say it could be made to work, but to me it is > dubious the benefit outweighs its downsides. It would make teaching > Git and troubleshooting over the phone harder, I would guess. > > A: "Hi, I am in a 'git add -i' session." > B: "Give 's' at the prompt." > A: "My Git does not seem to take 's' as a valid command." > B: "What? I've never seen that problem." > ... back and forth wastes 10 minutes ... > A: "By the way, I am running Git in Portuguese." Also, for one-letter commands to work (there is setting where you don't even need to press enter, IIRC) all those translations would have to be chosen to begin with different letter, isn't it? Best, -- Jakub Narębski
Re: [PATCH v8 00/11] Git filter protocol
Lars Schneiderwrites: > We discussed that issue in v4 and v6: > http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/ > http://public-inbox.org/git/xmqqbn0a3wy3@gitster.mtv.corp.google.com/ > > My impression was that you don't want Git to wait for the filter process. > If Git waits for the filter process - how long should Git wait? I am not sure where you got that impression. I did say that I do not want Git to _KILL_ my filter process. That does not mean I want Git to go away without waiting for me. If the filter process refuses to die forever when Git told it to shutdown (by closing the pipe to it, for example), that filter process is simply buggy. I think we want users to become aware of that, instead of Git leaving it behind, which essentially is to sweep the problem under the rug. I agree with what Peff said elsewhere in the thread; if a filter process wants to take time to clean things up while letting Git proceed, it can do its own process management, but I think it is sensible for Git to wait the filter process it directly spawned.
Re: [PATCH v2 01/11] i18n: add--interactive: mark strings for translation
W dniu 26.09.2016 o 00:52, Junio C Hamano pisze: > Vasco Almeidawrites: >> my $status_fmt = '%12s %12s %s'; >> -my $status_head = sprintf($status_fmt, 'staged', 'unstaged', 'path'); >> +my $status_head = sprintf($status_fmt, __('staged'), __('unstaged'), >> __('path')); > > Wouldn't it make sense to allow translators to tweak $status_fmt if > you are allowing the earlier elements that are formatted with %12s, > as their translation may not fit within that width, in which case > they may want to make these columns wider? Perl's printf, sprintf, and format think all codepoints take up 1 print column; also, without "use utf8;" they all think that one byte is one codepoint (as it is in latin1 encoding). Many codepoints can take 0 print columns (zero-width joiners), or 2 columns (so called wide characters). The proper way to justify Unicode output is described e.g. in http://www.perl.com/pub/2012/05/perlunicook-unicode-column-width-for-printing.html use Unicode::GCString; my $gcs = Unicode::GCString->new($str); # grapheme cluster string my $cols = $gcs->columns; my $pad = " " x (12 - $cols); $status_head .= $str . $pad . " "; Though we would need to provide fallback if there is no perl-i18n, no extended Unicode support in Perl (also, if we are not using gettext). So it is even more complicated. >> prompt_yesno( >> -'Your edited hunk does not apply. Edit again ' >> -. '(saying "no" discards!) [y/n]? ' >> +# TRANSLATORS: do not translate [y/n] >> +# The program will only accept that input >> +# at this point. >> +__('Your edited hunk does not apply. Edit again >> ' >> + . '(saying "no" discards!) [y/n]? ') > > Not just [y/n], but "no" in "saying no discards!" also needs to > stay, no? I wonder if it is a good idea to lose the TRANSLATORS > comment by ejecting "[y/n]" outside the "__()" construct here. Actually the message to translators should also mention that if the translation of "no" doesn't begin with 'n', then one needs to say something like '(saying "n" for "no" discards!)'. Best, -- Jakub Narębski
Re: [PATCH v8 00/11] Git filter protocol
Jeff Kingwrites: > I don't necessarily agree, though, that the timing of filter-process > cleanup needs to be part of the public interface. So in your list: > >> 3) Git waits until the filter process finishes. > > That seems simple and elegant, but I can think of reasons we might not > want to wait (e.g., if the filter has to do some maintenance task and > does not the user to have to wait). > > OTOH, we already face this in git, and we solve it by explicitly > backgrounding the maintenance task (i.e., auto-gc). So one could argue > that it is the responsibility of the filter process to manage its own > processes. It certainly makes the interaction with git simpler. Yup, that summarizes my thinking a lot better than I managed to do in the previous message.
Re: [PATCH v8 00/11] Git filter protocol
Jakub Narębskiwrites: > Or even better: make filter driver write its pid to pidfile, and then > "wait $(cat rot13-filter.pid)". That's what we do in lib-git-daemon.sh > (I think). I am not sure if "wait"ing on a random process that is not a direct child is a reasonable thing to do, but I like the direction. Communicate with a pidfile and wait until "kill -0 $that_pid" fails, or something like that, would be clean enough.
Re: [PATCH v8 00/11] Git filter protocol
Lars Schneiderwrites: > A pragmatic approach: > > I could drop the "STOP" message that the filter writes to the log > on exit and everything would work as is. We could argue that this > is OK because Git doesn't care anyways if the filter process has > stopped or not. That would mean you can leave the process running while the test framework tries to remove the trash directory when we are done, creating the same bug J6t mentioned in the thread, no?
Re: [PATCH 2/4] t13xx: do not assume system config is empty
Jeff Kingwrites: > Good description. > > Signed-off-by: Jeff King > > of course. > >> @@ -1304,6 +1315,7 @@ test_expect_success '--show-origin with --get-regexp' ' >> file:$HOME/.gitconfig user.global true >> file:.git/configuser.local true >> EOF >> +GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \ >> git config --show-origin --get-regexp "user\.[g|l].*" >output && >> test_cmp expect output >> ' > > This is one is trying to do a multi-file lookup, but we couldn't look in > the system config before. But to naturally extend it, it ought to look > like this on top: > > diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh > index d2476a8..4dd5ce3 100755 > --- a/t/t1300-repo-config.sh > +++ b/t/t1300-repo-config.sh > @@ -1310,11 +1310,12 @@ test_expect_success '--show-origin with single file' ' > > test_expect_success '--show-origin with --get-regexp' ' > cat >expect <<-EOF && > + file:$HOME/etc-gitconfiguser.system true > file:$HOME/.gitconfig user.global true > file:.git/configuser.local true > EOF > GIT_ETC_GITCONFIG=$HOME/etc-gitconfig \ > - git config --show-origin --get-regexp "user\.[g|l].*" >output && > + git config --show-origin --get-regexp "user\.[g|l|s].*" >output && > test_cmp expect output > ' Makes sense modulo you inherited useless vertical bars from the original. I'll squash something like that in but without || ;-) Thanks.
Re: [PATCH 2/4] t13xx: do not assume system config is empty
On Thu, Sep 29, 2016 at 02:03:39PM -0700, Junio C Hamano wrote: > > - git config --show-origin --get-regexp "user\.[g|l].*" >output && > > + git config --show-origin --get-regexp "user\.[g|l|s].*" >output && > > test_cmp expect output > > ' > > Makes sense modulo you inherited useless vertical bars from the > original. I'll squash something like that in but without || ;-) Heh, I glossed over that completely. Thanks. -Peff
[PATCH v2 7/9] t1300: be explicit in local configuration tests
Many tests in this script prepare variable settings in the repository local configuration and expects "--list" to report only the ones from the repository local configuration. This happened to work while we were running out tests under GIT_CONFIG_NOSYSTEM and/or with an empty system-wide configuration file, but as we will soon make our fake system-wide configuration non-empty, prepare for that change by explicitly telling the command to look only at "--local" configuration. Signed-off-by: Junio C Hamano--- t/t1300-repo-config.sh | 80 +- 1 file changed, 40 insertions(+), 40 deletions(-) diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh index 2a15cd4d150d..8979212946c0 100755 --- a/t/t1300-repo-config.sh +++ b/t/t1300-repo-config.sh @@ -245,18 +245,18 @@ test_expect_success 'multivar' ' ' test_expect_success 'non-match' ' - git config --get nextsection.nonewline !for + git config --local --get nextsection.nonewline !for ' test_expect_success 'non-match value' ' echo wow >expect && - git config --get nextsection.nonewline !for >actual && + git config --local --get nextsection.nonewline !for >actual && test_cmp expect actual ' test_expect_success 'multi-valued get returns final one' ' echo "wow2 for me" >expect && - git config --get nextsection.nonewline >actual && + git config --local --get nextsection.nonewline >actual && test_cmp expect actual ' @@ -265,7 +265,7 @@ test_expect_success 'multi-valued get-all returns all' ' wow wow2 for me EOF - git config --get-all nextsection.nonewline >actual && + git config --local --get-all nextsection.nonewline >actual && test_cmp expect actual ' @@ -341,7 +341,7 @@ version.1.2.3eX.alpha=beta EOF test_expect_success 'working --list' ' - git config --list > output && + git config --local --list > output && test_cmp expect output ' @@ -361,7 +361,7 @@ version.1.2.3eX.alpha EOF test_expect_success '--name-only --list' ' - git config --name-only --list >output && + git config --local --name-only --list >output && test_cmp expect output ' @@ -371,7 +371,7 @@ nextsection.nonewline wow2 for me EOF test_expect_success '--get-regexp' ' - git config --get-regexp in >output && + git config --local --get-regexp in >output && test_cmp expect output ' @@ -381,7 +381,7 @@ nextsection.nonewline EOF test_expect_success '--name-only --get-regexp' ' - git config --name-only --get-regexp in >output && + git config --local --name-only --get-regexp in >output && test_cmp expect output ' @@ -392,7 +392,7 @@ EOF test_expect_success '--add' ' git config --add nextsection.nonewline "wow4 for you" && - git config --get-all nextsection.nonewline > output && + git config --local --get-all nextsection.nonewline > output && test_cmp expect output ' @@ -404,45 +404,45 @@ cat > .git/config << EOF EOF test_expect_success 'get variable with no value' ' - git config --get novalue.variable ^$ + git config --local --get novalue.variable ^$ ' test_expect_success 'get variable with empty value' ' - git config --get emptyvalue.variable ^$ + git config --local --get emptyvalue.variable ^$ ' echo novalue.variable > expect test_expect_success 'get-regexp variable with no value' ' - git config --get-regexp novalue > output && + git config --local --get-regexp novalue > output && test_cmp expect output ' echo 'novalue.variable true' > expect test_expect_success 'get-regexp --bool variable with no value' ' - git config --bool --get-regexp novalue > output && + git config --local --bool --get-regexp novalue > output && test_cmp expect output ' echo 'emptyvalue.variable ' > expect test_expect_success 'get-regexp variable with empty value' ' - git config --get-regexp emptyvalue > output && + git config --local --get-regexp emptyvalue > output && test_cmp expect output ' echo true > expect test_expect_success 'get bool variable with no value' ' - git config --bool novalue.variable > output && + git config --local --bool novalue.variable > output && test_cmp expect output ' echo false > expect test_expect_success 'get bool variable with empty value' ' - git config --bool emptyvalue.variable > output && + git config --local --bool emptyvalue.variable > output && test_cmp expect output ' @@ -683,15 +683,15 @@ test_expect_success numbers ' git config mega.ton 1m && echo 1024 >expect && echo 1048576 >>expect && - git config --int --get kilo.gram >actual && - git config --int --get mega.ton >>actual && + git config --local --int --get kilo.gram >actual && + git
[PATCH v2 5/9] t1300: disable system-wide config for tests that wants to read from -c
This test wants to do git -c x.two=2 config --get-regexp ^x\.* and see x.two that came from the one-shot configuration in its output. This form cannot be limited with "--local", as it limits the input to the local configuration file and makes these one-shot settings ignored. At this point, the test knows that there is no variable that match x.* in its local configuration, and it also was OK to assume that there is nothing in the system-wide config or global one. Make sure that assumption holds by using the GIT_CONFIG_NOSYSTEM environment, as we may add anything to t/gitconfig-for-test later. Signed-off-by: Junio C Hamano--- t/t1300-repo-config.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh index 940469339bd2..95734034e0d5 100755 --- a/t/t1300-repo-config.sh +++ b/t/t1300-repo-config.sh @@ -1093,6 +1093,7 @@ test_expect_success 'multiple git -c appends config' ' x.one 1 x.two 2 EOF + GIT_CONFIG_NOSYSTEM=1 \ git -c x.one=1 x >actual && test_cmp expect actual ' -- 2.10.0-589-g5adf4e1
[PATCH v2 9/9] core.abbrev: raise the default abbreviation to 12 hexdigits
As Peff said, responding in a thread started by Linus's suggestion to raise the default abbreviation to 12 hexdigits: I actually think "12" might be sane for a long time. That's 48 bits of sha1, so we'd expect a 50% chance of a single collision at 2^24, or 16 million. The biggest repository I know about (in number of objects) is the one holding all of the objects for all of the forks of torvalds/linux on GitHub. It's at about 15 million objects. Which seems close, but remember that's the size where we expect to see a single collision. They don't become common until much later (I didn't compute an exact number, but Linus's 16x sounds about right). I know that the growth of the kernel isn't really linear, but I think the need to bump to "13" might not just be decades, but possibly a century or more. So 12 seems reasonable, and the only downside for it (or for "13", for that matter) is a few extra bytes. I dunno, maybe people will really hate that, but I have a feeling these are mostly cut-and-pasted anyway. And this does exactly that. Keep the tests working by explicitly asking for the old 7 hexdigits setting in the fake system-wide configuration file used for tests. Signed-off-by: Junio C Hamano--- environment.c| 2 +- t/gitconfig-for-test | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/environment.c b/environment.c index ca72464a9850..25daddbc13d6 100644 --- a/environment.c +++ b/environment.c @@ -16,7 +16,7 @@ int trust_executable_bit = 1; int trust_ctime = 1; int check_stat = 1; int has_symlinks = 1; -int minimum_abbrev = 4, default_abbrev = 7; +int minimum_abbrev = 4, default_abbrev = 12; int ignore_case; int assume_unchanged; int prefer_symlink_refs; diff --git a/t/gitconfig-for-test b/t/gitconfig-for-test index 4598885ed5c3..8c284425d725 100644 --- a/t/gitconfig-for-test +++ b/t/gitconfig-for-test @@ -4,3 +4,6 @@ ;; [user] ;; name = A U Thor ;; email = aut...@example.com + +[core] + abbrev = 7 -- 2.10.0-589-g5adf4e1
[PATCH v2 8/9] worktree: honor configuration variables
The command accesses default_abbrev (defined in environment.c and is updated via core.abbrev configuration), but never makes any call to git_config(). The output from "worktree list" ignores the abbrev setting for this reason. Make a call to git_config() to read the default set of configuration variables at the beginning of the command. Signed-off-by: Junio C Hamano--- builtin/worktree.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/builtin/worktree.c b/builtin/worktree.c index 6dcf7bd9d270..5c4854d3e4a6 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -528,6 +528,8 @@ int cmd_worktree(int ac, const char **av, const char *prefix) OPT_END() }; + git_config(git_default_config, NULL); + if (ac < 2) usage_with_options(worktree_usage, options); if (!prefix) -- 2.10.0-589-g5adf4e1
[PATCH v2 4/9] t1300: check also system-wide configuration file in --show-origin tests
From: Jeff KingBecause we used to run our tests with GIT_CONFIG_NOSYSTEM, these did not test that the system-wide configuration file is also read and shown as one of the origins. Create a custom/fake system-wide configuration file and make sure it appears in the output, using the newly introduced GIT_CONFIG_SYSTEM_PATH mechanism. Signed-off-by: Junio C Hamano --- t/t1300-repo-config.sh | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh index 1b3f6f4854f9..940469339bd2 100755 --- a/t/t1300-repo-config.sh +++ b/t/t1300-repo-config.sh @@ -1236,6 +1236,11 @@ test_expect_success 'set up --show-origin tests' ' [user] relative = include EOF + cat >"$HOME"/etc-gitconfig <<-\EOF && + [user] + system = true + override = system + EOF cat >"$HOME"/.gitconfig <<-EOF && [user] global = true @@ -1254,6 +1259,8 @@ test_expect_success 'set up --show-origin tests' ' test_expect_success '--show-origin with --list' ' cat >expect <<-EOF && + file:$HOME/etc-gitconfiguser.system=true + file:$HOME/etc-gitconfiguser.override=system file:$HOME/.gitconfig user.global=true file:$HOME/.gitconfig user.override=global file:$HOME/.gitconfig include.path=$INCLUDE_DIR/absolute.include @@ -1264,13 +1271,16 @@ test_expect_success '--show-origin with --list' ' file:.git/../include/relative.include user.relative=include command line: user.cmdline=true EOF + GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \ git -c user.cmdline=true config --list --show-origin >output && test_cmp expect output ' test_expect_success '--show-origin with --list --null' ' cat >expect <<-EOF && - file:$HOME/.gitconfigQuser.global + file:$HOME/etc-gitconfigQuser.system + trueQfile:$HOME/etc-gitconfigQuser.override + systemQfile:$HOME/.gitconfigQuser.global trueQfile:$HOME/.gitconfigQuser.override globalQfile:$HOME/.gitconfigQinclude.path $INCLUDE_DIR/absolute.includeQfile:$INCLUDE_DIR/absolute.includeQuser.absolute @@ -1281,6 +1291,7 @@ test_expect_success '--show-origin with --list --null' ' includeQcommand line:Quser.cmdline trueQ EOF + GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \ git -c user.cmdline=true config --null --list --show-origin >output.raw && nul_to_q output && # The here-doc above adds a newline that the --null output would not @@ -1304,6 +1315,7 @@ test_expect_success '--show-origin with --get-regexp' ' file:$HOME/.gitconfig user.global true file:.git/configuser.local true EOF + GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \ git config --show-origin --get-regexp "user\.[g|l].*" >output && test_cmp expect output ' @@ -1312,6 +1324,7 @@ test_expect_success '--show-origin getting a single key' ' cat >expect <<-\EOF && file:.git/configlocal EOF + GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \ git config --show-origin user.override >output && test_cmp expect output ' -- 2.10.0-589-g5adf4e1
[PATCH v2 6/9] t1300: take contents of system-wide configuration into account in "--list" test
One of the "git config" test tries to see that the command run without a valid repository still shows non-repository specific configuration. As we are planning to later make the system-wide file non-empty, prepare for the change by expecting to see the contents from it. Signed-off-by: Junio C Hamano--- t/t1300-repo-config.sh | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh index 95734034e0d5..2a15cd4d150d 100755 --- a/t/t1300-repo-config.sh +++ b/t/t1300-repo-config.sh @@ -344,10 +344,11 @@ test_expect_success 'working --list' ' git config --list > output && test_cmp expect output ' -cat > expect << EOF -EOF -test_expect_success '--list without repo produces empty output' ' +test_expect_success '--list without repo shows only system-wide and global' ' + # The global one aka $HOME/.gitconfig is missing, + # so we do not have to worry about it. + git config --system --list >expect && git --git-dir=nonexistent config --list >output && test_cmp expect output ' -- 2.10.0-589-g5adf4e1
[PATCH v2 2/9] t1300: always compare expect to actual
The two arguments to the test_cmp helper should always have the expected output first and then the actual one, so that an unmet expectation would appear as -what we wanted to see +what we actually saw in its output. Signed-off-by: Junio C Hamano--- t/t1300-repo-config.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh index 0543b62227bf..1b3f6f4854f9 100755 --- a/t/t1300-repo-config.sh +++ b/t/t1300-repo-config.sh @@ -936,7 +936,7 @@ EOF test_expect_success 'value continued on next line' ' git config --list > result && - test_cmp result expect + test_cmp expect result ' cat > .git/config <<\EOF -- 2.10.0-589-g5adf4e1
[PATCH v2 1/9] config: allow customizing /etc/gitconfig location with an environment
We introduced GIT_CONFIG_NOSYSTEM environment variable at ab88c363 ("allow suppressing of global and system config", 2008-02-06), primarily to protect our tests from random set of configuration variables the system administrators would put in /etc/gitconfig file. Introduce a new environment variable GIT_CONFIG_SYSTEM_PATH, and allow the users to specify a file that is used instead of /etc/gitconfig to read (and write) the system-wide configuration. By doing so, we can force our tests to honor certain configuration settings by default by pointing GIT_CONFIG_SYSTEM_PATH at our own, in addition to the existing GIT_CONFIG_NOSYSTEM mechanism. Signed-off-by: Junio C Hamano--- cache.h| 1 + config.c | 2 ++ t/gitconfig-for-test | 6 ++ t/t1300-repo-config.sh | 15 +++ t/test-lib.sh | 4 ++-- 5 files changed, 26 insertions(+), 2 deletions(-) create mode 100644 t/gitconfig-for-test diff --git a/cache.h b/cache.h index b0dae4bac1a1..d4b689f386d6 100644 --- a/cache.h +++ b/cache.h @@ -408,6 +408,7 @@ static inline enum object_type object_type(unsigned int mode) #define GIT_NAMESPACE_ENVIRONMENT "GIT_NAMESPACE" #define GIT_WORK_TREE_ENVIRONMENT "GIT_WORK_TREE" #define GIT_PREFIX_ENVIRONMENT "GIT_PREFIX" +#define GIT_CONFIG_SYSTEM_PATH_ENVIRONMENT "GIT_CONFIG_SYSTEM_PATH" #define DEFAULT_GIT_DIR_ENVIRONMENT ".git" #define DB_ENVIRONMENT "GIT_OBJECT_DIRECTORY" #define INDEX_ENVIRONMENT "GIT_INDEX_FILE" diff --git a/config.c b/config.c index 0dfed682b868..096bb754aad7 100644 --- a/config.c +++ b/config.c @@ -1253,6 +1253,8 @@ const char *git_etc_gitconfig(void) { static const char *system_wide; if (!system_wide) + system_wide = getenv(GIT_CONFIG_SYSTEM_PATH_ENVIRONMENT); + if (!system_wide) system_wide = system_path(ETC_GITCONFIG); return system_wide; } diff --git a/t/gitconfig-for-test b/t/gitconfig-for-test new file mode 100644 index ..4598885ed5c3 --- /dev/null +++ b/t/gitconfig-for-test @@ -0,0 +1,6 @@ +;; This file is used as if it were /etc/gitconfig while running the +;; test scripts in this directory. +;; +;; [user] +;; name = A U Thor +;; email = aut...@example.com diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh index 923bfc5a2606..0543b62227bf 100755 --- a/t/t1300-repo-config.sh +++ b/t/t1300-repo-config.sh @@ -1372,4 +1372,19 @@ test_expect_success !MINGW '--show-origin blob ref' ' test_cmp expect output ' +test_expect_success 'system-wide configuration' ' + system="$TRASH_DIRECTORY/system-wide" && + >"$system" && + git config -f "$system" --add frotz.nitfol xyzzy && + + git config -f "$system" frotz.nitfol >expect && + GIT_CONFIG_SYSTEM_PATH="$system" \ + git config --system frotz.nitfol >actual && + + GIT_CONFIG_SYSTEM_PATH="$system" \ + git config --system --replace-all frotz.nitfol blorb && + echo blorb >expect && + GIT_CONFIG_SYSTEM_PATH="$system" git config --system frotz.nitfol >actual +' + test_done diff --git a/t/test-lib.sh b/t/test-lib.sh index ac56512a1c5e..b811e4c70273 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -851,9 +851,9 @@ else # normal case, use ../bin-wrappers only unless $with_dashes: fi fi GIT_TEMPLATE_DIR="$GIT_BUILD_DIR"/templates/blt -GIT_CONFIG_NOSYSTEM=1 +GIT_CONFIG_SYSTEM_PATH="$GIT_BUILD_DIR/t/gitconfig-for-test" GIT_ATTR_NOSYSTEM=1 -export PATH GIT_EXEC_PATH GIT_TEMPLATE_DIR GIT_CONFIG_NOSYSTEM GIT_ATTR_NOSYSTEM +export PATH GIT_EXEC_PATH GIT_TEMPLATE_DIR GIT_CONFIG_SYSTEM_PATH GIT_ATTR_NOSYSTEM if test -z "$GIT_TEST_CMP" then -- 2.10.0-589-g5adf4e1
[PATCH v2 3/9] t1308: ignore system-wide config in the iteration test
We do not want to keep track of the exact contents of the fake system-wide t/gitconfig-for-test configuration file. Keep ignoring it as we used to. Signed-off-by: Junio C Hamano--- t/t1308-config-set.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/t/t1308-config-set.sh b/t/t1308-config-set.sh index 7655c94c2801..5d5adb1efd8e 100755 --- a/t/t1308-config-set.sh +++ b/t/t1308-config-set.sh @@ -260,6 +260,7 @@ test_expect_success 'iteration shows correct origins' ' name= scope=cmdline EOF + GIT_CONFIG_NOSYSTEM=1 \ GIT_CONFIG_PARAMETERS=$cmdline_config test-config iterate >actual && test_cmp expect actual ' -- 2.10.0-589-g5adf4e1
[PATCH v2 0/9] allow customizing /etc/gitconfig location with an environment
This ended up growing quite a bit, and I mostly hate it. - Patch 1 introduces GIT_CONFIG_SYSTEM_PATH environment variable that lets you point at a file other than /etc/gitconfig to pretend that your file is the system-wide configuration. - Patch 2 is a small bugfix. - Patches 3-7 are updates to 1300 and 1308, i.e. tests for "git config", to make them more robust, in preparation for using GIT_CONFIG_SYSTEM_PATH mechanism to point at a file during the test. It protects them a bit more than necessary in that the variables some of the tests they use when they try to see the output from "git config --get" are unlikely to appear in the fake system-wide configuration during the test (hence disabling the fake system-wide configuration has no practical effect), but nevertheless the calls are protected by explicitly telling them to read only from --local configuration file to future-proof them. - Patch 8 is queued elsewhere already. - Patch 9 raises the default core.abbrev to 12 and countermands it by setting it to 7 in a fake system-wide configuration file during our test. The unconditional widening of the default abbreviation size in this patch will have to be discarded, preferring the approach Linus is taking to auto-size it based on the number of objects in the repository, but the part that updates the test script may still be necessary. Jeff King (1): t1300: check also system-wide configuration file in --show-origin tests Junio C Hamano (8): config: allow customizing /etc/gitconfig location with an environment t1300: always compare expect to actual t1308: ignore system-wide config in the iteration test t1300: disable system-wide config for tests that wants to read from -c t1300: take contents of system-wide configuration into account in "--list" test t1300: be explicit in local configuration tests worktree: honor configuration variables core.abbrev: raise the default abbreviation to 12 hexdigits builtin/worktree.c | 2 + cache.h| 1 + config.c | 2 + environment.c | 2 +- t/gitconfig-for-test | 9 t/t1300-repo-config.sh | 120 ++--- t/t1308-config-set.sh | 1 + t/test-lib.sh | 4 +- 8 files changed, 93 insertions(+), 48 deletions(-) create mode 100644 t/gitconfig-for-test -- 2.10.0-589-g5adf4e1
Re: [PATCH v8 00/11] Git filter protocol
W dniu 29.09.2016 o 13:57, Torsten Bögershausen pisze: > On 29/09/16 12:28, Lars Schneider wrote: >> This is what happens: >> >> 1) Git exits >> 2) The filter process receives EOF and prints "STOP" to the log >> 3) t0021 checks the content of the log >> >> Sometimes 3 happened before 2 which makes the test fail. >> (Example: https://travis-ci.org/git/git/jobs/162660563 ) >> >> I added a this to wait until the filter process terminates: >> >> +wait_for_filter_termination () { >> +while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null >> 2>&1 >> +do >> +echo "Waiting for /t0021/rot13-filter.pl to finish..." >> +sleep 1 >> +done >> +} >> >> Does this look OK to you? > Do we need the ps at all ? > How about this: > > +wait_for_filter_termination () { > +while ! grep "STOP" LOGFILENAME >/dev/null > +do > +echo "Waiting for /t0021/rot13-filter.pl to finish..." > +sleep 1 > +done > +} Or even better: make filter driver write its pid to pidfile, and then "wait $(cat rot13-filter.pid)". That's what we do in lib-git-daemon.sh (I think). If the problem is exit status of "wait" builtin, then filter driver can remove its pidfile after writing "STOP", just before ending. -- Jakub Narębski
Re: [PATCH v8 00/11] Git filter protocol
> On 29 Sep 2016, at 18:57, Junio C Hamanowrote: > > Torsten Bögershausen writes: > >>> 1) Git exits >>> 2) The filter process receives EOF and prints "STOP" to the log >>> 3) t0021 checks the content of the log >>> >>> Sometimes 3 happened before 2 which makes the test fail. >>> (Example: https://travis-ci.org/git/git/jobs/162660563 ) >>> >>> I added a this to wait until the filter process terminates: >>> >>> +wait_for_filter_termination () { >>> + while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null >>> 2>&1 >>> + do >>> + echo "Waiting for /t0021/rot13-filter.pl to finish..." >>> + sleep 1 >>> + done >>> +} >>> >>> Does this look OK to you? >> Do we need the ps at all ? >> How about this: >> >> +wait_for_filter_termination () { >> +while ! grep "STOP" LOGFILENAME >/dev/null >> +do >> +echo "Waiting for /t0021/rot13-filter.pl to finish..." >> +sleep 1 >> +done >> +} > > Running "ps" and grepping for a command is not suitable for script > to reliably tell things, so it is out of question. Compared to > that, your version looks slightly better, but what if the machinery > that being tested, i.e. the part that drives the filter process, is > buggy or becomes buggy and causes the filter process that writes > "STOP" to die before it actually writes that string? > > I have a feeling that the machinery being tested needs to be fixed > so that the sequence is always be: > >0) Git spawns the filter process, as it needs some contents to > be filtered. > >1) Git did everything it needed to do and decides that is time > to go. > >2) Filter process receives EOF and prints "STOP" to the log. > >3) Git waits until the filter process finishes. > >4) t0021, after Git finishes, checks the log. A pragmatic approach: I could drop the "STOP" message that the filter writes to the log on exit and everything would work as is. We could argue that this is OK because Git doesn't care anyways if the filter process has stopped or not. Would that be OK for everyone? - Lars
Re: [PATCH 2/4] t13xx: do not assume system config is empty
Jeff Kingwrites: > I just don't see it being a problem. Adding core.abbrev for the whole > test suite is just about not having a big flag day where we change all > the tests. Changing one or two tests (and again, I'd be surprised if we > even have to do that) doesn't seem like a big deal. I've already wasted several hours whipping t1300 into shape, because it was done in not so forward-looking future-proofed way. I am not worried about core.abbrev but I am worried more about the next thing that requires us to add an entry to t/gitconfig-for-test. Adding a corresponding entry to retain the old default for that new config to two places may not be a big deal, but it still makes me feel a bit uneasy. In any case, I suspect that Linus's "auto" thing may still need the custom system config with t1300 clean-up to pass the test, even though I suspect it would compute that 7 is enough for most of the tiny repositories our tests use, so I'll polish this a bit more while waiting for that discussion to settle. Thanks.
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
Linus Torvaldswrites: > But you could easily also just instead have it do something like > > if (default_abbrev < 0) > default_abbrev = initialize_abbrev(); > > at startup time if "abbrev_commit" is set, and just do it once and for > all rather rthan the odd loping behavior. I think that is a reasonable way to go. #define DEFAULT_ABBREV get_default_abbrev() would help.
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 12:16 PM, Jeff Kingwrote: > > Hmm. So at length 7, we expect collisions at 2^14, which is 16384. That > seems really low. I mean, by the birthday paradox that's where expect > a 50% chance of a collision. But that's a single collision. We > definitely don't expect them to be common at that size. > > So I suspect this could be a bit looser. So I have to admit that I was surprised by how quickly it actually decided that 7 isn't enough. In fact, the reason I initially said that git used 8 digits was that I didn't count very closely, and just verified that it was more than the default 7. But quite frankly, I think the math is correct, and part of that is that the logic is all about not just the current state, but the "reasonably near future". So it is indeed fairly aggressive, and the moment you have more objects than the "we'd expect to probably see _one_ collision" it grows the size. But looking at the kernel situation, that really is what we'd want, because the whole problem with the existing code is that it only takes the *current* situation into account. That's what we want to get away from. We want git to pick a number that is sane from a standpoint of "this project is still growing". And git _already_ has commits that are ambiguous in 8 hex digits and need 9. Yes, it's rare today, but the reason I'm telling kernel developers to use 12 is because while a size-11 collision is very rare today, it does actually happen, and we want o pick a value where it is rare enough that even in the near future it's not going to be a big deal. Don't get me wrong: collisions aren't fatal. So it's not like we have to absolutely avoid them, and I really like your patch series exactly because it makes collisions even less of a deal (particularly since I expect people will not upgrade immediately, so we'll continue to see even new 7-hex-digit short forms even in the kernel). So it's a balance of making the hex string long enough that it's simply not a big worry. So I'm sure it *could* be looser, but I actually also really suspect that git truly *should* use a 9-digit abbreviation rather than 8 (and 7 is definitely starting to be borderline, I think). > As far as the implementation, I was surprised to see it touch > get_short_sha1() at all. That's, after all, for lookups, and we would > never want to require more characters on the reading side. Heh. The implementation is crap. It was literally a "how can I make the smallest possible patch" implementation. I was finishing it off while at a talk by Nicolas Pitre at Linaro Connect where I am right now. So I agree - it does extra work just because that's where it all slotted in with minimal effort. At a minimum, once it finds a good new default, it should just memoize that. So a minimal fix to the "it's stupldly recalculating things over rand over again" would be to just set "default_abbrev" to the value it finds acceptable after the first time it finds something, so that it doesn't end up looping _again_ in the future. But you could easily also just instead have it do something like if (default_abbrev < 0) default_abbrev = initialize_abbrev(); at startup time if "abbrev_commit" is set, and just do it once and for all rather rthan the odd loping behavior. I really just wanted to see how well the concept worked, and I was happy to see that it gave what I thought were the "correct" numbers. And the loop was salready there ... Linus
Impossible to change working directory
Hi there, I have a problem executing a pre-commit hook. The hook script has to change the working directory to work and if I use plain git commit it works as expected, the script executes without errors, but if I use git commit —only file.x file.y the script fails because changing the current working directory fails. If I echo the current working directory it always echoes the root repository path Is this expected behavior? Thanks for your feedback. Cheers Sebastian
Re: [PATCH 2/4] t13xx: do not assume system config is empty
On Thu, Sep 29, 2016 at 12:06:15PM -0700, Junio C Hamano wrote: > I think it deserves a separate patch and the result is more > understandable. I've queued this for now (on top of a revised 1/4 > that uses GIT_CONFIG_SYSTEM_PATH instead). Thanks, makes sense (and I like the new variable name better, by the way). > -- >8 -- > From: Jeff King> Date: Thu, 29 Sep 2016 11:29:10 -0700 > Subject: [PATCH] t1300: check also system-wide configuration file in > --show-origin tests > > Because we used to run our tests with GIT_CONFIG_NOSYSTEM, these did > not test that the system-wide configuration file is also read and > shown as one of the origins. Create a custom/fake system-wide > configuration file and make sure it appears in the output, using the > newly introduced GIT_CONFIG_SYSTEM_PATH mechanism. > > Signed-off-by: Junio C Hamano Good description. Signed-off-by: Jeff King of course. > @@ -1304,6 +1315,7 @@ test_expect_success '--show-origin with --get-regexp' ' > file:$HOME/.gitconfig user.global true > file:.git/configuser.local true > EOF > + GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \ > git config --show-origin --get-regexp "user\.[g|l].*" >output && > test_cmp expect output > ' This is one is trying to do a multi-file lookup, but we couldn't look in the system config before. But to naturally extend it, it ought to look like this on top: diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh index d2476a8..4dd5ce3 100755 --- a/t/t1300-repo-config.sh +++ b/t/t1300-repo-config.sh @@ -1310,11 +1310,12 @@ test_expect_success '--show-origin with single file' ' test_expect_success '--show-origin with --get-regexp' ' cat >expect <<-EOF && + file:$HOME/etc-gitconfiguser.system true file:$HOME/.gitconfig user.global true file:.git/configuser.local true EOF GIT_ETC_GITCONFIG=$HOME/etc-gitconfig \ - git config --show-origin --get-regexp "user\.[g|l].*" >output && + git config --show-origin --get-regexp "user\.[g|l|s].*" >output && test_cmp expect output ' > @@ -1312,6 +1324,7 @@ test_expect_success '--show-origin getting a single > key' ' > cat >expect <<-\EOF && > file:.git/configlocal > EOF > + GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \ > git config --show-origin user.override >output && > test_cmp expect output > ' And I was tempted to say this one should not need to care, but I guess it is testing that we correctly read the override from the local config over the global one. So likewise, it is good to check that we also override the system config (it does not effect the "expect" output, but that does not mean it is not enhancing the test). -Peff
[RFC/PATCH 2/2] sequencer: allow origin line below commit title
When git cherry-pick -x is invoked, a "(cherry picked from commit ...)" line is appended to the footer of a commit message that Git interprets to contain a footer; otherwise it is appended at the end as a new paragraph, preceded by a blank line. This behavior may appear inconsistent, especially to users who differ from Git in their interpretation of what constitutes a footer. Provide the user, through a configuration option and command-line flag, the option of placing the "cherry picked" line below the commit title instead of the current behavior. This allows the "cherry picked" line to be placed in a consistent manner, independent of the nature of the footer of the existing commit message. Signed-off-by: Jonathan Tan--- Documentation/config.txt | 4 +++ Documentation/git-cherry-pick.txt | 15 +- builtin/revert.c | 38 - sequencer.c | 39 -- sequencer.h | 7 + t/t3511-cherry-pick-x.sh | 59 +++ 6 files changed, 152 insertions(+), 10 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 0bcb679..fb1990f 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -945,6 +945,10 @@ browser..path:: browse HTML help (see `-w` option in linkgit:git-help[1]) or a working repository in gitweb (see linkgit:git-instaweb[1]). +cherrypick.originLineLocation:: + Default for the `--origin-line-location` option in git-cherry-pick. + Defaults to `bottom`. + clean.requireForce:: A boolean to make git-clean do nothing unless given -f, -i or -n. Defaults to true. diff --git a/Documentation/git-cherry-pick.txt b/Documentation/git-cherry-pick.txt index d35d771..5a8359f 100644 --- a/Documentation/git-cherry-pick.txt +++ b/Documentation/git-cherry-pick.txt @@ -58,7 +58,7 @@ OPTIONS message prior to committing. -x:: - When recording the commit, append a line that says + When recording the commit, add a line that says "(cherry picked from commit ...)" to the original commit message in order to indicate which commit this change was cherry-picked from. This is done only for cherry @@ -71,6 +71,14 @@ OPTIONS development branch), adding this information can be useful. +--origin-line-location:: + Where to put the "(cherry picked from commit ...)" line when requested + with the `-x` option. May be `top`, meaning at the top of the commit + message body, immediately below the commit title (see the DISCUSSION + section of linkgit:git-commit[1]), or `bottom`, meaning at the end of + the commit message. The default is controlled by the + `cherrypick.originLineLocation` configuration variable. + -r:: It used to be that the command defaulted to do `-x` described above, and `-r` was to disable it. Now the @@ -224,6 +232,11 @@ the working tree. spending extra time to avoid mistakes based on incorrectly matching context lines. +CONFIGURATION +- +cherrypick.originLineLocation:: + Default for the `--origin-line-location` option. Defaults to `bottom`. + SEE ALSO linkgit:git-revert[1] diff --git a/builtin/revert.c b/builtin/revert.c index 4e69380..a5459a0 100644 --- a/builtin/revert.c +++ b/builtin/revert.c @@ -71,11 +71,25 @@ static void verify_opt_compatible(const char *me, const char *base_opt, ...) die(_("%s: %s cannot be used with %s"), me, this_opt, base_opt); } +static int set_origin_line(enum origin_line *line, const char *str) +{ + if (!strcmp(str, "bottom")) { + *line = ORIGIN_LINE_BOTTOM; + return 1; + } + if (!strcmp(str, "top")) { + *line = ORIGIN_LINE_TOP; + return 1; + } + return 0; +} + static void parse_args(int argc, const char **argv, struct replay_opts *opts) { const char * const * usage_str = revert_or_cherry_pick_usage(opts); const char *me = action_name(opts); int cmd = 0; + const char *origin_str = NULL; struct option base_options[] = { OPT_CMDMODE(0, "quit", , N_("end revert or cherry-pick sequence"), 'q'), OPT_CMDMODE(0, "continue", , N_("resume revert or cherry-pick sequence"), 'c'), @@ -98,6 +112,7 @@ static void parse_args(int argc, const char **argv, struct replay_opts *opts) if (opts->action == REPLAY_PICK) { struct option cp_extra[] = { OPT_BOOL('x', NULL, >record_origin, N_("append commit name")), + OPT_STRING(0, "origin-line-location", _str, N_("origin-line-location"), N_("location of appended commit name")), OPT_BOOL(0, "ff", >allow_ff, N_("allow fast-forward")),
[RFC/PATCH 0/2] place cherry pick line below commit title
This is somewhat of a follow-up to my previous e-mail with subject "[PATCH] sequencer: support folding in rfc2822 footer" [1], in which I proposed relaxing the definition of a commit message footer to allow multiple-line field bodies (as described in RFC2822), but its strictness was deemed deliberate. Below is a patch set that allows placing the "cherry picked from" line without taking into account the definition of a commit message footer. For example, "git cherry-pick -x" (with the appropriate configuration variable or argument) would, to this commit message: commit title This is an explanatory paragraph. Footer: foo place the "(cherry picked from ...)" line below "commit title". Would this be better? [1] <1472846322-5592-1-git-send-email-jonathanta...@google.com> Jonathan Tan (2): sequencer: refactor message and origin appending sequencer: allow origin line below commit title Documentation/config.txt | 4 +++ Documentation/git-cherry-pick.txt | 15 - builtin/revert.c | 38 - sequencer.c | 69 +-- sequencer.h | 7 t/t3511-cherry-pick-x.sh | 59 + 6 files changed, 172 insertions(+), 20 deletions(-) -- 2.8.0.rc3.226.g39d4020
[RFC/PATCH 1/2] sequencer: refactor message and origin appending
Move the appending of the commit message and origin information into its own function, in preparation for a subsequent patch. Signed-off-by: Jonathan Tan--- sequencer.c | 46 -- 1 file changed, 28 insertions(+), 18 deletions(-) diff --git a/sequencer.c b/sequencer.c index 3804fa9..b29c9ca 100644 --- a/sequencer.c +++ b/sequencer.c @@ -443,6 +443,33 @@ static int allow_empty(struct replay_opts *opts, struct commit *commit) return 1; } +/* + * Appends the commit log message, including the cherry picked notification if + * record_origin is nonzero. + */ +static void append_message(struct strbuf *msgbuf, + const struct commit_message *msg, + int record_origin, + const struct commit *commit) +{ + /* +* Append the commit log message to msgbuf; it starts +* after the tree, parent, author, committer +* information followed by "\n\n". +*/ + const char *p = strstr(msg->message, "\n\n"); + if (p) + strbuf_addstr(msgbuf, skip_blank_lines(p + 2)); + + if (record_origin) { + if (!has_conforming_footer(msgbuf, NULL, 0)) + strbuf_addch(msgbuf, '\n'); + strbuf_addstr(msgbuf, cherry_picked_prefix); + strbuf_addstr(msgbuf, oid_to_hex(>object.oid)); + strbuf_addstr(msgbuf, ")\n"); + } +} + static int do_pick_commit(struct commit *commit, struct replay_opts *opts) { unsigned char head[20]; @@ -534,29 +561,12 @@ static int do_pick_commit(struct commit *commit, struct replay_opts *opts) } strbuf_addstr(, ".\n"); } else { - const char *p; - base = parent; base_label = msg.parent_label; next = commit; next_label = msg.label; - /* -* Append the commit log message to msgbuf; it starts -* after the tree, parent, author, committer -* information followed by "\n\n". -*/ - p = strstr(msg.message, "\n\n"); - if (p) - strbuf_addstr(, skip_blank_lines(p + 2)); - - if (opts->record_origin) { - if (!has_conforming_footer(, NULL, 0)) - strbuf_addch(, '\n'); - strbuf_addstr(, cherry_picked_prefix); - strbuf_addstr(, oid_to_hex(>object.oid)); - strbuf_addstr(, ")\n"); - } + append_message(, , opts->record_origin, commit); } if (!opts->strategy || !strcmp(opts->strategy, "recursive") || opts->action == REPLAY_REVERT) { -- 2.8.0.rc3.226.g39d4020
Re: [PATCH 2/4] t13xx: do not assume system config is empty
On Thu, Sep 29, 2016 at 11:57:02AM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > >> "either" meaning "we do not need to add --local and we do not need > >> GIT_CONFIG_NOSYSTEM"? > > > > Yes. I didn't test it with your core.abbrev patch 4/4, but I _didn't_ > > have to touch their expected output after pointing them at a non-empty > > etc-gitconfig file in the trash directory. Which implies to me they > > don't care either way (which makes sense; they are asking for a specific > > key which is supposed to be found in one of the other files). > > There is a bit of problem here, though. > > * If we make t1300 point at its own system-wide config, it will be >in control of its contents, so "find this key" will find only it >wants to find (or we found a regression). > > * But then if it ever does something that depends on the default >value of core.abbrev (or whatever we'd tweak in response to the >next suggestion by Linus ;-), we cannot really allow it to do >so. We'd want t/gitconfig-for-test to be the single place that >we can tweak these things, but we'll have to know t1300 uses its >own and need to make the same change there, too. Right, but I think that's fine. Tests that care deeply about the contents of etc-gitconfig are unlikely to care about core.abbrev. And in the off chance that they do, then the worst case is...they get updated to handle core.abbrev (either passing a command line option, or just putting core.abbrev in their test file). I just don't see it being a problem. Adding core.abbrev for the whole test suite is just about not having a big flag day where we change all the tests. Changing one or two tests (and again, I'd be surprised if we even have to do that) doesn't seem like a big deal. -Peff
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 11:55:46AM -0700, Linus Torvalds wrote: > I think the patch can speak for itself, but the basic core is this > section in get_short_sha1(): > > + if (len < 16 && !status && (flags & GET_SHA1_AUTOMATIC)) { > + unsigned int expect_collision = 1 << (len * 2); > + if (ds.nrobjects > expect_collision) > + return SHORT_NAME_AMBIGUOUS; > + } Hmm. So at length 7, we expect collisions at 2^14, which is 16384. That seems really low. I mean, by the birthday paradox that's where expect a 50% chance of a collision. But that's a single collision. We definitely don't expect them to be common at that size. So I suspect this could be a bit looser. The real number we care about is probably something like "there is probability 'p' of a collision when we add a new object", but I'm not sure what that 'p' would be. Or perhaps "we accept collisions in 'n' percent of objects". But again, I don't know that 'n'. I dunno. I suppose being overly conservative with this number leaves room for growth. Repositories generally get bigger, not smaller. :) > What do you think? It's actually a fairly simple patch and I really do > think it makes sense and it seems to just DTRT automatically. I like the general idea. As far as the implementation, I was surprised to see it touch get_short_sha1() at all. That's, after all, for lookups, and we would never want to require more characters on the reading side. I see you worked around it with a flag so that this behavior only kicks in when called via find_unique_abbrev(). But if you look at the caller: > @@ -458,14 +472,19 @@ int for_each_abbrev(const char *prefix, each_abbrev_fn > fn, void *cb_data) > int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len) > { > int status, exists; > + int flags = GET_SHA1_QUIETLY; > > + if (len < 0) { > + flags |= GET_SHA1_AUTOMATIC; > + len = 7; > + } > sha1_to_hex_r(hex, sha1); > if (len == 40 || !len) > return 40; > exists = has_sha1_file(sha1); > while (len < 40) { > unsigned char sha1_ret[20]; > - status = get_short_sha1(hex, len, sha1_ret, GET_SHA1_QUIETLY); > + status = get_short_sha1(hex, len, sha1_ret, flags); > if (exists > ? !status > : status == SHORT_NAME_NOT_FOUND) { You can see that we're going to do more work than we would otherwise need to. Because we start at 7, and ask get_short_sha1() "is this unique enough?", and looping. But if we _know_ we won't accept any answer shorter than some N based on the number of objects in the repository, then we should start at that N. IOW, something like: if (len < 0) len = ceil(log_base_2(repository_object_count())); here, and then you don't have to touch get_short_sha1() at all. I suspect you pushed it down into get_short_sha1() because it kind-of does the repository_object_count() step for "free" as it's looking at the object anyway. But that step is really not very expensive. And I'd even say you could just ignore loose objects entirely, and treat them like a rounding error (the way that duplicate objects in packs are treated). That leaves you with just an O(# of packs) loop over a linked list. You could even just keep a global object count up to date in add_packed_git(), and then it's O(1). -Peff
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 11:55 AM, Linus Torvaldswrote: > > For the kernel, just the *math* right now actually gives 12 > characters. For current git it actually seems to say that 8 is the > correct number. For small projects, you'll still see 7. Sorry, the git number is 9, not 8. The reason is that git has roughly 212k objects, and 9 hex digits gets expected collisions at about 256k objects. So the logic means that we'll see 7 hex digits for projects with less than 16k objects, 8 hex digits if there are less than 64k objects, and 9 hex digits for projects like git that currently have fewer than 256k objects. But git itself might not be *that* far from going to 10 hex digits with my patch. The kernel uses 12 he digits because the collision math says that's the right thing for a project with between 4M and 16M objects (with the kernel being at 5M). So on the whole the patch really does seem to just do the right thing automatically. Linus
Re: [PATCH 2/4] t13xx: do not assume system config is empty
Jeff Kingwrites: > On Thu, Sep 29, 2016 at 11:13:45AM -0700, Junio C Hamano wrote: > >> Jeff King writes: >> >> > I think anytime you would use GIT_CONFIG_NOSYSTEM over --local, it is an >> > indication that the test is trying to check how multiple sources >> > interact. And the right thing to do for them is to set GIT_ETC_GITCONFIG >> > to some known quantity. We just couldn't do that before, so we skipped >> > it. IOW, something like the patch below (on top of yours). >> >> OK, that way we can make sure that "multiple sources" operations do >> look at the system-wide stuff. > > Exactly. I think it deserves a separate patch and the result is more understandable. I've queued this for now (on top of a revised 1/4 that uses GIT_CONFIG_SYSTEM_PATH instead). -- >8 -- From: Jeff King Date: Thu, 29 Sep 2016 11:29:10 -0700 Subject: [PATCH] t1300: check also system-wide configuration file in --show-origin tests Because we used to run our tests with GIT_CONFIG_NOSYSTEM, these did not test that the system-wide configuration file is also read and shown as one of the origins. Create a custom/fake system-wide configuration file and make sure it appears in the output, using the newly introduced GIT_CONFIG_SYSTEM_PATH mechanism. Signed-off-by: Junio C Hamano --- t/t1300-repo-config.sh | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh index 0543b62227bf..aa25577709c5 100755 --- a/t/t1300-repo-config.sh +++ b/t/t1300-repo-config.sh @@ -1236,6 +1236,11 @@ test_expect_success 'set up --show-origin tests' ' [user] relative = include EOF + cat >"$HOME"/etc-gitconfig <<-\EOF && + [user] + system = true + override = system + EOF cat >"$HOME"/.gitconfig <<-EOF && [user] global = true @@ -1254,6 +1259,8 @@ test_expect_success 'set up --show-origin tests' ' test_expect_success '--show-origin with --list' ' cat >expect <<-EOF && + file:$HOME/etc-gitconfiguser.system=true + file:$HOME/etc-gitconfiguser.override=system file:$HOME/.gitconfig user.global=true file:$HOME/.gitconfig user.override=global file:$HOME/.gitconfig include.path=$INCLUDE_DIR/absolute.include @@ -1264,13 +1271,16 @@ test_expect_success '--show-origin with --list' ' file:.git/../include/relative.include user.relative=include command line: user.cmdline=true EOF + GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \ git -c user.cmdline=true config --list --show-origin >output && test_cmp expect output ' test_expect_success '--show-origin with --list --null' ' cat >expect <<-EOF && - file:$HOME/.gitconfigQuser.global + file:$HOME/etc-gitconfigQuser.system + trueQfile:$HOME/etc-gitconfigQuser.override + systemQfile:$HOME/.gitconfigQuser.global trueQfile:$HOME/.gitconfigQuser.override globalQfile:$HOME/.gitconfigQinclude.path $INCLUDE_DIR/absolute.includeQfile:$INCLUDE_DIR/absolute.includeQuser.absolute @@ -1281,6 +1291,7 @@ test_expect_success '--show-origin with --list --null' ' includeQcommand line:Quser.cmdline trueQ EOF + GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \ git -c user.cmdline=true config --null --list --show-origin >output.raw && nul_to_q output && # The here-doc above adds a newline that the --null output would not @@ -1304,6 +1315,7 @@ test_expect_success '--show-origin with --get-regexp' ' file:$HOME/.gitconfig user.global true file:.git/configuser.local true EOF + GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \ git config --show-origin --get-regexp "user\.[g|l].*" >output && test_cmp expect output ' @@ -1312,6 +1324,7 @@ test_expect_success '--show-origin getting a single key' ' cat >expect <<-\EOF && file:.git/configlocal EOF + GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \ git config --show-origin user.override >output && test_cmp expect output ' -- 2.10.0-589-g5adf4e1
Re: [PATCH 2/4] t13xx: do not assume system config is empty
Jeff Kingwrites: >> "either" meaning "we do not need to add --local and we do not need >> GIT_CONFIG_NOSYSTEM"? > > Yes. I didn't test it with your core.abbrev patch 4/4, but I _didn't_ > have to touch their expected output after pointing them at a non-empty > etc-gitconfig file in the trash directory. Which implies to me they > don't care either way (which makes sense; they are asking for a specific > key which is supposed to be found in one of the other files). There is a bit of problem here, though. * If we make t1300 point at its own system-wide config, it will be in control of its contents, so "find this key" will find only it wants to find (or we found a regression). * But then if it ever does something that depends on the default value of core.abbrev (or whatever we'd tweak in response to the next suggestion by Linus ;-), we cannot really allow it to do so. We'd want t/gitconfig-for-test to be the single place that we can tweak these things, but we'll have to know t1300 uses its own and need to make the same change there, too. So, I dunno.
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 11:37 AM, Linus Torvaldswrote: > > I'm playing with an early patch to make the default more dynamic. > Let's see how well it works in practice, but it looks fairly > promising. Let me test a bit more and send out an RFC patch.. Ok, this is *very* rough, and it doesn't actuall pass all the tests, and I didn't even try to look at why. But it passes the trivial smell-test, and in particular it actually makes mathematical sense... I think the patch can speak for itself, but the basic core is this section in get_short_sha1(): + if (len < 16 && !status && (flags & GET_SHA1_AUTOMATIC)) { + unsigned int expect_collision = 1 << (len * 2); + if (ds.nrobjects > expect_collision) + return SHORT_NAME_AMBIGUOUS; + } basically, what it says is that we will consider a sha1 ambiguous even if it was *technically* unique (that's the '!status' part of the test) if: - the length was 15 or less *and* - the number of objects we have is larger than the expected point where statistically we should start to expect to get one collision. That "expect_collision" math is actually very simple: each hex character adds four bits of range, but since we expect collisions at the square root of the maximum number of objects, we shift by just two bits per hex digits instead. The rest of the patch is a trivial change to just initialize the default short size to -1, and consider that to mean "enable the automatic size checking with a minimum of 7". And the trivial code to estimate the number of objects (which ignores duplicates between packs etc _entirely_). For the kernel, just the *math* right now actually gives 12 characters. For current git it actually seems to say that 8 is the correct number. For small projects, you'll still see 7. ANYWAY. This patch is on top of Jeff's patches in 'pu' (I think those are great regardless of this patch!), and as mentioned, it fails some tests. I suspect that the failures might be due to the abbrev_default being -1, and some other code finds that surprising now. But as mentioned, I didn't really even look at it. What do you think? It's actually a fairly simple patch and I really do think it makes sense and it seems to just DTRT automatically. Linus cache.h | 1 + environment.c | 2 +- sha1_name.c | 21 - 3 files changed, 22 insertions(+), 2 deletions(-) diff --git a/cache.h b/cache.h index 6e33f2f..d2da6d1 100644 --- a/cache.h +++ b/cache.h @@ -1207,6 +1207,7 @@ struct object_context { #define GET_SHA1_TREEISH 020 #define GET_SHA1_BLOB 040 #define GET_SHA1_FOLLOW_SYMLINKS 0100 +#define GET_SHA1_AUTOMATIC 0200 #define GET_SHA1_ONLY_TO_DIE04000 #define GET_SHA1_DISAMBIGUATORS \ diff --git a/environment.c b/environment.c index c1442df..fd6681e 100644 --- a/environment.c +++ b/environment.c @@ -16,7 +16,7 @@ int trust_executable_bit = 1; int trust_ctime = 1; int check_stat = 1; int has_symlinks = 1; -int minimum_abbrev = 4, default_abbrev = 7; +int minimum_abbrev = 4, default_abbrev = -1; int ignore_case; int assume_unchanged; int prefer_symlink_refs; diff --git a/sha1_name.c b/sha1_name.c index 3b647fd..8791ff3 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -15,6 +15,7 @@ typedef int (*disambiguate_hint_fn)(const unsigned char *, void *); struct disambiguate_state { int len; /* length of prefix in hex chars */ + unsigned int nrobjects; char hex_pfx[GIT_SHA1_HEXSZ + 1]; unsigned char bin_pfx[GIT_SHA1_RAWSZ]; @@ -118,6 +119,12 @@ static void find_short_object_filename(struct disambiguate_state *ds) if (strlen(de->d_name) != 38) continue; + + // We only look at the one subdirectory, and we assume + // each subdirectory is roughly similar, so each object + // we find probably has 255 other objects in the other + // fan-out directories + ds->nrobjects += 256; if (memcmp(de->d_name, ds->hex_pfx + 2, ds->len - 2)) continue; memcpy(hex + 2, de->d_name, 38); @@ -151,6 +158,7 @@ static void unique_in_pack(struct packed_git *p, open_pack_index(p); num = p->num_objects; + ds->nrobjects += num; last = num; while (first < last) { uint32_t mid = (first + last) / 2; @@ -426,6 +434,12 @@ static int get_short_sha1(const char *name, int len, unsigned char *sha1, for_each_abbrev(ds.hex_pfx, show_ambiguous_object, ); } + if (len < 16 && !status && (flags & GET_SHA1_AUTOMATIC)) { + unsigned int expect_collision = 1 << (len * 2); + if (ds.nrobjects > expect_collision) + return
Re: [PATCH/RFC] git log --oneline alternative with dates, times and initials
Jeff Kingwrites: > But I also buy the argument that contrib/ is simply a hassle. This > script can live in its own repository somewhere, and handle > announcements and patches on the list. I think the output of this script is largely personal preference, which can be made to a project preference for a project enough of whose participant so desires. For example, I would not be surprised if this appeared next to checkpatch.pl script in the kernel archive. When a project that uses Git to store its sources finds a need to summarize its log in a standardized way that is not produced natively by Git, such a project may add this script to its scripts/ area, just like a project that wants to have a standard way to help its contributors to avoid common style errors a lot more than our "diff" (which only highlights whitespace errors) does may ship checkpatch.pl in it. So in that sense, while I do not mean to say that the script itself must become a standalone project that has only one script in it, I do not think it belongs "our" contrib/, as we do not see a need to standardize its output as the log summary standard we the Git project uses on its own history. On the other hand, your illustration of the needed bits to express this particular output format used by Kyle's script, when polished, does fit in our codebase. We are interested in making it possible for projects and users to do more by using Git with its standard customization features.
Re: [PATCH v5 1/4] git: make super-prefix option
On 09/29, Jeff King wrote: > On Wed, Sep 28, 2016 at 02:50:40PM -0700, Brandon Williams wrote: > > > Add a super-prefix environment variable 'GIT_INTERNAL_SUPER_PREFIX' > > which can be used to specify a path from above a repository down to its > > root. The immediate use of this option is by commands which have a > > --recurse-submodule option in order to give context to submodules about > > how they were invoked. This option is currently only allowed for > > builtins which support a super-prefix. > > What about non-builtins? > > E.g., what should > > git --super-prefix=foo bar > > do? Should the externals and scripts check the presence of > GIT_INTERNAL_SUPER_PREFIX and barf if it is set? Most scripts would > probably notice eventually when calling some other builtin that doesn't > support SUPER_PREFIX, but it seems hacky to count on that. > > There's also the question of 3rd-party programs. If we want to be > conservative, I think you'd want to just always bail in > execv_dashed_external() if --super-prefix is in use. That doesn't give > an option for scripts to say "hey, I support this", but we can perhaps > worry about loosening later. > > -Peff That makes sense. -- Brandon Williams
Re: [PATCH v5 1/4] git: make super-prefix option
On Wed, Sep 28, 2016 at 02:50:40PM -0700, Brandon Williams wrote: > Add a super-prefix environment variable 'GIT_INTERNAL_SUPER_PREFIX' > which can be used to specify a path from above a repository down to its > root. The immediate use of this option is by commands which have a > --recurse-submodule option in order to give context to submodules about > how they were invoked. This option is currently only allowed for > builtins which support a super-prefix. What about non-builtins? E.g., what should git --super-prefix=foo bar do? Should the externals and scripts check the presence of GIT_INTERNAL_SUPER_PREFIX and barf if it is set? Most scripts would probably notice eventually when calling some other builtin that doesn't support SUPER_PREFIX, but it seems hacky to count on that. There's also the question of 3rd-party programs. If we want to be conservative, I think you'd want to just always bail in execv_dashed_external() if --super-prefix is in use. That doesn't give an option for scripts to say "hey, I support this", but we can perhaps worry about loosening later. -Peff
Re: [PATCH v8 00/11] Git filter protocol
Am 29.09.2016 um 20:18 schrieb Torsten Bögershausen: I would agree that Git should not wait for the filter. But does the test suite need to wait for the filter ? We have fixed a test case on Windows recently where a process hung around too long (5babb5bd). So, yes, the test suite has to wait for the filter. -- Hannes
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
On Thu, Sep 29, 2016 at 11:05 AM, Junio C Hamanowrote: > > Yes, "git log --oneline" looks somewhat different and strange for > me, too ;-) I'm playing with an early patch to make the default more dynamic. Let's see how well it works in practice, but it looks fairly promising. Let me test a bit more and send out an RFC patch.. Linus
Re: [PATCH 5/5] log: add --commit-header option
On Thu, Sep 29, 2016 at 10:49:04AM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > > This lets you stick a header right before a commit, but > > suppresses headers that are duplicates. This means you can > > do something like: > > > > git log --graph --author-date-order --commit-header='== %as ==' > > > > to get a marker in the graph whenever the day changes. > > That's interesting. So it is not really "commit" header, but a > header for groups of commits. Credits for realizing the usefulness > of such grouping may go to Kyle, but the implementation is also > brilliant ;-). Yeah, I really don't like the name "--commit-header" that much. I initially thought to call it "--graph-header", but it is potentially useful without a graph, too. Maybe "--group-header" or something. I dunno. I'd leave that to somebody who actually wanted to polish the patches up enough for submission. That might even be me someday, but not today. :) -Peff
Re: [PATCH/RFC] git log --oneline alternative with dates, times and initials
On Thu, Sep 29, 2016 at 10:38:14AM -0700, Junio C Hamano wrote: > > I have no problem taking this in contrib or whatever, until a point when > > Git is capable of doing the same thing itself. I just hoped to trick you > > into working on Git. :) > > I thought we stopped adding random things to contrib/, though. > > Unlike the earlier days of Git, if a custom command that uses Git is > very userful, it can live its own life and flourish within the much > larger Git userbase we have these days. I dunno. I said "contrib or whatever" to duck that question. :) I do not have a strong opinion either way. In some ways this script is similar to diff-highlight, which is in contrib. Perhaps that is only because diff-highlight is grandfathered. But I also think it somewhat makes sense, because in an ideal world diff-highlight gets thrown away in favor of git's internal diff routines learning to do the same thing. And in theory this script is in the same position. But I also buy the argument that contrib/ is simply a hassle. This script can live in its own repository somewhere, and handle announcements and patches on the list. For that matter, so could diff-highlight, and I don't mind ripping it out of contrib if that's the consensus. My only real objection is that doing so is more work than leaving it as-is, and I'm lazy. -Peff
Re: [PATCH 2/4] t13xx: do not assume system config is empty
On Thu, Sep 29, 2016 at 11:13:45AM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > > I think anytime you would use GIT_CONFIG_NOSYSTEM over --local, it is an > > indication that the test is trying to check how multiple sources > > interact. And the right thing to do for them is to set GIT_ETC_GITCONFIG > > to some known quantity. We just couldn't do that before, so we skipped > > it. IOW, something like the patch below (on top of yours). > > OK, that way we can make sure that "multiple sources" operations do > look at the system-wide stuff. Exactly. > > Note that the > > commands that are doing a "--get" and not a "--list" don't actually seem > > to need either (because they are getting the values out of the local > > file anyway), so we could drop the setting of GIT_ETC_GITCONFIG from > > them entirely. > > "either" meaning "we do not need to add --local and we do not need > GIT_CONFIG_NOSYSTEM"? Yes. I didn't test it with your core.abbrev patch 4/4, but I _didn't_ have to touch their expected output after pointing them at a non-empty etc-gitconfig file in the trash directory. Which implies to me they don't care either way (which makes sense; they are asking for a specific key which is supposed to be found in one of the other files). -Peff
Re: [PATCH v8 00/11] Git filter protocol
On 29/09/16 19:57, Lars Schneider wrote: On 29 Sep 2016, at 18:57, Junio C Hamanowrote: Torsten Bögershausen writes: 1) Git exits 2) The filter process receives EOF and prints "STOP" to the log 3) t0021 checks the content of the log Sometimes 3 happened before 2 which makes the test fail. (Example: https://travis-ci.org/git/git/jobs/162660563 ) I added a this to wait until the filter process terminates: +wait_for_filter_termination () { + while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 2>&1 + do + echo "Waiting for /t0021/rot13-filter.pl to finish..." + sleep 1 + done +} Does this look OK to you? Do we need the ps at all ? How about this: +wait_for_filter_termination () { + while ! grep "STOP" LOGFILENAME >/dev/null + do + echo "Waiting for /t0021/rot13-filter.pl to finish..." + sleep 1 + done +} Running "ps" and grepping for a command is not suitable for script to reliably tell things, so it is out of question. Compared to that, your version looks slightly better, but what if the machinery that being tested, i.e. the part that drives the filter process, is buggy or becomes buggy and causes the filter process that writes "STOP" to die before it actually writes that string? I have a feeling that the machinery being tested needs to be fixed so that the sequence is always be: 0) Git spawns the filter process, as it needs some contents to be filtered. 1) Git did everything it needed to do and decides that is time to go. 2) Filter process receives EOF and prints "STOP" to the log. 3) Git waits until the filter process finishes. 4) t0021, after Git finishes, checks the log. Repeated sleep combined with grep is probably just sweeping the real problem under the rug. Do we have enough information to do the above? An inspiration may be in the way we centrally clean all tempfiles and lockfiles before exiting. We have a central registry of these files that need cleaning up and have a single atexit(3) handler to clean them up. Perhaps we need a registry that filter processes spawned by the mechanism Lars introduces in this series, and have an atexit(3) handler that closes the pipe to them (which signals the filters that it is time for them to go) and wait(2) on them, or something? I do not think we want any kill(2) to be involved in this clean-up procedure, but I do think we should wait(2) on what we spawn, as long as these processes are meant to be shut down when the main process of Git exits (this is different from things like credential-cache daemon where they are expected to persist and meant to serve multiple Git processes). We discussed that issue in v4 and v6: http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/ http://public-inbox.org/git/xmqqbn0a3wy3@gitster.mtv.corp.google.com/ My impression was that you don't want Git to wait for the filter process. If Git waits for the filter process - how long should Git wait? Thanks, Lars Hm, I would agree that Git should not wait for the filter. But does the test suite need to wait for the filter ? May be, in this case we test the filter and Git, which is good. Adding a 1 second delay, if, and only if, there is a racy condition, is not that bad (or do we have better ways to check for a process to be terminated ?)
Re: [PATCH 2/4] t13xx: do not assume system config is empty
Jeff Kingwrites: > I think anytime you would use GIT_CONFIG_NOSYSTEM over --local, it is an > indication that the test is trying to check how multiple sources > interact. And the right thing to do for them is to set GIT_ETC_GITCONFIG > to some known quantity. We just couldn't do that before, so we skipped > it. IOW, something like the patch below (on top of yours). OK, that way we can make sure that "multiple sources" operations do look at the system-wide stuff. > Note that the > commands that are doing a "--get" and not a "--list" don't actually seem > to need either (because they are getting the values out of the local > file anyway), so we could drop the setting of GIT_ETC_GITCONFIG from > them entirely. "either" meaning "we do not need to add --local and we do not need GIT_CONFIG_NOSYSTEM"?
Re: Two bugs in --pretty with %C(auto)
Am 17.09.2016 um 20:25 schrieb René Scharfe: > diff --git a/pretty.c b/pretty.c > index 9788bd8..493edb0 100644 > --- a/pretty.c > +++ b/pretty.c > @@ -1072,6 +1072,8 @@ static size_t format_commit_one(struct strbuf *sb, /* > in UTF-8 */ > case 'C': > if (starts_with(placeholder + 1, "(auto)")) { > c->auto_color = want_color(c->pretty_ctx->color); > + if (c->auto_color) > + strbuf_addstr(sb, GIT_COLOR_RESET); > return 7; /* consumed 7 bytes, "C(auto)" */ > } else { > int ret = parse_color(sb, placeholder, c); We could optimize this a bit (see below). I can't think of a downside; someone adding a prefix would be responsible for adding a reset as well if needed, right? -- >8 -- Subject: [PATCH] pretty: avoid adding reset for %C(auto) if output is empty We emit an escape sequence for resetting color and attribute for %C(auto) to make sure automatic coloring is displayed as intended. Stop doing that if the output strbuf is empty, i.e. when %C(auto) appears at the start of the format string, because then there is no need for a reset and we save a few bytes in the output. Signed-off-by: Rene Scharfe--- Reverts the change to t6006, so we'd need another test for this. Anatoly? :) pretty.c | 2 +- t/t6006-rev-list-format.sh | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/pretty.c b/pretty.c index 493edb0..25efbca 100644 --- a/pretty.c +++ b/pretty.c @@ -1072,7 +1072,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */ case 'C': if (starts_with(placeholder + 1, "(auto)")) { c->auto_color = want_color(c->pretty_ctx->color); - if (c->auto_color) + if (c->auto_color && sb->len) strbuf_addstr(sb, GIT_COLOR_RESET); return 7; /* consumed 7 bytes, "C(auto)" */ } else { diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh index f6020cd..a1dcdb8 100755 --- a/t/t6006-rev-list-format.sh +++ b/t/t6006-rev-list-format.sh @@ -225,7 +225,7 @@ test_expect_success '%C(auto,...) respects --color=auto (stdout not tty)' ' test_expect_success '%C(auto) respects --color' ' git log --color --format="%C(auto)%H" -1 >actual && - printf "\\033[m\\033[33m%s\\033[m\\n" $(git rev-parse HEAD) >expect && + printf "\\033[33m%s\\033[m\\n" $(git rev-parse HEAD) >expect && test_cmp expect actual ' -- 2.10.0
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
Johannes Sixtwrites: > Am 29.09.2016 um 01:30 schrieb Junio C Hamano: >> As Peff said, responding in a thread started by Linus's suggestion >> to raise the default abbreviation to 12 hexdigits: > > This is waayy too large for a new default. The vast majority of > repositories is smallish. For those, the long sequences of hex digits > are an uglification that is almost unbearable. > > I know that kernel developers are important, but their importance has > long been outnumbered by the anonymous and silent masses of users. > > Personally, I use 8 digits just because it is a "rounder" number than > 7, but in all of my repositories 7 would still work just as well. Yes, "git log --oneline" looks somewhat different and strange for me, too ;-) I am sure I'll get used to it if I keep using it, but I suspect that I'd be irritated as I find myself typing 'q' more and more often to "less -S" that is automatically invoked when I do "git log --oneline master.." to see what commits are on my current topic branch.
Re: [PATCH v8 00/11] Git filter protocol
On Thu, Sep 29, 2016 at 09:57:57AM -0700, Junio C Hamano wrote: > > +wait_for_filter_termination () { > > + while ! grep "STOP" LOGFILENAME >/dev/null > > + do > > + echo "Waiting for /t0021/rot13-filter.pl to finish..." > > + sleep 1 > > + done > > +} > > Running "ps" and grepping for a command is not suitable for script > to reliably tell things, so it is out of question. Compared to > that, your version looks slightly better, but what if the machinery > that being tested, i.e. the part that drives the filter process, is > buggy or becomes buggy and causes the filter process that writes > "STOP" to die before it actually writes that string? I'm of the opinion that any busy-waiting is a good sign that something is suboptimal. The right solution here seems like it should be signaling the test script via a descriptor. I don't necessarily agree, though, that the timing of filter-process cleanup needs to be part of the public interface. So in your list: > 3) Git waits until the filter process finishes. That seems simple and elegant, but I can think of reasons we might not want to wait (e.g., if the filter has to do some maintenance task and does not the user to have to wait). OTOH, we already face this in git, and we solve it by explicitly backgrounding the maintenance task (i.e., auto-gc). So one could argue that it is the responsibility of the filter process to manage its own processes. It certainly makes the interaction with git simpler. -Peff
Re: [PATCH v8 00/11] Git filter protocol
> On 29 Sep 2016, at 18:57, Junio C Hamanowrote: > > Torsten Bögershausen writes: > >>> 1) Git exits >>> 2) The filter process receives EOF and prints "STOP" to the log >>> 3) t0021 checks the content of the log >>> >>> Sometimes 3 happened before 2 which makes the test fail. >>> (Example: https://travis-ci.org/git/git/jobs/162660563 ) >>> >>> I added a this to wait until the filter process terminates: >>> >>> +wait_for_filter_termination () { >>> + while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null >>> 2>&1 >>> + do >>> + echo "Waiting for /t0021/rot13-filter.pl to finish..." >>> + sleep 1 >>> + done >>> +} >>> >>> Does this look OK to you? >> Do we need the ps at all ? >> How about this: >> >> +wait_for_filter_termination () { >> +while ! grep "STOP" LOGFILENAME >/dev/null >> +do >> +echo "Waiting for /t0021/rot13-filter.pl to finish..." >> +sleep 1 >> +done >> +} > > Running "ps" and grepping for a command is not suitable for script > to reliably tell things, so it is out of question. Compared to > that, your version looks slightly better, but what if the machinery > that being tested, i.e. the part that drives the filter process, is > buggy or becomes buggy and causes the filter process that writes > "STOP" to die before it actually writes that string? > > I have a feeling that the machinery being tested needs to be fixed > so that the sequence is always be: > >0) Git spawns the filter process, as it needs some contents to > be filtered. > >1) Git did everything it needed to do and decides that is time > to go. > >2) Filter process receives EOF and prints "STOP" to the log. > >3) Git waits until the filter process finishes. > >4) t0021, after Git finishes, checks the log. > > Repeated sleep combined with grep is probably just sweeping the real > problem under the rug. Do we have enough information to do the > above? > > An inspiration may be in the way we centrally clean all tempfiles > and lockfiles before exiting. We have a central registry of these > files that need cleaning up and have a single atexit(3) handler to > clean them up. Perhaps we need a registry that filter processes > spawned by the mechanism Lars introduces in this series, and have an > atexit(3) handler that closes the pipe to them (which signals the > filters that it is time for them to go) and wait(2) on them, or > something? I do not think we want any kill(2) to be involved in > this clean-up procedure, but I do think we should wait(2) on what we > spawn, as long as these processes are meant to be shut down when the > main process of Git exits (this is different from things like > credential-cache daemon where they are expected to persist and meant > to serve multiple Git processes). We discussed that issue in v4 and v6: http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/ http://public-inbox.org/git/xmqqbn0a3wy3@gitster.mtv.corp.google.com/ My impression was that you don't want Git to wait for the filter process. If Git waits for the filter process - how long should Git wait? Thanks, Lars
Re: [PATCH 5/5] log: add --commit-header option
Jeff Kingwrites: > This lets you stick a header right before a commit, but > suppresses headers that are duplicates. This means you can > do something like: > > git log --graph --author-date-order --commit-header='== %as ==' > > to get a marker in the graph whenever the day changes. That's interesting. So it is not really "commit" header, but a header for groups of commits. Credits for realizing the usefulness of such grouping may go to Kyle, but the implementation is also brilliant ;-). > This probably needs some refactoring around the setup of the > pretty-print context.
Re: [PATCH 1/4] config: allow customizing /etc/gitconfig location
Junio C Hamanowrites: > Jakub Narębski writes: > >> W dniu 29.09.2016 o 01:30, Junio C Hamano pisze: >>> With a new environment variable GIT_ETC_GITCONFIG, the users can >>> specify a file that is used instead of /etc/gitconfig to read (and >>> write) the system-wide configuration. >> >> Why it is named GIT_ETC_GITCONFIG (which is Unix-ism), and not >> GIT_CONFIG_SYSTEM / GIT_CONFIG_SYSTEM_PATH, that is something >> OS-neutral? > > Isn't "environment variable" something that came from POSIX world? I don't know who invented the concept, but environment variables have been there in the windows world since it exists I think (it existed in MS-DOS). -- Matthieu Moy http://www-verimag.imag.fr/~moy/
Re: [PATCH/RFC] git log --oneline alternative with dates, times and initials
Jeff Kingwrites: >> Those patches are missing some of the features like showing root commits, >> handling two letter initials, showing the weekday, inserting a break where >> needed to avoid parent-child confusion in graph output and properly handling >> Duy's initials. :) > > I'm not too surprised. I literally looked at the first screenshot from > your output and thought "surely git can do that with some minor tweaks". > Nor am I surprised that there are cases where the output is funny (99% > of the time I spent on it was tracking down that graph-padding bug). > > I have no problem taking this in contrib or whatever, until a point when > Git is capable of doing the same thing itself. I just hoped to trick you > into working on Git. :) I thought we stopped adding random things to contrib/, though. Unlike the earlier days of Git, if a custom command that uses Git is very userful, it can live its own life and flourish within the much larger Git userbase we have these days.
Re: [PATCH 2/5] pretty: allow formatting names as initials
On Thu, Sep 29, 2016 at 10:31:30AM -0700, Junio C Hamano wrote: > > When I first tested it with "git log --format=%aS" I had to wonder "who > > the heck is ntnd?". So using only the first-and-last would match the git > > project's practice better, at least. > > And there is also "isalpha() good enough?" question. > > I think we have a few Chinese and Hangul as well as Cyrillic names > in our history, some of them having outside-ascii first letters. > One of the more prolific contributor's initial is ÆAB ;-) Heh, true. In case it was not clear, these were mostly quick-and-dirty patches. I think the right test is probably '!isspace()". -Peff
Re: [PATCH 2/5] pretty: allow formatting names as initials
Jeff Kingwrites: > Initials are shorter and often unique enough in a > per-project setting, so they can be used to give a more > informative version of --oneline. > > The 'S' in the placeholder is for "short" (and 's' is > already taken by DATE_SHORT), but obviously that's pretty > arcane. > > Possibly there should be more customization of initials, > asking for only 2-letter initials, etc. > > Signed-off-by: Jeff King > --- > When I first tested it with "git log --format=%aS" I had to wonder "who > the heck is ntnd?". So using only the first-and-last would match the git > project's practice better, at least. And there is also "isalpha() good enough?" question. I think we have a few Chinese and Hangul as well as Cyrillic names in our history, some of them having outside-ascii first letters. One of the more prolific contributor's initial is ÆAB ;-) > pretty.c | 21 + > 1 file changed, 21 insertions(+) > > diff --git a/pretty.c b/pretty.c > index c532c17..de62405 100644 > --- a/pretty.c > +++ b/pretty.c > @@ -674,6 +674,23 @@ static int mailmap_name(const char **email, size_t > *email_len, > return mail_map->nr && map_user(mail_map, email, email_len, name, > name_len); > } > > +static void format_initials(struct strbuf *out, const char *name, size_t len) > +{ > + int initial = 1; > + size_t i; > + > + for (i = 0; i < len; i++) { > + char c = name[i]; > + if (isspace(c)) { > + initial = 1; > + continue; > + } > + if (initial && isalpha(c)) > + strbuf_addch(out, tolower(c)); > + initial = 0; > + } > +} > + > static size_t format_person_part(struct strbuf *sb, char part, >const char *msg, int len, >const struct date_mode *dmode) > @@ -702,6 +719,10 @@ static size_t format_person_part(struct strbuf *sb, char > part, > strbuf_add(sb, mail, maillen); > return placeholder_len; > } > + if (part == 'S') { > + format_initials(sb, name, namelen); > + return placeholder_len; > + } > > if (!s.date_begin) > goto skip;
Re: [PATCH 1/4] config: allow customizing /etc/gitconfig location
Jakub Narębskiwrites: > W dniu 29.09.2016 o 01:30, Junio C Hamano pisze: >> With a new environment variable GIT_ETC_GITCONFIG, the users can >> specify a file that is used instead of /etc/gitconfig to read (and >> write) the system-wide configuration. > > Why it is named GIT_ETC_GITCONFIG (which is Unix-ism), and not > GIT_CONFIG_SYSTEM / GIT_CONFIG_SYSTEM_PATH, that is something > OS-neutral? Isn't "environment variable" something that came from POSIX world?
Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error
Jeff Kingwrites: >> $ git rev-parse --disambiguate-list=b2e1 >> b2e1196 tag v2.8.0-rc1 >> b2e11d1 tree >> b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options' >> b2e1759 blob >> b2e18954 blob >> b2e1895c blob > > I think the "right" way to do this is pipe the list of sha1s into > another git commit which can format them however you want. > Unfortunately, there isn't a single command that does a great job: > > - "cat-file --batch-check" can show you the sha1 and type, but it > won't abbreviate sha1s, and it won't show you commit/tag information > > - "log --stdin --no-walk" will format the commit however you like, but > skips the trees and blobs entirely, and the tag can only be seen via > "%d" > > - "for-each-ref" has flexible formatting, too, but wants to format > refs, not objects (and doesn't read from stdin). - "name-rev" is used to give "describe --contains", and can read from its standard input, but has no format customization. Another downside of it is that it only wants to see committishes. > IMHO that is a sign that our formatting tools aren't as good as they > could be (I think the right tool is cat-file, but it should be able to > do all of the formatting that the other commands can do). > > Of course if you really just want human-readable output, then: > > $ git cat-file -e b2e1 > error: short SHA1 b2e1 is ambiguous > hint: The candidates are: > hint: b2e1196 tag v2.8.0-rc1 > hint: b2e11d1 tree > hint: b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options' > hint: b2e1759 blob > hint: b2e18954 blob > hint: b2e1895c blob > fatal: Not a valid object name b2e1 > > is pretty easy. Yes. I think adding this to rev-parse that is meant for machines is probably a mistake, as this "hint" machinery's output will become even more human friendly over time as we gain experience. - If the hypothetical "--disambiguate-list" option wants to produce machine parseable output for scripts, it would mean its output (and whatgver the reading script can do based on its output for humans) will become less useful for humans over time. - If the hypothetical "--disambiguate-list" option only wants to replicate the human readable output that is designed to be improved over time and expects its output _not_ to be interpreted by scripts but merely be relayed, then why aren't these scripts just invoking the commands that already gives the "hint:" output and showing that directly to humans in the first place? > That being said, I don't mind if somebody wanted to do a rev-parse > option on top of my series. The formatting code is already split into > its own function. So let's not go there.
Re: [PATCH v2 02/11] i18n: add--interactive: mark simple here documents for translation
Vasco Almeidawrites: > On the other hand, would it make sense to translate these commands? If > so, we would mark for translation the commands name of @cmd in > main_loop(). > > sub main_loop { > - my @cmd = ([ 'status', \_cmd, ], > - [ 'update', \_cmd, ], > - [ 'revert', \_cmd, ], > - [ 'add untracked', \_untracked_cmd, ], > - [ 'patch', \_update_cmd, ], > - [ 'diff', \_cmd, ], > - [ 'quit', \_cmd, ], > - [ 'help', \_cmd, ], > + my @cmd = ([ __('status'), \_cmd, ], > + [ __('update'), \_cmd, ], > + [ __('revert'), \_cmd, ], > + [ __('add untracked'), \_untracked_cmd, ], > + [ __('patch'), \_update_cmd, ], > + [ __('diff'), \_cmd, ], > + [ __('quit'), \_cmd, ], > + [ __('help'), \_cmd, ], I don't know offhand. If the code to prompt and accept the command given by the user can take the translated word (or a prefix of it), theoretically I would say it could be made to work, but to me it is dubious the benefit outweighs its downsides. It would make teaching Git and troubleshooting over the phone harder, I would guess. A: "Hi, I am in a 'git add -i' session." B: "Give 's' at the prompt." A: "My Git does not seem to take 's' as a valid command." B: "What? I've never seen that problem." ... back and forth wastes 10 minutes ... A: "By the way, I am running Git in Portuguese." ;-)
Re: [PATCH v8 00/11] Git filter protocol
Torsten Bögershausenwrites: >> 1) Git exits >> 2) The filter process receives EOF and prints "STOP" to the log >> 3) t0021 checks the content of the log >> >> Sometimes 3 happened before 2 which makes the test fail. >> (Example: https://travis-ci.org/git/git/jobs/162660563 ) >> >> I added a this to wait until the filter process terminates: >> >> +wait_for_filter_termination () { >> +while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null >> 2>&1 >> +do >> +echo "Waiting for /t0021/rot13-filter.pl to finish..." >> +sleep 1 >> +done >> +} >> >> Does this look OK to you? > Do we need the ps at all ? > How about this: > > +wait_for_filter_termination () { > + while ! grep "STOP" LOGFILENAME >/dev/null > + do > + echo "Waiting for /t0021/rot13-filter.pl to finish..." > + sleep 1 > + done > +} Running "ps" and grepping for a command is not suitable for script to reliably tell things, so it is out of question. Compared to that, your version looks slightly better, but what if the machinery that being tested, i.e. the part that drives the filter process, is buggy or becomes buggy and causes the filter process that writes "STOP" to die before it actually writes that string? I have a feeling that the machinery being tested needs to be fixed so that the sequence is always be: 0) Git spawns the filter process, as it needs some contents to be filtered. 1) Git did everything it needed to do and decides that is time to go. 2) Filter process receives EOF and prints "STOP" to the log. 3) Git waits until the filter process finishes. 4) t0021, after Git finishes, checks the log. Repeated sleep combined with grep is probably just sweeping the real problem under the rug. Do we have enough information to do the above? An inspiration may be in the way we centrally clean all tempfiles and lockfiles before exiting. We have a central registry of these files that need cleaning up and have a single atexit(3) handler to clean them up. Perhaps we need a registry that filter processes spawned by the mechanism Lars introduces in this series, and have an atexit(3) handler that closes the pipe to them (which signals the filters that it is time for them to go) and wait(2) on them, or something? I do not think we want any kill(2) to be involved in this clean-up procedure, but I do think we should wait(2) on what we spawn, as long as these processes are meant to be shut down when the main process of Git exits (this is different from things like credential-cache daemon where they are expected to persist and meant to serve multiple Git processes).
[PATCH 1/3] add QSORT
Add the macro QSORT, a convenient wrapper for qsort(3) that infers the size of the array elements and supports the convention of initializing empty arrays with a NULL pointer, which we use in some places. Calling qsort(3) directly with a NULL pointer is undefined -- even with an element count of zero -- and allows the compiler to optimize away any following NULL checks. Using the macro avoids such surprises. Add a semantic patch as well to demonstrate the macro's usage and to automate the transformation of trivial cases. Signed-off-by: Rene Scharfe--- contrib/coccinelle/qsort.cocci | 19 +++ git-compat-util.h | 8 2 files changed, 27 insertions(+) create mode 100644 contrib/coccinelle/qsort.cocci diff --git a/contrib/coccinelle/qsort.cocci b/contrib/coccinelle/qsort.cocci new file mode 100644 index 000..a094e7c --- /dev/null +++ b/contrib/coccinelle/qsort.cocci @@ -0,0 +1,19 @@ +@@ +expression base, nmemb, compar; +@@ +- qsort(base, nmemb, sizeof(*base), compar); ++ QSORT(base, nmemb, compar); + +@@ +expression base, nmemb, compar; +@@ +- qsort(base, nmemb, sizeof(base[0]), compar); ++ QSORT(base, nmemb, compar); + +@@ +type T; +T *base; +expression nmemb, compar; +@@ +- qsort(base, nmemb, sizeof(T), compar); ++ QSORT(base, nmemb, compar); diff --git a/git-compat-util.h b/git-compat-util.h index 8aab0c3..d7ed137 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -977,6 +977,14 @@ void git_qsort(void *base, size_t nmemb, size_t size, #define qsort git_qsort #endif +#define QSORT(base, n, compar) sane_qsort((base), (n), sizeof(*(base)), compar) +static void inline sane_qsort(void *base, size_t nmemb, size_t size, + int(*compar)(const void *, const void *)) +{ + if (nmemb > 1) + qsort(base, nmemb, size, compar); +} + #ifndef REG_STARTEND #error "Git requires REG_STARTEND support. Compile with NO_REGEX=NeedsStartEnd" #endif -- 2.10.0
[PATCH 3/3] remove unnecessary check before QSORT
Add a semantic patch for removing checks similar to the one that QSORT already does internally and apply it to the code base. Signed-off-by: Rene Scharfe--- builtin/fmt-merge-msg.c| 10 -- contrib/coccinelle/qsort.cocci | 18 ++ sh-i18n--envsubst.c| 3 +-- 3 files changed, 23 insertions(+), 8 deletions(-) diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c index 4976967..efab62f 100644 --- a/builtin/fmt-merge-msg.c +++ b/builtin/fmt-merge-msg.c @@ -314,12 +314,10 @@ static void add_people_info(struct strbuf *out, struct string_list *authors, struct string_list *committers) { - if (authors->nr) - QSORT(authors->items, authors->nr, - cmp_string_list_util_as_integral); - if (committers->nr) - QSORT(committers->items, committers->nr, - cmp_string_list_util_as_integral); + QSORT(authors->items, authors->nr, + cmp_string_list_util_as_integral); + QSORT(committers->items, committers->nr, + cmp_string_list_util_as_integral); credit_people(out, authors, 'a'); credit_people(out, committers, 'c'); diff --git a/contrib/coccinelle/qsort.cocci b/contrib/coccinelle/qsort.cocci index a094e7c..22b93a9 100644 --- a/contrib/coccinelle/qsort.cocci +++ b/contrib/coccinelle/qsort.cocci @@ -17,3 +17,21 @@ expression nmemb, compar; @@ - qsort(base, nmemb, sizeof(T), compar); + QSORT(base, nmemb, compar); + +@@ +expression base, nmemb, compar; +@@ +- if (nmemb) +QSORT(base, nmemb, compar); + +@@ +expression base, nmemb, compar; +@@ +- if (nmemb > 0) +QSORT(base, nmemb, compar); + +@@ +expression base, nmemb, compar; +@@ +- if (nmemb > 1) +QSORT(base, nmemb, compar); diff --git a/sh-i18n--envsubst.c b/sh-i18n--envsubst.c index 3637a2a..c3a2b5a 100644 --- a/sh-i18n--envsubst.c +++ b/sh-i18n--envsubst.c @@ -230,8 +230,7 @@ cmp_string (const void *pstr1, const void *pstr2) static inline void string_list_sort (string_list_ty *slp) { - if (slp->nitems > 0) -QSORT(slp->item, slp->nitems, cmp_string); + QSORT(slp->item, slp->nitems, cmp_string); } /* Test whether a sorted string list contains a given string. */ -- 2.10.0
[PATCH 2/3] use QSORT
Apply the semantic patch contrib/coccinelle/qsort.cocci to the code base, replacing calls of qsort(3) with QSORT. The resulting code is shorter and supports empty arrays with NULL pointers. Signed-off-by: Rene Scharfe--- Freshly generated using coccicheck, compiles, survives make test. bisect.c | 2 +- builtin/describe.c | 2 +- builtin/fast-export.c| 2 +- builtin/fmt-merge-msg.c | 6 ++ builtin/index-pack.c | 8 +++- builtin/mktree.c | 2 +- builtin/name-rev.c | 3 +-- builtin/pack-objects.c | 7 +++ builtin/remote.c | 3 +-- diff.c | 6 +++--- diffcore-delta.c | 5 + diffcore-order.c | 2 +- diffcore-rename.c| 2 +- dir.c| 4 ++-- fast-import.c| 4 ++-- fetch-pack.c | 2 +- help.c | 15 +-- line-log.c | 2 +- pack-bitmap-write.c | 3 +-- pack-check.c | 2 +- pack-write.c | 3 +-- pathspec.c | 3 +-- ref-filter.c | 2 +- refs/files-backend.c | 2 +- server-info.c| 2 +- sh-i18n--envsubst.c | 2 +- sha1-array.c | 2 +- string-list.c| 2 +- t/helper/test-dump-untracked-cache.c | 6 ++ tree.c | 3 +-- 30 files changed, 44 insertions(+), 65 deletions(-) diff --git a/bisect.c b/bisect.c index 6f512c2..21bc6da 100644 --- a/bisect.c +++ b/bisect.c @@ -215,7 +215,7 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n array[cnt].distance = distance; cnt++; } - qsort(array, cnt, sizeof(*array), compare_commit_dist); + QSORT(array, cnt, compare_commit_dist); for (p = list, i = 0; i < cnt; i++) { char buf[100]; /* enough for dist=%d */ struct object *obj = &(array[i].commit->object); diff --git a/builtin/describe.c b/builtin/describe.c index 8a25abe..01490a1 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -352,7 +352,7 @@ static void describe(const char *arg, int last_one) oid_to_hex(oid)); } - qsort(all_matches, match_cnt, sizeof(all_matches[0]), compare_pt); + QSORT(all_matches, match_cnt, compare_pt); if (gave_up_on) { commit_list_insert_by_date(gave_up_on, ); diff --git a/builtin/fast-export.c b/builtin/fast-export.c index c0652a7..1e815b5 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -347,7 +347,7 @@ static void show_filemodify(struct diff_queue_struct *q, * Handle files below a directory first, in case they are all deleted * and the directory changes to a file or symlink. */ - qsort(q->queue, q->nr, sizeof(q->queue[0]), depth_first); + QSORT(q->queue, q->nr, depth_first); for (i = 0; i < q->nr; i++) { struct diff_filespec *ospec = q->queue[i]->one; diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c index dc2e9e4..4976967 100644 --- a/builtin/fmt-merge-msg.c +++ b/builtin/fmt-merge-msg.c @@ -315,12 +315,10 @@ static void add_people_info(struct strbuf *out, struct string_list *committers) { if (authors->nr) - qsort(authors->items, - authors->nr, sizeof(authors->items[0]), + QSORT(authors->items, authors->nr, cmp_string_list_util_as_integral); if (committers->nr) - qsort(committers->items, - committers->nr, sizeof(committers->items[0]), + QSORT(committers->items, committers->nr, cmp_string_list_util_as_integral); credit_people(out, authors, 'a'); diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 4a8b4ae..7657d0a 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1190,10 +1190,8 @@ static void resolve_deltas(void) return; /* Sort deltas by base SHA1/offset for fast searching */ - qsort(ofs_deltas, nr_ofs_deltas, sizeof(struct ofs_delta_entry), - compare_ofs_delta_entry); - qsort(ref_deltas, nr_ref_deltas, sizeof(struct ref_delta_entry), - compare_ref_delta_entry); + QSORT(ofs_deltas, nr_ofs_deltas, compare_ofs_delta_entry); + QSORT(ref_deltas, nr_ref_deltas, compare_ref_delta_entry); if (verbose || show_resolving_progress) progress = start_progress(_("Resolving deltas"), @@ -1356,7
Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error
On Thu, Sep 29, 2016 at 07:36:27AM -0700, Kyle J. McKay wrote: > On Sep 29, 2016, at 06:24, Jeff King wrote: > > > > If you are doing "git show 235234" it should pick the tag (if it > > > peels to a > > > committish) because Git has already set a precedent of preferring > > > tags over > > > commits when it disambiguates ref names and otherwise pick the > > > commit. > > > > I'm not convinced that picking the tag is actually helpful in this case; > > I agree with Linus that feeding something to "git show" almost always > > wants to choose the commit. > > Since "git show" peels tags you end up seeing the commit it refers to > (assuming it's a committish tag). Yes, but it's almost certainly _not_ the commit you meant. From your example: >c512b03: > c512b035556eff4d commit Merge branch 'rc/maint-reflog-msg-for-forced > c512b0344196931a tag(v0.99.9a) GIT 0.99.9a If I'm looking for the commit c512b03, then it almost certainly isn't v0.99.9a. That tag's commit is e634aec. Or another way of thinking about it: you want to guess what the _writer_ of the note meant. Why would somebody write "c512b03" when they could have written "v0.99.9a"? And they certainly would not have written it if they meant "e634aec". :) > > I also don't think tag ambiguity in short sha1s is all that interesting. > > The Linux repository has this: > >901069c: > 901069c71415a76d commit iwlagn: change Copyright to 2011 > 901069c5c5b15532 tag(v2.6.38-rc4) Linux 2.6.38-rc4 Sure, I'm not surprised there's a collision. But I'd expect those to be a tiny fraction of collisions. Here's the breakdown of object types in my clone of linux.git: $ git cat-file --batch-all-objects --batch-check='%(objecttype)' | sort | uniq -c 1421198 blob 618073 commit 479 tag 2877913 tree That's a hundredth of a percent tag objects. The chance that you have _a_ 7-hex collision with a tag is relatively high. But the chance that any given collision involves a tag is rather small. -Peff
Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error
On Sep 29, 2016, at 06:24, Jeff King wrote: If you are doing "git show 235234" it should pick the tag (if it peels to a committish) because Git has already set a precedent of preferring tags over commits when it disambiguates ref names and otherwise pick the commit. I'm not convinced that picking the tag is actually helpful in this case; I agree with Linus that feeding something to "git show" almost always wants to choose the commit. Since "git show" peels tags you end up seeing the commit it refers to (assuming it's a committish tag). I also don't think tag ambiguity in short sha1s is all that interesting. The Linux repository has this: 901069c: 901069c71415a76d commit iwlagn: change Copyright to 2011 901069c5c5b15532 tag(v2.6.38-rc4) Linux 2.6.38-rc4 Since that tag peels to a commit, it seems like it would be incorrect to pick the commit over the tag when you're looking for a committish. Either 901069c should resolve to the tag (which gets peeled to the commit) or it should error out with the hint messages. The Git repository has this: c512b03: c512b035556eff4d commit Merge branch 'rc/maint-reflog-msg-for- forced c512b0344196931a tag(v0.99.9a) GIT 0.99.9a So perhaps it's a little bit more interesting than it first appears. :) --Kyle
Re: [PATCH v2 02/11] i18n: add--interactive: mark simple here documents for translation
A Dom, 25-09-2016 às 15:54 -0700, Junio C Hamano escreveu: > > sub status_cmd { > > @@ -1573,14 +1573,14 @@ sub quit_cmd { > > } > > > > sub help_cmd { > > - print colored $help_color, <<\EOF ; > > -status - show paths with changes > > + print colored $help_color, __( > > +"status - show paths with changes > > update - add working tree state to the staged set of > changes > > revert - revert staged set of changes back to the HEAD > version > > patch - pick hunks and update selectively > > diff - view diff between HEAD and index > > -add untracked - add contents of untracked files to the staged set > of changes > > -EOF > > +add untracked - add contents of untracked files to the staged set > of changes"), > > +"\n"; > > } > > Do we need TRANSLATORS: comment to all of the above not to touch the > command words that are explained and translate only the explanation? Yes, it is better to have that comment. On the other hand, would it make sense to translate these commands? If so, we would mark for translation the commands name of @cmd in main_loop(). sub main_loop { - my @cmd = ([ 'status', \_cmd, ], - [ 'update', \_cmd, ], - [ 'revert', \_cmd, ], - [ 'add untracked', \_untracked_cmd, ], - [ 'patch', \_update_cmd, ], - [ 'diff', \_cmd, ], - [ 'quit', \_cmd, ], - [ 'help', \_cmd, ], + my @cmd = ([ __('status'), \_cmd, ], + [ __('update'), \_cmd, ], + [ __('revert'), \_cmd, ], + [ __('add untracked'), \_untracked_cmd, ], + [ __('patch'), \_update_cmd, ], + [ __('diff'), \_cmd, ], + [ __('quit'), \_cmd, ], + [ __('help'), \_cmd, ],
Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error
On Thu, Sep 29, 2016 at 06:01:51AM -0700, Kyle J. McKay wrote: > But perhaps it makes sense to actually pick one if there's only one > disambiguation of the type you're looking for. > > For example given: > > 235234a blob > 2352347 tag > 235234f tree > 2352340 commit > > If you are doing "git cat-file blob 235234" it should pick the blob and spit > out a warning (and similarly for other cat-file types). But "git cat-file > -p 235234" would give the fatal error with the disambiguation hints because > it wants type "any". That code is already there; it's just a matter of whether git has enough information to know the context. E.g. (in git.git): $ git show b2e11 error: short SHA1 b2e11 is ambiguous hint: The candidates are: hint: b2e1196 tag v2.8.0-rc1 hint: b2e11d1 tree ... $ git log b2e11 commit ab5d01a29eb7380ceab070f0807c2939849c44bc (tag: v2.8.0-rc1) ... The "show" command can show anything, but "log" really wants committishes, so it's able to disambiguate. It looks like cat-file never learned to feed its context, but it's probably something like this: diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 94e67eb..ecbb959 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -56,12 +56,22 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name, struct object_info oi = {NULL}; struct strbuf sb = STRBUF_INIT; unsigned flags = LOOKUP_REPLACE_OBJECT; + unsigned sha1_flags = 0; const char *path = force_path; if (unknown_type) flags |= LOOKUP_UNKNOWN_OBJECT; - if (get_sha1_with_context(obj_name, 0, oid.hash, _context)) + if (exp_type) { + if (!strcmp(exp_type, "commit")) + sha1_flags |= GET_SHA1_COMMITTISH; + else if(!strcmp(exp_type, "tree")) + sha1_flags |= GET_SHA1_TREEISH; + else if(!strcmp(exp_type, "blob")) + sha1_flags |= GET_SHA1_BLOB; + } + + if (get_sha1_with_context(obj_name, sha1_flags, oid.hash, _context)) die("Not a valid object name %s", obj_name); if (!path) > If you are doing "git show 235234" it should pick the tag (if it peels to a > committish) because Git has already set a precedent of preferring tags over > commits when it disambiguates ref names and otherwise pick the commit. I'm not convinced that picking the tag is actually helpful in this case; I agree with Linus that feeding something to "git show" almost always wants to choose the commit. I also don't think tag ambiguity in short sha1s is all that interesting. There are a tiny number of tag objects. Most of your collisions are going to be with trees or blobs, which should generally outnumber commits by a factor of 5-10, though it depends on your workflow (git.git does not have a deep tree, so it's only a factor of 4). And if you just want to choose a committish over trees and blobs, well, then; I invite you to check out the core.disambiguate patch I sent elsewhere in the thread. :) -Peff
Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error
On Thu, Sep 29, 2016 at 04:46:19AM -0700, Kyle J. McKay wrote: > This hint: information is excellent. There needs to be a way to show it on > demand. > > $ git rev-parse --disambiguate=b2e1 > b2e11962c5e6a9c81aa712c751c83a743fd4f384 > b2e11d1bb40c5f81a2f4e37b9f9a60ec7474eeab > b2e163272c01aca4aee4684f5c683ba341c1953d > b2e18954c03ff502053cb74d142faab7d2a8dacb > b2e1895ca92ec2037349d88b945ba64ebf16d62d > > Not nearly so helpful, but the operation of --disambiguate cannot be changed > without breaking current scripts. > > Can your excellent "hint:" output above be attached to the --disambiguate > option somehow, please. Something like this perhaps: > > $ git rev-parse --disambiguate-list=b2e1 > b2e1196 tag v2.8.0-rc1 > b2e11d1 tree > b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options' > b2e1759 blob > b2e18954 blob > b2e1895c blob I think the "right" way to do this is pipe the list of sha1s into another git commit which can format them however you want. Unfortunately, there isn't a single command that does a great job: - "cat-file --batch-check" can show you the sha1 and type, but it won't abbreviate sha1s, and it won't show you commit/tag information - "log --stdin --no-walk" will format the commit however you like, but skips the trees and blobs entirely, and the tag can only be seen via "%d" - "for-each-ref" has flexible formatting, too, but wants to format refs, not objects (and doesn't read from stdin). IMHO that is a sign that our formatting tools aren't as good as they could be (I think the right tool is cat-file, but it should be able to do all of the formatting that the other commands can do). Of course if you really just want human-readable output, then: $ git cat-file -e b2e1 error: short SHA1 b2e1 is ambiguous hint: The candidates are: hint: b2e1196 tag v2.8.0-rc1 hint: b2e11d1 tree hint: b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options' hint: b2e1759 blob hint: b2e18954 blob hint: b2e1895c blob fatal: Not a valid object name b2e1 is pretty easy. That being said, I don't mind if somebody wanted to do a rev-parse option on top of my series. The formatting code is already split into its own function. -Peff
Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error
On Sep 26, 2016, at 09:36, Linus Torvalds wrote: On Mon, Sep 26, 2016 at 5:00 AM, Jeff Kingwrote: This patch teaches get_short_sha1() to list the sha1s of the objects it found, along with a few bits of information that may help the user decide which one they meant. This looks very good to me, but I wonder if it couldn't be even more aggressive. In particular, the only hashes that most people ever use in short form are commit hashes. Those are the ones you'd use in normal human interactions to point to something happening. So when the disambiguation notices that there is ambiguity, but there is only _one_ commit, maybe it should just have an aggressive mode that says "use that as if it wasn't ambiguous". If you have this: faa23ec9b437812ce2fc9a5b3d59418d672debc1 refs/heads/ambig 7f40afe646fa3f8a0f361b6f567d8f7d7a184c10 refs/tags/ambig and you do this: $ git rev-parse ambig warning: refname 'ambig' is ambiguous. 7f40afe646fa3f8a0f361b6f567d8f7d7a184c10 Git automatically prefers the tag over the branch, but it does spit out a warning. And then have an explicit command (or flag) to do disambiguation for when you explicitly want it. I think you don't even need that. Git already does disambiguation for ref names, picks one and spits out a warning. Why not do the same for short hash names when it makes sense? Rationale: you'd never care about short forms for tags. You'd just use the tag name. And while blob ID's certainly show up in short form in diff output (in the "index" line), very few people will use them. And tree hashes are basically never seen outside of any plumbing commands and then seldom in shortened form. So I think it would make sense to default to a mode that just picks the commit hash if there is only one such hash. Sure, some command might want a "treeish", but a commit is still more likely than a tree or a tag. But regardless, this series looks like a good thing. I like it too. But perhaps it makes sense to actually pick one if there's only one disambiguation of the type you're looking for. For example given: 235234a blob 2352347 tag 235234f tree 2352340 commit If you are doing "git cat-file blob 235234" it should pick the blob and spit out a warning (and similarly for other cat-file types). But "git cat-file -p 235234" would give the fatal error with the disambiguation hints because it wants type "any". If you are doing "git show 235234" it should pick the tag (if it peels to a committish) because Git has already set a precedent of preferring tags over commits when it disambiguates ref names and otherwise pick the commit. Lets consider this approach using the stats for the Linux kernel: Ambiguous prefix length 7 counts: prefixes: 44733 objects: 89766 Ambiguous length 11 (but not at length 12) info: prefixes: 2 0 (with 1 or more commit disambiguations) Ambiguous length 10 (but not at length 11) info: prefixes: 12 3 (with 1 or more commit disambiguations) 0 (with 2 or more commit disambiguations) Ambiguous length 9 (but not at length 10) info: prefixes: 186 43 (with 1 or more commit disambiguations) 1 (with 2 or more commit disambiguations) Ambiguous length 8 (but not at length 9) info: prefixes:2723 651 (with 1 or more commit disambiguations) 40 (with 2 or more commit disambiguations) Ambiguous length 7 (but not at length 8) info: prefixes: 41864 9842 (with 1 or more commit disambiguations) 680 (with 2 or more commit disambiguations) Of the 44733 ambiguous length 7 prefixes, only about 10539 of them disambiguate into one or more commit objects. But if we apply the "spit a warning and prefer a commit object if there's only one and you're looking for a committish" rule, that drops the number from 10539 to about 721. In other words, only about 7% of the previously ambiguous short commit SHA1 prefixes would continue to be ambiguous at length 7. In fact it almost makes a prefix length of 9 good enough, there's just the one at length 9 that disambiguates into more than one commit (45f014c52). --Kyle
Re: Changing the default for "core.abbrev"?
On Sep 25, 2016, at 18:39, Linus Torvalds wrote: The kernel, these days, is at roughly 5 million objects, and while the seven hex digits are still often enough for uniqueness (and git will always add digits *until* it is unique), it's long been at the point where I tell people to do git config --global core.abbrev 12 because even though git will extend the seven hex digits until the object name is unique, that only reflects the *current* situation in the repository. With 5 million objects and a very healthy growth rate, a 7-8 hex digit number that is unique today is not necessarily unique a month or two from now, and then it gets annoying when a commit message has a short git ID that is no longer unique when you go back and try to figure out what went wrong in that commit. On Sep 25, 2016, at 20:46, Junio C Hamano wrote: Linus Torvaldswrites: I can just keep reminding kernel maintainers and developers to update their git config, but maybe it would be a good idea to just admit that the defaults picked in 2005 weren't necessarily the best ones possible, and those could be bumped up a bit? I am not quite sure how good any new default would be, though. Just like any timeout is not long enough for somebody, growing projects will eventually hit whatever abbreviation length they start with. This made me curious what the situation is really like. So I crunched some data. Using a recent clone of $korg/torvalds/linux: $ git rev-parse --verify d597639e203 error: short SHA1 d597639e203 is ambiguous. fatal: Needed a single revision So the kernel already has 11-character "short" SHA1s that are ambiguous. Is a core.abbrev setting of 12 really good enough? Here are the stats on the kernel's repository: Ambiguous length 11 (but not at length 12) info: prefixes: 2 0 (with 1 or more commit disambiguations) Ambiguous length 10 (but not at length 11) info: prefixes: 12 3 (with 1 or more commit disambiguations) 0 (with 2 or more commit disambiguations) Ambiguous length 9 (but not at length 10) info: prefixes: 186 43 (with 1 or more commit disambiguations) 1 (with 2 or more commit disambiguations) 0 (with 3 or more disambiguations) Ambiguous length 8 (but not at length 9) info: prefixes:2723 651 (with 1 or more commit disambiguations) 40 (with 2 or more commit disambiguations) 1 (with 3 or more disambiguations) maxambig: 3 (there is 1 of them) Ambiguous length 7 (but not at length 8) info: prefixes: 41864 9842 (with 1 or more commit disambiguations) 680 (with 2 or more commit disambiguations) 299 (with 3 or more disambiguations) maxambig: 3 (there are 299 of them) The "maxambig" value is the maximum number of disambiguations for any single prefix at that prefix length. So for prefixes of length 7 there are 299 that disambiguate into 3 objects. Just out of curiosity, generating stats on the Git repository gives: Ambiguous length 8 (but not at length 9) info: prefixes: 7 3 (with 1 or more commit disambiguations) 2 (with 2 or more commit disambiguations) 0 (with 3 or more disambiguations) Ambiguous length 7 (but not at length 8) info: prefixes: 87 36 (with 1 or more commit disambiguations) 3 (with 2 or more commit disambiguations) 0 (with 3 or more disambiguations) Running the stats on $github/gitster/git produces some ambiguous length 9 prefixes (one of which contains a commit disambiguation). --Kyle
Re: [PATCH/RFC] git log --oneline alternative with dates, times and initials
On Thu, Sep 29, 2016 at 04:00:06AM -0700, Kyle J. McKay wrote: > > Each of those commits[1] needs some minor polish, and as I'm not really > > that interested in fancy log output myself, I don't plan on working on > > them further. I was mostly curious just how close we were. But if you'd > > like to pursue it, feel free to use them as a starting point. > > Those patches are missing some of the features like showing root commits, > handling two letter initials, showing the weekday, inserting a break where > needed to avoid parent-child confusion in graph output and properly handling > Duy's initials. :) I'm not too surprised. I literally looked at the first screenshot from your output and thought "surely git can do that with some minor tweaks". Nor am I surprised that there are cases where the output is funny (99% of the time I spent on it was tracking down that graph-padding bug). I have no problem taking this in contrib or whatever, until a point when Git is capable of doing the same thing itself. I just hoped to trick you into working on Git. :) -Peff
Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits
Quoting Jeff King: On Thu, Sep 29, 2016 at 04:44:00AM +0200, SZEDER Gábor wrote: > So 12 seems reasonable, and the only downside for it (or for "13", for > that matter) is a few extra bytes. I dunno, maybe people will really > hate that, but I have a feeling these are mostly cut-and-pasted anyway. I for one raise my hand in protest... "few extra bytes" is not the only downside, and it's not at all about how many characters are copy-and-pasted. In my opinion it's much more important that this change wastes 5 columns worth of valuable screen real estate e.g. for 'git blame' or 'git log --oneline' in projects that don't need it and certainly won't ever need it. True. The core of the issue is that we really only care about this minimum length when _storing_ an abbreviation, but we don't know when the user is just looking at it in the moment, and when they are going to stick it in a commit message, email, or bug tracker. In an ideal world, anybody who was about to store it would run "git describe" or something to come up with some canonical reference format. And we could just bump the default minimum there. Personally, I almost exclusively cite commits as the output of: git log -1 --pretty='tformat:%h (%s, %ad)' --date=short Interesting, I have a pretty format alias that looks almost like this, except that I carry a patch locally allowing me to say %as for short date format :) What I sometimes wished for is a pretty format specifier for 'git describe --contains', which would make it convenient to cite commits like this: v0.99~954 (Initial revision of "git", the information manager from hell, 2005-04-07). It's better than the abbreviated object name, because it will stay unique, assuming that the chosen tag is never deleted, and it carries extra information for humans (the first release containing the referenced commit), while the abbreviated object name is completely meaningless. The obvious drawback that makes it a non-solution for the problem at hand is that this format can only refer to commits that are reachable from a tag and can't be used for commits that are descendants of the most recent tag, e.g. when fixing a bug introduced after the last release. Oh, and the user has to fetch the tag first to be able to make sense of such a reference. and I'd be fine to stick "--abbrev=12" in there for future-proofing. But I don't know what the kernel or other projects do. I'd also be curious to know if the patch I sent in [1] to more aggressively prefer commits would make this less of an issue, and people wouldn't care as much about using longer hashes in the first place. So one option is to merge that (and possibly even make it the default) and see if people still care in 6 months. -Peff [1] http://public-inbox.org/git/20160927123801.3bpdg3hap3kzz...@sigill.intra.peff.net/
Re: [PATCH v8 00/11] Git filter protocol
On 29/09/16 12:28, Lars Schneider wrote: On 28 Sep 2016, at 23:49, Junio C Hamanowrote: I suspect that you are preparing a reroll already, but the one that is sitting in 'pu' seems to be flaky in t/t0021 and I seem to see occasional failures from it. I didn't trace where the test goes wrong, but one easy mistake you could make (I am not saying that is the reason of the failure) is to assume your filter will not be called under certain condition (like immediately after you checked out from the index to the working tree), when the automated test goes fast enough and get you into a "racy git" situation---the filter may be asked to filter the contents from the working tree again to re-validate what's there is still what is in the index. Thanks for the heads-up! This is what happens: 1) Git exits 2) The filter process receives EOF and prints "STOP" to the log 3) t0021 checks the content of the log Sometimes 3 happened before 2 which makes the test fail. (Example: https://travis-ci.org/git/git/jobs/162660563 ) I added a this to wait until the filter process terminates: +wait_for_filter_termination () { + while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 2>&1 + do + echo "Waiting for /t0021/rot13-filter.pl to finish..." + sleep 1 + done +} Does this look OK to you? Do we need the ps at all ? How about this: +wait_for_filter_termination () { + while ! grep "STOP" LOGFILENAME >/dev/null + do + echo "Waiting for /t0021/rot13-filter.pl to finish..." + sleep 1 + done +}
Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error
On Sep 26, 2016, at 05:00, Jeff King wrote: $ git rev-parse b2e1 error: short SHA1 b2e1 is ambiguous hint: The candidates are: hint: b2e1196 tag v2.8.0-rc1 hint: b2e11d1 tree hint: b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit- options' hint: b2e1759 blob hint: b2e18954 blob hint: b2e1895c blob fatal: ambiguous argument 'b2e1': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git [...] -- [...]' This hint: information is excellent. There needs to be a way to show it on demand. $ git rev-parse --disambiguate=b2e1 b2e11962c5e6a9c81aa712c751c83a743fd4f384 b2e11d1bb40c5f81a2f4e37b9f9a60ec7474eeab b2e163272c01aca4aee4684f5c683ba341c1953d b2e18954c03ff502053cb74d142faab7d2a8dacb b2e1895ca92ec2037349d88b945ba64ebf16d62d Not nearly so helpful, but the operation of --disambiguate cannot be changed without breaking current scripts. Can your excellent "hint:" output above be attached to the -- disambiguate option somehow, please. Something like this perhaps: $ git rev-parse --disambiguate-list=b2e1 b2e1196 tag v2.8.0-rc1 b2e11d1 tree b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options' b2e1759 blob b2e18954 blob b2e1895c blob Any option name will do, --disambiguate-verbose, --disambiguate- extended, --disambiguate-long, --disambiguate-log, --disambiguate- help, --disambiguate-show-me-something-useful-to-humans-not-scripts ... --Kyle