Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Jacob Keller
On Thu, Sep 29, 2016 at 10:19 AM, Junio C Hamano  wrote:
> Jeff King  writes:
>>   - "cat-file --batch-check" can show you the sha1 and type, but it
>> won't abbreviate sha1s, and it won't show you commit/tag information
>>
>>   - "log --stdin --no-walk" will format the commit however you like, but
>> skips the trees and blobs entirely, and the tag can only be seen via
>> "%d"
>>
>>   - "for-each-ref" has flexible formatting, too, but wants to format
>> refs, not objects (and doesn't read from stdin).
>
> - "name-rev" is used to give "describe --contains", and can read
>   from its standard input, but has no format customization.
>   Another downside of it is that it only wants to see
>   committishes.
>

Some tool which reads standard input and can be formatted would be
nice. Extending name-rev with the same format options as for-each-ref
would be nice.

Thanks,
Jake


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Junio C Hamano
Junio C Hamano  writes:

> Junio C Hamano  writes:
>
>> There still are breakages seen in t5510 and t5526 that are about the
>> verbose output of "git fetch".  I'll stop digging at this point
>> tonight, and welcome others who look into it ;-)
>
> OK, just before I leave the keyboard for the night...
>
> -- >8 --
> From: Junio C Hamano 
> Date: Thu, 29 Sep 2016 21:19:20 -0700
> Subject: [PATCH] abbrev: adjust to the new world order

To those who are following from sidelines, this builds on Linus's
third iteration patch (which is based on his first patch), applied
on Peff's "give disambiguation help when giving an ambiguity error"
series.  I didn't merge the work-in-progress going back and forth
between Linus and I tonight to any of the integration branches, but
it is available as lt/abbrev-auto-2 branch of the "broken down"
repository, i.e.

git://github.com/gitster/git.git lt/abbrev-auto-2



Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Linus Torvalds
On Thu, Sep 29, 2016 at 9:18 PM, Linus Torvalds
 wrote:
>
> There are probably other things like that.

t5510-fetch.sh fails oddly, looks like the output is off by one character.

   not ok 77 - fetch aligned output

It has a magic "cut -c 22-" that expects the output at a specific
place, and now it's at column 21 instead of column 22. Strange test,
but it still seems to be aligned, just in a different column.

But clearly something changed.

 Linus


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Junio C Hamano
Junio C Hamano  writes:

> There still are breakages seen in t5510 and t5526 that are about the
> verbose output of "git fetch".  I'll stop digging at this point
> tonight, and welcome others who look into it ;-)

OK, just before I leave the keyboard for the night...

-- >8 --
From: Junio C Hamano 
Date: Thu, 29 Sep 2016 21:19:20 -0700
Subject: [PATCH] abbrev: adjust to the new world order

The default_abbrev used to be a concrete value usable as the default
abbreviation length.  The code that sets custom abbreviation length,
in response to command line argument, often did something like:

if (skip_prefix(arg, "--abbrev=", ))
abbrev = atoi(arg);
else if (!strcmp("--abbrev", ))
abbrev = DEFAULT_ABBREV;
/* make the value sane */
if (abbrev < 0 || 40 < abbrev)
abbrev = ... some sane value ...

The new world order however is that the default_abbrev is a negative
value that signals find_unique_abbrev() that it needs to dynamically
find out a good default value.  We shouldn't coerce a negative value
into a random positive value like the above sample code.

Signed-off-by: Junio C Hamano 
---
 builtin/rev-parse.c | 5 +++--
 diff.c  | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 76cf05e2ad..17cbfabdde 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -643,8 +643,9 @@ int cmd_rev_parse(int argc, const char **argv, const char 
*prefix)
filter &= ~(DO_FLAGS|DO_NOREV);
verify = 1;
abbrev = DEFAULT_ABBREV;
-   if (arg[7] == '=')
-   abbrev = strtoul(arg + 8, NULL, 10);
+   if (!arg[7])
+   continue;
+   abbrev = strtoul(arg + 8, NULL, 10);
if (abbrev < MINIMUM_ABBREV)
abbrev = MINIMUM_ABBREV;
else if (40 <= abbrev)
diff --git a/diff.c b/diff.c
index c6da383c56..cefc13eb8e 100644
--- a/diff.c
+++ b/diff.c
@@ -3399,7 +3399,7 @@ void diff_setup_done(struct diff_options *options)
 */
read_cache();
}
-   if (options->abbrev <= 0 || 40 < options->abbrev)
+   if (40 < options->abbrev)
options->abbrev = 40; /* full */
 
/*
-- 
2.10.0-612-g22341905f2



Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Linus Torvalds
On Thu, Sep 29, 2016 at 9:10 PM, Junio C Hamano  wrote:
>
> A quick and dirty fix for it may look like this.

Crossed emails.

Indeed, I just solved the builtin/rev-parse.c thing slightly differently.

And you found another failure in the diff code similarly not liking
the negative DEFAULT_ABBREV.  There are probably other things like
that.

  Linus


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Linus Torvalds
On Thu, Sep 29, 2016 at 8:54 PM, Junio C Hamano  wrote:
>
>  * The script uses "git rev-parse --short HEAD"; I suspect that it
>says "ah, default_abbrev is -1 and minimum_abbrev is 4, so let's
>try abbreviating to 4 hexdigits".

Ahh, right you are. The logic there is

abbrev = DEFAULT_ABBREV;
if (arg[7] == '=')
abbrev = strtoul(arg + 8, NULL, 10);
if (abbrev < MINIMUM_ABBREV)
abbrev = MINIMUM_ABBREV;


which now does something different than what it used to do because
DEFAULT_ABBREV is -1.

Putting the "sanity-check the abbrev range" tests inside the "if()"
statement that does strtoul() should fix it. Let me test...

[ short time passes ]

Yup. Incremental patch for that single issue attached.  I made it do
an early "continue" instead of adding another level on indentation.

 Linus
 builtin/rev-parse.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 4da1f1da2..cfb0f1510 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -671,8 +671,9 @@ int cmd_rev_parse(int argc, const char **argv, const char 
*prefix)
filter &= ~(DO_FLAGS|DO_NOREV);
verify = 1;
abbrev = DEFAULT_ABBREV;
-   if (arg[7] == '=')
-   abbrev = strtoul(arg + 8, NULL, 10);
+   if (!arg[7])
+   continue;
+   abbrev = strtoul(arg + 8, NULL, 10);
if (abbrev < MINIMUM_ABBREV)
abbrev = MINIMUM_ABBREV;
else if (40 <= abbrev)


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Junio C Hamano
Junio C Hamano  writes:

> Linus Torvalds  writes:
>
>> So this patch may actually be "production ready" apart from the fact
>> that some tests still fail (at least t2027-worktree-list.sh) because
>> of different short SHA1 cases.
>
> t2027 has at least two problems.
>
>  * "git worktree" does not read the core.abbrev configuration,
>without a recent fix in jc/worktree-config, i.e. d49028e6
>("worktree: honor configuration variables", 2016-09-26).
>
>  * The script uses "git rev-parse --short HEAD"; I suspect that it
>says "ah, default_abbrev is -1 and minimum_abbrev is 4, so let's
>try abbreviating to 4 hexdigits".
>
> The first failure in t3203 seems to come from the same issue in
> "rev-parse --short".

A quick and dirty fix for it may look like this.

We leave the variable abbrev to DEFAULT_ABBREV and let
find_unique_abbrev() react to "eh, -1? I need to do the
auto-scaling".  "git diff-tree --abbrev" seems to have a similar
problem, and the fix is the same.

There still are breakages seen in t5510 and t5526 that are about the
verbose output of "git fetch".  I'll stop digging at this point
tonight, and welcome others who look into it ;-)

 builtin/rev-parse.c | 14 --
 diff.c  |  2 +-
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 76cf05e2ad..f8c8c6c22e 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -642,13 +642,15 @@ int cmd_rev_parse(int argc, const char **argv, const char 
*prefix)
starts_with(arg, "--short=")) {
filter &= ~(DO_FLAGS|DO_NOREV);
verify = 1;
-   abbrev = DEFAULT_ABBREV;
-   if (arg[7] == '=')
+   if (arg[7] != '=') {
+   abbrev = DEFAULT_ABBREV;
+   } else {
abbrev = strtoul(arg + 8, NULL, 10);
-   if (abbrev < MINIMUM_ABBREV)
-   abbrev = MINIMUM_ABBREV;
-   else if (40 <= abbrev)
-   abbrev = 40;
+   if (abbrev < MINIMUM_ABBREV)
+   abbrev = MINIMUM_ABBREV;
+   else if (40 <= abbrev)
+   abbrev = 40;
+   }
continue;
}
if (!strcmp(arg, "--sq")) {
diff --git a/diff.c b/diff.c
index c6da383c56..cefc13eb8e 100644
--- a/diff.c
+++ b/diff.c
@@ -3399,7 +3399,7 @@ void diff_setup_done(struct diff_options *options)
 */
read_cache();
}
-   if (options->abbrev <= 0 || 40 < options->abbrev)
+   if (40 < options->abbrev)
options->abbrev = 40; /* full */
 
/*


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Junio C Hamano
Linus Torvalds  writes:

> So this patch may actually be "production ready" apart from the fact
> that some tests still fail (at least t2027-worktree-list.sh) because
> of different short SHA1 cases.

t2027 has at least two problems.

 * "git worktree" does not read the core.abbrev configuration,
   without a recent fix in jc/worktree-config, i.e. d49028e6
   ("worktree: honor configuration variables", 2016-09-26).

 * The script uses "git rev-parse --short HEAD"; I suspect that it
   says "ah, default_abbrev is -1 and minimum_abbrev is 4, so let's
   try abbreviating to 4 hexdigits".

The first failure in t3203 seems to come from the same issue in
"rev-parse --short".


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Mike Hommey
On Thu, Sep 29, 2016 at 12:06:23PM -0700, Linus Torvalds wrote:
> On Thu, Sep 29, 2016 at 11:55 AM, Linus Torvalds
>  wrote:
> >
> > For the kernel, just the *math* right now actually gives 12
> > characters. For current git it actually seems to say that 8 is the
> > correct number. For small projects, you'll still see 7.
> 
> Sorry, the git number is 9, not 8. The reason is that git has roughly
> 212k objects, and 9 hex digits gets expected collisions at about 256k
> objects.
> 
> So the logic means that we'll see 7 hex digits for projects with less
> than 16k objects, 8 hex digits if there are less than 64k objects, and
> 9 hex digits for projects like git that currently have fewer than 256k
> objects.
> 
> But git itself might not be *that* far from going to 10 hex digits
> with my patch.
> 
> The kernel uses 12 he digits because the collision math says that's
> the right thing for a project with between 4M and 16M objects (with
> the kernel being at 5M).

OTOH, how often does one refer to trees or blobs with abbreviated sha1s?
Most of the time, you'd use abbreviated sha1s for commits. And the number
of commits in git and the kernel repositories are much lower than the
number of overall objects.

rev-list --all --count on the git repo gives me 46790. On the kernel, it
gives 618078.

Now, the interesting thing is looking at the *actual* collisions in
those spaces.

At 9 digits, there's only one commit collision in the kernel repo:
  45f014c5264f5e68ef0e51b36f4ef5ede3d18397
  45f014c52eef022873b19d6a20eb0ec9668f2b09

And two commit collisions at 8 digits in the git repo:
  1536dd9c1df0b7167b139f080cc4774ef63f
  1536dd9c61b5582cf07057cb715dd6dc6620

  2e6e3e82ee36b3e1bec1db8db24817270080424e
  2e6e3e829f3759823d70e7af511bc04cd05ad0af

At 7 digits, there are 5 actual commit collisions in the git repo and
718 in the kernel repo only one of those collisions involve more than 2
commits.

Mike


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Linus Torvalds
On Thu, Sep 29, 2016 at 5:57 PM, Linus Torvalds
 wrote:
>
> Actually, all the other cases seem to be "parse a SHA1 with a known
> length", so they really don't have a negative length.  So this seems
> ok, and is easier to verify than the "what all contexts might use
> DEFAULT_ABBREV" thing. There's only a few callers, and it's a static
> function so it's easy to check it locally in sha1_name.c.

Here's my original patch with just a tiny change that instead of
starting the automatic guessing at 7 each time, it starts at
"default_automatic_abbrev", which is initialized to 7.

The difference is that if we decide that "oh, that was too small, need
to repeat", we also update that "default_automatic_abbrev" value, so
that we won't start at the number that we now know was too small.

So it still loops over the abbrev values, but now it only loops a
couple of times.

I actually verified the performance impact by doing

  time git rev-list --abbrev-commit HEAD > /dev/null

on the kernel git tree, and it does actually matter. With my original
patch, we wasted a noticeable amount of time on just the extra
looping, with this it's down to the same performance as just doing it
once at init time (it's about 12s vs 9s on my laptop).

So this patch may actually be "production ready" apart from the fact
that some tests still fail (at least t2027-worktree-list.sh) because
of different short SHA1 cases.

 Linus
 cache.h   |  1 +
 environment.c |  2 +-
 sha1_name.c   | 26 +-
 3 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/cache.h b/cache.h
index 6e33f2f28..d2da6d186 100644
--- a/cache.h
+++ b/cache.h
@@ -1207,6 +1207,7 @@ struct object_context {
 #define GET_SHA1_TREEISH  020
 #define GET_SHA1_BLOB 040
 #define GET_SHA1_FOLLOW_SYMLINKS 0100
+#define GET_SHA1_AUTOMATIC  0200
 #define GET_SHA1_ONLY_TO_DIE04000
 
 #define GET_SHA1_DISAMBIGUATORS \
diff --git a/environment.c b/environment.c
index c1442df9a..fd6681e46 100644
--- a/environment.c
+++ b/environment.c
@@ -16,7 +16,7 @@ int trust_executable_bit = 1;
 int trust_ctime = 1;
 int check_stat = 1;
 int has_symlinks = 1;
-int minimum_abbrev = 4, default_abbrev = 7;
+int minimum_abbrev = 4, default_abbrev = -1;
 int ignore_case;
 int assume_unchanged;
 int prefer_symlink_refs;
diff --git a/sha1_name.c b/sha1_name.c
index 3b647fd7c..1003c96ea 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -15,6 +15,7 @@ typedef int (*disambiguate_hint_fn)(const unsigned char *, 
void *);
 
 struct disambiguate_state {
int len; /* length of prefix in hex chars */
+   unsigned int nrobjects;
char hex_pfx[GIT_SHA1_HEXSZ + 1];
unsigned char bin_pfx[GIT_SHA1_RAWSZ];
 
@@ -118,6 +119,12 @@ static void find_short_object_filename(struct 
disambiguate_state *ds)
 
if (strlen(de->d_name) != 38)
continue;
+
+   // We only look at the one subdirectory, and we assume
+   // each subdirectory is roughly similar, so each object
+   // we find probably has 255 other objects in the other
+   // fan-out directories
+   ds->nrobjects += 256;
if (memcmp(de->d_name, ds->hex_pfx + 2, ds->len - 2))
continue;
memcpy(hex + 2, de->d_name, 38);
@@ -151,6 +158,7 @@ static void unique_in_pack(struct packed_git *p,
 
open_pack_index(p);
num = p->num_objects;
+   ds->nrobjects += num;
last = num;
while (first < last) {
uint32_t mid = (first + last) / 2;
@@ -380,6 +388,9 @@ static int show_ambiguous_object(const unsigned char *sha1, 
void *data)
return 0;
 }
 
+// Why seven? That's our historical default before the automatic abbreviation
+static int default_automatic_abbrev = 7;
+
 static int get_short_sha1(const char *name, int len, unsigned char *sha1,
  unsigned flags)
 {
@@ -426,6 +437,14 @@ static int get_short_sha1(const char *name, int len, 
unsigned char *sha1,
for_each_abbrev(ds.hex_pfx, show_ambiguous_object, );
}
 
+   if (len < 16 && !status && (flags & GET_SHA1_AUTOMATIC)) {
+   unsigned int expect_collision = 1 << (len * 2);
+   if (ds.nrobjects > expect_collision) {
+   default_automatic_abbrev = len+1;
+   return SHORT_NAME_AMBIGUOUS;
+   }
+   }
+
return status;
 }
 
@@ -458,14 +477,19 @@ int for_each_abbrev(const char *prefix, each_abbrev_fn 
fn, void *cb_data)
 int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len)
 {
int status, exists;
+   int flags = GET_SHA1_QUIETLY;
 
+   if (len < 0) {
+   flags |= GET_SHA1_AUTOMATIC;
+   len = default_automatic_abbrev;
+   }
   

Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Linus Torvalds
On Thu, Sep 29, 2016 at 5:56 PM, Mike Hommey  wrote:
>
> OTOH, how often does one refer to trees or blobs with abbreviated sha1s?
> Most of the time, you'd use abbreviated sha1s for commits. And the number
> of commits in git and the kernel repositories are much lower than the
> number of overall objects.

See that whole other discussion about this. I agree. If we only ever
worried about just commits, the abbreviation length wouldn't need to
be grown nearly as aggressively. The current default would still be
wrong for the kernel, but it wouldn't be as noticeably wrong, and
updating it to 8 or 9 would be fine.

That said, people argued against that too. We *do* end up having
abbreviated SHA1's for blobs in the diff index. When I said that _I_
neer use it, somebody piped up to say that they do.

So I'd rather just keep the existing semantics (a hash is a hash is a
hash), and just abbreviate at a sufficient point that we don't have to
worry too much about disambiguating further by object type.

  Linus


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Linus Torvalds
On Thu, Sep 29, 2016 at 5:28 PM, Linus Torvalds
 wrote:
>
> To be fair, my original patch had a different worry that I didn't
> bother with: what if one of the _other_ callers of "get_short_sha1()"
> passed in -1 to it.  I only handled the -1 case in th eone path care
> about in that first RFC for testing. So I'm *not* suggesting you
> should apply my first version,, It has issues too.

Actually, all the other cases seem to be "parse a SHA1 with a known
length", so they really don't have a negative length.  So this seems
ok, and is easier to verify than the "what all contexts might use
DEFAULT_ABBREV" thing. There's only a few callers, and it's a static
function so it's easy to check it locally in sha1_name.c.

   Linus


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Linus Torvalds
On Thu, Sep 29, 2016 at 5:20 PM, Linus Torvalds
 wrote:
>
> As you say, my original patch had neither of those issues.

To be fair, my original patch had a different worry that I didn't
bother with: what if one of the _other_ callers of "get_short_sha1()"
passed in -1 to it.  I only handled the -1 case in th eone path care
about in that first RFC for testing. So I'm *not* suggesting you
should apply my first version,, It has issues too.

Let me see if I can massage my first hacky RFC test-patch into
something more reliable.

  Linus


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Linus Torvalds
On Thu, Sep 29, 2016 at 4:13 PM, Junio C Hamano  wrote:
>
> One thing that worries me is if we are ready to start accessing the
> object store in all codepaths when we ask for DEFAULT_ABBREV.

Yes. That was my main worry too. I also looked at just doing an explicit

 if (abbrev_commit && default_abbrev < 0)
  default_abbrev = get_default_abbrev();

and in many ways that would be nicer exactly because the point where
this happens is then explicit, instead of being hidden behind that
macro that may end up being done in random places.

But it wasn't entirely obvious which all paths would need that
initialization either, so on the whole it was very much a "six of one,
half a dozen of the other" thing.

As you say, my original patch had neither of those issues. It just
stupidly re-did the loop over and over, and maybe the right thing to
do is to have that original code, but just short-circuit the "over and
over" behavior by just resetting default_abbrev to the value we do
find.

  Linus


Re: [PATCH v6 3/4] ls-files: pass through safe options for --recurse-submodules

2016-09-29 Thread Junio C Hamano
Brandon Williams  writes:

> +static void compile_submodule_options(const struct dir_struct *dir, int 
> show_tag)
> +{
> + if (line_terminator == '\0')
> + argv_array_push(_options, "-z");
> + if (show_tag)
> + argv_array_push(_options, "-t");
> + if (show_valid_bit)
> + argv_array_push(_options, "-v");
> + if (show_cached)
> + argv_array_push(_options, "--cached");
> + if (show_deleted)
> + argv_array_push(_options, "--deleted");
> + if (show_modified)
> + argv_array_push(_options, "--modified");
> + if (show_others)
> + argv_array_push(_options, "--others");
> + if (dir->flags & DIR_SHOW_IGNORED)
> + argv_array_push(_options, "--ignored");
> + if (show_stage)
> + argv_array_push(_options, "--stage");
> + if (show_killed)
> + argv_array_push(_options, "--killed");
> + if (dir->flags & DIR_SHOW_OTHER_DIRECTORIES)
> + argv_array_push(_options, "--directory");
> + if (!(dir->flags & DIR_SHOW_OTHER_DIRECTORIES))
> + argv_array_push(_options, "--empty-directory");
> + if (show_unmerged)
> + argv_array_push(_options, "--unmerged");
> + if (show_resolve_undo)
> + argv_array_push(_options, "--resolve-undo");
> + if (show_eol)
> + argv_array_push(_options, "--eol");
> + if (debug_mode)
> + argv_array_push(_options, "--debug");
> +}

With this and 4/4 applied, the documentation still says "--cached"
is the only supported option.

Does it really make sense to pass all of these?  I understand "-z"
and I suspect things like "-t" and "-v" that affect "how" things are
shown may also happen to work, but I am not sure how much it makes
sense for options that affect "what" things are shown.

What does it even mean to ask for say "--unmerged" to be shown, for
example, from the superproject?  Recurse into submodules whose cache
entries in the index of the superproject are unmerged, or something
else?

I am inclined to say that it is probably better to keep the
"--cached only" as documented, at least on the "what are shown"
side.

Thanks.


Re: [PATCH v2] http: Control GSSAPI credential delegation.

2016-09-29 Thread brian m. carlson
On Wed, Sep 28, 2016 at 08:01:34PM +0200, Petr Stodulka wrote:
> Delegation of credentials is disabled by default in libcurl since
> version 7.21.7 due to security vulnerability CVE-2011-2192. Which
> makes troubles with GSS/kerberos authentication when delegation
> of credentials is required. This can be changed with option
> CURLOPT_GSSAPI_DELEGATION in libcurl with set expected parameter
> since libcurl version 7.22.0.
> 
> This patch provides new configuration variable http.delegation
> which corresponds to curl parameter "--delegation" (see man 1 curl).
> 
> The following values are supported:
> 
> * none (default).
> * policy
> * always

I don't personally use Kerberos delegation with Git, but I don't see any
problems with this patch.  It preserves the security properties of the
current behavior, and I think adding "policy" as an option to allow
per-realm configuration is a good idea.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH 1/3] add QSORT

2016-09-29 Thread René Scharfe
Am 30.09.2016 um 01:21 schrieb René Scharfe:
> Am 30.09.2016 um 00:36 schrieb Junio C Hamano:
>> René Scharfe  writes:
>>
>>> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
>>> size of the array elements and supports the convention of initializing
>>> empty arrays with a NULL pointer, which we use in some places.
>>>
>>> Calling qsort(3) directly with a NULL pointer is undefined -- even with
>>> an element count of zero -- and allows the compiler to optimize away any
>>> following NULL checks.  Using the macro avoids such surprises.
>>>
>>> Add a semantic patch as well to demonstrate the macro's usage and to
>>> automate the transformation of trivial cases.
>>>
>>> Signed-off-by: Rene Scharfe 
>>> ---
>>>  contrib/coccinelle/qsort.cocci | 19 +++
>>>  git-compat-util.h  |  8 
>>>  2 files changed, 27 insertions(+)
>>>  create mode 100644 contrib/coccinelle/qsort.cocci
>>
>> The direct calls to qsort(3) that this series leaves behind are
>> interesting.
>>
>> 1. builtin/index-pack.c has this:
>>
>>  if (1 < opts->anomaly_nr)
>>  qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), 
>> cmp_uint32);
>>
>> where opts->anomaly is coming from pack.h:
>>
>> struct pack_idx_option {
>> unsigned flags;
>> ...
>> int anomaly_alloc, anomaly_nr;
>> uint32_t *anomaly;
>> };
>>
>> I cannot quite see how the automated conversion misses it?  It's not
>> like base and nmemb are type-restricted in the rule (they are both
>> just "expression"s).
>>
>> 2. builtin/shortlog.c has this:
>>
>>  qsort(log->list.items, log->list.nr, sizeof(struct string_list_item),
>>log->summary ? compare_by_counter : compare_by_list);
>>
>> where log->list is coming from shortlog.h:
>>
>> struct shortlog {
>> struct string_list list;
>> };
>>
>> and string-list.h says:
>>
>> struct string_list {
>> struct string_list_item *items;
>> unsigned int nr, alloc;
>> ...
>> };
>>
>> which seems to be a good candidate for this rule:
>>
>> type T;
>> T *base;
>> expression nmemb, compar;
>> @@
>> - qsort(base, nmemb, sizeof(T), compar);
>> + QSORT(base, nmemb, compar);
>>
>> if we take "T == struct string_list_item".
> 
> Transformations for these two are generated if we pass --all-includes
> to spatch.  So let's do that.

And here's the result:

-- >8 --
Subject: [PATCH] use QSORT, part 2

Convert two more qsort(3) calls to QSORT to reduce code size and for
better safety and consistency.

Signed-off-by: Rene Scharfe 
---
Squashable.

 builtin/index-pack.c | 3 +--
 builtin/shortlog.c   | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 7657d0a..0a27bab 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1531,8 +1531,7 @@ static void read_v2_anomalous_offsets(struct packed_git 
*p,
opts->anomaly[opts->anomaly_nr++] = ntohl(idx2[off * 2 + 1]);
}
 
-   if (1 < opts->anomaly_nr)
-   qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), 
cmp_uint32);
+   QSORT(opts->anomaly, opts->anomaly_nr, cmp_uint32);
 }
 
 static void read_idx_option(struct pack_idx_option *opts, const char 
*pack_name)
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index 25fa8a6..ba0e115 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -308,7 +308,7 @@ void shortlog_output(struct shortlog *log)
struct strbuf sb = STRBUF_INIT;
 
if (log->sort_by_number)
-   qsort(log->list.items, log->list.nr, sizeof(struct 
string_list_item),
+   QSORT(log->list.items, log->list.nr,
  log->summary ? compare_by_counter : compare_by_list);
for (i = 0; i < log->list.nr; i++) {
const struct string_list_item *item = >list.items[i];
-- 
2.10.0




Re: Impossible to change working directory

2016-09-29 Thread Junio C Hamano
Sebastian Feldmann  writes:

> the script fails because changing the current working directory fails.
> If I echo the current working directory it always echoes the root repository 
> path
>
> Is this expected behavior?

Yes, we always go to the top before doing anything.

If you echo environment variables Git may set up for you, what do
you see?  Do you see GIT_PREFIX that you can use to tell where you
came from, or something like that?


Re: [PATCH 1/3] add QSORT

2016-09-29 Thread René Scharfe
Am 30.09.2016 um 00:36 schrieb Junio C Hamano:
> René Scharfe  writes:
> 
>> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
>> size of the array elements and supports the convention of initializing
>> empty arrays with a NULL pointer, which we use in some places.
>>
>> Calling qsort(3) directly with a NULL pointer is undefined -- even with
>> an element count of zero -- and allows the compiler to optimize away any
>> following NULL checks.  Using the macro avoids such surprises.
>>
>> Add a semantic patch as well to demonstrate the macro's usage and to
>> automate the transformation of trivial cases.
>>
>> Signed-off-by: Rene Scharfe 
>> ---
>>  contrib/coccinelle/qsort.cocci | 19 +++
>>  git-compat-util.h  |  8 
>>  2 files changed, 27 insertions(+)
>>  create mode 100644 contrib/coccinelle/qsort.cocci
> 
> The direct calls to qsort(3) that this series leaves behind are
> interesting.
> 
> 1. builtin/index-pack.c has this:
> 
>   if (1 < opts->anomaly_nr)
>   qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), 
> cmp_uint32);
> 
> where opts->anomaly is coming from pack.h:
> 
> struct pack_idx_option {
> unsigned flags;
> ...
> int anomaly_alloc, anomaly_nr;
> uint32_t *anomaly;
> };
> 
> I cannot quite see how the automated conversion misses it?  It's not
> like base and nmemb are type-restricted in the rule (they are both
> just "expression"s).
> 
> 2. builtin/shortlog.c has this:
> 
>   qsort(log->list.items, log->list.nr, sizeof(struct string_list_item),
> log->summary ? compare_by_counter : compare_by_list);
> 
> where log->list is coming from shortlog.h:
> 
> struct shortlog {
> struct string_list list;
> };
> 
> and string-list.h says:
> 
> struct string_list {
> struct string_list_item *items;
> unsigned int nr, alloc;
> ...
> };
> 
> which seems to be a good candidate for this rule:
> 
> type T;
> T *base;
> expression nmemb, compar;
> @@
> - qsort(base, nmemb, sizeof(T), compar);
> + QSORT(base, nmemb, compar);
> 
> if we take "T == struct string_list_item".

Transformations for these two are generated if we pass --all-includes
to spatch.  So let's do that.

-- >8 --
Subject: [PATCH] coccicheck: use --all-includes by default

Add a make variable, SPATCH_FLAGS, for specifying flags for spatch, and
set it to --all-includes by default.  This option lets it consider
header files which would otherwise be ignored.  That's important for
some rules that rely on type information.  It doubles the duration of
coccicheck, however.

Signed-off-by: Rene Scharfe 
---
 Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 1aad150..d15bf8d 100644
--- a/Makefile
+++ b/Makefile
@@ -467,6 +467,7 @@ SPATCH = spatch
 export TCL_PATH TCLTK_PATH
 
 SPARSE_FLAGS =
+SPATCH_FLAGS = --all-includes
 
 
 
@@ -2314,7 +2315,7 @@ C_SOURCES = $(patsubst %.o,%.c,$(C_OBJ))
 %.cocci.patch: %.cocci $(C_SOURCES)
@echo '' SPATCH $<; \
for f in $(C_SOURCES); do \
-   $(SPATCH) --sp-file $< $$f; \
+   $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS); \
done >$@ 2>$@.log; \
if test -s $@; \
then \
-- 
2.10.0



Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Junio C Hamano
Junio C Hamano  writes:

> Linus Torvalds  writes:
>
>> The advantage of the
>> previous patch was that it got the object counting right almost
>> automatically, this actually has its own new object counting code and
>> maybe I screwed it up.

I guess another advantage of your original approach was that it
delayed the counting to the very last minute, so the things that
worried me in my previous response were automatically made
non-issues.


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Junio C Hamano
Linus Torvalds  writes:

> Somebody should really double-check my heuristics, to see that I did
> the pack counting etc right.  It doesn't do alternate loose file
> counting at all, and maybe it could matter.  The advantage of the
> previous patch was that it got the object counting right almost
> automatically, this actually has its own new object counting code and
> maybe I screwed it up.

One thing that worries me is if we are ready to start accessing the
object store in all codepaths when we ask for DEFAULT_ABBREV.  The
worries are twofold:

 (1) Do we do the right thing if object store is not available to
 us?  Some commands can be run outside repository, and if our
 call to prepare_packed_git() or loose object iteration barfed
 in some way, that would introduce a regression.

 (2) Is calling prepare_packed_git() too early interfere with how
 the commands expect its own prepare_packed_git() work?  That
 is, if a command has this sequence, "ask DEFAULT_ABBREV,
 arrange things, and then call prepare_packed_git()", and the
 existing "arrange things" step had something that causes a new
 pack to become eligible to be read by prepare_packed_git(),
 like adding to the list of alternate object stores, its own
 prepare_packed_git() will now become a no-op.

I browsed through "tig grep DEFAULT_ABBREV \*.c" and it seems that
in majority of the hits, we not just are ready to start accessing,
but already have an object or two, which must have come from an
already open object store, so they are OK.  Especially the ones that
use it as the last argument to find_unique_abbrev() are OK as we are
about to open the object store to do the computation.

There are very early ones in the program startup sequence in the
following functions, but I do not think of a reason why our new and
early call to prepare_packed_git() might be problematic, given that
all of them require us to have an access to the repository (i.e.
this change cannot introduce a regression where a command used to
work outside a repository but barf when prepare_packed_git() is
called early):

 - builtin/describe.c
 - builtin/rev-list.c
 - builtin/rev-parse.c

I thought that the one in diff.c might be problematic when the "git
diff" command is run outside a repository with the "--no-index"
option, but it appears that init_default_abbrev() seems to be OK
when run outside a repository.

There is one in parse-options-cb.c that is used to parse the --abbrev
command line option.  This might cause a cosmetic problem but when
the user is asking for an abbreviation, it is expected that we will
have an access to the object store anyway, so it may be OK.

I am sorry that none of the above is about your math ;-)  I suck at
math so I won't comment.



Re: [PATCH 1/3] add QSORT

2016-09-29 Thread Junio C Hamano
René Scharfe  writes:

> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
> size of the array elements and supports the convention of initializing
> empty arrays with a NULL pointer, which we use in some places.
>
> Calling qsort(3) directly with a NULL pointer is undefined -- even with
> an element count of zero -- and allows the compiler to optimize away any
> following NULL checks.  Using the macro avoids such surprises.
>
> Add a semantic patch as well to demonstrate the macro's usage and to
> automate the transformation of trivial cases.
>
> Signed-off-by: Rene Scharfe 
> ---
>  contrib/coccinelle/qsort.cocci | 19 +++
>  git-compat-util.h  |  8 
>  2 files changed, 27 insertions(+)
>  create mode 100644 contrib/coccinelle/qsort.cocci

The direct calls to qsort(3) that this series leaves behind are
interesting.

1. builtin/index-pack.c has this:

if (1 < opts->anomaly_nr)
qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), 
cmp_uint32);

where opts->anomaly is coming from pack.h:

struct pack_idx_option {
unsigned flags;
...
int anomaly_alloc, anomaly_nr;
uint32_t *anomaly;
};

I cannot quite see how the automated conversion misses it?  It's not
like base and nmemb are type-restricted in the rule (they are both
just "expression"s).

2. builtin/shortlog.c has this:

qsort(log->list.items, log->list.nr, sizeof(struct string_list_item),
  log->summary ? compare_by_counter : compare_by_list);

where log->list is coming from shortlog.h:

struct shortlog {
struct string_list list;
};

and string-list.h says:

struct string_list {
struct string_list_item *items;
unsigned int nr, alloc;
...
};

which seems to be a good candidate for this rule:

type T;
T *base;
expression nmemb, compar;
@@
- qsort(base, nmemb, sizeof(T), compar);
+ QSORT(base, nmemb, compar);

if we take "T == struct string_list_item".

3. builtin/show-branch.c does this:

qsort(ref_name + bottom, top - bottom, sizeof(ref_name[0]),
  compare_ref_name);

where ref_name[] is a file-scope global:

static char *ref_name[MAX_REVS + 1];

and top and bottom are plain integers.  The sizeof() does not take
the size of *base, so it is understandable that this does not get
automatically converted.

It seems that some calls to this function _could_ send the same top
and bottom, asking for 0 element array to be sorted, by the way.

Thanks for an amusing read.



Re: [RFC/PATCH 0/2] place cherry pick line below commit title

2016-09-29 Thread Junio C Hamano
Jonathan Tan  writes:

> This is somewhat of a follow-up to my previous e-mail with subject
> "[PATCH] sequencer: support folding in rfc2822 footer" [1], in which I
> proposed relaxing the definition of a commit message footer to allow
> multiple-line field bodies (as described in RFC2822), but its strictness
> was deemed deliberate.

It does not necessarily mean we can never change it when we did
something deliberately, though.  With a good enough justification,
and with a transitition plan if the backward incompatibility is
severe enough to warrant one, we can change things.

I vaguely recall that there were some discussion on the definition
of "what's a trailer line" with folks from the kernel land, perhaps
while discussing the interpret-trailers topic.  IIRC, when somebody
passes an improved version along, the resulting message's trailer
block may look like this:

Signed-off-by: Original Author 
[fixed typo in the variable names]
Signed-off-by: Somebhody Else 

and an obvious "wish" of theirs was to treat not just RFC2822-like
"a line that begins with token followed by a colon" but also these
short comments as part of the trailer block.  Your original wish in
[*1*] is to also treat "a line that begin with a whitespace that
follows a line that begins with token followed by a colon" as part
of the trailer block and I personally think that is a reasonable
thing to wish for, too.

I recall that I was somewhat surprised and dissapointed to see no
change to interpret-trailers when you tried [*1*], which was really
about improving the definition of what the trailer block is, by the
way.

In any case, if we want to improve what the trailer block is, we
would certainly need to make sure what is inserted by "cherry-pick -x"
is also considered as part of the trailer block, so it may be necessary
to change it to "Cherry-picked-from: ..." while doing so.  I dunno.

> Below is a patch set that allows placing the "cherry picked from" line
> without taking into account the definition of a commit message footer.
> For example, "git cherry-pick -x" (with the appropriate configuration
> variable or argument) would, to this commit message:
>
>   commit title
>
>   This is an explanatory paragraph.
>
>   Footer: foo
>
> place the "(cherry picked from ...)" line below "commit title".
>
> Would this be better?

It is not immediately obvious what such a change buys us.  Wouldn't
the current code place that line below "Footer: foo"?  I cannot
think of any reason why anybody would want to place "cherry-picked
from" immediately below the title and before the first line of the
body.


[Footnotes]

*1* 
http://public-inbox.org/git/1472846322-5592-1-git-send-email-jonathanta...@google.com/


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Linus Torvalds
On Thu, Sep 29, 2016 at 12:45 PM, Junio C Hamano  wrote:
>
> I think that is a reasonable way to go.
>
> #define DEFAULT_ABBREV get_default_abbrev()
>
> would help.

So something like this that replaces the previous patch?

Somebody should really double-check my heuristics, to see that I did
the pack counting etc right.  It doesn't do alternate loose file
counting at all, and maybe it could matter.  The advantage of the
previous patch was that it got the object counting right almost
automatically, this actually has its own new object counting code and
maybe I screwed it up.

Linus
 cache.h   |  3 ++-
 environment.c |  2 +-
 sha1_file.c   | 43 +++
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/cache.h b/cache.h
index 6e33f2f28..a022e1bd2 100644
--- a/cache.h
+++ b/cache.h
@@ -1186,8 +1186,9 @@ static inline int hex2chr(const char *s)
 }
 
 /* Convert to/from hex/sha1 representation */
+extern int get_default_abbrev(void);
 #define MINIMUM_ABBREV minimum_abbrev
-#define DEFAULT_ABBREV default_abbrev
+#define DEFAULT_ABBREV get_default_abbrev()
 
 struct object_context {
unsigned char tree[20];
diff --git a/environment.c b/environment.c
index c1442df9a..fd6681e46 100644
--- a/environment.c
+++ b/environment.c
@@ -16,7 +16,7 @@ int trust_executable_bit = 1;
 int trust_ctime = 1;
 int check_stat = 1;
 int has_symlinks = 1;
-int minimum_abbrev = 4, default_abbrev = 7;
+int minimum_abbrev = 4, default_abbrev = -1;
 int ignore_case;
 int assume_unchanged;
 int prefer_symlink_refs;
diff --git a/sha1_file.c b/sha1_file.c
index ca149a607..28ba04b65 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3720,3 +3720,46 @@ int for_each_packed_object(each_packed_object_fn cb, 
void *data, unsigned flags)
}
return r ? r : pack_errors;
 }
+
+static int init_default_abbrev(void)
+{
+   unsigned long count = 0;
+   struct packed_git *p;
+   struct strbuf buf = STRBUF_INIT;
+   DIR *dir;
+   char *name;
+   int ret;
+
+   prepare_packed_git();
+   for (p = packed_git; p; p = p->next) {
+   if (open_pack_index(p))
+   continue;
+   count += p->num_objects;
+   }
+
+   strbuf_addstr(, get_object_directory());
+   strbuf_addstr(, "/42/");
+   name = strbuf_detach(, NULL);
+   dir = opendir(name);
+   free(name);
+   if (dir) {
+   struct dirent *de;
+   while ((de = readdir(dir)) != NULL) {
+   count += 256;
+   }
+   closedir(dir);
+   }
+   for (ret = 7; ret < 15; ret++) {
+   unsigned long expect_collision = 1ul << (ret * 2);
+   if (count < expect_collision)
+   break;
+   }
+   return ret;
+}
+
+int get_default_abbrev(void)
+{
+   if (default_abbrev < 0)
+   default_abbrev = init_default_abbrev();
+   return default_abbrev;
+}


[PATCH v6 2/4] ls-files: optionally recurse into submodules

2016-09-29 Thread Brandon Williams
Allow ls-files to recognize submodules in order to retrieve a list of
files from a repository's submodules.  This is done by forking off a
process to recursively call ls-files on all submodules. Use top-level
--super-prefix option to pass a path to the submodule which it can
use to prepend to output or pathspec matching logic.

Signed-off-by: Brandon Williams 
---
 Documentation/git-ls-files.txt |   8 +-
 builtin/ls-files.c | 138 -
 git.c  |   2 +-
 t/t3007-ls-files-recurse-submodules.sh | 100 
 4 files changed, 208 insertions(+), 40 deletions(-)
 create mode 100755 t/t3007-ls-files-recurse-submodules.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0d933ac..ea01d45 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -18,7 +18,8 @@ SYNOPSIS
[--exclude-per-directory=]
[--exclude-standard]
[--error-unmatch] [--with-tree=]
-   [--full-name] [--abbrev] [--] [...]
+   [--full-name] [--recurse-submodules]
+   [--abbrev] [--] [...]
 
 DESCRIPTION
 ---
@@ -137,6 +138,11 @@ a space) at the start of each line:
option forces paths to be output relative to the project
top directory.
 
+--recurse-submodules::
+   Recursively calls ls-files on each submodule in the repository.
+   Currently there is only support for the --cached mode without a
+   pathspec.
+
 --abbrev[=]::
Instead of showing the full 40-byte hexadecimal object
lines, show only a partial prefix.
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 00ea91a..63befed 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -14,6 +14,7 @@
 #include "resolve-undo.h"
 #include "string-list.h"
 #include "pathspec.h"
+#include "run-command.h"
 
 static int abbrev;
 static int show_deleted;
@@ -28,8 +29,10 @@ static int show_valid_bit;
 static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
+static int recurse_submodules;
 
 static const char *prefix;
+static const char *super_prefix;
 static int max_prefix_len;
 static int prefix_len;
 static struct pathspec pathspec;
@@ -68,11 +71,24 @@ static void write_eolinfo(const struct cache_entry *ce, 
const char *path)
 static void write_name(const char *name)
 {
/*
+* Prepend the super_prefix to name to construct the full_name to be
+* written.
+*/
+   struct strbuf full_name = STRBUF_INIT;
+   if (super_prefix) {
+   strbuf_addstr(_name, super_prefix);
+   strbuf_addstr(_name, name);
+   name = full_name.buf;
+   }
+
+   /*
 * With "--full-name", prefix_len=0; this caller needs to pass
 * an empty string in that case (a NULL is good for "").
 */
write_name_quoted_relative(name, prefix_len ? prefix : NULL,
   stdout, line_terminator);
+
+   strbuf_release(_name);
 }
 
 static void show_dir_entry(const char *tag, struct dir_entry *ent)
@@ -152,55 +168,84 @@ static void show_killed_files(struct dir_struct *dir)
}
 }
 
+/**
+ * Recursively call ls-files on a submodule
+ */
+static void show_gitlink(const struct cache_entry *ce)
+{
+   struct child_process cp = CHILD_PROCESS_INIT;
+   int status;
+
+   argv_array_pushf(, "--super-prefix=%s%s/",
+super_prefix ? super_prefix : "",
+ce->name);
+   argv_array_push(, "ls-files");
+   argv_array_push(, "--recurse-submodules");
+
+   cp.git_cmd = 1;
+   cp.dir = ce->name;
+   status = run_command();
+   if (status)
+   exit(status);
+}
+
 static void show_ce_entry(const char *tag, const struct cache_entry *ce)
 {
+   struct strbuf name = STRBUF_INIT;
int len = max_prefix_len;
+   if (super_prefix)
+   strbuf_addstr(, super_prefix);
+   strbuf_addstr(, ce->name);
 
if (len >= ce_namelen(ce))
die("git ls-files: internal error - cache entry not superset of 
prefix");
 
-   if (!match_pathspec(, ce->name, ce_namelen(ce),
-   len, ps_matched,
-   S_ISDIR(ce->ce_mode) || S_ISGITLINK(ce->ce_mode)))
-   return;
+   if (recurse_submodules && S_ISGITLINK(ce->ce_mode)) {
+   show_gitlink(ce);
+   } else if (match_pathspec(, name.buf, name.len,
+ len, ps_matched,
+ S_ISDIR(ce->ce_mode) ||
+ S_ISGITLINK(ce->ce_mode))) {
+   if (tag && *tag && show_valid_bit &&
+   (ce->ce_flags & CE_VALID)) {
+   static char alttag[4];
+   memcpy(alttag, tag, 3);
+   if 

[PATCH v6 4/4] ls-files: add pathspec matching for submodules

2016-09-29 Thread Brandon Williams
Pathspecs can be a bit tricky when trying to apply them to submodules.
The main challenge is that the pathspecs will be with respect to the
superproject and not with respect to paths in the submodule.  The
approach this patch takes is to pass in the identical pathspec from the
superproject to the submodule in addition to the submodule-prefix, which
is the path from the root of the superproject to the submodule, and then
we can compare an entry in the submodule prepended with the
submodule-prefix to the pathspec in order to determine if there is a
match.

This patch also permits the pathspec logic to perform a prefix match against
submodules since a pathspec could refer to a file inside of a submodule.
Due to limitations in the wildmatch logic, a prefix match is only done
literally.  If any wildcard character is encountered we'll simply punt
and produce a false positive match.  More accurate matching will be done
once inside the submodule.  This is due to the superproject not knowing
what files could exist in the submodule.

Signed-off-by: Brandon Williams 
---
 Documentation/git-ls-files.txt |   3 +-
 builtin/ls-files.c |  27 +++--
 dir.c  |  46 +-
 dir.h  |   4 ++
 t/t3007-ls-files-recurse-submodules.sh | 108 -
 5 files changed, 175 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index ea01d45..51ec9a1 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -140,8 +140,7 @@ a space) at the start of each line:
 
 --recurse-submodules::
Recursively calls ls-files on each submodule in the repository.
-   Currently there is only support for the --cached mode without a
-   pathspec.
+   Currently there is only support for the --cached.
 
 --abbrev[=]::
Instead of showing the full 40-byte hexadecimal object
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 6f744ef..82ec811 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -215,6 +215,7 @@ static void show_gitlink(const struct cache_entry *ce)
 {
struct child_process cp = CHILD_PROCESS_INIT;
int status;
+   int i;
 
argv_array_pushf(, "--super-prefix=%s%s/",
 super_prefix ? super_prefix : "",
@@ -225,6 +226,15 @@ static void show_gitlink(const struct cache_entry *ce)
/* add supported options */
argv_array_pushv(, submodules_options.argv);
 
+   /*
+* Pass in the original pathspec args.  The submodule will be
+* responsible for prepending the 'submodule_prefix' prior to comparing
+* against the pathspec for matches.
+*/
+   argv_array_push(, "--");
+   for (i = 0; i < pathspec.nr; i++)
+   argv_array_push(, pathspec.items[i].original);
+
cp.git_cmd = 1;
cp.dir = ce->name;
status = run_command();
@@ -243,7 +253,8 @@ static void show_ce_entry(const char *tag, const struct 
cache_entry *ce)
if (len >= ce_namelen(ce))
die("git ls-files: internal error - cache entry not superset of 
prefix");
 
-   if (recurse_submodules && S_ISGITLINK(ce->ce_mode)) {
+   if (recurse_submodules && S_ISGITLINK(ce->ce_mode) &&
+   submodule_path_match(, name.buf, ps_matched)) {
show_gitlink(ce);
} else if (match_pathspec(, name.buf, name.len,
  len, ps_matched,
@@ -623,16 +634,20 @@ int cmd_ls_files(int argc, const char **argv, const char 
*cmd_prefix)
die("ls-files --recurse-submodules does not support "
"--error-unmatch");
 
-   if (recurse_submodules && argc)
-   die("ls-files --recurse-submodules does not support pathspec");
-
parse_pathspec(, 0,
   PATHSPEC_PREFER_CWD |
   PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP,
   prefix, argv);
 
-   /* Find common prefix for all pathspec's */
-   max_prefix = common_prefix();
+   /*
+* Find common prefix for all pathspec's
+* This is used as a performance optimization which unfortunately cannot
+* be done when recursing into submodules
+*/
+   if (recurse_submodules)
+   max_prefix = NULL;
+   else
+   max_prefix = common_prefix();
max_prefix_len = max_prefix ? strlen(max_prefix) : 0;
 
/* Treat unmatching pathspec elements as errors */
diff --git a/dir.c b/dir.c
index 0ea235f..28e9736 100644
--- a/dir.c
+++ b/dir.c
@@ -207,8 +207,9 @@ int within_depth(const char *name, int namelen,
return 1;
 }
 
-#define DO_MATCH_EXCLUDE   1
-#define DO_MATCH_DIRECTORY 2
+#define DO_MATCH_EXCLUDE   (1<<0)
+#define DO_MATCH_DIRECTORY (1<<1)
+#define DO_MATCH_SUBMODULE (1<<2)
 
 /*
  * Does 'match' match the given 

[PATCH v6 1/4] git: make super-prefix option

2016-09-29 Thread Brandon Williams
Add a super-prefix environment variable 'GIT_INTERNAL_SUPER_PREFIX'
which can be used to specify a path from above a repository down to its
root.  When such a super-prefix is specified, the paths reported by Git
are prefixed with it to make them relative to that directory "above".
The paths given by the user on the command line
(e.g. "git subcmd --output-file=path/to/a/file" and pathspecs) are taken
relative to the directory "above" to match.

The immediate use of this option is by commands which have a
--recurse-submodule option in order to give context to submodules about
how they were invoked.  This option is currently only allowed for
builtins which support a super-prefix.

Signed-off-by: Brandon Williams 
---
 Documentation/git.txt |  6 ++
 cache.h   |  2 ++
 environment.c | 10 ++
 git.c | 26 ++
 4 files changed, 44 insertions(+)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 7913fc2..2188ae6 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -13,6 +13,7 @@ SYNOPSIS
 [--exec-path[=]] [--html-path] [--man-path] [--info-path]
 [-p|--paginate|--no-pager] [--no-replace-objects] [--bare]
 [--git-dir=] [--work-tree=] [--namespace=]
+[--super-prefix=]
  []
 
 DESCRIPTION
@@ -601,6 +602,11 @@ foo.bar= ...`) sets `foo.bar` to the empty string.
details.  Equivalent to setting the `GIT_NAMESPACE` environment
variable.
 
+--super-prefix=::
+   Currently for internal use only.  Set a prefix which gives a path from
+   above a repository down to its root.  One use is to give submodules
+   context about the superproject that invoked it.
+
 --bare::
Treat the repository as a bare repository.  If GIT_DIR
environment is not set, it is set to the current working
diff --git a/cache.h b/cache.h
index 3556326..8cf495d 100644
--- a/cache.h
+++ b/cache.h
@@ -408,6 +408,7 @@ static inline enum object_type object_type(unsigned int 
mode)
 #define GIT_NAMESPACE_ENVIRONMENT "GIT_NAMESPACE"
 #define GIT_WORK_TREE_ENVIRONMENT "GIT_WORK_TREE"
 #define GIT_PREFIX_ENVIRONMENT "GIT_PREFIX"
+#define GIT_SUPER_PREFIX_ENVIRONMENT "GIT_INTERNAL_SUPER_PREFIX"
 #define DEFAULT_GIT_DIR_ENVIRONMENT ".git"
 #define DB_ENVIRONMENT "GIT_OBJECT_DIRECTORY"
 #define INDEX_ENVIRONMENT "GIT_INDEX_FILE"
@@ -468,6 +469,7 @@ extern int get_common_dir_noenv(struct strbuf *sb, const 
char *gitdir);
 extern int get_common_dir(struct strbuf *sb, const char *gitdir);
 extern const char *get_git_namespace(void);
 extern const char *strip_namespace(const char *namespaced_ref);
+extern const char *get_super_prefix(void);
 extern const char *get_git_work_tree(void);
 
 /*
diff --git a/environment.c b/environment.c
index ca72464..13f3d70 100644
--- a/environment.c
+++ b/environment.c
@@ -100,6 +100,8 @@ static char *work_tree;
 static const char *namespace;
 static size_t namespace_len;
 
+static const char *super_prefix;
+
 static const char *git_dir, *git_common_dir;
 static char *git_object_dir, *git_index_file, *git_graft_file;
 int git_db_env, git_index_env, git_graft_env, git_common_dir_env;
@@ -120,6 +122,7 @@ const char * const local_repo_env[] = {
NO_REPLACE_OBJECTS_ENVIRONMENT,
GIT_REPLACE_REF_BASE_ENVIRONMENT,
GIT_PREFIX_ENVIRONMENT,
+   GIT_SUPER_PREFIX_ENVIRONMENT,
GIT_SHALLOW_FILE_ENVIRONMENT,
GIT_COMMON_DIR_ENVIRONMENT,
NULL
@@ -222,6 +225,13 @@ const char *strip_namespace(const char *namespaced_ref)
return namespaced_ref + namespace_len;
 }
 
+const char *get_super_prefix(void)
+{
+   if (!super_prefix)
+   super_prefix = getenv(GIT_SUPER_PREFIX_ENVIRONMENT);
+   return super_prefix;
+}
+
 static int git_work_tree_initialized;
 
 /*
diff --git a/git.c b/git.c
index 1c61151..f756b62 100644
--- a/git.c
+++ b/git.c
@@ -164,6 +164,20 @@ static int handle_options(const char ***argv, int *argc, 
int *envchanged)
setenv(GIT_WORK_TREE_ENVIRONMENT, cmd, 1);
if (envchanged)
*envchanged = 1;
+   } else if (!strcmp(cmd, "--super-prefix")) {
+   if (*argc < 2) {
+   fprintf(stderr, "No prefix given for 
--super-prefix.\n" );
+   usage(git_usage_string);
+   }
+   setenv(GIT_SUPER_PREFIX_ENVIRONMENT, (*argv)[1], 1);
+   if (envchanged)
+   *envchanged = 1;
+   (*argv)++;
+   (*argc)--;
+   } else if (skip_prefix(cmd, "--super-prefix=", )) {
+   setenv(GIT_SUPER_PREFIX_ENVIRONMENT, cmd, 1);
+   if (envchanged)
+   *envchanged = 1;
} else if (!strcmp(cmd, "--bare")) {
char *cwd = 

[PATCH v6 3/4] ls-files: pass through safe options for --recurse-submodules

2016-09-29 Thread Brandon Williams
Pass through some known-safe options when recursing into submodules.
(--cached, --stage, -v, -t, -z, --debug, --eol)

Other options are compiled into an argv_array but if an unsafe option is
given the caller will be errored out.

Signed-off-by: Brandon Williams 
---
 builtin/ls-files.c | 51 --
 t/t3007-ls-files-recurse-submodules.sh | 17 
 2 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 63befed..6f744ef 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -30,6 +30,7 @@ static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
 static int recurse_submodules;
+static struct argv_array submodules_options = ARGV_ARRAY_INIT;
 
 static const char *prefix;
 static const char *super_prefix;
@@ -168,6 +169,45 @@ static void show_killed_files(struct dir_struct *dir)
}
 }
 
+/*
+ * Compile an argv_array with all of the options supported by 
--recurse_submodules
+ */
+static void compile_submodule_options(const struct dir_struct *dir, int 
show_tag)
+{
+   if (line_terminator == '\0')
+   argv_array_push(_options, "-z");
+   if (show_tag)
+   argv_array_push(_options, "-t");
+   if (show_valid_bit)
+   argv_array_push(_options, "-v");
+   if (show_cached)
+   argv_array_push(_options, "--cached");
+   if (show_deleted)
+   argv_array_push(_options, "--deleted");
+   if (show_modified)
+   argv_array_push(_options, "--modified");
+   if (show_others)
+   argv_array_push(_options, "--others");
+   if (dir->flags & DIR_SHOW_IGNORED)
+   argv_array_push(_options, "--ignored");
+   if (show_stage)
+   argv_array_push(_options, "--stage");
+   if (show_killed)
+   argv_array_push(_options, "--killed");
+   if (dir->flags & DIR_SHOW_OTHER_DIRECTORIES)
+   argv_array_push(_options, "--directory");
+   if (!(dir->flags & DIR_SHOW_OTHER_DIRECTORIES))
+   argv_array_push(_options, "--empty-directory");
+   if (show_unmerged)
+   argv_array_push(_options, "--unmerged");
+   if (show_resolve_undo)
+   argv_array_push(_options, "--resolve-undo");
+   if (show_eol)
+   argv_array_push(_options, "--eol");
+   if (debug_mode)
+   argv_array_push(_options, "--debug");
+}
+
 /**
  * Recursively call ls-files on a submodule
  */
@@ -182,6 +222,9 @@ static void show_gitlink(const struct cache_entry *ce)
argv_array_push(, "ls-files");
argv_array_push(, "--recurse-submodules");
 
+   /* add supported options */
+   argv_array_pushv(, submodules_options.argv);
+
cp.git_cmd = 1;
cp.dir = ce->name;
status = run_command();
@@ -567,11 +610,13 @@ int cmd_ls_files(int argc, const char **argv, const char 
*cmd_prefix)
if (require_work_tree && !is_inside_work_tree())
setup_work_tree();
 
+   if (recurse_submodules)
+   compile_submodule_options(, show_tag);
+
if (recurse_submodules &&
-   (show_stage || show_deleted || show_others || show_unmerged ||
+   (show_deleted || show_others || show_unmerged ||
 show_killed || show_modified || show_resolve_undo ||
-show_valid_bit || show_tag || show_eol || with_tree ||
-(line_terminator == '\0')))
+with_tree))
die("ls-files --recurse-submodules unsupported mode");
 
if (recurse_submodules && error_unmatch)
diff --git a/t/t3007-ls-files-recurse-submodules.sh 
b/t/t3007-ls-files-recurse-submodules.sh
index b5a53c3..e76fa30 100755
--- a/t/t3007-ls-files-recurse-submodules.sh
+++ b/t/t3007-ls-files-recurse-submodules.sh
@@ -34,6 +34,18 @@ test_expect_success 'ls-files correctly outputs files in 
submodule' '
test_cmp expect actual
 '
 
+test_expect_success 'ls-files correctly outputs files in submodule with -z' '
+   lf_to_nul >expect <<-\EOF &&
+   .gitmodules
+   a
+   b/b
+   submodule/c
+   EOF
+
+   git ls-files --recurse-submodules -z >actual &&
+   test_cmp expect actual
+'
+
 test_expect_success 'ls-files does not output files not added to a repo' '
cat >expect <<-\EOF &&
.gitmodules
@@ -86,15 +98,10 @@ test_incompatible_with_recurse_submodules () {
"
 }
 
-test_incompatible_with_recurse_submodules -z
-test_incompatible_with_recurse_submodules -v
-test_incompatible_with_recurse_submodules -t
 test_incompatible_with_recurse_submodules --deleted
 test_incompatible_with_recurse_submodules --modified
 test_incompatible_with_recurse_submodules --others
-test_incompatible_with_recurse_submodules --stage
 test_incompatible_with_recurse_submodules --killed
 test_incompatible_with_recurse_submodules --unmerged

Re: [PATCH v2 02/11] i18n: add--interactive: mark simple here documents for translation

2016-09-29 Thread Junio C Hamano
Jakub Narębski  writes:

> W dniu 29.09.2016 o 19:05, Junio C Hamano pisze:
>> Vasco Almeida  writes:
>> 
>>> On the other hand, would it make sense to translate these commands? If
>>> so, we would mark for translation the commands name of @cmd in
>>> main_loop().
>>>
>>>  sub main_loop {
>>> -   my @cmd = ([ 'status', \_cmd, ],
>>> -  [ 'update', \_cmd, ],
>>> -  [ 'revert', \_cmd, ],
>>> -  [ 'add untracked', \_untracked_cmd, ],
>>> -  [ 'patch', \_update_cmd, ],
>>> -  [ 'diff', \_cmd, ],
>>> -  [ 'quit', \_cmd, ],
>>> -  [ 'help', \_cmd, ],
>>> +   my @cmd = ([ __('status'), \_cmd, ],
>>> +  [ __('update'), \_cmd, ],
>>> +  [ __('revert'), \_cmd, ],
>>> +  [ __('add untracked'), \_untracked_cmd, ],
>>> +  [ __('patch'), \_update_cmd, ],
>>> +  [ __('diff'), \_cmd, ],
>>> +  [ __('quit'), \_cmd, ],
>>> +  [ __('help'), \_cmd, ],
>> 
>> I don't know offhand.  If the code to prompt and accept the command
>> given by the user can take the translated word (or a prefix of it),
>> theoretically I would say it could be made to work, but to me it is
>> dubious the benefit outweighs its downsides.  It would make teaching
>> Git and troubleshooting over the phone harder, I would guess.
>> 
>>  A: "Hi, I am in a 'git add -i' session."
>>  B: "Give 's' at the prompt."
>>  A: "My Git does not seem to take 's' as a valid command."
>>  B: "What? I've never seen that problem."
>>  ... back and forth wastes 10 minutes ...
>>  A: "By the way, I am running Git in Portuguese."
>
> Also, for one-letter commands to work (there is setting where you
> don't even need to press enter, IIRC) all those translations would
> have to be chosen to begin with different letter, isn't it?

The original was written with an explicit expectation that these
command words will not be translated adn chose words that do not
share the first letter exactly for that reason.

Having said that, if somebody is willing to i18n the command words,
I'd expect that the command line prompt interaction would be updated
to take the unique prefix instead of the "first byte", and if that
happens, I think the resulting system would at least be internally
consistent.

It is still dubious to me if the benefit of i18n outweighs its
downsides, though.





Re: [PATCH v2 02/11] i18n: add--interactive: mark simple here documents for translation

2016-09-29 Thread Jakub Narębski
W dniu 29.09.2016 o 19:05, Junio C Hamano pisze:
> Vasco Almeida  writes:
> 
>> On the other hand, would it make sense to translate these commands? If
>> so, we would mark for translation the commands name of @cmd in
>> main_loop().
>>
>>  sub main_loop {
>> -   my @cmd = ([ 'status', \_cmd, ],
>> -  [ 'update', \_cmd, ],
>> -  [ 'revert', \_cmd, ],
>> -  [ 'add untracked', \_untracked_cmd, ],
>> -  [ 'patch', \_update_cmd, ],
>> -  [ 'diff', \_cmd, ],
>> -  [ 'quit', \_cmd, ],
>> -  [ 'help', \_cmd, ],
>> +   my @cmd = ([ __('status'), \_cmd, ],
>> +  [ __('update'), \_cmd, ],
>> +  [ __('revert'), \_cmd, ],
>> +  [ __('add untracked'), \_untracked_cmd, ],
>> +  [ __('patch'), \_update_cmd, ],
>> +  [ __('diff'), \_cmd, ],
>> +  [ __('quit'), \_cmd, ],
>> +  [ __('help'), \_cmd, ],
> 
> I don't know offhand.  If the code to prompt and accept the command
> given by the user can take the translated word (or a prefix of it),
> theoretically I would say it could be made to work, but to me it is
> dubious the benefit outweighs its downsides.  It would make teaching
> Git and troubleshooting over the phone harder, I would guess.
> 
>  A: "Hi, I am in a 'git add -i' session."
>  B: "Give 's' at the prompt."
>  A: "My Git does not seem to take 's' as a valid command."
>  B: "What? I've never seen that problem."
>  ... back and forth wastes 10 minutes ...
>  A: "By the way, I am running Git in Portuguese."

Also, for one-letter commands to work (there is setting where you
don't even need to press enter, IIRC) all those translations would
have to be chosen to begin with different letter, isn't it?

Best,
-- 
Jakub Narębski



Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Junio C Hamano
Lars Schneider  writes:

> We discussed that issue in v4 and v6:
> http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/
> http://public-inbox.org/git/xmqqbn0a3wy3@gitster.mtv.corp.google.com/
>
> My impression was that you don't want Git to wait for the filter process.
> If Git waits for the filter process - how long should Git wait?

I am not sure where you got that impression.  I did say that I do
not want Git to _KILL_ my filter process.  That does not mean I want
Git to go away without waiting for me.

If the filter process refuses to die forever when Git told it to
shutdown (by closing the pipe to it, for example), that filter
process is simply buggy.  I think we want users to become aware of
that, instead of Git leaving it behind, which essentially is to
sweep the problem under the rug.

I agree with what Peff said elsewhere in the thread; if a filter
process wants to take time to clean things up while letting Git
proceed, it can do its own process management, but I think it is
sensible for Git to wait the filter process it directly spawned.



Re: [PATCH v2 01/11] i18n: add--interactive: mark strings for translation

2016-09-29 Thread Jakub Narębski
W dniu 26.09.2016 o 00:52, Junio C Hamano pisze:
> Vasco Almeida  writes: 

>>  my $status_fmt = '%12s %12s %s';
>> -my $status_head = sprintf($status_fmt, 'staged', 'unstaged', 'path');
>> +my $status_head = sprintf($status_fmt, __('staged'), __('unstaged'), 
>> __('path'));
> 
> Wouldn't it make sense to allow translators to tweak $status_fmt if
> you are allowing the earlier elements that are formatted with %12s,
> as their translation may not fit within that width, in which case
> they may want to make these columns wider?

Perl's printf, sprintf, and format think all codepoints take up 1 print
column; also, without "use utf8;" they all think that one byte is one
codepoint (as it is in latin1 encoding).

Many codepoints can take 0 print columns (zero-width joiners), or 2
columns (so called wide characters).

The proper way to justify Unicode output is described e.g. in
http://www.perl.com/pub/2012/05/perlunicook-unicode-column-width-for-printing.html

  use Unicode::GCString;

  my $gcs  = Unicode::GCString->new($str);  # grapheme cluster string
  my $cols = $gcs->columns;
  my $pad  = " " x (12 - $cols);

  $status_head .= $str . $pad . " ";

Though we would need to provide fallback if there is no perl-i18n,
no extended Unicode support in Perl (also, if we are not using
gettext).


So it is even more complicated.

>>  prompt_yesno(
>> -'Your edited hunk does not apply. Edit again '
>> -. '(saying "no" discards!) [y/n]? '
>> +# TRANSLATORS: do not translate [y/n]
>> +# The program will only accept that input
>> +# at this point.
>> +__('Your edited hunk does not apply. Edit again 
>> '
>> +   . '(saying "no" discards!) [y/n]? ')
> 
> Not just [y/n], but "no" in "saying no discards!" also needs to
> stay, no?  I wonder if it is a good idea to lose the TRANSLATORS
> comment by ejecting "[y/n]" outside the "__()" construct here.

Actually the message to translators should also mention that if
the translation of "no" doesn't begin with 'n', then one needs
to say something like '(saying "n" for "no" discards!)'.

Best,
-- 
Jakub Narębski



Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Junio C Hamano
Jeff King  writes:

> I don't necessarily agree, though, that the timing of filter-process
> cleanup needs to be part of the public interface. So in your list:
>
>> 3) Git waits until the filter process finishes.
>
> That seems simple and elegant, but I can think of reasons we might not
> want to wait (e.g., if the filter has to do some maintenance task and
> does not the user to have to wait).
>
> OTOH, we already face this in git, and we solve it by explicitly
> backgrounding the maintenance task (i.e., auto-gc). So one could argue
> that it is the responsibility of the filter process to manage its own
> processes. It certainly makes the interaction with git simpler.

Yup, that summarizes my thinking a lot better than I managed to do
in the previous message.



Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Junio C Hamano
Jakub Narębski  writes:

> Or even better: make filter driver write its pid to pidfile, and then
> "wait $(cat rot13-filter.pid)".  That's what we do in lib-git-daemon.sh
> (I think).

I am not sure if "wait"ing on a random process that is not a direct
child is a reasonable thing to do, but I like the direction.

Communicate with a pidfile and wait until "kill -0 $that_pid" fails,
or something like that, would be clean enough.





Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Junio C Hamano
Lars Schneider  writes:

> A pragmatic approach:
>
> I could drop the "STOP" message that the filter writes to the log
> on exit and everything would work as is. We could argue that this 
> is OK because Git doesn't care anyways if the filter process has 
> stopped or not.

That would mean you can leave the process running while the test
framework tries to remove the trash directory when we are done,
creating the same bug J6t mentioned in the thread, no?



Re: [PATCH 2/4] t13xx: do not assume system config is empty

2016-09-29 Thread Junio C Hamano
Jeff King  writes:

> Good description.
>
> Signed-off-by: Jeff King 
>
> of course.
>
>> @@ -1304,6 +1315,7 @@ test_expect_success '--show-origin with --get-regexp' '
>>  file:$HOME/.gitconfig   user.global true
>>  file:.git/configuser.local true
>>  EOF
>> +GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \
>>  git config --show-origin --get-regexp "user\.[g|l].*" >output &&
>>  test_cmp expect output
>>  '
>
> This is one is trying to do a multi-file lookup, but we couldn't look in
> the system config before. But to naturally extend it, it ought to look
> like this on top:
>
> diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
> index d2476a8..4dd5ce3 100755
> --- a/t/t1300-repo-config.sh
> +++ b/t/t1300-repo-config.sh
> @@ -1310,11 +1310,12 @@ test_expect_success '--show-origin with single file' '
>  
>  test_expect_success '--show-origin with --get-regexp' '
>   cat >expect <<-EOF &&
> + file:$HOME/etc-gitconfiguser.system true
>   file:$HOME/.gitconfig   user.global true
>   file:.git/configuser.local true
>   EOF
>   GIT_ETC_GITCONFIG=$HOME/etc-gitconfig \
> - git config --show-origin --get-regexp "user\.[g|l].*" >output &&
> + git config --show-origin --get-regexp "user\.[g|l|s].*" >output &&
>   test_cmp expect output
>  '

Makes sense modulo you inherited useless vertical bars from the
original.  I'll squash something like that in but without || ;-)

Thanks.


Re: [PATCH 2/4] t13xx: do not assume system config is empty

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 02:03:39PM -0700, Junio C Hamano wrote:

> > -   git config --show-origin --get-regexp "user\.[g|l].*" >output &&
> > +   git config --show-origin --get-regexp "user\.[g|l|s].*" >output &&
> > test_cmp expect output
> >  '
> 
> Makes sense modulo you inherited useless vertical bars from the
> original.  I'll squash something like that in but without || ;-)

Heh, I glossed over that completely. Thanks.

-Peff


[PATCH v2 7/9] t1300: be explicit in local configuration tests

2016-09-29 Thread Junio C Hamano
Many tests in this script prepare variable settings in the
repository local configuration and expects "--list" to report only
the ones from the repository local configuration.

This happened to work while we were running out tests under
GIT_CONFIG_NOSYSTEM and/or with an empty system-wide configuration
file, but as we will soon make our fake system-wide configuration
non-empty, prepare for that change by explicitly telling the command
to look only at "--local" configuration.

Signed-off-by: Junio C Hamano 
---
 t/t1300-repo-config.sh | 80 +-
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 2a15cd4d150d..8979212946c0 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -245,18 +245,18 @@ test_expect_success 'multivar' '
 '
 
 test_expect_success 'non-match' '
-   git config --get nextsection.nonewline !for
+   git config --local --get nextsection.nonewline !for
 '
 
 test_expect_success 'non-match value' '
echo wow >expect &&
-   git config --get nextsection.nonewline !for >actual &&
+   git config --local --get nextsection.nonewline !for >actual &&
test_cmp expect actual
 '
 
 test_expect_success 'multi-valued get returns final one' '
echo "wow2 for me" >expect &&
-   git config --get nextsection.nonewline >actual &&
+   git config --local --get nextsection.nonewline >actual &&
test_cmp expect actual
 '
 
@@ -265,7 +265,7 @@ test_expect_success 'multi-valued get-all returns all' '
wow
wow2 for me
EOF
-   git config --get-all nextsection.nonewline >actual &&
+   git config --local --get-all nextsection.nonewline >actual &&
test_cmp expect actual
 '
 
@@ -341,7 +341,7 @@ version.1.2.3eX.alpha=beta
 EOF
 
 test_expect_success 'working --list' '
-   git config --list > output &&
+   git config --local --list > output &&
test_cmp expect output
 '
 
@@ -361,7 +361,7 @@ version.1.2.3eX.alpha
 EOF
 
 test_expect_success '--name-only --list' '
-   git config --name-only --list >output &&
+   git config --local --name-only --list >output &&
test_cmp expect output
 '
 
@@ -371,7 +371,7 @@ nextsection.nonewline wow2 for me
 EOF
 
 test_expect_success '--get-regexp' '
-   git config --get-regexp in >output &&
+   git config --local --get-regexp in >output &&
test_cmp expect output
 '
 
@@ -381,7 +381,7 @@ nextsection.nonewline
 EOF
 
 test_expect_success '--name-only --get-regexp' '
-   git config --name-only --get-regexp in >output &&
+   git config --local --name-only --get-regexp in >output &&
test_cmp expect output
 '
 
@@ -392,7 +392,7 @@ EOF
 
 test_expect_success '--add' '
git config --add nextsection.nonewline "wow4 for you" &&
-   git config --get-all nextsection.nonewline > output &&
+   git config --local --get-all nextsection.nonewline > output &&
test_cmp expect output
 '
 
@@ -404,45 +404,45 @@ cat > .git/config << EOF
 EOF
 
 test_expect_success 'get variable with no value' '
-   git config --get novalue.variable ^$
+   git config --local --get novalue.variable ^$
 '
 
 test_expect_success 'get variable with empty value' '
-   git config --get emptyvalue.variable ^$
+   git config --local --get emptyvalue.variable ^$
 '
 
 echo novalue.variable > expect
 
 test_expect_success 'get-regexp variable with no value' '
-   git config --get-regexp novalue > output &&
+   git config --local --get-regexp novalue > output &&
test_cmp expect output
 '
 
 echo 'novalue.variable true' > expect
 
 test_expect_success 'get-regexp --bool variable with no value' '
-   git config --bool --get-regexp novalue > output &&
+   git config --local --bool --get-regexp novalue > output &&
test_cmp expect output
 '
 
 echo 'emptyvalue.variable ' > expect
 
 test_expect_success 'get-regexp variable with empty value' '
-   git config --get-regexp emptyvalue > output &&
+   git config --local --get-regexp emptyvalue > output &&
test_cmp expect output
 '
 
 echo true > expect
 
 test_expect_success 'get bool variable with no value' '
-   git config --bool novalue.variable > output &&
+   git config --local --bool novalue.variable > output &&
test_cmp expect output
 '
 
 echo false > expect
 
 test_expect_success 'get bool variable with empty value' '
-   git config --bool emptyvalue.variable > output &&
+   git config --local --bool emptyvalue.variable > output &&
test_cmp expect output
 '
 
@@ -683,15 +683,15 @@ test_expect_success numbers '
git config mega.ton 1m &&
echo 1024 >expect &&
echo 1048576 >>expect &&
-   git config --int --get kilo.gram >actual &&
-   git config --int --get mega.ton >>actual &&
+   git config --local --int --get kilo.gram >actual &&
+   git 

[PATCH v2 5/9] t1300: disable system-wide config for tests that wants to read from -c

2016-09-29 Thread Junio C Hamano
This test wants to do

git -c x.two=2 config --get-regexp ^x\.*

and see x.two that came from the one-shot configuration in its
output.  This form cannot be limited with "--local", as it limits
the input to the local configuration file and makes these one-shot
settings ignored.  At this point, the test knows that there is no
variable that match x.* in its local configuration, and it also was
OK to assume that there is nothing in the system-wide config or
global one.

Make sure that assumption holds by using the GIT_CONFIG_NOSYSTEM
environment, as we may add anything to t/gitconfig-for-test later.

Signed-off-by: Junio C Hamano 
---
 t/t1300-repo-config.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 940469339bd2..95734034e0d5 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1093,6 +1093,7 @@ test_expect_success 'multiple git -c appends config' '
x.one 1
x.two 2
EOF
+   GIT_CONFIG_NOSYSTEM=1 \
git -c x.one=1 x >actual &&
test_cmp expect actual
 '
-- 
2.10.0-589-g5adf4e1



[PATCH v2 9/9] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Junio C Hamano
As Peff said, responding in a thread started by Linus's suggestion
to raise the default abbreviation to 12 hexdigits:

I actually think "12" might be sane for a long time. That's 48
bits of sha1, so we'd expect a 50% chance of a single collision
at 2^24, or 16 million.  The biggest repository I know about (in
number of objects) is the one holding all of the objects for all
of the forks of torvalds/linux on GitHub. It's at about 15
million objects.

Which seems close, but remember that's the size where we expect
to see a single collision. They don't become common until much
later (I didn't compute an exact number, but Linus's 16x sounds
about right). I know that the growth of the kernel isn't really
linear, but I think the need to bump to "13" might not just be
decades, but possibly a century or more.

So 12 seems reasonable, and the only downside for it (or for "13", for
that matter) is a few extra bytes. I dunno, maybe people will really
hate that, but I have a feeling these are mostly cut-and-pasted anyway.

And this does exactly that.

Keep the tests working by explicitly asking for the old 7 hexdigits
setting in the fake system-wide configuration file used for tests.

Signed-off-by: Junio C Hamano 
---
 environment.c| 2 +-
 t/gitconfig-for-test | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/environment.c b/environment.c
index ca72464a9850..25daddbc13d6 100644
--- a/environment.c
+++ b/environment.c
@@ -16,7 +16,7 @@ int trust_executable_bit = 1;
 int trust_ctime = 1;
 int check_stat = 1;
 int has_symlinks = 1;
-int minimum_abbrev = 4, default_abbrev = 7;
+int minimum_abbrev = 4, default_abbrev = 12;
 int ignore_case;
 int assume_unchanged;
 int prefer_symlink_refs;
diff --git a/t/gitconfig-for-test b/t/gitconfig-for-test
index 4598885ed5c3..8c284425d725 100644
--- a/t/gitconfig-for-test
+++ b/t/gitconfig-for-test
@@ -4,3 +4,6 @@
 ;; [user]
 ;; name = A U Thor
 ;; email = aut...@example.com
+
+[core]
+   abbrev = 7
-- 
2.10.0-589-g5adf4e1



[PATCH v2 8/9] worktree: honor configuration variables

2016-09-29 Thread Junio C Hamano
The command accesses default_abbrev (defined in environment.c and is
updated via core.abbrev configuration), but never makes any call to
git_config().  The output from "worktree list" ignores the abbrev
setting for this reason.

Make a call to git_config() to read the default set of configuration
variables at the beginning of the command.

Signed-off-by: Junio C Hamano 
---
 builtin/worktree.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 6dcf7bd9d270..5c4854d3e4a6 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -528,6 +528,8 @@ int cmd_worktree(int ac, const char **av, const char 
*prefix)
OPT_END()
};
 
+   git_config(git_default_config, NULL);
+
if (ac < 2)
usage_with_options(worktree_usage, options);
if (!prefix)
-- 
2.10.0-589-g5adf4e1



[PATCH v2 4/9] t1300: check also system-wide configuration file in --show-origin tests

2016-09-29 Thread Junio C Hamano
From: Jeff King 

Because we used to run our tests with GIT_CONFIG_NOSYSTEM, these did
not test that the system-wide configuration file is also read and
shown as one of the origins.  Create a custom/fake system-wide
configuration file and make sure it appears in the output, using the
newly introduced GIT_CONFIG_SYSTEM_PATH mechanism.

Signed-off-by: Junio C Hamano 
---
 t/t1300-repo-config.sh | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 1b3f6f4854f9..940469339bd2 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1236,6 +1236,11 @@ test_expect_success 'set up --show-origin tests' '
[user]
relative = include
EOF
+   cat >"$HOME"/etc-gitconfig <<-\EOF &&
+   [user]
+   system = true
+   override = system
+   EOF
cat >"$HOME"/.gitconfig <<-EOF &&
[user]
global = true
@@ -1254,6 +1259,8 @@ test_expect_success 'set up --show-origin tests' '
 
 test_expect_success '--show-origin with --list' '
cat >expect <<-EOF &&
+   file:$HOME/etc-gitconfiguser.system=true
+   file:$HOME/etc-gitconfiguser.override=system
file:$HOME/.gitconfig   user.global=true
file:$HOME/.gitconfig   user.override=global
file:$HOME/.gitconfig   
include.path=$INCLUDE_DIR/absolute.include
@@ -1264,13 +1271,16 @@ test_expect_success '--show-origin with --list' '
file:.git/../include/relative.include   user.relative=include
command line:   user.cmdline=true
EOF
+   GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \
git -c user.cmdline=true config --list --show-origin >output &&
test_cmp expect output
 '
 
 test_expect_success '--show-origin with --list --null' '
cat >expect <<-EOF &&
-   file:$HOME/.gitconfigQuser.global
+   file:$HOME/etc-gitconfigQuser.system
+   trueQfile:$HOME/etc-gitconfigQuser.override
+   systemQfile:$HOME/.gitconfigQuser.global
trueQfile:$HOME/.gitconfigQuser.override
globalQfile:$HOME/.gitconfigQinclude.path

$INCLUDE_DIR/absolute.includeQfile:$INCLUDE_DIR/absolute.includeQuser.absolute
@@ -1281,6 +1291,7 @@ test_expect_success '--show-origin with --list --null' '
includeQcommand line:Quser.cmdline
trueQ
EOF
+   GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \
git -c user.cmdline=true config --null --list --show-origin >output.raw 
&&
nul_to_q output &&
# The here-doc above adds a newline that the --null output would not
@@ -1304,6 +1315,7 @@ test_expect_success '--show-origin with --get-regexp' '
file:$HOME/.gitconfig   user.global true
file:.git/configuser.local true
EOF
+   GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \
git config --show-origin --get-regexp "user\.[g|l].*" >output &&
test_cmp expect output
 '
@@ -1312,6 +1324,7 @@ test_expect_success '--show-origin getting a single key' '
cat >expect <<-\EOF &&
file:.git/configlocal
EOF
+   GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \
git config --show-origin user.override >output &&
test_cmp expect output
 '
-- 
2.10.0-589-g5adf4e1



[PATCH v2 6/9] t1300: take contents of system-wide configuration into account in "--list" test

2016-09-29 Thread Junio C Hamano
One of the "git config" test tries to see that the command run
without a valid repository still shows non-repository specific
configuration.  As we are planning to later make the system-wide
file non-empty, prepare for the change by expecting to see the
contents from it.

Signed-off-by: Junio C Hamano 
---
 t/t1300-repo-config.sh | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 95734034e0d5..2a15cd4d150d 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -344,10 +344,11 @@ test_expect_success 'working --list' '
git config --list > output &&
test_cmp expect output
 '
-cat > expect << EOF
-EOF
 
-test_expect_success '--list without repo produces empty output' '
+test_expect_success '--list without repo shows only system-wide and global' '
+   # The global one aka $HOME/.gitconfig is missing,
+   # so we do not have to worry about it.
+   git config --system --list >expect &&
git --git-dir=nonexistent config --list >output &&
test_cmp expect output
 '
-- 
2.10.0-589-g5adf4e1



[PATCH v2 2/9] t1300: always compare expect to actual

2016-09-29 Thread Junio C Hamano
The two arguments to the test_cmp helper should always have the
expected output first and then the actual one, so that an unmet
expectation would appear as

-what we wanted to see
+what we actually saw

in its output.

Signed-off-by: Junio C Hamano 
---
 t/t1300-repo-config.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 0543b62227bf..1b3f6f4854f9 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -936,7 +936,7 @@ EOF
 
 test_expect_success 'value continued on next line' '
git config --list > result &&
-   test_cmp result expect
+   test_cmp expect result
 '
 
 cat > .git/config <<\EOF
-- 
2.10.0-589-g5adf4e1



[PATCH v2 1/9] config: allow customizing /etc/gitconfig location with an environment

2016-09-29 Thread Junio C Hamano
We introduced GIT_CONFIG_NOSYSTEM environment variable at ab88c363
("allow suppressing of global and system config", 2008-02-06),
primarily to protect our tests from random set of configuration
variables the system administrators would put in /etc/gitconfig
file.

Introduce a new environment variable GIT_CONFIG_SYSTEM_PATH, and allow
the users to specify a file that is used instead of /etc/gitconfig
to read (and write) the system-wide configuration.  By doing so, we
can force our tests to honor certain configuration settings by
default by pointing GIT_CONFIG_SYSTEM_PATH at our own, in addition to the
existing GIT_CONFIG_NOSYSTEM mechanism.

Signed-off-by: Junio C Hamano 
---
 cache.h|  1 +
 config.c   |  2 ++
 t/gitconfig-for-test   |  6 ++
 t/t1300-repo-config.sh | 15 +++
 t/test-lib.sh  |  4 ++--
 5 files changed, 26 insertions(+), 2 deletions(-)
 create mode 100644 t/gitconfig-for-test

diff --git a/cache.h b/cache.h
index b0dae4bac1a1..d4b689f386d6 100644
--- a/cache.h
+++ b/cache.h
@@ -408,6 +408,7 @@ static inline enum object_type object_type(unsigned int 
mode)
 #define GIT_NAMESPACE_ENVIRONMENT "GIT_NAMESPACE"
 #define GIT_WORK_TREE_ENVIRONMENT "GIT_WORK_TREE"
 #define GIT_PREFIX_ENVIRONMENT "GIT_PREFIX"
+#define GIT_CONFIG_SYSTEM_PATH_ENVIRONMENT "GIT_CONFIG_SYSTEM_PATH"
 #define DEFAULT_GIT_DIR_ENVIRONMENT ".git"
 #define DB_ENVIRONMENT "GIT_OBJECT_DIRECTORY"
 #define INDEX_ENVIRONMENT "GIT_INDEX_FILE"
diff --git a/config.c b/config.c
index 0dfed682b868..096bb754aad7 100644
--- a/config.c
+++ b/config.c
@@ -1253,6 +1253,8 @@ const char *git_etc_gitconfig(void)
 {
static const char *system_wide;
if (!system_wide)
+   system_wide = getenv(GIT_CONFIG_SYSTEM_PATH_ENVIRONMENT);
+   if (!system_wide)
system_wide = system_path(ETC_GITCONFIG);
return system_wide;
 }
diff --git a/t/gitconfig-for-test b/t/gitconfig-for-test
new file mode 100644
index ..4598885ed5c3
--- /dev/null
+++ b/t/gitconfig-for-test
@@ -0,0 +1,6 @@
+;; This file is used as if it were /etc/gitconfig while running the
+;; test scripts in this directory.
+;;
+;; [user]
+;; name = A U Thor
+;; email = aut...@example.com
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 923bfc5a2606..0543b62227bf 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1372,4 +1372,19 @@ test_expect_success !MINGW '--show-origin blob ref' '
test_cmp expect output
 '
 
+test_expect_success 'system-wide configuration' '
+   system="$TRASH_DIRECTORY/system-wide" &&
+   >"$system" &&
+   git config -f "$system" --add frotz.nitfol xyzzy &&
+
+   git config -f "$system" frotz.nitfol >expect &&
+   GIT_CONFIG_SYSTEM_PATH="$system" \
+   git config --system frotz.nitfol >actual &&
+
+   GIT_CONFIG_SYSTEM_PATH="$system" \
+   git config --system --replace-all frotz.nitfol blorb &&
+   echo blorb >expect &&
+   GIT_CONFIG_SYSTEM_PATH="$system" git config --system frotz.nitfol 
>actual
+'
+
 test_done
diff --git a/t/test-lib.sh b/t/test-lib.sh
index ac56512a1c5e..b811e4c70273 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -851,9 +851,9 @@ else # normal case, use ../bin-wrappers only unless 
$with_dashes:
fi
 fi
 GIT_TEMPLATE_DIR="$GIT_BUILD_DIR"/templates/blt
-GIT_CONFIG_NOSYSTEM=1
+GIT_CONFIG_SYSTEM_PATH="$GIT_BUILD_DIR/t/gitconfig-for-test"
 GIT_ATTR_NOSYSTEM=1
-export PATH GIT_EXEC_PATH GIT_TEMPLATE_DIR GIT_CONFIG_NOSYSTEM 
GIT_ATTR_NOSYSTEM
+export PATH GIT_EXEC_PATH GIT_TEMPLATE_DIR GIT_CONFIG_SYSTEM_PATH 
GIT_ATTR_NOSYSTEM
 
 if test -z "$GIT_TEST_CMP"
 then
-- 
2.10.0-589-g5adf4e1



[PATCH v2 3/9] t1308: ignore system-wide config in the iteration test

2016-09-29 Thread Junio C Hamano
We do not want to keep track of the exact contents of the fake
system-wide t/gitconfig-for-test configuration file.  Keep ignoring
it as we used to.

Signed-off-by: Junio C Hamano 
---
 t/t1308-config-set.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t1308-config-set.sh b/t/t1308-config-set.sh
index 7655c94c2801..5d5adb1efd8e 100755
--- a/t/t1308-config-set.sh
+++ b/t/t1308-config-set.sh
@@ -260,6 +260,7 @@ test_expect_success 'iteration shows correct origins' '
name=
scope=cmdline
EOF
+   GIT_CONFIG_NOSYSTEM=1 \
GIT_CONFIG_PARAMETERS=$cmdline_config test-config iterate >actual &&
test_cmp expect actual
 '
-- 
2.10.0-589-g5adf4e1



[PATCH v2 0/9] allow customizing /etc/gitconfig location with an environment

2016-09-29 Thread Junio C Hamano
This ended up growing quite a bit, and I mostly hate it.

 - Patch 1 introduces GIT_CONFIG_SYSTEM_PATH environment variable
   that lets you point at a file other than /etc/gitconfig to
   pretend that your file is the system-wide configuration.

 - Patch 2 is a small bugfix.

 - Patches 3-7 are updates to 1300 and 1308, i.e. tests for "git
   config", to make them more robust, in preparation for using
   GIT_CONFIG_SYSTEM_PATH mechanism to point at a file during the
   test.  It protects them a bit more than necessary in that the
   variables some of the tests they use when they try to see the
   output from "git config --get" are unlikely to appear in the fake
   system-wide configuration during the test (hence disabling the
   fake system-wide configuration has no practical effect), but
   nevertheless the calls are protected by explicitly telling them
   to read only from --local configuration file to future-proof
   them.

 - Patch 8 is queued elsewhere already.

 - Patch 9 raises the default core.abbrev to 12 and countermands it
   by setting it to 7 in a fake system-wide configuration file
   during our test.  The unconditional widening of the default
   abbreviation size in this patch will have to be discarded,
   preferring the approach Linus is taking to auto-size it based on
   the number of objects in the repository, but the part that
   updates the test script may still be necessary.

Jeff King (1):
  t1300: check also system-wide configuration file in --show-origin
tests

Junio C Hamano (8):
  config: allow customizing /etc/gitconfig location with an environment
  t1300: always compare expect to actual
  t1308: ignore system-wide config in the iteration test
  t1300: disable system-wide config for tests that wants to read from -c
  t1300: take contents of system-wide configuration into account in
"--list" test
  t1300: be explicit in local configuration tests
  worktree: honor configuration variables
  core.abbrev: raise the default abbreviation to 12 hexdigits

 builtin/worktree.c |   2 +
 cache.h|   1 +
 config.c   |   2 +
 environment.c  |   2 +-
 t/gitconfig-for-test   |   9 
 t/t1300-repo-config.sh | 120 ++---
 t/t1308-config-set.sh  |   1 +
 t/test-lib.sh  |   4 +-
 8 files changed, 93 insertions(+), 48 deletions(-)
 create mode 100644 t/gitconfig-for-test

-- 
2.10.0-589-g5adf4e1



Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Jakub Narębski
W dniu 29.09.2016 o 13:57, Torsten Bögershausen pisze: 
> On 29/09/16 12:28, Lars Schneider wrote:

>> This is what happens:
>>
>> 1) Git exits
>> 2) The filter process receives EOF and prints "STOP" to the log
>> 3) t0021 checks the content of the log
>>
>> Sometimes 3 happened before 2 which makes the test fail.
>> (Example: https://travis-ci.org/git/git/jobs/162660563 )
>>
>> I added a this to wait until the filter process terminates:
>>
>> +wait_for_filter_termination () {
>> +while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
>> 2>&1
>> +do
>> +echo "Waiting for /t0021/rot13-filter.pl to finish..."
>> +sleep 1
>> +done
>> +}
>>
>> Does this look OK to you?
> Do we need the ps at all ?
> How about this:
> 
> +wait_for_filter_termination () {
> +while ! grep "STOP"  LOGFILENAME >/dev/null
> +do
> +echo "Waiting for /t0021/rot13-filter.pl to finish..."
> +sleep 1
> +done
> +}

Or even better: make filter driver write its pid to pidfile, and then
"wait $(cat rot13-filter.pid)".  That's what we do in lib-git-daemon.sh
(I think).

If the problem is exit status of "wait" builtin, then filter driver
can remove its pidfile after writing "STOP", just before ending.

-- 
Jakub Narębski



Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Lars Schneider

> On 29 Sep 2016, at 18:57, Junio C Hamano  wrote:
> 
> Torsten Bögershausen  writes:
> 
>>> 1) Git exits
>>> 2) The filter process receives EOF and prints "STOP" to the log
>>> 3) t0021 checks the content of the log
>>> 
>>> Sometimes 3 happened before 2 which makes the test fail.
>>> (Example: https://travis-ci.org/git/git/jobs/162660563 )
>>> 
>>> I added a this to wait until the filter process terminates:
>>> 
>>> +wait_for_filter_termination () {
>>> +   while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
>>> 2>&1
>>> +   do
>>> +   echo "Waiting for /t0021/rot13-filter.pl to finish..."
>>> +   sleep 1
>>> +   done
>>> +}
>>> 
>>> Does this look OK to you?
>> Do we need the ps at all ?
>> How about this:
>> 
>> +wait_for_filter_termination () {
>> +while ! grep "STOP"  LOGFILENAME >/dev/null
>> +do
>> +echo "Waiting for /t0021/rot13-filter.pl to finish..."
>> +sleep 1
>> +done
>> +}
> 
> Running "ps" and grepping for a command is not suitable for script
> to reliably tell things, so it is out of question.  Compared to
> that, your version looks slightly better, but what if the machinery
> that being tested, i.e. the part that drives the filter process, is
> buggy or becomes buggy and causes the filter process that writes
> "STOP" to die before it actually writes that string?
> 
> I have a feeling that the machinery being tested needs to be fixed
> so that the sequence is always be:
> 
>0) Git spawns the filter process, as it needs some contents to
>   be filtered.
> 
>1) Git did everything it needed to do and decides that is time
>   to go.
> 
>2) Filter process receives EOF and prints "STOP" to the log.
> 
>3) Git waits until the filter process finishes.
> 
>4) t0021, after Git finishes, checks the log.


A pragmatic approach:

I could drop the "STOP" message that the filter writes to the log
on exit and everything would work as is. We could argue that this 
is OK because Git doesn't care anyways if the filter process has 
stopped or not.

Would that be OK for everyone?

- Lars

Re: [PATCH 2/4] t13xx: do not assume system config is empty

2016-09-29 Thread Junio C Hamano
Jeff King  writes:

> I just don't see it being a problem. Adding core.abbrev for the whole
> test suite is just about not having a big flag day where we change all
> the tests. Changing one or two tests (and again, I'd be surprised if we
> even have to do that) doesn't seem like a big deal.

I've already wasted several hours whipping t1300 into shape, because
it was done in not so forward-looking future-proofed way.  I am not
worried about core.abbrev but I am worried more about the next thing
that requires us to add an entry to t/gitconfig-for-test.  Adding a
corresponding entry to retain the old default for that new config to
two places may not be a big deal, but it still makes me feel a bit
uneasy.

In any case, I suspect that Linus's "auto" thing may still need the
custom system config with t1300 clean-up to pass the test, even
though I suspect it would compute that 7 is enough for most of the
tiny repositories our tests use, so I'll polish this a bit more
while waiting for that discussion to settle.

Thanks.






Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Junio C Hamano
Linus Torvalds  writes:

> But you could easily also just instead have it do something like
>
>   if (default_abbrev < 0)
> default_abbrev = initialize_abbrev();
>
> at startup time if "abbrev_commit" is set, and just do it once and for
> all rather rthan the odd loping behavior.

I think that is a reasonable way to go.

#define DEFAULT_ABBREV get_default_abbrev()

would help.


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Linus Torvalds
On Thu, Sep 29, 2016 at 12:16 PM, Jeff King  wrote:
>
> Hmm. So at length 7, we expect collisions at 2^14, which is 16384. That
> seems really low. I mean, by the birthday paradox that's where expect
> a 50% chance of a collision. But that's a single collision. We
> definitely don't expect them to be common at that size.
>
> So I suspect this could be a bit looser.

So I have to admit that I was surprised by how quickly it actually
decided that 7 isn't enough. In fact, the reason I initially said that
git used 8 digits was that I didn't count very closely, and just
verified that it was more than the default 7.

But quite frankly, I think the math is correct, and part of that is
that the logic is all about not just the current state, but the
"reasonably near future".

So it is indeed fairly aggressive, and the moment you have more
objects than the "we'd expect to probably see  _one_ collision" it
grows the size. But looking at the kernel situation, that really is
what we'd want, because the whole problem with the existing code is
that it only takes the *current* situation into account. That's what
we want to get away from. We want git to pick a number that is sane
from a standpoint of "this project is still growing".

And git _already_ has commits that are ambiguous in 8 hex digits and
need 9. Yes, it's rare today, but the reason I'm telling kernel
developers to use 12 is because while a size-11 collision is very rare
today, it does actually happen, and we want o pick a value where it is
rare enough that even in the near future it's not going to be a big
deal.

Don't get me wrong: collisions aren't fatal. So it's not like we have
to absolutely avoid them, and I really like your patch series exactly
because it makes collisions even less of a deal (particularly since I
expect people will not upgrade immediately, so we'll continue to see
even new 7-hex-digit short forms even in the kernel). So it's a
balance of making the hex string long enough that it's simply not a
big worry.

So I'm sure it *could* be looser, but I actually also really suspect
that git truly *should* use a 9-digit abbreviation rather than 8 (and
7 is definitely starting to be borderline, I think).

> As far as the implementation, I was surprised to see it touch
> get_short_sha1() at all. That's, after all, for lookups, and we would
> never want to require more characters on the reading side.

Heh. The implementation is crap. It was literally a "how can I make
the smallest possible patch" implementation. I was finishing it off
while at a talk by Nicolas Pitre at Linaro Connect where I am right
now.

So I agree - it does extra work just because that's where it all
slotted in with minimal effort.

At a minimum, once it finds a good new default, it should just memoize
that. So a minimal fix to the "it's stupldly recalculating things over
rand over again" would be to just set "default_abbrev" to the value it
finds acceptable after the first time it finds something, so that it
doesn't end up looping _again_ in the future.

But you could easily also just instead have it do something like

  if (default_abbrev < 0)
default_abbrev = initialize_abbrev();

at startup time if "abbrev_commit" is set, and just do it once and for
all rather rthan the odd loping behavior.

I really just wanted to see how well the concept worked, and I was
happy to see that it gave what I thought were the "correct" numbers.
And the loop was salready there ...

Linus


Impossible to change working directory

2016-09-29 Thread Sebastian Feldmann
Hi there,

I have a problem executing a pre-commit hook.
The hook script has to change the working directory to work and if I use plain

git commit

it works as expected, the script executes without errors, but if I use

git commit —only file.x file.y

the script fails because changing the current working directory fails.
If I echo the current working directory it always echoes the root repository 
path

Is this expected behavior?
Thanks for your feedback.

Cheers Sebastian






Re: [PATCH 2/4] t13xx: do not assume system config is empty

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 12:06:15PM -0700, Junio C Hamano wrote:

> I think it deserves a separate patch and the result is more
> understandable.  I've queued this for now (on top of a revised 1/4
> that uses GIT_CONFIG_SYSTEM_PATH instead).

Thanks, makes sense (and I like the new variable name better, by the
way).

> -- >8 --
> From: Jeff King 
> Date: Thu, 29 Sep 2016 11:29:10 -0700
> Subject: [PATCH] t1300: check also system-wide configuration file in
>  --show-origin tests
> 
> Because we used to run our tests with GIT_CONFIG_NOSYSTEM, these did
> not test that the system-wide configuration file is also read and
> shown as one of the origins.  Create a custom/fake system-wide
> configuration file and make sure it appears in the output, using the
> newly introduced GIT_CONFIG_SYSTEM_PATH mechanism.
> 
> Signed-off-by: Junio C Hamano 

Good description.

Signed-off-by: Jeff King 

of course.

> @@ -1304,6 +1315,7 @@ test_expect_success '--show-origin with --get-regexp' '
>   file:$HOME/.gitconfig   user.global true
>   file:.git/configuser.local true
>   EOF
> + GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \
>   git config --show-origin --get-regexp "user\.[g|l].*" >output &&
>   test_cmp expect output
>  '

This is one is trying to do a multi-file lookup, but we couldn't look in
the system config before. But to naturally extend it, it ought to look
like this on top:

diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index d2476a8..4dd5ce3 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1310,11 +1310,12 @@ test_expect_success '--show-origin with single file' '
 
 test_expect_success '--show-origin with --get-regexp' '
cat >expect <<-EOF &&
+   file:$HOME/etc-gitconfiguser.system true
file:$HOME/.gitconfig   user.global true
file:.git/configuser.local true
EOF
GIT_ETC_GITCONFIG=$HOME/etc-gitconfig \
-   git config --show-origin --get-regexp "user\.[g|l].*" >output &&
+   git config --show-origin --get-regexp "user\.[g|l|s].*" >output &&
test_cmp expect output
 '
 
> @@ -1312,6 +1324,7 @@ test_expect_success '--show-origin getting a single 
> key' '
>   cat >expect <<-\EOF &&
>   file:.git/configlocal
>   EOF
> + GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \
>   git config --show-origin user.override >output &&
>   test_cmp expect output
>  '

And I was tempted to say this one should not need to care, but I guess
it is testing that we correctly read the override from the local config
over the global one. So likewise, it is good to check that we also
override the system config (it does not effect the "expect" output, but
that does not mean it is not enhancing the test).

-Peff


[RFC/PATCH 2/2] sequencer: allow origin line below commit title

2016-09-29 Thread Jonathan Tan
When git cherry-pick -x is invoked, a "(cherry picked from commit ...)"
line is appended to the footer of a commit message that Git interprets
to contain a footer; otherwise it is appended at the end as a new
paragraph, preceded by a blank line. This behavior may appear
inconsistent, especially to users who differ from Git in their
interpretation of what constitutes a footer.

Provide the user, through a configuration option and command-line flag,
the option of placing the "cherry picked" line below the commit title
instead of the current behavior.  This allows the "cherry picked" line
to be placed in a consistent manner, independent of the nature of the
footer of the existing commit message.

Signed-off-by: Jonathan Tan 
---
 Documentation/config.txt  |  4 +++
 Documentation/git-cherry-pick.txt | 15 +-
 builtin/revert.c  | 38 -
 sequencer.c   | 39 --
 sequencer.h   |  7 +
 t/t3511-cherry-pick-x.sh  | 59 +++
 6 files changed, 152 insertions(+), 10 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 0bcb679..fb1990f 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -945,6 +945,10 @@ browser..path::
browse HTML help (see `-w` option in linkgit:git-help[1]) or a
working repository in gitweb (see linkgit:git-instaweb[1]).
 
+cherrypick.originLineLocation::
+   Default for the `--origin-line-location` option in git-cherry-pick.
+   Defaults to `bottom`.
+
 clean.requireForce::
A boolean to make git-clean do nothing unless given -f,
-i or -n.   Defaults to true.
diff --git a/Documentation/git-cherry-pick.txt 
b/Documentation/git-cherry-pick.txt
index d35d771..5a8359f 100644
--- a/Documentation/git-cherry-pick.txt
+++ b/Documentation/git-cherry-pick.txt
@@ -58,7 +58,7 @@ OPTIONS
message prior to committing.
 
 -x::
-   When recording the commit, append a line that says
+   When recording the commit, add a line that says
"(cherry picked from commit ...)" to the original commit
message in order to indicate which commit this change was
cherry-picked from.  This is done only for cherry
@@ -71,6 +71,14 @@ OPTIONS
development branch), adding this information can be
useful.
 
+--origin-line-location::
+   Where to put the "(cherry picked from commit ...)" line when requested
+   with the `-x` option.  May be `top`, meaning at the top of the commit
+   message body, immediately below the commit title (see the DISCUSSION
+   section of linkgit:git-commit[1]), or `bottom`, meaning at the end of
+   the commit message.  The default is controlled by the
+   `cherrypick.originLineLocation` configuration variable.
+
 -r::
It used to be that the command defaulted to do `-x`
described above, and `-r` was to disable it.  Now the
@@ -224,6 +232,11 @@ the working tree.
 spending extra time to avoid mistakes based on incorrectly matching
 context lines.
 
+CONFIGURATION
+-
+cherrypick.originLineLocation::
+   Default for the `--origin-line-location` option.  Defaults to `bottom`.
+
 SEE ALSO
 
 linkgit:git-revert[1]
diff --git a/builtin/revert.c b/builtin/revert.c
index 4e69380..a5459a0 100644
--- a/builtin/revert.c
+++ b/builtin/revert.c
@@ -71,11 +71,25 @@ static void verify_opt_compatible(const char *me, const 
char *base_opt, ...)
die(_("%s: %s cannot be used with %s"), me, this_opt, base_opt);
 }
 
+static int set_origin_line(enum origin_line *line, const char *str)
+{
+   if (!strcmp(str, "bottom")) {
+   *line = ORIGIN_LINE_BOTTOM;
+   return 1;
+   }
+   if (!strcmp(str, "top")) {
+   *line = ORIGIN_LINE_TOP;
+   return 1;
+   }
+   return 0;
+}
+
 static void parse_args(int argc, const char **argv, struct replay_opts *opts)
 {
const char * const * usage_str = revert_or_cherry_pick_usage(opts);
const char *me = action_name(opts);
int cmd = 0;
+   const char *origin_str = NULL;
struct option base_options[] = {
OPT_CMDMODE(0, "quit", , N_("end revert or cherry-pick 
sequence"), 'q'),
OPT_CMDMODE(0, "continue", , N_("resume revert or 
cherry-pick sequence"), 'c'),
@@ -98,6 +112,7 @@ static void parse_args(int argc, const char **argv, struct 
replay_opts *opts)
if (opts->action == REPLAY_PICK) {
struct option cp_extra[] = {
OPT_BOOL('x', NULL, >record_origin, N_("append 
commit name")),
+   OPT_STRING(0, "origin-line-location", _str, 
N_("origin-line-location"), N_("location of appended commit name")),
OPT_BOOL(0, "ff", >allow_ff, N_("allow 
fast-forward")),

[RFC/PATCH 0/2] place cherry pick line below commit title

2016-09-29 Thread Jonathan Tan
This is somewhat of a follow-up to my previous e-mail with subject
"[PATCH] sequencer: support folding in rfc2822 footer" [1], in which I
proposed relaxing the definition of a commit message footer to allow
multiple-line field bodies (as described in RFC2822), but its strictness
was deemed deliberate.

Below is a patch set that allows placing the "cherry picked from" line
without taking into account the definition of a commit message footer.
For example, "git cherry-pick -x" (with the appropriate configuration
variable or argument) would, to this commit message:

  commit title

  This is an explanatory paragraph.

  Footer: foo

place the "(cherry picked from ...)" line below "commit title".

Would this be better?

[1] <1472846322-5592-1-git-send-email-jonathanta...@google.com>

Jonathan Tan (2):
  sequencer: refactor message and origin appending
  sequencer: allow origin line below commit title

 Documentation/config.txt  |  4 +++
 Documentation/git-cherry-pick.txt | 15 -
 builtin/revert.c  | 38 -
 sequencer.c   | 69 +--
 sequencer.h   |  7 
 t/t3511-cherry-pick-x.sh  | 59 +
 6 files changed, 172 insertions(+), 20 deletions(-)

-- 
2.8.0.rc3.226.g39d4020



[RFC/PATCH 1/2] sequencer: refactor message and origin appending

2016-09-29 Thread Jonathan Tan
Move the appending of the commit message and origin information into its
own function, in preparation for a subsequent patch.

Signed-off-by: Jonathan Tan 
---
 sequencer.c | 46 --
 1 file changed, 28 insertions(+), 18 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 3804fa9..b29c9ca 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -443,6 +443,33 @@ static int allow_empty(struct replay_opts *opts, struct 
commit *commit)
return 1;
 }
 
+/*
+ * Appends the commit log message, including the cherry picked notification if
+ * record_origin is nonzero.
+ */
+static void append_message(struct strbuf *msgbuf,
+  const struct commit_message *msg,
+  int record_origin,
+  const struct commit *commit)
+{
+   /*
+* Append the commit log message to msgbuf; it starts
+* after the tree, parent, author, committer
+* information followed by "\n\n".
+*/
+   const char *p = strstr(msg->message, "\n\n");
+   if (p)
+   strbuf_addstr(msgbuf, skip_blank_lines(p + 2));
+
+   if (record_origin) {
+   if (!has_conforming_footer(msgbuf, NULL, 0))
+   strbuf_addch(msgbuf, '\n');
+   strbuf_addstr(msgbuf, cherry_picked_prefix);
+   strbuf_addstr(msgbuf, oid_to_hex(>object.oid));
+   strbuf_addstr(msgbuf, ")\n");
+   }
+}
+
 static int do_pick_commit(struct commit *commit, struct replay_opts *opts)
 {
unsigned char head[20];
@@ -534,29 +561,12 @@ static int do_pick_commit(struct commit *commit, struct 
replay_opts *opts)
}
strbuf_addstr(, ".\n");
} else {
-   const char *p;
-
base = parent;
base_label = msg.parent_label;
next = commit;
next_label = msg.label;
 
-   /*
-* Append the commit log message to msgbuf; it starts
-* after the tree, parent, author, committer
-* information followed by "\n\n".
-*/
-   p = strstr(msg.message, "\n\n");
-   if (p)
-   strbuf_addstr(, skip_blank_lines(p + 2));
-
-   if (opts->record_origin) {
-   if (!has_conforming_footer(, NULL, 0))
-   strbuf_addch(, '\n');
-   strbuf_addstr(, cherry_picked_prefix);
-   strbuf_addstr(, oid_to_hex(>object.oid));
-   strbuf_addstr(, ")\n");
-   }
+   append_message(, , opts->record_origin, commit);
}
 
if (!opts->strategy || !strcmp(opts->strategy, "recursive") || 
opts->action == REPLAY_REVERT) {
-- 
2.8.0.rc3.226.g39d4020



Re: [PATCH 2/4] t13xx: do not assume system config is empty

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 11:57:02AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> >> "either" meaning "we do not need to add --local and we do not need
> >> GIT_CONFIG_NOSYSTEM"?
> >
> > Yes. I didn't test it with your core.abbrev patch 4/4, but I _didn't_
> > have to touch their expected output after pointing them at a non-empty
> > etc-gitconfig file in the trash directory. Which implies to me they
> > don't care either way (which makes sense; they are asking for a specific
> > key which is supposed to be found in one of the other files).
> 
> There is a bit of problem here, though.
> 
>  * If we make t1300 point at its own system-wide config, it will be
>in control of its contents, so "find this key" will find only it
>wants to find (or we found a regression).
> 
>  * But then if it ever does something that depends on the default
>value of core.abbrev (or whatever we'd tweak in response to the
>next suggestion by Linus ;-), we cannot really allow it to do
>so.  We'd want t/gitconfig-for-test to be the single place that
>we can tweak these things, but we'll have to know t1300 uses its
>own and need to make the same change there, too.

Right, but I think that's fine. Tests that care deeply about the
contents of etc-gitconfig are unlikely to care about core.abbrev. And in
the off chance that they do, then the worst case is...they get updated
to handle core.abbrev (either passing a command line option, or just
putting core.abbrev in their test file).

I just don't see it being a problem. Adding core.abbrev for the whole
test suite is just about not having a big flag day where we change all
the tests. Changing one or two tests (and again, I'd be surprised if we
even have to do that) doesn't seem like a big deal.

-Peff


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 11:55:46AM -0700, Linus Torvalds wrote:

> I think the patch can speak for itself, but the basic core is this
> section in get_short_sha1():
> 
>   +   if (len < 16 && !status && (flags & GET_SHA1_AUTOMATIC)) {
>   +   unsigned int expect_collision = 1 << (len * 2);
>   +   if (ds.nrobjects > expect_collision)
>   +   return SHORT_NAME_AMBIGUOUS;
>   +   }

Hmm. So at length 7, we expect collisions at 2^14, which is 16384. That
seems really low. I mean, by the birthday paradox that's where expect
a 50% chance of a collision. But that's a single collision. We
definitely don't expect them to be common at that size.

So I suspect this could be a bit looser. The real number we care about
is probably something like "there is probability 'p' of a collision when
we add a new object", but I'm not sure what that 'p' would be. Or
perhaps "we accept collisions in 'n' percent of objects". But again, I
don't know that 'n'.

I dunno. I suppose being overly conservative with this number leaves
room for growth. Repositories generally get bigger, not smaller. :)

> What do you think? It's actually a fairly simple patch and I really do
> think it makes sense and it seems to just DTRT automatically.

I like the general idea.

As far as the implementation, I was surprised to see it touch
get_short_sha1() at all. That's, after all, for lookups, and we would
never want to require more characters on the reading side.

I see you worked around it with a flag so that this behavior only kicks
in when called via find_unique_abbrev(). But if you look at the caller:

> @@ -458,14 +472,19 @@ int for_each_abbrev(const char *prefix, each_abbrev_fn 
> fn, void *cb_data)
>  int find_unique_abbrev_r(char *hex, const unsigned char *sha1, int len)
>  {
>   int status, exists;
> + int flags = GET_SHA1_QUIETLY;
>  
> + if (len < 0) {
> + flags |= GET_SHA1_AUTOMATIC;
> + len = 7;
> + }
>   sha1_to_hex_r(hex, sha1);
>   if (len == 40 || !len)
>   return 40;
>   exists = has_sha1_file(sha1);
>   while (len < 40) {
>   unsigned char sha1_ret[20];
> - status = get_short_sha1(hex, len, sha1_ret, GET_SHA1_QUIETLY);
> + status = get_short_sha1(hex, len, sha1_ret, flags);
>   if (exists
>   ? !status
>   : status == SHORT_NAME_NOT_FOUND) {

You can see that we're going to do more work than we would otherwise
need to. Because we start at 7, and ask get_short_sha1() "is this unique
enough?", and looping. But if we _know_ we won't accept any answer
shorter than some N based on the number of objects in the repository,
then we should start at that N.

IOW, something like:

  if (len < 0)
len = ceil(log_base_2(repository_object_count()));

here, and then you don't have to touch get_short_sha1() at all.

I suspect you pushed it down into get_short_sha1() because it kind-of
does the repository_object_count() step for "free" as it's looking at
the object anyway. But that step is really not very expensive. And I'd
even say you could just ignore loose objects entirely, and treat them
like a rounding error (the way that duplicate objects in packs are
treated).

That leaves you with just an O(# of packs) loop over a linked list. You
could even just keep a global object count up to date in
add_packed_git(), and then it's O(1).

-Peff


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Linus Torvalds
On Thu, Sep 29, 2016 at 11:55 AM, Linus Torvalds
 wrote:
>
> For the kernel, just the *math* right now actually gives 12
> characters. For current git it actually seems to say that 8 is the
> correct number. For small projects, you'll still see 7.

Sorry, the git number is 9, not 8. The reason is that git has roughly
212k objects, and 9 hex digits gets expected collisions at about 256k
objects.

So the logic means that we'll see 7 hex digits for projects with less
than 16k objects, 8 hex digits if there are less than 64k objects, and
9 hex digits for projects like git that currently have fewer than 256k
objects.

But git itself might not be *that* far from going to 10 hex digits
with my patch.

The kernel uses 12 he digits because the collision math says that's
the right thing for a project with between 4M and 16M objects (with
the kernel being at 5M).

So on the whole the patch really does seem to just do the right thing
automatically.

  Linus


Re: [PATCH 2/4] t13xx: do not assume system config is empty

2016-09-29 Thread Junio C Hamano
Jeff King  writes:

> On Thu, Sep 29, 2016 at 11:13:45AM -0700, Junio C Hamano wrote:
>
>> Jeff King  writes:
>> 
>> > I think anytime you would use GIT_CONFIG_NOSYSTEM over --local, it is an
>> > indication that the test is trying to check how multiple sources
>> > interact. And the right thing to do for them is to set GIT_ETC_GITCONFIG
>> > to some known quantity. We just couldn't do that before, so we skipped
>> > it.  IOW, something like the patch below (on top of yours).
>> 
>> OK, that way we can make sure that "multiple sources" operations do
>> look at the system-wide stuff.
>
> Exactly.

I think it deserves a separate patch and the result is more
understandable.  I've queued this for now (on top of a revised 1/4
that uses GIT_CONFIG_SYSTEM_PATH instead).

-- >8 --
From: Jeff King 
Date: Thu, 29 Sep 2016 11:29:10 -0700
Subject: [PATCH] t1300: check also system-wide configuration file in
 --show-origin tests

Because we used to run our tests with GIT_CONFIG_NOSYSTEM, these did
not test that the system-wide configuration file is also read and
shown as one of the origins.  Create a custom/fake system-wide
configuration file and make sure it appears in the output, using the
newly introduced GIT_CONFIG_SYSTEM_PATH mechanism.

Signed-off-by: Junio C Hamano 
---
 t/t1300-repo-config.sh | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 0543b62227bf..aa25577709c5 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1236,6 +1236,11 @@ test_expect_success 'set up --show-origin tests' '
[user]
relative = include
EOF
+   cat >"$HOME"/etc-gitconfig <<-\EOF &&
+   [user]
+   system = true
+   override = system
+   EOF
cat >"$HOME"/.gitconfig <<-EOF &&
[user]
global = true
@@ -1254,6 +1259,8 @@ test_expect_success 'set up --show-origin tests' '
 
 test_expect_success '--show-origin with --list' '
cat >expect <<-EOF &&
+   file:$HOME/etc-gitconfiguser.system=true
+   file:$HOME/etc-gitconfiguser.override=system
file:$HOME/.gitconfig   user.global=true
file:$HOME/.gitconfig   user.override=global
file:$HOME/.gitconfig   
include.path=$INCLUDE_DIR/absolute.include
@@ -1264,13 +1271,16 @@ test_expect_success '--show-origin with --list' '
file:.git/../include/relative.include   user.relative=include
command line:   user.cmdline=true
EOF
+   GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \
git -c user.cmdline=true config --list --show-origin >output &&
test_cmp expect output
 '
 
 test_expect_success '--show-origin with --list --null' '
cat >expect <<-EOF &&
-   file:$HOME/.gitconfigQuser.global
+   file:$HOME/etc-gitconfigQuser.system
+   trueQfile:$HOME/etc-gitconfigQuser.override
+   systemQfile:$HOME/.gitconfigQuser.global
trueQfile:$HOME/.gitconfigQuser.override
globalQfile:$HOME/.gitconfigQinclude.path

$INCLUDE_DIR/absolute.includeQfile:$INCLUDE_DIR/absolute.includeQuser.absolute
@@ -1281,6 +1291,7 @@ test_expect_success '--show-origin with --list --null' '
includeQcommand line:Quser.cmdline
trueQ
EOF
+   GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \
git -c user.cmdline=true config --null --list --show-origin >output.raw 
&&
nul_to_q output &&
# The here-doc above adds a newline that the --null output would not
@@ -1304,6 +1315,7 @@ test_expect_success '--show-origin with --get-regexp' '
file:$HOME/.gitconfig   user.global true
file:.git/configuser.local true
EOF
+   GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \
git config --show-origin --get-regexp "user\.[g|l].*" >output &&
test_cmp expect output
 '
@@ -1312,6 +1324,7 @@ test_expect_success '--show-origin getting a single key' '
cat >expect <<-\EOF &&
file:.git/configlocal
EOF
+   GIT_CONFIG_SYSTEM_PATH=$HOME/etc-gitconfig \
git config --show-origin user.override >output &&
test_cmp expect output
 '
-- 
2.10.0-589-g5adf4e1



Re: [PATCH 2/4] t13xx: do not assume system config is empty

2016-09-29 Thread Junio C Hamano
Jeff King  writes:

>> "either" meaning "we do not need to add --local and we do not need
>> GIT_CONFIG_NOSYSTEM"?
>
> Yes. I didn't test it with your core.abbrev patch 4/4, but I _didn't_
> have to touch their expected output after pointing them at a non-empty
> etc-gitconfig file in the trash directory. Which implies to me they
> don't care either way (which makes sense; they are asking for a specific
> key which is supposed to be found in one of the other files).

There is a bit of problem here, though.

 * If we make t1300 point at its own system-wide config, it will be
   in control of its contents, so "find this key" will find only it
   wants to find (or we found a regression).

 * But then if it ever does something that depends on the default
   value of core.abbrev (or whatever we'd tweak in response to the
   next suggestion by Linus ;-), we cannot really allow it to do
   so.  We'd want t/gitconfig-for-test to be the single place that
   we can tweak these things, but we'll have to know t1300 uses its
   own and need to make the same change there, too.

So, I dunno.


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Linus Torvalds
On Thu, Sep 29, 2016 at 11:37 AM, Linus Torvalds
 wrote:
>
> I'm playing with an early patch to make the default more dynamic.
> Let's see how well it works in practice, but it looks fairly
> promising. Let me test a bit more and send out an RFC patch..

Ok, this is *very* rough, and it doesn't actuall pass all the tests,
and I didn't even try to look at why. But it passes the trivial
smell-test, and in particular it actually makes mathematical sense...

I think the patch can speak for itself, but the basic core is this
section in get_short_sha1():

  +   if (len < 16 && !status && (flags & GET_SHA1_AUTOMATIC)) {
  +   unsigned int expect_collision = 1 << (len * 2);
  +   if (ds.nrobjects > expect_collision)
  +   return SHORT_NAME_AMBIGUOUS;
  +   }

basically, what it says is that we will consider a sha1 ambiguous even
if it was *technically* unique (that's the '!status' part of the test)
if:

 - the length was 15 or less

*and*

 - the number of objects we have is larger than the expected point
where statistically we should start to expect to get one collision.

That "expect_collision" math is actually very simple: each hex
character adds four bits of range, but since we expect collisions at
the square root of the maximum number of objects, we shift by just two
bits per hex digits instead.

The rest of the patch is a trivial change to just initialize the
default short size to -1, and consider that to mean "enable the
automatic size checking with a minimum of 7". And the trivial code to
estimate the number of objects (which ignores duplicates between packs
etc _entirely_).

For the kernel, just the *math* right now actually gives 12
characters. For current git it actually seems to say that 8 is the
correct number. For small projects, you'll still see 7.

ANYWAY. This patch is on top of Jeff's patches in 'pu' (I think those
are great regardless of this patch!), and as mentioned, it fails some
tests. I suspect that the failures might be due to the abbrev_default
being -1, and some other code finds that surprising now. But as
mentioned, I didn't really even look at it.

What do you think? It's actually a fairly simple patch and I really do
think it makes sense and it seems to just DTRT automatically.

  Linus
 cache.h   |  1 +
 environment.c |  2 +-
 sha1_name.c   | 21 -
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/cache.h b/cache.h
index 6e33f2f..d2da6d1 100644
--- a/cache.h
+++ b/cache.h
@@ -1207,6 +1207,7 @@ struct object_context {
 #define GET_SHA1_TREEISH  020
 #define GET_SHA1_BLOB 040
 #define GET_SHA1_FOLLOW_SYMLINKS 0100
+#define GET_SHA1_AUTOMATIC  0200
 #define GET_SHA1_ONLY_TO_DIE04000
 
 #define GET_SHA1_DISAMBIGUATORS \
diff --git a/environment.c b/environment.c
index c1442df..fd6681e 100644
--- a/environment.c
+++ b/environment.c
@@ -16,7 +16,7 @@ int trust_executable_bit = 1;
 int trust_ctime = 1;
 int check_stat = 1;
 int has_symlinks = 1;
-int minimum_abbrev = 4, default_abbrev = 7;
+int minimum_abbrev = 4, default_abbrev = -1;
 int ignore_case;
 int assume_unchanged;
 int prefer_symlink_refs;
diff --git a/sha1_name.c b/sha1_name.c
index 3b647fd..8791ff3 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -15,6 +15,7 @@ typedef int (*disambiguate_hint_fn)(const unsigned char *, 
void *);
 
 struct disambiguate_state {
int len; /* length of prefix in hex chars */
+   unsigned int nrobjects;
char hex_pfx[GIT_SHA1_HEXSZ + 1];
unsigned char bin_pfx[GIT_SHA1_RAWSZ];
 
@@ -118,6 +119,12 @@ static void find_short_object_filename(struct 
disambiguate_state *ds)
 
if (strlen(de->d_name) != 38)
continue;
+
+   // We only look at the one subdirectory, and we assume
+   // each subdirectory is roughly similar, so each object
+   // we find probably has 255 other objects in the other
+   // fan-out directories
+   ds->nrobjects += 256;
if (memcmp(de->d_name, ds->hex_pfx + 2, ds->len - 2))
continue;
memcpy(hex + 2, de->d_name, 38);
@@ -151,6 +158,7 @@ static void unique_in_pack(struct packed_git *p,
 
open_pack_index(p);
num = p->num_objects;
+   ds->nrobjects += num;
last = num;
while (first < last) {
uint32_t mid = (first + last) / 2;
@@ -426,6 +434,12 @@ static int get_short_sha1(const char *name, int len, 
unsigned char *sha1,
for_each_abbrev(ds.hex_pfx, show_ambiguous_object, );
}
 
+   if (len < 16 && !status && (flags & GET_SHA1_AUTOMATIC)) {
+   unsigned int expect_collision = 1 << (len * 2);
+   if (ds.nrobjects > expect_collision)
+   return 

Re: [PATCH/RFC] git log --oneline alternative with dates, times and initials

2016-09-29 Thread Junio C Hamano
Jeff King  writes:

> But I also buy the argument that contrib/ is simply a hassle. This
> script can live in its own repository somewhere, and handle
> announcements and patches on the list.

I think the output of this script is largely personal preference,
which can be made to a project preference for a project enough of
whose participant so desires.

For example, I would not be surprised if this appeared next to
checkpatch.pl script in the kernel archive.  When a project that
uses Git to store its sources finds a need to summarize its log in a
standardized way that is not produced natively by Git, such a
project may add this script to its scripts/ area, just like a
project that wants to have a standard way to help its contributors
to avoid common style errors a lot more than our "diff" (which only
highlights whitespace errors) does may ship checkpatch.pl in it.

So in that sense, while I do not mean to say that the script itself
must become a standalone project that has only one script in it, I
do not think it belongs "our" contrib/, as we do not see a need to
standardize its output as the log summary standard we the Git
project uses on its own history.

On the other hand, your illustration of the needed bits to express
this particular output format used by Kyle's script, when polished,
does fit in our codebase.  We are interested in making it possible
for projects and users to do more by using Git with its standard
customization features.


Re: [PATCH v5 1/4] git: make super-prefix option

2016-09-29 Thread Brandon Williams
On 09/29, Jeff King wrote:
> On Wed, Sep 28, 2016 at 02:50:40PM -0700, Brandon Williams wrote:
> 
> > Add a super-prefix environment variable 'GIT_INTERNAL_SUPER_PREFIX'
> > which can be used to specify a path from above a repository down to its
> > root.  The immediate use of this option is by commands which have a
> > --recurse-submodule option in order to give context to submodules about
> > how they were invoked.  This option is currently only allowed for
> > builtins which support a super-prefix.
> 
> What about non-builtins?
> 
> E.g., what should
> 
>   git --super-prefix=foo bar
> 
> do? Should the externals and scripts check the presence of
> GIT_INTERNAL_SUPER_PREFIX and barf if it is set? Most scripts would
> probably notice eventually when calling some other builtin that doesn't
> support SUPER_PREFIX, but it seems hacky to count on that.
> 
> There's also the question of 3rd-party programs. If we want to be
> conservative, I think you'd want to just always bail in
> execv_dashed_external() if --super-prefix is in use. That doesn't give
> an option for scripts to say "hey, I support this", but we can perhaps
> worry about loosening later.
> 
> -Peff

That makes sense.

-- 
Brandon Williams


Re: [PATCH v5 1/4] git: make super-prefix option

2016-09-29 Thread Jeff King
On Wed, Sep 28, 2016 at 02:50:40PM -0700, Brandon Williams wrote:

> Add a super-prefix environment variable 'GIT_INTERNAL_SUPER_PREFIX'
> which can be used to specify a path from above a repository down to its
> root.  The immediate use of this option is by commands which have a
> --recurse-submodule option in order to give context to submodules about
> how they were invoked.  This option is currently only allowed for
> builtins which support a super-prefix.

What about non-builtins?

E.g., what should

  git --super-prefix=foo bar

do? Should the externals and scripts check the presence of
GIT_INTERNAL_SUPER_PREFIX and barf if it is set? Most scripts would
probably notice eventually when calling some other builtin that doesn't
support SUPER_PREFIX, but it seems hacky to count on that.

There's also the question of 3rd-party programs. If we want to be
conservative, I think you'd want to just always bail in
execv_dashed_external() if --super-prefix is in use. That doesn't give
an option for scripts to say "hey, I support this", but we can perhaps
worry about loosening later.

-Peff


Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Johannes Sixt

Am 29.09.2016 um 20:18 schrieb Torsten Bögershausen:

I would agree that  Git should not wait for the filter.
But does the test suite need to wait for the filter ?


We have fixed a test case on Windows recently where a process hung 
around too long (5babb5bd). So, yes, the test suite has to wait for the 
filter.


-- Hannes



Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Linus Torvalds
On Thu, Sep 29, 2016 at 11:05 AM, Junio C Hamano  wrote:
>
> Yes, "git log --oneline" looks somewhat different and strange for
> me, too ;-)

I'm playing with an early patch to make the default more dynamic.
Let's see how well it works in practice, but it looks fairly
promising. Let me test a bit more and send out an RFC patch..

  Linus


Re: [PATCH 5/5] log: add --commit-header option

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 10:49:04AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > This lets you stick a header right before a commit, but
> > suppresses headers that are duplicates. This means you can
> > do something like:
> >
> >   git log --graph --author-date-order --commit-header='== %as =='
> >
> > to get a marker in the graph whenever the day changes.
> 
> That's interesting.  So it is not really "commit" header, but a
> header for groups of commits.  Credits for realizing the usefulness
> of such grouping may go to Kyle, but the implementation is also
> brilliant ;-).

Yeah, I really don't like the name "--commit-header" that much. I
initially thought to call it "--graph-header", but it is potentially
useful without a graph, too. Maybe "--group-header" or something.
I dunno. I'd leave that to somebody who actually wanted to polish the
patches up enough for submission. That might even be me someday, but not
today. :)

-Peff


Re: [PATCH/RFC] git log --oneline alternative with dates, times and initials

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 10:38:14AM -0700, Junio C Hamano wrote:

> > I have no problem taking this in contrib or whatever, until a point when
> > Git is capable of doing the same thing itself. I just hoped to trick you
> > into working on Git. :)
> 
> I thought we stopped adding random things to contrib/, though.
> 
> Unlike the earlier days of Git, if a custom command that uses Git is
> very userful, it can live its own life and flourish within the much
> larger Git userbase we have these days.

I dunno. I said "contrib or whatever" to duck that question. :)

I do not have a strong opinion either way. In some ways this script is
similar to diff-highlight, which is in contrib. Perhaps that is only
because diff-highlight is grandfathered. But I also think it somewhat
makes sense, because in an ideal world diff-highlight gets thrown away
in favor of git's internal diff routines learning to do the same thing.
And in theory this script is in the same position.

But I also buy the argument that contrib/ is simply a hassle. This
script can live in its own repository somewhere, and handle
announcements and patches on the list. For that matter, so could
diff-highlight, and I don't mind ripping it out of contrib if that's the
consensus.  My only real objection is that doing so is more work than
leaving it as-is, and I'm lazy.

-Peff


Re: [PATCH 2/4] t13xx: do not assume system config is empty

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 11:13:45AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > I think anytime you would use GIT_CONFIG_NOSYSTEM over --local, it is an
> > indication that the test is trying to check how multiple sources
> > interact. And the right thing to do for them is to set GIT_ETC_GITCONFIG
> > to some known quantity. We just couldn't do that before, so we skipped
> > it.  IOW, something like the patch below (on top of yours).
> 
> OK, that way we can make sure that "multiple sources" operations do
> look at the system-wide stuff.

Exactly.

> > Note that the
> > commands that are doing a "--get" and not a "--list" don't actually seem
> > to need either (because they are getting the values out of the local
> > file anyway), so we could drop the setting of GIT_ETC_GITCONFIG from
> > them entirely.
> 
> "either" meaning "we do not need to add --local and we do not need
> GIT_CONFIG_NOSYSTEM"?

Yes. I didn't test it with your core.abbrev patch 4/4, but I _didn't_
have to touch their expected output after pointing them at a non-empty
etc-gitconfig file in the trash directory. Which implies to me they
don't care either way (which makes sense; they are asking for a specific
key which is supposed to be found in one of the other files).

-Peff


Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Torsten Bögershausen



On 29/09/16 19:57, Lars Schneider wrote:

On 29 Sep 2016, at 18:57, Junio C Hamano  wrote:

Torsten Bögershausen  writes:


1) Git exits
2) The filter process receives EOF and prints "STOP" to the log
3) t0021 checks the content of the log

Sometimes 3 happened before 2 which makes the test fail.
(Example: https://travis-ci.org/git/git/jobs/162660563 )

I added a this to wait until the filter process terminates:

+wait_for_filter_termination () {
+   while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
2>&1
+   do
+   echo "Waiting for /t0021/rot13-filter.pl to finish..."
+   sleep 1
+   done
+}

Does this look OK to you?

Do we need the ps at all ?
How about this:

+wait_for_filter_termination () {
+   while ! grep "STOP"  LOGFILENAME >/dev/null
+   do
+   echo "Waiting for /t0021/rot13-filter.pl to finish..."
+   sleep 1
+   done
+}

Running "ps" and grepping for a command is not suitable for script
to reliably tell things, so it is out of question.  Compared to
that, your version looks slightly better, but what if the machinery
that being tested, i.e. the part that drives the filter process, is
buggy or becomes buggy and causes the filter process that writes
"STOP" to die before it actually writes that string?

I have a feeling that the machinery being tested needs to be fixed
so that the sequence is always be:

0) Git spawns the filter process, as it needs some contents to
   be filtered.

1) Git did everything it needed to do and decides that is time
   to go.

2) Filter process receives EOF and prints "STOP" to the log.

3) Git waits until the filter process finishes.

4) t0021, after Git finishes, checks the log.

Repeated sleep combined with grep is probably just sweeping the real
problem under the rug.  Do we have enough information to do the
above?

An inspiration may be in the way we centrally clean all tempfiles
and lockfiles before exiting.  We have a central registry of these
files that need cleaning up and have a single atexit(3) handler to
clean them up.  Perhaps we need a registry that filter processes
spawned by the mechanism Lars introduces in this series, and have an
atexit(3) handler that closes the pipe to them (which signals the
filters that it is time for them to go) and wait(2) on them, or
something?  I do not think we want any kill(2) to be involved in
this clean-up procedure, but I do think we should wait(2) on what we
spawn, as long as these processes are meant to be shut down when the
main process of Git exits (this is different from things like
credential-cache daemon where they are expected to persist and meant
to serve multiple Git processes).

We discussed that issue in v4 and v6:
http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/
http://public-inbox.org/git/xmqqbn0a3wy3@gitster.mtv.corp.google.com/

My impression was that you don't want Git to wait for the filter process.
If Git waits for the filter process - how long should Git wait?

Thanks,
Lars


Hm,
I would agree that  Git should not wait for the filter.
But does the test suite need to wait for the filter ?
May be, in this case we test the filter and Git, which is good.
Adding a 1 second delay, if, and only if, there is a racy condition,
is not that bad (or do we have better ways to check for a process to
be terminated ?)




Re: [PATCH 2/4] t13xx: do not assume system config is empty

2016-09-29 Thread Junio C Hamano
Jeff King  writes:

> I think anytime you would use GIT_CONFIG_NOSYSTEM over --local, it is an
> indication that the test is trying to check how multiple sources
> interact. And the right thing to do for them is to set GIT_ETC_GITCONFIG
> to some known quantity. We just couldn't do that before, so we skipped
> it.  IOW, something like the patch below (on top of yours).

OK, that way we can make sure that "multiple sources" operations do
look at the system-wide stuff.

> Note that the
> commands that are doing a "--get" and not a "--list" don't actually seem
> to need either (because they are getting the values out of the local
> file anyway), so we could drop the setting of GIT_ETC_GITCONFIG from
> them entirely.

"either" meaning "we do not need to add --local and we do not need
GIT_CONFIG_NOSYSTEM"?



Re: Two bugs in --pretty with %C(auto)

2016-09-29 Thread René Scharfe
Am 17.09.2016 um 20:25 schrieb René Scharfe:
> diff --git a/pretty.c b/pretty.c
> index 9788bd8..493edb0 100644
> --- a/pretty.c
> +++ b/pretty.c
> @@ -1072,6 +1072,8 @@ static size_t format_commit_one(struct strbuf *sb, /* 
> in UTF-8 */
>   case 'C':
>   if (starts_with(placeholder + 1, "(auto)")) {
>   c->auto_color = want_color(c->pretty_ctx->color);
> + if (c->auto_color)
> + strbuf_addstr(sb, GIT_COLOR_RESET);
>   return 7; /* consumed 7 bytes, "C(auto)" */
>   } else {
>   int ret = parse_color(sb, placeholder, c);

We could optimize this a bit (see below).  I can't think of a downside;
someone adding a prefix would be responsible for adding a reset as well
if needed, right?

-- >8 --
Subject: [PATCH] pretty: avoid adding reset for %C(auto) if output is empty

We emit an escape sequence for resetting color and attribute for
%C(auto) to make sure automatic coloring is displayed as intended.
Stop doing that if the output strbuf is empty, i.e. when %C(auto)
appears at the start of the format string, because then there is no
need for a reset and we save a few bytes in the output.

Signed-off-by: Rene Scharfe 
---
Reverts the change to t6006, so we'd need another test for this.
Anatoly? :)

 pretty.c   | 2 +-
 t/t6006-rev-list-format.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/pretty.c b/pretty.c
index 493edb0..25efbca 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1072,7 +1072,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
case 'C':
if (starts_with(placeholder + 1, "(auto)")) {
c->auto_color = want_color(c->pretty_ctx->color);
-   if (c->auto_color)
+   if (c->auto_color && sb->len)
strbuf_addstr(sb, GIT_COLOR_RESET);
return 7; /* consumed 7 bytes, "C(auto)" */
} else {
diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh
index f6020cd..a1dcdb8 100755
--- a/t/t6006-rev-list-format.sh
+++ b/t/t6006-rev-list-format.sh
@@ -225,7 +225,7 @@ test_expect_success '%C(auto,...) respects --color=auto 
(stdout not tty)' '
 
 test_expect_success '%C(auto) respects --color' '
git log --color --format="%C(auto)%H" -1 >actual &&
-   printf "\\033[m\\033[33m%s\\033[m\\n" $(git rev-parse HEAD) >expect &&
+   printf "\\033[33m%s\\033[m\\n" $(git rev-parse HEAD) >expect &&
test_cmp expect actual
 '
 
-- 
2.10.0



Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 29.09.2016 um 01:30 schrieb Junio C Hamano:
>> As Peff said, responding in a thread started by Linus's suggestion
>> to raise the default abbreviation to 12 hexdigits:
>
> This is waayy too large for a new default. The vast majority of
> repositories is smallish. For those, the long sequences of hex digits
> are an uglification that is almost unbearable.
>
> I know that kernel developers are important, but their importance has
> long been outnumbered by the anonymous and silent masses of users.
>
> Personally, I use 8 digits just because it is a "rounder" number than
> 7, but in all of my repositories 7 would still work just as well.

Yes, "git log --oneline" looks somewhat different and strange for
me, too ;-)

I am sure I'll get used to it if I keep using it, but I suspect that
I'd be irritated as I find myself typing 'q' more and more often to
"less -S" that is automatically invoked when I do "git log --oneline
master.." to see what commits are on my current topic branch.


Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 09:57:57AM -0700, Junio C Hamano wrote:

> > +wait_for_filter_termination () {
> > +   while ! grep "STOP"  LOGFILENAME >/dev/null
> > +   do
> > +   echo "Waiting for /t0021/rot13-filter.pl to finish..."
> > +   sleep 1
> > +   done
> > +}
> 
> Running "ps" and grepping for a command is not suitable for script
> to reliably tell things, so it is out of question.  Compared to
> that, your version looks slightly better, but what if the machinery
> that being tested, i.e. the part that drives the filter process, is
> buggy or becomes buggy and causes the filter process that writes
> "STOP" to die before it actually writes that string?

I'm of the opinion that any busy-waiting is a good sign that something
is suboptimal. The right solution here seems like it should be signaling
the test script via a descriptor.

I don't necessarily agree, though, that the timing of filter-process
cleanup needs to be part of the public interface. So in your list:

> 3) Git waits until the filter process finishes.

That seems simple and elegant, but I can think of reasons we might not
want to wait (e.g., if the filter has to do some maintenance task and
does not the user to have to wait).

OTOH, we already face this in git, and we solve it by explicitly
backgrounding the maintenance task (i.e., auto-gc). So one could argue
that it is the responsibility of the filter process to manage its own
processes. It certainly makes the interaction with git simpler.

-Peff


Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Lars Schneider

> On 29 Sep 2016, at 18:57, Junio C Hamano  wrote:
> 
> Torsten Bögershausen  writes:
> 
>>> 1) Git exits
>>> 2) The filter process receives EOF and prints "STOP" to the log
>>> 3) t0021 checks the content of the log
>>> 
>>> Sometimes 3 happened before 2 which makes the test fail.
>>> (Example: https://travis-ci.org/git/git/jobs/162660563 )
>>> 
>>> I added a this to wait until the filter process terminates:
>>> 
>>> +wait_for_filter_termination () {
>>> +   while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
>>> 2>&1
>>> +   do
>>> +   echo "Waiting for /t0021/rot13-filter.pl to finish..."
>>> +   sleep 1
>>> +   done
>>> +}
>>> 
>>> Does this look OK to you?
>> Do we need the ps at all ?
>> How about this:
>> 
>> +wait_for_filter_termination () {
>> +while ! grep "STOP"  LOGFILENAME >/dev/null
>> +do
>> +echo "Waiting for /t0021/rot13-filter.pl to finish..."
>> +sleep 1
>> +done
>> +}
> 
> Running "ps" and grepping for a command is not suitable for script
> to reliably tell things, so it is out of question.  Compared to
> that, your version looks slightly better, but what if the machinery
> that being tested, i.e. the part that drives the filter process, is
> buggy or becomes buggy and causes the filter process that writes
> "STOP" to die before it actually writes that string?
> 
> I have a feeling that the machinery being tested needs to be fixed
> so that the sequence is always be:
> 
>0) Git spawns the filter process, as it needs some contents to
>   be filtered.
> 
>1) Git did everything it needed to do and decides that is time
>   to go.
> 
>2) Filter process receives EOF and prints "STOP" to the log.
> 
>3) Git waits until the filter process finishes.
> 
>4) t0021, after Git finishes, checks the log.
> 
> Repeated sleep combined with grep is probably just sweeping the real
> problem under the rug.  Do we have enough information to do the
> above?
> 
> An inspiration may be in the way we centrally clean all tempfiles
> and lockfiles before exiting.  We have a central registry of these
> files that need cleaning up and have a single atexit(3) handler to
> clean them up.  Perhaps we need a registry that filter processes
> spawned by the mechanism Lars introduces in this series, and have an
> atexit(3) handler that closes the pipe to them (which signals the
> filters that it is time for them to go) and wait(2) on them, or
> something?  I do not think we want any kill(2) to be involved in
> this clean-up procedure, but I do think we should wait(2) on what we
> spawn, as long as these processes are meant to be shut down when the
> main process of Git exits (this is different from things like
> credential-cache daemon where they are expected to persist and meant
> to serve multiple Git processes).

We discussed that issue in v4 and v6:
http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/
http://public-inbox.org/git/xmqqbn0a3wy3@gitster.mtv.corp.google.com/

My impression was that you don't want Git to wait for the filter process.
If Git waits for the filter process - how long should Git wait?

Thanks,
Lars

Re: [PATCH 5/5] log: add --commit-header option

2016-09-29 Thread Junio C Hamano
Jeff King  writes:

> This lets you stick a header right before a commit, but
> suppresses headers that are duplicates. This means you can
> do something like:
>
>   git log --graph --author-date-order --commit-header='== %as =='
>
> to get a marker in the graph whenever the day changes.

That's interesting.  So it is not really "commit" header, but a
header for groups of commits.  Credits for realizing the usefulness
of such grouping may go to Kyle, but the implementation is also
brilliant ;-).

> This probably needs some refactoring around the setup of the
> pretty-print context.



Re: [PATCH 1/4] config: allow customizing /etc/gitconfig location

2016-09-29 Thread Matthieu Moy
Junio C Hamano  writes:

> Jakub Narębski  writes:
>
>> W dniu 29.09.2016 o 01:30, Junio C Hamano pisze:
>>> With a new environment variable GIT_ETC_GITCONFIG, the users can
>>> specify a file that is used instead of /etc/gitconfig to read (and
>>> write) the system-wide configuration.
>>
>> Why it is named GIT_ETC_GITCONFIG (which is Unix-ism), and not
>> GIT_CONFIG_SYSTEM / GIT_CONFIG_SYSTEM_PATH, that is something
>> OS-neutral?
>
> Isn't "environment variable" something that came from POSIX world?

I don't know who invented the concept, but environment variables have
been there in the windows world since it exists I think (it existed in
MS-DOS).

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH/RFC] git log --oneline alternative with dates, times and initials

2016-09-29 Thread Junio C Hamano
Jeff King  writes:

>> Those patches are missing some of the features like showing root commits,
>> handling two letter initials, showing the weekday, inserting a break where
>> needed to avoid parent-child confusion in graph output and properly handling
>> Duy's initials. :)
>
> I'm not too surprised. I literally looked at the first screenshot from
> your output and thought "surely git can do that with some minor tweaks".
> Nor am I surprised that there are cases where the output is funny (99%
> of the time I spent on it was tracking down that graph-padding bug).
>
> I have no problem taking this in contrib or whatever, until a point when
> Git is capable of doing the same thing itself. I just hoped to trick you
> into working on Git. :)

I thought we stopped adding random things to contrib/, though.

Unlike the earlier days of Git, if a custom command that uses Git is
very userful, it can live its own life and flourish within the much
larger Git userbase we have these days.


Re: [PATCH 2/5] pretty: allow formatting names as initials

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 10:31:30AM -0700, Junio C Hamano wrote:

> > When I first tested it with "git log --format=%aS" I had to wonder "who
> > the heck is ntnd?". So using only the first-and-last would match the git
> > project's practice better, at least.
> 
> And there is also "isalpha() good enough?" question.
> 
> I think we have a few Chinese and Hangul as well as Cyrillic names
> in our history, some of them having outside-ascii first letters.
> One of the more prolific contributor's initial is ÆAB ;-)

Heh, true. In case it was not clear, these were mostly quick-and-dirty
patches. I think the right test is probably '!isspace()".

-Peff


Re: [PATCH 2/5] pretty: allow formatting names as initials

2016-09-29 Thread Junio C Hamano
Jeff King  writes:

> Initials are shorter and often unique enough in a
> per-project setting, so they can be used to give a more
> informative version of --oneline.
>
> The 'S' in the placeholder is for "short" (and 's' is
> already taken by DATE_SHORT), but obviously that's pretty
> arcane.
>
> Possibly there should be more customization of initials,
> asking for only 2-letter initials, etc.
>
> Signed-off-by: Jeff King 
> ---
> When I first tested it with "git log --format=%aS" I had to wonder "who
> the heck is ntnd?". So using only the first-and-last would match the git
> project's practice better, at least.

And there is also "isalpha() good enough?" question.

I think we have a few Chinese and Hangul as well as Cyrillic names
in our history, some of them having outside-ascii first letters.
One of the more prolific contributor's initial is ÆAB ;-)

>  pretty.c | 21 +
>  1 file changed, 21 insertions(+)
>
> diff --git a/pretty.c b/pretty.c
> index c532c17..de62405 100644
> --- a/pretty.c
> +++ b/pretty.c
> @@ -674,6 +674,23 @@ static int mailmap_name(const char **email, size_t 
> *email_len,
>   return mail_map->nr && map_user(mail_map, email, email_len, name, 
> name_len);
>  }
>  
> +static void format_initials(struct strbuf *out, const char *name, size_t len)
> +{
> + int initial = 1;
> + size_t i;
> +
> + for (i = 0; i < len; i++) {
> + char c = name[i];
> + if (isspace(c)) {
> + initial = 1;
> + continue;
> + }
> + if (initial && isalpha(c))
> + strbuf_addch(out, tolower(c));
> + initial = 0;
> + }
> +}
> +
>  static size_t format_person_part(struct strbuf *sb, char part,
>const char *msg, int len,
>const struct date_mode *dmode)
> @@ -702,6 +719,10 @@ static size_t format_person_part(struct strbuf *sb, char 
> part,
>   strbuf_add(sb, mail, maillen);
>   return placeholder_len;
>   }
> + if (part == 'S') {
> + format_initials(sb, name, namelen);
> + return placeholder_len;
> + }
>  
>   if (!s.date_begin)
>   goto skip;


Re: [PATCH 1/4] config: allow customizing /etc/gitconfig location

2016-09-29 Thread Junio C Hamano
Jakub Narębski  writes:

> W dniu 29.09.2016 o 01:30, Junio C Hamano pisze:
>> With a new environment variable GIT_ETC_GITCONFIG, the users can
>> specify a file that is used instead of /etc/gitconfig to read (and
>> write) the system-wide configuration.
>
> Why it is named GIT_ETC_GITCONFIG (which is Unix-ism), and not
> GIT_CONFIG_SYSTEM / GIT_CONFIG_SYSTEM_PATH, that is something
> OS-neutral?

Isn't "environment variable" something that came from POSIX world?


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Junio C Hamano
Jeff King  writes:

>> $ git rev-parse --disambiguate-list=b2e1
>> b2e1196 tag v2.8.0-rc1
>> b2e11d1 tree
>> b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
>> b2e1759 blob
>> b2e18954 blob
>> b2e1895c blob
>
> I think the "right" way to do this is pipe the list of sha1s into
> another git commit which can format them however you want.
> Unfortunately, there isn't a single command that does a great job:
>
>   - "cat-file --batch-check" can show you the sha1 and type, but it
> won't abbreviate sha1s, and it won't show you commit/tag information
>
>   - "log --stdin --no-walk" will format the commit however you like, but
> skips the trees and blobs entirely, and the tag can only be seen via
> "%d"
>
>   - "for-each-ref" has flexible formatting, too, but wants to format
> refs, not objects (and doesn't read from stdin).

- "name-rev" is used to give "describe --contains", and can read
  from its standard input, but has no format customization.
  Another downside of it is that it only wants to see
  committishes.

> IMHO that is a sign that our formatting tools aren't as good as they
> could be (I think the right tool is cat-file, but it should be able to
> do all of the formatting that the other commands can do).
>
> Of course if you really just want human-readable output, then:
>
>   $ git cat-file -e b2e1
>   error: short SHA1 b2e1 is ambiguous
>   hint: The candidates are:
>   hint:   b2e1196 tag v2.8.0-rc1
>   hint:   b2e11d1 tree
>   hint:   b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
>   hint:   b2e1759 blob
>   hint:   b2e18954 blob
>   hint:   b2e1895c blob
>   fatal: Not a valid object name b2e1
>
> is pretty easy.

Yes.  I think adding this to rev-parse that is meant for machines is
probably a mistake, as this "hint" machinery's output will become
even more human friendly over time as we gain experience.

 - If the hypothetical "--disambiguate-list" option wants to produce
   machine parseable output for scripts, it would mean its output
   (and whatgver the reading script can do based on its output for
   humans) will become less useful for humans over time.

 - If the hypothetical "--disambiguate-list" option only wants to
   replicate the human readable output that is designed to be
   improved over time and expects its output _not_ to be interpreted
   by scripts but merely be relayed, then why aren't these scripts
   just invoking the commands that already gives the "hint:" output
   and showing that directly to humans in the first place?

> That being said, I don't mind if somebody wanted to do a rev-parse
> option on top of my series. The formatting code is already split into
> its own function.

So let's not go there.


Re: [PATCH v2 02/11] i18n: add--interactive: mark simple here documents for translation

2016-09-29 Thread Junio C Hamano
Vasco Almeida  writes:

> On the other hand, would it make sense to translate these commands? If
> so, we would mark for translation the commands name of @cmd in
> main_loop().
>
>  sub main_loop {
> -   my @cmd = ([ 'status', \_cmd, ],
> -  [ 'update', \_cmd, ],
> -  [ 'revert', \_cmd, ],
> -  [ 'add untracked', \_untracked_cmd, ],
> -  [ 'patch', \_update_cmd, ],
> -  [ 'diff', \_cmd, ],
> -  [ 'quit', \_cmd, ],
> -  [ 'help', \_cmd, ],
> +   my @cmd = ([ __('status'), \_cmd, ],
> +  [ __('update'), \_cmd, ],
> +  [ __('revert'), \_cmd, ],
> +  [ __('add untracked'), \_untracked_cmd, ],
> +  [ __('patch'), \_update_cmd, ],
> +  [ __('diff'), \_cmd, ],
> +  [ __('quit'), \_cmd, ],
> +  [ __('help'), \_cmd, ],

I don't know offhand.  If the code to prompt and accept the command
given by the user can take the translated word (or a prefix of it),
theoretically I would say it could be made to work, but to me it is
dubious the benefit outweighs its downsides.  It would make teaching
Git and troubleshooting over the phone harder, I would guess.

 A: "Hi, I am in a 'git add -i' session."
 B: "Give 's' at the prompt."
 A: "My Git does not seem to take 's' as a valid command."
 B: "What? I've never seen that problem."
 ... back and forth wastes 10 minutes ...
 A: "By the way, I am running Git in Portuguese."

;-)


Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Junio C Hamano
Torsten Bögershausen  writes:

>> 1) Git exits
>> 2) The filter process receives EOF and prints "STOP" to the log
>> 3) t0021 checks the content of the log
>>
>> Sometimes 3 happened before 2 which makes the test fail.
>> (Example: https://travis-ci.org/git/git/jobs/162660563 )
>>
>> I added a this to wait until the filter process terminates:
>>
>> +wait_for_filter_termination () {
>> +while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
>> 2>&1
>> +do
>> +echo "Waiting for /t0021/rot13-filter.pl to finish..."
>> +sleep 1
>> +done
>> +}
>>
>> Does this look OK to you?
> Do we need the ps at all ?
> How about this:
>
> +wait_for_filter_termination () {
> + while ! grep "STOP"  LOGFILENAME >/dev/null
> + do
> + echo "Waiting for /t0021/rot13-filter.pl to finish..."
> + sleep 1
> + done
> +}

Running "ps" and grepping for a command is not suitable for script
to reliably tell things, so it is out of question.  Compared to
that, your version looks slightly better, but what if the machinery
that being tested, i.e. the part that drives the filter process, is
buggy or becomes buggy and causes the filter process that writes
"STOP" to die before it actually writes that string?

I have a feeling that the machinery being tested needs to be fixed
so that the sequence is always be:

0) Git spawns the filter process, as it needs some contents to
   be filtered.

1) Git did everything it needed to do and decides that is time
   to go.

2) Filter process receives EOF and prints "STOP" to the log.

3) Git waits until the filter process finishes.

4) t0021, after Git finishes, checks the log.

Repeated sleep combined with grep is probably just sweeping the real
problem under the rug.  Do we have enough information to do the
above?

An inspiration may be in the way we centrally clean all tempfiles
and lockfiles before exiting.  We have a central registry of these
files that need cleaning up and have a single atexit(3) handler to
clean them up.  Perhaps we need a registry that filter processes
spawned by the mechanism Lars introduces in this series, and have an
atexit(3) handler that closes the pipe to them (which signals the
filters that it is time for them to go) and wait(2) on them, or
something?  I do not think we want any kill(2) to be involved in
this clean-up procedure, but I do think we should wait(2) on what we
spawn, as long as these processes are meant to be shut down when the
main process of Git exits (this is different from things like
credential-cache daemon where they are expected to persist and meant
to serve multiple Git processes).




[PATCH 1/3] add QSORT

2016-09-29 Thread René Scharfe
Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
size of the array elements and supports the convention of initializing
empty arrays with a NULL pointer, which we use in some places.

Calling qsort(3) directly with a NULL pointer is undefined -- even with
an element count of zero -- and allows the compiler to optimize away any
following NULL checks.  Using the macro avoids such surprises.

Add a semantic patch as well to demonstrate the macro's usage and to
automate the transformation of trivial cases.

Signed-off-by: Rene Scharfe 
---
 contrib/coccinelle/qsort.cocci | 19 +++
 git-compat-util.h  |  8 
 2 files changed, 27 insertions(+)
 create mode 100644 contrib/coccinelle/qsort.cocci

diff --git a/contrib/coccinelle/qsort.cocci b/contrib/coccinelle/qsort.cocci
new file mode 100644
index 000..a094e7c
--- /dev/null
+++ b/contrib/coccinelle/qsort.cocci
@@ -0,0 +1,19 @@
+@@
+expression base, nmemb, compar;
+@@
+- qsort(base, nmemb, sizeof(*base), compar);
++ QSORT(base, nmemb, compar);
+
+@@
+expression base, nmemb, compar;
+@@
+- qsort(base, nmemb, sizeof(base[0]), compar);
++ QSORT(base, nmemb, compar);
+
+@@
+type T;
+T *base;
+expression nmemb, compar;
+@@
+- qsort(base, nmemb, sizeof(T), compar);
++ QSORT(base, nmemb, compar);
diff --git a/git-compat-util.h b/git-compat-util.h
index 8aab0c3..d7ed137 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -977,6 +977,14 @@ void git_qsort(void *base, size_t nmemb, size_t size,
 #define qsort git_qsort
 #endif
 
+#define QSORT(base, n, compar) sane_qsort((base), (n), sizeof(*(base)), compar)
+static void inline sane_qsort(void *base, size_t nmemb, size_t size,
+ int(*compar)(const void *, const void *))
+{
+   if (nmemb > 1)
+   qsort(base, nmemb, size, compar);
+}
+
 #ifndef REG_STARTEND
 #error "Git requires REG_STARTEND support. Compile with NO_REGEX=NeedsStartEnd"
 #endif
-- 
2.10.0



[PATCH 3/3] remove unnecessary check before QSORT

2016-09-29 Thread René Scharfe
Add a semantic patch for removing checks similar to the one that QSORT
already does internally and apply it to the code base.

Signed-off-by: Rene Scharfe 
---
 builtin/fmt-merge-msg.c| 10 --
 contrib/coccinelle/qsort.cocci | 18 ++
 sh-i18n--envsubst.c|  3 +--
 3 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index 4976967..efab62f 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -314,12 +314,10 @@ static void add_people_info(struct strbuf *out,
struct string_list *authors,
struct string_list *committers)
 {
-   if (authors->nr)
-   QSORT(authors->items, authors->nr,
- cmp_string_list_util_as_integral);
-   if (committers->nr)
-   QSORT(committers->items, committers->nr,
- cmp_string_list_util_as_integral);
+   QSORT(authors->items, authors->nr,
+ cmp_string_list_util_as_integral);
+   QSORT(committers->items, committers->nr,
+ cmp_string_list_util_as_integral);
 
credit_people(out, authors, 'a');
credit_people(out, committers, 'c');
diff --git a/contrib/coccinelle/qsort.cocci b/contrib/coccinelle/qsort.cocci
index a094e7c..22b93a9 100644
--- a/contrib/coccinelle/qsort.cocci
+++ b/contrib/coccinelle/qsort.cocci
@@ -17,3 +17,21 @@ expression nmemb, compar;
 @@
 - qsort(base, nmemb, sizeof(T), compar);
 + QSORT(base, nmemb, compar);
+
+@@
+expression base, nmemb, compar;
+@@
+- if (nmemb)
+QSORT(base, nmemb, compar);
+
+@@
+expression base, nmemb, compar;
+@@
+- if (nmemb > 0)
+QSORT(base, nmemb, compar);
+
+@@
+expression base, nmemb, compar;
+@@
+- if (nmemb > 1)
+QSORT(base, nmemb, compar);
diff --git a/sh-i18n--envsubst.c b/sh-i18n--envsubst.c
index 3637a2a..c3a2b5a 100644
--- a/sh-i18n--envsubst.c
+++ b/sh-i18n--envsubst.c
@@ -230,8 +230,7 @@ cmp_string (const void *pstr1, const void *pstr2)
 static inline void
 string_list_sort (string_list_ty *slp)
 {
-  if (slp->nitems > 0)
-QSORT(slp->item, slp->nitems, cmp_string);
+  QSORT(slp->item, slp->nitems, cmp_string);
 }
 
 /* Test whether a sorted string list contains a given string.  */
-- 
2.10.0



[PATCH 2/3] use QSORT

2016-09-29 Thread René Scharfe
Apply the semantic patch contrib/coccinelle/qsort.cocci to the code
base, replacing calls of qsort(3) with QSORT.  The resulting code is
shorter and supports empty arrays with NULL pointers.

Signed-off-by: Rene Scharfe 
---
Freshly generated using coccicheck, compiles, survives make test.

 bisect.c |  2 +-
 builtin/describe.c   |  2 +-
 builtin/fast-export.c|  2 +-
 builtin/fmt-merge-msg.c  |  6 ++
 builtin/index-pack.c |  8 +++-
 builtin/mktree.c |  2 +-
 builtin/name-rev.c   |  3 +--
 builtin/pack-objects.c   |  7 +++
 builtin/remote.c |  3 +--
 diff.c   |  6 +++---
 diffcore-delta.c |  5 +
 diffcore-order.c |  2 +-
 diffcore-rename.c|  2 +-
 dir.c|  4 ++--
 fast-import.c|  4 ++--
 fetch-pack.c |  2 +-
 help.c   | 15 +--
 line-log.c   |  2 +-
 pack-bitmap-write.c  |  3 +--
 pack-check.c |  2 +-
 pack-write.c |  3 +--
 pathspec.c   |  3 +--
 ref-filter.c |  2 +-
 refs/files-backend.c |  2 +-
 server-info.c|  2 +-
 sh-i18n--envsubst.c  |  2 +-
 sha1-array.c |  2 +-
 string-list.c|  2 +-
 t/helper/test-dump-untracked-cache.c |  6 ++
 tree.c   |  3 +--
 30 files changed, 44 insertions(+), 65 deletions(-)

diff --git a/bisect.c b/bisect.c
index 6f512c2..21bc6da 100644
--- a/bisect.c
+++ b/bisect.c
@@ -215,7 +215,7 @@ static struct commit_list *best_bisection_sorted(struct 
commit_list *list, int n
array[cnt].distance = distance;
cnt++;
}
-   qsort(array, cnt, sizeof(*array), compare_commit_dist);
+   QSORT(array, cnt, compare_commit_dist);
for (p = list, i = 0; i < cnt; i++) {
char buf[100]; /* enough for dist=%d */
struct object *obj = &(array[i].commit->object);
diff --git a/builtin/describe.c b/builtin/describe.c
index 8a25abe..01490a1 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -352,7 +352,7 @@ static void describe(const char *arg, int last_one)
oid_to_hex(oid));
}
 
-   qsort(all_matches, match_cnt, sizeof(all_matches[0]), compare_pt);
+   QSORT(all_matches, match_cnt, compare_pt);
 
if (gave_up_on) {
commit_list_insert_by_date(gave_up_on, );
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index c0652a7..1e815b5 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -347,7 +347,7 @@ static void show_filemodify(struct diff_queue_struct *q,
 * Handle files below a directory first, in case they are all deleted
 * and the directory changes to a file or symlink.
 */
-   qsort(q->queue, q->nr, sizeof(q->queue[0]), depth_first);
+   QSORT(q->queue, q->nr, depth_first);
 
for (i = 0; i < q->nr; i++) {
struct diff_filespec *ospec = q->queue[i]->one;
diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index dc2e9e4..4976967 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -315,12 +315,10 @@ static void add_people_info(struct strbuf *out,
struct string_list *committers)
 {
if (authors->nr)
-   qsort(authors->items,
- authors->nr, sizeof(authors->items[0]),
+   QSORT(authors->items, authors->nr,
  cmp_string_list_util_as_integral);
if (committers->nr)
-   qsort(committers->items,
- committers->nr, sizeof(committers->items[0]),
+   QSORT(committers->items, committers->nr,
  cmp_string_list_util_as_integral);
 
credit_people(out, authors, 'a');
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 4a8b4ae..7657d0a 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1190,10 +1190,8 @@ static void resolve_deltas(void)
return;
 
/* Sort deltas by base SHA1/offset for fast searching */
-   qsort(ofs_deltas, nr_ofs_deltas, sizeof(struct ofs_delta_entry),
- compare_ofs_delta_entry);
-   qsort(ref_deltas, nr_ref_deltas, sizeof(struct ref_delta_entry),
- compare_ref_delta_entry);
+   QSORT(ofs_deltas, nr_ofs_deltas, compare_ofs_delta_entry);
+   QSORT(ref_deltas, nr_ref_deltas, compare_ref_delta_entry);
 
if (verbose || show_resolving_progress)
progress = start_progress(_("Resolving deltas"),
@@ -1356,7 

Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 07:36:27AM -0700, Kyle J. McKay wrote:

> On Sep 29, 2016, at 06:24, Jeff King wrote:
> 
> > > If you are doing "git show 235234" it should pick the tag (if it
> > > peels to a
> > > committish) because Git has already set a precedent of preferring
> > > tags over
> > > commits when it disambiguates ref names and otherwise pick the
> > > commit.
> > 
> > I'm not convinced that picking the tag is actually helpful in this case;
> > I agree with Linus that feeding something to "git show" almost always
> > wants to choose the commit.
> 
> Since "git show" peels tags you end up seeing the commit it refers to
> (assuming it's a committish tag).

Yes, but it's almost certainly _not_ the commit you meant. From your
example:

>c512b03:
>   c512b035556eff4d commit Merge branch 'rc/maint-reflog-msg-for-forced
>   c512b0344196931a tag(v0.99.9a) GIT 0.99.9a

If I'm looking for the commit c512b03, then it almost certainly isn't
v0.99.9a. That tag's commit is e634aec. Or another way of thinking about
it: you want to guess what the _writer_ of the note meant. Why would
somebody write "c512b03" when they could have written "v0.99.9a"? And
they certainly would not have written it if they meant "e634aec". :)

> > I also don't think tag ambiguity in short sha1s is all that interesting.
> 
> The Linux repository has this:
> 
>901069c:
>   901069c71415a76d commit iwlagn: change Copyright to 2011
>   901069c5c5b15532 tag(v2.6.38-rc4) Linux 2.6.38-rc4

Sure, I'm not surprised there's a collision. But I'd expect those to be
a tiny fraction of collisions. Here's the breakdown of object types in
my clone of linux.git:

  $ git cat-file --batch-all-objects --batch-check='%(objecttype)' |
sort | uniq -c
  1421198 blob
   618073 commit
  479 tag
  2877913 tree

That's a hundredth of a percent tag objects.  The chance that you have
_a_ 7-hex collision with a tag is relatively high. But the chance that
any given collision involves a tag is rather small.

-Peff


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Kyle J. McKay

On Sep 29, 2016, at 06:24, Jeff King wrote:

If you are doing "git show 235234" it should pick the tag (if it  
peels to a
committish) because Git has already set a precedent of preferring  
tags over
commits when it disambiguates ref names and otherwise pick the  
commit.


I'm not convinced that picking the tag is actually helpful in this  
case;

I agree with Linus that feeding something to "git show" almost always
wants to choose the commit.


Since "git show" peels tags you end up seeing the commit it refers to  
(assuming it's a committish tag).


I also don't think tag ambiguity in short sha1s is all that  
interesting.


The Linux repository has this:

   901069c:
  901069c71415a76d commit iwlagn: change Copyright to 2011
  901069c5c5b15532 tag(v2.6.38-rc4) Linux 2.6.38-rc4

Since that tag peels to a commit, it seems like it would be incorrect  
to pick the commit over the tag when you're looking for a committish.


Either 901069c should resolve to the tag (which gets peeled to the  
commit) or it should error out with the hint messages.


The Git repository has this:

   c512b03:
  c512b035556eff4d commit Merge branch 'rc/maint-reflog-msg-for- 
forced

  c512b0344196931a tag(v0.99.9a) GIT 0.99.9a

So perhaps it's a little bit more interesting than it first appears.  :)

--Kyle


Re: [PATCH v2 02/11] i18n: add--interactive: mark simple here documents for translation

2016-09-29 Thread Vasco Almeida
A Dom, 25-09-2016 às 15:54 -0700, Junio C Hamano escreveu:
> >  sub status_cmd {
> > @@ -1573,14 +1573,14 @@ sub quit_cmd {
> >  }
> >  
> >  sub help_cmd {
> > - print colored $help_color, <<\EOF ;
> > -status    - show paths with changes
> > + print colored $help_color, __(
> > +"status    - show paths with changes
> >  update    - add working tree state to the staged set of
> changes
> >  revert    - revert staged set of changes back to the HEAD
> version
> >  patch - pick hunks and update selectively
> >  diff   - view diff between HEAD and index
> > -add untracked - add contents of untracked files to the staged set
> of changes
> > -EOF
> > +add untracked - add contents of untracked files to the staged set
> of changes"),
> > +"\n";
> >  }
> 
> Do we need TRANSLATORS: comment to all of the above not to touch the
> command words that are explained and translate only the explanation?

Yes, it is better to have that comment.

On the other hand, would it make sense to translate these commands? If
so, we would mark for translation the commands name of @cmd in
main_loop().

 sub main_loop {
-   my @cmd = ([ 'status', \_cmd, ],
-  [ 'update', \_cmd, ],
-  [ 'revert', \_cmd, ],
-  [ 'add untracked', \_untracked_cmd, ],
-  [ 'patch', \_update_cmd, ],
-  [ 'diff', \_cmd, ],
-  [ 'quit', \_cmd, ],
-  [ 'help', \_cmd, ],
+   my @cmd = ([ __('status'), \_cmd, ],
+  [ __('update'), \_cmd, ],
+  [ __('revert'), \_cmd, ],
+  [ __('add untracked'), \_untracked_cmd, ],
+  [ __('patch'), \_update_cmd, ],
+  [ __('diff'), \_cmd, ],
+  [ __('quit'), \_cmd, ],
+  [ __('help'), \_cmd, ],


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 06:01:51AM -0700, Kyle J. McKay wrote:

> But perhaps it makes sense to actually pick one if there's only one
> disambiguation of the type you're looking for.
> 
> For example given:
> 
> 235234a blob
> 2352347 tag
> 235234f tree
> 2352340 commit
> 
> If you are doing "git cat-file blob 235234" it should pick the blob and spit
> out a warning (and similarly for other cat-file types).  But "git cat-file
> -p 235234" would give the fatal error with the disambiguation hints because
> it wants type "any".

That code is already there; it's just a matter of whether git has enough
information to know the context. E.g. (in git.git):

  $ git show b2e11
  error: short SHA1 b2e11 is ambiguous
  hint: The candidates are:
  hint:   b2e1196 tag v2.8.0-rc1
  hint:   b2e11d1 tree
  ...

  $ git log b2e11
  commit ab5d01a29eb7380ceab070f0807c2939849c44bc (tag: v2.8.0-rc1)
  ...

The "show" command can show anything, but "log" really wants
committishes, so it's able to disambiguate. It looks like cat-file never
learned to feed its context, but it's probably something like this:

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 94e67eb..ecbb959 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -56,12 +56,22 @@ static int cat_one_file(int opt, const char *exp_type, 
const char *obj_name,
struct object_info oi = {NULL};
struct strbuf sb = STRBUF_INIT;
unsigned flags = LOOKUP_REPLACE_OBJECT;
+   unsigned sha1_flags = 0;
const char *path = force_path;
 
if (unknown_type)
flags |= LOOKUP_UNKNOWN_OBJECT;
 
-   if (get_sha1_with_context(obj_name, 0, oid.hash, _context))
+   if (exp_type) {
+   if (!strcmp(exp_type, "commit"))
+   sha1_flags |= GET_SHA1_COMMITTISH;
+   else if(!strcmp(exp_type, "tree"))
+   sha1_flags |= GET_SHA1_TREEISH;
+   else if(!strcmp(exp_type, "blob"))
+   sha1_flags |= GET_SHA1_BLOB;
+   }
+
+   if (get_sha1_with_context(obj_name, sha1_flags, oid.hash, _context))
die("Not a valid object name %s", obj_name);
 
if (!path)

> If you are doing "git show 235234" it should pick the tag (if it peels to a
> committish) because Git has already set a precedent of preferring tags over
> commits when it disambiguates ref names and otherwise pick the commit.

I'm not convinced that picking the tag is actually helpful in this case;
I agree with Linus that feeding something to "git show" almost always
wants to choose the commit.

I also don't think tag ambiguity in short sha1s is all that interesting.
There are a tiny number of tag objects. Most of your collisions are
going to be with trees or blobs, which should generally outnumber
commits by a factor of 5-10, though it depends on your workflow (git.git
does not have a deep tree, so it's only a factor of 4).

And if you just want to choose a committish over trees and blobs, well,
then; I invite you to check out the core.disambiguate patch I sent
elsewhere in the thread. :)

-Peff


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 04:46:19AM -0700, Kyle J. McKay wrote:

> This hint: information is excellent.  There needs to be a way to show it on
> demand.
> 
> $ git rev-parse --disambiguate=b2e1
> b2e11962c5e6a9c81aa712c751c83a743fd4f384
> b2e11d1bb40c5f81a2f4e37b9f9a60ec7474eeab
> b2e163272c01aca4aee4684f5c683ba341c1953d
> b2e18954c03ff502053cb74d142faab7d2a8dacb
> b2e1895ca92ec2037349d88b945ba64ebf16d62d
> 
> Not nearly so helpful, but the operation of --disambiguate cannot be changed
> without breaking current scripts.
> 
> Can your excellent "hint:" output above be attached to the --disambiguate
> option somehow, please.  Something like this perhaps:
> 
> $ git rev-parse --disambiguate-list=b2e1
> b2e1196 tag v2.8.0-rc1
> b2e11d1 tree
> b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
> b2e1759 blob
> b2e18954 blob
> b2e1895c blob

I think the "right" way to do this is pipe the list of sha1s into
another git commit which can format them however you want.
Unfortunately, there isn't a single command that does a great job:

  - "cat-file --batch-check" can show you the sha1 and type, but it
won't abbreviate sha1s, and it won't show you commit/tag information

  - "log --stdin --no-walk" will format the commit however you like, but
skips the trees and blobs entirely, and the tag can only be seen via
"%d"

  - "for-each-ref" has flexible formatting, too, but wants to format
refs, not objects (and doesn't read from stdin).

IMHO that is a sign that our formatting tools aren't as good as they
could be (I think the right tool is cat-file, but it should be able to
do all of the formatting that the other commands can do).

Of course if you really just want human-readable output, then:

  $ git cat-file -e b2e1
  error: short SHA1 b2e1 is ambiguous
  hint: The candidates are:
  hint:   b2e1196 tag v2.8.0-rc1
  hint:   b2e11d1 tree
  hint:   b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
  hint:   b2e1759 blob
  hint:   b2e18954 blob
  hint:   b2e1895c blob
  fatal: Not a valid object name b2e1

is pretty easy.

That being said, I don't mind if somebody wanted to do a rev-parse
option on top of my series. The formatting code is already split into
its own function.

-Peff


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Kyle J. McKay

On Sep 26, 2016, at 09:36, Linus Torvalds wrote:


On Mon, Sep 26, 2016 at 5:00 AM, Jeff King  wrote:


This patch teaches get_short_sha1() to list the sha1s of the
objects it found, along with a few bits of information that
may help the user decide which one they meant.


This looks very good to me, but I wonder if it couldn't be even more  
aggressive.


In particular, the only hashes that most people ever use in short form
are commit hashes. Those are the ones you'd use in normal human
interactions to point to something happening.

So when the disambiguation notices that there is ambiguity, but there
is only _one_ commit, maybe it should just have an aggressive mode
that says "use that as if it wasn't ambiguous".


If you have this:

faa23ec9b437812ce2fc9a5b3d59418d672debc1 refs/heads/ambig
7f40afe646fa3f8a0f361b6f567d8f7d7a184c10 refs/tags/ambig

and you do this:

$ git rev-parse ambig
warning: refname 'ambig' is ambiguous.
7f40afe646fa3f8a0f361b6f567d8f7d7a184c10

Git automatically prefers the tag over the branch, but it does spit  
out a warning.



And then have an explicit command (or flag) to do disambiguation for
when you explicitly want it.


I think you don't even need that.  Git already does disambiguation for  
ref names, picks one and spits out a warning.


Why not do the same for short hash names when it makes sense?


Rationale: you'd never care about short forms for tags. You'd just use
the tag name. And while blob ID's certainly show up in short form in
diff output (in the "index" line), very few people will use them. And
tree hashes are basically never seen outside of any plumbing commands
and then seldom in shortened form.

So I think it would make sense to default to a mode that just picks
the commit hash if there is only one such hash. Sure, some command
might want a "treeish", but a commit is still more likely than a tree
or a tag.

But regardless, this series looks like a good thing.


I like it too.

But perhaps it makes sense to actually pick one if there's only one  
disambiguation of the type you're looking for.


For example given:

235234a blob
2352347 tag
235234f tree
2352340 commit

If you are doing "git cat-file blob 235234" it should pick the blob  
and spit out a warning (and similarly for other cat-file types).  But  
"git cat-file -p 235234" would give the fatal error with the  
disambiguation hints because it wants type "any".


If you are doing "git show 235234" it should pick the tag (if it peels  
to a committish) because Git has already set a precedent of preferring  
tags over commits when it disambiguates ref names and otherwise pick  
the commit.


Lets consider this approach using the stats for the Linux kernel:


Ambiguous prefix length 7 counts:
  prefixes:   44733
   objects:   89766

Ambiguous length 11 (but not at length 12) info:
  prefixes:   2
  0 (with 1 or more commit disambiguations)

Ambiguous length 10 (but not at length 11) info:
  prefixes:  12
  3 (with 1 or more commit disambiguations)
  0 (with 2 or more commit disambiguations)

Ambiguous length 9 (but not at length 10) info:
  prefixes: 186
 43 (with 1 or more commit disambiguations)
  1 (with 2 or more commit disambiguations)

Ambiguous length 8 (but not at length 9) info:
  prefixes:2723
651 (with 1 or more commit disambiguations)
 40 (with 2 or more commit disambiguations)

Ambiguous length 7 (but not at length 8) info:
  prefixes:   41864
   9842 (with 1 or more commit disambiguations)
680 (with 2 or more commit disambiguations)


Of the 44733 ambiguous length 7 prefixes, only about 10539 of them  
disambiguate into one or more commit objects.


But if we apply the "spit a warning and prefer a commit object if  
there's only one and you're looking for a committish" rule, that drops  
the number from 10539 to about 721.  In other words, only about 7% of  
the previously ambiguous short commit SHA1 prefixes would continue to  
be ambiguous at length 7.  In fact it almost makes a prefix length of  
9 good enough, there's just the one at length 9 that disambiguates  
into more than one commit (45f014c52).


--Kyle


Re: Changing the default for "core.abbrev"?

2016-09-29 Thread Kyle J. McKay

On Sep 25, 2016, at 18:39, Linus Torvalds wrote:


The kernel, these days, is at roughly 5 million objects, and while the
seven hex digits are still often enough for uniqueness (and git will
always add digits *until* it is unique), it's long been at the point
where I tell people to do

   git config --global core.abbrev 12

because even though git will extend the seven hex digits until the
object name is unique, that only reflects the *current* situation in
the repository. With 5 million objects and a very healthy growth rate,
a 7-8 hex digit number that is unique today is not necessarily unique
a month or two from now, and then it gets annoying when a commit
message has a short git ID that is no longer unique when you go back
and try to figure out what went wrong in that commit.


On Sep 25, 2016, at 20:46, Junio C Hamano wrote:


Linus Torvalds  writes:


I can just keep reminding kernel maintainers and developers to update
their git config, but maybe it would be a good idea to just admit  
that

the defaults picked in 2005 weren't necessarily the best ones
possible, and those could be bumped up a bit?


I am not quite sure how good any new default would be, though.  Just
like any timeout is not long enough for somebody, growing projects
will eventually hit whatever abbreviation length they start with.


This made me curious what the situation is really like.  So I crunched  
some data.


Using a recent clone of $korg/torvalds/linux:

$ git rev-parse --verify d597639e203
error: short SHA1 d597639e203 is ambiguous.
fatal: Needed a single revision

So the kernel already has 11-character "short" SHA1s that are  
ambiguous.  Is a core.abbrev setting of 12 really good enough?


Here are the stats on the kernel's repository:

Ambiguous length 11 (but not at length 12) info:
  prefixes:   2
  0 (with 1 or more commit disambiguations)

Ambiguous length 10 (but not at length 11) info:
  prefixes:  12
  3 (with 1 or more commit disambiguations)
  0 (with 2 or more commit disambiguations)

Ambiguous length 9 (but not at length 10) info:
  prefixes: 186
 43 (with 1 or more commit disambiguations)
  1 (with 2 or more commit disambiguations)
  0 (with 3 or more disambiguations)

Ambiguous length 8 (but not at length 9) info:
  prefixes:2723
651 (with 1 or more commit disambiguations)
 40 (with 2 or more commit disambiguations)
  1 (with 3 or more disambiguations)
  maxambig:   3 (there is 1 of them)

Ambiguous length 7 (but not at length 8) info:
  prefixes:   41864
   9842 (with 1 or more commit disambiguations)
680 (with 2 or more commit disambiguations)
299 (with 3 or more disambiguations)
  maxambig:   3 (there are 299 of them)

The "maxambig" value is the maximum number of disambiguations for any  
single prefix at that prefix length.  So for prefixes of length 7  
there are 299 that disambiguate into 3 objects.


Just out of curiosity, generating stats on the Git repository gives:

Ambiguous length 8 (but not at length 9) info:
  prefixes:   7
  3 (with 1 or more commit disambiguations)
  2 (with 2 or more commit disambiguations)
  0 (with 3 or more disambiguations)

Ambiguous length 7 (but not at length 8) info:
  prefixes:  87
 36 (with 1 or more commit disambiguations)
  3 (with 2 or more commit disambiguations)
  0 (with 3 or more disambiguations)

Running the stats on $github/gitster/git produces some ambiguous  
length 9 prefixes (one of which contains a commit disambiguation).


--Kyle


Re: [PATCH/RFC] git log --oneline alternative with dates, times and initials

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 04:00:06AM -0700, Kyle J. McKay wrote:

> > Each of those commits[1] needs some minor polish, and as I'm not really
> > that interested in fancy log output myself, I don't plan on working on
> > them further. I was mostly curious just how close we were. But if you'd
> > like to pursue it, feel free to use them as a starting point.
> 
> Those patches are missing some of the features like showing root commits,
> handling two letter initials, showing the weekday, inserting a break where
> needed to avoid parent-child confusion in graph output and properly handling
> Duy's initials. :)

I'm not too surprised. I literally looked at the first screenshot from
your output and thought "surely git can do that with some minor tweaks".
Nor am I surprised that there are cases where the output is funny (99%
of the time I spent on it was tracking down that graph-padding bug).

I have no problem taking this in contrib or whatever, until a point when
Git is capable of doing the same thing itself. I just hoped to trick you
into working on Git. :)

-Peff


Re: [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits

2016-09-29 Thread SZEDER Gábor


Quoting Jeff King :


On Thu, Sep 29, 2016 at 04:44:00AM +0200, SZEDER Gábor wrote:


> So 12 seems reasonable, and the only downside for it (or for "13", for
> that matter) is a few extra bytes. I dunno, maybe people will really
> hate that, but I have a feeling these are mostly  
cut-and-pasted anyway.


I for one raise my hand in protest...

"few extra bytes" is not the only downside, and it's not at all about
how many characters are copy-and-pasted.  In my opinion it's much more
important that this change wastes 5 columns worth of valuable screen
real estate e.g. for 'git blame' or 'git log --oneline' in projects
that don't need it and certainly won't ever need it.


True. The core of the issue is that we really only care about this
minimum length when _storing_ an abbreviation, but we don't know when
the user is just looking at it in the moment, and when they are going to
stick it in a commit message, email, or bug tracker.

In an ideal world, anybody who was about to store it would run "git
describe" or something to come up with some canonical reference format.
And we could just bump the default minimum there. Personally, I almost
exclusively cite commits as the output of:

  git log -1 --pretty='tformat:%h (%s, %ad)' --date=short


Interesting, I have a pretty format alias that looks almost like this,
except that I carry a patch locally allowing me to say %as for short
date format :)

What I sometimes wished for is a pretty format specifier for 'git
describe --contains', which would make it convenient to cite commits
like this: v0.99~954 (Initial revision of "git", the information manager
from hell, 2005-04-07).  It's better than the abbreviated object name,
because it will stay unique, assuming that the chosen tag is never
deleted, and it carries extra information for humans (the first release
containing the referenced commit), while the abbreviated object name is
completely meaningless.

The obvious drawback that makes it a non-solution for the problem at
hand is that this format can only refer to commits that are reachable
from a tag and can't be used for commits that are descendants of the
most recent tag, e.g. when fixing a bug introduced after the last
release.  Oh, and the user has to fetch the tag first to be able to
make sense of such a reference.


and I'd be fine to stick "--abbrev=12" in there for future-proofing. But
I don't know what the kernel or other projects do.

I'd also be curious to know if the patch I sent in [1] to more
aggressively prefer commits would make this less of an issue, and people
wouldn't care as much about using longer hashes in the first place. So
one option is to merge that (and possibly even make it the default) and
see if people still care in 6 months.

-Peff

[1]  
http://public-inbox.org/git/20160927123801.3bpdg3hap3kzz...@sigill.intra.peff.net/





Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Torsten Bögershausen



On 29/09/16 12:28, Lars Schneider wrote:

On 28 Sep 2016, at 23:49, Junio C Hamano  wrote:

I suspect that you are preparing a reroll already, but the one that
is sitting in 'pu' seems to be flaky in t/t0021 and I seem to see
occasional failures from it.

I didn't trace where the test goes wrong, but one easy mistake you
could make (I am not saying that is the reason of the failure) is to
assume your filter will not be called under certain condition (like
immediately after you checked out from the index to the working
tree), when the automated test goes fast enough and get you into a
"racy git" situation---the filter may be asked to filter the
contents from the working tree again to re-validate what's there is
still what is in the index.

Thanks for the heads-up!

This is what happens:

1) Git exits
2) The filter process receives EOF and prints "STOP" to the log
3) t0021 checks the content of the log

Sometimes 3 happened before 2 which makes the test fail.
(Example: https://travis-ci.org/git/git/jobs/162660563 )

I added a this to wait until the filter process terminates:

+wait_for_filter_termination () {
+   while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
2>&1
+   do
+   echo "Waiting for /t0021/rot13-filter.pl to finish..."
+   sleep 1
+   done
+}

Does this look OK to you?

Do we need the ps at all ?
How about this:

+wait_for_filter_termination () {
+   while ! grep "STOP"  LOGFILENAME >/dev/null
+   do
+   echo "Waiting for /t0021/rot13-filter.pl to finish..."
+   sleep 1
+   done
+}




Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Kyle J. McKay

On Sep 26, 2016, at 05:00, Jeff King wrote:


 $ git rev-parse b2e1
 error: short SHA1 b2e1 is ambiguous
 hint: The candidates are:
 hint:   b2e1196 tag v2.8.0-rc1
 hint:   b2e11d1 tree
 hint:   b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit- 
options'

 hint:   b2e1759 blob
 hint:   b2e18954 blob
 hint:   b2e1895c blob
 fatal: ambiguous argument 'b2e1': unknown revision or path not in  
the working tree.

 Use '--' to separate paths from revisions, like this:
 'git  [...] -- [...]'


This hint: information is excellent.  There needs to be a way to show  
it on demand.


$ git rev-parse --disambiguate=b2e1
b2e11962c5e6a9c81aa712c751c83a743fd4f384
b2e11d1bb40c5f81a2f4e37b9f9a60ec7474eeab
b2e163272c01aca4aee4684f5c683ba341c1953d
b2e18954c03ff502053cb74d142faab7d2a8dacb
b2e1895ca92ec2037349d88b945ba64ebf16d62d

Not nearly so helpful, but the operation of --disambiguate cannot be  
changed without breaking current scripts.


Can your excellent "hint:" output above be attached to the -- 
disambiguate option somehow, please.  Something like this perhaps:


$ git rev-parse --disambiguate-list=b2e1
b2e1196 tag v2.8.0-rc1
b2e11d1 tree
b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
b2e1759 blob
b2e18954 blob
b2e1895c blob

Any option name will do, --disambiguate-verbose, --disambiguate- 
extended, --disambiguate-long, --disambiguate-log, --disambiguate- 
help, --disambiguate-show-me-something-useful-to-humans-not-scripts ...


--Kyle


  1   2   >