Re: [PATCH 00/15] name-rev: eliminate recursion

2019-09-20 Thread SZEDER Gábor
On Fri, Sep 20, 2019 at 11:37:12AM -0400, Derrick Stolee wrote:
> On 9/19/2019 5:46 PM, SZEDER Gábor wrote:
> > 'git name-rev' is implemented using a recursive algorithm, and,
> > consequently, it can segfault in deep histories (e.g. WebKit), and
> > thanks to a test case demonstrating this limitation every test run
> > results in a dmesg entry logging the segfaulting git process.
> > 
> > This patch series eliminates the recursion.
> 
> A noble goal! Recursion into commit history is much easier to get
> stack overflows than when we recurse into the directory hierarchy.
> 
> > Patches 1-5 and 14-15 are while-at-it cleanups I noticed on the way,
> > and patch 6 improves test coverage.
> 
> These cleanups are nice, and I think I followed them pretty closely.
>  
> > Patches 7-11 are preparatory refactorings that are supposed to make
> > this series easier to follow, and make patch 12, the one finally
> > eliminating the recursion, somewhat shorter, and even much shorter
> > when viewed with '--ignore-all-space'.  Patch 13 cleans up after those
> > preparatory steps.
> 
> I responded to several of these, mostly with questions and not actual
> recommendations. I do want to apply your patches locally so I can try
> this --ignore-all-space trick to really be sure patch 12 is doing the
> right thing.

  git fetch https://github.com/szeder/git name-rev-no-recursion

(But this is sort of a v1.1, as it already includes René's suggestion
for patch 3.)



Re: [PATCH v2] merge-recursive: symlink's descendants not in way

2019-09-20 Thread Junio C Hamano
Jonathan Tan  writes:

> When the working tree has:
>  - bar (directory)
>  - bar/file (file)
>  - foo (symlink to .)
>
> (note that lstat() for "foo/bar" would tell us that it is a directory)
>
> and the user merges a commit that deletes the foo symlink and instead
> contains:
>  - bar (directory, as above)
>  - bar/file (file, as above)
>  - foo (directory)
>  - foo/bar (file)
>
> the merge should happen without requiring user intervention.

Thanks.  That clears my previous confusion.  It is clear that the
desired outcome is that bar/file will be merged with itself, foo
itself will resolve to "deleted", and foo/bar will be created.

> However, this does not happen.

OK.  We notice that we need to newly create foo/bar but we
incorrectly find that there is "foo/bar" already because of the
careless use of bare lstat(2) makes "bar" visible as if it were also
"foo/bar".  I wonder if the current code would be confused the same
way if the side branch added "foo/bar/file", or the confusion would
be even worse---it is not dir_in_way() and a different codepath
would be affected, no?

> diff --git a/merge-recursive.c b/merge-recursive.c
> index 6b812d67e3..22a12cfeba 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -764,7 +764,8 @@ static int dir_in_way(struct index_state *istate, const 
> char *path,
>  
>   strbuf_release(&dirpath);
>   return check_working_copy && !lstat(path, &st) && S_ISDIR(st.st_mode) &&
> - !(empty_ok && is_empty_dir(path));
> + !(empty_ok && is_empty_dir(path)) &&
> + !has_symlink_leading_path(path, strlen(path));

As the new call is fairly heavyweight compared to everything else we
are doing, I think it is very sensible to have this at the end, like
this patch does.

Nicely done.  Thanks, will queue.


Re: [PATCH 10/15] name-rev: restructure creating/updating 'struct rev_name' instances

2019-09-20 Thread SZEDER Gábor
On Fri, Sep 20, 2019 at 11:27:49AM -0400, Derrick Stolee wrote:
> On 9/19/2019 5:47 PM, SZEDER Gábor wrote:
> > At the beginning of the recursive name_rev() function it creates a new
> > 'struct rev_name' instance for each previously unvisited commit or, if
> > this visit results in better name for an already visited commit, then
> > updates the 'struct rev_name' instance attached to to the commit, or
> > returns early.
> > 
> > Restructure this so it's caller creates or updates the 'struct
> > rev_name' instance associated with the commit to be passed as
> > parameter, i.e. both name_ref() before calling name_rev() and
> > name_rev() itself as it iterates over the parent commits.
> > 
> > This makes eliminating the recursion a bit easier to follow, and it
> > will be moved back to name_rev() after the recursion is eliminated.
> > 
> > This change also plugs the memory leak that was temporarily unplugged
> > in the earlier "name-rev: pull out deref handling from the recursion"
> > patch in this series.
> [snip]
> >  
> > @@ -276,11 +277,17 @@ static int name_ref(const char *path, const struct 
> > object_id *oid, int flags, vo
> > path = name_ref_abbrev(path, can_abbreviate_output);
> > if (commit->date >= cutoff) {
> > const char *tip_name;
> > +   char *to_free = NULL;
> > if (deref)
> > -   tip_name = xstrfmt("%s^0", path);
> > +   tip_name = to_free = xstrfmt("%s^0", path);
> > else
> > tip_name = xstrdup(path);
> 
> So this xstrdup(path) is not a leak?

Well... yes, everything is leaked, eventually ;)

First of all, name_ref() is a callback function invoked by
for_each_ref(), and 'path' is one of its parameters.  This means
that we must duplicate this string, because we might need to access it
even after iterating over refs is over (when displaying the name of
the commits).

This copy then becomes the initial name_rev() invocation's 'tip_name'
parameter, and things get a bit harder to reason about because of
merges, the recursion, and being in the middle of the refactorings...

So for argument's sake look at current master's 'builtin/name-rev.c',
and assume that it has to deal with a linear history with a single
branch, and no tags.  When name_ref() is invoked with a branch, then
deref = 0, so in name_rev() the condition:

  if (deref) {
tip_name = to_free = xstrfmt("%s^0", tip_name);

won't be fulfilled, and 'tip_name' remains unchanged.  Then comes the
initialization of the 'struct rev_name' associated with the tip commit
in the commit slab, including the

  name->tip_name = tip_name;

assignment, IOW we now have a pointer to the original 'tip_name' in
the commit slab.

Then name_rev() looks at the tip commit's parents, or rather, because
of the linear history, at its only parent.  This means that neither of
the other two xstrfmt() will be invoked, and name_rev() will be
recursively invoked with the original 'tip_name' pointer.  This will
then initialize another 'struct rev_name' instance, including yet
another pointer to the original 'tip_name'.

This will go on until the root commit is reached, and in the end every
commit will have an associated 'struct rev_name' instance with a
pointer to the original 'tip_name' string.

At this point we're done with the recursion, and since the repo has
only a single branch, we're done with for_each_ref() as well, and
finally print names of the commits.

Then it's time to clean up and free() memory, but:

  - we could only release the commit slab and free() all 'struct
rev_name' instances within, but we can't free() the
'name->tip_name' pointers, because, as shown above, we might have
a bunch of pointers pointing to the same string.

  - we are at the end of cmd_name_rev() and are about to exit the
program, so the OS will release any and all resources anyway.
Yeah, it's far from ready to be turned into a library call...

Now, in a real repository we'll have multiple branches, tags, and
merges, so there will be cases when an existing 'struct rev_name'
instance is updated, and it's 'name->tip_name' is overwritten by a
better name.  At that point the old value is potentially leaked, but
we can't really do anything about it, because we don't know whether
any other 'struct rev_name' instances point to it.

Eliminating the recursion doesn't change anything in this respect.




Re: [PATCH 03/15] name-rev: use strip_suffix() in get_rev_name()

2019-09-20 Thread SZEDER Gábor
On Fri, Sep 20, 2019 at 06:36:30PM +0200, René Scharfe wrote:
> Am 19.09.19 um 23:46 schrieb SZEDER Gábor:
> > Use strip_suffix() instead of open-coding it, making the code more
> > idiomatic.
> >
> > Signed-off-by: SZEDER Gábor 
> > ---
> >  builtin/name-rev.c | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> > index c785fe16ba..d345456656 100644
> > --- a/builtin/name-rev.c
> > +++ b/builtin/name-rev.c
> > @@ -317,11 +317,11 @@ static const char *get_rev_name(const struct object 
> > *o, struct strbuf *buf)
> > if (!n->generation)
> > return n->tip_name;
> > else {
> > -   int len = strlen(n->tip_name);
> > -   if (len > 2 && !strcmp(n->tip_name + len - 2, "^0"))
> > -   len -= 2;
> > +   size_t len;
> > +   strip_suffix(n->tip_name, "^0", &len);
> > strbuf_reset(buf);
> > -   strbuf_addf(buf, "%.*s~%d", len, n->tip_name, n->generation);
> > +   strbuf_addf(buf, "%.*s~%d", (int) len, n->tip_name,
> > +   n->generation);
> > return buf->buf;
> > }
> >  }
> >
> 
> This gets rid of the repeated magic string length constant 2, which is
> nice.  But why not go all the way to full strbuf-ness?  It's shorter,
> looks less busy, and the extra two copied bytes shouldn't matter in a
> measurable way.
> 
>   else {
>   strbuf_reset(buf);
>   strbuf_addstr(buf, n->tip_name);
>   strbuf_strip_suffix(buf, "^0");
>   strbuf_addf(buf, "~%d", n->generation);
>   return buf->buf;
>   }

Oh, I like this, thanks!



Re: [PATCH 03/15] name-rev: use strip_suffix() in get_rev_name()

2019-09-20 Thread René Scharfe
Am 19.09.19 um 23:46 schrieb SZEDER Gábor:
> Use strip_suffix() instead of open-coding it, making the code more
> idiomatic.
>
> Signed-off-by: SZEDER Gábor 
> ---
>  builtin/name-rev.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index c785fe16ba..d345456656 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -317,11 +317,11 @@ static const char *get_rev_name(const struct object *o, 
> struct strbuf *buf)
>   if (!n->generation)
>   return n->tip_name;
>   else {
> - int len = strlen(n->tip_name);
> - if (len > 2 && !strcmp(n->tip_name + len - 2, "^0"))
> - len -= 2;
> + size_t len;
> + strip_suffix(n->tip_name, "^0", &len);
>   strbuf_reset(buf);
> - strbuf_addf(buf, "%.*s~%d", len, n->tip_name, n->generation);
> + strbuf_addf(buf, "%.*s~%d", (int) len, n->tip_name,
> + n->generation);
>   return buf->buf;
>   }
>  }
>

This gets rid of the repeated magic string length constant 2, which is
nice.  But why not go all the way to full strbuf-ness?  It's shorter,
looks less busy, and the extra two copied bytes shouldn't matter in a
measurable way.

else {
strbuf_reset(buf);
strbuf_addstr(buf, n->tip_name);
strbuf_strip_suffix(buf, "^0");
strbuf_addf(buf, "~%d", n->generation);
return buf->buf;
}



Re: [PATCH 05/15] name-rev: use sizeof(*ptr) instead of sizeof(type) in allocation

2019-09-20 Thread René Scharfe
Am 19.09.19 um 23:47 schrieb SZEDER Gábor:
> Signed-off-by: SZEDER Gábor 
> ---
>  builtin/name-rev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index e406ff8e17..dec2228cc7 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -98,7 +98,7 @@ static void name_rev(struct commit *commit,
>   }
>
>   if (name == NULL) {
> - name = xmalloc(sizeof(rev_name));
> + name = xmalloc(sizeof(*name));

Here are the declarations of both (and my beloved --function-context
option would only have shown the second one):

typedef struct rev_name {
const char *tip_name;
timestamp_t taggerdate;
int generation;
int distance;
int from_tag;
} rev_name;

struct rev_name *name = get_commit_rev_name(commit);

So your patch is correct.  Had me scratching my head when I first saw
it, though.  That old code has been present since bd321bcc51 ("Add
git-name-rev", 2005-10-26).

>   set_commit_rev_name(commit, name);
>   goto copy_data;
>   } else if (is_better_name(name, taggerdate, distance, from_tag)) {
>



Re: [PATCH 04/15] name-rev: avoid unnecessary cast in name_ref()

2019-09-20 Thread René Scharfe
Am 19.09.19 um 23:46 schrieb SZEDER Gábor:
> Casting a 'struct object' to 'struct commit' is unnecessary there,
> because it's already available in the local 'commit' variable.

That's true, but you can't see that only by reading your email.

>
> Signed-off-by: SZEDER Gábor 
> ---
>  builtin/name-rev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index d345456656..e406ff8e17 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -268,7 +268,7 @@ static int name_ref(const char *path, const struct 
> object_id *oid, int flags, vo

Here's the pertinent context line; --function-context would have been too
much, I think, but -U4 would have shown it:

struct commit *commit = (struct commit *)o;
>   int from_tag = starts_with(path, "refs/tags/");
>
>   if (taggerdate == TIME_MAX)
> - taggerdate = ((struct commit *)o)->date;
> + taggerdate = commit->date;
>   path = name_ref_abbrev(path, can_abbreviate_output);
>   name_rev(commit, xstrdup(path), taggerdate, 0, 0,
>from_tag, deref);
>


Re: [PATCH 08/15] name-rev: pull out deref handling from the recursion

2019-09-20 Thread René Scharfe
Am 19.09.19 um 23:47 schrieb SZEDER Gábor:
> The 'if (deref) { ... }' condition near the beginning of the recursive
> name_rev() function can only ever be true in the first invocation,
> because the 'deref' parameter is always 0 in the subsequent recursive
> invocations.
>
> Extract this condition from the recursion into name_rev()'s caller and
> drop the function's 'deref' parameter.  This makes eliminating the
> recursion a bit easier to follow, and it will be moved back into
> name_rev() after the recursion is elminated.
>
> Furthermore, drop the condition that die()s when both 'deref' and
> 'generation' are non-null (which should have been a BUG() to begin
> with).
>
> Note that this change reintroduces the memory leak that was plugged in
> in commit 5308224633 (name-rev: avoid leaking memory in the `deref`
> case, 2017-05-04), but a later patch in this series will plug it in
> again.
>
> Signed-off-by: SZEDER Gábor 
> ---
>  builtin/name-rev.c | 27 ++-
>  1 file changed, 10 insertions(+), 17 deletions(-)
>
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index cb8ac2fa64..42cea5c881 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -102,30 +102,19 @@ static struct rev_name *create_or_update_name(struct 
> commit *commit,
>
>  static void name_rev(struct commit *commit,
>   const char *tip_name, timestamp_t taggerdate,
> - int generation, int distance, int from_tag,
> - int deref)
> + int generation, int distance, int from_tag)
>  {
>   struct commit_list *parents;
>   int parent_number = 1;
> - char *to_free = NULL;
>
>   parse_commit(commit);
>
>   if (commit->date < cutoff)
>   return;
>
> - if (deref) {
> - tip_name = to_free = xstrfmt("%s^0", tip_name);
> -
> - if (generation)
> - die("generation: %d, but deref?", generation);
> - }
> -
>   if (!create_or_update_name(commit, tip_name, taggerdate, generation,
> -distance, from_tag)) {
> - free(to_free);
> +distance, from_tag))
>   return;
> - }
>
>   for (parents = commit->parents;
>   parents;
> @@ -144,11 +133,11 @@ static void name_rev(struct commit *commit,
>
>   name_rev(parents->item, new_name, taggerdate, 0,
>distance + MERGE_TRAVERSAL_WEIGHT,
> -  from_tag, 0);
> +  from_tag);
>   } else {
>   name_rev(parents->item, tip_name, taggerdate,
>generation + 1, distance + 1,
> -  from_tag, 0);
> +  from_tag);
>   }
>   }
>  }
> @@ -280,12 +269,16 @@ static int name_ref(const char *path, const struct 
> object_id *oid, int flags, vo
>   if (o && o->type == OBJ_COMMIT) {
>   struct commit *commit = (struct commit *)o;
>   int from_tag = starts_with(path, "refs/tags/");
> + const char *tip_name;

This should not be const because you allocate the buffer it points to
right here in the function, in each execution path.

>
>   if (taggerdate == TIME_MAX)
>   taggerdate = commit->date;
>   path = name_ref_abbrev(path, can_abbreviate_output);
> - name_rev(commit, xstrdup(path), taggerdate, 0, 0,
> -  from_tag, deref);
> + if (deref)
> + tip_name = xstrfmt("%s^0", path);
> + else
> + tip_name = xstrdup(path);
> + name_rev(commit, tip_name, taggerdate, 0, 0, from_tag);

tip_name should be free(3)'d here.  Except we can't do that because
name_rev() sometimes stores that pointer in a commit slab.  Ugh.

If the (re)introduced leak doesn't impact performance and memory
usage too much then duplicating tip_name again in name_rev() or
rather your new create_or_update_name() would likely make the
lifetimes of those string buffers easier to manage.

>   }
>   return 0;
>  }
>



Re: [PATCH 06/15] t6120: add a test to cover inner conditions in 'git name-rev's name_rev()

2019-09-20 Thread SZEDER Gábor
On Fri, Sep 20, 2019 at 11:14:56AM -0400, Derrick Stolee wrote:
> On 9/19/2019 5:47 PM, SZEDER Gábor wrote:
> > These conditions are not covered properly in the test suite.  As far
> > as purely test coverage goes, they are all executed several times over
> > in 't6120-describe.sh'.  However, they don't directly influence the
> > command's output, because the repository used in that test script
> > contains several branches and tags pointing somewhere into the middle
> > of the commit DAG, and thus result in a better name for the
> > to-be-named commit.

> > diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
> > index 07e6793e84..2a0f2204c4 100755
> > --- a/t/t6120-describe.sh
> > +++ b/t/t6120-describe.sh
> > @@ -421,4 +421,47 @@ test_expect_success 'describe complains about missing 
> > object' '
> > test_must_fail git describe $ZERO_OID
> >  '
> >  
> > +#   ---master
> > +#  /  /
> > +# A--M2
> > +#  \/
> > +#   \---M1-C
> > +#\ /
> > +# B
> > +test_expect_success 'test' '
> > +   git init repo &&
> > +   (
> > +   cd repo &&
> > +
> > +   echo A >file &&
> > +   git add file &&
> > +   git commit -m A &&
> > +   A=$(git rev-parse HEAD) &&
> 
> Is it not enough to do something like test_commit here?

No, because 'test_commit' adds branches and tags pointing to commits
somewhere in the middle of the history, and those will serve as better
starting point for the resulting name.

> > +
> > +   git checkout --detach &&
> > +   echo B >file &&
> > +   git commit -m B file &&
> > +   B=$(git rev-parse HEAD) &&
> > +
> > +   git checkout $A &&
> > +   git merge --no-ff $B &&  # M1
> > +
> > +   echo C >file &&
> > +   git commit -m C file &&
> > +
> > +   git checkout $A &&
> > +   git merge --no-ff HEAD@{1} && # M2
> > +
> > +   git checkout master &&
> > +   git merge --no-ff HEAD@{1} &&
> > +
> > +   git log --graph --oneline &&
> > +
> > +   echo "$B master^2^2~1^2" >expect &&
> > +   git name-rev $B >actual &&
> 
> This matches your description.
> 
> Thanks,
> -Stolee
>  


Re: [PATCH 05/15] name-rev: use sizeof(*ptr) instead of sizeof(type) in allocation

2019-09-20 Thread SZEDER Gábor
On Fri, Sep 20, 2019 at 11:11:02AM -0400, Derrick Stolee wrote:
> On 9/19/2019 5:47 PM, SZEDER Gábor wrote:
> > Signed-off-by: SZEDER Gábor 
> > ---
> >  builtin/name-rev.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> > index e406ff8e17..dec2228cc7 100644
> > --- a/builtin/name-rev.c
> > +++ b/builtin/name-rev.c
> > @@ -98,7 +98,7 @@ static void name_rev(struct commit *commit,
> > }
> >  
> > if (name == NULL) {
> > -   name = xmalloc(sizeof(rev_name));
> > +   name = xmalloc(sizeof(*name));
> 
> Is this our preferred way to use xmalloc()? If so, then
> I've been doing it wrong and will correct myself in the
> future.

I seem to remember that Peff mentioned in a commit message that this
is the preferred way, but can't find it at the moment.  Anyway, when
using 'sizeof(*ptr)' the type is inferred by the compiler, but when
using 'sizeof(type)' then we have to make sure that 'type' is indeed
the right type.

Besides, that 'rev_name' should have been spelled as 'struct rev_name'
to begin with.



Re: [PATCH 00/15] name-rev: eliminate recursion

2019-09-20 Thread Derrick Stolee
On 9/19/2019 5:46 PM, SZEDER Gábor wrote:
> 'git name-rev' is implemented using a recursive algorithm, and,
> consequently, it can segfault in deep histories (e.g. WebKit), and
> thanks to a test case demonstrating this limitation every test run
> results in a dmesg entry logging the segfaulting git process.
> 
> This patch series eliminates the recursion.

A noble goal! Recursion into commit history is much easier to get
stack overflows than when we recurse into the directory hierarchy.

> Patches 1-5 and 14-15 are while-at-it cleanups I noticed on the way,
> and patch 6 improves test coverage.

These cleanups are nice, and I think I followed them pretty closely.
 
> Patches 7-11 are preparatory refactorings that are supposed to make
> this series easier to follow, and make patch 12, the one finally
> eliminating the recursion, somewhat shorter, and even much shorter
> when viewed with '--ignore-all-space'.  Patch 13 cleans up after those
> preparatory steps.

I responded to several of these, mostly with questions and not actual
recommendations. I do want to apply your patches locally so I can try
this --ignore-all-space trick to really be sure patch 12 is doing the
right thing.

Great organization of patches!

-Stolee


Re: [PATCH 15/15] name-rev: plug a memory leak in name_rev() in the deref case

2019-09-20 Thread Derrick Stolee
On 9/19/2019 5:47 PM, SZEDER Gábor wrote:
> The name_rev() function's 'tip_name' parameter is a freshly
> xstrdup()ed string, so when name_rev() invokes...
This patch 15/15 seems to be the same as your 14/15, and
we should use your _other_ 15/15, right?

Thanks,
-Stolee


Re: [PATCH 10/15] name-rev: restructure creating/updating 'struct rev_name' instances

2019-09-20 Thread Derrick Stolee
On 9/19/2019 5:47 PM, SZEDER Gábor wrote:
> At the beginning of the recursive name_rev() function it creates a new
> 'struct rev_name' instance for each previously unvisited commit or, if
> this visit results in better name for an already visited commit, then
> updates the 'struct rev_name' instance attached to to the commit, or
> returns early.
> 
> Restructure this so it's caller creates or updates the 'struct
> rev_name' instance associated with the commit to be passed as
> parameter, i.e. both name_ref() before calling name_rev() and
> name_rev() itself as it iterates over the parent commits.
> 
> This makes eliminating the recursion a bit easier to follow, and it
> will be moved back to name_rev() after the recursion is eliminated.
> 
> This change also plugs the memory leak that was temporarily unplugged
> in the earlier "name-rev: pull out deref handling from the recursion"
> patch in this series.
[snip]
>  
> @@ -276,11 +277,17 @@ static int name_ref(const char *path, const struct 
> object_id *oid, int flags, vo
>   path = name_ref_abbrev(path, can_abbreviate_output);
>   if (commit->date >= cutoff) {
>   const char *tip_name;
> + char *to_free = NULL;
>   if (deref)
> - tip_name = xstrfmt("%s^0", path);
> + tip_name = to_free = xstrfmt("%s^0", path);
>   else
>   tip_name = xstrdup(path);

So this xstrdup(path) is not a leak?

> - name_rev(commit, tip_name, taggerdate, 0, 0, from_tag);
> + if (create_or_update_name(commit, tip_name, taggerdate,
> +   0, 0, from_tag))
> + name_rev(commit, tip_name, taggerdate, 0, 0,
> +  from_tag);
> + else
> + free(to_free);
>   }
>   }
>   return 0;
> 


Re: [PATCH 08/15] name-rev: pull out deref handling from the recursion

2019-09-20 Thread Derrick Stolee
On 9/19/2019 5:47 PM, SZEDER Gábor wrote:
> The 'if (deref) { ... }' condition near the beginning of the recursive
> name_rev() function can only ever be true in the first invocation,
> because the 'deref' parameter is always 0 in the subsequent recursive
> invocations.
> 
> Extract this condition from the recursion into name_rev()'s caller and
> drop the function's 'deref' parameter.  This makes eliminating the
> recursion a bit easier to follow, and it will be moved back into
> name_rev() after the recursion is elminated.
> 
> Furthermore, drop the condition that die()s when both 'deref' and
> 'generation' are non-null (which should have been a BUG() to begin
> with).

These changes seem sensible. I look forward to seeing how deref is
reintroduced.

> Note that this change reintroduces the memory leak that was plugged in
> in commit 5308224633 (name-rev: avoid leaking memory in the `deref`
> case, 2017-05-04), but a later patch in this series will plug it in
> again.

The memory leak is now for "tip_name" correct? Just tracking to make
sure it gets plugged later.

> Signed-off-by: SZEDER Gábor 
> ---
>  builtin/name-rev.c | 27 ++-
>  1 file changed, 10 insertions(+), 17 deletions(-)
> 
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index cb8ac2fa64..42cea5c881 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -102,30 +102,19 @@ static struct rev_name *create_or_update_name(struct 
> commit *commit,
>  
>  static void name_rev(struct commit *commit,
>   const char *tip_name, timestamp_t taggerdate,
> - int generation, int distance, int from_tag,
> - int deref)
> + int generation, int distance, int from_tag)
>  {
>   struct commit_list *parents;
>   int parent_number = 1;
> - char *to_free = NULL;
>  
>   parse_commit(commit);
>  
>   if (commit->date < cutoff)
>   return;
>  
> - if (deref) {
> - tip_name = to_free = xstrfmt("%s^0", tip_name);
> -
> - if (generation)
> - die("generation: %d, but deref?", generation);
> - }
> -
>   if (!create_or_update_name(commit, tip_name, taggerdate, generation,
> -distance, from_tag)) {
> - free(to_free);
> +distance, from_tag))
>   return;
> - }
>  
>   for (parents = commit->parents;
>   parents;
> @@ -144,11 +133,11 @@ static void name_rev(struct commit *commit,
>  
>   name_rev(parents->item, new_name, taggerdate, 0,
>distance + MERGE_TRAVERSAL_WEIGHT,
> -  from_tag, 0);
> +  from_tag);
>   } else {
>   name_rev(parents->item, tip_name, taggerdate,
>generation + 1, distance + 1,
> -  from_tag, 0);
> +  from_tag);
>   }
>   }
>  }
> @@ -280,12 +269,16 @@ static int name_ref(const char *path, const struct 
> object_id *oid, int flags, vo
>   if (o && o->type == OBJ_COMMIT) {
>   struct commit *commit = (struct commit *)o;
>   int from_tag = starts_with(path, "refs/tags/");
> + const char *tip_name;
>  
>   if (taggerdate == TIME_MAX)
>   taggerdate = commit->date;
>   path = name_ref_abbrev(path, can_abbreviate_output);
> - name_rev(commit, xstrdup(path), taggerdate, 0, 0,
> -  from_tag, deref);
> + if (deref)
> + tip_name = xstrfmt("%s^0", path);
> + else
> + tip_name = xstrdup(path);

(leak above, as noted in message).

> + name_rev(commit, tip_name, taggerdate, 0, 0, from_tag);
>   }
>   return 0;
>  }
> 


Re: [PATCH 07/15] name-rev: extract creating/updating a 'struct name_rev' into a helper

2019-09-20 Thread Derrick Stolee
On 9/19/2019 5:47 PM, SZEDER Gábor wrote:
> In a later patch in this series we'll want to do this in two places.
> 
> Signed-off-by: SZEDER Gábor 
> ---
>  builtin/name-rev.c | 40 +++-
>  1 file changed, 27 insertions(+), 13 deletions(-)
> 
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index dec2228cc7..cb8ac2fa64 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -75,12 +75,36 @@ static int is_better_name(struct rev_name *name,
>   return 0;
>  }
>  
> +static struct rev_name *create_or_update_name(struct commit *commit,
> +   const char *tip_name,
> +   timestamp_t taggerdate,
> +   int generation, int distance,
> +   int from_tag)
> +{
> + struct rev_name *name = get_commit_rev_name(commit);
> +
> + if (name == NULL) {
> + name = xmalloc(sizeof(*name));
> + set_commit_rev_name(commit, name);
> + goto copy_data;
> + } else if (is_better_name(name, taggerdate, distance, from_tag)) {
> +copy_data:
> + name->tip_name = tip_name;
> + name->taggerdate = taggerdate;
> + name->generation = generation;
> + name->distance = distance;
> + name->from_tag = from_tag;
> +
> + return name;
> + } else
> + return NULL;
> +}
> +
>  static void name_rev(struct commit *commit,
>   const char *tip_name, timestamp_t taggerdate,
>   int generation, int distance, int from_tag,
>   int deref)
>  {
> - struct rev_name *name = get_commit_rev_name(commit);

A perhaps small benefit: we delay this call until after some
other checks happen. It's just looking up data in a cache, but
it may help a little.

>   struct commit_list *parents;
>   int parent_number = 1;
>   char *to_free = NULL;
> @@ -97,18 +121,8 @@ static void name_rev(struct commit *commit,
>   die("generation: %d, but deref?", generation);
>   }
>  
> - if (name == NULL) {
> - name = xmalloc(sizeof(*name));
> - set_commit_rev_name(commit, name);
> - goto copy_data;
> - } else if (is_better_name(name, taggerdate, distance, from_tag)) {
> -copy_data:
> - name->tip_name = tip_name;
> - name->taggerdate = taggerdate;
> - name->generation = generation;
> - name->distance = distance;
> - name->from_tag = from_tag;
> - } else {
> + if (!create_or_update_name(commit, tip_name, taggerdate, generation,
> +distance, from_tag)) {

Otherwise this method extraction looks correct.

Thanks,
-Stolee



Re: [PATCH 06/15] t6120: add a test to cover inner conditions in 'git name-rev's name_rev()

2019-09-20 Thread Derrick Stolee
On 9/19/2019 5:47 PM, SZEDER Gábor wrote:
> In 'builtin/name-rev.c' in the name_rev() function there is a loop
> iterating over all parents of the given commit, and the loop body
> looks like this:
> 
>   if (parent_number > 1) {
> if (generation > 0)
>   // do stuff #1
> else
>   // do stuff #2
>   } else {
>  // do stuff #3
>   }
> 
> These conditions are not covered properly in the test suite.  As far
> as purely test coverage goes, they are all executed several times over
> in 't6120-describe.sh'.  However, they don't directly influence the
> command's output, because the repository used in that test script
> contains several branches and tags pointing somewhere into the middle
> of the commit DAG, and thus result in a better name for the
> to-be-named commit.  In an early version of this patch series I
> managed to mess up those conditions (every single one of them at
> once!), but the whole test suite still passed successfully.
> 
> So add a new test case that operates on the following history:
> 
> ---master
>/  /
>   A--M2
>\/
> \---M1-C
>  \ /
>   B
> 
> and names the commit 'B', where:
> 
>   - The merge commit at master makes sure that the 'do stuff #3'
> affects the final name.
> 
>   - The merge commit M2 make sure that the 'do stuff #1' part
> affects the final name.
> 
>   - And M1 makes sure that the 'do stuff #2' part affects the final
> name.
> 
> Signed-off-by: SZEDER Gábor 
> ---
>  t/t6120-describe.sh | 43 +++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
> index 07e6793e84..2a0f2204c4 100755
> --- a/t/t6120-describe.sh
> +++ b/t/t6120-describe.sh
> @@ -421,4 +421,47 @@ test_expect_success 'describe complains about missing 
> object' '
>   test_must_fail git describe $ZERO_OID
>  '
>  
> +#   ---master
> +#  /  /
> +# A--M2
> +#  \/
> +#   \---M1-C
> +#\ /
> +# B
> +test_expect_success 'test' '
> + git init repo &&
> + (
> + cd repo &&
> +
> + echo A >file &&
> + git add file &&
> + git commit -m A &&
> + A=$(git rev-parse HEAD) &&

Is it not enough to do something like test_commit here?

> +
> + git checkout --detach &&
> + echo B >file &&
> + git commit -m B file &&
> + B=$(git rev-parse HEAD) &&
> +
> + git checkout $A &&
> + git merge --no-ff $B &&  # M1
> +
> + echo C >file &&
> + git commit -m C file &&
> +
> + git checkout $A &&
> + git merge --no-ff HEAD@{1} && # M2
> +
> + git checkout master &&
> + git merge --no-ff HEAD@{1} &&
> +
> + git log --graph --oneline &&
> +
> + echo "$B master^2^2~1^2" >expect &&
> + git name-rev $B >actual &&

This matches your description.

Thanks,
-Stolee
 


Re: [PATCH 05/15] name-rev: use sizeof(*ptr) instead of sizeof(type) in allocation

2019-09-20 Thread Derrick Stolee
On 9/19/2019 5:47 PM, SZEDER Gábor wrote:
> Signed-off-by: SZEDER Gábor 
> ---
>  builtin/name-rev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index e406ff8e17..dec2228cc7 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -98,7 +98,7 @@ static void name_rev(struct commit *commit,
>   }
>  
>   if (name == NULL) {
> - name = xmalloc(sizeof(rev_name));
> + name = xmalloc(sizeof(*name));

Is this our preferred way to use xmalloc()? If so, then
I've been doing it wrong and will correct myself in the
future.

-Stolee


Re: [PATCH v2 09/11] sparse-checkout: use hashmaps for cone patterns

2019-09-20 Thread Derrick Stolee
On 9/19/2019 4:59 PM, Derrick Stolee wrote:
> On 9/19/2019 10:43 AM, Derrick Stolee via GitGitGadget wrote:
>> @@ -848,6 +953,10 @@ static int add_patterns_from_buffer(char *buf, size_t 
>> size,
>>  int i, lineno = 1;
>>  char *entry;
>>  
>> +pl->use_cone_patterns = core_sparse_checkout_cone;
>> +hashmap_init(&pl->recursive_hashmap, pl_hashmap_cmp, NULL, 0);
>> +hashmap_init(&pl->parent_hashmap, pl_hashmap_cmp, NULL, 0);
>> +
> 
> Just a head's-up to anyone looking at this series: this is not the
> right place to set use_cone_patterns (without passing a flag or
> something). This same path is called from the .gitignore machinery,
> so if you have a non-cone pattern in your .gitignore you will start
> seeing warnings with core.sparseCheckoutCone=true.
> 
> I figured it out only via integration tests with our C# layer. In
> v2 I'll fix this and add a test to make sure it stays fixed.

Here is the code fix. I will have a test to check this in v3.

-->8--

>From 73b100d11d11bf8f045c2e116390120819dcb800 Mon Sep 17 00:00:00 2001
From: Derrick Stolee 
Date: Fri, 20 Sep 2019 08:55:06 -0400
Subject: [PATCH v2] fixup! sparse-checkout: use hashmaps for cone patterns

---
 dir.c  | 1 -
 unpack-trees.c | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 35fd60d487..248a418379 100644
--- a/dir.c
+++ b/dir.c
@@ -953,7 +953,6 @@ static int add_patterns_from_buffer(char *buf, size_t size,
int i, lineno = 1;
char *entry;
 
-   pl->use_cone_patterns = core_sparse_checkout_cone;
hashmap_init(&pl->recursive_hashmap, pl_hashmap_cmp, NULL, 0);
hashmap_init(&pl->parent_hashmap, pl_hashmap_cmp, NULL, 0);
 
diff --git a/unpack-trees.c b/unpack-trees.c
index 43acc0ffd6..b5cf591c38 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1487,6 +1487,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
o->skip_sparse_checkout = 1;
if (!o->skip_sparse_checkout) {
char *sparse = git_pathdup("info/sparse-checkout");
+   pl.use_cone_patterns = core_sparse_checkout_cone;
if (add_patterns_from_file_to_list(sparse, "", 0, &pl, NULL) < 
0)
o->skip_sparse_checkout = 1;
else
-- 
2.23.0.vfs.1.1.19.gce6e76d




Re: [PATCH] doc: MyFirstContribution: fix cmd placement instructions

2019-09-20 Thread Philip Oakley

On 19/09/2019 21:46, Emily Shaffer wrote:

Thanks for sending this. I guess I don't know the alphabet very well :)

A different color we could paint the bikeshed would be to write "add a
new line for `psuh` immediately after it" in these places instead. But I
have no preference whatsoever.

  - Emily

Reviewed-by: Emily Shaffer 

On Thu, Sep 19, 2019 at 03:46:34PM -0300, Pedro Sousa wrote:

Using the pull command instead of push is more accurate when giving
instructions on placing the psuh command in alphabetical order.

Signed-off-by: Pedro Sousa 
---
  Documentation/MyFirstContribution.txt | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/MyFirstContribution.txt 
b/Documentation/MyFirstContribution.txt
index f867037..fb15af8 100644
--- a/Documentation/MyFirstContribution.txt
+++ b/Documentation/MyFirstContribution.txt
@@ -97,7 +97,7 @@ int cmd_psuh(int argc, const char **argv, const char *prefix)
  
  
  We'll also need to add the declaration of psuh; open up `builtin.h`, find the

-declaration for `cmd_push`, and add a new line for `psuh` immediately before 
it,
+declaration for `cmd_pull`, and add a new line for `psuh` immediately before 
it,
Maybe feature the "alphabetical" sort order (i.e. include the word 
here), so that other see it (use of alphabetical sorting) that little 
more obviously in the tutorial material.

Philip

  in order to keep the declarations sorted:
  
  

@@ -123,7 +123,7 @@ int cmd_psuh(int argc, const char **argv, const char 
*prefix)
  }
  
  
-Let's try to build it.  Open `Makefile`, find where `builtin/push.o` is added

+Let's try to build it.  Open `Makefile`, find where `builtin/pull.o` is added
  to `BUILTIN_OBJS`, and add `builtin/psuh.o` in the same way next to it in
  alphabetical order. Once you've done so, move to the top-level directory and
  build simply with `make`. Also add the `DEVELOPER=1` variable to turn on
@@ -149,7 +149,7 @@ a `cmd_struct` to the `commands[]` array. `struct 
cmd_struct` takes a string
  with the command name, a function pointer to the command implementation, and a
  setup option flag. For now, let's keep mimicking `push`. Find the line where
  `cmd_push` is registered, copy it, and modify it for `cmd_psuh`, placing the 
new
-line in alphabetical order.
+line in alphabetical order (immediately before `cmd_pull`).
  
  The options are documented in `builtin.h` under "Adding a new built-in." Since

  we hope to print some data about the user's current workspace context later,
@@ -167,7 +167,7 @@ Check it out! You've got a command! Nice work! Let's commit 
this.
  
  `git status` reveals modified `Makefile`, `builtin.h`, and `git.c` as well as

  untracked `builtin/psuh.c` and `git-psuh`. First, let's take care of the 
binary,
-which should be ignored. Open `.gitignore` in your editor, find `/git-push`, 
and
+which should be ignored. Open `.gitignore` in your editor, find `/git-pull`, 
and
  add an entry for your new command in alphabetical order:
  
  

--
2.9.3





Re: [PATCH 15/15] name-rev: plug memory leak in name_rev()

2019-09-19 Thread SZEDER Gábor
Please ignore this patch as well.

On Thu, Sep 19, 2019 at 11:47:12PM +0200, SZEDER Gábor wrote:
> The loop iterating over the parent commits in the name_rev() function
> contains two xstrfmt() calls, and their result is leaked if the parent
> commit is not processed further (because that parent has already been
> visited before, and this further visit doesn't result in a better name
> for its ancestors).
> 
> Make sure that the result of those xstrfmt() calls is free()d if the
> parent commit is not processed further.
> 
> This results in slightly but measurably lower memory usage: the
> avarage maximum resident size of 5 'git name-rev --all' invocations in
> 'linux.git' shrinks from 3256124kB to 319990kB, just about 2% less.
> 
> Signed-off-by: SZEDER Gábor 
> ---
>  builtin/name-rev.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index f867d45f0b..d65de04918 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -139,6 +139,7 @@ static void name_rev(struct commit *start_commit,
>   struct commit *parent = parents->item;
>   const char *new_name;
>   int generation, distance;
> + const char *new_name_to_free = NULL;
>  
>   parse_commit(parent);
>   if (parent->date < cutoff)
> @@ -158,6 +159,7 @@ static void name_rev(struct commit *start_commit,
>   new_name = xstrfmt("%.*s^%d", (int)len,
>  name->tip_name,
>  parent_number);
> + new_name_to_free = new_name;
>   generation = 0;
>   distance = name->distance + 
> MERGE_TRAVERSAL_WEIGHT;
>   } else {
> @@ -171,6 +173,8 @@ static void name_rev(struct commit *start_commit,
> from_tag))
>   last_new_parent = commit_list_append(parent,
> last_new_parent);
> + else
> + free((char*) new_name_to_free);
>   }
>  
>   *last_new_parent = list;
> -- 
> 2.23.0.331.g4e51dcdf11
> 


Re: [PATCH 14/15] name-rev: plug memory leak in name_rev() in the deref case

2019-09-19 Thread SZEDER Gábor
Please ignore this mail.

On Thu, Sep 19, 2019 at 11:47:10PM +0200, SZEDER Gábor wrote:
> The name_rev() function's 'tip_name' parameter is a freshly
> xstrdup()ed string, so when name_rev() invokes:
> 
>   tip_name = xstrfmt("%s^0", tip_name);
> 
> then the original 'tip_name' string is leaked.
> 
> Make sure that this string is free()d after it has been used as input
> for that xstrfmt() call.
> 
> This only happens when name_rev() is invoked with a tag, i.e.
> relatively infrequently in a usual repository, so any reduction in
> memory usage is lost in the noise.
> 
> Signed-off-by: SZEDER Gábor 
> ---
>  builtin/name-rev.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index e202835129..f867d45f0b 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -101,18 +101,22 @@ static struct rev_name *create_or_update_name(struct 
> commit *commit,
>  }
>  
>  static void name_rev(struct commit *start_commit,
> - const char *tip_name, timestamp_t taggerdate,
> + const char *start_tip_name, timestamp_t taggerdate,
>   int from_tag, int deref)
>  {
>   struct commit_list *list = NULL;
> + const char *tip_name;
>   char *to_free = NULL;
>  
>   parse_commit(start_commit);
>   if (start_commit->date < cutoff)
>   return;
>  
> - if (deref)
> - tip_name = to_free = xstrfmt("%s^0", tip_name);
> + if (deref) {
> + tip_name = to_free = xstrfmt("%s^0", start_tip_name);
> + free((char*) start_tip_name);
> + } else
> + tip_name = start_tip_name;
>  
>   if (!create_or_update_name(start_commit, tip_name, taggerdate, 0, 0,
>  from_tag)) {
> -- 
> 2.23.0.331.g4e51dcdf11
> 


Re: [PATCH v2 09/11] sparse-checkout: use hashmaps for cone patterns

2019-09-19 Thread Derrick Stolee
On 9/19/2019 10:43 AM, Derrick Stolee via GitGitGadget wrote:
> @@ -848,6 +953,10 @@ static int add_patterns_from_buffer(char *buf, size_t 
> size,
>   int i, lineno = 1;
>   char *entry;
>  
> + pl->use_cone_patterns = core_sparse_checkout_cone;
> + hashmap_init(&pl->recursive_hashmap, pl_hashmap_cmp, NULL, 0);
> + hashmap_init(&pl->parent_hashmap, pl_hashmap_cmp, NULL, 0);
> +

Just a head's-up to anyone looking at this series: this is not the
right place to set use_cone_patterns (without passing a flag or
something). This same path is called from the .gitignore machinery,
so if you have a non-cone pattern in your .gitignore you will start
seeing warnings with core.sparseCheckoutCone=true.

I figured it out only via integration tests with our C# layer. In
v2 I'll fix this and add a test to make sure it stays fixed.

Otherwise, everything is working as expected.

-Stolee



Re: [PATCH] doc: MyFirstContribution: fix cmd placement instructions

2019-09-19 Thread Emily Shaffer
Thanks for sending this. I guess I don't know the alphabet very well :)

A different color we could paint the bikeshed would be to write "add a
new line for `psuh` immediately after it" in these places instead. But I
have no preference whatsoever.

 - Emily

Reviewed-by: Emily Shaffer 

On Thu, Sep 19, 2019 at 03:46:34PM -0300, Pedro Sousa wrote:
> Using the pull command instead of push is more accurate when giving
> instructions on placing the psuh command in alphabetical order.
> 
> Signed-off-by: Pedro Sousa 
> ---
>  Documentation/MyFirstContribution.txt | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/MyFirstContribution.txt 
> b/Documentation/MyFirstContribution.txt
> index f867037..fb15af8 100644
> --- a/Documentation/MyFirstContribution.txt
> +++ b/Documentation/MyFirstContribution.txt
> @@ -97,7 +97,7 @@ int cmd_psuh(int argc, const char **argv, const char 
> *prefix)
>  
>  
>  We'll also need to add the declaration of psuh; open up `builtin.h`, find the
> -declaration for `cmd_push`, and add a new line for `psuh` immediately before 
> it,
> +declaration for `cmd_pull`, and add a new line for `psuh` immediately before 
> it,
>  in order to keep the declarations sorted:
>  
>  
> @@ -123,7 +123,7 @@ int cmd_psuh(int argc, const char **argv, const char 
> *prefix)
>  }
>  
>  
> -Let's try to build it.  Open `Makefile`, find where `builtin/push.o` is added
> +Let's try to build it.  Open `Makefile`, find where `builtin/pull.o` is added
>  to `BUILTIN_OBJS`, and add `builtin/psuh.o` in the same way next to it in
>  alphabetical order. Once you've done so, move to the top-level directory and
>  build simply with `make`. Also add the `DEVELOPER=1` variable to turn on
> @@ -149,7 +149,7 @@ a `cmd_struct` to the `commands[]` array. `struct 
> cmd_struct` takes a string
>  with the command name, a function pointer to the command implementation, and 
> a
>  setup option flag. For now, let's keep mimicking `push`. Find the line where
>  `cmd_push` is registered, copy it, and modify it for `cmd_psuh`, placing the 
> new
> -line in alphabetical order.
> +line in alphabetical order (immediately before `cmd_pull`).
>  
>  The options are documented in `builtin.h` under "Adding a new built-in." 
> Since
>  we hope to print some data about the user's current workspace context later,
> @@ -167,7 +167,7 @@ Check it out! You've got a command! Nice work! Let's 
> commit this.
>  
>  `git status` reveals modified `Makefile`, `builtin.h`, and `git.c` as well as
>  untracked `builtin/psuh.c` and `git-psuh`. First, let's take care of the 
> binary,
> -which should be ignored. Open `.gitignore` in your editor, find `/git-push`, 
> and
> +which should be ignored. Open `.gitignore` in your editor, find `/git-pull`, 
> and
>  add an entry for your new command in alphabetical order:
>  
>  
> -- 
> 2.9.3
> 


Re: [PATCH] ls-remote: create '--count' option

2019-09-19 Thread Johannes Sixt
Am 18.09.19 um 23:44 schrieb Kamil Domański:
> On 9/18/19 8:28 AM, Johannes Sixt wrote:
>> Am 18.09.19 um 02:11 schrieb Kamil Domański:
>>>     DESCRIPTION
>>>   ---
>>> @@ -21,6 +21,11 @@ commit IDs.
>>>     OPTIONS
>>>   ---
>>> +--count=::
>>> +    By default the command shows all refs that match
>>> +    ``.  This option makes it stop after showing
>>> +    that many refs.
>> Is the meaning of this option perhaps:
>>
>>  Stops after the specified count of refs have been listed.
>>  If `--sort=` is specified as well, refs are counted
>>  after sorting; otherwise, it is unspecified which subset
>>  of is listed.
> 
> Similarly, I merely copied the description used by 'for-each-ref'. I
> like your version as well, since it clarifies the relation to sorting.
> Should the description for 'for-each-ref' be changed as well then?

I am neutral. If you do it, then it should not happen in the same patch
as this change.

> ... it appears that 'for-each-ref' defaults to sorting
> by 'refname' (through ref_default_sorting() ) if no alternative sorting
> is provided, while 'ls-remote' does no such defaulting. Do you figure it
> would be a good idea to add another patch which would introduce a
> similar default in 'ls-remote'?

I don't think so. I would think that in the case of ls-remote it is
preferable to stream refs to the output as soon as they are received
from the remote. If refs must be sorted, it would have to wait until the
remote has sent all refs before it can do anything.

That said, Junio's question remains open: why is piping though head -n
not good enough?

-- Hannes


Re: [PATCH v2] merge-recursive: symlink's descendants not in way

2019-09-18 Thread Elijah Newren
On Wed, Sep 18, 2019 at 1:27 PM Jonathan Tan  wrote:
>
> When the working tree has:
>  - bar (directory)
>  - bar/file (file)
>  - foo (symlink to .)
>
> (note that lstat() for "foo/bar" would tell us that it is a directory)
>
> and the user merges a commit that deletes the foo symlink and instead
> contains:
>  - bar (directory, as above)
>  - bar/file (file, as above)
>  - foo (directory)
>  - foo/bar (file)
>
> the merge should happen without requiring user intervention. However,
> this does not happen.
>
> This is because dir_in_way(), when checking the working tree, thinks
> that "foo/bar" is a directory. But a symlink should be treated much the
> same as a file: since dir_in_way() is only checking to see if there is a
> directory in the way, we don't want symlinks in leading paths to
> sometimes cause dir_in_way() to return true.
>
> Teach dir_in_way() to also check for symlinks in leading paths before
> reporting whether a directory is in the way.
>
> Helped-by: Elijah Newren 
> Signed-off-by: Jonathan Tan 
> ---
> Changes from v1:
>
> - Used has_symlink_leading_path(). This drastically shortens the diff.
> - Updated commit message following suggestions from Junio, Szeder Gábor,
>   and Elijah Newren.
> - Updated test to add prereq and verification that the working tree
>   contains what we want.
> ---
>  merge-recursive.c  |  3 ++-
>  t/t3030-merge-recursive.sh | 28 
>  2 files changed, 30 insertions(+), 1 deletion(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 6b812d67e3..22a12cfeba 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -764,7 +764,8 @@ static int dir_in_way(struct index_state *istate, const 
> char *path,
>
> strbuf_release(&dirpath);
> return check_working_copy && !lstat(path, &st) && S_ISDIR(st.st_mode) 
> &&
> -   !(empty_ok && is_empty_dir(path));
> +   !(empty_ok && is_empty_dir(path)) &&
> +   !has_symlink_leading_path(path, strlen(path));
>  }
>
>  /*
> diff --git a/t/t3030-merge-recursive.sh b/t/t3030-merge-recursive.sh
> index ff641b348a..faa8892741 100755
> --- a/t/t3030-merge-recursive.sh
> +++ b/t/t3030-merge-recursive.sh
> @@ -452,6 +452,34 @@ test_expect_success 'merge-recursive d/f conflict 
> result' '
>
>  '
>
> +test_expect_success SYMLINKS 'dir in working tree with symlink ancestor does 
> not produce d/f conflict' '
> +   git init sym &&
> +   (
> +   cd sym &&
> +   ln -s . foo &&
> +   mkdir bar &&
> +   >bar/file &&
> +   git add foo bar/file &&
> +   git commit -m "foo symlink" &&
> +
> +   git checkout -b branch1 &&
> +   git commit --allow-empty -m "empty commit" &&
> +
> +   git checkout master &&
> +   git rm foo &&
> +   mkdir foo &&
> +   >foo/bar &&
> +   git add foo/bar &&
> +   git commit -m "replace foo symlink with real foo dir and 
> foo/bar file" &&
> +
> +   git checkout branch1 &&
> +
> +   git cherry-pick master &&
> +   test_path_is_dir foo &&
> +   test_path_is_file foo/bar
> +   )
> +'
> +
>  test_expect_success 'reset and 3-way merge' '
>
> git reset --hard "$c2" &&
> --

Looks good to me; nice how much it has simplified.  Thanks for working on this.

> 2.23.0.237.gc6a4ce50a0-goog

A total tangent, but what do you use the "-goog" suffix for?


Re: [PATCH] ls-remote: create '--count' option

2019-09-18 Thread Kamil Domański

On 9/18/19 8:28 AM, Johannes Sixt wrote:

Am 18.09.19 um 02:11 schrieb Kamil Domański:

Create a '--count' option for ls-remote, based on the one from
for-each-ref. This allows e.g. to return only the first result
from a sorted list of refs.

Signed-off-by: Kamil Domański 
---
  Documentation/git-ls-remote.txt | 11 ---
  builtin/ls-remote.c | 16 
  t/t5512-ls-remote.sh|  9 +
  3 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-ls-remote.txt b/Documentation/git-ls-remote.txt
index 0b057cbb10..5adc1d676e 100644
--- a/Documentation/git-ls-remote.txt
+++ b/Documentation/git-ls-remote.txt
@@ -9,9 +9,9 @@ git-ls-remote - List references in a remote repository
  SYNOPSIS
  
  [verse]
-'git ls-remote' [--heads] [--tags] [--refs] [--upload-pack=]
- [-q | --quiet] [--exit-code] [--get-url] [--sort=]
- [--symref] [ [...]]
+'git ls-remote' [--count=] [--heads] [--tags] [--refs]
+ [--upload-pack=] [-q | --quiet] [--exit-code] [--get-url]
+ [--sort=] [--symref] [ [...]]

It is understandable that the new option is important to _you_, but it
does not seem important enough that it must be the first in the list.
Please add it between --symref and 
The assumption is quite unnecessary. I merely tried to reflect the 
position of this parameter in the 'for-each-ref' command, on which I 
based the patch.
  
  DESCRIPTION

  ---
@@ -21,6 +21,11 @@ commit IDs.
  
  OPTIONS

  ---
+--count=::
+   By default the command shows all refs that match
+   ``.  This option makes it stop after showing
+   that many refs.

Is the meaning of this option perhaps:

 Stops after the specified count of refs have been listed.
 If `--sort=` is specified as well, refs are counted
 after sorting; otherwise, it is unspecified which subset
 of is listed.


Similarly, I merely copied the description used by 'for-each-ref'. I 
like your version as well, since it clarifies the relation to sorting. 
Should the description for 'for-each-ref' be changed as well then?



I do not know whether the "otherwise" part would be true (check it!),
but I am pretty certain that the "If" part must be true, otherwise the
option would be pointless.


Yes, both in 'for-each-ref' as well as this patch, cutting off the 
results takes place after the whole set has been already sorted. As for 
the "otherwise" part, it appears that 'for-each-ref' defaults to sorting 
by 'refname' (through ref_default_sorting() ) if no alternative sorting 
is provided, while 'ls-remote' does no such defaulting. Do you figure it 
would be a good idea to add another patch which would introduce a 
similar default in 'ls-remote'?



Regards,

Kamil





Re: [PATCH] sha1_name: simplify strbuf handling in interpret_nth_prior_checkout()

2019-09-18 Thread Derrick Stolee
On 9/18/2019 12:35 PM, René Scharfe wrote:
> Pass the target strbuf to the callback function grab_nth_branch_switch()
> by reference so that it can add the result string directly instead of
> having it put the string into a temporary strbuf first.  This gets rid
> of an extra allocation and a string copy.
> 
> Signed-off-by: René Scharfe 
> ---
> Patch formatted with --function-context for easier reviewing.

I appreciate this. I needed to look through the whole method to see that
previously "buf" was used to take the resulting value on a successful
parse.

>  sha1-name.c | 11 ---
>  1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/sha1-name.c b/sha1-name.c
> index c665e3f96d..85196929c7 100644
> --- a/sha1-name.c
> +++ b/sha1-name.c
> @@ -1286,70 +1286,67 @@ static int get_oid_oneline(struct repository *r,
> 
>  struct grab_nth_branch_switch_cbdata {
>   int remaining;
> - struct strbuf buf;
> + struct strbuf *sb;
>  };
> 
>  static int grab_nth_branch_switch(struct object_id *ooid, struct object_id 
> *noid,
> const char *email, timestamp_t timestamp, int 
> tz,
> const char *message, void *cb_data)
>  {
>   struct grab_nth_branch_switch_cbdata *cb = cb_data;
>   const char *match = NULL, *target = NULL;
>   size_t len;
> 
>   if (skip_prefix(message, "checkout: moving from ", &match))
>   target = strstr(match, " to ");
> 
>   if (!match || !target)
>   return 0;
>   if (--(cb->remaining) == 0) {
>   len = target - match;
> - strbuf_reset(&cb->buf);
> - strbuf_add(&cb->buf, match, len);
> + strbuf_reset(cb->sb);
> + strbuf_add(cb->sb, match, len);
>   return 1; /* we are done */

It seems that now "buf" is still only being replaced on a successful parse, 
since
this strbuf_reset and strbuf_add is happening only on a return 1, which 
terminates
the refs_for_each... method.

>   }
>   return 0;
>  }
> 
>  /*
>   * Parse @{-N} syntax, return the number of characters parsed
>   * if successful; otherwise signal an error with negative value.
>   */
>  static int interpret_nth_prior_checkout(struct repository *r,
>   const char *name, int namelen,
>   struct strbuf *buf)
>  {
>   long nth;
>   int retval;
>   struct grab_nth_branch_switch_cbdata cb;
>   const char *brace;
>   char *num_end;
> 
>   if (namelen < 4)
>   return -1;
>   if (name[0] != '@' || name[1] != '{' || name[2] != '-')
>   return -1;
>   brace = memchr(name, '}', namelen);
>   if (!brace)
>   return -1;
>   nth = strtol(name + 3, &num_end, 10);
>   if (num_end != brace)
>   return -1;
>   if (nth <= 0)
>   return -1;
>   cb.remaining = nth;
> - strbuf_init(&cb.buf, 20);
> + cb.sb = buf;
> 
>   retval = refs_for_each_reflog_ent_reverse(get_main_ref_store(r),
>   "HEAD", grab_nth_branch_switch, &cb);
>   if (0 < retval) {
> - strbuf_reset(buf);
> - strbuf_addbuf(buf, &cb.buf);
>   retval = brace - name + 1;
>   } else
>   retval = 0;
> 
> - strbuf_release(&cb.buf);
>   return retval;
>  }
> 

The reaction in this method appears correct, too.

Thanks,
-Stolee



Re: [PATCH] doc: fix reference to --ignore-submodules

2019-09-18 Thread Junio C Hamano
Thanks.


Re: [PATCH] ls-remote: create '--count' option

2019-09-18 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 18.09.19 um 02:11 schrieb Kamil Domański:
>> Create a '--count' option for ls-remote, based on the one from
>> for-each-ref. This allows e.g. to return only the first result
>> from a sorted list of refs.
>> 
>> Signed-off-by: Kamil Domański 
>> ---
>>  Documentation/git-ls-remote.txt | 11 ---
>>  builtin/ls-remote.c | 16 
>>  t/t5512-ls-remote.sh|  9 +
>>  3 files changed, 29 insertions(+), 7 deletions(-)
>> 
>> diff --git a/Documentation/git-ls-remote.txt 
>> b/Documentation/git-ls-remote.txt
>> index 0b057cbb10..5adc1d676e 100644
>> --- a/Documentation/git-ls-remote.txt
>> +++ b/Documentation/git-ls-remote.txt
>> @@ -9,9 +9,9 @@ git-ls-remote - List references in a remote repository
>>  SYNOPSIS
>>  
>>  [verse]
>> -'git ls-remote' [--heads] [--tags] [--refs] [--upload-pack=]
>> -  [-q | --quiet] [--exit-code] [--get-url] [--sort=]
>> -  [--symref] [ [...]]
>> +'git ls-remote' [--count=] [--heads] [--tags] [--refs]
>> +  [--upload-pack=] [-q | --quiet] [--exit-code] [--get-url]
>> +  [--sort=] [--symref] [ [...]]
>
> It is understandable that the new option is important to _you_, but it
> does not seem important enough that it must be the first in the list.
> Please add it between --symref and 
>
>>  
>>  DESCRIPTION
>>  ---
>> @@ -21,6 +21,11 @@ commit IDs.
>>  
>>  OPTIONS
>>  ---
>> +--count=::
>> +By default the command shows all refs that match
>> +``.  This option makes it stop after showing
>> +that many refs.
>
> Is the meaning of this option perhaps:
>
> Stops after the specified count of refs have been listed.
> If `--sort=` is specified as well, refs are counted
> after sorting; otherwise, it is unspecified which subset
> of is listed.
>
> I do not know whether the "otherwise" part would be true (check it!),
> but I am pretty certain that the "If" part must be true, otherwise the
> option would be pointless.
>
> The comment about the ordering of this paragraph at the very beginning
> of the option list applies here, too, because the list is not sorted
> alphabetically.

All sensible comments and suggestions.  I am not sure what's so hard
to pipe the output to "head -n 20" or something like that, though.


Re: [PATCH 4/9] sparse-checkout: 'add' subcommand

2019-09-18 Thread Derrick Stolee
On 9/18/2019 10:56 AM, Elijah Newren wrote:
> On Wed, Sep 18, 2019 at 6:55 AM Derrick Stolee  wrote:
>>
>> On 8/23/2019 7:30 PM, Elijah Newren wrote:
>>> On Tue, Aug 20, 2019 at 8:12 AM Derrick Stolee via GitGitGadget
>>>  wrote:

> ...
 diff --git a/t/t1091-sparse-checkout-builtin.sh 
 b/t/t1091-sparse-checkout-builtin.sh
 index b7d5f15830..499bd8d6d0 100755
 --- a/t/t1091-sparse-checkout-builtin.sh
 +++ b/t/t1091-sparse-checkout-builtin.sh
 @@ -100,4 +100,24 @@ test_expect_success 'clone --sparse' '
 test_cmp expect dir
  '

 +test_expect_success 'add to existing sparse-checkout' '
 +   echo "/folder2/*" | git -C repo sparse-checkout add &&
>>>
>>> I've always been using '/folder2/' in sparse-checkout, without the
>>> trailing asterisk.  That seems more friendly for cone mode too.  Are
>>> there benefits to keeping the trailing asterisk?
>>
>> I think I've been seeing issues with pattern matching on Windows without
>> the trailing asterisk. I'm currently double-checking to make sure this
>> is important or not.
> 
> Can you try with the en/clean-nested-with-ignored topic in pu to see
> if that fixes those issues?

Merging with that branch was very difficult. There is a lot of unshared
history between our branches.

Instead, I tried once more to dig into the strange issue on Windows, and
it appears it is an issue with how the Git for Windows SDK modifies shell
arguments with a "/".

When I ran `git sparse-checkout set "/folder1/*"` it worked.

When I run `git sparse-checkout set "/folder1/"`, the SDK completes that
argument to "C:/git-sdk-64/folder1/" on my machine (something more
complicated on the build machine). It's not actually a bug in the Git
code, but something in the build and test environment.

I can get around it by testing the builtin without using these cone-like
patterns. When using `git sparse-checkout set folder1 folder2` in cone
mode, Git does the right thing.

Sorry for the noise here.

-Stolee



Re: [PATCH v2] hg-to-git: make it compatible with both python3 and python2

2019-09-18 Thread Junio C Hamano
Hervé Beraud  writes:

> Python 2 is EOL at the end of 2019, many distros
> and systems now come with python 3 is the default version.

Grammo.  s/is the/as their/ or something like that to fix.

> These changes introduce a syntaxe compatible with the both versions of
> python and so with the nearly future python standard.
>
> Introduced changes:
> ---

Let's drop the above 5 lines.  A canonical form of log message for
us is to begin with a brief background to make readers realize what
is lacking in the current system (if needed -- and you've done a
good job to make the readers realize that we need to make sure we
work with Py3).  With readers' minds prepared to accept the need for
a change, then you give orders to the codebase to "be like so" in
imperative mood.  E.g.

Rewrite features used in hg-to-git that are no longer
supported in Python 3, in such a way that an updated code
can still be usable with Python 2.

- print is not ...
- dict.has_key(key) is no more; ...

You seem to have dropped the change from map() to list comprehention
in this iteration.  I am not into Python deeply enough to care
either way myself, and the original form with map() seems to work
with Python3 (evaluating it seems to result in a map object, instead
of a list, to lazily yield the values, but we are not printing the
result of map() directly with print(), so it should be OK) so from
my point of view, the fewer things we have to defend/justify, the
better ;-)

Will queue.  Thanks.

> Rewriting features that are no longer supported (or recommended)
> in Python 3 in the hg-to-git script.py so that it can be used with both
> Python 2 and 3, namely:
>
> - print is not a statement; use print() function instead.
> - dict.has_key(key) is no more; use "key in dict" instead.
>
> Signed-off-by: Hervé Beraud 
> ---
>  contrib/hg-to-git/hg-to-git.py | 50 +-
>  1 file changed, 25 insertions(+), 25 deletions(-)
>
> diff --git a/contrib/hg-to-git/hg-to-git.py b/contrib/hg-to-git/hg-to-git.py
> index de3f81667ed97..bb2822d4a5e17 100755
> --- a/contrib/hg-to-git/hg-to-git.py
> +++ b/contrib/hg-to-git/hg-to-git.py
> @@ -42,7 +42,7 @@
>  
>  def usage():
>  
> -print """\
> +print("""\
>  %s: [OPTIONS] 
>  
>  options:
> @@ -54,7 +54,7 @@ def usage():
>  
>  required:
>  hgprj:  name of the HG project to import (directory)
> -""" % sys.argv[0]
> +""" % sys.argv[0])
>  
>  
> #--
>  
> @@ -104,22 +104,22 @@ def getgitenv(user, date):
>  if state:
>  if os.path.exists(state):
>  if verbose:
> -print 'State does exist, reading'
> +print('State does exist, reading')
>  f = open(state, 'r')
>  hgvers = pickle.load(f)
>  else:
> -print 'State does not exist, first run'
> +print('State does not exist, first run')
>  
>  sock = os.popen('hg tip --template "{rev}"')
>  tip = sock.read()
>  if sock.close():
>  sys.exit(1)
>  if verbose:
> -print 'tip is', tip
> +print('tip is', tip)
>  
>  # Calculate the branches
>  if verbose:
> -print 'analysing the branches...'
> +print('analysing the branches...')
>  hgchildren["0"] = ()
>  hgparents["0"] = (None, None)
>  hgbranch["0"] = "master"
> @@ -154,15 +154,15 @@ def getgitenv(user, date):
>  else:
>  hgbranch[str(cset)] = "branch-" + str(cset)
>  
> -if not hgvers.has_key("0"):
> -print 'creating repository'
> +if "0" not in hgvers:
> +print('creating repository')
>  os.system('git init')
>  
>  # loop through every hg changeset
>  for cset in range(int(tip) + 1):
>  
>  # incremental, already seen
> -if hgvers.has_key(str(cset)):
> +if str(cset) in hgvers:
>  continue
>  hgnewcsets += 1
>  
> @@ -180,27 +180,27 @@ def getgitenv(user, date):
>  os.write(fdcomment, csetcomment)
>  os.close(fdcomment)
>  
> -print '-'
> -print 'cset:', cset
> -print 'branch:', hgbranch[str(cset)]
> -print 'user:', user
> -print 'date:', date
> -print 'comment:', csetcomment
> +print('-')
> +print('cset:', cset)
> +print('branch:', hgbranch[str(cset)])
> +print('user:', user)
> +print('date:', date)
> +print('comment:', csetcomment)
>  if parent:
> - print 'parent:', parent
> + print('parent:', parent)
>  if mparent:
> -print 'mparent:', mparent
> +print('mparent:', mparent)
>  if tag:
> -print 'tag:', tag
> -print '-'
> +print('tag:', tag)
> +print('-')
>  
>  # checkout the parent if necessary
>  if cset != 0:
>  if hgbranch[str(cset)] == "branch-" + str(cset):
> -print 'creating new branch', hgbranch[

Re: [PATCH 4/9] sparse-checkout: 'add' subcommand

2019-09-18 Thread Elijah Newren
On Wed, Sep 18, 2019 at 6:55 AM Derrick Stolee  wrote:
>
> On 8/23/2019 7:30 PM, Elijah Newren wrote:
> > On Tue, Aug 20, 2019 at 8:12 AM Derrick Stolee via GitGitGadget
> >  wrote:
> >>
...
> >> diff --git a/t/t1091-sparse-checkout-builtin.sh 
> >> b/t/t1091-sparse-checkout-builtin.sh
> >> index b7d5f15830..499bd8d6d0 100755
> >> --- a/t/t1091-sparse-checkout-builtin.sh
> >> +++ b/t/t1091-sparse-checkout-builtin.sh
> >> @@ -100,4 +100,24 @@ test_expect_success 'clone --sparse' '
> >> test_cmp expect dir
> >>  '
> >>
> >> +test_expect_success 'add to existing sparse-checkout' '
> >> +   echo "/folder2/*" | git -C repo sparse-checkout add &&
> >
> > I've always been using '/folder2/' in sparse-checkout, without the
> > trailing asterisk.  That seems more friendly for cone mode too.  Are
> > there benefits to keeping the trailing asterisk?
>
> I think I've been seeing issues with pattern matching on Windows without
> the trailing asterisk. I'm currently double-checking to make sure this
> is important or not.

Can you try with the en/clean-nested-with-ignored topic in pu to see
if that fixes those issues?


Re: [PATCH 4/9] sparse-checkout: 'add' subcommand

2019-09-18 Thread Derrick Stolee
On 8/23/2019 7:30 PM, Elijah Newren wrote:
> On Tue, Aug 20, 2019 at 8:12 AM Derrick Stolee via GitGitGadget
>  wrote:
>>
>> From: Derrick Stolee 
>>
>> The 'git sparse-checkout add' subcommand takes a list of patterns
>> over stdin and writes them to the sparse-checkout file. Then, it
>> updates the working directory using 'git read-tree -mu HEAD'.
> 
> As mentioned in response to the cover letter, I'd rather see it take
> patterns as positional arguments (though requiring a '--' argument
> before any patterns that start with a hyphen).  It could also take
> --stdin to read from stdin.
> 
>> Note: if a user adds a negative pattern that would lead to the
>> removal of a non-empty directory, then Git may not delete that
>> directory (on Windows).
> 
> This sounds like you're re-iterating a bug mentioned earlier, but if
> someone in the future comes and reads this comment it might sound like
> you're saying git can avoid clearing a directory for optimization or
> other reasons.  (And, of course, it'd be nice to figure out why this
> bug exists.)
> 
> Another question this brings up, though, is that you worked around
> this bug in 'init' so why would you not also do so for 'add'?  Seems
> inconsistent to me.
> 
>> Signed-off-by: Derrick Stolee 
>> ---
>>  Documentation/git-sparse-checkout.txt |  4 
>>  builtin/sparse-checkout.c | 32 ++-
>>  t/t1091-sparse-checkout-builtin.sh| 20 +
>>  3 files changed, 55 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/git-sparse-checkout.txt 
>> b/Documentation/git-sparse-checkout.txt
>> index 50c53ee60a..6f540a3443 100644
>> --- a/Documentation/git-sparse-checkout.txt
>> +++ b/Documentation/git-sparse-checkout.txt
>> @@ -34,6 +34,10 @@ COMMANDS
>> by Git. Add patterns to the sparse-checkout file to
>> repopulate the working directory.
>>
>> +'add'::
>> +   Add a set of patterns to the sparse-checkout file, as given over
>> +   stdin. Updates the working directory to match the new patterns.
>> +
>>  SPARSE CHECKOUT
>>  
>>
>> diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
>> index 86d24e6295..ec6134fecc 100644
>> --- a/builtin/sparse-checkout.c
>> +++ b/builtin/sparse-checkout.c
>> @@ -8,7 +8,7 @@
>>  #include "strbuf.h"
>>
>>  static char const * const builtin_sparse_checkout_usage[] = {
>> -   N_("git sparse-checkout [init|list]"),
>> +   N_("git sparse-checkout [init|add|list]"),
>> NULL
>>  };
>>
>> @@ -166,6 +166,34 @@ static int sparse_checkout_init(int argc, const char 
>> **argv)
>> return sc_read_tree();
>>  }
>>
>> +static int sparse_checkout_add(int argc, const char **argv)
>> +{
>> +   struct exclude_list el;
>> +   char *sparse_filename;
>> +   FILE *fp;
>> +   struct strbuf line = STRBUF_INIT;
>> +
>> +   memset(&el, 0, sizeof(el));
>> +
>> +   sparse_filename = get_sparse_checkout_filename();
>> +   add_excludes_from_file_to_list(sparse_filename, "", 0, &el, NULL);
> 
> el is an exclude_list and we call add_excludes_..., but it's actually
> an *include* list.  This is going to cause errors at some point, and
> will cause lots of headaches.
> 
>> +
>> +   fp = fopen(sparse_filename, "w");
>> +   write_excludes_to_file(fp, &el);
>> +
>> +   while (!strbuf_getline(&line, stdin)) {
>> +   strbuf_trim(&line);
>> +   fprintf(fp, "%s\n", line.buf);
>> +   }
> 
> Should we first check whether these excludes are already in the
> sparse-checkout file?
> 
>> +   fclose(fp);
>> +   free(sparse_filename);
>> +
>> +   clear_exclude_list(&el);
>> +
>> +   return sc_read_tree();
> 
> What if someone calls 'git sparse-checkout add' without first calling
> 'git sparse-checkout init'?  As far as I can tell, core.sparseCheckout
> will be unset (i.e. treated as false), meaning that this operation
> will do some work, but result in no changes and a report of success.
> After users try to figure out why it won't work, they eventually run
> 'git sparse-checkout init', which will delete all the entries they
> previously added with the 'add' subcommand.
> 
> What should happen instead?

If someone runs 'git sparse-checkout init' after an 'add', the
sparse-checkout file has contents, so the init does not overwrite
those.

(In the update I'm working on, 'init' doesn't delete folders, so
it will only remove the files that do not match the patterns.)

>> +}
>> +
>>  int cmd_sparse_checkout(int argc, const char **argv, const char *prefix)
>>  {
>> static struct option builtin_sparse_checkout_options[] = {
>> @@ -187,6 +215,8 @@ int cmd_sparse_checkout(int argc, const char **argv, 
>> const char *prefix)
>> return sparse_checkout_list(argc, argv);
>> if (!strcmp(argv[0], "init"))
>> return sparse_checkout_init(argc, argv);
>> +   if (!strcmp(argv[0], "add"))
>> +  

Re: [PATCH 3/9] clone: add --sparse mode

2019-09-18 Thread Derrick Stolee
On 8/23/2019 7:17 PM, Elijah Newren wrote:
> On Tue, Aug 20, 2019 at 8:12 AM Derrick Stolee via GitGitGadget
>  wrote:
>>
>> From: Derrick Stolee 
>>
>> When someone wants to clone a large repository, but plans to work
>> using a sparse-checkout file, they either need to do a full
>> checkout first and then reduce the patterns they included, or
>> clone with --no-checkout, set up their patterns, and then run
>> a checkout manually. This requires knowing a lot about the repo
>> shape and how sparse-checkout works.
>>
>> Add a new '--sparse' option to 'git clone' that initializes the
>> sparse-checkout file to include the following patterns:
>>
>> /*
>> !/*/*
>>
>> These patterns include every file in the root directory, but
>> no directories. This allows a repo to include files like a
>> README or a bootstrapping script to grow enlistments from that
>> point.
> 
> Nice.
> 
>>
>> Signed-off-by: Derrick Stolee 
>> ---
>>  Documentation/git-clone.txt|  8 +++-
>>  builtin/clone.c| 27 +++
>>  t/t1091-sparse-checkout-builtin.sh | 13 +
>>  3 files changed, 47 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
>> index 34011c2940..0fe91d2f04 100644
>> --- a/Documentation/git-clone.txt
>> +++ b/Documentation/git-clone.txt
>> @@ -15,7 +15,7 @@ SYNOPSIS
>>   [--dissociate] [--separate-git-dir ]
>>   [--depth ] [--[no-]single-branch] [--no-tags]
>>   [--recurse-submodules[=]] [--[no-]shallow-submodules]
>> - [--[no-]remote-submodules] [--jobs ] [--] 
>> + [--[no-]remote-submodules] [--jobs ] [--sparse] [--] 
>> 
>>   []
>>
>>  DESCRIPTION
>> @@ -156,6 +156,12 @@ objects from the source repository into a pack in the 
>> cloned repository.
>> used, neither remote-tracking branches nor the related
>> configuration variables are created.
>>
>> +--sparse::
>> +   Initialize the sparse-checkout file so the working
>> +   directory starts with only the files in the root
>> +   of the repository. The sparse-checkout file can be
>> +   modified to grow the working directory as needed.
>> +
>>  --mirror::
>> Set up a mirror of the source repository.  This implies `--bare`.
>> Compared to `--bare`, `--mirror` not only maps local branches of the
>> diff --git a/builtin/clone.c b/builtin/clone.c
>> index f665b28ccc..d6d49a73ff 100644
>> --- a/builtin/clone.c
>> +++ b/builtin/clone.c
>> @@ -60,6 +60,7 @@ static const char *real_git_dir;
>>  static char *option_upload_pack = "git-upload-pack";
>>  static int option_verbosity;
>>  static int option_progress = -1;
>> +static int option_sparse_checkout;
>>  static enum transport_family family;
>>  static struct string_list option_config = STRING_LIST_INIT_NODUP;
>>  static struct string_list option_required_reference = 
>> STRING_LIST_INIT_NODUP;
>> @@ -147,6 +148,8 @@ static struct option builtin_clone_options[] = {
>> OPT_PARSE_LIST_OBJECTS_FILTER(&filter_options),
>> OPT_BOOL(0, "remote-submodules", &option_remote_submodules,
>> N_("any cloned submodules will use their remote-tracking 
>> branch")),
>> +   OPT_BOOL(0, "sparse", &option_sparse_checkout,
>> +   N_("initialize sparse-checkout file to include only 
>> files at root")),
>> OPT_END()
>>  };
>>
>> @@ -734,6 +737,27 @@ static void update_head(const struct ref *our, const 
>> struct ref *remote,
>> }
>>  }
>>
>> +static int git_sparse_checkout_init(const char *repo)
>> +{
>> +   struct argv_array argv = ARGV_ARRAY_INIT;
>> +   int result = 0;
>> +   argv_array_pushl(&argv, "-C", repo, "sparse-checkout", "init", NULL);
>> +
>> +   /*
>> +* We must apply the setting in the current process
>> +* for the later checkout to use the sparse-checkout file.
>> +*/
>> +   core_apply_sparse_checkout = 1;
>> +
>> +   if (run_command_v_opt(argv.argv, RUN_GIT_CMD)) {
>> +   error(_("failed to initialize sparse-checkout"));
>> +   result = 1;
>> +   }
> 
> Sigh...so much forking of additional processes.  I'd really rather
> that we were reducing how much of this we are doing in the codebase
> instead of adding more.  Every fork makes following stuff in a
> debugger harder.

At the moment, this is the simplest way to do this interaction. The
init subcommand is doing multiple things, and we can consider moving
this to be a library method instead of builtin-specific code later.

This is not a huge performance hit, as "clone" is called only once
per repo.

>> +
>> +   argv_array_clear(&argv);
>> +   return result;
>> +}
>> +
>>  static int checkout(int submodule_progress)
>>  {
>> struct object_id oid;
>> @@ -1107,6 +1131,9 @@ int cmd_clone(int argc, const char **argv, const char 
>> *prefix)
>> if (option_required_reference.nr || option_optional

Re: [PATCH] ls-remote: create '--count' option

2019-09-17 Thread Johannes Sixt
Am 18.09.19 um 02:11 schrieb Kamil Domański:
> Create a '--count' option for ls-remote, based on the one from
> for-each-ref. This allows e.g. to return only the first result
> from a sorted list of refs.
> 
> Signed-off-by: Kamil Domański 
> ---
>  Documentation/git-ls-remote.txt | 11 ---
>  builtin/ls-remote.c | 16 
>  t/t5512-ls-remote.sh|  9 +
>  3 files changed, 29 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/git-ls-remote.txt b/Documentation/git-ls-remote.txt
> index 0b057cbb10..5adc1d676e 100644
> --- a/Documentation/git-ls-remote.txt
> +++ b/Documentation/git-ls-remote.txt
> @@ -9,9 +9,9 @@ git-ls-remote - List references in a remote repository
>  SYNOPSIS
>  
>  [verse]
> -'git ls-remote' [--heads] [--tags] [--refs] [--upload-pack=]
> -   [-q | --quiet] [--exit-code] [--get-url] [--sort=]
> -   [--symref] [ [...]]
> +'git ls-remote' [--count=] [--heads] [--tags] [--refs]
> +   [--upload-pack=] [-q | --quiet] [--exit-code] [--get-url]
> +   [--sort=] [--symref] [ [...]]

It is understandable that the new option is important to _you_, but it
does not seem important enough that it must be the first in the list.
Please add it between --symref and 

>  
>  DESCRIPTION
>  ---
> @@ -21,6 +21,11 @@ commit IDs.
>  
>  OPTIONS
>  ---
> +--count=::
> + By default the command shows all refs that match
> + ``.  This option makes it stop after showing
> + that many refs.

Is the meaning of this option perhaps:

Stops after the specified count of refs have been listed.
If `--sort=` is specified as well, refs are counted
after sorting; otherwise, it is unspecified which subset
of is listed.

I do not know whether the "otherwise" part would be true (check it!),
but I am pretty certain that the "If" part must be true, otherwise the
option would be pointless.

The comment about the ordering of this paragraph at the very beginning
of the option list applies here, too, because the list is not sorted
alphabetically.

-- Hannes


Re: [PATCH] Rewrite feature to render hg-to-git compatible with python2.x and 3.x

2019-09-17 Thread Junio C Hamano
Junio C Hamano  writes:

One more thing.

> Subject: Re: [PATCH] Rewrite feature to render hg-to-git compatible with 
> python2.x and 3.x

Our commit titles typically begin with : prefix, e.g.

Subject: hg-to-git: make it compatibile with both Python 3 and Python 2

or something like that.


Re: [PATCH] Rewrite feature to render hg-to-git compatible with python2.x and 3.x

2019-09-17 Thread Junio C Hamano
Hervé Beraud  writes:

> Rewrite features that are no longer supported (or recommended)
> in Python 3 in the script so that it can be used with both
> Python 2 and 3, namely:
>
> - print is not a statement; use print() function instead.
> - dict.has_key(key) is no more; use "key in dict" instead.
> - map(lambda ..., collection) is not liked; use list comprehension.

The first two are must haves (iow, without them the code would not
work with Py3), but the last one sounds like a mere preference.  Is
there some authoritative source we can quote (e.g. "PEPx says that
we should prefer X over Y")?

> These changes introduce a syntaxe compatible with the both versions of
> python.
>
> Python 2 is EOL at the end of 2019, the major part of distros
> and systems now come with python 3  is the default version so
> these address the situation to render this script compatible with
> the latest versions of python.

The explanation looks vaguely familiar.  I would have placed the
background first, and then description of what is done next, perhaps
like:

Python 2 is EOL at the end of 2019, many distros and systems now
come with python 3 as their default version.

Rewrite features that are no longer supported (or recommended)
in Python 3 in the script so that it can be used with both
Python 2 and 3, namely:

- print is not a statement; use print() function instead.
- dict.has_key(key) is no more; use "key in dict" instead.
- map(lambda ..., collection) is not liked; use list comprehension.

to make it even easier to read without unnecessary repetition, but
what you wrote is clear enough already.

The patch text looked clearly done.  Nice.


Re: [PATCH] gitk: rename zh_CN.po to zh_cn.po

2019-09-17 Thread Junio C Hamano
Junio C Hamano  writes:

> This would make this in line with pt_{pt,br}.po, existing locales
> with country code in them.
>
> Reviewed-by: Junio C Hamano 

Paul, I've applied this to a clone of gitk locally and merged it
(resutling in 2a4ac71ffb that should be reachable from 'master'), as
it was breaking my "install" tests.

Thanks.


Re: [PATCH 2/3] list-objects-filter: delay parsing of sparse oid

2019-09-17 Thread Junio C Hamano
Jeff King  writes:

> It's a good idea to parse as much as we can in step 1, in order to catch
> problems early (e.g., a blob size limit that isn't a number). But one
> thing we _shouldn't_ do is resolve any oids at that step (e.g., for
> sparse-file contents specified by oid). In the case of a fetch, the oid
> has to be resolved on the remote side.
> ...
> We can fix both by moving the oid resolution into the sparse-oid init
> function. At that point we know we have a repository (because we're
> about to traverse), and handling the error there fixes the segfault.

Makes sense.  Thanks for a clean solution to a messy problem.



Re: [PATCH v2 1/2] diff, log doc: say "patch text" instead of "patches"

2019-09-17 Thread Martin Ågren
On Mon, 16 Sep 2019 at 22:46, Johannes Sixt  wrote:
>
> Am 16.09.19 um 21:58 schrieb Junio C Hamano:
> > I wonder if the result becomes even clearer if we dropped "instead
> > of the usual output".  It is a given that presence of an option
> > would change the behaviour, so "instead of the usual" does not add
> > any value in the context of the explanation we are giving.
>
> Agreed.
>
> > Also I question the value of the "running git diff without --raw
> > option" sentence; "diff --stat" is also a way to suppress the patch
> > text and see only the overview; I know it is not a new problem this
> > patch introduces, but the objective of this patch is clarify about
> > the generation of output in patch format, so...
>
> You have a point here, too.
>
> Below is v2 of just patch 1/2. 2/2 remains unchanged. I've added
> git-show to the enumeration.

Yeah, this makes sense. Tested -- this renders fine.

Martin


Re: [PATCH 1/1] Git config allows value setting from stdin

2019-09-17 Thread Junio C Hamano
Zeger-Jan van de Weg  writes:

> Subject: Re: [PATCH 1/1] Git config allows value setting from stdin

Probably something like this, with the ":" prefix.

config: learn the --stdin option to take args from the standard input

> When setting values in the git config, the value is part of the
> arguments for execution. This potentially leaks the value through
> logging, or other programs like `ps`.

OK.

> Prior to this change, there was no option to do this. This change adds
> the `--stdin` to be combined with `--add`. When passed, the value cannot
> be passed and is read through stdin.

That's overly verbose.

Add the `--stdin` option that can be used with `--add` to
instead take the variables and values from the standard
input to hide them from prying eyes.

or something?  When you say "Add", we know there isn't any right now
(that is why you are adding, after all).

Also, shouldn't the variable also be considered sensitive?  IOW

$ git config --stdin --add <<\EOF
remote.hidden.url=https://user:pass@over.there/repo
EOF

instead of

$ git config --stdin --add remote.hidden.url <<\EOF
https://user:pass@over.there/repo
EOF

?  

Incidentally, allowing it to take variable=value pair would also
allow you to set many of them in batch, which is another benefit.

> -'git config' [] [--type=] --add name value
> +'git config' [] [--type=] --add [--stdin] name [value]

This does not convey "you pass name and value without --stdin, or
you pass only name with --stdin" and instead allow a nonsense like
"git config --add name".  Splitting it into two would be a way to
tell this unambiguously to the readers, e.g. 

git config [--type=] --add name value
git config [--type=] --add --stdin name

although I suspect we would also want to allow treating the varilabe
names as sensitive.

Thanks.


Re: [PATCH] gitk: rename zh_CN.po to zh_cn.po

2019-09-17 Thread Junio C Hamano
Denton Liu  writes:

> When running make from a clean environment, all of the *.po files should
> be converted into *.msg files. After that, when make is run without any
> changes, make should not do anything.
>
> After beffae768a (gitk: Add Chinese (zh_CN) translation, 2017-03-11),
> zh_CN.po was introduced. When make was run, a zh_cn.msg file was
> generated (notice the lowercase). However, since make is case-sensitive,
> it expects zh_CN.po to generate a zh_CN.msg file so make will keep
> reattempting to generate a zh_CN.msg so successive make invocations
> result in
>
> Generating catalog po/zh_cn.msg
> msgfmt --statistics --tcl po/zh_cn.po -l zh_cn -d po/
> 317 translated messages.
>
> happening continuously.
>
> Rename zh_CN.po to zh_cn.po so that when make generates the zh_cn.msg
> file, it will realize that it was successfully generated and only run
> once.

True, and it is not just that.  The install target would fail
because it cannot find zh_CN.msg, I think.

> Signed-off-by: Denton Liu 
> ---
>  po/{zh_CN.po => zh_cn.po} | 0
>  1 file changed, 0 insertions(+), 0 deletions(-)
>  rename po/{zh_CN.po => zh_cn.po} (100%)
>
> diff --git a/po/zh_CN.po b/po/zh_cn.po
> similarity index 100%
> rename from po/zh_CN.po
> rename to po/zh_cn.po

This would make this in line with pt_{pt,br}.po, existing locales
with country code in them.

Reviewed-by: Junio C Hamano 


Re: [PATCH v3 0/4] Makefile: run coccicheck on all non-upstream sources

2019-09-17 Thread Junio C Hamano
SZEDER Gábor  writes:

> On Mon, Sep 16, 2019 at 01:57:14PM -0700, Junio C Hamano wrote:
>> Queued; thanks.
>
> Nit: it would be nicer to queue this series on top of
> 'dl/compat-cleanup', because 'make coccicheck' errors out suggesting
> transformations in 'compat/mingw.c' that have only been applied in
> that series.

Yikes, yes I remember that you pointed it out already, yet I
forgot.  Will correct.

Thanks.


Re: [PATCH v2 1/1] commit-graph: add --[no-]progress to write and verify.

2019-09-17 Thread SZEDER Gábor
On Tue, Sep 17, 2019 at 06:47:38AM -0400, Derrick Stolee wrote:
> 
> On 9/16/2019 6:36 PM, SZEDER Gábor wrote:
> > On Mon, Aug 26, 2019 at 09:29:58AM -0700, Garima Singh via GitGitGadget 
> > wrote:
> >> From: Garima Singh 
> >>
> >> Add --[no-]progress to git commit-graph write and verify.
> >> The progress feature was introduced in 7b0f229
> >> ("commit-graph write: add progress output", 2018-09-17) but
> >> the ability to opt-out was overlooked.
> > 
> >> diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
> >> index 99f4ef4c19..4fc3fda9d6 100755
> >> --- a/t/t5324-split-commit-graph.sh
> >> +++ b/t/t5324-split-commit-graph.sh
> >> @@ -319,7 +319,7 @@ test_expect_success 'add octopus merge' '
> >>git merge commits/3 commits/4 &&
> >>git branch merge/octopus &&
> >>git commit-graph write --reachable --split &&
> >> -  git commit-graph verify 2>err &&
> >> +  git commit-graph verify --progress 2>err &&
> > 
> > Why is it necessary to use '--progress' here?  It should not be
> > necessary, because the commit message doesn't mention that it changed
> > the default behavior of 'git commit-graph verify'...
> 
> It does change the default when stderr is not a terminal window. If we
> were not redirecting to a file, this change would not be necessary.

OK, yesterday I overlooked that the patch added this line:

  +   opts.progress = isatty(2);

So, the first question is whether that behavior change is desired; I
don't really have an opinion.  But if it is desired, then it should be
changed in a separate patch, explaining why it is desired, I would
think.

> >>test_line_count = 3 err &&
> > 
> > Having said that, this test should not check the number of progress
> > lines in the first place; see the recent discussion:
> > 
> > https://public-inbox.org/git/ec14865f-98cb-5e1a-b580-8b6fddaa6...@gmail.com/
> 
> True, this is an old issue. I think it never got corrected because
> your reply sounded like the issue doesn't exist in the normal test
> suite,

Well, the way I see it the root issue is that the test checks things
that it shouldn't.

> only in a private branch where you changed the behavior of
> GIT_TEST_GETTEXT_POISON.
> 
> If we still think that should be fixed, it should not be a part of
> this series, but should be a separate one that focuses on just
> those changes.

Yeah, it should rather go on top of 'ds/commit-graph-octopus-fix'.



Re: [PATCH v2 1/1] commit-graph: add --[no-]progress to write and verify.

2019-09-17 Thread Derrick Stolee


On 9/16/2019 6:36 PM, SZEDER Gábor wrote:
> On Mon, Aug 26, 2019 at 09:29:58AM -0700, Garima Singh via GitGitGadget wrote:
>> From: Garima Singh 
>>
>> Add --[no-]progress to git commit-graph write and verify.
>> The progress feature was introduced in 7b0f229
>> ("commit-graph write: add progress output", 2018-09-17) but
>> the ability to opt-out was overlooked.
> 
>> diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
>> index 99f4ef4c19..4fc3fda9d6 100755
>> --- a/t/t5324-split-commit-graph.sh
>> +++ b/t/t5324-split-commit-graph.sh
>> @@ -319,7 +319,7 @@ test_expect_success 'add octopus merge' '
>>  git merge commits/3 commits/4 &&
>>  git branch merge/octopus &&
>>  git commit-graph write --reachable --split &&
>> -git commit-graph verify 2>err &&
>> +git commit-graph verify --progress 2>err &&
> 
> Why is it necessary to use '--progress' here?  It should not be
> necessary, because the commit message doesn't mention that it changed
> the default behavior of 'git commit-graph verify'...

It does change the default when stderr is not a terminal window. If we
were not redirecting to a file, this change would not be necessary.
 
>>  test_line_count = 3 err &&
> 
> Having said that, this test should not check the number of progress
> lines in the first place; see the recent discussion:
> 
> https://public-inbox.org/git/ec14865f-98cb-5e1a-b580-8b6fddaa6...@gmail.com/

True, this is an old issue. I think it never got corrected because
your reply sounded like the issue doesn't exist in the normal test
suite, only in a private branch where you changed the behavior of
GIT_TEST_GETTEXT_POISON.

If we still think that should be fixed, it should not be a part of
this series, but should be a separate one that focuses on just
those changes.

Thanks,
-Stolee


Re: [PATCH v3 0/4] Makefile: run coccicheck on all non-upstream sources

2019-09-17 Thread SZEDER Gábor
On Mon, Sep 16, 2019 at 01:57:14PM -0700, Junio C Hamano wrote:
> Queued; thanks.

Nit: it would be nicer to queue this series on top of
'dl/compat-cleanup', because 'make coccicheck' errors out suggesting
transformations in 'compat/mingw.c' that have only been applied in
that series.



Re: [PATCH v2 1/1] commit-graph: add --[no-]progress to write and verify.

2019-09-16 Thread SZEDER Gábor
On Mon, Aug 26, 2019 at 09:29:58AM -0700, Garima Singh via GitGitGadget wrote:
> From: Garima Singh 
> 
> Add --[no-]progress to git commit-graph write and verify.
> The progress feature was introduced in 7b0f229
> ("commit-graph write: add progress output", 2018-09-17) but
> the ability to opt-out was overlooked.

> diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
> index 99f4ef4c19..4fc3fda9d6 100755
> --- a/t/t5324-split-commit-graph.sh
> +++ b/t/t5324-split-commit-graph.sh
> @@ -319,7 +319,7 @@ test_expect_success 'add octopus merge' '
>   git merge commits/3 commits/4 &&
>   git branch merge/octopus &&
>   git commit-graph write --reachable --split &&
> - git commit-graph verify 2>err &&
> + git commit-graph verify --progress 2>err &&

Why is it necessary to use '--progress' here?  It should not be
necessary, because the commit message doesn't mention that it changed
the default behavior of 'git commit-graph verify'...

>   test_line_count = 3 err &&

Having said that, this test should not check the number of progress
lines in the first place; see the recent discussion:

https://public-inbox.org/git/ec14865f-98cb-5e1a-b580-8b6fddaa6...@gmail.com/

>   test_i18ngrep ! warning err &&
>   test_line_count = 3 $graphdir/commit-graph-chain
> -- 
> gitgitgadget


Re: [PATCH v3 0/4] Makefile: run coccicheck on all non-upstream sources

2019-09-16 Thread Junio C Hamano
Queued; thanks.


Re: [PATCH v3 2/4] Makefile: define THIRD_PARTY_SOURCES

2019-09-16 Thread Junio C Hamano
Denton Liu  writes:

> +# THIRD_PARTY_SOURCES is a list of patterns compatible with
> +# the $(filter) and $(filter-out) family of functions

That defines the format, but does it convey what they want to
achieve to the readers?  "... to catch the list of source files
we borrowed from elsewhere" or something, perhaps?

> +THIRD_PARTY_SOURCES += compat/inet_ntop.c
> +THIRD_PARTY_SOURCES += compat/inet_pton.c
> +THIRD_PARTY_SOURCES += compat/obstack.%
> +THIRD_PARTY_SOURCES += compat/nedmalloc/%
> +THIRD_PARTY_SOURCES += compat/poll/%
> +THIRD_PARTY_SOURCES += compat/regex/%
> +THIRD_PARTY_SOURCES += sha1collisiondetection/%
> +THIRD_PARTY_SOURCES += sha1dc/%
> +
>  GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB)
>  EXTLIBS =


Re: [PATCH 0/1] fetch: Cache the want OIDs for faster lookup

2019-09-16 Thread Junio C Hamano
Masaya Suzuki  writes:

> Debugging this, it seems most of the time is spent on iterating the want refs 
> to
> see OIDs are included there. This patch speeds up this process by using 
> oid_set.
> Now the client can send a fetch request almost immediately.

Wonderful.  Thanks.


Re: [PATCH 1/2] diff, log doc: say "patch text" instead of "patches"

2019-09-16 Thread Junio C Hamano
Martin Ågren  writes:

> On Sun, 15 Sep 2019 at 21:26, Johannes Sixt  wrote:
>>  I do not have the toolchain to check that a correct result is produced.
>
> But I do. I've tested this patch and 2/2 with AsciiDoc 8.6.10 and
> Asciidoctor 1.5.5, as well as with Asciidoctor 2.0.10 (on top of brian's
> recent patch so that it builds to begin with). They all render this
> nicely.
>
> Both of these patches seem like good changes to me.

Thanks, both.  I am neutral between "patch" and "patch text", but if
the latter is more easily understood by readers, that would be
great.  "patch *file*" certainly does sound misleading.

I wonder if the result becomes even clearer if we dropped "instead
of the usual output".  It is a given that presence of an option
would change the behaviour, so "instead of the usual" does not add
any value in the context of the explanation we are giving.

Also I question the value of the "running git diff without --raw
option" sentence; "diff --stat" is also a way to suppress the patch
text and see only the overview; I know it is not a new problem this
patch introduces, but the objective of this patch is clarify about
the generation of output in patch format, so...


Re: [PATCH 1/1] multi-pack-index: add --no-progress Add --no-progress option to git multi-pack-index. The progress feature was added in 144d703 ("multi-pack-index: report progress during 'verify'", 20

2019-09-16 Thread William Baker
On 9/13/19 1:26 PM, Junio C Hamano wrote:

> Compare the ways how dispatching and command line option parsing of
> subcommands in "multi-pack-index" and "commit-graph" are
> implemented.  When a command (e.g. "commit-graph") takes common
> options and also has subcommands (e.g. "read" and "write") that take
> different set of options, there is a common options parser in the
> primary entry point (e.g. "cmd_commit_graph()"), and after
> dispatching to a chosen subcommand, the implementation of each
> subcommand (e.g. "graph_read()" and "graph_write()") parses its own
> options.  That's bog-standard way.

Thanks for the pointer to "commit-graph", looking through that code
cleared up any questions I had.

After taking another look through the "multi-pack-index" code my plan
is to update all of the subcommands to understand [--[no-]progress].
I gave the public struct in midx.h approach a try, but after seeing
how that looks I think it would be cleaner to update "write" and "expire"
to display progress and have an explicit parameter.

> Started as an isolated experimental command whose
> existence as a standalone command is solely because it was easier to
> experiment with (as opposed to being a plumbing command to be used
> by scripters), it probably was an acceptable trade-off to leave the
> code in this shape.  In the longer run, I suspect we'd rather want
> to get rid of "git multi-pack-index" as a standalone command and
> instead make "git gc" and other commands make direct calls to the
> internal machinery of midx code from strategic places.  So in that
> sense, I am not sure if I should "recommend" fixing the way the
> subcommand dispatching works in this command, or just accept to keep
> the ugly technical debt and let it limp along...

Thanks for the background here, it helped with understanding why the
multi-pack-index parsing is different than the other commands.

My plan is to include 3 commits in the next (v2) patch series:

1. Adding the new parameters to midx.h/c to control progress
2. Update write/expire to display progress
3. Update the multi-pack-index.c builtin to parse the [--[no-]progress]
   option and update the tests.

I wasn't thinking that I would adjust the the subcommand dispatching in 
multi-pack-index.c in this patch series.  By updating all of the subcommands
to support [--[no-]progress] I should be able to keep the changes to 
multi-pack-index.c quite small.  If you see any potential issues with this
approach please let me know.

Thanks,
William


Re: [PATCH v2] git-submodule.txt: fix AsciiDoc formatting error

2019-09-16 Thread Junio C Hamano
Denton Liu  writes:

> Break `--default` and `--branch` into their own separate invocations to
> make it obvious that these options are mutually exclusive.

That sounds even clearer than what the original wanted to do.  Very
good idea.

> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index 0ed5c24dc1..1f46380af2 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -173,7 +173,8 @@ submodule with the `--init` option.
>  If `--recursive` is specified, this command will recurse into the
>  registered submodules, and update any nested submodules within.
>  --
> -set-branch ((-d|--default)|(-b|--branch )) [--] ::
> +set-branch (-b|--branch)  [--] ::
> +set-branch (-d|--default) [--] ::
>   Sets the default remote tracking branch for the submodule. The
>   `--branch` option allows the remote branch to be specified. The
>   `--default` option removes the submodule..branch configuration


Re: [PATCH v2] Documentation: fix build with Asciidoctor 2

2019-09-16 Thread Junio C Hamano
Martin Ågren  writes:

>>   -Git omitted   01/01/1970
>> GIT-ADD(1)
>>   +  omitted 1970-01-01
>> GIT-ADD(1)
>
> Yeah, I should be able to post v3 of my refmiscinfo-series this evening,
> which should fix this, so that the only difference that remains here is
> how the date is formatted.

Thanks.


Re: [PATCH 1/2] diff, log doc: say "patch text" instead of "patches"

2019-09-16 Thread Martin Ågren
On Sun, 15 Sep 2019 at 21:26, Johannes Sixt  wrote:
>  I do not have the toolchain to check that a correct result is produced.

But I do. I've tested this patch and 2/2 with AsciiDoc 8.6.10 and
Asciidoctor 1.5.5, as well as with Asciidoctor 2.0.10 (on top of brian's
recent patch so that it builds to begin with). They all render this
nicely.

Both of these patches seem like good changes to me.

Martin


Re: [PATCH 0/3] clone --filter=sparse:oid bugs

2019-09-16 Thread Jeff Hostetler




On 9/14/2019 9:09 PM, Jeff King wrote:

On Mon, Sep 09, 2019 at 01:08:24PM -0400, Jeff King wrote:


I'll work up what I sent earlier into a real patch, and include some of
this discussion.


Here it is. I pulled Jon's tests out into their own patch (mostly
because it makes it easier to give credit). Then patch 2 is my fix, and
patch 3 is the message fixups he had done.

This replaces what's queued in js/partial-clone-sparse-blob.

   [1/3]: t5616: test cloning/fetching with sparse:oid= filter
   [2/3]: list-objects-filter: delay parsing of sparse oid
   [3/3]: list-objects-filter: give a more specific error sparse parsing error

  builtin/rev-list.c|  4 
  list-objects-filter-options.c | 14 ++
  list-objects-filter-options.h |  2 +-
  list-objects-filter.c | 14 +++---
  t/t5616-partial-clone.sh  | 36 +++
  5 files changed, 50 insertions(+), 20 deletions(-)



This series looks good to me.

Thanks!
Jeff


Re: [PATCH v5] git-gui: add hotkey to toggle "Amend Last Commit"

2019-09-16 Thread Birger Skogeng Pedersen
Hi,

On Sat, Sep 14, 2019 at 7:48 PM Pratyush Yadav  wrote:
> You missed labelling the menu item of "Amend Last Commit" with the
> shortcut, like we do for other menu items bound to a hotkey like F5 for
> rescan, Ctrl-T for stage, etc. I added that locally.

Sorry about that. Thanks!


Birger


Re: [PATCH v3] Documentation: fix build with Asciidoctor 2

2019-09-16 Thread Martin Ågren
On Sun, 15 Sep 2019 at 23:26, brian m. carlson
 wrote:
>
> On 2019-09-15 at 09:59:52, SZEDER Gábor wrote:
> > On Sat, Sep 14, 2019 at 07:49:19PM +, brian m. carlson wrote:
> > > test -n "$ALREADY_HAVE_ASCIIDOCTOR" ||
> > > gem install --version 1.5.8 asciidoctor
> >
> > So, since the documentation can now be built with Asciidoctor v2, is
> > it already time to remove this '--version 1.5.8'?
>
> I think Martin was going to send in some more patches before we did
> that.

I've got two series floating around [1] [2]. I'll be rerolling [2]
hopefully tonight and then I feel pretty good about the manpages. I've
got a third series that I'll get to then, which fixes up the rendering
of user-manual.pdf/html with Asciidoctor. Assuming those three series
graduate, I'm not aware of any reason to hold off on the switch to
"Asciidoctor by default".

As for "v2.x vs 1.5.8", that seems like a separate issue to me, though
-- and one that your patch does a very good job at! My series [2] will
make the rendering with v2.x /prettier/, but since your series makes the
docs build /at all/, maybe it's worth switching the CI-job sooner rather
than later to make sure we keep it so.

[1] https://public-inbox.org/git/cover.1567707999.git.martin.ag...@gmail.com/

[2] https://public-inbox.org/git/cover.1567534373.git.martin.ag...@gmail.com/

Martin


Re: [PATCH v2] Documentation: fix build with Asciidoctor 2

2019-09-16 Thread Martin Ågren
On Fri, 13 Sep 2019 at 07:06, Jeff King  wrote:
>
> On Fri, Sep 13, 2019 at 01:52:40AM +, brian m. carlson wrote:
>
>
> >  Documentation/Makefile| 4 +++-
> >  Documentation/manpage.xsl | 3 +++
> >  ci/test-documentation.sh  | 2 ++
> >  3 files changed, 8 insertions(+), 1 deletion(-)
> >  create mode 100644 Documentation/manpage.xsl
>
> Running with this patch on asciidoctor 2.0.10, plus Martin's recent
> literal-block cleanups, plus his refmiscinfo fix, I get pretty decent
> output from:
>
>   ./doc-diff --from-asciidoc --to-asciidoctor origin HEAD
>
> The header/footer are still a little funny (but I think Martin said that
> he needs to update the refmiscinfo patches for later versions of
> asciidoctor, which is probably what's going on here):
>
>   --- 
> a/f1d4a28250629ae469fc5dd59ab843cb2fd68e12-asciidoc/home/peff/share/man/man1/git-add.1
>   +++ 
> b/6c08635fd1d38c83d3765ff05fabbfbd25ef4943-asciidoctor/home/peff/share/man/man1/git-add.1
>   @@ -1,4 +1,4 @@
>   -GIT-ADD(1)Git Manual
> GIT-ADD(1)
>   +GIT-ADD(1)  
> GIT-ADD(1)
>
>NAME
>   git-add - Add file contents to the index
>   @@ -356,4 +356,4 @@ SEE ALSO
>GIT
>   Part of the git(1) suite
>
>   -Git omitted   01/01/1970
> GIT-ADD(1)
>   +  omitted 1970-01-01
> GIT-ADD(1)

Yeah, I should be able to post v3 of my refmiscinfo-series this evening,
which should fix this, so that the only difference that remains here is
how the date is formatted.

Martin


Re: [PATCH 1/1] fetch: Cache the want OIDs for faster lookup

2019-09-15 Thread Jeff King
On Sun, Sep 15, 2019 at 02:18:02PM -0700, Masaya Suzuki wrote:

> During git-fetch, the client checks if the advertised tags' OIDs are
> already in the fetch request's want OID set. This check is done in a
> linear scan. For a repository that has a lot of refs, repeating this
> scan takes 15+ minutes. In order to speed this up, create a oid_set for
> other refs' OIDs.

Good find. I was curious why nobody noticed it sooner, but I think a key
element in your chromium example is the use of "--mirror", which brings
in a bunch of refs which would not normally be grabbed. There are only
about 17k heads and tags, but over 1.5M entries in refs/changes.
Quadratically speaking, that's almost 8000 times worse, so your
15-minute delay is only 1/10th of a second in the non-mirror case.

Mostly I was wondering if this might have been a recent regression. But
it looks rather old:

> -static int will_fetch(struct ref **head, const unsigned char *sha1)
> +static void create_fetch_oidset(struct ref **head, struct oidset *out)
>  {
>   struct ref *rm = *head;
>   while (rm) {
> - if (hasheq(rm->old_oid.hash, sha1))
> - return 1;
> + oidset_insert(out, &rm->old_oid);
>   rm = rm->next;
>   }
> - return 0;
>  }

This function comes from cf7f929a10 (Teach git-fetch to grab a tag at
the same time as a commit, 2008-03-02). As explained in that commit
message, this technique can't actually get all the tags we're looking
for. It was around the same time, in 348e390b17 (Teach
fetch-pack/upload-pack about --include-tag, 2008-03-03), that we added
an extension to do this reliably.

So it _might_ even be possible to get rid of this code entirely, if we
don't care about pre-2008 versions of Git (we'd still get the right
answer from them; it would just take an extra round-trip).

That said, it seems like a good idea to do this obvious and safe
conversion in the near term, and we can look at removing it entirely as
a separate topic. The cost of the oidset isn't too bad (~30MB, I guess
even for that pathological chromium case).

> @@ -324,6 +324,7 @@ static void find_non_local_tags(const struct ref *refs,
>  
>   refname_hash_init(&existing_refs);
>   refname_hash_init(&remote_refs);
> + create_fetch_oidset(head, &fetch_oids);
>  
>   for_each_ref(add_one_refname, &existing_refs);
>   for (ref = refs; ref; ref = ref->next) {
> @@ -340,9 +341,9 @@ static void find_non_local_tags(const struct ref *refs,
>   if (item &&
>   !has_object_file_with_flags(&ref->old_oid,
>   OBJECT_INFO_QUICK) &&
> - !will_fetch(head, ref->old_oid.hash) &&
> + !oidset_contains(&fetch_oids, &ref->old_oid) &&
>   !has_object_file_with_flags(&item->oid, 
> OBJECT_INFO_QUICK) &&
> - !will_fetch(head, item->oid.hash))
> + !oidset_contains(&fetch_oids, &item->oid))
>   clear_item(item);
>   item = NULL;
>   continue;

One thing to check is whether the list we feed to will_fetch ever gets
updated in the loop that checks it (i.e., do we need to be adding or
clearing elements in the oidset).

I _think_ the answer is no, because we modify only the "refs" list in
that loop (via clear_item()), but the will_fetch() check is on the
head/tail we receive (which isn't updated until later). Both of those
lists are passed into the function, but I didn't trace it all the way up
to make sure they are not aliases.

-Peff


Re: [PATCH 1/1] fetch: Cache the want OIDs for faster lookup

2019-09-15 Thread Masaya Suzuki
On Sun, Sep 15, 2019 at 7:35 PM Derrick Stolee  wrote:
>
> On 9/15/2019 5:18 PM, Masaya Suzuki wrote:
> > During git-fetch, the client checks if the advertised tags' OIDs are
> > already in the fetch request's want OID set. This check is done in a
> > linear scan. For a repository that has a lot of refs, repeating this
> > scan takes 15+ minutes. In order to speed this up, create a oid_set for
> > other refs' OIDs.
>
> Good catch! Quadratic performance is never good.
>
> The patch below looks like it works, but could you also share your
> performance timings for the 15+ minute case after your patch is
> applied?

With the following code change, I measured the time for
find_non_local_tags. It shows 215 msec with the example commands. (I
didn't measure entire fetch time as good portion of the time is spent
on the server side.)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 51a276dfaa..d3b06c733d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -25,6 +25,7 @@
 #include "list-objects-filter-options.h"
 #include "commit-reach.h"
 #include "branch.h"
+#include 

 #define FORCED_UPDATES_DELAY_WARNING_IN_MS (10 * 1000)

@@ -322,8 +323,11 @@ static void find_non_local_tags(const struct ref *refs,
const struct ref *ref;
struct refname_hash_entry *item = NULL;

+   struct timespec start, end;
+
refname_hash_init(&existing_refs);
refname_hash_init(&remote_refs);
+   clock_gettime(CLOCK_MONOTONIC, &start);
create_fetch_oidset(head, &fetch_oids);

for_each_ref(add_one_refname, &existing_refs);
@@ -405,6 +409,12 @@ static void find_non_local_tags(const struct ref *refs,
}
hashmap_free(&remote_refs, 1);
string_list_clear(&remote_refs_list, 0);
+   clock_gettime(CLOCK_MONOTONIC, &end);
+   {
+   uint64_t millisec = (end.tv_sec - start.tv_sec) * 1000
+ (end.tv_nsec - start.tv_nsec) / 100;
+   fprintf(stderr, "find_non_local_tags: %ld msec\n", millisec);
+   }
+
oidset_clear(&fetch_oids);
 }


Re: [PATCH 1/1] fetch: Cache the want OIDs for faster lookup

2019-09-15 Thread Derrick Stolee
On 9/15/2019 5:18 PM, Masaya Suzuki wrote:
> During git-fetch, the client checks if the advertised tags' OIDs are
> already in the fetch request's want OID set. This check is done in a
> linear scan. For a repository that has a lot of refs, repeating this
> scan takes 15+ minutes. In order to speed this up, create a oid_set for
> other refs' OIDs.

Good catch! Quadratic performance is never good.

The patch below looks like it works, but could you also share your
performance timings for the 15+ minute case after your patch is
applied?

Thanks,
-Stolee

> 
> Signed-off-by: Masaya Suzuki 
> ---
>  builtin/fetch.c | 18 ++
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 54d6b01892..51a276dfaa 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -7,6 +7,7 @@
>  #include "refs.h"
>  #include "refspec.h"
>  #include "object-store.h"
> +#include "oidset.h"
>  #include "commit.h"
>  #include "builtin.h"
>  #include "string-list.h"
> @@ -243,15 +244,13 @@ static void add_merge_config(struct ref **head,
>   }
>  }
>  
> -static int will_fetch(struct ref **head, const unsigned char *sha1)
> +static void create_fetch_oidset(struct ref **head, struct oidset *out)
>  {
>   struct ref *rm = *head;
>   while (rm) {
> - if (hasheq(rm->old_oid.hash, sha1))
> - return 1;
> + oidset_insert(out, &rm->old_oid);
>   rm = rm->next;
>   }
> - return 0;
>  }
>  
>  struct refname_hash_entry {
> @@ -317,6 +316,7 @@ static void find_non_local_tags(const struct ref *refs,
>  {
>   struct hashmap existing_refs;
>   struct hashmap remote_refs;
> + struct oidset fetch_oids = OIDSET_INIT;
>   struct string_list remote_refs_list = STRING_LIST_INIT_NODUP;
>   struct string_list_item *remote_ref_item;
>   const struct ref *ref;
> @@ -324,6 +324,7 @@ static void find_non_local_tags(const struct ref *refs,
>  
>   refname_hash_init(&existing_refs);
>   refname_hash_init(&remote_refs);
> + create_fetch_oidset(head, &fetch_oids);
>  
>   for_each_ref(add_one_refname, &existing_refs);
>   for (ref = refs; ref; ref = ref->next) {
> @@ -340,9 +341,9 @@ static void find_non_local_tags(const struct ref *refs,
>   if (item &&
>   !has_object_file_with_flags(&ref->old_oid,
>   OBJECT_INFO_QUICK) &&
> - !will_fetch(head, ref->old_oid.hash) &&
> + !oidset_contains(&fetch_oids, &ref->old_oid) &&
>   !has_object_file_with_flags(&item->oid, 
> OBJECT_INFO_QUICK) &&
> - !will_fetch(head, item->oid.hash))
> + !oidset_contains(&fetch_oids, &item->oid))
>   clear_item(item);
>   item = NULL;
>   continue;
> @@ -356,7 +357,7 @@ static void find_non_local_tags(const struct ref *refs,
>*/
>   if (item &&
>   !has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) 
> &&
> - !will_fetch(head, item->oid.hash))
> + !oidset_contains(&fetch_oids, &item->oid))
>   clear_item(item);
>  
>   item = NULL;
> @@ -377,7 +378,7 @@ static void find_non_local_tags(const struct ref *refs,
>*/
>   if (item &&
>   !has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) &&
> - !will_fetch(head, item->oid.hash))
> + !oidset_contains(&fetch_oids, &item->oid))
>   clear_item(item);
>  
>   /*
> @@ -404,6 +405,7 @@ static void find_non_local_tags(const struct ref *refs,
>   }
>   hashmap_free(&remote_refs, 1);
>   string_list_clear(&remote_refs_list, 0);
> + oidset_clear(&fetch_oids);
>  }
>  
>  static struct ref *get_ref_map(struct remote *remote,
> 



Re: [PATCH] commit-graph: use commit_list_count()

2019-09-15 Thread Derrick Stolee
On 9/15/2019 1:07 PM, René Scharfe wrote:
> Let commit_list_count() count the number of parents instead of
> duplicating it.  Also store the result in an unsigned int, as that's
> what the function returns, and the count is never negative.

I was unfamiliar with this method, but it obviously removes some
redundant code. We would have many more problems before the signed-ness
of the int was important, but good to use the type matching the
method.

Thanks,
-Stolee

> 
> Signed-off-by: René Scharfe 
> ---
>  commit-graph.c | 17 ++---
>  1 file changed, 6 insertions(+), 11 deletions(-)
> 
> diff --git a/commit-graph.c b/commit-graph.c
> index 9b02d2c426..5b0d6b5adc 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -1279,7 +1279,6 @@ static uint32_t count_distinct_commits(struct 
> write_commit_graph_context *ctx)
>  static void copy_oids_to_commits(struct write_commit_graph_context *ctx)
>  {
>   uint32_t i;
> - struct commit_list *parent;
> 
>   ctx->num_extra_edges = 0;
>   if (ctx->report_progress)
> @@ -1287,7 +1286,8 @@ static void copy_oids_to_commits(struct 
> write_commit_graph_context *ctx)
>   _("Finding extra edges in commit graph"),
>   ctx->oids.nr);
>   for (i = 0; i < ctx->oids.nr; i++) {
> - int num_parents = 0;
> + unsigned int num_parents;
> +
>   display_progress(ctx->progress, i + 1);
>   if (i > 0 && oideq(&ctx->oids.list[i - 1], &ctx->oids.list[i]))
>   continue;
> @@ -1301,10 +1301,7 @@ static void copy_oids_to_commits(struct 
> write_commit_graph_context *ctx)
> 
>   parse_commit_no_graph(ctx->commits.list[ctx->commits.nr]);
> 
> - for (parent = ctx->commits.list[ctx->commits.nr]->parents;
> -  parent; parent = parent->next)
> - num_parents++;
> -
> + num_parents = 
> commit_list_count(ctx->commits.list[ctx->commits.nr]->parents);
>   if (num_parents > 2)
>   ctx->num_extra_edges += num_parents - 1;
> 
> @@ -1616,8 +1613,7 @@ static int commit_compare(const void *_a, const void 
> *_b)
> 
>  static void sort_and_scan_merged_commits(struct write_commit_graph_context 
> *ctx)
>  {
> - uint32_t i, num_parents;
> - struct commit_list *parent;
> + uint32_t i;
> 
>   if (ctx->report_progress)
>   ctx->progress = start_delayed_progress(
> @@ -1635,10 +1631,9 @@ static void sort_and_scan_merged_commits(struct 
> write_commit_graph_context *ctx)
>   die(_("unexpected duplicate commit id %s"),
>   oid_to_hex(&ctx->commits.list[i]->object.oid));
>   } else {
> - num_parents = 0;
> - for (parent = ctx->commits.list[i]->parents; parent; 
> parent = parent->next)
> - num_parents++;
> + unsigned int num_parents;
> 
> + num_parents = 
> commit_list_count(ctx->commits.list[i]->parents);
>   if (num_parents > 2)
>   ctx->num_extra_edges += num_parents - 1;
>   }
> --
> 2.23.0
> 



Re: [PATCH] gitk: Do not mistake unchanged lines with submodule changes

2019-09-15 Thread Paul Mackerras
On Sat, Mar 23, 2019 at 06:00:36PM +0100, Gabriele Mazzotta wrote:
> Unchanged lines are prefixed with a white-space, thus unchanged lines
> starting with either " <" or " >" are mistaken for submodule changes.
> Check if a line starts with either "  <" or "  >" only if we listing
> the changes of a submodule.
> 
> Signed-off-by: Gabriele Mazzotta 

Thanks, patch applied to my git://ozlabs.org/~paulus/gitk.git repository.

Paul.


Re: [PATCH] doc: provide guidance on user.name format

2019-09-15 Thread brian m. carlson
On 2019-09-15 at 22:18:07, Junio C Hamano wrote:
> "brian m. carlson"  writes:
> > diff --git a/Documentation/git-commit-tree.txt 
> > b/Documentation/git-commit-tree.txt
> > index 4b90b9c12a..34a8496b0e 100644
> > --- a/Documentation/git-commit-tree.txt
> > +++ b/Documentation/git-commit-tree.txt
> > @@ -92,6 +92,12 @@ if set:
> >  
> >  (nb "<", ">" and "\n"s are stripped)
> >  
> > +The author and committer names are by convention some form of a personal 
> > name,
> > +as opposed to a username, although Git does not enforce or require any
> > +particular form.
> 
> I have a lot of trouble with 'username' in the context of this
> paragraph.
> 
> After all, you are describing the name appropriate to be set as the
> value of the user.name configuration, and you are trying to stress
> that the name used there is different from and has nothing to do
> with the name machines use to identify the user.  In the paragraph
> that follows this new paragraph, there is a reference to "system
> user name", which is still not great but probably better than
> "username" above.  Perhaps there is a term that is distinct enough
> from "user name" that is commonly used I am forgetting?  I am almost
> tempted to say "user id", but there must be even better phrases.  I
> dunno.

I wonder if we should just omit that aside, then, since I'm not sure of
a less ambiguous term for "how I identify myself to a computer".  I think
describing the convention as "some form of a personal name" is probably
sufficient to tell people what we suggest they do.

My first draft of that sentence didn't include the part within the
commas at all.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH] doc: provide guidance on user.name format

2019-09-15 Thread Junio C Hamano
"brian m. carlson"  writes:

> Users may or may not read the documentation, but at least we've done our
> best at providing them helpful information should they choose to do so.

Good.

>  Documentation/git-commit-tree.txt | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/Documentation/git-commit-tree.txt 
> b/Documentation/git-commit-tree.txt
> index 4b90b9c12a..34a8496b0e 100644
> --- a/Documentation/git-commit-tree.txt
> +++ b/Documentation/git-commit-tree.txt
> @@ -92,6 +92,12 @@ if set:
>  
>  (nb "<", ">" and "\n"s are stripped)
>  
> +The author and committer names are by convention some form of a personal 
> name,
> +as opposed to a username, although Git does not enforce or require any
> +particular form.

I have a lot of trouble with 'username' in the context of this
paragraph.

After all, you are describing the name appropriate to be set as the
value of the user.name configuration, and you are trying to stress
that the name used there is different from and has nothing to do
with the name machines use to identify the user.  In the paragraph
that follows this new paragraph, there is a reference to "system
user name", which is still not great but probably better than
"username" above.  Perhaps there is a term that is distinct enough
from "user name" that is commonly used I am forgetting?  I am almost
tempted to say "user id", but there must be even better phrases.  I
dunno.

> Arbitrary Unicode may be used, subject to the constraints
> +listed above. This name has no effect on authentication; for that, see the
> +`credential.username` variable in linkgit::git-config[1].
> +
>  In case (some of) these environment variables are not set, the information
>  is taken from the configuration items user.name and user.email, or, if not
>  present, the environment variable EMAIL, or, if that is not set,

Thanks.



Re: [PATCH v3] Documentation: fix build with Asciidoctor 2

2019-09-15 Thread brian m. carlson
On 2019-09-15 at 22:05:55, SZEDER Gábor wrote:
> On Sun, Sep 15, 2019 at 09:26:21PM +, brian m. carlson wrote:
> > > > diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
> > > > index d49089832d..b3e76ef863 100755
> > > > --- a/ci/test-documentation.sh
> > > > +++ b/ci/test-documentation.sh
> > > > @@ -8,6 +8,8 @@
> > > >  filter_log () {
> > > > sed -e '/^GIT_VERSION = /d' \
> > > > -e '/^\* new asciidoc flags$/d' \
> > > > +   -e '/stripped namespace before processing/d' \
> > > > +   -e '/Attributed.*IDs for element/d' \
> > > 
> > > I haven't seen this latter message in the CI builds, neither with
> > > Asciidoctor v1.5.8 nor with v2.  Do we really need this filter, then?
> > > Where does this message come from?
> > 
> > I see it and it definitely fails on my system without it.  It comes from
> > libxslt, which has been patched in Debian to produce deterministic IDs.
> > I suspect we may not have seen it on Ubuntu systems because they are
> > running 16.04, which is likely older than the patch.  If Travis updates
> > to 18.04, we may be more likely to have a problem.
> 
> Thanks.  Indeed, I kicked off a Travis CI build using their Ubuntu
> 18.04 image, and that "Attributed..." message was there.
> 
> I think this future-proofing is a good idea, but I also think that
> this should be clarified in the commit message.

I can do that.  I just noticed it failed on my laptop and added it,
assuming it was the stylesheets.  I had to search Google for the output
to find out that it was libxslt.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v3] Documentation: fix build with Asciidoctor 2

2019-09-15 Thread SZEDER Gábor
On Sun, Sep 15, 2019 at 09:26:21PM +, brian m. carlson wrote:
> > > diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
> > > index d49089832d..b3e76ef863 100755
> > > --- a/ci/test-documentation.sh
> > > +++ b/ci/test-documentation.sh
> > > @@ -8,6 +8,8 @@
> > >  filter_log () {
> > >   sed -e '/^GIT_VERSION = /d' \
> > >   -e '/^\* new asciidoc flags$/d' \
> > > + -e '/stripped namespace before processing/d' \
> > > + -e '/Attributed.*IDs for element/d' \
> > 
> > I haven't seen this latter message in the CI builds, neither with
> > Asciidoctor v1.5.8 nor with v2.  Do we really need this filter, then?
> > Where does this message come from?
> 
> I see it and it definitely fails on my system without it.  It comes from
> libxslt, which has been patched in Debian to produce deterministic IDs.
> I suspect we may not have seen it on Ubuntu systems because they are
> running 16.04, which is likely older than the patch.  If Travis updates
> to 18.04, we may be more likely to have a problem.

Thanks.  Indeed, I kicked off a Travis CI build using their Ubuntu
18.04 image, and that "Attributed..." message was there.

I think this future-proofing is a good idea, but I also think that
this should be clarified in the commit message.



Re: [PATCH] credential: add nocache option to the credentials API

2019-09-15 Thread Masaya Suzuki
On Mon, Aug 26, 2019 at 9:28 AM Junio C Hamano  wrote:
>
> Jeff King  writes:
>
> > I was thinking that Git itself could treat "ttl=0" specially, the same
> > as your nocache, and avoid passing it along to any helpers during the
> > approve stage. That would make it exactly equivalent to your patch
> > (modulo the name change).
> > ...
> > And as you noted above, if we don't suppress the helper calls inside
> > Git, then every matching storage helper needs to learn about "nocache"
> > (or "ttl") before it will do any good.
>
> I was waiting for this discussion to settle and then the discussion
> seems to have petered out.  Any interest to following the "ttl with
> special casing value 0 as 'nocache'" idea thru from either two of
> you, or should I take the patch as is in the meantime?

Sorry for the late reply. I think about this again. I imagine that, if
I would like to have credentials with an expiration and I want to have
them managed by other helpers, it's probably better to use an absolute
timestamp instead of duration. The second call to the helpers is done
after the remote call. For the helpers that store TTL-ed credentials,
they cannot tell the start time of the TTL in the second call. This
makes it hard to cache the short-lived credentials safely because some
time has spent during the remote call and the actual TTL is shorter
than ttl=N option. From this, instead of adding ttl=DURATION, it might
be better to have expires_at=TIMESTAMP.

Maybe my observation is not an issue. I don't know. For now, adding
nocache seems a safer action for me, so I vote for taking nocache
patch as-is in the meantime. If there's somebody who wants to receive
TTL or expiration timestamp, they can decide what's actually needed
later.


Re: [PATCH v3] Documentation: fix build with Asciidoctor 2

2019-09-15 Thread brian m. carlson
On 2019-09-15 at 09:59:52, SZEDER Gábor wrote:
> On Sat, Sep 14, 2019 at 07:49:19PM +, brian m. carlson wrote:
> > test -n "$ALREADY_HAVE_ASCIIDOCTOR" ||
> > gem install --version 1.5.8 asciidoctor
> 
> So, since the documentation can now be built with Asciidoctor v2, is
> it already time to remove this '--version 1.5.8'?

I think Martin was going to send in some more patches before we did
that.

> > diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
> > index d49089832d..b3e76ef863 100755
> > --- a/ci/test-documentation.sh
> > +++ b/ci/test-documentation.sh
> > @@ -8,6 +8,8 @@
> >  filter_log () {
> > sed -e '/^GIT_VERSION = /d' \
> > -e '/^\* new asciidoc flags$/d' \
> > +   -e '/stripped namespace before processing/d' \
> > +   -e '/Attributed.*IDs for element/d' \
> 
> I haven't seen this latter message in the CI builds, neither with
> Asciidoctor v1.5.8 nor with v2.  Do we really need this filter, then?
> Where does this message come from?

I see it and it definitely fails on my system without it.  It comes from
libxslt, which has been patched in Debian to produce deterministic IDs.
I suspect we may not have seen it on Ubuntu systems because they are
running 16.04, which is likely older than the patch.  If Travis updates
to 18.04, we may be more likely to have a problem.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH 2/3] list-objects-filter: delay parsing of sparse oid

2019-09-15 Thread Jeff King
On Sat, Sep 14, 2019 at 09:13:19PM -0400, Jeff King wrote:

> --- a/list-objects-filter-options.c
> +++ b/list-objects-filter-options.c
> @@ -63,17 +63,8 @@ static int gently_parse_list_objects_filter(
>   return 0;
>  
>   } else if (skip_prefix(arg, "sparse:oid=", &v0)) {
> - struct object_context oc;
> - struct object_id sparse_oid;
> -
> - /*
> -  * Try to parse  into an OID for the current
> -  * command, but DO NOT complain if we don't have the blob or
> -  * ref locally.
> -  */
> - if (!get_oid_with_context(the_repository, v0, GET_OID_BLOB,
> -   &sparse_oid, &oc))
> - filter_options->sparse_oid_value = oiddup(&sparse_oid);
> + /* v0 points into filter_options->filter_spec; no allocation 
> needed */
> + filter_options->sparse_oid_name = v0;

Looks like this comment is no longer true after merging with
md/list-objects-filter-combo, which is in 'next'.

We can obviously deal with it during the merge, but it probably makes
sense to just be more defensive from the start. Here's a revised version
of patch 2, with these changes:

diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 43f41f446c..adbea552a0 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -63,8 +63,7 @@ static int gently_parse_list_objects_filter(
return 0;
 
} else if (skip_prefix(arg, "sparse:oid=", &v0)) {
-   /* v0 points into filter_options->filter_spec; no allocation 
needed */
-   filter_options->sparse_oid_name = v0;
+   filter_options->sparse_oid_name = xstrdup(v0);
filter_options->choice = LOFC_SPARSE_OID;
return 0;
 
@@ -129,6 +128,7 @@ void list_objects_filter_release(
struct list_objects_filter_options *filter_options)
 {
free(filter_options->filter_spec);
+   free(filter_options->sparse_oid_name);
memset(filter_options, 0, sizeof(*filter_options));
 }
 
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index a819e42ffb..20b9d1e587 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -42,7 +42,7 @@ struct list_objects_filter_options {
 * choice-specific; not all values will be defined for any given
 * choice.
 */
-   const char *sparse_oid_name;
+   char *sparse_oid_name;
unsigned long blob_limit_value;
unsigned long tree_exclude_depth;
 };

-- >8 --
Subject: [PATCH] list-objects-filter: delay parsing of sparse oid

The list-objects-filter code has two steps to its initialization:

  1. parse_list_objects_filter() makes sure the spec is a filter we know
 about and is syntactically correct. This step is done by "rev-list"
 or "upload-pack" that is going to apply a filter, but also by "git
 clone" or "git fetch" before they send the spec across the wire.

  2. list_objects_filter__init() runs the type-specific initialization
 (using function pointers established in step 1). This happens at
 the start of traverse_commit_list_filtered(), when we're about to
 actually use the filter.

It's a good idea to parse as much as we can in step 1, in order to catch
problems early (e.g., a blob size limit that isn't a number). But one
thing we _shouldn't_ do is resolve any oids at that step (e.g., for
sparse-file contents specified by oid). In the case of a fetch, the oid
has to be resolved on the remote side.

The current code does resolve the oid during the parse phase, but
ignores any error (which we must do, because we might just be sending
the spec across the wire). This leads to two bugs:

  - if we're not in a repository (e.g., because it's git-clone parsing
the spec), then we trigger a BUG() trying to resolve the name

  - if we did hit the error case, we still have to notice that later and
bail. The code path in rev-list handles this, but the one in
upload-pack does not, leading to a segfault.

We can fix both by moving the oid resolution into the sparse-oid init
function. At that point we know we have a repository (because we're
about to traverse), and handling the error there fixes the segfault.

As a bonus, we can drop the NULL sparse_oid_value check in rev-list,
since this is now handled in the sparse-oid-filter init function.

Signed-off-by: Jeff King 
---
 builtin/rev-list.c|  4 
 list-objects-filter-options.c | 14 ++
 list-objects-filter-options.h |  2 +-
 list-objects-filter.c | 11 +--
 t/t5616-partial-clone.sh  |  4 ++--
 5 files changed, 14 insertions(+), 21 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 301ccb970b..74dbfad73d 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c

Re: [PATCH 2/2] sha1-name: check for overflow of N in "foo^N" and "foo~N"

2019-09-15 Thread René Scharfe
Am 15.09.19 um 17:15 schrieb brian m. carlson:
> This approach seems reasonable.  I must admit some curiosity as to how
> you discovered this issue, though.  Did you have a cat assisting you in
> typing revisions?

Found it by reading the code, but I'm not sure anymore what I was
actually looking for.

Would a fuzzer (or a cat) be able to catch that?  The function is
happily eating extra digits -- it's not crashing for me.

René


Re: [PATCH 2/2] sha1-name: check for overflow of N in "foo^N" and "foo~N"

2019-09-15 Thread brian m. carlson
On 2019-09-15 at 12:10:28, René Scharfe wrote:
> Reject values that don't fit into an int, as get_parent() and
> get_nth_ancestor() cannot handle them.  That's better than potentially
> returning a random object.
> 
> If this restriction turns out to be too tight then we can switch to a
> wider data type, but we'd still have to check for overflow.

Certainly we want Git to perform as well as possible on large
repositories, but I doubt if it will scale to more than 2 billion
revisions, even with significant effort.  I think this restriction
should be fine.

> diff --git a/sha1-name.c b/sha1-name.c
> index c665e3f96d..7a047e9e2b 100644
> --- a/sha1-name.c
> +++ b/sha1-name.c
> @@ -1160,13 +1160,22 @@ static enum get_oid_result get_oid_1(struct 
> repository *r,
>   }
> 
>   if (has_suffix) {
> - int num = 0;
> + unsigned int num = 0;
>   int len1 = cp - name;
>   cp++;
> - while (cp < name + len)
> - num = num * 10 + *cp++ - '0';
> + while (cp < name + len) {
> + unsigned int digit = *cp++ - '0';
> + if (unsigned_mult_overflows(num, 10))
> + return MISSING_OBJECT;
> + num *= 10;
> + if (unsigned_add_overflows(num, digit))
> + return MISSING_OBJECT;

I was worried whether these functions only handled size_t or if they
also handle unsigned int, but I checked and they seem to be fine for any
unsigned type.

> + num += digit;
> + }
>   if (!num && len1 == len - 1)
>   num = 1;
> + else if (num > INT_MAX)
> + return MISSING_OBJECT;
>   if (has_suffix == '^')
>   return get_parent(r, name, len1, oid, num);
>   /* else if (has_suffix == '~') -- goes without saying */

This approach seems reasonable.  I must admit some curiosity as to how
you discovered this issue, though.  Did you have a cat assisting you in
typing revisions?
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v3] Documentation: fix build with Asciidoctor 2

2019-09-15 Thread SZEDER Gábor
On Sat, Sep 14, 2019 at 07:49:19PM +, brian m. carlson wrote:
> Our documentation toolchain has traditionally been built around DocBook
> 4.5.  This version of DocBook is the last DTD-based version of DocBook.
> In 2009, DocBook 5 was introduced using namespaces and its syntax is
> expressed in RELAX NG, which is more expressive and allows a wider
> variety of syntax forms.
> 
> Asciidoctor, one of the alternatives for building our documentation,
> moved support for DocBook 4.5 out of core in its recent 2.0 release and
> now only supports DocBook 5 in the main release.  The DocBoook 4.5
> converter is still available as a separate component, but this is not
> available in most distro packages.  This would not be a problem but for
> the fact that we use xmlto, which is still stuck in the DocBook 4.5 era.
> 
> xmlto performs DTD validation as part of the build process.  This is not
> problematic for DocBook 4.5, which has a valid DTD, but it clearly
> cannot work for DocBook 5, since no DTD can adequately express its full
> syntax.  In addition, even if xmlto did support RELAX NG validation,
> that wouldn't be sufficient because it uses the libxml2-based xmllint to
> do so, which has known problems with validating interleaves in RELAX NG.
> 
> Fortunately, there's an easy way forward: ask Asciidoctor to use its
> DocBook 5 backend and tell xmlto to skip validation.  Asciidoctor has
> supported DocBook 5 since v0.1.4 in 2013 and xmlto has supported
> skipping validation for probably longer than that.
> 
> We also need to teach xmlto how to use the namespaced DocBook XSLT
> stylesheets instead of the non-namespaced ones it usually uses.
> Normally these stylesheets are interchangeable, but the non-namespaced
> ones have a bug that causes them not to strip whitespace automatically
> from certain elements when namespaces are in use.  This results in
> additional whitespace at the beginning of list elements, which is
> jarring and unsightly.
> 
> We can do this by passing a custom stylesheet with the -x option that
> simply imports the namespaced stylesheets via a URL.  Any system with
> support for XML catalogs will automatically look this URL up and
> reference a local copy instead without us having to know where this
> local copy is located.  We know that anyone using xmlto will already
> have catalogs set up properly since the DocBook 4.5 DTD used during
> validation is also looked up via catalogs.  All major Linux
> distributions distribute the necessary stylesheets and have built-in
> catalog support, and Homebrew does as well, albeit with a requirement to
> set an environment variable to enable catalog support.
> 
> On the off chance that someone lacks support for catalogs, it is
> possible for xmlto (via xmllint) to download the stylesheets from the
> URLs in question, although this will likely perform poorly enough to
> attract attention.  People still have the option of using the prebuilt
> documentation that we ship, so happily this should not be an impediment.
> 
> Finally, we need to filter out some messages from other stylesheets that
> occur when invoking dblatex in the CI job.  This tool strips namespaces
> much like the unnamespaced DocBook stylesheets and prints similar
> messages.  If we permit these messages to be printed to standard error,
> our documentation CI job will fail because we check standard error for
> unexpected output.  Due to dblatex's reliance on Python 2, we may need
> to revisit its use in the future, in which case this problem may go
> away, but this can be delayed until a future patch.
> 
> Signed-off-by: brian m. carlson 

>  Documentation/Makefile | 4 +++-
>  Documentation/manpage.xsl  | 3 +++
>  azure-pipelines.yml| 2 +-
>  ci/install-dependencies.sh | 2 +-
>  ci/test-documentation.sh   | 2 ++
>  5 files changed, 10 insertions(+), 3 deletions(-)
>  create mode 100644 Documentation/manpage.xsl
> 
> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index 76f2ecfc1b..d94f47c5c9 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -197,11 +197,13 @@ ifdef USE_ASCIIDOCTOR
>  ASCIIDOC = asciidoctor
>  ASCIIDOC_CONF =
>  ASCIIDOC_HTML = xhtml5
> -ASCIIDOC_DOCBOOK = docbook45
> +ASCIIDOC_DOCBOOK = docbook5
>  ASCIIDOC_EXTRA += -acompat-mode -atabsize=8
>  ASCIIDOC_EXTRA += -I. -rasciidoctor-extensions
>  ASCIIDOC_EXTRA += -alitdd='&\#x2d;&\#x2d;'
>  DBLATEX_COMMON =
> +XMLTO_EXTRA += --skip-validation
> +XMLTO_EXTRA += -x manpage.xsl
>  endif
>  
>  SHELL_PATH ?= $(SHELL)
> diff --git a/Documentation/manpage.xsl b/Documentation/manpage.xsl
> new file mode 100644
> index 00..ef64bab17a
> --- /dev/null
> +++ b/Documentation/manpage.xsl
> @@ -0,0 +1,3 @@
> +http://www.w3.org/1999/XSL/Transform"; 
> version="1.0">
> +  href="http://docbook.sourceforge.net/release/xsl-ns/current/manpages/docbook.xsl";
>  />
> +
> diff --git a/azure-pipelines.yml b/azure-pipelines.yml
> index c329b7218b..34031b182a 100644
> --- a/azure-pipe

Re: [PATCH v2] Documentation: fix build with Asciidoctor 2

2019-09-14 Thread brian m. carlson
On 2019-09-14 at 07:53:01, SZEDER Gábor wrote:
> Unfortunately, five out of five CI builds failed with the following:
> 
>   XMLTO git-revert.1
>   I/O error : Attempt to load network entity 
> http://docbook.sourceforge.net/release/xsl-ns/current/manpages/docbook.xsl
>   warning: failed to load external entity 
> "http://docbook.sourceforge.net/release/xsl-ns/current/manpages/docbook.xsl";
>   compilation error: file 
> /home/travis/build/git/git/Documentation/manpage.xsl line 2 element import
>   xsl:import : unable to load 
> http://docbook.sourceforge.net/release/xsl-ns/current/manpages/docbook.xsl
>   Makefile:375: recipe for target 'git-revert.1' failed
> 
> https://travis-ci.org/git/git/jobs/584794387#L1552

Ah, I forgot to install the packages in CI.  I'll send a v3 with that
fixed.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v5] git-gui: add hotkey to toggle "Amend Last Commit"

2019-09-14 Thread Pratyush Yadav
You missed labelling the menu item of "Amend Last Commit" with the 
shortcut, like we do for other menu items bound to a hotkey like F5 for 
rescan, Ctrl-T for stage, etc. I added that locally.

Thanks for the re-roll. Will queue.

On 14/09/19 11:18AM, Birger Skogeng Pedersen wrote:
> Selecting whether to "Amend Last Commit" or not does not have a hotkey.
> 
> With this patch, the user may toggle between the two options with
> CTRL/CMD+e.
> 
> Signed-off-by: Birger Skogeng Pedersen 
> Rebased-by: Bert Wesarg 
> ---
>  git-gui.sh | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/git-gui.sh b/git-gui.sh
> index c7d9103..790adf1 100755
> --- a/git-gui.sh
> +++ b/git-gui.sh
> @@ -2664,6 +2664,12 @@ proc focus_widget {widget} {
>   }
>  }
>  
> +proc toggle_commit_type {} {
> + global commit_type_is_amend
> + set commit_type_is_amend [expr !$commit_type_is_amend]
> + do_select_commit_type
> +}
> +
>  ##
>  ##
>  ## ui construction
> @@ -3892,6 +3898,8 @@ bind .   <$M1B-Key-j> do_revert_selection
>  bind .   <$M1B-Key-J> do_revert_selection
>  bind .   <$M1B-Key-i> do_add_all
>  bind .   <$M1B-Key-I> do_add_all
> +bind .   <$M1B-Key-e> toggle_commit_type
> +bind .   <$M1B-Key-E> toggle_commit_type
>  bind .   <$M1B-Key-minus> {show_less_context;break}
>  bind .   <$M1B-Key-KP_Subtract> {show_less_context;break}
>  bind .   <$M1B-Key-equal> {show_more_context;break}
> -- 
> 2.21.0.windows.1
> 

-- 
Regards,
Pratyush Yadav


Re: [PATCH v2] gitk: Make web links clickable

2019-09-14 Thread Pratyush Yadav
On 14/09/19 09:33AM, Paul Mackerras wrote:
> On Fri, Aug 30, 2019 at 12:02:07AM +0530, Pratyush Yadav wrote:
> > On 29/08/19 11:27AM, Paul Mackerras wrote:
> > 
> > I know I suggested searching till the first non-whitespace character, 
> > but thinking more about, there are some problematic cases. Say someone 
> > has a commit message like:
> >   
> >   Foo bar baz (more details at https://example.com/hello)
> > 
> > Or like:
> > 
> >   Check out https://foo.com, https://bar.com
> > 
> > In the first example, the closing parenthesis gets included in the link, 
> > but shouldn't be. In the second, the comma after foo.com would be 
> > included in the link, but shouldn't be. So maybe use a more 
> > sophisticated regex?
> 
> I did think about that, but it seems to be impossible to get it right
> in all cases, so I went for simple and obvious.  In particular I don't
> see how to handle the common case of a '.' immediately following the
> URL, since '.' is a legal character in a URL.
> 
> > A quick Google search gives out the following options [0][1].
> > 
> > [0] gives the following regex:
> > 
> >   
> > https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)
> > 
> > It is kind of ugly to look at, and I'm not even sure if there are any 
> > syntax differences with Tcl's regex library.
> > 
> > [1] lists a bunch of regexes and which URLs they work on and which ones 
> > they don't. The smallest among them I found is:
> > 
> >   @^(https?|ftp)://[^\s/$.?#].[^\s]*$@iS
> > 
> > Again, I'm not sure how well this would work with Tcl's regex library, 
> > or how commonly these URL patterns appear in actual commit messages.  
> > Just something to consider.
> > 
> > [0] 
> > https://stackoverflow.com/questions/3809401/what-is-a-good-regular-expression-to-match-a-url
> > [1] https://mathiasbynens.be/demo/url-regex
> 
> I think I would be inclined to make the regex customizable, since that
> would also allow the user to match ftp or other URLs if they want.
> The only difficulty with that is if there are subexpressions, that
> will change how we have to interpret the list returned by the
> regexp -indices -all -inline command.

That just puts the responsibility of parsing the URL on the user, it 
doesn't solve the problem.

I don't have any numbers, but I think most problematic cases are when 
there are some trailing characters. We aren't dealing with malicious 
actors that want to do something bad or make gitk crash. IMO it is 
reasonable to expect legal URLs in a commit message.

So instead of trying to encompass all possible legal URLs and removing 
all illegal URLs, how about using a simple regex for basic filtering to 
weed out some false positives, and then trimming illegal trailing 
characters. These trailing characters would most likely be comma, 
period, parenthesis, question marks, quotation marks, etc. This way the 
logic stays simple and we tackle more real world problems.

Sounds reasonable?

-- 
Regards,
Pratyush Yadav


Re: [PATCH v4] git-gui: add hotkey to toggle "Amend Last Commit"

2019-09-14 Thread Birger Skogeng Pedersen
Thanks, I really appreciate you taking time to explain it thoroughly.

On Sat, Sep 14, 2019 at 12:11 AM Pratyush Yadav  wrote:
> So while my advice above was to work on top of "master", that does not
> apply in this case since your patch is dependent on someone's patch
> which isn't in master yet. So in this specific case, you should base
> your patch on top of Bert's.

Okay, I might have to make another re-roll then. I based v4 on master.

Birger


Re: [PATCH v2] Documentation: fix build with Asciidoctor 2

2019-09-14 Thread SZEDER Gábor
On Fri, Sep 13, 2019 at 01:52:40AM +, brian m. carlson wrote:

> We also need to teach xmlto how to use the namespaced DocBook XSLT
> stylesheets instead of the non-namespaced ones it usually uses.
> Normally these stylesheets are interchangeable, but the non-namespaced
> ones have a bug that causes them not to strip whitespace automatically
> from certain elements when namespaces are in use.  This results in
> additional whitespace at the beginning of list elements, which is
> jarring and unsightly.
> 
> We can do this by passing a custom stylesheet with the -x option that
> simply imports the namespaced stylesheets via a URL.  Any system with
> support for XML catalogs will automatically look this URL up and
> reference a local copy instead without us having to know where this
> local copy is located.  We know that anyone using xmlto will already
> have catalogs set up properly since the DocBook 4.5 DTD used during
> validation is also looked up via catalogs.  All major Linux
> distributions distribute the necessary stylesheets and have built-in
> catalog support, and Homebrew does as well, albeit with a requirement to
> set an environment variable to enable catalog support.
> 
> On the off chance that someone lacks support for catalogs, it is
> possible for xmlto (via xmllint) to download the stylesheets from the
> URLs in question, although this will likely perform poorly enough to
> attract attention.  People still have the option of using the prebuilt
> documentation that we ship, so happily this should not be an impediment.


> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index 76f2ecfc1b..d94f47c5c9 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -197,11 +197,13 @@ ifdef USE_ASCIIDOCTOR
>  ASCIIDOC = asciidoctor
>  ASCIIDOC_CONF =
>  ASCIIDOC_HTML = xhtml5
> -ASCIIDOC_DOCBOOK = docbook45
> +ASCIIDOC_DOCBOOK = docbook5
>  ASCIIDOC_EXTRA += -acompat-mode -atabsize=8
>  ASCIIDOC_EXTRA += -I. -rasciidoctor-extensions
>  ASCIIDOC_EXTRA += -alitdd='&\#x2d;&\#x2d;'
>  DBLATEX_COMMON =
> +XMLTO_EXTRA += --skip-validation
> +XMLTO_EXTRA += -x manpage.xsl
>  endif
>  
>  SHELL_PATH ?= $(SHELL)
> diff --git a/Documentation/manpage.xsl b/Documentation/manpage.xsl
> new file mode 100644
> index 00..ef64bab17a
> --- /dev/null
> +++ b/Documentation/manpage.xsl
> @@ -0,0 +1,3 @@
> +http://www.w3.org/1999/XSL/Transform"; 
> version="1.0">
> +  href="http://docbook.sourceforge.net/release/xsl-ns/current/manpages/docbook.xsl";
>  />
> +

Unfortunately, five out of five CI builds failed with the following:

  XMLTO git-revert.1
  I/O error : Attempt to load network entity 
http://docbook.sourceforge.net/release/xsl-ns/current/manpages/docbook.xsl
  warning: failed to load external entity 
"http://docbook.sourceforge.net/release/xsl-ns/current/manpages/docbook.xsl";
  compilation error: file /home/travis/build/git/git/Documentation/manpage.xsl 
line 2 element import
  xsl:import : unable to load 
http://docbook.sourceforge.net/release/xsl-ns/current/manpages/docbook.xsl
  Makefile:375: recipe for target 'git-revert.1' failed

https://travis-ci.org/git/git/jobs/584794387#L1552



Re: [PATCH v2] gitk: Make web links clickable

2019-09-13 Thread Paul Mackerras
On Fri, Aug 30, 2019 at 12:02:07AM +0530, Pratyush Yadav wrote:
> On 29/08/19 11:27AM, Paul Mackerras wrote:
> > This makes gitk look for http or https URLs in the commit description
> > and make the URLs clickable.  Clicking on them will invoke an external
> > web browser with the URL.
> > 
> > The web browser command is by default "xdg-open" on Linux, "open" on
> > MacOS, and "cmd /c start" on Windows.  The command can be changed in
> > the preferences window, and it can include parameters as well as the
> > command name.  If it is set to the empty string then URLs will no
> > longer be made clickable.
> > 
> > Signed-off-by: Paul Mackerras 
> > ---
> > v2: Match URLs anywhere, not just after [Bug]Link:.
> > 
> >  gitk | 51 ++-
> >  1 file changed, 50 insertions(+), 1 deletion(-)
> > 
> > diff --git a/gitk b/gitk
> > index a14d7a1..2a0d00c 100755
> > --- a/gitk
> > +++ b/gitk
> > @@ -7016,6 +7016,7 @@ proc commit_descriptor {p} {
> >  
> >  # append some text to the ctext widget, and make any SHA1 ID
> >  # that we know about be a clickable link.
> > +# Also look for URLs of the form "http[s]://..." and make them web links.
> >  proc appendwithlinks {text tags} {
> >  global ctext linknum curview
> >  
> > @@ -7032,6 +7033,18 @@ proc appendwithlinks {text tags} {
> > setlink $linkid link$linknum
> > incr linknum
> >  }
> > +set wlinks [regexp -indices -all -inline -line \
> > +   {https?://[^[:space:]]+} $text]
> 
> I know I suggested searching till the first non-whitespace character, 
> but thinking more about, there are some problematic cases. Say someone 
> has a commit message like:
>   
>   Foo bar baz (more details at https://example.com/hello)
> 
> Or like:
> 
>   Check out https://foo.com, https://bar.com
> 
> In the first example, the closing parenthesis gets included in the link, 
> but shouldn't be. In the second, the comma after foo.com would be 
> included in the link, but shouldn't be. So maybe use a more 
> sophisticated regex?

I did think about that, but it seems to be impossible to get it right
in all cases, so I went for simple and obvious.  In particular I don't
see how to handle the common case of a '.' immediately following the
URL, since '.' is a legal character in a URL.

> A quick Google search gives out the following options [0][1].
> 
> [0] gives the following regex:
> 
>   
> https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)
> 
> It is kind of ugly to look at, and I'm not even sure if there are any 
> syntax differences with Tcl's regex library.
> 
> [1] lists a bunch of regexes and which URLs they work on and which ones 
> they don't. The smallest among them I found is:
> 
>   @^(https?|ftp)://[^\s/$.?#].[^\s]*$@iS
> 
> Again, I'm not sure how well this would work with Tcl's regex library, 
> or how commonly these URL patterns appear in actual commit messages.  
> Just something to consider.
> 
> [0] 
> https://stackoverflow.com/questions/3809401/what-is-a-good-regular-expression-to-match-a-url
> [1] https://mathiasbynens.be/demo/url-regex

I think I would be inclined to make the regex customizable, since that
would also allow the user to match ftp or other URLs if they want.
The only difficulty with that is if there are subexpressions, that
will change how we have to interpret the list returned by the
regexp -indices -all -inline command.

Paul.


Re: [PATCH] git-submodule.txt: fix AsciiDoc formatting error

2019-09-13 Thread Junio C Hamano
Denton Liu  writes:

>> > -set-branch ((-d|--default)|(-b|--branch )) [--] ::
>> 
>> I say "almost", as it gives a wrong impression that you can give
>> "-b" without "" X-<.
>> 
>> Now what does the updated text say to us?
>> 
>> > +set-branch (-d|--default)|(-b|--branch ) [--] ::
>> 
>> I think the attempt to cram the short-form is unnecessarily
>> cluttering and making the result incorrect.
>>  ...
>> 
> Hmm, I don't really like this since with every other subcommand, the
> short-forms are in the command summary so it's obvious to the reader
> in a quick glance which options are available.

I actually do not think it adds that much value.  Once a user learns
what --branch does and is told -b is its short form, it is much less
important that -b and --branch are both available than --default and
--branch are possibilities, and you cannot use both at the same time.

If anything, perhaps other subcommands' description may benefit if
we unclutter by reducing the prominence we give to their short form.

> In the context line above, we see `[(-n|--summary-limit) ]` as a
> possible way of notating a short and long option with argument. What do
> you think about the following potential output?
>
>   set-branch (-d|--default)|((-b|--branch) ) [--] ::
>
> Of course, we reintroduce the double paren problem but I can dig into
> asciidoc syntax and figure out how to escape it properly.

That's much less important than the fact that you are losing "-b and
-d cannot be used together", which is what the usage string gives us
(and which is what I tried to express with my rewrite).


Re: [PATCH v4] git-gui: add hotkey to toggle "Amend Last Commit"

2019-09-13 Thread Pratyush Yadav


+Cc Junio so you know what development model I'm using, and comment with 
your thoughts (if you want to).

On 13/09/19 11:32PM, Birger Skogeng Pedersen wrote:
> On Fri, Sep 13, 2019 at 4:37 PM Pratyush Yadav  wrote:
> > Hi Birger,
> >
> > I'm afraid you are working on an older version of this patch. You should
> > be re-rolling [0], which works well with Bert's "amend check button"
> > change.
> >
> > [0] 
> > https://public-inbox.org/git/b82a00441ff1a6a9cea3fd235c1c33729ec31b71.1567713659.git.bert.wes...@googlemail.com/
> 
> Forgive me, I get a little bit confused. Should my patch be based on
> "next" or "master" branch?

For git-gui, ignore "next" for now. I considered using a model similar 
to Git where patches first get queued to "next" (or "pu" depending on 
how finished they are). And then after some time letting people use 
them, they are merged to "master" which eventually goes in the release. 
But this seems to be too complicated to me without any clear benefit.

I think for now using just "master" for git-gui should be fine, since I 
won't directly release git-gui. Instead I'll periodically ask Junio to 
pull changes from my master. This will be our "release". So essentially 
my "master" for now acts as a place for people involved in development 
to test out all the changes, and then the rest of the world gets the new 
version along with Git's release.

(Junio, you have done this for much longer than I have, so is there a 
major problem with my workflow?)

If all this seems too complicated, just work on top of my "master". The 
rest of it is mostly my problem.

> Also, is it an issue that this patch won't work unless you merge
> Bert's 1/2 patch[0]?

Your patch is dependent on Bert's patch. This means I will have to merge 
his patch first, and then yours. And that makes complete sense. That's 
how dependent changes should work. So no, it is not an issue that this 
patch won't work unless I merge Bert's patch first.

So while my advice above was to work on top of "master", that does not 
apply in this case since your patch is dependent on someone's patch 
which isn't in master yet. So in this specific case, you should base 
your patch on top of Bert's. Otherwise there are two problems:

- Your patch in and of itself makes little sense, it probably would 
  crash or do some unexpected stuff. This hurts people doing a bisect, 
  where if they land on your patch stuff is broken, and they have to 
  manually move a commit up or down to continue.

- Your patch will textually conflict with Bert's. That means I'd first 
  merge Bert's patch since yours doesn't work without his. But then when 
  I merge yours, I'd get a nasty merge conflict that is not easy to 
  resolve and leaves chances for a subtle bug.

> Your feedback cannot be too specific, I want to learn how to do this
> properly :-)

I'm not sure how well I explained this, so feel free to ask more 
questions if I didn't explain something properly :).
 
> [0] 
> https://public-inbox.org/git/ab1f68cc8552e405c9d04622be1e728ab81bda17.1567713659.git.bert.wes...@googlemail.com/

-- 
Regards,
Pratyush Yadav


Re: [PATCH] git-submodule.txt: fix AsciiDoc formatting error

2019-09-13 Thread Denton Liu
On Fri, Sep 13, 2019 at 11:03:39AM -0700, Junio C Hamano wrote:
> Denton Liu  writes:
> 
> > Remove surrounding parentheses so that the "index term" syntax is not
> > triggered (and because it looks nicer without them anyway ;) ).
> 
> "Correct" always trumps "nicer", though ;-)
> 
> The $USAGE string in the script describes the available options to
> this subcommand like so:
> 
> or: $dashless [--quiet] set-branch (--default|--branch ) [--] 
> 
> 
> which is, "you can give either --default or --branch , but
> not both".  What the original, if there were no special meaning in
> double-paren, meant to say seems to (almost) match that.
> 
> > -set-branch ((-d|--default)|(-b|--branch )) [--] ::
> 
> I say "almost", as it gives a wrong impression that you can give
> "-b" without "" X-<.
> 
> Now what does the updated text say to us?
> 
> > +set-branch (-d|--default)|(-b|--branch ) [--] ::
> 
> I think the attempt to cram the short-form is unnecessarily
> cluttering and making the result incorrect.  How about doing
> something like this instead?
> 
>  Documentation/git-submodule.txt | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index 0ed5c24dc1..816baa7dd0 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -173,10 +173,12 @@ submodule with the `--init` option.
>  If `--recursive` is specified, this command will recurse into the
>  registered submodules, and update any nested submodules within.
>  --
> -set-branch ((-d|--default)|(-b|--branch )) [--] ::
> +set-branch (--default|--branch ) [--] ::
>   Sets the default remote tracking branch for the submodule. The
> - `--branch` option allows the remote branch to be specified. The
> - `--default` option removes the submodule..branch configuration
> + `--branch` option (or its short-form, `-b `)
> + allows the remote branch to be specified. The
> + `--default` option (or its short-form, `-d`)
> + removes the submodule..branch configuration
>   key, which causes the tracking branch to default to 'master'.
>  
>  summary [--cached|--files] [(-n|--summary-limit) ] [commit] [--] 
> [...]::

Hmm, I don't really like this since with every other subcommand, the
short-forms are in the command summary so it's obvious to the reader
in a quick glance which options are available. With this change, a
reader would have to not only read the summary line but also scan the
blurb below.

In the context line above, we see `[(-n|--summary-limit) ]` as a
possible way of notating a short and long option with argument. What do
you think about the following potential output?

set-branch (-d|--default)|((-b|--branch) ) [--] ::

Of course, we reintroduce the double paren problem but I can dig into
asciidoc syntax and figure out how to escape it properly.


Re: [PATCH v2 3/3] Makefile: run coccicheck on more source files

2019-09-13 Thread Denton Liu
On Fri, Sep 13, 2019 at 08:00:14PM +0200, SZEDER Gábor wrote:
> On Fri, Sep 13, 2019 at 10:14:01AM -0700, Denton Liu wrote:
> > On Fri, Sep 13, 2019 at 01:49:52PM +0200, SZEDER Gábor wrote:
> > > On Thu, Sep 12, 2019 at 11:40:36AM -0700, Junio C Hamano wrote:
> > > > Denton Liu  writes:
> > > > 
> > > > > +FIND_C_SOURCES = $(filter %.c,$(shell $(FIND_SOURCE_FILES)))
> > > > > +COCCI_SOURCES = $(filter-out 
> > > > > $(THIRD_PARTY_SOURCES),$(FIND_C_SOURCES))
> > > > 
> > > > The former is somewhat misnamed.  FIND_SOURCE_FILES is *not* a list
> > > > of source files---it is a procedure to list source files to its
> > > > standard output.  FIND_C_SOUCRES sounds as if it is a similar
> > > > procedure, which would be implemented much like
> > > > 
> > > > FIND_C_SOURCES = $(FIND_SOURCE_FILES) | sed -n -e '/\.c$/p'
> > > > 
> > > > but that is not what you did and that is not what you want to have.
> > > > Perhaps call it FOUND_C_SOURCES?
> > > > 
> > > > I wonder if we can get rid of FIND_SOURCE_FILES that is a mere
> > > > procedure and replace its use with a true list of source files.
> > > > Would it make the result more pleasant to work with?
> > > > 
> > > > Perhaps something like the attached patch, (which would come before
> > > > this entire thing as a clean-up, and removing the need for 2/3)?
> > > > 
> > > > I dunno.
> > > > 
> > > > Using a procedure whose output is fed to xargs has an advantage that
> > > > a platform with very short command line limit can still work with
> > > > many source files, but the way you create and use COCCI_SOURCES in
> > > > this patch would defeat that advantage anyway,
> > > 
> > > COCCI_SOURCES is only used as an input to 'xargs', so that advantage
> > > is not defeated.
> > 
> > I think it still does matter; the relevant snippet is as follows:
> > 
> > if ! echo $(COCCI_SOURCES) | xargs $$limit \
> > $(SPATCH) --sp-file $< $(SPATCH_FLAGS) \
> > >$@+ 2>$@.log; \
> > 
> > which means that a really big COCCI_SOURCES could exceed the limit.
> 
> Oh, you're both right.
> 
> > That being said, COCCI_SOURCES should be smaller than the future
> > SOURCE_FILES variable since we're only taking %.c files (and filtering
> > out some of them too!).
> 
> We could also argue that Coccinelle only runs on platforms that have a
> reasonably large command line arg limit, and the number of our source
> files is way below that, so it won't matter in the foreseeable future.

Good point.

> 
> (Furthermore, 'echo' is often a shell builtin command, and I don't
> think that the platform's argument size limit applies to them.  At
> least the 'echo' of dash, Bash, ksh, ksh93, mksh, and BusyBox sh can
> deal with at least 10 million arguments; the platform limit is
> somewhere around 147k)
> 
> > > > diff --git a/Makefile b/Makefile
> > > > index f9255344ae..90e88c 100644
> > > > --- a/Makefile
> > > > +++ b/Makefile
> > > > @@ -2584,7 +2584,7 @@ perl/build/man/man3/Git.3pm: perl/Git.pm
> > > > $(QUIET_GEN)mkdir -p $(dir $@) && \
> > > > pod2man $< $@
> > > >  
> > > > -FIND_SOURCE_FILES = ( \
> > > > +SOURCE_FILES = $(patsubst ./%,%,$(shell \
> > > > git ls-files \
> > > > '*.[hcS]' \
> > > > '*.sh' \
> > > > @@ -2599,19 +2599,19 @@ FIND_SOURCE_FILES = ( \
> > > > -o \( -name 'trash*' -type d -prune \) \
> > > > -o \( -name '*.[hcS]' -type f -print \) \
> > > > -o \( -name '*.sh' -type f -print \) \
> > > > -   )
> > > > +   ))
> > > >  
> > > >  $(ETAGS_TARGET): FORCE
> > > > $(RM) $(ETAGS_TARGET)
> > > > -   $(FIND_SOURCE_FILES) | xargs etags -a -o $(ETAGS_TARGET)
> > > > +   etags -a -o $(ETAGS_TARGET) $(SOURCE_FILES)
> > > >  
> > > >  tags: FORCE
> > > > $(RM) tags
> > > > -   $(FIND_SOURCE_FILES) | xargs ctags -a
> > > > +   ctags -a $(SOURCE_FILES)
> > > >  
> > > >  cscope:
> > > > $(RM) cscope*
> > > > -   $(FIND_SOURCE_FILES) | xargs cscope -b
> > > > +   cscope -b $(SOURCE_FILES)
> 
> Here, however, the list of source files is passed as argument to
> non-builtin commands, that also might be used on
> cmdline-arg-limit-challenged platforms.
> 

After doing a bit of research, I think that I agree with you. It seems
like the max command-line length for CMD on Windows is 8191 characters.

However, after running the following,

$ git ls-files '*.[hcS]' '*.sh' ':!*[tp][0-9][0-9][0-9][0-9]*' 
':!contrib' | wc -c
   12779

we can see that the command-line length would definitely exceed the max
length so xargs would be required. As a result, we should probably just
keep the existing xargs invocations.


Re: [PATCH v4] git-gui: add hotkey to toggle "Amend Last Commit"

2019-09-13 Thread Birger Skogeng Pedersen
On Fri, Sep 13, 2019 at 4:37 PM Pratyush Yadav  wrote:
> Hi Birger,
>
> I'm afraid you are working on an older version of this patch. You should
> be re-rolling [0], which works well with Bert's "amend check button"
> change.
>
> [0] 
> https://public-inbox.org/git/b82a00441ff1a6a9cea3fd235c1c33729ec31b71.1567713659.git.bert.wes...@googlemail.com/

Forgive me, I get a little bit confused. Should my patch be based on
"next" or "master" branch?
Also, is it an issue that this patch won't work unless you merge
Bert's 1/2 patch[0]?
Your feedback cannot be too specific, I want to learn how to do this
properly :-)

[0] 
https://public-inbox.org/git/ab1f68cc8552e405c9d04622be1e728ab81bda17.1567713659.git.bert.wes...@googlemail.com/


Birger


Re: [PATCH] completion: add missing completions for log, diff, show

2019-09-13 Thread Johannes Schindelin
Hi Max,

[your reply did not make it to the Git mailing list because it drops
anything with an HTML part]

On Thu, 12 Sep 2019, Max Rothman wrote:

> Great! What’s the path for getting this merged?

Usually it will be picked up into the `pu` branch first, then advance to
`next`, and then to `master`. Once it is in `master`, it will be part of
the then-next official version ending in `.0`.

You can also read about the state of your patch (once it has been picked
up) in the bi-weekly "What's cooking" mails. If you're not subscribed,
you can look at the current version on the web:
https://github.com/git/git/blob/todo/whats-cooking.txt

Ciao,
Johannes

>
> On Thu, Sep 12, 2019 at 4:54 AM Johannes Schindelin <
> johannes.schinde...@gmx.de> wrote:
>
> > Hi Max,
> >
> > The patch looks good to me!
> >
> > Thanks,
> > Johannes
> >
> > On Wed, 11 Sep 2019, Max Rothman wrote:
> >
> > > On Thu, Aug 1, 2019 at 8:54 PM Max Rothman 
> > wrote:
> > > >
> > > > On Thu, Aug 1, 2019 at 8:50 PM Max Rothman 
> > wrote:
> > > > >
> > > > > The bash completion script knows some options to "git log" and
> > > > > "git show" only in the positive form, (e.g. "--abbrev-commit"), but
> > not
> > > > > in their negative form (e.g. "--no-abbrev-commit"). Add them.
> > > > >
> > > > > Also, the bash completion script is missing some other options to
> > > > > "git diff", and "git show" (and thus, all other commands that take
> > > > > "git diff"'s options). Add them. Of note, since "--indent-heuristic"
> > is
> > > > > no longer experimental, add that too.
> > > > >
> > > > > Signed-off-by: Max Rothman 
> > > > > ---
> > > > >  contrib/completion/git-completion.bash | 18 ++
> > > > >  1 file changed, 14 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/contrib/completion/git-completion.bash
> > b/contrib/completion/git-completion.bash
> > > > > index 9f71bcde967bc..b6d18710135ec 100644
> > > > > --- a/contrib/completion/git-completion.bash
> > > > > +++ b/contrib/completion/git-completion.bash
> > > > > @@ -1474,6 +1474,8 @@ __git_diff_common_options="--stat --numstat
> > --shortstat --summary
> > > > > --dirstat-by-file= --cumulative
> > > > > --diff-algorithm=
> > > > > --submodule --submodule= --ignore-submodules
> > > > > +   --indent-heuristic --no-indent-heuristic
> > > > > +   --textconv --no-textconv
> > > > >  "
> > > > >
> > > > >  _git_diff ()
> > > > > @@ -1782,6 +1784,10 @@ _git_log ()
> > > > > __gitcomp "$__git_diff_submodule_formats" ""
> > "${cur##--submodule=}"
> > > > > return
> > > > > ;;
> > > > > +   --no-walk=*)
> > > > > +   __gitcomp "sorted unsorted" "" "${cur##--no-walk=}"
> > > > > +   return
> > > > > +   ;;
> > > > > --*)
> > > > > __gitcomp "
> > > > > $__git_log_common_options
> > > > > @@ -1789,16 +1795,19 @@ _git_log ()
> > > > > $__git_log_gitk_options
> > > > > --root --topo-order --date-order --reverse
> > > > > --follow --full-diff
> > > > > -   --abbrev-commit --abbrev=
> > > > > +   --abbrev-commit --no-abbrev-commit --abbrev=
> > > > > --relative-date --date=
> > > > > --pretty= --format= --oneline
> > > > > --show-signature
> > > > > --cherry-mark
> > > > > --cherry-pick
> > > > > --graph
> > > > > -   --decorate --decorate=
> > > > > +   --decorate --decorate= --no-decorate
> > > > > --walk-reflogs
> > > > > +   --no-walk --no-walk= --do-walk
> > > > > --parents --children
> > > > > +   --expand-tabs --expand-tabs= --no-expand-tabs
> > > > > +   --patch
> > > > > $merge
> > > > > $__git_diff_common_options
> > > > > --pickaxe-all --pickaxe-regex
> > > > > @@ -2525,8 +2534,9 @@ _git_show ()
> > > > > return
> > > > > ;;
> > > > > --*)
> > > > > -   __gitcomp "--pretty= --format= --abbrev-commit
> > --oneline
> > > > > -   --show-signature
> > > > > +   __gitcomp "--pretty= --format= --abbrev-commit
> > --no-abbrev-commit
> > > > > +   --oneline --show-signature --patch
> > > > > +   --expand-tabs --expand-tabs= --no-expand-tabs
> > > > > $__git_diff_common_options
> > > > > "
> > > > > return
> > > > >
> > > > > --
> > > > > https://github.com/git/git/pull/426
> > > > >
>

Re: [PATCH v4] git-gui: add horizontal scrollbar to commit buffer

2019-09-13 Thread Pratyush Yadav
Thanks for the re-roll. Will queue.

On 13/09/19 10:16PM, Bert Wesarg wrote:
> While the commit message widget has a configurable fixed width, it
> nevertheless allowed to write commit messages which exceeded this limit.
> Though there is no visual clue, that there is scrolling going on. Now
> there is a horizontal scrollbar.
> 
> There seems to be a bug in at least Tcl/Tk up to version 8.6.8, which
> does not update the horizontal scrollbar if one removes the whole
> content at once.
> 
> Suggested-by: Birger Skogeng Pedersen 
> Signed-off-by: Bert Wesarg 
> ---
>  git-gui.sh | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/git-gui.sh b/git-gui.sh
> index 5bc21b8..ad962d4 100755
> --- a/git-gui.sh
> +++ b/git-gui.sh
> @@ -3363,10 +3363,16 @@ ttext $ui_comm -background white -foreground black \
>   -relief sunken \
>   -width $repo_config(gui.commitmsgwidth) -height 9 -wrap none \
>   -font font_diff \
> + -xscrollcommand {.vpane.lower.commarea.buffer.frame.sbx set} \
>   -yscrollcommand {.vpane.lower.commarea.buffer.frame.sby set}
> +${NS}::scrollbar .vpane.lower.commarea.buffer.frame.sbx \
> + -orient horizontal \
> + -command [list $ui_comm xview]
>  ${NS}::scrollbar .vpane.lower.commarea.buffer.frame.sby \
> + -orient vertical \
>   -command [list $ui_comm yview]
>  
> +pack .vpane.lower.commarea.buffer.frame.sbx -side bottom -fill x
>  pack .vpane.lower.commarea.buffer.frame.sby -side right -fill y
>  pack $ui_comm -side left -fill y
>  pack .vpane.lower.commarea.buffer.header -side top -fill x
> -- 
> 2.21.0.789.ga095d9d866
> 

-- 
Regards,
Pratyush Yadav


Re: [PATCH v3 01/12] t7300: add testcases showing failure to clean specified pathspecs

2019-09-13 Thread Junio C Hamano
Elijah Newren  writes:

>> Also, you talked about tracked paths in the proposed log message; do
>> they not participate in reproducing the issue(s)?
>
> If there is only one directory which has no tracked files, then the
> user can clean up the files -- but confusingly, they have to issue the
> same git-clean command multiple times.  If multiple directories have
> no untracked files, git-clean will never clean them out.  I probably
> didn't do a very good job explaining that although I started with the
> case with one tracked, that I view the case without any as the more
> general case -- and that solving it solves both problems.  I could
> probably make that clearer in the commit message.  (Or maybe just add
> more testcases even if slightly duplicative, I guess.)

My comment/puzzlement indeed was the lack of any tracked file in
your tests, even though the log message did talk about one's
presence making a difference in the outcome.

Thanks.


Re: [PATCH 1/1] multi-pack-index: add --no-progress Add --no-progress option to git multi-pack-index. The progress feature was added in 144d703 ("multi-pack-index: report progress during 'verify'", 20

2019-09-13 Thread Junio C Hamano
William Baker  writes:

>> I also see in the code that
>> handles opts.batch_size that there is a workaround for this inverted
>> code structure to make sure subcommands other than repack does not
>> allow --batch-size option specified.
>
>> we probably would want to correct the use
>> of parse_options() API in the implementation of this command before
>> adding any new option or subcommand.
>
> To confirm I understand, is the recommendation that
> cmd_multi_pack_index be updated to only parse "batch-size" for the repack
> subcommand (i.e. use PARSE_OPT_STOP_AT_NON_OPTION to parse all of the common
> options, and then only parse "batch-size" when the repack subcommand is 
> running)?

Compare the ways how dispatching and command line option parsing of
subcommands in "multi-pack-index" and "commit-graph" are
implemented.  When a command (e.g. "commit-graph") takes common
options and also has subcommands (e.g. "read" and "write") that take
different set of options, there is a common options parser in the
primary entry point (e.g. "cmd_commit_graph()"), and after
dispatching to a chosen subcommand, the implementation of each
subcommand (e.g. "graph_read()" and "graph_write()") parses its own
options.  That's bog-standard way.

cmd_multi_pack_index() "cheats" and does not implement proper
subcommand dispatch (it directly makes underlying API calls
instead).  Started as an isolated experimental command whose
existence as a standalone command is solely because it was easier to
experiment with (as opposed to being a plumbing command to be used
by scripters), it probably was an acceptable trade-off to leave the
code in this shape.  In the longer run, I suspect we'd rather want
to get rid of "git multi-pack-index" as a standalone command and
instead make "git gc" and other commands make direct calls to the
internal machinery of midx code from strategic places.  So in that
sense, I am not sure if I should "recommend" fixing the way the
subcommand dispatching works in this command, or just accept to keep
the ugly technical debt and let it limp along...

>> subcommands to also understand the progress output or verbosity
>> option (and if the excuse given as a response to the above analysis
>> is "this is just a first step, more will come later")
>
> Yep this was my thinking.  Today "repack" and "verify" are the only 
> subcommands
> that have any progress output but as the other subcommands learn how to 
> provide
> progress the [--[no-]progress] option can be used to control it. 
>
>> instead of adding a "unsigned flag" local variable to the function, it would
>> probably make much more sense to
>> 
>>  (1) make struct opts_multi_pack_index as a part of the public API
>>  between cmd_multi_pack_index() and midx.c and document it in
>>  midx.h;
>> 
>>  (2) instead of passing opts.object_dir to existing command
>>  implementations, pass &opts, the pointer to the whole
>>  structure;
>> 
>>  (3) add a new field "unsigned progress" to the structure, and teach
>>  the command line parser to flip it upon seeing "--[no-]progress".
>> 
>
> Thanks for this suggestion I'll use this approach in the next patch.

...but it appears to me that you are more enthused than I am in
improving this code, so I probably should actually recommend fixing
the main entry point function of this command to imitate the way
"commit-graph" implements subcommands and their own set of options
;-)



Re: [PATCH] git-gui: convert new/amend commit radiobutton to checketton

2019-09-13 Thread Bert Wesarg
On Fri, Sep 13, 2019 at 9:49 PM Pratyush Yadav  wrote:
>
> You missed fixing the typo in the subject. s/checketton/checkbutton/.
> But no need to send a re-roll for that. I can fix it locally.

thanks, I also fixed that locally, but again, it slipped thru in the
patch mail. Still don't know why this happened.

Best,
Bert

>
> Other than that, LGTM. Thanks, will queue.
>
> On 13/09/19 08:02AM, Bert Wesarg wrote:
> > Its a bi-state anyway and also saves one line in the menu.
> >
> > Signed-off-by: Bert Wesarg 
>
> --
> Regards,
> Pratyush Yadav


Re: [PATCH] git-gui: add horizontal scrollbar to commit buffer

2019-09-13 Thread Bert Wesarg
On Fri, Sep 13, 2019 at 8:53 PM Pratyush Yadav  wrote:
>
> Can you please add a version number when you send re-rolls the next time
> around. Something like "[PATCH v2] subject...". Makes it easier for me
> to keep track of things when there are multiple re-rolls of multiple
> patches.
>
> You can do this by passing "-v2" (or "-v3", "-v4", etc) to
> git-send-email or git-format-patch.
>
> You missed two quick questions I had in the last version. I'll ask them
> again below. Other than those two, LGTM. Thanks.

I removed both of then, but they still made it into the patch, sorry.
Will re-roll with v4 now.

Bert

>
> On 12/09/19 09:20PM, Bert Wesarg wrote:
> > While the commit message widget has a configurable fixed width, it
> > nevertheless allowed to write commit messages which exceeded this limit.
> > Though there is no visual clue, that there is scrolling going on. Now
> > there is a horizontal scrollbar.
>
> Looks much better!
>
> > There seems to be a bug in at least Tcl/Tk up to version 8.6.8, which
> > does not update the horizontal scrollbar if one removes the whole
> > content at once.
> >
> > Suggested-by: Birger Skogeng Pedersen 
> > Signed-off-by: Bert Wesarg 
> > ---
> >  git-gui.sh | 10 --
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/git-gui.sh b/git-gui.sh
> > index 5bc21b8..032ebd6 100755
> > --- a/git-gui.sh
> > +++ b/git-gui.sh
> > @@ -3363,14 +3363,20 @@ ttext $ui_comm -background white -foreground black \
> >   -relief sunken \
> >   -width $repo_config(gui.commitmsgwidth) -height 9 -wrap none \
> >   -font font_diff \
> > + -xscrollcommand {.vpane.lower.commarea.buffer.frame.sbx set} \
> >   -yscrollcommand {.vpane.lower.commarea.buffer.frame.sby set}
> > +${NS}::scrollbar .vpane.lower.commarea.buffer.frame.sbx \
> > + -orient horizontal \
> > + -command [list $ui_comm xview]
> >  ${NS}::scrollbar .vpane.lower.commarea.buffer.frame.sby \
> > + -orient vertical \
> >   -command [list $ui_comm yview]
> >
> > +pack .vpane.lower.commarea.buffer.frame.sbx -side bottom -fill x
> >  pack .vpane.lower.commarea.buffer.frame.sby -side right -fill y
> > -pack $ui_comm -side left -fill y
> > +pack $ui_comm -side left -fill both -expand 1
>
> If I remove this change, the behavior does not seem to change, and the
> commit message buffer still expands. So what exactly does this change
> do?
>
> >  pack .vpane.lower.commarea.buffer.header -side top -fill x
> > -pack .vpane.lower.commarea.buffer.frame -side left -fill y
> > +pack .vpane.lower.commarea.buffer.frame -side bottom -fill both -expand 1
>
> I'm not too familiar with pack, but why change the side from left to
> bottom? I tested by changing it back to left and it doesn't seem to make
> a difference.
>
> >  pack .vpane.lower.commarea.buffer -side left -fill y
> >
> >  # -- Commit Message Buffer Context Menu
> > --
> > 2.21.0.789.ga095d9d866
> >
>
> --
> Regards,
> Pratyush Yadav


Re: [PATCH v3 07/12] dir: add commentary explaining match_pathspec_item's return value

2019-09-13 Thread Junio C Hamano
Elijah Newren  writes:

> The way match_pathspec_item() handles names and pathspecs with trailing
> slash characters, in conjunction with special options like
> DO_MATCH_DIRECTORY and DO_MATCH_LEADING_PATHSPEC were non-obvious, and
> broken until this patch series.  Add a table in a comment explaining the
> intent of how these work.

Thanks.

>
> Signed-off-by: Elijah Newren 
> ---
>  dir.c | 27 +++
>  1 file changed, 19 insertions(+), 8 deletions(-)
>
> diff --git a/dir.c b/dir.c
> index 47c0a99cb5..3b2fe1701c 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -276,16 +276,27 @@ static int do_read_blob(const struct object_id *oid, 
> struct oid_stat *oid_stat,
>  #define DO_MATCH_LEADING_PATHSPEC (1<<2)
>  
>  /*
> - * Does 'match' match the given name?
> - * A match is found if
> + * Does the given pathspec match the given name?  A match is found if
>   *
> - * (1) the 'match' string is leading directory of 'name', or
> - * (2) the 'match' string is a wildcard and matches 'name', or
> - * (3) the 'match' string is exactly the same as 'name'.
> + * (1) the pathspec string is leading directory of 'name' ("RECURSIVELY"), or
> + * (2) the pathspec string has a leading part matching 'name' ("LEADING"), or
> + * (3) the pathspec string is a wildcard and matches 'name' ("WILDCARD"), or
> + * (4) the pathspec string is exactly the same as 'name' ("EXACT").
>   *
> - * and the return value tells which case it was.
> + * Return value tells which case it was (1-4), or 0 when there is no match.
>   *
> - * It returns 0 when there is no match.
> + * It may be instructive to look at a small table of concrete examples
> + * to understand the differences between 1, 2, and 4:
> + *
> + *  Pathspecs
> + *|a/b|   a/b/|   a/b/c
> + *  --+---+---+
> + *  a/b   |  EXACT|  EXACT[1] | LEADING[2]
> + *  Names   a/b/  | RECURSIVE |   EXACT   | LEADING[2]
> + *  a/b/c | RECURSIVE | RECURSIVE |   EXACT
> + *
> + * [1] Only if DO_MATCH_DIRECTORY is passed; otherwise, this is NOT a match.
> + * [2] Only if DO_MATCH_LEADING_PATHSPEC is passed; otherwise, not a match.
>   */
>  static int match_pathspec_item(const struct index_state *istate,
>  const struct pathspec_item *item, int prefix,
> @@ -353,7 +364,7 @@ static int match_pathspec_item(const struct index_state 
> *istate,
>item->nowildcard_len - prefix))
>   return MATCHED_FNMATCH;
>  
> - /* Perform checks to see if "name" is a super set of the pathspec */
> + /* Perform checks to see if "name" is a leading string of the pathspec 
> */
>   if (flags & DO_MATCH_LEADING_PATHSPEC) {
>   /* name is a literal prefix of the pathspec */
>   int offset = name[namelen-1] == '/' ? 1 : 0;


Re: [PATCH] git-gui: convert new/amend commit radiobutton to checketton

2019-09-13 Thread Pratyush Yadav
You missed fixing the typo in the subject. s/checketton/checkbutton/. 
But no need to send a re-roll for that. I can fix it locally.

Other than that, LGTM. Thanks, will queue.

On 13/09/19 08:02AM, Bert Wesarg wrote:
> Its a bi-state anyway and also saves one line in the menu.
> 
> Signed-off-by: Bert Wesarg 

-- 
Regards,
Pratyush Yadav


Re: [PATCH v3 06/12] dir: if our pathspec might match files under a dir, recurse into it

2019-09-13 Thread Junio C Hamano
Elijah Newren  writes:

> For git clean, if a directory is entirely untracked and the user did not
> specify -d (corresponding to DIR_SHOW_IGNORED_TOO), then we usually do
> not want to remove that directory and thus do not recurse into it.

Makes sense.  To clean named paths in such a directory, we'd need an
option to recurse into it to find them, yet make sure the directory
itself does not get removed.

> However, if the user manually specified specific (or even globbed) paths
> somewhere under that directory to remove, then we need to recurse into
> the directory to make sure we remove the relevant paths under that
> directory as the user requested.

Surely.

> @@ -1939,8 +1939,10 @@ static enum path_treatment 
> read_directory_recursive(struct dir_struct *dir,
>   /* recurse into subdir if instructed by treat_path */
>   if ((state == path_recurse) ||
>   ((state == path_untracked) &&
> -  (dir->flags & DIR_SHOW_IGNORED_TOO) &&
> -  (get_dtype(cdir.de, istate, path.buf, path.len) == 
> DT_DIR))) {
> +  (get_dtype(cdir.de, istate, path.buf, path.len) == 
> DT_DIR) &&
> +  ((dir->flags & DIR_SHOW_IGNORED_TOO) ||
> +   do_match_pathspec(istate, pathspec, path.buf, 
> path.len,
> + baselen, NULL, 
> DO_MATCH_LEADING_PATHSPEC) == MATCHED_RECURSIVELY_LEADING_PATHSPEC))) {
>   struct untracked_cache_dir *ud;
>   ud = lookup_untracked(dir->untracked, untracked,
> path.buf + baselen,

OK.

> diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
> index 12617158db..d83aeb7dc2 100755
> --- a/t/t7300-clean.sh
> +++ b/t/t7300-clean.sh
> @@ -691,7 +691,7 @@ test_expect_failure 'git clean -d skips nested repo 
> containing ignored files' '
>   test_path_is_file nested-repo-with-ignored-file/file
>  '
>  
> -test_expect_failure 'git clean handles being told what to clean' '
> +test_expect_success 'git clean handles being told what to clean' '
>   mkdir -p d1 d2 &&
>   touch d1/ut d2/ut &&
>   git clean -f */ut &&
> @@ -707,7 +707,7 @@ test_expect_success 'git clean handles being told what to 
> clean, with -d' '
>   test_path_is_missing d2/ut
>  '
>  
> -test_expect_failure 'git clean works if a glob is passed without -d' '
> +test_expect_success 'git clean works if a glob is passed without -d' '
>   mkdir -p d1 d2 &&
>   touch d1/ut d2/ut &&
>   git clean -f "*ut" &&

Nice.

Thanks.


Re: [PATCH v3 01/12] t7300: add testcases showing failure to clean specified pathspecs

2019-09-13 Thread Elijah Newren
On Fri, Sep 13, 2019 at 11:54 AM Junio C Hamano  wrote:
>
> Elijah Newren  writes:
>
> > +test_expect_failure 'git clean handles being told what to clean' '
> > + mkdir -p d1 d2 &&
> > + touch d1/ut d2/ut &&
> > + git clean -f */ut &&
> > + test_path_is_missing d1/ut &&
> > + test_path_is_missing d2/ut
> > +'
>
> Looks like d1 and d2 are new directories and the paths we see in the
> test are the only ones that are involved (i.e. we do not rely on any
> leftover cruft in d[12]/ from previous tests).  If so, perhaps it is
> easier to follow by starting the tests with "rm -fr d1 d2 &&" or
> something to assure the readers of the script (not this patch, but
> the resulting file down the road) about the isolation?  The same
> comment applies to the remainder.

Makes sense.

> Also, you talked about tracked paths in the proposed log message; do
> they not participate in reproducing the issue(s)?

If there is only one directory which has no tracked files, then the
user can clean up the files -- but confusingly, they have to issue the
same git-clean command multiple times.  If multiple directories have
no untracked files, git-clean will never clean them out.  I probably
didn't do a very good job explaining that although I started with the
case with one tracked, that I view the case without any as the more
general case -- and that solving it solves both problems.  I could
probably make that clearer in the commit message.  (Or maybe just add
more testcases even if slightly duplicative, I guess.)


Re: [PATCH v3 03/12] dir: fix off-by-one error in match_pathspec_item

2019-09-13 Thread Junio C Hamano
Elijah Newren  writes:

> For a pathspec like 'foo/bar' comparing against a path named "foo/",
> namelen will be 4, and match[namelen] will be 'b'.  The correct location
> of the directory separator is namelen-1.

And the reason why name[namelen-1] may not be slash, in which case
your new code makes offset 0, is because we need to handle what
case?  When path is "foo" (not "foo/")?  Just makes me wonder why
this callee allows the caller(s) to be inconsistent, sometimes
including the trailing slash in  tuple, sometimes
not.

> The reason the code worked anyway was that the following code immediately
> checked whether the first matchlen characters matched (which they do) and
> then bailed and return MATCHED_RECURSIVELY anyway since wildmatch doesn't
> have the ability to check if "name" can be matched as a directory (or
> prefix) against the pathspec.

Nicely spotted and explained.

>
> Signed-off-by: Elijah Newren 
> ---
>  dir.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/dir.c b/dir.c
> index a9168bed96..bf1a74799e 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -356,8 +356,9 @@ static int match_pathspec_item(const struct index_state 
> *istate,
>   /* Perform checks to see if "name" is a super set of the pathspec */
>   if (flags & DO_MATCH_SUBMODULE) {
>   /* name is a literal prefix of the pathspec */
> + int offset = name[namelen-1] == '/' ? 1 : 0;
>   if ((namelen < matchlen) &&
> - (match[namelen] == '/') &&
> + (match[namelen-offset] == '/') &&
>   !ps_strncmp(item, match, name, namelen))
>   return MATCHED_RECURSIVELY;


Re: [PATCH v3 01/12] t7300: add testcases showing failure to clean specified pathspecs

2019-09-13 Thread Junio C Hamano
Elijah Newren  writes:

> +test_expect_failure 'git clean handles being told what to clean' '
> + mkdir -p d1 d2 &&
> + touch d1/ut d2/ut &&
> + git clean -f */ut &&
> + test_path_is_missing d1/ut &&
> + test_path_is_missing d2/ut
> +'

Looks like d1 and d2 are new directories and the paths we see in the
test are the only ones that are involved (i.e. we do not rely on any
leftover cruft in d[12]/ from previous tests).  If so, perhaps it is
easier to follow by starting the tests with "rm -fr d1 d2 &&" or
something to assure the readers of the script (not this patch, but
the resulting file down the road) about the isolation?  The same
comment applies to the remainder.

Also, you talked about tracked paths in the proposed log message; do
they not participate in reproducing the issue(s)?

Thanks.


> +test_expect_failure 'git clean handles being told what to clean, with -d' '
> + mkdir -p d1 d2 &&
> + touch d1/ut d2/ut &&
> + git clean -ffd */ut &&
> + test_path_is_missing d1/ut &&
> + test_path_is_missing d2/ut
> +'
> +
> +test_expect_failure 'git clean works if a glob is passed without -d' '
> + mkdir -p d1 d2 &&
> + touch d1/ut d2/ut &&
> + git clean -f "*ut" &&
> + test_path_is_missing d1/ut &&
> + test_path_is_missing d2/ut
> +'
> +
> +test_expect_failure 'git clean works if a glob is passed with -d' '
> + mkdir -p d1 d2 &&
> + touch d1/ut d2/ut &&
> + git clean -ffd "*ut" &&
> + test_path_is_missing d1/ut &&
> + test_path_is_missing d2/ut
> +'
> +
>  test_expect_success MINGW 'handle clean & core.longpaths = false nicely' '
>   test_config core.longpaths false &&
>   a50=aa &&


Re: [PATCH] git-gui: add horizontal scrollbar to commit buffer

2019-09-13 Thread Pratyush Yadav
Can you please add a version number when you send re-rolls the next time 
around. Something like "[PATCH v2] subject...". Makes it easier for me 
to keep track of things when there are multiple re-rolls of multiple 
patches.

You can do this by passing "-v2" (or "-v3", "-v4", etc) to 
git-send-email or git-format-patch.

You missed two quick questions I had in the last version. I'll ask them 
again below. Other than those two, LGTM. Thanks.

On 12/09/19 09:20PM, Bert Wesarg wrote:
> While the commit message widget has a configurable fixed width, it
> nevertheless allowed to write commit messages which exceeded this limit.
> Though there is no visual clue, that there is scrolling going on. Now
> there is a horizontal scrollbar.

Looks much better!
 
> There seems to be a bug in at least Tcl/Tk up to version 8.6.8, which
> does not update the horizontal scrollbar if one removes the whole
> content at once.
> 
> Suggested-by: Birger Skogeng Pedersen 
> Signed-off-by: Bert Wesarg 
> ---
>  git-gui.sh | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/git-gui.sh b/git-gui.sh
> index 5bc21b8..032ebd6 100755
> --- a/git-gui.sh
> +++ b/git-gui.sh
> @@ -3363,14 +3363,20 @@ ttext $ui_comm -background white -foreground black \
>   -relief sunken \
>   -width $repo_config(gui.commitmsgwidth) -height 9 -wrap none \
>   -font font_diff \
> + -xscrollcommand {.vpane.lower.commarea.buffer.frame.sbx set} \
>   -yscrollcommand {.vpane.lower.commarea.buffer.frame.sby set}
> +${NS}::scrollbar .vpane.lower.commarea.buffer.frame.sbx \
> + -orient horizontal \
> + -command [list $ui_comm xview]
>  ${NS}::scrollbar .vpane.lower.commarea.buffer.frame.sby \
> + -orient vertical \
>   -command [list $ui_comm yview]
>  
> +pack .vpane.lower.commarea.buffer.frame.sbx -side bottom -fill x
>  pack .vpane.lower.commarea.buffer.frame.sby -side right -fill y
> -pack $ui_comm -side left -fill y
> +pack $ui_comm -side left -fill both -expand 1

If I remove this change, the behavior does not seem to change, and the 
commit message buffer still expands. So what exactly does this change 
do?

>  pack .vpane.lower.commarea.buffer.header -side top -fill x
> -pack .vpane.lower.commarea.buffer.frame -side left -fill y
> +pack .vpane.lower.commarea.buffer.frame -side bottom -fill both -expand 1

I'm not too familiar with pack, but why change the side from left to 
bottom? I tested by changing it back to left and it doesn't seem to make 
a difference.

>  pack .vpane.lower.commarea.buffer -side left -fill y
>  
>  # -- Commit Message Buffer Context Menu
> -- 
> 2.21.0.789.ga095d9d866
> 

-- 
Regards,
Pratyush Yadav


Re: [PATCH 1/1] multi-pack-index: add --no-progress Add --no-progress option to git multi-pack-index. The progress feature was added in 144d703 ("multi-pack-index: report progress during 'verify'", 20

2019-09-13 Thread William Baker
Hi Junio,

Thanks for the review!

On 9/12/19 1:17 PM, Junio C Hamano wrote:

>> +'git multi-pack-index' [--object-dir=]  [--[no-]progress]
> 
> I am wondering what the reasoning behind having this new one *after*
> the subcommand while the existing one *before* is.  Isn't the
> --[no-]progress option supported by all subcommands of the
> multi-pack-index command, just like the --object-dir= option
> is?
> 
> always show "global --options, subcommand, and then subcommand --options" 
> order).

Thanks for calling this out.  I didn't have a specific reason for making this
option appear after the subcommand.  I tried looking at other commands as
examples and I missed that there's a specific ordering based on the type
of the option.  I will clean this up in the next patch. 

> I also see in the code that
> handles opts.batch_size that there is a workaround for this inverted
> code structure to make sure subcommands other than repack does not
> allow --batch-size option specified.

> we probably would want to correct the use
> of parse_options() API in the implementation of this command before
> adding any new option or subcommand.

To confirm I understand, is the recommendation that
cmd_multi_pack_index be updated to only parse "batch-size" for the repack
subcommand (i.e. use PARSE_OPT_STOP_AT_NON_OPTION to parse all of the common
options, and then only parse "batch-size" when the repack subcommand is 
running)?


>> @@ -47,14 +54,15 @@ int cmd_multi_pack_index(int argc, const char **argv,
>>  trace2_cmd_mode(argv[0]);
>>  
>>  if (!strcmp(argv[0], "repack"))
>> -return midx_repack(the_repository, opts.object_dir, 
>> (size_t)opts.batch_size);
>> +return midx_repack(the_repository, opts.object_dir, 
>> +(size_t)opts.batch_size, flags);
>>  if (opts.batch_size)
>>  die(_("--batch-size option is only for 'repack' subcommand"));
>>  
>>  if (!strcmp(argv[0], "write"))
>>  return write_midx_file(opts.object_dir);
>>  if (!strcmp(argv[0], "verify"))
>> -return verify_midx_file(the_repository, opts.object_dir);
>> +return verify_midx_file(the_repository, opts.object_dir, flags);
>>  if (!strcmp(argv[0], "expire"))
>>  return expire_midx_packs(the_repository, opts.object_dir);
> 

> We can see that the new option only affects "verify", even though
> the SYNOPSIS and usage text pretends that everybody understands and
> reacts to it.  Shouldn't it be documented just like how --batch-size
> is documented that it is understood only by "repack"?
> 
> If the mid-term aspiration of this patch is to later enhance other
> subcommands to also understand the progress output or verbosity
> option (and if the excuse given as a response to the above analysis
> is "this is just a first step, more will come later")

Yep this was my thinking.  Today "repack" and "verify" are the only subcommands
that have any progress output but as the other subcommands learn how to provide
progress the [--[no-]progress] option can be used to control it. 

> instead of adding a "unsigned flag" local variable to the function, it would
> probably make much more sense to
> 
>  (1) make struct opts_multi_pack_index as a part of the public API
>  between cmd_multi_pack_index() and midx.c and document it in
>  midx.h;
> 
>  (2) instead of passing opts.object_dir to existing command
>  implementations, pass &opts, the pointer to the whole
>  structure;
> 
>  (3) add a new field "unsigned progress" to the structure, and teach
>  the command line parser to flip it upon seeing "--[no-]progress".
> 

Thanks for this suggestion I'll use this approach in the next patch.

One small point to clarify, in the current struct I'm using an int for 
"progress" because OPT_BOOL expects an int*.  Do you have any concerns with
keeping "progress" as an int in the public struct, or would you rather 
cmd_multi_pack_index parse an int internally and use it to populate the
public struct?

Thanks!
William


Re: [PATCH] list-objects: don't queue root trees unless revs->tree_objects is set

2019-09-13 Thread Junio C Hamano
Jeff King  writes:

>> Thanks for spotting it.  I wonder if it was broken like this forever
>> since the beginning X-<.
>
> Not quite since the beginning; it comes from 8d2dfc49b1
> (process_{tree,blob}: show objects without buffering, 2009-04-10). I
> suspect nobody noticed because the cost for the normal case is really
> just shuffling some pointers around. It's only because it actively works
> against the commit-graph optimizations that it's so expensive there.

Yeah, that is what I meant by "since the beginning" (of
commit-graph, that is).

> I was surprised (and pleased) by how much such a simple thing helped.

Yes, it is very pleasing.  Thanks.


<    4   5   6   7   8   9   10   11   12   13   >