Re: reftable [v4]: new ref storage format

2017-08-02 Thread Junio C Hamano
Shawn Pearce  writes:

> On Wed, Aug 2, 2017 at 6:50 PM, Junio C Hamano  wrote:
> ...
>>Would it benefit us if we define the sort order of bytes slightly
>>different from the ASCII order, so that a slash '/' sorts between
>>NUL '\000' and SOH '\001', which is the order we should have used
>>when storing the entries in the index?
>
> I'm not really with that. It complicates the compare routine, and
> makes the data in reftable sorted differently than we announce in the
> wire protocol.

Fair enough.  It was not like I had some operations in mind that
would benefit from such a sort order (i.e. walking two of these
things in parallel and merging them, which would have been the case
for the index when we walk it together with one or more trees); if
there is no such operation that benefit, there is no reason to try
to be clever here.

>>  * Even though readers can continue accessing, starting from the
>>$GIT_DIR/refs, without locking and get consistent views, any
>>transaction that groups one or more ref updates would need to
>>take a global lock on $GIT_DIR/refs file.
>>
>>Would it become a problem in a busy repository?
> ...
>
> I'm not really sure how one could safely do the same thing with
> git-core. Its certainly not going to be portable or safe on NFS if we
> tried to do anything fancy with flock(2), fcntl(F_SETLKW), or
> semop(2).

Yes.

And for public record, another thing we privately discussed was that
we currently do not know if we would want to make this design mesh
well with the use of multiple worktrees (IIUC, HEAD and things
outside refs/, and refs/bisect/, need to be per- worktree, while
others are to be common), and if so how.


Re: reftable [v4]: new ref storage format

2017-08-02 Thread Shawn Pearce
On Wed, Aug 2, 2017 at 6:50 PM, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>> I like the general idea, what the file format can represent and how
>> it does so, but I am a bit uneasy about how well this "stacked" part
>> would work for desktop clients.
>
> Two more random things before I forget.
>
>  * I understand that you would want to allow both a ref "ref/D" and
>"ref/D/F" to appear in the same reftable file.  A refname is an
>uninterpreted sequence of bytes and refnames are sorted in the
>table.
>
>Would it benefit us if we define the sort order of bytes slightly
>different from the ASCII order, so that a slash '/' sorts between
>NUL '\000' and SOH '\001', which is the order we should have used
>when storing the entries in the index?

I'm not really with that. It complicates the compare routine, and
makes the data in reftable sorted differently than we announce in the
wire protocol. That cuts off any sort of optimizations we were
considering at $DAY_JOB to plumb the wire protocol code in JGit closer
to reftable code so that a large advertisement is more or less just
dumped straight from reftable to the wire protocol with only minimal
formatting changes.

>  * Even though readers can continue accessing, starting from the
>$GIT_DIR/refs, without locking and get consistent views, any
>transaction that groups one or more ref updates would need to
>take a global lock on $GIT_DIR/refs file.
>
>Would it become a problem in a busy repository?

Potentially yes. Writers may need to use a randomized exponential
backoff while trying to acquire the lock. For big application servers,
I recommended wrapping the file IO locking code with an application
server managed lock + fair wait queue. This is fairly straightforward
for JGit to implement within Gerrit Code Review.

I'm not really sure how one could safely do the same thing with
git-core. Its certainly not going to be portable or safe on NFS if we
tried to do anything fancy with flock(2), fcntl(F_SETLKW), or
semop(2).


Re: reftable [v4]: new ref storage format

2017-08-02 Thread Junio C Hamano
Junio C Hamano  writes:

> I like the general idea, what the file format can represent and how
> it does so, but I am a bit uneasy about how well this "stacked" part
> would work for desktop clients.

Two more random things before I forget.

 * I understand that you would want to allow both a ref "ref/D" and
   "ref/D/F" to appear in the same reftable file.  A refname is an
   uninterpreted sequence of bytes and refnames are sorted in the
   table.

   Would it benefit us if we define the sort order of bytes slightly
   different from the ASCII order, so that a slash '/' sorts between
   NUL '\000' and SOH '\001', which is the order we should have used
   when storing the entries in the index?

 * Even though readers can continue accessing, starting from the
   $GIT_DIR/refs, without locking and get consistent views, any
   transaction that groups one or more ref updates would need to
   take a global lock on $GIT_DIR/refs file.  

   Would it become a problem in a busy repository?


Re: [PATCH 4/4] revision: do not fallback to default when rev_input_given is set

2017-08-02 Thread Stefan Beller
On Wed, Aug 2, 2017 at 3:44 PM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> If revs->def is set (as it is in "git log") and there are no
>> pending objects after parsing the user's input, then we show
>> whatever is in "def". But if the user _did_ ask for some
>> input that just happened to be empty (e.g., "--glob" that
>> does not match anything), showing the default revision is
>> confusing. We should just show nothing, as that is what the
>> user's request yielded.
>>
>> Signed-off-by: Jeff King 
>> ---
>> The "!got_rev_arg" that's already in the conditional is interesting. I
>> wondered if it could be subsumed by the rev_input_given flag. But
>> digging in the history, I think it's mostly about doing reflog walks.
>> Usually if we see a rev arg it will result either in an object added to
>> the pending queue, or a fatal error. But empty reflogs are the
>> exception. And since my other nearby series adds a separate check for
>> "are we doing an empty reflog walk", I don't think it makes sense to
>> tangle this up the new flag I'm adding here.
>
> OK, I'll have to stare at possible merge conflicts to see if I like
> this or some other design decision ;-)
>
> This shows one of the reasons why I want consumers of revision
> machinery not to be futzing these internal implementation detail
> bits in the revs structure, by the way.

cc'd Johannes to see this example and discussion.

>
>>  revision.c | 2 +-
>>  t/t4202-log.sh | 6 ++
>>  2 files changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/revision.c b/revision.c
>> index 08d5806b8..ba2b166cd 100644
>> --- a/revision.c
>> +++ b/revision.c
>> @@ -2316,7 +2316,7 @@ int setup_revisions(int argc, const char **argv, 
>> struct rev_info *revs, struct s
>>   opt->tweak(revs, opt);
>>   if (revs->show_merge)
>>   prepare_show_merge(revs);
>> - if (revs->def && !revs->pending.nr && !got_rev_arg) {
>> + if (revs->def && !revs->pending.nr && !revs->rev_input_given && 
>> !got_rev_arg) {
>>   struct object_id oid;
>>   struct object *object;
>>   struct object_context oc;
>> diff --git a/t/t4202-log.sh b/t/t4202-log.sh
>> index 3f3531f0a..36d120c96 100755
>> --- a/t/t4202-log.sh
>> +++ b/t/t4202-log.sh
>> @@ -1523,6 +1523,12 @@ test_expect_success 'log diagnoses bogus HEAD' '
>>   test_i18ngrep broken stderr
>>  '
>>
>> +test_expect_success 'log does not default to HEAD when rev input is given' '
>> + >expect &&
>> + git log --branches=does-not-exist >actual &&
>> + test_cmp expect actual
>> +'
>> +
>>  test_expect_success 'set up --source tests' '
>>   git checkout --orphan source-a &&
>>   test_commit one &&


Re: [PATCH 2/4] revision: add rev_input_given flag

2017-08-02 Thread Jeff King
On Wed, Aug 02, 2017 at 03:41:52PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Normally a caller that invokes setup_revisions() has to
> > check rev.pending to see if anything was actually queued for
> > the traversal. But they can't tell the difference between
> > two cases:
> >
> >   1. The user gave us no tip from which to start a
> >  traversal.
> >
> >   2. The user tried to give us tips via --glob, --all, etc,
> >  but their patterns ended up being empty.
> >
> > Let's set a flag in the rev_info struct that callers can use
> > to tell the difference.  We can set this from the
> > init_all_refs_cb() function.  That's a little funny because
> > it's not exactly about initializing the "cb" struct itself.
> > But that function is the common setup place for doing
> > pattern traversals that is used by --glob, --all, etc.
> 
> ...and "--bisect", which is an oddball so we probably do not have to
> care.  I didn't check if there is a fallout on that codepath.

Yeah, I saw that one and figured it should probably as "input given".
There's also "--reflog" and "--indexed-objects", which aren't covered
here. I'm not sure if anybody really cares (you'd generally use them
with "--all" anyway), so I left them out for now.

-Peff


Re: [PATCH 4/4] revision: do not fallback to default when rev_input_given is set

2017-08-02 Thread Junio C Hamano
Jeff King  writes:

> If revs->def is set (as it is in "git log") and there are no
> pending objects after parsing the user's input, then we show
> whatever is in "def". But if the user _did_ ask for some
> input that just happened to be empty (e.g., "--glob" that
> does not match anything), showing the default revision is
> confusing. We should just show nothing, as that is what the
> user's request yielded.
>
> Signed-off-by: Jeff King 
> ---
> The "!got_rev_arg" that's already in the conditional is interesting. I
> wondered if it could be subsumed by the rev_input_given flag. But
> digging in the history, I think it's mostly about doing reflog walks.
> Usually if we see a rev arg it will result either in an object added to
> the pending queue, or a fatal error. But empty reflogs are the
> exception. And since my other nearby series adds a separate check for
> "are we doing an empty reflog walk", I don't think it makes sense to
> tangle this up the new flag I'm adding here.

OK, I'll have to stare at possible merge conflicts to see if I like
this or some other design decision ;-)

This shows one of the reasons why I want consumers of revision
machinery not to be futzing these internal implementation detail
bits in the revs structure, by the way.

>  revision.c | 2 +-
>  t/t4202-log.sh | 6 ++
>  2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/revision.c b/revision.c
> index 08d5806b8..ba2b166cd 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -2316,7 +2316,7 @@ int setup_revisions(int argc, const char **argv, struct 
> rev_info *revs, struct s
>   opt->tweak(revs, opt);
>   if (revs->show_merge)
>   prepare_show_merge(revs);
> - if (revs->def && !revs->pending.nr && !got_rev_arg) {
> + if (revs->def && !revs->pending.nr && !revs->rev_input_given && 
> !got_rev_arg) {
>   struct object_id oid;
>   struct object *object;
>   struct object_context oc;
> diff --git a/t/t4202-log.sh b/t/t4202-log.sh
> index 3f3531f0a..36d120c96 100755
> --- a/t/t4202-log.sh
> +++ b/t/t4202-log.sh
> @@ -1523,6 +1523,12 @@ test_expect_success 'log diagnoses bogus HEAD' '
>   test_i18ngrep broken stderr
>  '
>  
> +test_expect_success 'log does not default to HEAD when rev input is given' '
> + >expect &&
> + git log --branches=does-not-exist >actual &&
> + test_cmp expect actual
> +'
> +
>  test_expect_success 'set up --source tests' '
>   git checkout --orphan source-a &&
>   test_commit one &&


Re: [PATCH 2/4] revision: add rev_input_given flag

2017-08-02 Thread Junio C Hamano
Jeff King  writes:

> Normally a caller that invokes setup_revisions() has to
> check rev.pending to see if anything was actually queued for
> the traversal. But they can't tell the difference between
> two cases:
>
>   1. The user gave us no tip from which to start a
>  traversal.
>
>   2. The user tried to give us tips via --glob, --all, etc,
>  but their patterns ended up being empty.
>
> Let's set a flag in the rev_info struct that callers can use
> to tell the difference.  We can set this from the
> init_all_refs_cb() function.  That's a little funny because
> it's not exactly about initializing the "cb" struct itself.
> But that function is the common setup place for doing
> pattern traversals that is used by --glob, --all, etc.

...and "--bisect", which is an oddball so we probably do not have to
care.  I didn't check if there is a fallout on that codepath.




> Signed-off-by: Jeff King 
> ---
>  revision.c | 1 +
>  revision.h | 7 +++
>  2 files changed, 8 insertions(+)
>
> diff --git a/revision.c b/revision.c
> index 6603af944..08d5806b8 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -1168,6 +1168,7 @@ static void init_all_refs_cb(struct all_refs_cb *cb, 
> struct rev_info *revs,
>  {
>   cb->all_revs = revs;
>   cb->all_flags = flags;
> + revs->rev_input_given = 1;
>  }
>  
>  void clear_ref_exclusion(struct string_list **ref_excludes_p)
> diff --git a/revision.h b/revision.h
> index f96e7f7f4..c8f4e91f2 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -71,6 +71,13 @@ struct rev_info {
>   const char *def;
>   struct pathspec prune_data;
>  
> + /*
> +  * Whether the arguments parsed by setup_revisions() included any
> +  * "input" revisions that might still have yielded an empty pending
> +  * list (e.g., patterns like "--all" or "--glob").
> +  */
> + int rev_input_given;
> +
>   /* topo-sort */
>   enum rev_sort_order sort_order;


Re: [PATCH 0/4] handling empty inputs in the revision machinery

2017-08-02 Thread Junio C Hamano
Jeff King  writes:

> I noticed that:
>
>   git log --tags=does-not-exist
>
> will show all of HEAD, which is rather confusing. This fixes it, and
> also hits several other cases that were marked as expect_failure for
> rev-list. There is one case it doesn't handle: --stdin. It's not clear
> to me what the right behavior is there. I'll follow up with more
> discussion.

Naïvely, I would expect that an empty input from --stdin should
still prevent us from defaulting to "HEAD", just like --glob=* and
friends should do.  Perhaps there are conter-example scenarios I
haven't thought of..

Thanks.


>
>   [1/4]: t6018: flesh out empty input/output rev-list tests
>   [2/4]: revision: add rev_input_given flag
>   [3/4]: rev-list: don't show usage when we see empty ref patterns
>   [4/4]: revision: do not fallback to default when rev_input_given is set
>
>  builtin/rev-list.c   |  3 ++-
>  revision.c   |  3 ++-
>  revision.h   |  7 +++
>  t/t4202-log.sh   |  6 ++
>  t/t6018-rev-list-glob.sh | 20 +---
>  5 files changed, 26 insertions(+), 13 deletions(-)
>
> -Peff


[PATCH DONOTAPPLY 5/4] revision: let --stdin set rev_input_given

2017-08-02 Thread Jeff King
On Wed, Aug 02, 2017 at 06:24:25PM -0400, Jeff King wrote:

> I noticed that:
> 
>   git log --tags=does-not-exist
> 
> will show all of HEAD, which is rather confusing. This fixes it, and
> also hits several other cases that were marked as expect_failure for
> rev-list. There is one case it doesn't handle: --stdin. It's not clear
> to me what the right behavior is there. I'll follow up with more
> discussion.

So here that is. The patch below is what I had intended to send, but I
found some interesting corner cases.

This patch makes "rev-list --stdin 8 --
Subject: [PATCH] revision: let --stdin set rev_input_given

Currently "git rev-list --stdin 
---
 revision.c   | 1 +
 t/t6018-rev-list-glob.sh | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index ba2b166cd..6a1ccd407 100644
--- a/revision.c
+++ b/revision.c
@@ -2253,6 +2253,7 @@ int setup_revisions(int argc, const char **argv, struct 
rev_info *revs, struct s
}
if (read_from_stdin++)
die("--stdin given twice?");
+   revs->rev_input_given = 1;
read_revisions_from_stdin(revs, _data);
continue;
}
diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh
index d3453c583..bd300521b 100755
--- a/t/t6018-rev-list-glob.sh
+++ b/t/t6018-rev-list-glob.sh
@@ -255,7 +255,7 @@ test_expect_success 'rev-list accumulates multiple 
--exclude' '
compare rev-list "--exclude=refs/remotes/* --exclude=refs/tags/* --all" 
--branches
 '
 
-test_expect_failure 'rev-list should succeed with empty output on empty stdin' 
'
+test_expect_success 'rev-list should succeed with empty output on empty stdin' 
'
>expect &&
git rev-list --stdin actual &&
test_cmp expect actual
-- 
2.14.0.rc1.586.g00244b0b6



[PATCH 4/4] revision: do not fallback to default when rev_input_given is set

2017-08-02 Thread Jeff King
If revs->def is set (as it is in "git log") and there are no
pending objects after parsing the user's input, then we show
whatever is in "def". But if the user _did_ ask for some
input that just happened to be empty (e.g., "--glob" that
does not match anything), showing the default revision is
confusing. We should just show nothing, as that is what the
user's request yielded.

Signed-off-by: Jeff King 
---
The "!got_rev_arg" that's already in the conditional is interesting. I
wondered if it could be subsumed by the rev_input_given flag. But
digging in the history, I think it's mostly about doing reflog walks.
Usually if we see a rev arg it will result either in an object added to
the pending queue, or a fatal error. But empty reflogs are the
exception. And since my other nearby series adds a separate check for
"are we doing an empty reflog walk", I don't think it makes sense to
tangle this up the new flag I'm adding here.

 revision.c | 2 +-
 t/t4202-log.sh | 6 ++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index 08d5806b8..ba2b166cd 100644
--- a/revision.c
+++ b/revision.c
@@ -2316,7 +2316,7 @@ int setup_revisions(int argc, const char **argv, struct 
rev_info *revs, struct s
opt->tweak(revs, opt);
if (revs->show_merge)
prepare_show_merge(revs);
-   if (revs->def && !revs->pending.nr && !got_rev_arg) {
+   if (revs->def && !revs->pending.nr && !revs->rev_input_given && 
!got_rev_arg) {
struct object_id oid;
struct object *object;
struct object_context oc;
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 3f3531f0a..36d120c96 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -1523,6 +1523,12 @@ test_expect_success 'log diagnoses bogus HEAD' '
test_i18ngrep broken stderr
 '
 
+test_expect_success 'log does not default to HEAD when rev input is given' '
+   >expect &&
+   git log --branches=does-not-exist >actual &&
+   test_cmp expect actual
+'
+
 test_expect_success 'set up --source tests' '
git checkout --orphan source-a &&
test_commit one &&
-- 
2.14.0.rc1.586.g00244b0b6


Re: [PATCH 6/6] cherry-pick/revert: reject --rerere-autoupdate when continuing

2017-08-02 Thread Junio C Hamano
Junio C Hamano  writes:

> Phillip Wood  writes:
>
>> From: Phillip Wood 
>>
>> cherry-pick and revert should not accept --[no-]rerere-autoupdate once
>> they have started.
>
> Hmph, why shouldn't they?  In other words, shouldn't the usual "try
> to carry forward from the original invocation (saved in the state
> file), but allow overriding from the command line" rule apply?

Actually, I do not care _too_ deeply between

 * You can only give "--[no-]rerere-autoupdate" at the beginning and
   cannot change your mind later.

and

 * The "--[no-]rerere-autoupdate" you give at the beginning is used
   throughout your multi-commit cherry-pick session, but you can
   give an opposite one from the command line when you say
   "--continue", and in that case it takes effect only for a single
   commit.

If I understand correctly, the former is what 5-6/6 implements.  The
latter makes it more in line with how "am -3" followed by "am --no-3
--continue" behaves.

Thanks.


[PATCH 3/4] rev-list: don't show usage when we see empty ref patterns

2017-08-02 Thread Jeff King
If the user gives us no starting point for a traversal, we
want to complain with our normal usage message. But if they
tried to do so with "--all" or "--glob", but that happened
not to match any refs, the usage message isn't helpful. We
should just give them the empty output they asked for
instead.

Signed-off-by: Jeff King 
---
This will have a minor textual conflict with my reflog series, which
touches the same conditional.

 builtin/rev-list.c   | 3 ++-
 t/t6018-rev-list-glob.sh | 6 +++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 95d84d5cd..1e9cc5948 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -350,7 +350,8 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
 
if ((!revs.commits &&
 (!(revs.tag_objects || revs.tree_objects || revs.blob_objects) &&
- !revs.pending.nr)) ||
+ !revs.pending.nr) &&
+!revs.rev_input_given) ||
revs.diff)
usage(rev_list_usage);
 
diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh
index f8367b829..d3453c583 100755
--- a/t/t6018-rev-list-glob.sh
+++ b/t/t6018-rev-list-glob.sh
@@ -261,13 +261,13 @@ test_expect_failure 'rev-list should succeed with empty 
output on empty stdin' '
test_cmp expect actual
 '
 
-test_expect_failure 'rev-list should succeed with empty output with all refs 
excluded' '
+test_expect_success 'rev-list should succeed with empty output with all refs 
excluded' '
>expect &&
git rev-list --exclude=* --all >actual &&
test_cmp expect actual
 '
 
-test_expect_failure 'rev-list should succeed with empty output with empty 
--all' '
+test_expect_success 'rev-list should succeed with empty output with empty 
--all' '
(
test_create_repo empty &&
cd empty &&
@@ -277,7 +277,7 @@ test_expect_failure 'rev-list should succeed with empty 
output with empty --all'
)
 '
 
-test_expect_failure 'rev-list should succeed with empty output with empty 
glob' '
+test_expect_success 'rev-list should succeed with empty output with empty 
glob' '
>expect &&
git rev-list --glob=does-not-match-anything >actual &&
test_cmp expect actual
-- 
2.14.0.rc1.586.g00244b0b6



[PATCH 2/4] revision: add rev_input_given flag

2017-08-02 Thread Jeff King
Normally a caller that invokes setup_revisions() has to
check rev.pending to see if anything was actually queued for
the traversal. But they can't tell the difference between
two cases:

  1. The user gave us no tip from which to start a
 traversal.

  2. The user tried to give us tips via --glob, --all, etc,
 but their patterns ended up being empty.

Let's set a flag in the rev_info struct that callers can use
to tell the difference.  We can set this from the
init_all_refs_cb() function.  That's a little funny because
it's not exactly about initializing the "cb" struct itself.
But that function is the common setup place for doing
pattern traversals that is used by --glob, --all, etc.

Signed-off-by: Jeff King 
---
 revision.c | 1 +
 revision.h | 7 +++
 2 files changed, 8 insertions(+)

diff --git a/revision.c b/revision.c
index 6603af944..08d5806b8 100644
--- a/revision.c
+++ b/revision.c
@@ -1168,6 +1168,7 @@ static void init_all_refs_cb(struct all_refs_cb *cb, 
struct rev_info *revs,
 {
cb->all_revs = revs;
cb->all_flags = flags;
+   revs->rev_input_given = 1;
 }
 
 void clear_ref_exclusion(struct string_list **ref_excludes_p)
diff --git a/revision.h b/revision.h
index f96e7f7f4..c8f4e91f2 100644
--- a/revision.h
+++ b/revision.h
@@ -71,6 +71,13 @@ struct rev_info {
const char *def;
struct pathspec prune_data;
 
+   /*
+* Whether the arguments parsed by setup_revisions() included any
+* "input" revisions that might still have yielded an empty pending
+* list (e.g., patterns like "--all" or "--glob").
+*/
+   int rev_input_given;
+
/* topo-sort */
enum rev_sort_order sort_order;
 
-- 
2.14.0.rc1.586.g00244b0b6



[PATCH 1/4] t6018: flesh out empty input/output rev-list tests

2017-08-02 Thread Jeff King
In 751a2ac6e (rev-list --exclude: tests, 2013-11-01), we
added a few tests for handling "empty" inputs with rev-list
(i.e., where the user gave us some pattern but it turned out
not to queue any objects for traversal), all of which were
marked as failing.

In preparation for working on this area of the code, let's
give each test a more descriptive name. Let's also include
one more case which we should cover: feeding a --glob
pattern that doesn't match anything.

We can also drop the explanatory comment; we'll be
converting these to expect_success in the next few patches,
so the discussion isn't necessary.

Signed-off-by: Jeff King 
---
 t/t6018-rev-list-glob.sh | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh
index 381f35ed1..f8367b829 100755
--- a/t/t6018-rev-list-glob.sh
+++ b/t/t6018-rev-list-glob.sh
@@ -255,27 +255,19 @@ test_expect_success 'rev-list accumulates multiple 
--exclude' '
compare rev-list "--exclude=refs/remotes/* --exclude=refs/tags/* --all" 
--branches
 '
 
-
-# "git rev-list" is likely to be a bug in the calling script and may
-# deserve an error message, but do cases where set of refs programmatically
-# given using globbing and/or --stdin need to fail with the same error, or
-# are we better off reporting a success with no output?  The following few
-# tests document the current behaviour to remind us that we might want to
-# think about this issue.
-
-test_expect_failure 'rev-list may want to succeed with empty output on no 
input (1)' '
+test_expect_failure 'rev-list should succeed with empty output on empty stdin' 
'
>expect &&
git rev-list --stdin actual &&
test_cmp expect actual
 '
 
-test_expect_failure 'rev-list may want to succeed with empty output on no 
input (2)' '
+test_expect_failure 'rev-list should succeed with empty output with all refs 
excluded' '
>expect &&
git rev-list --exclude=* --all >actual &&
test_cmp expect actual
 '
 
-test_expect_failure 'rev-list may want to succeed with empty output on no 
input (3)' '
+test_expect_failure 'rev-list should succeed with empty output with empty 
--all' '
(
test_create_repo empty &&
cd empty &&
@@ -285,6 +277,12 @@ test_expect_failure 'rev-list may want to succeed with 
empty output on no input
)
 '
 
+test_expect_failure 'rev-list should succeed with empty output with empty 
glob' '
+   >expect &&
+   git rev-list --glob=does-not-match-anything >actual &&
+   test_cmp expect actual
+'
+
 test_expect_success 'shortlog accepts --glob/--tags/--remotes' '
 
compare shortlog "subspace/one subspace/two" --branches=subspace &&
-- 
2.14.0.rc1.586.g00244b0b6



[PATCH 0/4] handling empty inputs in the revision machinery

2017-08-02 Thread Jeff King
I noticed that:

  git log --tags=does-not-exist

will show all of HEAD, which is rather confusing. This fixes it, and
also hits several other cases that were marked as expect_failure for
rev-list. There is one case it doesn't handle: --stdin. It's not clear
to me what the right behavior is there. I'll follow up with more
discussion.

  [1/4]: t6018: flesh out empty input/output rev-list tests
  [2/4]: revision: add rev_input_given flag
  [3/4]: rev-list: don't show usage when we see empty ref patterns
  [4/4]: revision: do not fallback to default when rev_input_given is set

 builtin/rev-list.c   |  3 ++-
 revision.c   |  3 ++-
 revision.h   |  7 +++
 t/t4202-log.sh   |  6 ++
 t/t6018-rev-list-glob.sh | 20 +---
 5 files changed, 26 insertions(+), 13 deletions(-)

-Peff


Re: Git log --tags isn't working as expected

2017-08-02 Thread Jeff King
On Wed, Aug 02, 2017 at 01:28:38PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > By the way, I noticed something funny that I forgot to mention:
> >
> >   git log --tags=does-not-exist
> >
> > shows HEAD, because the logic to kick in the default only asks "were we
> > given any refs to start the traversal?". I think it probably should
> > consider a wildcard with no matches to override the default, and show
> > nothing.
> >
> > I haven't decided if it should be empty-but-success, or give an error.
> 
> I agree.  The same for --branches, --glob, etc.  I'd say it should
> behave just like "git log HEAD..HEAD".

Thanks for confirming. That's the way I lean, too. And I found some
related tests for rev-list in t6018 that expect that, too (but are
currently marked to fail).

I'll send some patches in a minute, but I'll start a new thread, as
we've drifted off-topic for this one.

-Peff


Re: [PATCH v2 0/5] Fsck for lazy objects, and (now) actual invocation of loader

2017-08-02 Thread Jonathan Nieder
Junio C Hamano wrote:
> Jonathan Nieder  writes:

>> Can you spell this out more?  To be clear, are you speaking as a
>> reviewer or as the project maintainer?  In other words, if other
>> reviewers are able to settle on a design that involves a relaxed
>> guarantee for fsck in this mode that they can agree on, does this
>> represent a veto meaning the patch can still not go through?
>
> Consider it a veto over punting without making sure that we can
> later come up with a solution to give such a guarantee.  I am not
> getting a feeling that "other reviewers" are even seeking a "relaxed
> guarantee"---all I've seen in the thread is to give up any guarantee
> and to hope for the best.

Thank you.  That makes sense.

In my defense, one reason I had for being okay with dropping the
connectivity check in the "lazy object" setup (at a higher level than
this patch currently does it, to avoid wasted work) is that this patch
series does not include the required components to do it more properly
and previous discussions on list had pointed to some of those
components that will arrive later (the "object size cache", which
doubles as an incomplete list of promises).  But that doesn't put the
project in a good position because it isn't an explicitly spelled out
plan.

The set of other reviewers that I was hoping will weigh in at some
point is the GVFS team at Microsoft.

I'll write up a summary of the ideas discussed so far to try to get
this unblocked.

Sincerely,
Jonathan


Re: [PATCH 6/6] cherry-pick/revert: reject --rerere-autoupdate when continuing

2017-08-02 Thread Junio C Hamano
Phillip Wood  writes:

> From: Phillip Wood 
>
> cherry-pick and revert should not accept --[no-]rerere-autoupdate once
> they have started.

Hmph, why shouldn't they?  In other words, shouldn't the usual "try
to carry forward from the original invocation (saved in the state
file), but allow overriding from the command line" rule apply?

I took a brief look at earlier steps and I thought 1-4/6 were all
good.  I am not sure (yet) about these last two patches.

Thanks for working on this topic.


Re: git svn fetch --localtime produces wrong commit times

2017-08-02 Thread Urs Thuermann
Urs Thuermann  writes:

> I could find the bug grepping through /usr/lib/git-core/git-svn, maybe
> it's in GIT::SVN::Fetcher or some other GIT::SVN module.

Oops, that should be "could *not* find", of course.

urs


Re: [PATCH v1 1/1] correct apply for files commited with CRLF

2017-08-02 Thread Junio C Hamano
tbo...@web.de writes:

> From: Torsten Bögershausen 
>
> git apply does not find the source lines when files have CRLF in the index
> and core.autocrlf is true:
> These files should not get the CRLF converted to LF. Because cmd_apply()
> does not load the index, this does not work, CRLF are converted into LF
> and apply fails.
>
> Fix this in the spirit of commit a08feb8ef0b6,
> "correct blame for files commited with CRLF" by loading the index.
>
> As an optimization, skip read_cache() when no conversion is specified
> for this path.

What happens when the input to apply is a patch to create a new file
and the patch uses CRLF line endings?  In such a case, shouldn't
core.autocrlf=true cause the patch to be converted to LF and then
succeed in applying?  An extra read_cache() would not help as there
must be no existing index entry for the path in such a case.

If you did "rm .git/index" after you did the "git checkout -- ." in
your test patch, "git apply" ought to succeed, as it is working as
a substitute for "patch -p1" and should not be consulting the index
at all.

I cannot shake this feeling that it is convert_to_git() that needs
to be fixed so that it does not need to refer to the current
contents in the index.  Why does it even need to look at the index?
If it is for the "safe CRLF" thing (which I would think is misguided),
shouldn't it be checking the original file in the working tree, not
the index?  After all, that is what we are patching, not the contents
stored in the index.

>
> Reported-by: Anthony Sottile 
> Signed-off-by: Torsten Bögershausen 
> ---
>  apply.c |  2 ++
>  t/t0020-crlf.sh | 12 
>  2 files changed, 14 insertions(+)
>
> diff --git a/apply.c b/apply.c
> index f2d599141d..66b8387360 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -2278,6 +2278,8 @@ static int read_old_data(struct stat *st, const char 
> *path, struct strbuf *buf)
>   case S_IFREG:
>   if (strbuf_read_file(buf, path, st->st_size) != st->st_size)
>   return error(_("unable to open or read %s"), path);
> + if (would_convert_to_git(_index, path))
> + read_cache();
>   convert_to_git(_index, path, buf->buf, buf->len, buf, 0);
>   return 0;
>   default:
> diff --git a/t/t0020-crlf.sh b/t/t0020-crlf.sh
> index 71350e0657..6611f8a6f6 100755
> --- a/t/t0020-crlf.sh
> +++ b/t/t0020-crlf.sh
> @@ -386,4 +386,16 @@ test_expect_success 'New CRLF file gets LF in repo' '
>   test_cmp alllf alllf2
>  '
>  
> +test_expect_success 'CRLF in repo, apply with autocrlf=true' '
> + git config core.autocrlf false &&
> + printf "1\r\n2\r\n" >crlf &&
> + git add crlf &&
> + git commit -m "commit crlf with crlf" &&
> + git config core.autocrlf true &&
> + printf "1\r\n2\r\n\r\n\r\n\r\n" >crlf &&
> + git diff >patch &&
> + git checkout -- . &&
> + git apply patch
> +'
> +
>  test_done


Re: core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch`

2017-08-02 Thread Junio C Hamano
Torsten Bögershausen  writes:

>   My very first investigation shows that a patch like this could fix
> the problem:
>
> diff --git a/apply.c b/apply.c
> index f2d599141d..66b8387360 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -2278,6 +2278,8 @@ static int read_old_data(struct stat *st, const
> char *path, struct strbuf *buf)
> case S_IFREG:
> if (strbuf_read_file(buf, path, st->st_size) !=
> st->st_size)
> return error(_("unable to open or read %s"), path);
> +   if (would_convert_to_git(_index, path))
> +   read_cache();
> convert_to_git(_index, path, buf->buf, buf->len,
> buf, 0);

Sorry, but it is unclear why this is a "fix" to me.  We may not even
be doing "git apply --index" or "git apply --cached" that care about
the state recorded in the index, but just the plain vanilla "git apply",
which could even be outside any repository.


Re: [PATCH v2 0/5] Fsck for lazy objects, and (now) actual invocation of loader

2017-08-02 Thread Junio C Hamano
Jonathan Nieder  writes:

> Junio C Hamano wrote:
>> Jonathan Tan  writes:
>
>>> One possibility to conceptually have the same thing without the overhead
>>> of the list is to put the obtained-from-elsewhere objects into its own
>>> alternate object store, so that we can distinguish the two.
>>
>> Now you are talking.  Either a separate object store, or a packfile
>> that is specially marked as such, would work.
>
> Jonathan's not in today, so let me say a few more words about this
> approach.
>
> This approach implies a relaxed connectivity guarantee, by creating
> two classes of objects:
>
>  1. Objects that I made should satisfy the connectivity check.  They
> can point to other objects I made, objects I fetched, or (*) objects
> pointed to directly by objects I fetched.  More on (*) below.

Or objects that are referred to by objects I fetched.  If you
narrowly clone while omitting a subdirectory, updated a file
that is outside the subdirectory, and created a new commit, while
recording the same tree object name for the directory you do not
know its contents (becaues you didn't fetch), then it is OK for the
top-level tree of the resulting commit you created to be pointing
at the tree that represents the subdirectory you never touched.

> The complication is in the "git gc" operation for the case (*).
> Today, "git gc" uses a reachability walk to decide which objects to
> remove --- an object referenced by no other object is fair game to
> remove.  With (*), there is another kind of object that must not be
> removed: if an object that I made, M, points to a missing/promised
> object, O, pointed to by a an object I fetched, F, then I cannot prune
> F unless there is another fetched object present to anchor O.

Absolutely.  Lazy-objects support comes with certain cost and this
is one of them.  

But I do not think it is realistic to expect that you can prune
anything you fetched from the "other place" (i.e. the source
'lazy-objects' hook reads from).  After all, once they give out
objects to their clients (like us in this case), they cannot prune
it, if we take the "implicit promise" approach to avoid the cost to
transmit and maintain a separate "object list".

> For example: suppose I have a sparse checkout and run
>
>   git fetch origin refs/pulls/x
>   git checkout -b topic FETCH_HEAD
>   echo "Some great modification" >> README
>   git add README
>   git commit --amend
>
> When I run "git gc", there is nothing pointing to the commit that was
> pointed to by the remote ref refs/pulls/x, so it can be pruned.  I
> would naively also expect that the tree pointed to by that commit
> could be pruned.  But pruning it means pruning the promise that made
> it permissible to lack various blobs that my topic branch refers to
> that are outside the sparse checkout area.  So "git gc" must notice
> that it is not safe to prune that tree.
>
> This feels hacky.  I prefer the promised object list over this
> approach.

I think they are moral equivalents implemented differently with
different assumptions.  The example we are discussing makes an extra
assumption: In order to reduce the cost of transferring and
maintaining the list, we assume that all objects that came during
that transfer are implicitly "promised", i.e. everything behind each
of these objects will later be available on demand.  How these
objects are marked is up to the exact mechanism (my preference is to
mark the resulting packfile as special; Jon Tan's message to which
my message was a resopnse alluded to using an alternate object
store).  If you choose to maintain a separate "object list" and have
the "other side" explicitly give it, perhaps you can lift that
assumption and replace it with some other assumption that assumes
less.

> Can you spell this out more?  To be clear, are you speaking as a
> reviewer or as the project maintainer?  In other words, if other
> reviewers are able to settle on a design that involves a relaxed
> guarantee for fsck in this mode that they can agree on, does this
> represent a veto meaning the patch can still not go through?

Consider it a veto over punting without making sure that we can
later come up with a solution to give such a guarantee.  I am not
getting a feeling that "other reviewers" are even seeking a "relaxed
guarantee"---all I've seen in the thread is to give up any guarantee
and to hope for the best.



[PATCH v1 1/1] correct apply for files commited with CRLF

2017-08-02 Thread tboegi
From: Torsten Bögershausen 

git apply does not find the source lines when files have CRLF in the index
and core.autocrlf is true:
These files should not get the CRLF converted to LF. Because cmd_apply()
does not load the index, this does not work, CRLF are converted into LF
and apply fails.

Fix this in the spirit of commit a08feb8ef0b6,
"correct blame for files commited with CRLF" by loading the index.

As an optimization, skip read_cache() when no conversion is specified
for this path.

Reported-by: Anthony Sottile 
Signed-off-by: Torsten Bögershausen 
---
 apply.c |  2 ++
 t/t0020-crlf.sh | 12 
 2 files changed, 14 insertions(+)

diff --git a/apply.c b/apply.c
index f2d599141d..66b8387360 100644
--- a/apply.c
+++ b/apply.c
@@ -2278,6 +2278,8 @@ static int read_old_data(struct stat *st, const char 
*path, struct strbuf *buf)
case S_IFREG:
if (strbuf_read_file(buf, path, st->st_size) != st->st_size)
return error(_("unable to open or read %s"), path);
+   if (would_convert_to_git(_index, path))
+   read_cache();
convert_to_git(_index, path, buf->buf, buf->len, buf, 0);
return 0;
default:
diff --git a/t/t0020-crlf.sh b/t/t0020-crlf.sh
index 71350e0657..6611f8a6f6 100755
--- a/t/t0020-crlf.sh
+++ b/t/t0020-crlf.sh
@@ -386,4 +386,16 @@ test_expect_success 'New CRLF file gets LF in repo' '
test_cmp alllf alllf2
 '
 
+test_expect_success 'CRLF in repo, apply with autocrlf=true' '
+   git config core.autocrlf false &&
+   printf "1\r\n2\r\n" >crlf &&
+   git add crlf &&
+   git commit -m "commit crlf with crlf" &&
+   git config core.autocrlf true &&
+   printf "1\r\n2\r\n\r\n\r\n\r\n" >crlf &&
+   git diff >patch &&
+   git checkout -- . &&
+   git apply patch
+'
+
 test_done
-- 
2.13.2.533.ge0aaa1b



Re: [RFC PATCH] clone: add clone.recursesubmodules config option

2017-08-02 Thread Stefan Beller
On Wed, Aug 2, 2017 at 11:11 AM, Jeremy Morton  wrote:
> Did this ever get anywhere?  If not why not?  It would be very useful to me
> to be able to clone recursively by default, especially considering you can't
> use 'alias' to override the existing 'clone' command.
>

Note that there is 3c548de378 (Merge branch 'sb/submodule-blanket-recursive',
2017-06-13), which adds recursing into submodules to a couple of commands.

clone is not one of them, because at that time I thought you'd want to select
explicitly at clone time which submodules you want. Unlike most other commands
that can recurse into submodules, clone supports a pathspec for the recurse
parameter, such that you can express a fine grained selection of submodules
that you are interested in.

I wonder if submodule.recurse is set if we'd just want to recurse into
all submodules for clone? That may have negative consequences though
as people may have forgotten that they set that config a long time ago and then
are surprised to get so many submodules.


Re: Git log --tags isn't working as expected

2017-08-02 Thread Junio C Hamano
Jeff King  writes:

> By the way, I noticed something funny that I forgot to mention:
>
>   git log --tags=does-not-exist
>
> shows HEAD, because the logic to kick in the default only asks "were we
> given any refs to start the traversal?". I think it probably should
> consider a wildcard with no matches to override the default, and show
> nothing.
>
> I haven't decided if it should be empty-but-success, or give an error.

I agree.  The same for --branches, --glob, etc.  I'd say it should
behave just like "git log HEAD..HEAD".




Re: reftable [v4]: new ref storage format

2017-08-02 Thread Jeff King
On Wed, Aug 02, 2017 at 12:50:39PM -0700, Junio C Hamano wrote:

> With the traditional "packed-refs plus loose" layout, no matter how
> many times a handful of selected busy refs are updated during the
> day, you'd need to open at most two files to find out the current
> value of a single ref (admittedly, the accessing of the second file,
> after we realize that there is no loose one, would be very costly).
> If you make a few commits on a topic branch A, then build a 100
> commit series on top of another topic branch B, finding the current
> value of A is still one open and read of refs/heads/A.
> 
> With the reftable format, we'd need to open and read all 100
> incremental transactions that touch branch B before realizing that
> none of them talk about A, and read the next transaction file to
> find the current value of A.  To keep this number low, we'd need
> quite a frequent compaction.

I think this is where compaction cleverness can come in.

One relatively easy optimization would be to note when the most recent
reftable contains a subset of the refs we are currently updating (and
the obvious string-of-updates to a single ref falls under that), and do
a "quick" compaction where we simply drop[1] that reftable in favor of
ours. That compaction is essentially free, because we know those entries
aren't valid anymore anyway.

I'm actually not sure if this is a strict "drop", though, because of
reflogs. If the reflog is written into the same file as the ref update,
then you'd need to roll its entry into your new update, too. But see
below anyway.

For more complicated cases, there's some cleverness you can do with
small on-the-fly compactions. Even if there are entries in the last few
reftables that we're not currently updating, it's pretty cheap to roll a
few of them up into our new reftable if it lets us drop some
intermediate reftables. E.g., if we're writing a new reftable with a 4K
block size but only have 100 bytes of new data, we're probably best to
roll up a previous 500-byte reftable.

That one's an obvious optimization because we know that the filesystem
is going to make us spend 4K either way, so rounding up to that is
generally free-ish.

What's less obvious is when we should roll up a bunch of 4K tables into
one (let's say) 32K table.  I think there's a formula for doing this
geometrically so that the amortized cost of writing stays under a
certain bound (linear?). But I haven't thought it through (or looked it
up); I was hoping Shawn had already done so and could dump that wisdom
on us.

> We can just declare that reftable format is not for desktop clients
> but for server implementations where frequent compaction would not
> be an annoyance to the users, but I'd wish we do not have to.

Yeah, I agree we should avoid that. I would love to eventually kill off
the loose/packed backend (or at least make it historical and read-only,
so that a reasonable response to people complaining about its races and
lack of atomicity is "please upgrade to reftables").

-Peff


Re: reftable [v4]: new ref storage format

2017-08-02 Thread Stefan Beller
> ### Ref block format
>
> A ref block is written as:
>
> 'r'
> uint24( block_len )
> ref_record+
> uint32( restart_offset )+
> uint16( restart_count )
> padding?
>

So I learned that your current writer is a two block pass,
i.e. the block is first written into memory and then once
the block looks complete it is written out to disk.

This would allow us to shuffle the data around during
the actual out-to-disk-phase, such as this:

  'r'
  uint24( restart_count )
  uint32( restart_offset )+
  ref_record+
  ref_record_endmarker
  padding?

(A) In nearby emails we discussed to have the restart offsets
to be 24 bit, but now they are 32-bit aligned to the start of a block
so we could keep them 32 bit for simplicity of reading.

(B) Note how there is no block_len encoding, which was originally
only needed to lookup the position of restart_count. (so even for that
we could rename it to padding_len, such that the position of
restart_count can be decoded easily)

We no longer need the block_len as the restart_count comes right
after the 'r'.

Instead we'll have a ref_record_endmarker that reads as a ref
with both prefix and suffix to '0', type deletion (such that there is
no further cost). The end marker would only need two '0's, which
makes it indistinguishable from padding.


Re: reftable [v4]: new ref storage format

2017-08-02 Thread Junio C Hamano
Shawn Pearce  writes:

> ### Layout
>
> The `$GIT_DIR/refs` path is a file when reftable is configured, not a
> directory.  This prevents loose references from being stored.
>
> A collection of reftable files are stored in the `$GIT_DIR/reftable/`
> directory:
>
> 0001_UF4paF
> 0002_bUVgy4
>
> where reftable files are named by a unique name such as produced by
> the function:
>
> mktemp "${update_index}_XX"
>
> The stack ordering file is `$GIT_DIR/refs` and lists the current
> files, one per line, in order, from oldest (base) to newest (most
> recent):
>
> $ cat .git/refs
> 0001_UF4paF
> 0002_bUVgy4
>
> Readers must read `$GIT_DIR/refs` to determine which files are
> relevant right now, and search through the stack in reverse order
> (last reftable is examined first).
>
> Reftable files not listed in `refs` may be new (and about to be added
> to the stack by the active writer), or ancient and ready to be pruned.

I like the general idea, what the file format can represent and how
it does so, but I am a bit uneasy about how well this "stacked" part
would work for desktop clients.  The structure presented here is for
optimizing the "we want to learn about many (or all) refs" access
pattern, which probably matters a lot on the server implementations,
but I do not feel comfortable without knowing how much it penalizes
"I want the current value of this single ref" access pattern.

With the traditional "packed-refs plus loose" layout, no matter how
many times a handful of selected busy refs are updated during the
day, you'd need to open at most two files to find out the current
value of a single ref (admittedly, the accessing of the second file,
after we realize that there is no loose one, would be very costly).
If you make a few commits on a topic branch A, then build a 100
commit series on top of another topic branch B, finding the current
value of A is still one open and read of refs/heads/A.

With the reftable format, we'd need to open and read all 100
incremental transactions that touch branch B before realizing that
none of them talk about A, and read the next transaction file to
find the current value of A.  To keep this number low, we'd need
quite a frequent compaction.

We can just declare that reftable format is not for desktop clients
but for server implementations where frequent compaction would not
be an annoyance to the users, but I'd wish we do not have to.





[PATCH v4 10/10] grep: recurse in-process using 'struct repository'

2017-08-02 Thread Brandon Williams
Convert grep to use 'struct repository' which enables recursing into
submodules to be handled in-process.

Signed-off-by: Brandon Williams 
---
 Documentation/git-grep.txt |   7 -
 builtin/grep.c | 396 ++---
 cache.h|   1 -
 git.c  |   2 +-
 grep.c |  13 --
 grep.h |   1 -
 setup.c|  12 +-
 7 files changed, 88 insertions(+), 344 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 5033483db..720c7850e 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -95,13 +95,6 @@ OPTIONS
 option the prefix of all submodule output will be the name of
the parent project's  object.
 
---parent-basename ::
-   For internal use only.  In order to produce uniform output with the
-   --recurse-submodules option, this option can be used to provide the
-   basename of a parent's  object to a submodule so the submodule
-   can prefix its output with the parent's name rather than the SHA1 of
-   the submodule.
-
 -a::
 --text::
Process binary files as if they were text.
diff --git a/builtin/grep.c b/builtin/grep.c
index 7e79eb1a7..cd0e51f3c 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -28,13 +28,7 @@ static char const * const grep_usage[] = {
NULL
 };
 
-static const char *super_prefix;
 static int recurse_submodules;
-static struct argv_array submodule_options = ARGV_ARRAY_INIT;
-static const char *parent_basename;
-
-static int grep_submodule_launch(struct grep_opt *opt,
-const struct grep_source *gs);
 
 #define GREP_NUM_THREADS_DEFAULT 8
 static int num_threads;
@@ -186,10 +180,7 @@ static void *run(void *arg)
break;
 
opt->output_priv = w;
-   if (w->source.type == GREP_SOURCE_SUBMODULE)
-   hit |= grep_submodule_launch(opt, >source);
-   else
-   hit |= grep_source(opt, >source);
+   hit |= grep_source(opt, >source);
grep_source_clear_data(>source);
work_done(w);
}
@@ -327,21 +318,13 @@ static int grep_oid(struct grep_opt *opt, const struct 
object_id *oid,
 {
struct strbuf pathbuf = STRBUF_INIT;
 
-   if (super_prefix) {
-   strbuf_add(, filename, tree_name_len);
-   strbuf_addstr(, super_prefix);
-   strbuf_addstr(, filename + tree_name_len);
+   if (opt->relative && opt->prefix_length) {
+   quote_path_relative(filename + tree_name_len, opt->prefix, 
);
+   strbuf_insert(, 0, filename, tree_name_len);
} else {
strbuf_addstr(, filename);
}
 
-   if (opt->relative && opt->prefix_length) {
-   char *name = strbuf_detach(, NULL);
-   quote_path_relative(name + tree_name_len, opt->prefix, 
);
-   strbuf_insert(, 0, name, tree_name_len);
-   free(name);
-   }
-
 #ifndef NO_PTHREADS
if (num_threads) {
add_work(opt, GREP_SOURCE_OID, pathbuf.buf, path, oid);
@@ -366,15 +349,10 @@ static int grep_file(struct grep_opt *opt, const char 
*filename)
 {
struct strbuf buf = STRBUF_INIT;
 
-   if (super_prefix)
-   strbuf_addstr(, super_prefix);
-   strbuf_addstr(, filename);
-
-   if (opt->relative && opt->prefix_length) {
-   char *name = strbuf_detach(, NULL);
-   quote_path_relative(name, opt->prefix, );
-   free(name);
-   }
+   if (opt->relative && opt->prefix_length)
+   quote_path_relative(filename, opt->prefix, );
+   else
+   strbuf_addstr(, filename);
 
 #ifndef NO_PTHREADS
if (num_threads) {
@@ -421,284 +399,89 @@ static void run_pager(struct grep_opt *opt, const char 
*prefix)
exit(status);
 }
 
-static void compile_submodule_options(const struct grep_opt *opt,
- const char **argv,
- int cached, int untracked,
- int opt_exclude, int use_index,
- int pattern_type_arg)
-{
-   struct grep_pat *pattern;
-
-   if (recurse_submodules)
-   argv_array_push(_options, "--recurse-submodules");
-
-   if (cached)
-   argv_array_push(_options, "--cached");
-   if (!use_index)
-   argv_array_push(_options, "--no-index");
-   if (untracked)
-   argv_array_push(_options, "--untracked");
-   if (opt_exclude > 0)
-   argv_array_push(_options, "--exclude-standard");
-
-   if (opt->invert)
-   argv_array_push(_options, "-v");
-   if (opt->ignore_case)
-   argv_array_push(_options, "-i");
-   if 

[PATCH v4 05/10] submodule: remove submodule.fetchjobs from submodule-config parsing

2017-08-02 Thread Brandon Williams
The '.gitmodules' file should only contain information pertinent to
configuring individual submodules (name to path mapping, URL where to
obtain the submodule, etc.) while other configuration like the number of
jobs to use when fetching submodules should be a part of the
repository's config.

Remove the 'submodule.fetchjobs' configuration option from the general
submodule-config parsing and instead rely on using the
'config_from_gitmodules()' in order to maintain backwards compatibility
with this config being placed in the '.gitmodules' file.

Signed-off-by: Brandon Williams 
---
 builtin/fetch.c | 18 +-
 builtin/submodule--helper.c | 17 +
 submodule-config.c  |  8 
 submodule-config.h  |  1 +
 submodule.c | 16 +---
 submodule.h |  1 -
 6 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index c87e59f3b..ade092bf8 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -39,7 +39,7 @@ static int prune = -1; /* unspecified */
 static int all, append, dry_run, force, keep, multiple, update_head_ok, 
verbosity, deepen_relative;
 static int progress = -1;
 static int tags = TAGS_DEFAULT, unshallow, update_shallow, deepen;
-static int max_children = -1;
+static int max_children = 1;
 static enum transport_family family;
 static const char *depth;
 static const char *deepen_since;
@@ -68,9 +68,24 @@ static int git_fetch_config(const char *k, const char *v, 
void *cb)
recurse_submodules = r;
}
 
+   if (!strcmp(k, "submodule.fetchjobs")) {
+   max_children = parse_submodule_fetchjobs(k, v);
+   return 0;
+   }
+
return git_default_config(k, v, cb);
 }
 
+static int gitmodules_fetch_config(const char *var, const char *value, void 
*cb)
+{
+   if (!strcmp(var, "submodule.fetchjobs")) {
+   max_children = parse_submodule_fetchjobs(var, value);
+   return 0;
+   }
+
+   return 0;
+}
+
 static int parse_refmap_arg(const struct option *opt, const char *arg, int 
unset)
 {
ALLOC_GROW(refmap_array, refmap_nr + 1, refmap_alloc);
@@ -1311,6 +1326,7 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
for (i = 1; i < argc; i++)
strbuf_addf(_rla, " %s", argv[i]);
 
+   config_from_gitmodules(gitmodules_fetch_config, NULL);
git_config(git_fetch_config, NULL);
 
argc = parse_options(argc, argv, prefix,
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6abdad329..6d9600d4f 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -960,10 +960,19 @@ static int update_clone_task_finished(int result,
return 0;
 }
 
+static int gitmodules_update_clone_config(const char *var, const char *value,
+ void *cb)
+{
+   int *max_jobs = cb;
+   if (!strcmp(var, "submodule.fetchjobs"))
+   *max_jobs = parse_submodule_fetchjobs(var, value);
+   return 0;
+}
+
 static int update_clone(int argc, const char **argv, const char *prefix)
 {
const char *update = NULL;
-   int max_jobs = -1;
+   int max_jobs = 1;
struct string_list_item *item;
struct pathspec pathspec;
struct submodule_update_clone suc = SUBMODULE_UPDATE_CLONE_INIT;
@@ -1000,6 +1009,9 @@ static int update_clone(int argc, const char **argv, 
const char *prefix)
};
suc.prefix = prefix;
 
+   config_from_gitmodules(gitmodules_update_clone_config, _jobs);
+   git_config(gitmodules_update_clone_config, _jobs);
+
argc = parse_options(argc, argv, prefix, module_update_clone_options,
 git_submodule_helper_usage, 0);
 
@@ -1017,9 +1029,6 @@ static int update_clone(int argc, const char **argv, 
const char *prefix)
gitmodules_config();
git_config(submodule_config, NULL);
 
-   if (max_jobs < 0)
-   max_jobs = parallel_submodules();
-
run_processes_parallel(max_jobs,
   update_clone_get_next_task,
   update_clone_start_failure,
diff --git a/submodule-config.c b/submodule-config.c
index 5fe2d0787..70400f553 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -248,6 +248,14 @@ static int parse_fetch_recurse(const char *opt, const char 
*arg,
}
 }
 
+int parse_submodule_fetchjobs(const char *var, const char *value)
+{
+   int fetchjobs = git_config_int(var, value);
+   if (fetchjobs < 0)
+   die(_("negative values not allowed for submodule.fetchjobs"));
+   return fetchjobs;
+}
+
 int parse_fetch_recurse_submodules_arg(const char *opt, const char *arg)
 {
return parse_fetch_recurse(opt, arg, 1);
diff --git a/submodule-config.h b/submodule-config.h
index 233bfcb7f..995d404f8 100644
--- a/submodule-config.h

[PATCH v4 09/10] submodule: merge repo_read_gitmodules and gitmodules_config

2017-08-02 Thread Brandon Williams
Since 69aba5329 (submodule: add repo_read_gitmodules) there have been
two ways to load a repository's .gitmodules file:
'repo_read_gitmodules()' is used if you have a repository object you are
working with or 'gitmodules_config()' if you are implicitly working with
'the_repository'.  Merge the logic of these two functions to remove
duplicate code.

In addition, 'repo_read_gitmodules()' can segfault by passing in a NULL
pointer to 'git_config_from_file()' if a repository doesn't have a
worktree.  Instead check for the existence of a worktree before
attempting to load the .gitmodules file.

Signed-off-by: Brandon Williams 
---
 submodule.c | 37 +
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/submodule.c b/submodule.c
index 3b0e70c51..9d5eacaf9 100644
--- a/submodule.c
+++ b/submodule.c
@@ -230,23 +230,6 @@ void load_submodule_cache(void)
git_config(submodule_config, NULL);
 }
 
-void gitmodules_config(void)
-{
-   const char *work_tree = get_git_work_tree();
-   if (work_tree) {
-   struct strbuf gitmodules_path = STRBUF_INIT;
-   strbuf_addstr(_path, work_tree);
-   strbuf_addstr(_path, "/" GITMODULES_FILE);
-   if (read_cache() < 0)
-   die("index file corrupt");
-
-   if (!is_gitmodules_unmerged(_index))
-   git_config_from_file(git_modules_config,
-   gitmodules_path.buf, NULL);
-   strbuf_release(_path);
-   }
-}
-
 static int gitmodules_cb(const char *var, const char *value, void *data)
 {
struct repository *repo = data;
@@ -255,10 +238,24 @@ static int gitmodules_cb(const char *var, const char 
*value, void *data)
 
 void repo_read_gitmodules(struct repository *repo)
 {
-   char *gitmodules_path = repo_worktree_path(repo, GITMODULES_FILE);
+   if (repo->worktree) {
+   char *gitmodules;
+
+   if (repo_read_index(repo) < 0)
+   return;
 
-   git_config_from_file(gitmodules_cb, gitmodules_path, repo);
-   free(gitmodules_path);
+   gitmodules = repo_worktree_path(repo, GITMODULES_FILE);
+
+   if (!is_gitmodules_unmerged(repo->index))
+   git_config_from_file(gitmodules_cb, gitmodules, repo);
+
+   free(gitmodules);
+   }
+}
+
+void gitmodules_config(void)
+{
+   repo_read_gitmodules(the_repository);
 }
 
 void gitmodules_config_sha1(const unsigned char *commit_sha1)
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH v4 06/10] submodule: remove fetch.recursesubmodules from submodule-config parsing

2017-08-02 Thread Brandon Williams
Remove the 'fetch.recursesubmodules' configuration option from the
general submodule-config parsing and instead rely on using
'config_from_gitmodules()' in order to maintain backwards compatibility
with this config being placed in the '.gitmodules' file.

Signed-off-by: Brandon Williams 
---
 builtin/fetch.c |  8 +++-
 submodule.c | 19 ++-
 submodule.h |  2 +-
 3 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ade092bf8..d84c26391 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -71,6 +71,9 @@ static int git_fetch_config(const char *k, const char *v, 
void *cb)
if (!strcmp(k, "submodule.fetchjobs")) {
max_children = parse_submodule_fetchjobs(k, v);
return 0;
+   } else if (!strcmp(k, "fetch.recursesubmodules")) {
+   recurse_submodules = parse_fetch_recurse_submodules_arg(k, v);
+   return 0;
}
 
return git_default_config(k, v, cb);
@@ -81,6 +84,9 @@ static int gitmodules_fetch_config(const char *var, const 
char *value, void *cb)
if (!strcmp(var, "submodule.fetchjobs")) {
max_children = parse_submodule_fetchjobs(var, value);
return 0;
+   } else if (!strcmp(var, "fetch.recursesubmodules")) {
+   recurse_submodules = parse_fetch_recurse_submodules_arg(var, 
value);
+   return 0;
}
 
return 0;
@@ -1355,7 +1361,6 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
deepen = 1;
 
if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
-   set_config_fetch_recurse_submodules(recurse_submodules_default);
gitmodules_config();
git_config(submodule_config, NULL);
}
@@ -1399,6 +1404,7 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
result = fetch_populated_submodules(,
submodule_prefix,
recurse_submodules,
+   recurse_submodules_default,
verbosity < 0,
max_children);
argv_array_clear();
diff --git a/submodule.c b/submodule.c
index aa4fb1eaa..1d9d2ce09 100644
--- a/submodule.c
+++ b/submodule.c
@@ -20,7 +20,6 @@
 #include "worktree.h"
 #include "parse-options.h"
 
-static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
 static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
 static struct string_list changed_submodule_paths = STRING_LIST_INIT_DUP;
 static int initialized_fetch_ref_tips;
@@ -160,10 +159,6 @@ static int git_modules_config(const char *var, const char 
*value, void *cb)
 {
if (starts_with(var, "submodule."))
return parse_submodule_config_option(var, value);
-   else if (!strcmp(var, "fetch.recursesubmodules")) {
-   config_fetch_recurse_submodules = 
parse_fetch_recurse_submodules_arg(var, value);
-   return 0;
-   }
return 0;
 }
 
@@ -714,11 +709,6 @@ void show_submodule_inline_diff(FILE *f, const char *path,
clear_commit_marks(right, ~0);
 }
 
-void set_config_fetch_recurse_submodules(int value)
-{
-   config_fetch_recurse_submodules = value;
-}
-
 int should_update_submodules(void)
 {
return config_update_recurse_submodules == RECURSE_SUBMODULES_ON;
@@ -1164,10 +1154,11 @@ struct submodule_parallel_fetch {
const char *work_tree;
const char *prefix;
int command_line_option;
+   int default_option;
int quiet;
int result;
 };
-#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0}
+#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0}
 
 static int get_next_submodule(struct child_process *cp,
  struct strbuf *err, void *data, void **task_cb)
@@ -1205,10 +1196,10 @@ static int get_next_submodule(struct child_process *cp,
default_argv = "on-demand";
}
} else {
-   if ((config_fetch_recurse_submodules == 
RECURSE_SUBMODULES_OFF) ||
+   if ((spf->default_option == 
RECURSE_SUBMODULES_OFF) ||
gitmodules_is_unmerged)
continue;
-   if (config_fetch_recurse_submodules == 
RECURSE_SUBMODULES_ON_DEMAND) {
+   if (spf->default_option == 
RECURSE_SUBMODULES_ON_DEMAND) {
if 
(!unsorted_string_list_lookup(_submodule_paths, ce->name))
continue;

[PATCH v4 03/10] cache.h: add GITMODULES_FILE macro

2017-08-02 Thread Brandon Williams
Add a macro to be used when specifying the '.gitmodules' file and
convert any existing hard coded '.gitmodules' file strings to use the
new macro.

Signed-off-by: Brandon Williams 
Signed-off-by: Stefan Beller 
---
 cache.h|  1 +
 submodule.c| 20 ++--
 unpack-trees.c |  2 +-
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/cache.h b/cache.h
index 71fe09264..d59f767e2 100644
--- a/cache.h
+++ b/cache.h
@@ -433,6 +433,7 @@ static inline enum object_type object_type(unsigned int 
mode)
 #define GITATTRIBUTES_FILE ".gitattributes"
 #define INFOATTRIBUTES_FILE "info/attributes"
 #define ATTRIBUTE_MACRO_PREFIX "[attr]"
+#define GITMODULES_FILE ".gitmodules"
 #define GIT_NOTES_REF_ENVIRONMENT "GIT_NOTES_REF"
 #define GIT_NOTES_DEFAULT_REF "refs/notes/commits"
 #define GIT_NOTES_DISPLAY_REF_ENVIRONMENT "GIT_NOTES_DISPLAY_REF"
diff --git a/submodule.c b/submodule.c
index 6531c5d60..64ad5c12d 100644
--- a/submodule.c
+++ b/submodule.c
@@ -63,7 +63,7 @@ int update_path_in_gitmodules(const char *oldpath, const char 
*newpath)
struct strbuf entry = STRBUF_INIT;
const struct submodule *submodule;
 
-   if (!file_exists(".gitmodules")) /* Do nothing without .gitmodules */
+   if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
return -1;
 
if (gitmodules_is_unmerged)
@@ -77,7 +77,7 @@ int update_path_in_gitmodules(const char *oldpath, const char 
*newpath)
strbuf_addstr(, "submodule.");
strbuf_addstr(, submodule->name);
strbuf_addstr(, ".path");
-   if (git_config_set_in_file_gently(".gitmodules", entry.buf, newpath) < 
0) {
+   if (git_config_set_in_file_gently(GITMODULES_FILE, entry.buf, newpath) 
< 0) {
/* Maybe the user already did that, don't error out here */
warning(_("Could not update .gitmodules entry %s"), entry.buf);
strbuf_release();
@@ -97,7 +97,7 @@ int remove_path_from_gitmodules(const char *path)
struct strbuf sect = STRBUF_INIT;
const struct submodule *submodule;
 
-   if (!file_exists(".gitmodules")) /* Do nothing without .gitmodules */
+   if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
return -1;
 
if (gitmodules_is_unmerged)
@@ -110,7 +110,7 @@ int remove_path_from_gitmodules(const char *path)
}
strbuf_addstr(, "submodule.");
strbuf_addstr(, submodule->name);
-   if (git_config_rename_section_in_file(".gitmodules", sect.buf, NULL) < 
0) {
+   if (git_config_rename_section_in_file(GITMODULES_FILE, sect.buf, NULL) 
< 0) {
/* Maybe the user already did that, don't error out here */
warning(_("Could not remove .gitmodules entry for %s"), path);
strbuf_release();
@@ -122,7 +122,7 @@ int remove_path_from_gitmodules(const char *path)
 
 void stage_updated_gitmodules(void)
 {
-   if (add_file_to_cache(".gitmodules", 0))
+   if (add_file_to_cache(GITMODULES_FILE, 0))
die(_("staging updated .gitmodules failed"));
 }
 
@@ -230,21 +230,21 @@ void gitmodules_config(void)
struct strbuf gitmodules_path = STRBUF_INIT;
int pos;
strbuf_addstr(_path, work_tree);
-   strbuf_addstr(_path, "/.gitmodules");
+   strbuf_addstr(_path, "/" GITMODULES_FILE);
if (read_cache() < 0)
die("index file corrupt");
-   pos = cache_name_pos(".gitmodules", 11);
+   pos = cache_name_pos(GITMODULES_FILE, 11);
if (pos < 0) { /* .gitmodules not found or isn't merged */
pos = -1 - pos;
if (active_nr > pos) {  /* there is a .gitmodules */
const struct cache_entry *ce = 
active_cache[pos];
if (ce_namelen(ce) == 11 &&
-   !memcmp(ce->name, ".gitmodules", 11))
+   !memcmp(ce->name, GITMODULES_FILE, 11))
gitmodules_is_unmerged = 1;
}
} else if (pos < active_nr) {
struct stat st;
-   if (lstat(".gitmodules", ) == 0 &&
+   if (lstat(GITMODULES_FILE, ) == 0 &&
ce_match_stat(active_cache[pos], , 0) & 
DATA_CHANGED)
gitmodules_is_modified = 1;
}
@@ -264,7 +264,7 @@ static int gitmodules_cb(const char *var, const char 
*value, void *data)
 
 void repo_read_gitmodules(struct repository *repo)
 {
-   char *gitmodules_path = repo_worktree_path(repo, ".gitmodules");
+   char *gitmodules_path = repo_worktree_path(repo, GITMODULES_FILE);
 
git_config_from_file(gitmodules_cb, gitmodules_path, repo);

[PATCH v4 08/10] submodule: check for unmerged .gitmodules outside of config parsing

2017-08-02 Thread Brandon Williams
Add 'is_gitmodules_unmerged()' function which can be used to determine
in the '.gitmodules' file is unmerged based on the passed in index
instead of relying on a global variable which is set during the
submodule-config parsing.

Signed-off-by: Brandon Williams 
---
 submodule.c | 47 +++
 submodule.h |  1 +
 2 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/submodule.c b/submodule.c
index 677b5c401..3b0e70c51 100644
--- a/submodule.c
+++ b/submodule.c
@@ -27,14 +27,25 @@ static struct oid_array ref_tips_before_fetch;
 static struct oid_array ref_tips_after_fetch;
 
 /*
- * The following flag is set if the .gitmodules file is unmerged. We then
- * disable recursion for all submodules where .git/config doesn't have a
- * matching config entry because we can't guess what might be configured in
- * .gitmodules unless the user resolves the conflict. When a command line
- * option is given (which always overrides configuration) this flag will be
- * ignored.
+ * Check if the .gitmodules file is unmerged. Parsing of the .gitmodules file
+ * will be disabled because we can't guess what might be configured in
+ * .gitmodules unless the user resolves the conflict.
  */
-static int gitmodules_is_unmerged;
+int is_gitmodules_unmerged(const struct index_state *istate)
+{
+   int pos = index_name_pos(istate, GITMODULES_FILE, 
strlen(GITMODULES_FILE));
+   if (pos < 0) { /* .gitmodules not found or isn't merged */
+   pos = -1 - pos;
+   if (istate->cache_nr > pos) {  /* there is a .gitmodules */
+   const struct cache_entry *ce = istate->cache[pos];
+   if (ce_namelen(ce) == strlen(GITMODULES_FILE) &&
+   !strcmp(ce->name, GITMODULES_FILE))
+   return 1;
+   }
+   }
+
+   return 0;
+}
 
 /*
  * Check if the .gitmodules file has unstaged modifications.  This must be
@@ -71,7 +82,7 @@ int update_path_in_gitmodules(const char *oldpath, const char 
*newpath)
if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
return -1;
 
-   if (gitmodules_is_unmerged)
+   if (is_gitmodules_unmerged(_index))
die(_("Cannot change unmerged .gitmodules, resolve merge 
conflicts first"));
 
submodule = submodule_from_path(null_sha1, oldpath);
@@ -105,7 +116,7 @@ int remove_path_from_gitmodules(const char *path)
if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
return -1;
 
-   if (gitmodules_is_unmerged)
+   if (is_gitmodules_unmerged(_index))
die(_("Cannot change unmerged .gitmodules, resolve merge 
conflicts first"));
 
submodule = submodule_from_path(null_sha1, path);
@@ -156,7 +167,7 @@ void set_diffopt_flags_from_submodule_config(struct 
diff_options *diffopt,
if (submodule) {
if (submodule->ignore)
handle_ignore_submodules_arg(diffopt, 
submodule->ignore);
-   else if (gitmodules_is_unmerged)
+   else if (is_gitmodules_unmerged(_index))
DIFF_OPT_SET(diffopt, IGNORE_SUBMODULES);
}
 }
@@ -224,23 +235,12 @@ void gitmodules_config(void)
const char *work_tree = get_git_work_tree();
if (work_tree) {
struct strbuf gitmodules_path = STRBUF_INIT;
-   int pos;
strbuf_addstr(_path, work_tree);
strbuf_addstr(_path, "/" GITMODULES_FILE);
if (read_cache() < 0)
die("index file corrupt");
-   pos = cache_name_pos(GITMODULES_FILE, 11);
-   if (pos < 0) { /* .gitmodules not found or isn't merged */
-   pos = -1 - pos;
-   if (active_nr > pos) {  /* there is a .gitmodules */
-   const struct cache_entry *ce = 
active_cache[pos];
-   if (ce_namelen(ce) == 11 &&
-   !memcmp(ce->name, GITMODULES_FILE, 11))
-   gitmodules_is_unmerged = 1;
-   }
-   }
 
-   if (!gitmodules_is_unmerged)
+   if (!is_gitmodules_unmerged(_index))
git_config_from_file(git_modules_config,
gitmodules_path.buf, NULL);
strbuf_release(_path);
@@ -1198,8 +1198,7 @@ static int get_next_submodule(struct child_process *cp,
default_argv = "on-demand";
}
} else {
-   if ((spf->default_option == 
RECURSE_SUBMODULES_OFF) ||
-   gitmodules_is_unmerged)
+   if (spf->default_option == 
RECURSE_SUBMODULES_OFF)
   

[PATCH v4 01/10] repo_read_index: don't discard the index

2017-08-02 Thread Brandon Williams
Have 'repo_read_index()' behave more like the other read_index family of
functions and don't discard the index if it has already been populated
and instead rely on the quick return of read_index_from which has:

  /* istate->initialized covers both .git/index and .git/sharedindex.xxx */
  if (istate->initialized)
return istate->cache_nr;

Signed-off-by: Brandon Williams 
---
 repository.c | 2 --
 repository.h | 8 
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/repository.c b/repository.c
index edca90740..8e60af1d5 100644
--- a/repository.c
+++ b/repository.c
@@ -235,8 +235,6 @@ int repo_read_index(struct repository *repo)
 {
if (!repo->index)
repo->index = xcalloc(1, sizeof(*repo->index));
-   else
-   discard_index(repo->index);
 
return read_index_from(repo->index, repo->index_file);
 }
diff --git a/repository.h b/repository.h
index 417787f3e..7f5e24a0a 100644
--- a/repository.h
+++ b/repository.h
@@ -92,6 +92,14 @@ extern int repo_submodule_init(struct repository *submodule,
   const char *path);
 extern void repo_clear(struct repository *repo);
 
+/*
+ * Populates the repository's index from its index_file, an index struct will
+ * be allocated if needed.
+ *
+ * Return the number of index entries in the populated index or a value less
+ * than zero if an error occured.  If the repository's index has already been
+ * populated then the number of entries will simply be returned.
+ */
 extern int repo_read_index(struct repository *repo);
 
 #endif /* REPOSITORY_H */
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH v4 07/10] submodule: check for unstaged .gitmodules outside of config parsing

2017-08-02 Thread Brandon Williams
Teach 'is_staging_gitmodules_ok()' to be able to determine in the
'.gitmodules' file has unstaged changes based on the passed in index
instead of relying on a global variable which is set during the
submodule-config parsing.

Signed-off-by: Brandon Williams 
---
 builtin/mv.c |  2 +-
 builtin/rm.c |  2 +-
 submodule.c  | 32 +---
 submodule.h  |  2 +-
 4 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index dcf6736b5..94fbaaa5d 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -81,7 +81,7 @@ static void prepare_move_submodule(const char *src, int first,
struct strbuf submodule_dotgit = STRBUF_INIT;
if (!S_ISGITLINK(active_cache[first]->ce_mode))
die(_("Directory %s is in index and no submodule?"), src);
-   if (!is_staging_gitmodules_ok())
+   if (!is_staging_gitmodules_ok(_index))
die(_("Please stage your changes to .gitmodules or stash them 
to proceed"));
strbuf_addf(_dotgit, "%s/.git", src);
*submodule_gitfile = read_gitfile(submodule_dotgit.buf);
diff --git a/builtin/rm.c b/builtin/rm.c
index 52826d137..4057e73fa 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -286,7 +286,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
list.entry[list.nr].name = xstrdup(ce->name);
list.entry[list.nr].is_submodule = S_ISGITLINK(ce->ce_mode);
if (list.entry[list.nr++].is_submodule &&
-   !is_staging_gitmodules_ok())
+   !is_staging_gitmodules_ok(_index))
die (_("Please stage your changes to .gitmodules or 
stash them to proceed"));
}
 
diff --git a/submodule.c b/submodule.c
index 1d9d2ce09..677b5c401 100644
--- a/submodule.c
+++ b/submodule.c
@@ -37,18 +37,25 @@ static struct oid_array ref_tips_after_fetch;
 static int gitmodules_is_unmerged;
 
 /*
- * This flag is set if the .gitmodules file had unstaged modifications on
- * startup. This must be checked before allowing modifications to the
- * .gitmodules file with the intention to stage them later, because when
- * continuing we would stage the modifications the user didn't stage herself
- * too. That might change in a future version when we learn to stage the
- * changes we do ourselves without staging any previous modifications.
+ * Check if the .gitmodules file has unstaged modifications.  This must be
+ * checked before allowing modifications to the .gitmodules file with the
+ * intention to stage them later, because when continuing we would stage the
+ * modifications the user didn't stage herself too. That might change in a
+ * future version when we learn to stage the changes we do ourselves without
+ * staging any previous modifications.
  */
-static int gitmodules_is_modified;
-
-int is_staging_gitmodules_ok(void)
+int is_staging_gitmodules_ok(const struct index_state *istate)
 {
-   return !gitmodules_is_modified;
+   int pos = index_name_pos(istate, GITMODULES_FILE, 
strlen(GITMODULES_FILE));
+
+   if ((pos >= 0) && (pos < istate->cache_nr)) {
+   struct stat st;
+   if (lstat(GITMODULES_FILE, ) == 0 &&
+   ce_match_stat(istate->cache[pos], , 0) & DATA_CHANGED)
+   return 0;
+   }
+
+   return 1;
 }
 
 /*
@@ -231,11 +238,6 @@ void gitmodules_config(void)
!memcmp(ce->name, GITMODULES_FILE, 11))
gitmodules_is_unmerged = 1;
}
-   } else if (pos < active_nr) {
-   struct stat st;
-   if (lstat(GITMODULES_FILE, ) == 0 &&
-   ce_match_stat(active_cache[pos], , 0) & 
DATA_CHANGED)
-   gitmodules_is_modified = 1;
}
 
if (!gitmodules_is_unmerged)
diff --git a/submodule.h b/submodule.h
index 29a1ecd19..b14660585 100644
--- a/submodule.h
+++ b/submodule.h
@@ -33,7 +33,7 @@ struct submodule_update_strategy {
 };
 #define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}
 
-extern int is_staging_gitmodules_ok(void);
+extern int is_staging_gitmodules_ok(const struct index_state *istate);
 extern int update_path_in_gitmodules(const char *oldpath, const char *newpath);
 extern int remove_path_from_gitmodules(const char *path);
 extern void stage_updated_gitmodules(void);
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH v4 00/10] Convert grep to recurse in-process

2017-08-02 Thread Brandon Williams
Changes in v4:
 * small typo fix in commit message.
 * convert all occurrences of '.gitmodules' to use the new macro.

Brandon Williams (10):
  repo_read_index: don't discard the index
  repository: have the_repository use the_index
  cache.h: add GITMODULES_FILE macro
  config: add config_from_gitmodules
  submodule: remove submodule.fetchjobs from submodule-config parsing
  submodule: remove fetch.recursesubmodules from submodule-config
parsing
  submodule: check for unstaged .gitmodules outside of config parsing
  submodule: check for unmerged .gitmodules outside of config parsing
  submodule: merge repo_read_gitmodules and gitmodules_config
  grep: recurse in-process using 'struct repository'

 Documentation/git-grep.txt  |   7 -
 builtin/fetch.c |  26 ++-
 builtin/grep.c  | 396 ++--
 builtin/mv.c|   2 +-
 builtin/rm.c|   2 +-
 builtin/submodule--helper.c |  17 +-
 cache.h |   2 +-
 config.c|  17 ++
 config.h|  10 ++
 git.c   |   2 +-
 grep.c  |  13 --
 grep.h  |   1 -
 repository.c|   6 +-
 repository.h|   8 +
 setup.c |  12 +-
 submodule-config.c  |   8 +
 submodule-config.h  |   1 +
 submodule.c | 157 --
 submodule.h |   6 +-
 unpack-trees.c  |   2 +-
 20 files changed, 246 insertions(+), 449 deletions(-)

-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH v4 04/10] config: add config_from_gitmodules

2017-08-02 Thread Brandon Williams
Add 'config_from_gitmodules()' function which can be used by 'fetch' and
'update_clone' in order to maintain backwards compatibility with
configuration being stored in .gitmodules' since a future patch will
remove reading these values in the submodule-config.

This function should not be used anywhere other than in 'fetch' and
'update_clone'.

Signed-off-by: Brandon Williams 
---
 config.c | 17 +
 config.h | 10 ++
 2 files changed, 27 insertions(+)

diff --git a/config.c b/config.c
index 231f9a750..06645a325 100644
--- a/config.c
+++ b/config.c
@@ -2053,6 +2053,23 @@ int git_config_get_pathname(const char *key, const char 
**dest)
return repo_config_get_pathname(the_repository, key, dest);
 }
 
+/*
+ * Note: This function exists solely to maintain backward compatibility with
+ * 'fetch' and 'update_clone' storing configuration in '.gitmodules' and should
+ * NOT be used anywhere else.
+ *
+ * Runs the provided config function on the '.gitmodules' file found in the
+ * working directory.
+ */
+void config_from_gitmodules(config_fn_t fn, void *data)
+{
+   if (the_repository->worktree) {
+   char *file = repo_worktree_path(the_repository, 
GITMODULES_FILE);
+   git_config_from_file(fn, file, data);
+   free(file);
+   }
+}
+
 int git_config_get_expiry(const char *key, const char **output)
 {
int ret = git_config_get_string_const(key, output);
diff --git a/config.h b/config.h
index 0352da117..6998e6645 100644
--- a/config.h
+++ b/config.h
@@ -187,6 +187,16 @@ extern int repo_config_get_maybe_bool(struct repository 
*repo,
 extern int repo_config_get_pathname(struct repository *repo,
const char *key, const char **dest);
 
+/*
+ * Note: This function exists solely to maintain backward compatibility with
+ * 'fetch' and 'update_clone' storing configuration in '.gitmodules' and should
+ * NOT be used anywhere else.
+ *
+ * Runs the provided config function on the '.gitmodules' file found in the
+ * working directory.
+ */
+extern void config_from_gitmodules(config_fn_t fn, void *data);
+
 extern int git_config_get_value(const char *key, const char **value);
 extern const struct string_list *git_config_get_value_multi(const char *key);
 extern void git_config_clear(void);
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH v4 02/10] repository: have the_repository use the_index

2017-08-02 Thread Brandon Williams
Have the index state which is stored in 'the_repository' be a pointer to
the in-core index 'the_index'.  This makes it easier to begin
transitioning more parts of the code base to operate on a 'struct
repository'.

Signed-off-by: Brandon Williams 
---
 repository.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/repository.c b/repository.c
index 8e60af1d5..c0e0e0e7e 100644
--- a/repository.c
+++ b/repository.c
@@ -4,7 +4,9 @@
 #include "submodule-config.h"
 
 /* The main repository */
-static struct repository the_repo;
+static struct repository the_repo = {
+   NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, _index, 0, 0
+};
 struct repository *the_repository = _repo;
 
 static char *git_path_from_env(const char *envvar, const char *git_dir,
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH v3 7/7] git.c: ignore pager.* when launching builtin as dashed external

2017-08-02 Thread Martin Ågren
When running, e.g., `git -c alias.bar=foo bar`, we expand the alias and
execute `git-foo` as a dashed external. This is true even if git foo is
a builtin. That is on purpose, and is motivated in a comment which was
added in commit 441981bc ("git: simplify environment save/restore
logic", 2016-01-26).

Shortly before we launch a dashed external, and unless we have already
found out whether we should use a pager, we check `pager.foo`. This was
added in commit 92058e4d ("support pager.* for external commands",
2011-08-18). If the dashed external is a builtin, this does not match
that commit's intention and is arguably wrong, since it would be cleaner
if we let the "dashed external builtin" handle `pager.foo`.

This has not mattered in practice, but a recent patch taught `git-tag`
to ignore `pager.tag` under certain circumstances. But, when started
using an alias, it doesn't get the chance to do so, as outlined above.
That recent patch added a test to document this breakage.

Do not check `pager.foo` before launching a builtin as a dashed
external, i.e., if we recognize the name of the external as a builtin.
Change the test to use `test_expect_success`.

Signed-off-by: Martin Ågren 
---
 t/t7006-pager.sh | 2 +-
 git.c| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index afa03f3b6..9128ec5ac 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -201,7 +201,7 @@ test_expect_success TTY 'git tag -a respects --paginate' '
test -e paginated.out
 '
 
-test_expect_failure TTY 'git tag as alias ignores pager.tag with -a' '
+test_expect_success TTY 'git tag as alias ignores pager.tag with -a' '
test_when_finished "git tag -d newtag" &&
rm -f paginated.out &&
test_terminal git -c pager.tag -c alias.t=tag t -am message newtag &&
diff --git a/git.c b/git.c
index 82ac2a092..6b6d9f68e 100644
--- a/git.c
+++ b/git.c
@@ -559,7 +559,7 @@ static void execv_dashed_external(const char **argv)
if (get_super_prefix())
die("%s doesn't support --super-prefix", argv[0]);
 
-   if (use_pager == -1)
+   if (use_pager == -1 && !is_builtin(argv[0]))
use_pager = check_pager_config(argv[0]);
commit_pager_choice();
 
-- 
2.14.0.rc1.12.ge2d9c4613



[PATCH v3 6/7] tag: change default of `pager.tag` to "on"

2017-08-02 Thread Martin Ågren
The previous patch taught `git tag` to only respect `pager.tag` in
list-mode. That patch left the default value of `pager.tag` at "off".

After that patch, it makes sense to let the default value be "on"
instead, since it will help with listing many tags, but will not hurt
users of `git tag -a` as it would have before. Make that change. Update
documentation and tests.

Signed-off-by: Martin Ågren 
---
 Documentation/git-tag.txt |  2 +-
 t/t7006-pager.sh  | 28 ++--
 builtin/tag.c |  2 +-
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
index 875d135e0..d97aad343 100644
--- a/Documentation/git-tag.txt
+++ b/Documentation/git-tag.txt
@@ -206,7 +206,7 @@ it in the repository configuration as follows:
 -
 
 `pager.tag` is only respected when listing tags, i.e., when `-l` is
-used or implied.
+used or implied. The default is to use a pager.
 See linkgit:git-config[1].
 
 DISCUSSION
diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index 570b2f252..afa03f3b6 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -134,16 +134,16 @@ test_expect_success TTY 'configuration can enable pager 
(from subdir)' '
}
 '
 
-test_expect_success TTY 'git tag -l defaults to not paging' '
+test_expect_success TTY 'git tag -l defaults to paging' '
rm -f paginated.out &&
test_terminal git tag -l &&
-   ! test -e paginated.out
+   test -e paginated.out
 '
 
 test_expect_success TTY 'git tag -l respects pager.tag' '
rm -f paginated.out &&
-   test_terminal git -c pager.tag tag -l &&
-   test -e paginated.out
+   test_terminal git -c pager.tag=false tag -l &&
+   ! test -e paginated.out
 '
 
 test_expect_success TTY 'git tag -l respects --no-pager' '
@@ -152,32 +152,32 @@ test_expect_success TTY 'git tag -l respects --no-pager' '
! test -e paginated.out
 '
 
-test_expect_success TTY 'git tag with no args defaults to not paging' '
+test_expect_success TTY 'git tag with no args defaults to paging' '
# no args implies -l so this should page like -l
rm -f paginated.out &&
test_terminal git tag &&
-   ! test -e paginated.out
+   test -e paginated.out
 '
 
 test_expect_success TTY 'git tag with no args respects pager.tag' '
# no args implies -l so this should page like -l
rm -f paginated.out &&
-   test_terminal git -c pager.tag tag &&
-   test -e paginated.out
+   test_terminal git -c pager.tag=false tag &&
+   ! test -e paginated.out
 '
 
-test_expect_success TTY 'git tag --contains defaults to not paging' '
+test_expect_success TTY 'git tag --contains defaults to paging' '
# --contains implies -l so this should page like -l
rm -f paginated.out &&
test_terminal git tag --contains &&
-   ! test -e paginated.out
+   test -e paginated.out
 '
 
 test_expect_success TTY 'git tag --contains respects pager.tag' '
# --contains implies -l so this should page like -l
rm -f paginated.out &&
-   test_terminal git -c pager.tag tag --contains &&
-   test -e paginated.out
+   test_terminal git -c pager.tag=false tag --contains &&
+   ! test -e paginated.out
 '
 
 test_expect_success TTY 'git tag -a defaults to not paging' '
@@ -210,8 +210,8 @@ test_expect_failure TTY 'git tag as alias ignores pager.tag 
with -a' '
 
 test_expect_success TTY 'git tag as alias respects pager.tag with -l' '
rm -f paginated.out &&
-   test_terminal git -c pager.tag -c alias.t=tag t -l &&
-   test -e paginated.out
+   test_terminal git -c pager.tag=false -c alias.t=tag t -l &&
+   ! test -e paginated.out
 '
 
 # A colored commit log will begin with an appropriate ANSI escape
diff --git a/builtin/tag.c b/builtin/tag.c
index 5ad1af252..ea83df5e1 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -462,7 +462,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
}
 
if (cmdmode == 'l')
-   setup_auto_pager("tag", 0);
+   setup_auto_pager("tag", 1);
 
if ((create_tag_object || force) && (cmdmode != 0))
usage_with_options(git_tag_usage, options);
-- 
2.14.0.rc1.12.ge2d9c4613



[PATCH v3 5/7] tag: respect `pager.tag` in list-mode only

2017-08-02 Thread Martin Ågren
Using, e.g., `git -c pager.tag tag -a new-tag` results in errors such as
"Vim: Warning: Output is not to a terminal" and a garbled terminal.
Someone who makes use of both `git tag -a` and `git tag -l` will
probably not set `pager.tag`, so that `git tag -a` will actually work,
at the cost of not paging output of `git tag -l`.

Use the mechanisms introduced in two earlier patches to ignore
`pager.tag` in git.c and let the `git tag` builtin handle it on its own.
Only respect `pager.tag` when running in list-mode.

There is a window between where the pager is started before and after
this patch. This means that early errors can behave slightly different
before and after this patch. Since operation-parsing has to happen
inside this window, this can be seen with `git -c pager.tag="echo pager
is used" tag -l --unknown-option`. This change in paging-behavior should
be acceptable since it only affects erroneous usages.

Update the documentation and update tests.

If an alias is used to run `git tag -a`, then `pager.tag` will still be
respected. Document this known breakage. It will be fixed in a later
commit. Add a similar test for `-l`, which works.

Noticed-by: Anatoly Borodin 
Suggested-by: Jeff King 
Signed-off-by: Martin Ågren 
---
 Documentation/git-tag.txt |  3 +++
 t/t7006-pager.sh  | 15 ++-
 builtin/tag.c |  3 +++
 git.c |  2 +-
 4 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
index 1eb15afa1..875d135e0 100644
--- a/Documentation/git-tag.txt
+++ b/Documentation/git-tag.txt
@@ -205,6 +205,9 @@ it in the repository configuration as follows:
 signingKey = 
 -
 
+`pager.tag` is only respected when listing tags, i.e., when `-l` is
+used or implied.
+See linkgit:git-config[1].
 
 DISCUSSION
 --
diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index b56d4cdd4..570b2f252 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -187,7 +187,7 @@ test_expect_success TTY 'git tag -a defaults to not paging' 
'
! test -e paginated.out
 '
 
-test_expect_failure TTY 'git tag -a ignores pager.tag' '
+test_expect_success TTY 'git tag -a ignores pager.tag' '
test_when_finished "git tag -d newtag" &&
rm -f paginated.out &&
test_terminal git -c pager.tag tag -am message newtag &&
@@ -201,6 +201,19 @@ test_expect_success TTY 'git tag -a respects --paginate' '
test -e paginated.out
 '
 
+test_expect_failure TTY 'git tag as alias ignores pager.tag with -a' '
+   test_when_finished "git tag -d newtag" &&
+   rm -f paginated.out &&
+   test_terminal git -c pager.tag -c alias.t=tag t -am message newtag &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git tag as alias respects pager.tag with -l' '
+   rm -f paginated.out &&
+   test_terminal git -c pager.tag -c alias.t=tag t -l &&
+   test -e paginated.out
+'
+
 # A colored commit log will begin with an appropriate ANSI escape
 # for the first color; the text "commit" comes later.
 colorful() {
diff --git a/builtin/tag.c b/builtin/tag.c
index 01154ea8d..5ad1af252 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -461,6 +461,9 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
cmdmode = 'l';
}
 
+   if (cmdmode == 'l')
+   setup_auto_pager("tag", 0);
+
if ((create_tag_object || force) && (cmdmode != 0))
usage_with_options(git_tag_usage, options);
 
diff --git a/git.c b/git.c
index 66832f232..82ac2a092 100644
--- a/git.c
+++ b/git.c
@@ -466,7 +466,7 @@ static struct cmd_struct commands[] = {
{ "stripspace", cmd_stripspace },
{ "submodule--helper", cmd_submodule__helper, RUN_SETUP | 
SUPPORT_SUPER_PREFIX},
{ "symbolic-ref", cmd_symbolic_ref, RUN_SETUP },
-   { "tag", cmd_tag, RUN_SETUP },
+   { "tag", cmd_tag, RUN_SETUP | DELAY_PAGER_CONFIG },
{ "unpack-file", cmd_unpack_file, RUN_SETUP },
{ "unpack-objects", cmd_unpack_objects, RUN_SETUP },
{ "update-index", cmd_update_index, RUN_SETUP },
-- 
2.14.0.rc1.12.ge2d9c4613



[PATCH v3 4/7] t7006: add tests for how git tag paginates

2017-08-02 Thread Martin Ågren
Using, e.g., `git -c pager.tag tag -a new-tag` results in errors such as
"Vim: Warning: Output is not to a terminal" and a garbled terminal.
Someone who makes use of both `git tag -a` and `git tag -l` will
probably not set `pager.tag`, so that `git tag -a` will actually work,
at the cost of not paging output of `git tag -l`.

Since we're about to change how `git tag` respects `pager.tag`, add tests
around this, including how the configuration is ignored if --no-pager or
--paginate are used.

Construct tests with a few different subcommands. First, use -l. Second,
use "no arguments" and --contains, since those imply -l. (There are
more arguments which imply -l, but using these two should be enough.)

Third, use -a as a representative for "not -l". Actually, the tests use
`git tag -am` so no editor is launched, but that is irrelevant, since we
just want to see whether the pager is used or not. Make one of the tests
demonstrate the broken behavior mentioned above, where `git tag -a`
respects `pager.tag`.

Signed-off-by: Martin Ågren 
---
In one place, I now use test_expect_failure (thanks Peff).

 t/t7006-pager.sh | 67 
 1 file changed, 67 insertions(+)

diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index 20b4d83c2..b56d4cdd4 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -134,6 +134,73 @@ test_expect_success TTY 'configuration can enable pager 
(from subdir)' '
}
 '
 
+test_expect_success TTY 'git tag -l defaults to not paging' '
+   rm -f paginated.out &&
+   test_terminal git tag -l &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git tag -l respects pager.tag' '
+   rm -f paginated.out &&
+   test_terminal git -c pager.tag tag -l &&
+   test -e paginated.out
+'
+
+test_expect_success TTY 'git tag -l respects --no-pager' '
+   rm -f paginated.out &&
+   test_terminal git -c pager.tag --no-pager tag -l &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git tag with no args defaults to not paging' '
+   # no args implies -l so this should page like -l
+   rm -f paginated.out &&
+   test_terminal git tag &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git tag with no args respects pager.tag' '
+   # no args implies -l so this should page like -l
+   rm -f paginated.out &&
+   test_terminal git -c pager.tag tag &&
+   test -e paginated.out
+'
+
+test_expect_success TTY 'git tag --contains defaults to not paging' '
+   # --contains implies -l so this should page like -l
+   rm -f paginated.out &&
+   test_terminal git tag --contains &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git tag --contains respects pager.tag' '
+   # --contains implies -l so this should page like -l
+   rm -f paginated.out &&
+   test_terminal git -c pager.tag tag --contains &&
+   test -e paginated.out
+'
+
+test_expect_success TTY 'git tag -a defaults to not paging' '
+   test_when_finished "git tag -d newtag" &&
+   rm -f paginated.out &&
+   test_terminal git tag -am message newtag &&
+   ! test -e paginated.out
+'
+
+test_expect_failure TTY 'git tag -a ignores pager.tag' '
+   test_when_finished "git tag -d newtag" &&
+   rm -f paginated.out &&
+   test_terminal git -c pager.tag tag -am message newtag &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git tag -a respects --paginate' '
+   test_when_finished "git tag -d newtag" &&
+   rm -f paginated.out &&
+   test_terminal git --paginate tag -am message newtag &&
+   test -e paginated.out
+'
+
 # A colored commit log will begin with an appropriate ANSI escape
 # for the first color; the text "commit" comes later.
 colorful() {
-- 
2.14.0.rc1.12.ge2d9c4613



[PATCH v3 3/7] git.c: provide setup_auto_pager()

2017-08-02 Thread Martin Ågren
The previous patch introduced a way for builtins to declare that they
will take responsibility for handling the `pager.foo`-config item. (See
the commit message of that patch for why that could be useful.)

Provide setup_auto_pager(), which builtins can call in order to handle
`pager.`, including possibly starting the pager. Make this function
don't do anything if a pager has already been started, as indicated by
use_pager or pager_in_use().

Whenever this function is called from a builtin, git.c will already have
called commit_pager_choice(). Since commit_pager_choice() treats the
special value -1 as "punt" or "not yet decided", it is not a problem
that we might end up calling commit_pager_choice() once in git.c and
once (or more) in the builtin. Make the new function use -1 in the same
way and document it as "punt".

Don't add any users of setup_auto_pager just yet, one will follow in
a later patch.

Suggested-by: Jeff King 
Signed-off-by: Martin Ågren 
---
 builtin.h | 12 
 git.c | 10 ++
 2 files changed, 22 insertions(+)

diff --git a/builtin.h b/builtin.h
index 0f3a7b770..42378f3aa 100644
--- a/builtin.h
+++ b/builtin.h
@@ -113,6 +113,18 @@ struct fmt_merge_msg_opts {
 extern int fmt_merge_msg(struct strbuf *in, struct strbuf *out,
 struct fmt_merge_msg_opts *);
 
+/**
+ * If a built-in has DELAY_PAGER_CONFIG set, the built-in should call this 
early
+ * when it wishes to respect the `pager.foo`-config. The `cmd` is the name of
+ * the built-in, e.g., "foo". If a paging-choice has already been setup, this
+ * does nothing. The default in `def` should be 0 for "pager off", 1 for "pager
+ * on" or -1 for "punt".
+ *
+ * You should most likely use a default of 0 or 1. "Punt" (-1) could be useful
+ * to be able to fall back to some historical compatibility name.
+ */
+extern void setup_auto_pager(const char *cmd, int def);
+
 extern int is_builtin(const char *s);
 
 extern int cmd_add(int argc, const char **argv, const char *prefix);
diff --git a/git.c b/git.c
index 79195ebbd..66832f232 100644
--- a/git.c
+++ b/git.c
@@ -33,6 +33,16 @@ static void commit_pager_choice(void) {
}
 }
 
+void setup_auto_pager(const char *cmd, int def)
+{
+   if (use_pager != -1 || pager_in_use())
+   return;
+   use_pager = check_pager_config(cmd);
+   if (use_pager == -1)
+   use_pager = def;
+   commit_pager_choice();
+}
+
 static int handle_options(const char ***argv, int *argc, int *envchanged)
 {
const char **orig_argv = *argv;
-- 
2.14.0.rc1.12.ge2d9c4613



[PATCH v3 2/7] git.c: let builtins opt for handling `pager.foo` themselves

2017-08-02 Thread Martin Ågren
Before launching a builtin git foo and unless mechanisms with precedence
are in use, we check for and handle the `pager.foo` config. This is done
without considering exactly how git foo is being used, and indeed, git.c
cannot (and should not) know what the arguments to git foo are supposed
to achieve.

In practice this means that, e.g., `git -c pager.tag tag -a new-tag`
results in errors such as "Vim: Warning: Output is not to a terminal"
and a garbled terminal. Someone who makes use of both `git tag -a` and
`git tag -l` will probably not set `pager.tag`, so that `git tag -a`
will actually work, at the cost of not paging output of `git tag -l`.

To allow individual builtins to make more informed decisions about when
to respect `pager.foo`, introduce a flag DELAY_PAGER_CONFIG. If the flag
is set, do not check `pager.foo`.

Do not check for DELAY_PAGER_CONFIG in `execv_dashed_external()`. That
call site is arguably wrong, although in a way that is not yet visible,
and will be changed in a slightly different direction in a later patch.

Don't add any users of DELAY_PAGER_CONFIG just yet, one will follow in a
later patch.

Suggested-by: Jeff King 
Signed-off-by: Martin Ågren 
---
 builtin.h | 8 
 git.c | 4 +++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/builtin.h b/builtin.h
index 8d87d06da..0f3a7b770 100644
--- a/builtin.h
+++ b/builtin.h
@@ -55,6 +55,14 @@
  *
  * The built-in supports `--super-prefix`.
  *
+ * `DELAY_PAGER_CONFIG`:
+ *
+ * If RUN_SETUP or RUN_SETUP_GENTLY is set, git.c normally handles
+ * the `pager.`-configuration. If this flag is used, git.c
+ * will skip that step, instead allowing the built-in to make a
+ * more informed decision, e.g., by ignoring `pager.` for
+ * certain subcommands.
+ *
  * . Add `builtin/foo.o` to `BUILTIN_OBJS` in `Makefile`.
  *
  * Additionally, if `foo` is a new command, there are 4 more things to do:
diff --git a/git.c b/git.c
index 489aab4d8..79195ebbd 100644
--- a/git.c
+++ b/git.c
@@ -283,6 +283,7 @@ static int handle_alias(int *argcp, const char ***argv)
  */
 #define NEED_WORK_TREE (1<<3)
 #define SUPPORT_SUPER_PREFIX   (1<<4)
+#define DELAY_PAGER_CONFIG (1<<5)
 
 struct cmd_struct {
const char *cmd;
@@ -306,7 +307,8 @@ static int run_builtin(struct cmd_struct *p, int argc, 
const char **argv)
prefix = setup_git_directory_gently(_ok);
}
 
-   if (use_pager == -1 && p->option & (RUN_SETUP | 
RUN_SETUP_GENTLY))
+   if (use_pager == -1 && p->option & (RUN_SETUP | 
RUN_SETUP_GENTLY) &&
+   !(p->option & DELAY_PAGER_CONFIG))
use_pager = check_pager_config(p->cmd);
if (use_pager == -1 && p->option & USE_PAGER)
use_pager = 1;
-- 
2.14.0.rc1.12.ge2d9c4613



[PATCH v3 1/7] builtin.h: take over documentation from api-builtin.txt

2017-08-02 Thread Martin Ågren
Delete Documentation/technical/api-builtin.txt and move its content
into builtin.h. Format it as a comment. Remove a '+' which was needed
when the information was formatted for AsciiDoc. Similarly, change
"::" to ":".

Document SUPPORT_SUPER_PREFIX, thereby bringing the documentation up to
date with the available flags.

While at it, correct '3 more things to do' to '4 more things to do'.

Signed-off-by: Martin Ågren 
---
I'm still documenting SUPPORT_SUPER_PREFIX. As Junio pointed out, a
nearby patch series is working on losing it (or one user of it). I tried
removing that part, but felt like I was leaving things incomplete.

 Documentation/technical/api-builtin.txt | 73 --
 builtin.h   | 80 +
 2 files changed, 80 insertions(+), 73 deletions(-)
 delete mode 100644 Documentation/technical/api-builtin.txt

diff --git a/Documentation/technical/api-builtin.txt 
b/Documentation/technical/api-builtin.txt
deleted file mode 100644
index 22a39b929..0
--- a/Documentation/technical/api-builtin.txt
+++ /dev/null
@@ -1,73 +0,0 @@
-builtin API
-===
-
-Adding a new built-in
--
-
-There are 4 things to do to add a built-in command implementation to
-Git:
-
-. Define the implementation of the built-in command `foo` with
-  signature:
-
-   int cmd_foo(int argc, const char **argv, const char *prefix);
-
-. Add the external declaration for the function to `builtin.h`.
-
-. Add the command to the `commands[]` table defined in `git.c`.
-  The entry should look like:
-
-   { "foo", cmd_foo,  },
-+
-where options is the bitwise-or of:
-
-`RUN_SETUP`::
-   If there is not a Git directory to work on, abort.  If there
-   is a work tree, chdir to the top of it if the command was
-   invoked in a subdirectory.  If there is no work tree, no
-   chdir() is done.
-
-`RUN_SETUP_GENTLY`::
-   If there is a Git directory, chdir as per RUN_SETUP, otherwise,
-   don't chdir anywhere.
-
-`USE_PAGER`::
-
-   If the standard output is connected to a tty, spawn a pager and
-   feed our output to it.
-
-`NEED_WORK_TREE`::
-
-   Make sure there is a work tree, i.e. the command cannot act
-   on bare repositories.
-   This only makes sense when `RUN_SETUP` is also set.
-
-. Add `builtin/foo.o` to `BUILTIN_OBJS` in `Makefile`.
-
-Additionally, if `foo` is a new command, there are 3 more things to do:
-
-. Add tests to `t/` directory.
-
-. Write documentation in `Documentation/git-foo.txt`.
-
-. Add an entry for `git-foo` to `command-list.txt`.
-
-. Add an entry for `/git-foo` to `.gitignore`.
-
-
-How a built-in is called
-
-
-The implementation `cmd_foo()` takes three parameters, `argc`, `argv,
-and `prefix`.  The first two are similar to what `main()` of a
-standalone command would be called with.
-
-When `RUN_SETUP` is specified in the `commands[]` table, and when you
-were started from a subdirectory of the work tree, `cmd_foo()` is called
-after chdir(2) to the top of the work tree, and `prefix` gets the path
-to the subdirectory the command started from.  This allows you to
-convert a user-supplied pathname (typically relative to that directory)
-to a pathname relative to the top of the work tree.
-
-The return value from `cmd_foo()` becomes the exit status of the
-command.
diff --git a/builtin.h b/builtin.h
index 498ac80d0..8d87d06da 100644
--- a/builtin.h
+++ b/builtin.h
@@ -6,6 +6,86 @@
 #include "cache.h"
 #include "commit.h"
 
+/*
+ * builtin API
+ * ===
+ *
+ * Adding a new built-in
+ * -
+ *
+ * There are 4 things to do to add a built-in command implementation to
+ * Git:
+ *
+ * . Define the implementation of the built-in command `foo` with
+ *   signature:
+ *
+ * int cmd_foo(int argc, const char **argv, const char *prefix);
+ *
+ * . Add the external declaration for the function to `builtin.h`.
+ *
+ * . Add the command to the `commands[]` table defined in `git.c`.
+ *   The entry should look like:
+ *
+ * { "foo", cmd_foo,  },
+ *
+ * where options is the bitwise-or of:
+ *
+ * `RUN_SETUP`:
+ * If there is not a Git directory to work on, abort.  If there
+ * is a work tree, chdir to the top of it if the command was
+ * invoked in a subdirectory.  If there is no work tree, no
+ * chdir() is done.
+ *
+ * `RUN_SETUP_GENTLY`:
+ * If there is a Git directory, chdir as per RUN_SETUP, otherwise,
+ * don't chdir anywhere.
+ *
+ * `USE_PAGER`:
+ *
+ * If the standard output is connected to a tty, spawn a pager and
+ * feed our output to it.
+ *
+ * `NEED_WORK_TREE`:
+ *
+ * Make sure there is a work tree, i.e. the command cannot act
+ * on bare repositories.
+ * This only makes sense when `RUN_SETUP` is also set.
+ *
+ * `SUPPORT_SUPER_PREFIX`:
+ *
+ * The built-in supports `--super-prefix`.
+ *
+ * . Add `builtin/foo.o` to 

[PATCH v3 0/7] tag: only respect `pager.tag` in list-mode

2017-08-02 Thread Martin Ågren
This is the third version of my attempt to make `pager tag` useful (v1
at [1], v2 at [2]). Thanks to Junio and Peff for comments on v2.

I've squashed patches 01-03/10 and 07-08/10, respectively. The interdiff
is below. I managed to clean up some tests thanks to a drive-by comment
by Peff which cleared up a misunderstanding I had. Some minor changes,
e.g., I write "built-in" instead of "builtin", since that seemed
preferred, at least locally in builtin.h. I documented why a default
value of "punt" could be useful, but also that it's probably not wanted.

Martin

[1] https://public-inbox.org/git/cover.1499723297.git.martin.ag...@gmail.com/T/

[2] 
https://public-inbox.org/git/cover.1500321657.git.martin.ag...@gmail.com/T/#u

Martin Ågren (7):
  builtin.h: take over documentation from api-builtin.txt
  git.c: let builtins opt for handling `pager.foo` themselves
  git.c: provide setup_auto_pager()
  t7006: add tests for how git tag paginates
  tag: respect `pager.tag` in list-mode only
  tag: change default of `pager.tag` to "on"
  git.c: ignore pager.* when launching builtin as dashed external

 Documentation/git-tag.txt   |   3 +
 Documentation/technical/api-builtin.txt |  73 ---
 t/t7006-pager.sh|  80 +
 builtin.h   | 100 
 builtin/tag.c   |   3 +
 git.c   |  18 +-
 6 files changed, 201 insertions(+), 76 deletions(-)
 delete mode 100644 Documentation/technical/api-builtin.txt

Interdiff vs v2:
diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index 8b2ffb1aa..9128ec5ac 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -162,7 +162,7 @@ test_expect_success TTY 'git tag with no args defaults to 
paging' '
 test_expect_success TTY 'git tag with no args respects pager.tag' '
# no args implies -l so this should page like -l
rm -f paginated.out &&
-   test_terminal git -c pager.tag=no tag &&
+   test_terminal git -c pager.tag=false tag &&
! test -e paginated.out
 '
 
@@ -202,20 +202,15 @@ test_expect_success TTY 'git tag -a respects --paginate' '
 '
 
 test_expect_success TTY 'git tag as alias ignores pager.tag with -a' '
-   # git-tag will be launched as a dashed external, which
-   # 1) is the source of a potential bug, and
-   # 2) is why we use test_config and not -c.
test_when_finished "git tag -d newtag" &&
rm -f paginated.out &&
-   test_config pager.tag true &&
-   test_terminal git -c alias.t=tag t -am message newtag &&
+   test_terminal git -c pager.tag -c alias.t=tag t -am message newtag &&
! test -e paginated.out
 '
 
 test_expect_success TTY 'git tag as alias respects pager.tag with -l' '
rm -f paginated.out &&
-   test_config pager.tag false &&
-   test_terminal git -c alias.t=tag t -l &&
+   test_terminal git -c pager.tag=false -c alias.t=tag t -l &&
! test -e paginated.out
 '
 
diff --git a/builtin.h b/builtin.h
index 3ca4a53a8..42378f3aa 100644
--- a/builtin.h
+++ b/builtin.h
@@ -51,15 +51,15 @@
  * on bare repositories.
  * This only makes sense when `RUN_SETUP` is also set.
  *
- * `SUPPORT_SUPER_PREFIX`::
+ * `SUPPORT_SUPER_PREFIX`:
  *
- * The builtin supports `--super-prefix`.
+ * The built-in supports `--super-prefix`.
  *
- * `DELAY_PAGER_CONFIG`::
+ * `DELAY_PAGER_CONFIG`:
  *
  * If RUN_SETUP or RUN_SETUP_GENTLY is set, git.c normally handles
  * the `pager.`-configuration. If this flag is used, git.c
- * will skip that step, instead allowing the builtin to make a
+ * will skip that step, instead allowing the built-in to make a
  * more informed decision, e.g., by ignoring `pager.` for
  * certain subcommands.
  *
@@ -114,11 +114,14 @@ extern int fmt_merge_msg(struct strbuf *in, struct strbuf 
*out,
 struct fmt_merge_msg_opts *);
 
 /**
- * If a builtin has DELAY_PAGER_CONFIG set, the builtin should call this early
+ * If a built-in has DELAY_PAGER_CONFIG set, the built-in should call this 
early
  * when it wishes to respect the `pager.foo`-config. The `cmd` is the name of
- * the builtin, e.g., "foo". If a paging-choice has already been setup, this
+ * the built-in, e.g., "foo". If a paging-choice has already been setup, this
  * does nothing. The default in `def` should be 0 for "pager off", 1 for "pager
  * on" or -1 for "punt".
+ *
+ * You should most likely use a default of 0 or 1. "Punt" (-1) could be useful
+ * to be able to fall back to some historical compatibility name.
  */
 extern void setup_auto_pager(const char *cmd, int def);
 
-- 
2.14.0.rc1.12.ge2d9c4613



[PATCH] Add 'raw' blob_plain link in history overview

2017-08-02 Thread Job Snijders
We often work with very large plain text files in our repositories and
found it friendlier to the users if we can click directly to the raw
version of such files.

This patch adds a 'raw' blob_plain link in history overview.

Signed-off-by: Job Snijders 
---
 gitweb/gitweb.perl | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 3d4a8ee27..ad79c518e 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -5961,8 +5961,11 @@ sub git_history_body {
  href(action=>"commit", 
hash=>$commit), $ref);
print "\n" .
  "" .
- $cgi->a({-href => href(action=>$ftype, 
hash_base=>$commit, file_name=>$file_name)}, $ftype) . " | " .
- $cgi->a({-href => href(action=>"commitdiff", 
hash=>$commit)}, "commitdiff");
+ $cgi->a({-href => href(action=>$ftype, 
hash_base=>$commit, file_name=>$file_name)}, $ftype) . " | ";
+   if ($ftype eq 'blob') {
+   print $cgi->a({-href => href(action=>"blob_plain", 
hash_base=>$commit, file_name=>$file_name)}, "raw") . " | ";
+   }
+   print $cgi->a({-href => href(action=>"commitdiff", 
hash=>$commit)}, "commitdiff");
 
if ($ftype eq 'blob') {
my $blob_current = $file_hash;


Re: [RFC PATCH] clone: add clone.recursesubmodules config option

2017-08-02 Thread Jeremy Morton
Did this ever get anywhere?  If not why not?  It would be very useful 
to me to be able to clone recursively by default, especially 
considering you can't use 'alias' to override the existing 'clone' 
command.


--
Best regards,
Jeremy Morton (Jez)

On 06/06/2014 06:26, Heiko Voigt wrote:

On Thu, Jun 05, 2014 at 11:18:28AM -0700, Junio C Hamano wrote:

Jens Lehmann  writes:

We had two settings in mind,...
So what if clone would just do an "git submodule init" for now when
"submodule.autoinit" is set but "submodule.autoupdate" isn't [?]
... and a single "submodule.auto" setting would be what users really want?


I do not offhand think of a sensible scenario where you want to init
a submodule once but do not want to update it when the superproject
changes.  Even if the user uses the mode to detach the submodule
HEAD, i.e. the branches in submodules do not matter and the whole
tree is described by the superproject's commit and gitlinks recorded
in it, the user would want the new objects necessary for the updated
superproject, which means a submodule that is init'ed (whether it is
via "git submodule init" or the submodule.autoinit variable) must be
updated.

So I am not sure why a user wants to disable autoupdate in the first
place.  For the same reason, setting submodule.*.update to none
would not make much sense, either.  Perhaps I am missing something.

Unless the user is very conservative and suspects that these
recursive behaviour we are going to bolt on to various commands
could be buggy and untrustworthy, in which case the user might want
to manually run "git submodule update", or even run "git fetch"
after going there while bypassing the whole "git submodule".  But I
do not think that is healthy in the longer run.


I think autoupdate is mainly there for the transition phase. Since
submodule can e.g. contain a lot of files a checkout would take much
longer. Similar to when Jens implemented the recursive diff, many people
were annoyed by the new files showing up and some with the impact on
performance (thats why we have the --ignore-submodules option).

In case of very big submodules and people already ignore their diff it
might even be necessary that the update is only done manually. E.g. for
a big media repository.

Cheers Heiko
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: [PATCH v3 07/10] submodule: check for unstaged .gitmodules outside of config parsing

2017-08-02 Thread Brandon Williams
On 08/02, Brandon Williams wrote:
> On 07/31, Stefan Beller wrote:
> > 
> > So this is where the check "pos < active_nr" is coming from,
> > introduced in 5fee995244 (submodule.c: add .gitmodules staging
> > helper functions, 2013-07-30) as well as d4e98b581b (Submodules:
> > Don't parse .gitmodules when it contains, merge conflicts, 2011-05-14).
> > 
> > If I am reading the docs for cache_name_pos correctly, we would
> > not need to check for the index exceeding active_cache,
> > but checking for the index not being out of bounds seems
> > to be wide spread.
> 
> I can drop the pos < active_nr requirement.

>From our conversation offline i'll leave this just in case there is some
subtle reason why it exists.  Also makes it more of a 1:1 conversion.

-- 
Brandon Williams


Re: [PATCH 2/2] doc: add another way to identify if a patch has been merged

2017-08-02 Thread Stefan Beller
On Wed, Aug 2, 2017 at 9:28 AM, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>> I think the exchange Stefan cited was an example that we want to
>> have more of.  The contributor is indicating that, even though the
>> patch could be a drive-by patch by one-timer from whom we will never
>> hear again, it is not--the contributor is willing to learn the way
>> things are done here, and showing that it is worth _our_ time to
>> explain the things so that the future patches will take less effort
>> to accept on our side.

The example I cited contains two important parts that I considered.

> I tried to follow as best I could, here's my attempt (please advise).

ok, I can help out as that conversation is very likely
to deliver some impact.

> I'm a bit overwhelmed by the documentation for submitting a patch!

That may be either a contributors problem (lacking time or
motivation to go through a long document) or a problem with
the community.

Here are my thoughts on the "problem with the community":

We are using Git ourselves as a mere (content-)version-control-system
What we really need is a community oriented workflow tool:
Instead of writing a long-winded document on what you can
do wrong in each contribution, we should have technical solutions
that just present the single issue that needs addressing.

For example when a contributor forgets to sign-off a patch,
git-send-email could warn about the missing sign-off and
present the rationale why our community needs sign-offs.

As this is specific to our community, such that it cannot be
baked into git-send-email, but rather we'd need a distributed
configuration that is respected by various git commands.

We had the discussion on shipping a project config which is
respected by git commands lately when discussing the
.gitorder file that I proposed, and IIRC such a thing "doesn't
quite fit into the broad picture of a version control system",
so maybe we need another tooling in our community?

Another example would be to show a hint/advice when
commits with no or very short commit message are created.
(also this is project specific, other communities do not expect
commit messages as we do. So they would not want to utilize
such an advice given).

>>
>> Because we do not have a group of dedicated volunteers, it is done
>> by more experienced people around here but that can be done only
>> when they have time.  I view it as a more severe problem than any
>> documentation.  An abbreviated version of the documentation to
>> invite more new people means that we must be prepared to give more
>> high-touch onboarding help to them.
>
> Just to make sure there is no misunderstanding, I am not saying "do
> not update the doc to have an abbreviated version, because we will
> get more clueless newbies".  I am saying that it is not a good idea
> to add an abbreviated version _before_ we are prepared to handle
> more patches from new people that require high-touch help.
>
> If you are volunteering to coordinate and form the onboarding
> helpers group, that would be great.

I would not want to explain the same thing over and over again,
but rather have a technical solution that explains the problem and
solution once it is detected.

Coming up with a technical solution for each little quirk
is not the hard part (e.g. grep for the sign off string, count lines of
the commit message), but rather to put it in place. (How can I make
sure that contributors run these small checker scripts?
Currently I cannot.)

Thanks.


Re: [PATCH] convert any hard coded .gitmodules file string to the MACRO

2017-08-02 Thread Brandon Williams
On 08/01, Junio C Hamano wrote:
> Stefan Beller  writes:
> 
> >>>   @@ -233,18 +233,18 @@ void gitmodules_config(void)
> >>> strbuf_addstr(_path, "/.gitmodules");
> >>
> >>
> >> Did you mean to also change "/.gitmodules" ??
> >
> > Goog point. We should pick that up as well. However as we do not have
> > a macro for that, we'd have to have 2 calls to strbuf API
> >
> > strbuf_addch(, '/');
> > strbuf_addstr(, GITMODULES);
> 
> Ehh, doesn't string literal concatenation work here?  I.e. something
> like:
> 
> strbuf_addstr(_path, "/" GITMODULES_FILE);

Also this doesn't really matter much since this line is removed latter
on in the series, but I'll go with the string literal concatenation for
the intermediate state.

> 
> 
> >>> if (pos < 0) { /* .gitmodules not found or isn't merged */
> >>> pos = -1 - pos;
> >>> if (active_nr > pos) {  /* there is a .gitmodules
> >>> */
> >>
> >>
> >> It might also be nice to change the literals in the comments to
> >> use the macro.
> 
> The reason you want this patch is not like we want to make it easy
> to rename the file to ".gitprojects" later, right?  The patch is
> about avoiding misspelled string constant, like "/.gitmdoules",
> without getting caught by the compiler, no?
> 
> Assuming that I am correctly guessing the intention, I think it is a
> bad idea to rename these in the comments.
> 
> 

-- 
Brandon Williams


Re: [PATCH v3 07/10] submodule: check for unstaged .gitmodules outside of config parsing

2017-08-02 Thread Brandon Williams
On 07/31, Stefan Beller wrote:
> On Tue, Jul 18, 2017 at 12:05 PM, Brandon Williams  wrote:
> > Teach 'is_staging_gitmodules_ok()' to be able to determine in the
> > '.gitmodules' file has unstaged changes based on the passed in index
> > instead of relying on a global varible which is set during the
> 
> variable
> 

Will change.

> > submodule-config parsing.
> >
> > Signed-off-by: Brandon Williams 
> > ---
> >  builtin/mv.c |  2 +-
> >  builtin/rm.c |  2 +-
> >  submodule.c  | 32 +---
> >  submodule.h  |  2 +-
> >  4 files changed, 20 insertions(+), 18 deletions(-)
> >
> > diff --git a/builtin/mv.c b/builtin/mv.c
> > index dcf6736b5..94fbaaa5d 100644
> > --- a/builtin/mv.c
> > +++ b/builtin/mv.c
> > @@ -81,7 +81,7 @@ static void prepare_move_submodule(const char *src, int 
> > first,
> > struct strbuf submodule_dotgit = STRBUF_INIT;
> > if (!S_ISGITLINK(active_cache[first]->ce_mode))
> > die(_("Directory %s is in index and no submodule?"), src);
> > -   if (!is_staging_gitmodules_ok())
> > +   if (!is_staging_gitmodules_ok(_index))
> > die(_("Please stage your changes to .gitmodules or stash 
> > them to proceed"));
> > strbuf_addf(_dotgit, "%s/.git", src);
> > *submodule_gitfile = read_gitfile(submodule_dotgit.buf);
> > diff --git a/builtin/rm.c b/builtin/rm.c
> > index 52826d137..4057e73fa 100644
> > --- a/builtin/rm.c
> > +++ b/builtin/rm.c
> > @@ -286,7 +286,7 @@ int cmd_rm(int argc, const char **argv, const char 
> > *prefix)
> > list.entry[list.nr].name = xstrdup(ce->name);
> > list.entry[list.nr].is_submodule = S_ISGITLINK(ce->ce_mode);
> > if (list.entry[list.nr++].is_submodule &&
> > -   !is_staging_gitmodules_ok())
> > +   !is_staging_gitmodules_ok(_index))
> > die (_("Please stage your changes to .gitmodules or 
> > stash them to proceed"));
> > }
> >
> > diff --git a/submodule.c b/submodule.c
> > index b1965290f..46ec04d7c 100644
> > --- a/submodule.c
> > +++ b/submodule.c
> > @@ -37,18 +37,25 @@ static struct oid_array ref_tips_after_fetch;
> >  static int gitmodules_is_unmerged;
> >
> >  /*
> > - * This flag is set if the .gitmodules file had unstaged modifications on
> > - * startup. This must be checked before allowing modifications to the
> > - * .gitmodules file with the intention to stage them later, because when
> > - * continuing we would stage the modifications the user didn't stage 
> > herself
> > - * too. That might change in a future version when we learn to stage the
> > - * changes we do ourselves without staging any previous modifications.
> > + * Check if the .gitmodules file has unstaged modifications.  This must be
> > + * checked before allowing modifications to the .gitmodules file with the
> > + * intention to stage them later, because when continuing we would stage 
> > the
> > + * modifications the user didn't stage herself too. That might change in a
> > + * future version when we learn to stage the changes we do ourselves 
> > without
> > + * staging any previous modifications.
> >   */
> > -static int gitmodules_is_modified;
> > -
> > -int is_staging_gitmodules_ok(void)
> > +int is_staging_gitmodules_ok(const struct index_state *istate)
> >  {
> > -   return !gitmodules_is_modified;
> > +   int pos = index_name_pos(istate, GITMODULES_FILE, 
> > strlen(GITMODULES_FILE));
> > +
> > +   if ((pos >= 0) && (pos < istate->cache_nr)) {
> 
> Why do we need the second check (pos < istate->cache_nr) ?
> 
> I would have assumed the first one suffices,
> it might read better if turned around:
> 
> 
> if (pos < 0)
> return 1;
> 
> return (lstat(GITMODULES_FILE, ) == 0 &&
> ce_match_stat(istate->cache[pos], , 0) & DATA_CHANGED);
>   }
> 
> > @@ -231,11 +238,6 @@ void gitmodules_config(void)
> > !memcmp(ce->name, ".gitmodules", 11))
> > gitmodules_is_unmerged = 1;
> > }
> > -   } else if (pos < active_nr) {
> > -   struct stat st;
> > -   if (lstat(".gitmodules", ) == 0 &&
> > -   ce_match_stat(active_cache[pos], , 0) & 
> > DATA_CHANGED)
> > -   gitmodules_is_modified = 1;
> > }
> 
> So this is where the check "pos < active_nr" is coming from,
> introduced in 5fee995244 (submodule.c: add .gitmodules staging
> helper functions, 2013-07-30) as well as d4e98b581b (Submodules:
> Don't parse .gitmodules when it contains, merge conflicts, 2011-05-14).
> 
> If I am reading the docs for cache_name_pos correctly, we would
> not need to check for the index exceeding active_cache,
> but checking for the index not being out of bounds seems
> to be wide spread.

I can drop the pos < active_nr 

Re: [PATCH v2 0/5] Fsck for lazy objects, and (now) actual invocation of loader

2017-08-02 Thread Jonathan Nieder
Hi,

Junio C Hamano wrote:
> Jonathan Tan  writes:

>> One possibility to conceptually have the same thing without the overhead
>> of the list is to put the obtained-from-elsewhere objects into its own
>> alternate object store, so that we can distinguish the two.
>
> Now you are talking.  Either a separate object store, or a packfile
> that is specially marked as such, would work.

Jonathan's not in today, so let me say a few more words about this
approach.

This approach implies a relaxed connectivity guarantee, by creating
two classes of objects:

 1. Objects that I made should satisfy the connectivity check.  They
can point to other objects I made, objects I fetched, or (*) objects
pointed to directly by objects I fetched.  More on (*) below.

 2. Objects that I fetched do not need to satisfy a connectivity
check.  I can trust the server to provide objects that they point
to when I ask for them, except in extraordinary cases like a
credit card number that was accidentally pushed to the server and
prompted a rewriting of history to remove it (**).

The guarantee (1) looks like it should be easy to satisfy (just like
the current connectivity guarantee where all objects are in class (1)).
I have to know about an object to point to it --- that means the
pointed-to object has to be in the object store or pointed to by
something in the object store.

The complication is in the "git gc" operation for the case (*).
Today, "git gc" uses a reachability walk to decide which objects to
remove --- an object referenced by no other object is fair game to
remove.  With (*), there is another kind of object that must not be
removed: if an object that I made, M, points to a missing/promised
object, O, pointed to by a an object I fetched, F, then I cannot prune
F unless there is another fetched object present to anchor O.

For example: suppose I have a sparse checkout and run

git fetch origin refs/pulls/x
git checkout -b topic FETCH_HEAD
echo "Some great modification" >> README
git add README
git commit --amend

When I run "git gc", there is nothing pointing to the commit that was
pointed to by the remote ref refs/pulls/x, so it can be pruned.  I
would naively also expect that the tree pointed to by that commit
could be pruned.  But pruning it means pruning the promise that made
it permissible to lack various blobs that my topic branch refers to
that are outside the sparse checkout area.  So "git gc" must notice
that it is not safe to prune that tree.

This feels hacky.  I prefer the promised object list over this
approach.

>"Maintaining a list
> of object names in a flat file is too costly" is not a valid excuse
> to discard the integrity of locally created objects, without which
> Git will no longer be a version control system,

I am confused by this: I think that Git without a "git fsck" command
at all would still be a version control system, just not as good of
one.

Can you spell this out more?  To be clear, are you speaking as a
reviewer or as the project maintainer?  In other words, if other
reviewers are able to settle on a design that involves a relaxed
guarantee for fsck in this mode that they can agree on, does this
represent a veto meaning the patch can still not go through?

On one hand I'm grateful for the help exploring the design space, and
I think it has helped get a better understanding of the issues
involved.

On the other hand, if this is a veto then it feels very black and
white and a hard starting point to build a consensus from.  I am
worried.

[...]
>> I mentioned
>> this in my e-mail but rejected it, but after some more thought, this
>> might be sufficient - we might still need to iterate through every
>> object to know exactly what we can assume the remote to have, but the
>> "frontier" solution also needs this iteration, so we are no worse off.
>
> Most importantly, this is allowed to be costly---we are doing this
> not at runtime all the time, but when the user says "make sure that
> I haven't lost objects and it is safe for me to build further on
> what I created locally so far" by running "git fsck".

check_everything_connected is also used in some other circumstances:
e.g. when running a fetch, and when receiving a push in git
receive-pack.

Thanks,
Jonathan


Re: Git log --tags isn't working as expected

2017-08-02 Thread Jeff King
On Wed, Aug 02, 2017 at 09:23:36AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> >   3. Use --simplify-by-decoration to show a particular range of commits,
> >  but limit to ones that actually have a ref pointing at them. Like:
> >
> >git log ..HEAD --simplify-by-decoration
> 
> Nit: dashed options come first and then revs and then pathspecs.

Well, sort of. It does work just fine, and always has. I always thought
we were following the GNU-style liberal option ordering.

By the way, I noticed something funny that I forgot to mention:

  git log --tags=does-not-exist

shows HEAD, because the logic to kick in the default only asks "were we
given any refs to start the traversal?". I think it probably should
consider a wildcard with no matches to override the default, and show
nothing.

I haven't decided if it should be empty-but-success, or give an error.
And if an error, if it is "you tried to give me refs, but they showed
nothing" or "you gave me a wildcard that matched nothing". The
distinction matters for:

  git log --tags=does-not-exist --tags=does-exist

which currently allows the empty wildcard to be a noop.

-Peff


Re: [RFC] The correct and consistent alternative to quote a command ?

2017-08-02 Thread Stefan Beller
On Wed, Aug 2, 2017 at 5:22 AM, Kaartic Sivaraam
 wrote:
> On Tue, 2017-08-01 at 10:38 -0700, Stefan Beller wrote:
>> On Tue, Aug 1, 2017 at 8:59 AM, Kaartic Sivaraam wrote:
>> > I assume you mean the consistency in quoting i.e., you're expecting the
>> > patch to use (") instead of (') for quoting. Correct me, if I'm wrong.
>>
>> Actually I did not imply any expectation on the outcome, because I
>> do not know what the consistency end game looks like for this issue.
>>
>> So maybe we'd want to go with the currently most used way?
> On scanning through Documentation/SubmittingPatches, I saw that most of
> the commands were quoted using double quotes and chose to use it for
> this patch. I'm not sure if it's the right one to use.

Thanks for checking. I think it is a reasonable thing to do then.

>
> Changing the subject of this mail; hoping some one who has thoughts
> about this would dive-in.
>
>> (Are there only three? (a) with dash, (b) with single quotes and
>> (c) with double quotes?)
>
> --
> Kaartic


Re: reftable [v4]: new ref storage format

2017-08-02 Thread Jeff King
On Wed, Aug 02, 2017 at 08:17:29AM -0700, Shawn Pearce wrote:

> > Just peeking at torvalds/linux, we have some objects with ~35K refs
> > pointing to them (e.g., the v2.6.11 tag).
> 
> Oy. I'll bet that every occurrence winds up in its own block due to
> the layout of the namespace, and so the obj block list needs 35k
> varint pointers. That requires a larger block size if it has any
> chance of fitting into the reftable format.
> 
> Another option is disable the obj table for these shared-object repos.
> Its an optional part of the format and can be omitted if the reader
> isn't likely to need to lookup by SHA-1, or is willing to pay the
> brute force cost of scanning every ref.

Yeah, sorry, I meant to write a few more paragraphs. I think refusing to
generate the object table for these repos would be OK. We don't serve
any user-facing operations out of them directly[1].

I'm also open to the argument that they're simply insane. Most of the
time we don't need them to be a real repository at all. They could exist
as a bare "objects/" directory. It's only at repack time that we
actually need to know which objects are reachable[2], so we could
do a special repack that generates the list on the fly from the child
repositories.

-Peff

[1] We actually disable the ".have" advertisements, because the cost of
showing all of the shared-storage ref tips is too high. One thing
I'd like to do is be able to advertise a subset of the alternate
refs (if you're a fork of torvalds/linux, then share _just_ the refs
from there). But with the current ref code, I can't even ask for a
subset of the refs without paying the cost to walk all of them.
That's one of the things I'd like to build on top of the mmap'd
packed-refs solution (and naturally would work with reftables, too).

[2] It's a bit more complicated than just knowing the list of reachable
objects. We also want to know which ones are reachable from which
fork, as we do some trickery to avoid creating deltas across forks.
So we really do want the whole ref-list, and not something like a
de-duped set of reachable tips. I don't think that makes a
difference for anything we're discussing here, but just a bit of
trivia in case somebody is thinking about the shared-storage problem
space.


Re: [PATCH] convert any hard coded .gitmodules file string to the MACRO

2017-08-02 Thread Brandon Williams
On 08/01, Junio C Hamano wrote:
> Stefan Beller  writes:
> 
> >>>   @@ -233,18 +233,18 @@ void gitmodules_config(void)
> >>> strbuf_addstr(_path, "/.gitmodules");
> >>
> >>
> >> Did you mean to also change "/.gitmodules" ??
> >
> > Goog point. We should pick that up as well. However as we do not have
> > a macro for that, we'd have to have 2 calls to strbuf API
> >
> > strbuf_addch(, '/');
> > strbuf_addstr(, GITMODULES);
> 
> Ehh, doesn't string literal concatenation work here?  I.e. something
> like:
> 
> strbuf_addstr(_path, "/" GITMODULES_FILE);
> 
> 
> >>> if (pos < 0) { /* .gitmodules not found or isn't merged */
> >>> pos = -1 - pos;
> >>> if (active_nr > pos) {  /* there is a .gitmodules
> >>> */
> >>
> >>
> >> It might also be nice to change the literals in the comments to
> >> use the macro.
> 
> The reason you want this patch is not like we want to make it easy
> to rename the file to ".gitprojects" later, right?  The patch is
> about avoiding misspelled string constant, like "/.gitmdoules",
> without getting caught by the compiler, no?

Yes, it was mostly about preventing mistakes and having the compiler
help you out a bit, so changing the comments isn't really needed.

> 
> Assuming that I am correctly guessing the intention, I think it is a
> bad idea to rename these in the comments.
> 
> 

-- 
Brandon Williams


Re: reftable [v4]: new ref storage format

2017-08-02 Thread Jeff King
On Wed, Aug 02, 2017 at 08:20:44AM -0400, Dave Borowitz wrote:

> >> OTOH a mythical protocol v2 might reduce the need to scan the
> >> references for advertisement, so maybe this optimization will be more
> >> helpful in the future?
> 
> I haven't been following the status of the proposal, but I was
> assuming a client-speaks-first protocol would also imply the client
> asking for refnames, not SHA-1s, in which case lookup by SHA-1 is no
> longer relevant.

Good point. The hidden-refs thing Shawn described is a trick that would
be used because the current protocol is so lousy. It's not clear how a
stateless-rpc request would work, but in theory the follow-up requests
could also say "hey, I'm only interested in refs/{heads,tags}" and the
stateless server could limit is marking to that.

But that would still leave something like allowReachableSHA1InWant
having to look at all refs (and I don't see why a client requesting by
sha1 would give any limiting refspec at all). On the other hand, looking
at the ref tips would probably be dominated by actually traversing the
commits in most cases. Of course one could come up with a pathological
case pretty easily (tons of refs, and people asking for submodules at
tip commits).

So I do think there are cases where the optimization would help, but I
still not sure how much. If it's an optional bit of the design, we at
least have the option of just not generating if it turns out not to be
useful. My only concern would be if we make other protocol sacrifices or
complications to include it.

-Peff


Re: reftable [v4]: new ref storage format

2017-08-02 Thread Junio C Hamano
Shawn Pearce  writes:

> On Wed, Aug 2, 2017 at 2:28 AM, Jeff King  wrote:
>> On Tue, Aug 01, 2017 at 07:38:37PM -0700, Shawn Pearce wrote:
>>
>>> > OBJS blocks can also be
>>> > unbounded in size if very many references point at the same object,
>>> > thought that is perhaps only a theoretical problem.
>>>
>>> Gah, I missed that in reftable. The block id pointer list could cause
>>> a single object id to exceed what fits in a block, and that will cause
>>> the writer to fail unless its caller sets the block size larger. I
>>> basically assumed this overflow condition is very unlikely, as its not
>>> common to have a huge number of refs pointing to the same object.
>>
>> It's actually quite common for us, as we have big shared-object repos
>> that contain a copy of the refs of all of their child repos (for
>> reachability during packing, etc). So tags, where the value is the same
>> in each fork, you have one ref per fork pointing to it.
>>
>> Just peeking at torvalds/linux, we have some objects with ~35K refs
>> pointing to them (e.g., the v2.6.11 tag).
>
> Oy. I'll bet that every occurrence winds up in its own block due to
> the layout of the namespace, and so the obj block list needs 35k
> varint pointers. That requires a larger block size if it has any
> chance of fitting into the reftable format.
>
> Another option is disable the obj table for these shared-object repos.
> Its an optional part of the format and can be omitted if the reader
> isn't likely to need to lookup by SHA-1, or is willing to pay the
> brute force cost of scanning every ref.

I am wondering if we need the reverse look-up for a regular
repository that allows "fetch anything at the tip".  It only needs
"I got this request for an object name--does it sit at the tip of
any ref?  Yes/No".  It does not need to know exactly which ref
points at the asked object.

Yes, I know Gerrit has its own ACL that limits the visibility of
refs so the question becomes "does it sit at the tip of any ref that
is visible to the current user?".  An Yes/No answer for "any ref?"
may still give you a short-cut when rejecting, but you'd then need
to scan to give a positive answer without a full reverse mapping.

For the use of "consolidated object store for all forks", I do not
think it makes sense to even have "Does it sit at a tip, Yes/No"
information.  Even when Linus's repository and a fork share the same
object store, I do not think anybody expects to be able to fetch a
commit at the tip of a branch in the fork but not in Linus's
repository.



Re: git send-email -v n fails

2017-08-02 Thread Junio C Hamano
Olaf Hering  writes:

> I think send-email should understand all options of format-patch.

I actually think it was a mistake to allow send-email drive
format-patch ;-).

> At least for '-v n' this fails, one has to type '--reroll-count n' as a
> workaround with git version 2.13.3:

I wonder if "-v2" would work here, though.  It seems that the hack
send-email uses to "understand" format-patch options (i.e. the
subroutine is_format_patch_arg) is not prepared to take an option
with parameter.



Re: [PATCH 2/2] doc: add another way to identify if a patch has been merged

2017-08-02 Thread Junio C Hamano
Junio C Hamano  writes:

> I think the exchange Stefan cited was an example that we want to
> have more of.  The contributor is indicating that, even though the
> patch could be a drive-by patch by one-timer from whom we will never
> hear again, it is not--the contributor is willing to learn the way
> things are done here, and showing that it is worth _our_ time to
> explain the things so that the future patches will take less effort
> to accept on our side.
>
> Because we do not have a group of dedicated volunteers, it is done
> by more experienced people around here but that can be done only
> when they have time.  I view it as a more severe problem than any
> documentation.  An abbreviated version of the documentation to
> invite more new people means that we must be prepared to give more
> high-touch onboarding help to them.

Just to make sure there is no misunderstanding, I am not saying "do
not update the doc to have an abbreviated version, because we will
get more clueless newbies".  I am saying that it is not a good idea
to add an abbreviated version _before_ we are prepared to handle
more patches from new people that require high-touch help.

If you are volunteering to coordinate and form the onboarding
helpers group, that would be great.

Thanks.


Re: Git log --tags isn't working as expected

2017-08-02 Thread Junio C Hamano
Jeff King  writes:

>   3. Use --simplify-by-decoration to show a particular range of commits,
>  but limit to ones that actually have a ref pointing at them. Like:
>
>git log ..HEAD --simplify-by-decoration

Nit: dashed options come first and then revs and then pathspecs.

> I think (3) matches what you're trying to do the best. You can't say
> "just tags" for the decoration, though, so you'll see branch tips as
> well. There's been some discussion about allowing specific decorations,
> but nothing merged yet.


Re: [PATCH v2 0/5] Fsck for lazy objects, and (now) actual invocation of loader

2017-08-02 Thread Junio C Hamano
Jonathan Tan  writes:

> One possibility to conceptually have the same thing without the overhead
> of the list is to put the obtained-from-elsewhere objects into its own
> alternate object store, so that we can distinguish the two.

Now you are talking.  Either a separate object store, or a packfile
that is specially marked as such, would work.  "Maintaining a list
of object names in a flat file is too costly" is not a valid excuse
to discard the integrity of locally created objects, without which
Git will no longer be a version control system, and your "We have to
trust the sender of objects on the other side anyway when operating
in lazy-objects mode, so devise a way that we can use to tell which
objects we _know_ the other side has" that leads to the above idea
is a good line of thought.

> I mentioned
> this in my e-mail but rejected it, but after some more thought, this
> might be sufficient - we might still need to iterate through every
> object to know exactly what we can assume the remote to have, but the
> "frontier" solution also needs this iteration, so we are no worse off.

Most importantly, this is allowed to be costly---we are doing this
not at runtime all the time, but when the user says "make sure that
I haven't lost objects and it is safe for me to build further on
what I created locally so far" by running "git fsck".


Re: [PATCH 2/2] doc: add another way to identify if a patch has been merged

2017-08-02 Thread Junio C Hamano
Kaartic Sivaraam  writes:

> On Tue, 2017-08-01 at 10:46 -0700, Stefan Beller wrote:
>> Actually I am slightly negative on this one, because of
>> occurrences like [1].
>> 
>> Our SubmittingPatches is already considered *too long* for most people
>> who just want to drop a drive-by patch.
>> 
>> Adding more knowledge (which btw is about general git usage and not
>> specific to our development workflow; you'd find the same tip in the
>> kernel community).
>> 
>> I wonder if we need a document that describes workflows.
>> (Oh, look we have 'man gitworkflows'! I did not know)
>> 
>> So maybe we want to cut a lot of workflow related commendatory from
>> the SubmitingPatches and then encourage to read such man page?
>> 
> That's right. Maybe Documentation/SubmittingPatches needs a revamp to
> be one-time contributor friendly? Maybe introducing a "gist" for people
> who do not have the time to read the whole document, might be of help?

First of all, I do not think lack of one-time contributor is
something we should consider to be a problem.  Supporting new
contributors so that they will be involved more in the development
process is a lot more important issue.

A new contributor will get something wrong no matter what the
documentation says.  One-time contributor's intention is "I am
sending a patch this time, but I have no plan to get involved
further---I do not have time for this".  It ridiculous to ask for a
patch that adds an 's' to the end of a third-party-singular verb in
the present tense to be rerolled only because the title had an extra
full-stop at the end.  Practically, a patch like that by a "one time
contributor" will always have to be fixed before committing it.

I think the exchange Stefan cited was an example that we want to
have more of.  The contributor is indicating that, even though the
patch could be a drive-by patch by one-timer from whom we will never
hear again, it is not--the contributor is willing to learn the way
things are done here, and showing that it is worth _our_ time to
explain the things so that the future patches will take less effort
to accept on our side.

Because we do not have a group of dedicated volunteers, it is done
by more experienced people around here but that can be done only
when they have time.  I view it as a more severe problem than any
documentation.  An abbreviated version of the documentation to
invite more new people means that we must be prepared to give more
high-touch onboarding help to them.






git bisect "commits left" miscount

2017-08-02 Thread Ben Boeckel
Hi,

When bisecting given a set of paths, git counts the number of remaining
commits improperly. Here's example output (based in the git.git
repository):

% git bisect start -- sha1_file.c
% git bisect good v2.10.0
% git bisect bad v2.10.3
Bisecting: 1 revision left to test after this (roughly 1 step)
[f7f0a87e0a27a1baaf782af7cec18fd23fdf35de] Merge branch 
'jc/verify-loose-object-header' into maint
% git bisect good
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[5827a03545663f6d6b491a35edb313900608568b] fetch: use "quick" has_sha1_file 
for tag following
% git bisect good
Bisecting: -1 revisions left to test after this (roughly 0 steps)
[39000e849970a554a257577dcb2fb844a523a1d1] Merge branch 
'jk/fetch-quick-tag-following' into maint
% git bisect good
No testable commit found.
Maybe you started with bad path parameters?

Note that it ends up with -1 revisions left and it doesn't end up with a
"this commit is the first bad commit" message.

Should git find the newest ancestors(s) of the given bad commit which
modifies the given paths and start counting from there?

I haven't bisected to find out when this was introduced yet (first seen
with 2.9.4; confirmed with 2.14.0-rc1).

Thanks,

--Ben


Re: core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch`

2017-08-02 Thread Torsten Bögershausen



On 08/01/2017 10:58 PM, Anthony Sottile wrote:

Here's where I'm hitting the problem described:
https://github.com/pre-commit/pre-commit/issues/570

Note that `git -c core.autocrlf=false` apply patch fixes this
situation, but breaks others.


[]
I wasn't thinking of that - and thanks for the detailed report.
I seems as there are 3 things to be done:
- Make a workaround in your scripts/tools. It seems as if that is 
already done.

- Fix Git.
  My very first investigation shows that a patch like this could fix 
the problem:


diff --git a/apply.c b/apply.c
index f2d599141d..66b8387360 100644
--- a/apply.c
+++ b/apply.c
@@ -2278,6 +2278,8 @@ static int read_old_data(struct stat *st, const 
char *path, struct strbuf *buf)

case S_IFREG:
if (strbuf_read_file(buf, path, st->st_size) != 
st->st_size)

return error(_("unable to open or read %s"), path);
+   if (would_convert_to_git(_index, path))
+   read_cache();
convert_to_git(_index, path, buf->buf, buf->len, 
buf, 0);

return 0;
default:

  I will probably do some more investigations if this is the core 
problem, so it will take some days or weeks.


- Convince people to normalize their repos and convert all CRLF in the 
repo into LF.

   This may take even longer.




Re: [RFC] The correct and consistent alternative to quote a command ?

2017-08-02 Thread Junio C Hamano
Kaartic Sivaraam  writes:

> On Tue, 2017-08-01 at 10:38 -0700, Stefan Beller wrote:
>> On Tue, Aug 1, 2017 at 8:59 AM, Kaartic Sivaraam wrote:
>> > I assume you mean the consistency in quoting i.e., you're expecting the
>> > patch to use (") instead of (') for quoting. Correct me, if I'm wrong.
>> 
>> Actually I did not imply any expectation on the outcome, because I
>> do not know what the consistency end game looks like for this issue.
>> 
>> So maybe we'd want to go with the currently most used way?
> On scanning through Documentation/SubmittingPatches, I saw that most of
> the commands were quoted using double quotes and chose to use it for
> this patch. I'm not sure if it's the right one to use.

After reading the patch before seeing the above, I also came to the
conclusion that the above is exactly what you are aiming at, and I
thought that was a reasonable thing to do.



Re: reftable [v4]: new ref storage format

2017-08-02 Thread Shawn Pearce
On Wed, Aug 2, 2017 at 2:28 AM, Jeff King  wrote:
> On Tue, Aug 01, 2017 at 07:38:37PM -0700, Shawn Pearce wrote:
>
>> > OBJS blocks can also be
>> > unbounded in size if very many references point at the same object,
>> > thought that is perhaps only a theoretical problem.
>>
>> Gah, I missed that in reftable. The block id pointer list could cause
>> a single object id to exceed what fits in a block, and that will cause
>> the writer to fail unless its caller sets the block size larger. I
>> basically assumed this overflow condition is very unlikely, as its not
>> common to have a huge number of refs pointing to the same object.
>
> It's actually quite common for us, as we have big shared-object repos
> that contain a copy of the refs of all of their child repos (for
> reachability during packing, etc). So tags, where the value is the same
> in each fork, you have one ref per fork pointing to it.
>
> Just peeking at torvalds/linux, we have some objects with ~35K refs
> pointing to them (e.g., the v2.6.11 tag).

Oy. I'll bet that every occurrence winds up in its own block due to
the layout of the namespace, and so the obj block list needs 35k
varint pointers. That requires a larger block size if it has any
chance of fitting into the reftable format.

Another option is disable the obj table for these shared-object repos.
Its an optional part of the format and can be omitted if the reader
isn't likely to need to lookup by SHA-1, or is willing to pay the
brute force cost of scanning every ref.


git svn fetch --localtime produces wrong commit times

2017-08-02 Thread Urs Thuermann
In converting a SVN repository to git, the commit timestamp is
generated incorrectly.  I use "git svn fetch --localtime" and the
offset from UTC is always set to +0200 (probably because that is the
current local offset here, i.e. Europe/Berlin) even for times when it
should be +0100.

For example, in my SVN working copy I get

$ svn log -r152

r152 | urs | 2010-08-16 11:05:52 +0200 (Mon, 16 Aug 2010) | 2 lines

Add error checks on input file streams


$ svn log -r158

r158 | urs | 2010-11-24 00:24:13 +0100 (Wed, 24 Nov 2010) | 16 lines

Make code portable to Linux/amd64

After converting to git using

$ mkdir use
$ cd use
$ git svn init -s file://$HOME/SVN/use
Initialized empty Git repository in /home/urs/GIT/use/.git/
$ git svn fetch -q -A ../ADM/git.svn-authors --localtime
r1 = 12cb83315be96e594a98b42db7ae57d19e0c7973 (refs/remotes/origin/trunk)
...
r162 = 99ff393f1d64f330b14d6e06412b94fd3023d616 (refs/remotes/origin/trunk)
Checked out HEAD:
  file:///home/urs/SVN/use/trunk r162

I see wrong commit timestamps:

$ git log
...
commit c6b4f7aaa66650a16de67d32fb83d541b1973331
Author: Urs Thuermann 
Date:   Wed Nov 24 00:24:13 2010 +0200

Make code portable to Linux/amd64


git-svn-id: file:///home/urs/SVN/use/trunk@158 
b3714422-7cff-11dd-9a33-b89007e0d947
...
commit a9d95e506681ac5742d2d0927c93f22c541ff170
Author: Urs Thuermann 
Date:   Mon Aug 16 11:05:52 2010 +0200

Add error checks on input file streams


git-svn-id: file:///home/urs/SVN/use/trunk@152 
b3714422-7cff-11dd-9a33-b89007e0d947
...

In revision 152 the timestamp is correct, but for revision 158 the UTC
offset is +0200 instead of +0100.  Then, of course, also the POSIX
timestamp in the commit object is wrong:

$ git cat-file -p c6b4f7aaa66650a16de67d32fb83d541b1973331
tree ff4528220ddcf8beca9f4958fbb947d5ed85808e
parent edcaeb292153663664d850bafe1dad12daab4165
author Urs Thuermann  1290551053 +0200
committer Urs Thuermann  1290551053 +0200

Make code portable to Linux/amd64
...
$ date -d@1290551053 +%F\ %T\ %z
2010-11-23 23:24:13 +0100

The correct timestamp would be 2010-11-24 00:24:13 +0100, or, as a
POSIX time_t 1290554653.

I could find the bug grepping through /usr/lib/git-core/git-svn, maybe
it's in GIT::SVN::Fetcher or some other GIT::SVN module.

Is a fix available for this bug?

urs


git send-email -v n fails

2017-08-02 Thread Olaf Hering
I think send-email should understand all options of format-patch.
At least for '-v n' this fails, one has to type '--reroll-count n' as a
workaround with git version 2.13.3:

+ git send-email -v 2 --annotate --confirm=always 
--to-cmd=scripts/get_maintainer.pl 'HEAD^'
fatal: ambiguous argument '2': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git  [...] -- [...]'
format-patch -o /tmp/fVUtbB6OWI 1 HEAD^: command returned error: 128

Olaf


signature.asc
Description: PGP signature


Re: [PATCH 2/2] doc: add another way to identify if a patch has been merged

2017-08-02 Thread Kaartic Sivaraam
On Tue, 2017-08-01 at 10:46 -0700, Stefan Beller wrote:
> Actually I am slightly negative on this one, because of
> occurrences like [1].
> 
> Our SubmittingPatches is already considered *too long* for most people
> who just want to drop a drive-by patch.
> 
> Adding more knowledge (which btw is about general git usage and not
> specific to our development workflow; you'd find the same tip in the
> kernel community).
> 
> I wonder if we need a document that describes workflows.
> (Oh, look we have 'man gitworkflows'! I did not know)
> 
> So maybe we want to cut a lot of workflow related commendatory from
> the SubmitingPatches and then encourage to read such man page?
> 
That's right. Maybe Documentation/SubmittingPatches needs a revamp to
be one-time contributor friendly? Maybe introducing a "gist" for people
who do not have the time to read the whole document, might be of help?

> [1 
> ]https://public-inbox.org/git/CA+dzEB=cDvp7ZS8x+p+U-5NbK3SNd0FPyj_wP=gvi8mji6d...@mail.gmail.com/
> 
> 
> 
> > --
> > 2.14.0.rc1.434.g6eded367a
> > 

-- 
Kaartic


Re: [RFC] The correct and consistent alternative to quote a command ?

2017-08-02 Thread Kaartic Sivaraam
On Tue, 2017-08-01 at 10:38 -0700, Stefan Beller wrote:
> On Tue, Aug 1, 2017 at 8:59 AM, Kaartic Sivaraam wrote:
> > I assume you mean the consistency in quoting i.e., you're expecting the
> > patch to use (") instead of (') for quoting. Correct me, if I'm wrong.
> 
> Actually I did not imply any expectation on the outcome, because I
> do not know what the consistency end game looks like for this issue.
> 
> So maybe we'd want to go with the currently most used way?
On scanning through Documentation/SubmittingPatches, I saw that most of
the commands were quoted using double quotes and chose to use it for
this patch. I'm not sure if it's the right one to use.

Changing the subject of this mail; hoping some one who has thoughts
about this would dive-in.

> (Are there only three? (a) with dash, (b) with single quotes and
> (c) with double quotes?)

-- 
Kaartic


Re: reftable [v4]: new ref storage format

2017-08-02 Thread Dave Borowitz
On Tue, Aug 1, 2017 at 10:38 PM, Shawn Pearce  wrote:
>> Peff and I discussed off-list whether the lookup-by-SHA-1 feature is
>> so important in the first place. Currently, all references must be
>> scanned for the advertisement anyway,
>
> Not really. You can hide refs and allow-tip-sha1 so clients can fetch
> a ref even if it wasn't in the advertisement. We really want to use
> that wire protocol capability with Gerrit Code Review to hide the
> refs/changes/ namespace from the advertisement, but allow clients to
> fetch any of those refs if they send its current SHA-1 in a want line
> anyway.
>
> So a server could scan only the refs/{heads,tags}/ prefixes for the
> advertisement, and then leverage the lookup-by-SHA1 to verify other
> SHA-1s sent by the client.
>
>> so avoiding a second scan to vet
>> SHA-1s received from the client is at best going to reduce the effort
>> by a constant factor. Do you have numbers showing that this
>> optimization is worth it?
>
> No, but I don't think I need to do much to prove it. My 866k ref
> example advertisement right now is >62 MiB. If we do what I'm
> suggesting in the paragraphs above, the advertisement is ~51 KiB.

That being said, our bias towards minimizing the number of ref scans
is rooted in our experience where scanning 866k refs takes 5 seconds
to get the response from the storage backend into the git server.
Cutting ref scans from 2 to 1 (or 1 to 0) is a big deal in that case.
But that 5s number is based on our current, slow storage, not on
reftable. If migrating to reftable turns each 5s scan into a 400ms
scan, we might be able to live with that, even if we don't have fast
lookup by SHA-1.

>> OTOH a mythical protocol v2 might reduce the need to scan the
>> references for advertisement, so maybe this optimization will be more
>> helpful in the future?

I haven't been following the status of the proposal, but I was
assuming a client-speaks-first protocol would also imply the client
asking for refnames, not SHA-1s, in which case lookup by SHA-1 is no
longer relevant.


[PATCH 1/6] am: remember --rerere-autoupdate setting

2017-08-02 Thread Phillip Wood
From: Phillip Wood 

Save the rerere-autoupdate setting so that it is remembered after
stopping for the user to resolve conflicts.

Signed-off-by: Phillip Wood 
---

There are no new tests, but this code is exercised by the new rebase
tests in the next patch.

builtin/am.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/builtin/am.c b/builtin/am.c
index 
c973bd96dcb5d630d56e935733bfa4530ccd2872..6962d4db5ffceef3022b7f877a43c8833a839b31
 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -431,6 +431,14 @@ static void am_load(struct am_state *state)
read_state_file(, state, "utf8", 1);
state->utf8 = !strcmp(sb.buf, "t");
 
+   if (file_exists(am_path(state, "rerere-autoupdate"))) {
+   read_state_file(, state, "rerere-autoupdate", 1);
+   state->allow_rerere_autoupdate = strcmp(sb.buf, "t") ?
+   RERERE_NOAUTOUPDATE : RERERE_AUTOUPDATE;
+   } else {
+   state->allow_rerere_autoupdate = 0;
+   }
+
read_state_file(, state, "keep", 1);
if (!strcmp(sb.buf, "t"))
state->keep = KEEP_TRUE;
@@ -1003,6 +1011,10 @@ static void am_setup(struct am_state *state, enum 
patch_format patch_format,
write_state_bool(state, "sign", state->signoff);
write_state_bool(state, "utf8", state->utf8);
 
+   if (state->allow_rerere_autoupdate)
+   write_state_bool(state, "rerere-autoupdate",
+state->allow_rerere_autoupdate == RERERE_AUTOUPDATE);
+
switch (state->keep) {
case KEEP_FALSE:
str = "f";
-- 
2.13.3



[PATCH 3/6] rebase -i: honor --rerere-autoupdate

2017-08-02 Thread Phillip Wood
From: Phillip Wood 

Interactive rebase was ignoring '--rerere-autoupdate'. Fix this by
reading it appropriate file when restoring the sequencer state for an
interactive rebase and passing '--rerere-autoupdate' to merge and
cherry-pick when rebasing with '--preserve-merges'.

Reported-by: Junio C Hamano 
Signed-off-by: Phillip Wood 
---
 git-rebase--interactive.sh |  7 ---
 sequencer.c| 10 ++
 t/t3418-rebase-continue.sh |  3 +++
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 
90b1fbe9cf6e8dfb2f4331916809fa40bf9050d2..29b7e8824b53abeaa68780b95d5954f67f734098
 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -281,7 +281,7 @@ pick_one () {
 
test -d "$rewritten" &&
pick_one_preserving_merges "$@" && return
-   output eval git cherry-pick \
+   output eval git cherry-pick $allow_rerere_autoupdate \
${gpg_sign_opt:+$(git rev-parse --sq-quote 
"$gpg_sign_opt")} \
"$strategy_args" $empty_args $ff "$@"
 
@@ -393,7 +393,8 @@ pick_one_preserving_merges () {
merge_args="--no-log --no-ff"
if ! do_with_author output eval \
'git merge ${gpg_sign_opt:+"$gpg_sign_opt"} \
-   $merge_args $strategy_args -m "$msg_content" 
$new_parents'
+   $allow_rerere_autoupdate $merge_args \
+   $strategy_args -m "$msg_content" $new_parents'
then
printf "%s\n" "$msg_content" > 
"$GIT_DIR"/MERGE_MSG
die_with_patch $sha1 "$(eval_gettext "Error 
redoing merge \$sha1")"
@@ -401,7 +402,7 @@ pick_one_preserving_merges () {
echo "$sha1 $(git rev-parse HEAD^0)" >> 
"$rewritten_list"
;;
*)
-   output eval git cherry-pick \
+   output eval git cherry-pick $allow_rerere_autoupdate \
${gpg_sign_opt:+$(git rev-parse --sq-quote 
"$gpg_sign_opt")} \
"$strategy_args" "$@" ||
die_with_patch $sha1 "$(eval_gettext "Could not 
pick \$sha1")"
diff --git a/sequencer.c b/sequencer.c
index 
3010faf86398697469e903318a35421d911acb23..7dc0670d902291b8054072d32cc0c8979c13598c
 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -127,6 +127,7 @@ static GIT_PATH_FUNC(rebase_path_onto, "rebase-merge/onto")
 static GIT_PATH_FUNC(rebase_path_autostash, "rebase-merge/autostash")
 static GIT_PATH_FUNC(rebase_path_strategy, "rebase-merge/strategy")
 static GIT_PATH_FUNC(rebase_path_strategy_opts, "rebase-merge/strategy_opts")
+static GIT_PATH_FUNC(rebase_path_allow_rerere_autoupdate, 
"rebase-merge/allow_rerere_autoupdate")
 
 static inline int is_rebase_i(const struct replay_opts *opts)
 {
@@ -1479,6 +1480,15 @@ static int read_populate_opts(struct replay_opts *opts)
free(opts->gpg_sign);
opts->gpg_sign = xstrdup(buf.buf + 2);
}
+   strbuf_reset();
+   }
+
+   if (read_oneliner(, rebase_path_allow_rerere_autoupdate(), 
1)) {
+   if (!strcmp(buf.buf, "--rerere-autoupdate"))
+   opts->allow_rerere_auto = RERERE_AUTOUPDATE;
+   else if (!strcmp(buf.buf, "--no-rerere-autoupdate"))
+   opts->allow_rerere_auto = RERERE_NOAUTOUPDATE;
+   strbuf_reset();
}
 
if (file_exists(rebase_path_verbose()))
diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh
index 
2b746f1559ad21a5ebf3bebd726c9a1b3d071c5a..fcfdd197bd352a9dca10233c2ba6d2aa4a66149e
 100755
--- a/t/t3418-rebase-continue.sh
+++ b/t/t3418-rebase-continue.sh
@@ -141,5 +141,8 @@ test_rerere_autoupdate () {
 
 test_rerere_autoupdate
 test_rerere_autoupdate -m
+GIT_SEQUENCE_EDITOR=: && export GIT_SEQUENCE_EDITOR
+test_rerere_autoupdate -i
+test_rerere_autoupdate --preserve-merges
 
 test_done
-- 
2.13.3



[PATCH 6/6] cherry-pick/revert: reject --rerere-autoupdate when continuing

2017-08-02 Thread Phillip Wood
From: Phillip Wood 

cherry-pick and revert should not accept --[no-]rerere-autoupdate once
they have started.

Signed-off-by: Phillip Wood 
---

This will break scripts that pass --[no-]rerere-autoupdate to 'git
cherry-pick --continue'. I don't think that this will be an issue for
the vast majority of users as I think most people will have assumed
that you cannot pass any other options with '--continue'.
'--rerere-autoupdate' is mentioned by 'git cherry-pick -h' but it is
not mentioned in the documentation. Hopefully a note in the release
notes should be enough to alert anyone who is affected by this.

builtin/revert.c  |  2 ++
 t/t3504-cherry-pick-rerere.sh | 13 +
 2 files changed, 15 insertions(+)

diff --git a/builtin/revert.c b/builtin/revert.c
index 
16028b9ea82edee9cf41044c69a47e8994d78fc6..b9d927eb09c9ed87c84681df1396f4e6d9b13c97
 100644
--- a/builtin/revert.c
+++ b/builtin/revert.c
@@ -155,6 +155,8 @@ static int run_sequencer(int argc, const char **argv, 
struct replay_opts *opts)
"--strategy-option", opts->xopts ? 1 : 0,
"-x", opts->record_origin,
"--ff", opts->allow_ff,
+   "--rerere-autoupdate", opts->allow_rerere_auto 
== RERERE_AUTOUPDATE,
+   "--no-rerere-autoupdate", 
opts->allow_rerere_auto == RERERE_NOAUTOUPDATE,
NULL);
}
 
diff --git a/t/t3504-cherry-pick-rerere.sh b/t/t3504-cherry-pick-rerere.sh
index 
af316cb40b7b16c95881eb8483eea4f6191c7cfa..a267b2d144df4a84f18ba4907b317e757ba98f16
 100755
--- a/t/t3504-cherry-pick-rerere.sh
+++ b/t/t3504-cherry-pick-rerere.sh
@@ -65,6 +65,19 @@ test_expect_success 'cherry-pick conflict with 
--no-rerere-autoupdate' '
git reset --hard bar-dev
 '
 
+test_expect_success 'cherry-pick --continue rejects --rerere-autoupdate' '
+   test_must_fail git cherry-pick --rerere-autoupdate foo..bar-master &&
+   test_cmp foo-expect foo &&
+   git diff-files --quiet &&
+   test_must_fail git cherry-pick --continue --rerere-autoupdate >actual 
2>&1 &&
+   echo "fatal: cherry-pick: --rerere-autoupdate cannot be used with 
--continue" >expect &&
+   test_i18ncmp expect actual &&
+   test_must_fail git cherry-pick --continue --no-rerere-autoupdate 
>actual 2>&1 &&
+   echo "fatal: cherry-pick: --no-rerere-autoupdate cannot be used with 
--continue" >expect &&
+   test_i18ncmp expect actual &&
+   git cherry-pick --abort
+'
+
 test_expect_success 'cherry-pick --rerere-autoupdate more than once' '
test_must_fail git cherry-pick --rerere-autoupdate --rerere-autoupdate 
foo..bar-master &&
test_cmp foo-expect foo &&
-- 
2.13.3



[PATCH 4/6] t3504: use test_commit

2017-08-02 Thread Phillip Wood
From: Phillip Wood 

Using test_commit is simpler than chaining echo && git add &&
test_tick && commit. Also having tags makes it clearer which commit
is being selecting by reset.

Signed-off-by: Phillip Wood 
---
 t/t3504-cherry-pick-rerere.sh | 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/t/t3504-cherry-pick-rerere.sh b/t/t3504-cherry-pick-rerere.sh
index 
e6a64816efef0e53018c7a56784d1af62602e9d3..33f902b1b0d8eb651e0e6857f8f5b86ef633ef4a
 100755
--- a/t/t3504-cherry-pick-rerere.sh
+++ b/t/t3504-cherry-pick-rerere.sh
@@ -5,14 +5,11 @@ test_description='cherry-pick should rerere for conflicts'
 . ./test-lib.sh
 
 test_expect_success setup '
-   echo foo >foo &&
-   git add foo && test_tick && git commit -q -m 1 &&
-   echo foo-master >foo &&
-   git add foo && test_tick && git commit -q -m 2 &&
+   test_commit foo &&
+   test_commit foo-master foo &&
 
-   git checkout -b dev HEAD^ &&
-   echo foo-dev >foo &&
-   git add foo && test_tick && git commit -q -m 3 &&
+   git checkout -b dev foo &&
+   test_commit foo-dev foo &&
git config rerere.enabled true
 '
 
@@ -21,10 +18,10 @@ test_expect_success 'conflicting merge' '
 '
 
 test_expect_success 'fixup' '
-   echo foo-dev >foo &&
-   git add foo && test_tick && git commit -q -m 4 &&
-   git reset --hard HEAD^ &&
-   echo foo-dev >expect
+   echo foo-resolved >foo &&
+   git commit -am resolved &&
+   cp foo expect &&
+   git reset --hard HEAD^
 '
 
 test_expect_success 'cherry-pick conflict' '
-- 
2.13.3



[PATCH 5/6] cherry-pick/revert: remember --rerere-autoupdate

2017-08-02 Thread Phillip Wood
From: Phillip Wood 

When continuing after conflicts, cherry-pick forgot if the user had specified
'--rerere-autoupdate'.

Redo the cherry-pick rerere tests to check --rerere-autoupdate works
as expected.

Signed-off-by: Phillip Wood 
---
 sequencer.c   | 10 +++-
 t/t3504-cherry-pick-rerere.sh | 60 ++-
 2 files changed, 62 insertions(+), 8 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 
7dc0670d902291b8054072d32cc0c8979c13598c..e0e66b987b27072da4aea6304a565ab708be91e4
 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1439,7 +1439,11 @@ static int populate_opts_cb(const char *key, const char 
*value, void *data)
else if (!strcmp(key, "options.strategy-option")) {
ALLOC_GROW(opts->xopts, opts->xopts_nr + 1, opts->xopts_alloc);
opts->xopts[opts->xopts_nr++] = xstrdup(value);
-   } else
+   } else if (!strcmp(key, "options.allow-rerere-auto"))
+   opts->allow_rerere_auto =
+   git_config_bool_or_int(key, value, _flag) ?
+   RERERE_AUTOUPDATE : RERERE_NOAUTOUPDATE;
+   else
return error(_("invalid key: %s"), key);
 
if (!error_flag)
@@ -1752,6 +1756,10 @@ static int save_opts(struct replay_opts *opts)

"options.strategy-option",
opts->xopts[i], "^$", 
0);
}
+   if (opts->allow_rerere_auto)
+   res |= git_config_set_in_file_gently(opts_file, 
"options.allow-rerere-auto",
+opts->allow_rerere_auto == 
RERERE_AUTOUPDATE ?
+"true" : "false");
return res;
 }
 
diff --git a/t/t3504-cherry-pick-rerere.sh b/t/t3504-cherry-pick-rerere.sh
index 
33f902b1b0d8eb651e0e6857f8f5b86ef633ef4a..af316cb40b7b16c95881eb8483eea4f6191c7cfa
 100755
--- a/t/t3504-cherry-pick-rerere.sh
+++ b/t/t3504-cherry-pick-rerere.sh
@@ -7,9 +7,11 @@ test_description='cherry-pick should rerere for conflicts'
 test_expect_success setup '
test_commit foo &&
test_commit foo-master foo &&
+   test_commit bar-master bar &&
 
git checkout -b dev foo &&
test_commit foo-dev foo &&
+   test_commit bar-dev bar &&
git config rerere.enabled true
 '
 
@@ -19,22 +21,66 @@ test_expect_success 'conflicting merge' '
 
 test_expect_success 'fixup' '
echo foo-resolved >foo &&
+   echo bar-resolved >bar &&
git commit -am resolved &&
-   cp foo expect &&
+   cp foo foo-expect &&
+   cp bar bar-expect &&
git reset --hard HEAD^
 '
 
-test_expect_success 'cherry-pick conflict' '
-   test_must_fail git cherry-pick master &&
-   test_cmp expect foo
+test_expect_success 'cherry-pick conflict with --rerere-autoupdate' '
+   test_must_fail git cherry-pick --rerere-autoupdate foo..bar-master &&
+   test_cmp foo-expect foo &&
+   git diff-files --quiet &&
+   test_must_fail git cherry-pick --continue &&
+   test_cmp bar-expect bar &&
+   git diff-files --quiet &&
+   git cherry-pick --continue &&
+   git reset --hard bar-dev
 '
 
-test_expect_success 'reconfigure' '
-   git config rerere.enabled false &&
-   git reset --hard
+test_expect_success 'cherry-pick conflict repsects rerere.autoUpdate' '
+   test_config rerere.autoUpdate true &&
+   test_must_fail git cherry-pick foo..bar-master &&
+   test_cmp foo-expect foo &&
+   git diff-files --quiet &&
+   test_must_fail git cherry-pick --continue &&
+   test_cmp bar-expect bar &&
+   git diff-files --quiet &&
+   git cherry-pick --continue &&
+   git reset --hard bar-dev
+'
+
+test_expect_success 'cherry-pick conflict with --no-rerere-autoupdate' '
+   test_config rerere.autoUpdate true &&
+   test_must_fail git cherry-pick --no-rerere-autoupdate foo..bar-master &&
+   test_cmp foo-expect foo &&
+   test_must_fail git diff-files --quiet &&
+   git add foo &&
+   test_must_fail git cherry-pick --continue &&
+   test_cmp bar-expect bar &&
+   test_must_fail git diff-files --quiet &&
+   git add bar &&
+   git cherry-pick --continue &&
+   git reset --hard bar-dev
+'
+
+test_expect_success 'cherry-pick --rerere-autoupdate more than once' '
+   test_must_fail git cherry-pick --rerere-autoupdate --rerere-autoupdate 
foo..bar-master &&
+   test_cmp foo-expect foo &&
+   git diff-files --quiet &&
+   git cherry-pick --abort &&
+   test_must_fail git cherry-pick --rerere-autoupdate 
--no-rerere-autoupdate --rerere-autoupdate foo..bar-master &&
+   test_cmp foo-expect foo &&
+   git diff-files --quiet &&
+   git cherry-pick --abort &&
+   test_must_fail git cherry-pick --rerere-autoupdate 

[PATCH 0/6] am/cherry-pick/rebase/revert --rerere-autoupdate fixes

2017-08-02 Thread Phillip Wood
From: Phillip Wood 

With the exception of 'rebase -m' when continuing after stopping for
the user to resolve conflicts, they all forget the setting of
--rerere-autoupdate.

Phillip Wood (6):
  am: remember --rerere-autoupdate setting
  rebase: honor --rerere-autoupdate
  rebase -i: honor --rerere-autoupdate
  t3504: use test_commit
  cherry-pick/revert: remember --rerere-autoupdate
  cherry-pick/revert: reject --rerere-autoupdate when continuing

 builtin/am.c  | 12 ++
 builtin/revert.c  |  2 +
 git-rebase--am.sh |  3 +-
 git-rebase--interactive.sh|  7 ++--
 sequencer.c   | 20 +-
 t/t3418-rebase-continue.sh| 85 +++-
 t/t3504-cherry-pick-rerere.sh | 90 +++
 7 files changed, 170 insertions(+), 49 deletions(-)

-- 
2.13.3



[PATCH 2/6] rebase: honor --rerere-autoupdate

2017-08-02 Thread Phillip Wood
From: Phillip Wood 

Rebase accepts '--rerere-autoupdate' as an option but only honors it
if '-m' is also given. Fix it for a non-interactive rebase by passing
on the option to 'git am' and 'git cherry-pick'. Rework the tests so
that they can be used for each rebase flavor and extend them.

Reported-by: Junio C Hamano 
Signed-off-by: Phillip Wood 
---
 git-rebase--am.sh  |  3 +-
 t/t3418-rebase-continue.sh | 82 +++---
 2 files changed, 57 insertions(+), 28 deletions(-)

diff --git a/git-rebase--am.sh b/git-rebase--am.sh
index 
375239341fbfe885e51a25e9e0dc2d4fee791345..319933e70a34f9da4ec93d063eb102eff33b6787
 100644
--- a/git-rebase--am.sh
+++ b/git-rebase--am.sh
@@ -45,7 +45,7 @@ then
# itself well to recording empty patches.  fortunately, cherry-pick
# makes this easy
git cherry-pick ${gpg_sign_opt:+"$gpg_sign_opt"} --allow-empty \
-   --right-only "$revisions" \
+   $allow_rerere_autoupdate --right-only "$revisions" \
${restrict_revision+^$restrict_revision}
ret=$?
 else
@@ -82,6 +82,7 @@ else
fi
 
git am $git_am_opt --rebasing --resolvemsg="$resolvemsg" \
+   $allow_rerere_autoupdate \
${gpg_sign_opt:+"$gpg_sign_opt"} <"$GIT_DIR/rebased-patches"
ret=$?
 
diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh
index 
4428b9086e8bcb383df801834d0de323f316f4fa..2b746f1559ad21a5ebf3bebd726c9a1b3d071c5a
 100755
--- a/t/t3418-rebase-continue.sh
+++ b/t/t3418-rebase-continue.sh
@@ -40,25 +40,6 @@ test_expect_success 'non-interactive rebase --continue works 
with touched file'
git rebase --continue
 '
 
-test_expect_success 'non-interactive rebase --continue with rerere enabled' '
-   test_config rerere.enabled true &&
-   test_when_finished "test_might_fail git rebase --abort" &&
-   git reset --hard commit-new-file-F2-on-topic-branch &&
-   git checkout master &&
-   rm -fr .git/rebase-* &&
-
-   test_must_fail git rebase --onto master master topic &&
-   echo "Resolved" >F2 &&
-   git add F2 &&
-   cp F2 F2.expected &&
-   git rebase --continue &&
-
-   git reset --hard commit-new-file-F2-on-topic-branch &&
-   git checkout master &&
-   test_must_fail git rebase --onto master master topic &&
-   test_cmp F2.expected F2
-'
-
 test_expect_success 'rebase --continue can not be used with other options' '
test_must_fail git rebase -v --continue &&
test_must_fail git rebase --continue -v
@@ -93,25 +74,72 @@ test_expect_success 'rebase --continue remembers merge 
strategy and options' '
test -f funny.was.run
 '
 
-test_expect_success 'rebase --continue remembers --rerere-autoupdate' '
+test_expect_success 'setup rerere database' '
rm -fr .git/rebase-* &&
git reset --hard commit-new-file-F3-on-topic-branch &&
git checkout master &&
test_commit "commit-new-file-F3" F3 3 &&
-   git config rerere.enabled true &&
+   test_config rerere.enabled true &&
test_must_fail git rebase -m master topic &&
echo "Resolved" >F2 &&
+   cp F2 expected-F2 &&
git add F2 &&
test_must_fail git rebase --continue &&
echo "Resolved" >F3 &&
+   cp F3 expected-F3 &&
git add F3 &&
git rebase --continue &&
-   git reset --hard topic@{1} &&
-   test_must_fail git rebase -m --rerere-autoupdate master &&
-   test "$(cat F2)" = "Resolved" &&
-   test_must_fail git rebase --continue &&
-   test "$(cat F3)" = "Resolved" &&
-   git rebase --continue
+   git reset --hard topic@{1}
 '
 
+prepare () {
+   rm -fr .git/rebase-* &&
+   git reset --hard commit-new-file-F3-on-topic-branch &&
+   git checkout master &&
+   test_config rerere.enabled true
+}
+
+test_rerere_autoupdate () {
+   action=$1 &&
+   test_expect_success "rebase $action --continue remembers 
--rerere-autoupdate" '
+   prepare &&
+   test_must_fail git rebase $action --rerere-autoupdate master 
topic &&
+   test_cmp expected-F2 F2 &&
+   git diff-files --quiet &&
+   test_must_fail git rebase --continue &&
+   test_cmp expected-F3 F3 &&
+   git diff-files --quiet &&
+   git rebase --continue
+   '
+
+   test_expect_success "rebase $action --continue honors 
rerere.autoUpdate" '
+   prepare &&
+   test_config rerere.autoupdate true &&
+   test_must_fail git rebase $action master topic &&
+   test_cmp expected-F2 F2 &&
+   git diff-files --quiet &&
+   test_must_fail git rebase --continue &&
+   test_cmp expected-F3 F3 &&
+   git diff-files --quiet &&
+   git rebase --continue
+   '

Re: reftable [v4]: new ref storage format

2017-08-02 Thread Jeff King
On Tue, Aug 01, 2017 at 07:38:37PM -0700, Shawn Pearce wrote:

> > OBJS blocks can also be
> > unbounded in size if very many references point at the same object,
> > thought that is perhaps only a theoretical problem.
> 
> Gah, I missed that in reftable. The block id pointer list could cause
> a single object id to exceed what fits in a block, and that will cause
> the writer to fail unless its caller sets the block size larger. I
> basically assumed this overflow condition is very unlikely, as its not
> common to have a huge number of refs pointing to the same object.

It's actually quite common for us, as we have big shared-object repos
that contain a copy of the refs of all of their child repos (for
reachability during packing, etc). So tags, where the value is the same
in each fork, you have one ref per fork pointing to it.

Just peeking at torvalds/linux, we have some objects with ~35K refs
pointing to them (e.g., the v2.6.11 tag).

> > Peff and I discussed off-list whether the lookup-by-SHA-1 feature is
> > so important in the first place. Currently, all references must be
> > scanned for the advertisement anyway,
> 
> Not really. You can hide refs and allow-tip-sha1 so clients can fetch
> a ref even if it wasn't in the advertisement. We really want to use
> that wire protocol capability with Gerrit Code Review to hide the
> refs/changes/ namespace from the advertisement, but allow clients to
> fetch any of those refs if they send its current SHA-1 in a want line
> anyway.
> 
> So a server could scan only the refs/{heads,tags}/ prefixes for the
> advertisement, and then leverage the lookup-by-SHA1 to verify other
> SHA-1s sent by the client.

Yeah, that makes sense (though I hope in the end that strategy will go
away in favor of a better protocol, as getting the sha1 out-of-band has
obvious UX complexities).

> > OTOH a mythical protocol v2 might reduce the need to scan the
> > references for advertisement, so maybe this optimization will be more
> > helpful in the future?
> 
> Yes, I'm hopeful we can get a v2 protocol built on the work Jonathan
> Tan is doing, and switch the advertisement around to "client speaks
> first", so that we can be smarter on the server about which refs are
> read and sent. That is a long way off, lets say 3-5 years before its
> really common in clients.

I was actually planning to spend some time on this in the next month or
two. I don't think it needs to be that complicated. We don't need a
whole protocol revamp. We just need a way to get a few bits from the
client before the advertisement, and from there we can bootstrap any
more radical protocol changes we want.

I know it will take a while before it's something we can expect in
clients, but it's definitely worth planning around. And sometimes a
feature like this can drive upgrades, if it's something that produces an
immediate and obvious benefit to the client.

> Servers today don't update HEAD reflog when a branch is pushed. I
> think trying to record that is overkill, you have the reflog data in
> the ref itself that the client sent the command to modify.

I think they do, at least for C git:

  $ git init --bare dst.git
  $ git -C dst.git config core.logallrefupdates
  $ git push dst.git
  ...
  To dst.git
   * [new branch]  master -> master
  $ find dst.git/logs -type f | xargs wc -l
1 dst.git/logs/refs/heads/master
1 dst.git/logs/HEAD

The special logic for "see if we're updating the ref that HEAD points
to" is deep in the ref machinery, so it gets triggered for all updates,
including pushes.

I agree it's not actually that interesting for a bare repo, where HEAD
isn't that meaningful (and doesn't tend to change a lot anyway).

> > That's what I was thinking. But I've yet to hear anybody complain
> > about missing reflogs for symrefs if the underlying reference is
> > updated directly, so maybe we should just leave that "problem"
> > unsolved. It is certainly simpler and less brittle not to have to keep
> > backreferences like these in sync with the forward references.
> 
> Yup, that is my take on it, and why I didn't try to put this into
> reftable drafts, even though it was discussed between us on the list
> in earlier messages.

Yeah, I'd agree. It might be worth doing a better job of showing the
before/after destinations in the reflog when updating a symbolic ref,
which would let you reconstruct the state from the pointed-to reflogs if
you cared to. But that's orthogonal to the storage format (you can do it
already if you bother to pass a good message to "symbolic-ref -m").

-Peff


Re: Git log --tags isn't working as expected

2017-08-02 Thread Jeff King
On Wed, Aug 02, 2017 at 08:41:41AM +0100, Richard Jones wrote:

> I’m trying to locate a commit which takes place after another one,
> matches a certain tag, and is by a specific user. I have the following
> command:
> 
> git log ..HEAD --decorate --author=“" --tags=“project-name”
> 
> The tag follows the format: project-name-version

I think you want --tags="project-name-*", as the pattern is a glob. If
you omit any wildcards, it does a prefix match, but only at "/"
boundaries. That's the implied "/*" from the docs:

  --tags[=]
Pretend as if all the refs in refs/tags are listed on the command
line as . If  is given, limit tags to ones matching
given shell glob. If pattern lacks ?, *, or [, /* at the end is implied.

> How ever this doesn’t seem to work, as it returns all the commits by
> that user since the commit ID supplied.

Your ref selector "--tags" didn't match anything. So we just showed
..HEAD.

However, I suspect "--tags" still doesn't quite do what you want. It
behaves as if each tag was given as a starting point for a traversal. So
we'd generally show ..HEAD in _addition_ to any tags we found
(and any commits reachable from those tags, down to ). But it
sounds like you want to just see tags. For that, I'd do one of:

  1. Ask git-tag to show the interesting bits of your tags, like:

   git tag --format='%(refname:short) %(*authorname) / %(*subject)'

 The "*" there is dereferencing the tag to show the underlying
 commit. See the FIELD NAMES section of git-for-each-ref for more
 details. You can further limit that to tags that contain a
 particular commit with "--contains ".

  2. Feed git-log some ref tips but ask it not to walk:

   git log --no-walk --tags='project-name-*'

 This is similar to (1), but uses the revision machinery, which can
 show more about each commit (e.g., the diff). But I don't think
 there's a good way to ask only for the tips that contain your
 particular commit (a negative endpoint like ".." doesn't
 play well with --no-walk, I think).

  3. Use --simplify-by-decoration to show a particular range of commits,
 but limit to ones that actually have a ref pointing at them. Like:

   git log ..HEAD --simplify-by-decoration

I think (3) matches what you're trying to do the best. You can't say
"just tags" for the decoration, though, so you'll see branch tips as
well. There's been some discussion about allowing specific decorations,
but nothing merged yet.

-Peff


Git log --tags isn't working as expected

2017-08-02 Thread Richard Jones
Hi,

I’m trying to locate a commit which takes place after another one,
matches a certain tag, and is by a specific user. I have the following
command:

git log ..HEAD --decorate --author=“" --tags=“project-name”

The tag follows the format: project-name-version

How ever this doesn’t seem to work, as it returns all the commits by
that user since the commit ID supplied.

There is also another tag, the release version which follows the
format: x.y.z. I have tried searching for this, using:

git log --tags=“x.y.z”

But it returns all the commits, not just a specific one. The tag I am
searching for is returned by: git tag

My understanding from the docs
(https://git-scm.com/docs/git-log#git-log---tagsltpatterngt) is that
git log should only return matches to the —tags argument pattern.

I have tried using git version 2.11.0 (Apple Git-81) and version 2.13.3.

Cheers,
Rich