Re: [PATCH 4/4] sha1dc: remove in favor of using sha1collisiondetection as a submodule

2017-12-04 Thread Jeff King
On Tue, Nov 28, 2017 at 09:32:14PM +, Ævar Arnfjörð Bjarmason wrote:

> As an aside: Now where was that .gitdiffsort or whatever it was called
> which would enable us to show sha1dc_git.h before nuking the contents
> of sha1dc? :)

diff.orderFile, I think.

> It also occurs to me that it would be useful for patch formatting in
> general to make all file removals go after file changes & additions.

I wonder if that might run afoul of programs which try to apply patches
in order. I think "git apply" is smart enough to apply all deletions
(regardless of order) before any creations (so it cannot be fooled by
adding "foo" followed by deleting "foo/bar"). So maybe it's fine.

-Peff


Re: [PATCH 3/4] Makefile: use the sha1collisiondetection submodule by default

2017-12-04 Thread Jeff King
On Tue, Nov 28, 2017 at 09:32:13PM +, Ævar Arnfjörð Bjarmason wrote:

> Change the build process so that instead of needing to supply
> DC_SHA1_SUBMODULE=YesPlease to use the sha1collisiondetection
> submodule instead of the copy of the same code shipped in the sha1dc
> directory, it uses the submodule by default unless
> NO_DC_SHA1_SUBMODULE=UnfortunatelyYes is supplied.
> 
> This reverses the logic added by me in 86cfd61e6b ("sha1dc: optionally
> use sha1collisiondetection as a submodule", 2017-07-01). Git has now
> shipped with the submodule in git.git for two major releases, if we're
> ever going to migrate to fully using it instead of perpetually
> maintaining both sha1collisiondetection and the sha1dc directory this
> is a logical first step.
> 
> This change removes the "auto" logic Junio added in
> cac87dc01d ("sha1collisiondetection: automatically enable when
> submodule is populated", 2017-07-01), I feel that automatically
> falling back to using sha1dc would defeat the point, which is to smoke
> out any remaining users of git.git who have issues cloning the
> submodule for whatever reason.

I'm not sure how I feel about this. I see your point that there's no
real value in maintaining two systems indefinitely.  At the same time, I
wonder how much value the submodule strategy is actually bringing us.

IOW, are we agreed that the path forward is to get everybody using the
submodule?

It seems like it's going to cause some minor pain for CI and packaging
systems that now need to care about submodules (so at least flipping the
switch, but maybe also dealing with having a network dependency for the
build that was not already there).

I'll admit I'm more sensitive to this than most people, since I happen
to maintain a fork of Git that I run through an internal CI system. And
that CI otherwise depends only on the locally-held fork, not any
external resources. But I'm probably in a fairly unique situation there.

-Peff


Re: [PATCH v2 1/2] git-prompt: make __git_eread intended use explicit

2017-12-04 Thread Robert Abel
Hi Junio,

On 05 Dec 2017 01:27, Junio C Hamano wrote:
> I know all of the above, but I think you misunderstood the point I
> wanted to raise, so let me try again.  The thing is, none of what
> you just wrote changes the fact that lack of callers that want to do
> "multi-line" is IRRELEVANT.

I disagree. The commit comment is meant to give context to the
introduced changes. One change is the  additional comment for
__git_eread, which now clearly states that only a single line is read.

I'm well aware that I'm not breaking reading multiple lines, because
that never worked in the first place. Thus, it was never the indented
use for __git_eread as I see it. I explicitly want to include that
information in my commit message to pay it forward to the next person
working on the prompt.

Regards,

Robert


Re: [PATCH 2/4] sha1dc_git.h: re-arrange an ifdef chain for a subsequent change

2017-12-04 Thread Jeff King
On Tue, Nov 28, 2017 at 09:32:12PM +, Ævar Arnfjörð Bjarmason wrote:

> A subsequent change will change the semantics of DC_SHA1_SUBMODULE in
> a way that would require moving these checks around, so start by
> moving them around without any functional changes.

OK. This flips the priority, but we're assuming that the Makefile
doesn't allow you to actually set both flags (at least not easily). And
I think that is the case, even after your loosening of the error for
"auto" in the previous patch, because the whole DC_SHA1_SUBMODULE
code-path is in the "else" block for DC_SHA1_EXTERNAL. Good.

-Peff


Re: [PATCH 1/4] Makefile: don't error out under DC_SHA1_EXTERNAL if DC_SHA1_SUBMODULE=auto

2017-12-04 Thread Jeff King
On Tue, Nov 28, 2017 at 09:32:11PM +, Ævar Arnfjörð Bjarmason wrote:

> Fix a logic error in the initial introduction of DC_SHA1_EXTERNAL. If
> git.git has a sha1collisiondetection submodule checked out the logic
> to set DC_SHA1_SUBMODULE=auto would interact badly with the check for
> whether DC_SHA1_SUBMODULE was set.
> 
> It would error out, meaning that there's no way to build git with
> DC_SHA1_EXTERNAL=YesPlease without deinit-ing the submodule.
> 
> Instead, adjust the logic to only fire if the variable is to something
> else than "auto" which would mean it's a mistake on the part of
> whoever's building git, not just the Makefile tripping over its own
> logic.

This all makes sense, and I agree your patch is an improvement.

One minor whitespace nit:

> diff --git a/Makefile b/Makefile
> index e53750ca01..8fe8278126 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1497,7 +1497,9 @@ else
>   LIB_OBJS += sha1dc_git.o
>  ifdef DC_SHA1_EXTERNAL
>   ifdef DC_SHA1_SUBMODULE
> +ifneq ($(DC_SHA1_SUBMODULE),auto)
>  $(error Only set DC_SHA1_EXTERNAL or DC_SHA1_SUBMODULE, not both)
> +endif
>   endif

The indentation here is funky. Unfortunately I think $(error) can't be
tab-indented, so it has to either stay at the left-most side, or we have
to use spaces.

But the ifneq/endif pair can be indented. Ordinarily I'd say it's fine
to keep it at the outermost (because of the weird error indent), but
note that we're _inside_ an already-indented ifdef/endif pair. We should
either stay inside there, or we should put the outer one to the left for
consistency.

-Peff


Re: [PATCH v2 2/2] Doc/check-ref-format: clarify information about @{-N} syntax

2017-12-04 Thread Kaartic Sivaraam

On Tuesday 05 December 2017 12:14 AM, Junio C Hamano wrote:

Kaartic Sivaraam  writes:


Stepping back a bit, the mild suspicion above says

 $ git checkout HEAD^0
 ... do things ...
 $ git checkout -b temp
 ... do more things ...
 $ git checkout -B @{-1}

that creates a new branch whose name is 40-hex of a commit that
happens to be where we started the whole dance *is* a bug.  No sane
user expects that to happen, and the last step "checkout -B @{-1}"
should result in an error instead [*1*].

I was wondering if "git check-ref-format --branch @{-1}", when used
in place of "checkout -B @{-1}" in the above sequence,


I guess you mean '... "git checkout -B $(git check-ref-format --branch
@{-1}", when used in place of "git checkout -B @{-1}" ...' ?


No you guessed wrong.  I was (and am) wondering if the last step in
the following sequence should fail.

 $ git checkout HEAD^0
 ... do things ...
 $ git checkout -b temp
 ... do more things ...
 $ git check-ref-format --branch @{-1}




Ok. Now I get what you say.



And I am leaning towards saying that it is a bug that it does not
fail; @{-1} is a detached HEAD and not a concrete branch name in
this case, 


It seems your thought is similar to the following thought that I 
expressed in [1],


-- 8< --


I thought this the other way round. Rather than letting the callers
error out when @{-N} didn't expand to a branch name, I thought we
should not be expanding @{-N} syntax for "check-ref-format --branch" at
all to make a "stronger guarantee" that the result is "always" a valid
branch name. Then I thought it might be too restrictive and didn't
mention it. So, I dunno.


-- >8 --




so "check-ref-format --branch" should at least notice
and say that it is a request that may lead to a nonsense next step
(which is to create a branch with that 40-hex name).



Makes sense, this should at least be noted in the Documentation. Is that 
what you had in mind too or do you expect 'check-ref-format' to do 
something else too?



[1]: https://public-inbox.org/git/1511880237.10193.5.ca...@gmail.com/

---
Kaartic


Re: [PATCH v2 6/9] rebase -i: update functions to use a flags parameter

2017-12-04 Thread liam Beguin
Hi Johannes,

On 04/12/17 10:46 AM, Johannes Schindelin wrote:
> Hi Liam,
> 
> On Sun, 3 Dec 2017, Liam Beguin wrote:
> 
>> diff --git a/sequencer.h b/sequencer.h
>> index 4e444e3bf1c4..3bb6b0658192 100644
>> --- a/sequencer.h
>> +++ b/sequencer.h
>> @@ -45,10 +45,12 @@ int sequencer_continue(struct replay_opts *opts);
>>  int sequencer_rollback(struct replay_opts *opts);
>>  int sequencer_remove_state(struct replay_opts *opts);
>>  
>> -int sequencer_make_script(int keep_empty, FILE *out,
>> -int argc, const char **argv);
>> +#define TODO_LIST_KEEP_EMPTY (1U << 0)
>> +#define TODO_LIST_SHORTED_IDS (1U << 1)
> 
> Maybe SHORTEN_IDs? And either revert back to transform_todo_ids() or use
> SHORTEN_INSNS...
> 

I'll change it to TODO_LIST_SHORTED_IDs. TODO_LIST_SHORTED_INSNS would
suggest the flag changes both parts of the todo.

> Maybe also TRANSFORM_TODO_LIST_* and maybe move the #define's above the
> transform_todo_ids() function, i.e. one line further down?
> 
>> +int sequencer_make_script(FILE *out, int argc, const char **argv,
>> +  unsigned flags);
>>  
>> -int transform_todo_insn(int shorten_ids);
>> +int transform_todo_insn(unsigned flags);
>>  int check_todo_list(void);
>>  int skip_unnecessary_picks(void);
>>  int rearrange_squash(void);
> 
> Ciao,
> Johannes
> 

Thanks, 
Liam


Re: [PATCH v2 4/9] rebase -i: refactor transform_todo_ids

2017-12-04 Thread liam Beguin
Hi Johannes,

On 04/12/17 09:42 AM, Johannes Schindelin wrote:
> Hi Liam,
> 
> On Sun, 3 Dec 2017, Liam Beguin wrote:
> 
>> The transform_todo_ids function is a little hard to read. Lets try
>> to make it easier by using more of the strbuf API. Also, since we'll
>> soon be adding command abbreviations, let's rename the function so
>> it's name reflects that change.
> 
> I am not really a fan of the new name, and would prefer the old one, but
> that's only a nit, not a reason to reject the patch.
> 

You're right, it's probably not the best name. I'll change it to
transform_todos() as we want the function name to reflect that it changes
both parts of the todo.

> The rest of it makes the code reads a lot nicer than before. Thank you,
> Johannes
> 

Thanks,
Liam


Re: [PATCH v2 4/9] rebase -i: refactor transform_todo_ids

2017-12-04 Thread liam Beguin
Hi Junio,

On 04/12/17 11:09 AM, Junio C Hamano wrote:
> Johannes Schindelin  writes:
> 
>> On Sun, 3 Dec 2017, Liam Beguin wrote:
>>
>>> The transform_todo_ids function is a little hard to read. Lets try
>>> to make it easier by using more of the strbuf API. Also, since we'll
>>> soon be adding command abbreviations, let's rename the function so
>>> it's name reflects that change.
>>
>> I am not really a fan of the new name, and would prefer the old one, but
>> that's only a nit, not a reason to reject the patch.
> 
> FWIW, I do think the new name goes backwards.  The command uses to
> remember what operations are to be carried out in which order using
> a thing, and the name of that thing "todo list".  We also called it
> the "instruction sheet", and "insn" was a good term to call one item
> on that sheet among other items.
> 

Good point on saying this name change is going backwards.

> But recent commits in the area are pushing us to call it "todo list"
> consistently.  An element in that list should be called "todo".
> 
> A "todo" consists of two parts, "what operation is done" part and
> "using what commit object" part.  The original implementation of
> this function affected only the latter part, and in that light, the
> original name transform_todo_ids() is understandable.  Now you are
> planning to make it modify both parts, not just "ids", so it is
> understandable that you would want to rename it.  But I do not think
> "insn" matches the recent trend.  It even risks misunderstanding
> (i.e. xfrm_todo_ids() is about modifying "using what commit" part,
> so perhaps xfrm_todo_insns() is about modifying "what operation is
> done" part---but that is different from what you want to do, which
> is to update _both_ halves).
> 

You're right! We do want the name to reflect that we intend to change
both halves and not only the command.

> Calling it just transform_todo() would probably be more in line with
> the reason why you wanted to rename it in the first place.
> 

Good suggestion. Would transform_todos() work too? I'll send an update
tomorrow.
Thanks, 

Liam


[PATCH] Documentation/git-clone: improve description for submodule recursing

2017-12-04 Thread Stefan Beller
There have been a few complaints on the mailing list that git-clone doesn't
respect the `submodule.recurse` setting, which every other command (that
potentially knows how to deal with submodules) respects.  In case of clone
this is not beneficial to respect as the user may not want to obtain all
submodules (assuming a pathspec of '.').

Improve the documentation such that the pathspec is mentioned in the
synopsis to alleviate the confusion around the submodule recursion flag
in git-clone.

While at it clarify that the option can be given multiple times for complex\
pathspecs.

Signed-off-by: Stefan Beller 
---
 Documentation/git-clone.txt | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 83c8e9b394..42ca7b5095 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -14,7 +14,7 @@ SYNOPSIS
  [-o ] [-b ] [-u ] [--reference ]
  [--dissociate] [--separate-git-dir ]
  [--depth ] [--[no-]single-branch] [--no-tags]
- [--recurse-submodules] [--[no-]shallow-submodules]
+ [--recurse-submodules[=]] [--[no-]shallow-submodules]
  [--jobs ] [--]  []
 
 DESCRIPTION
@@ -231,14 +231,17 @@ branch of some repository for search indexing.
After the clone is created, initialize and clone submodules
within based on the provided pathspec.  If no pathspec is
provided, all submodules are initialized and cloned.
-   Submodules are initialized and cloned using their default
-   settings.  The resulting clone has `submodule.active` set to
+   This option can be given multiple times for pathspecs consisting
+   of multiple entries.  The resulting clone has `submodule.active` set to
the provided pathspec, or "." (meaning all submodules) if no
-   pathspec is provided.  This is equivalent to running
-   `git submodule update --init --recursive` immediately after
-   the clone is finished. This option is ignored if the cloned
-   repository does not have a worktree/checkout (i.e. if any of
-   `--no-checkout`/`-n`, `--bare`, or `--mirror` is given)
+   pathspec is provided.
++
+Submodules are initialized and cloned using their default settings. This is
+equivalent to running
+`git submodule update --init --recursive ` immediately after
+the clone is finished. This option is ignored if the cloned repository does
+not have a worktree/checkout (i.e. if any of `--no-checkout`/`-n`, `--bare`,
+or `--mirror` is given)
 
 --[no-]shallow-submodules::
All submodules which are cloned will be shallow with a depth of 1.
-- 
2.15.0.531.g2ccb3012c9-goog



Re: What's cooking in git.git (Dec 2017, #01; Mon, 4)

2017-12-04 Thread Junio C Hamano
Jonathan Tan  writes:

> On Mon, 04 Dec 2017 13:46:43 -0800
> Junio C Hamano  wrote:
>
>> * jt/diff-anchored-patience (2017-11-28) 1 commit
>>  - diff: support anchoring line(s)
>> 
>>  "git diff" learned a variant of the "--patience" algorithm, to
>>  which the user can specify which 'unique' line to be used as
>>  anchoring points.
>
> Is there anything I can do to progress this?

I had an impression that there weren't any major outstanding issues.
Such topics will move at their own pace and the only thing somebody
could do to expedite is to somehow block other topics in flight so
that we have to worry about smaller number of topics ;-)



Re: [PATCH v2 1/2] git-prompt: make __git_eread intended use explicit

2017-12-04 Thread Junio C Hamano
Robert Abel  writes:

> Hi Junio,
>
> On 04 Dec 2017 18:58, Junio C Hamano wrote:
>> Robert Abel  writes:
>>> __git_eread is used to read a single line of a given file (if it exists)
>>> into a variable without the EOL. All six current users of __git_eread
>>> use it that way and don't expect multi-line content.
>> 
>> Changing $@ to $2 does not change whether this is about "multi-line"
>> or not. 
>
> I'm aware of that. I was documenting current usage. The function is used
> to read file contents (which are expected to be a single line) into
> _a_ (i.e. single) variable.
>
> None of the current users of the function expect tokens to be split,
> which is why I removed it in preparation of patch 2/2, which would
> break tokenizing file contents.

I know all of the above, but I think you misunderstood the point I
wanted to raise, so let me try again.  The thing is, none of what
you just wrote changes the fact that lack of callers that want to do
"multi-line" is IRRELEVANT.  True, there is no caller that wants to
read multiple lines---it is a true statement, but it is irrelevant
statement.  On the other hand, it is true and relevant that no
caller expects to split a line into multiple variables.

By changing "$@" to "$2" there, you would have broken callers that
wanted the helper function to read into multiple variables (if there
were such callers).  Explaining the current usage that nobody does
so *IS* a valid justification for the change.  It is relevant.

With or without that change, a caller that wanted to read multiple
lines from the file would never have worked.  It was just doing a
single "read" built-in, so the only thing that would have been
worked on is the first line of the file.  Your change wouldn't have
changed that---if a caller wanted to peek into the second line, your
change wouldn't have helped such a caller.  And it is not like your
change would have broken such a caller that were happily reading the
second and subsequent line.  The original wouldn't allowed it to
read the second line anyway.

Contrasting this with the above obsesrvation about possible breakage
for multi-variable callers (if there were such callers---luckily
there wasn't any), I hope that you can see why the lack of
"multi-line" caller in the existing usage is totally irrelevant when
analyzing this change and explaining why this is a good change.

HTH.


Re: [PATCH] pathspec: only match across submodule boundaries when requested

2017-12-04 Thread Brandon Williams
On 11/29, Johannes Schindelin wrote:
> Hi Brandon,
> 
> On Tue, 28 Nov 2017, Brandon Williams wrote:
> 
> > Commit 74ed43711fd (grep: enable recurse-submodules to work on 
> > objects, 2016-12-16) taught 'tree_entry_interesting()' to be able to
> > match across submodule boundaries in the presence of wildcards.  This is
> > done by performing literal matching up to the first wildcard and then
> > punting to the submodule itself to perform more accurate pattern
> > matching.  Instead of introducing a new flag to request this behavior,
> > commit 74ed43711fd overloaded the already existing 'recursive' flag in
> > 'struct pathspec' to request this behavior.
> > 
> > This leads to a bug where whenever any other caller has the 'recursive'
> > flag set as well as a pathspec with wildcards that all submodules will
> > be indicated as matches.  One simple example of this is:
> > 
> > git init repo
> > cd repo
> > 
> > git init submodule
> > git -C submodule commit -m initial --allow-empty
> > 
> > touch "[bracket]"
> > git add "[bracket]"
> > git commit -m bracket
> > git add submodule
> > git commit -m submodule
> > 
> > git rev-list HEAD -- "[bracket]"
> > 
> > Fix this by introducing the new flag 'recurse_submodules' in 'struct
> > pathspec' and using this flag to determine if matches should be allowed
> > to cross submodule boundaries.
> > 
> > Signed-off-by: Brandon Williams 
> 
> Could you also add something like
> 
>   This fixes https://github.com/git-for-windows/git/issues/1371
> 
> at the end of the commit message, to keep a reference to the original bug
> report?

Yep! I can do that.

> 
> >  4 files changed, 22 insertions(+), 2 deletions(-)
> 
> Phew. That was much smaller than I expected.
> 
> > +test_expect_success 'tree_entry_interesting does not match past submodule 
> > boundaries' '
> > +   test_when_finished "rm -rf repo submodule" &&
> > +   git init submodule &&
> > +   test_commit -C submodule initial &&
> > +   git init repo &&
> > +   >"repo/[bracket]" &&
> > +   git -C repo add "[bracket]" &&
> > +   git -C repo commit -m bracket &&
> > +   git -C repo rev-list HEAD -- "[bracket]" >expect &&
> > +
> > +   git -C repo submodule add ../submodule &&
> > +   git -C repo commit -m submodule &&
> > +
> > +   git -C repo rev-list HEAD -- "[bracket]" >actual &&
> > +   test_cmp expect actual
> > +'
> 
> Nicely prepared for a new hash function, too (no explicit SHA-1).
> 
> I wonder, however, why we can't `git checkout -b bracket` and
> `test_when_finished "git checkout master"` and void those many `-C repo`
> options. But then, it is actually one of the shorter test cases, and
> pretty easy to understand.
> 
> However, I would still like to see `test_tick`s before those `git commit`
> calls, to make the commit names reproducible.

In v2 I added the calls to test_tick.  I've never used the function
myself so hopefully I used it correctly! :)

> 
> Thanks,
> Dscho

-- 
Brandon Williams


[PATCH v2] pathspec: only match across submodule boundaries when requested

2017-12-04 Thread Brandon Williams
Commit 74ed43711fd (grep: enable recurse-submodules to work on 
objects, 2016-12-16) taught 'tree_entry_interesting()' to be able to
match across submodule boundaries in the presence of wildcards.  This is
done by performing literal matching up to the first wildcard and then
punting to the submodule itself to perform more accurate pattern
matching.  Instead of introducing a new flag to request this behavior,
commit 74ed43711fd overloaded the already existing 'recursive' flag in
'struct pathspec' to request this behavior.

This leads to a bug where whenever any other caller has the 'recursive'
flag set as well as a pathspec with wildcards that all submodules will
be indicated as matches.  One simple example of this is:

git init repo
cd repo

git init submodule
git -C submodule commit -m initial --allow-empty

touch "[bracket]"
git add "[bracket]"
git commit -m bracket
git add submodule
git commit -m submodule

git rev-list HEAD -- "[bracket]"

Fix this by introducing the new flag 'recurse_submodules' in 'struct
pathspec' and using this flag to determine if matches should be allowed
to cross submodule boundaries.

This fixes https://github.com/git-for-windows/git/issues/1371.

Signed-off-by: Brandon Williams 
---
 builtin/grep.c|  1 +
 pathspec.h|  1 +
 t/t4208-log-magic-pathspec.sh | 19 +++
 tree-walk.c   |  5 +++--
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5a6cfe6b4..3ca4ac80d 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1015,6 +1015,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
   prefix, argv + i);
pathspec.max_depth = opt.max_depth;
pathspec.recursive = 1;
+   pathspec.recurse_submodules = !!recurse_submodules;
 
 #ifndef NO_PTHREADS
if (list.nr || cached || show_in_pager)
diff --git a/pathspec.h b/pathspec.h
index 6420d1080..099a170c2 100644
--- a/pathspec.h
+++ b/pathspec.h
@@ -24,6 +24,7 @@ struct pathspec {
int nr;
unsigned int has_wildcard:1;
unsigned int recursive:1;
+   unsigned int recurse_submodules:1;
unsigned magic;
int max_depth;
struct pathspec_item {
diff --git a/t/t4208-log-magic-pathspec.sh b/t/t4208-log-magic-pathspec.sh
index 935df6a65..a1705f70c 100755
--- a/t/t4208-log-magic-pathspec.sh
+++ b/t/t4208-log-magic-pathspec.sh
@@ -93,4 +93,23 @@ test_expect_success 'command line pathspec parsing for "git 
log"' '
git log --merge -- a
 '
 
+test_expect_success 'tree_entry_interesting does not match past submodule 
boundaries' '
+   test_when_finished "rm -rf repo submodule" &&
+   git init submodule &&
+   test_commit -C submodule initial &&
+   git init repo &&
+   >"repo/[bracket]" &&
+   git -C repo add "[bracket]" &&
+   test_tick &&
+   git -C repo commit -m bracket &&
+   git -C repo rev-list HEAD -- "[bracket]" >expect &&
+
+   git -C repo submodule add ../submodule &&
+   test_tick &&
+   git -C repo commit -m submodule &&
+
+   git -C repo rev-list HEAD -- "[bracket]" >actual &&
+   test_cmp expect actual
+'
+
 test_done
diff --git a/tree-walk.c b/tree-walk.c
index 684f0e337..63a87ed66 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -1011,7 +1011,8 @@ static enum interesting do_match(const struct name_entry 
*entry,
 * character.  More accurate matching can then
 * be performed in the submodule itself.
 */
-   if (ps->recursive && S_ISGITLINK(entry->mode) &&
+   if (ps->recurse_submodules &&
+   S_ISGITLINK(entry->mode) &&
!ps_strncmp(item, match + baselen,
entry->path,
item->nowildcard_len - baselen))
@@ -1060,7 +1061,7 @@ static enum interesting do_match(const struct name_entry 
*entry,
 * character.  More accurate matching can then
 * be performed in the submodule itself.
 */
-   if (ps->recursive && S_ISGITLINK(entry->mode) &&
+   if (ps->recurse_submodules && S_ISGITLINK(entry->mode) &&
!ps_strncmp(item, match, base->buf + base_offset,
item->nowildcard_len)) {
strbuf_setlen(base, base_offset + baselen);
-- 
2.15.1.424.g9478a66081-goog



[WIP 05/15] upload-pack: factor out processing lines

2017-12-04 Thread Brandon Williams
Factor out the logic for processing shallow, deepen, deepen_since, and
deepen_not lines into their own functions to simplify the
'receive_needs()' function in addition to making it easier to reuse some
of this logic when implementing protocol_v2.

Signed-off-by: Brandon Williams 
---
 upload-pack.c | 113 ++
 1 file changed, 74 insertions(+), 39 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 2d16952a3..d2711e4ee 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -731,6 +731,75 @@ static void deepen_by_rev_list(int ac, const char **av,
packet_flush(1);
 }
 
+static int process_shallow(const char *line, struct object_array *shallows)
+{
+   const char *arg;
+   if (skip_prefix(line, "shallow ", )) {
+   struct object_id oid;
+   struct object *object;
+   if (get_oid_hex(arg, ))
+   die("invalid shallow line: %s", line);
+   object = parse_object();
+   if (!object)
+   return 1;
+   if (object->type != OBJ_COMMIT)
+   die("invalid shallow object %s", oid_to_hex());
+   if (!(object->flags & CLIENT_SHALLOW)) {
+   object->flags |= CLIENT_SHALLOW;
+   add_object_array(object, NULL, shallows);
+   }
+   return 1;
+   }
+
+   return 0;
+}
+
+static int process_deepen(const char *line, int *depth)
+{
+   const char *arg;
+   if (skip_prefix(line, "deepen ", )) {
+   char *end = NULL;
+   *depth = strtol(arg, , 0);
+   if (!end || *end || depth <= 0)
+   die("Invalid deepen: %s", line);
+   return 1;
+   }
+
+   return 0;
+}
+
+static int process_deepen_since(const char *line, timestamp_t *deepen_since, 
int *deepen_rev_list)
+{
+   const char *arg;
+   if (skip_prefix(line, "deepen-since ", )) {
+   char *end = NULL;
+   *deepen_since = parse_timestamp(arg, , 0);
+   if (!end || *end || !deepen_since ||
+   /* revisions.c's max_age -1 is special */
+   *deepen_since == -1)
+   die("Invalid deepen-since: %s", line);
+   *deepen_rev_list = 1;
+   return 1;
+   }
+   return 0;
+}
+
+static int process_deepen_not(const char *line, struct string_list 
*deepen_not, int *deepen_rev_list)
+{
+   const char *arg;
+   if (skip_prefix(line, "deepen-not ", )) {
+   char *ref = NULL;
+   struct object_id oid;
+   if (expand_ref(arg, strlen(arg), oid.hash, ) != 1)
+   die("git upload-pack: ambiguous deepen-not: %s", line);
+   string_list_append(deepen_not, ref);
+   free(ref);
+   *deepen_rev_list = 1;
+   return 1;
+   }
+   return 0;
+}
+
 static void receive_needs(void)
 {
struct object_array shallows = OBJECT_ARRAY_INIT;
@@ -752,49 +821,15 @@ static void receive_needs(void)
if (!line)
break;
 
-   if (skip_prefix(line, "shallow ", )) {
-   struct object_id oid;
-   struct object *object;
-   if (get_oid_hex(arg, ))
-   die("invalid shallow line: %s", line);
-   object = parse_object();
-   if (!object)
-   continue;
-   if (object->type != OBJ_COMMIT)
-   die("invalid shallow object %s", 
oid_to_hex());
-   if (!(object->flags & CLIENT_SHALLOW)) {
-   object->flags |= CLIENT_SHALLOW;
-   add_object_array(object, NULL, );
-   }
+   if (process_shallow(line, ))
continue;
-   }
-   if (skip_prefix(line, "deepen ", )) {
-   char *end = NULL;
-   depth = strtol(arg, , 0);
-   if (!end || *end || depth <= 0)
-   die("Invalid deepen: %s", line);
+   if (process_deepen(line, ))
continue;
-   }
-   if (skip_prefix(line, "deepen-since ", )) {
-   char *end = NULL;
-   deepen_since = parse_timestamp(arg, , 0);
-   if (!end || *end || !deepen_since ||
-   /* revisions.c's max_age -1 is special */
-   deepen_since == -1)
-   die("Invalid deepen-since: %s", line);
-   deepen_rev_list = 1;
+   if (process_deepen_since(line, _since, _rev_list))

[WIP 10/15] protocol: introduce enum protocol_version value protocol_v2

2017-12-04 Thread Brandon Williams
Introduce protocol_v2, a new value for 'enum protocol_version'.
Subsequent patches will fill in the implementation of protocol_v2.

Signed-off-by: Brandon Williams 
---
 builtin/fetch-pack.c   | 3 +++
 builtin/receive-pack.c | 3 +++
 builtin/send-pack.c| 3 +++
 connect.c  | 3 +++
 protocol.c | 2 ++
 protocol.h | 1 +
 remote-curl.c  | 3 +++
 transport.c| 9 +
 upload-pack.c  | 3 +++
 9 files changed, 30 insertions(+)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 4873e9572..061c278b4 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -199,6 +199,9 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
packet_reader_init(, fd[0], NULL, 0);
 
switch (discover_version()) {
+   case protocol_v2:
+   die("support for protocol v2 not implemented yet");
+   break;
case protocol_v1:
case protocol_v0:
get_remote_heads(, , 0, NULL, );
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 839c1462d..4e141d521 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1965,6 +1965,9 @@ int cmd_receive_pack(int argc, const char **argv, const 
char *prefix)
unpack_limit = receive_unpack_limit;
 
switch (determine_protocol_version_server()) {
+   case protocol_v2:
+   die("support for protocol v2 not implemented yet");
+   break;
case protocol_v1:
/*
 * v1 is just the original protocol with a version string,
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 9c2ca80c8..9441f1eed 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -261,6 +261,9 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
packet_reader_init(, fd[0], NULL, 0);
 
switch (discover_version()) {
+   case protocol_v2:
+   die("support for protocol v2 not implemented yet");
+   break;
case protocol_v1:
case protocol_v0:
get_remote_heads(, _refs, REF_NORMAL,
diff --git a/connect.c b/connect.c
index 5f7cf05c7..433f08649 100644
--- a/connect.c
+++ b/connect.c
@@ -84,6 +84,9 @@ enum protocol_version discover_version(struct packet_reader 
*reader)
 
/* Maybe process capabilities here, at least for v2 */
switch (version) {
+   case protocol_v2:
+   die("support for protocol v2 not implemented yet");
+   break;
case protocol_v1:
/* Read the peeked version line */
packet_reader_read(reader);
diff --git a/protocol.c b/protocol.c
index 43012b7eb..5e636785d 100644
--- a/protocol.c
+++ b/protocol.c
@@ -8,6 +8,8 @@ static enum protocol_version parse_protocol_version(const char 
*value)
return protocol_v0;
else if (!strcmp(value, "1"))
return protocol_v1;
+   else if (!strcmp(value, "2"))
+   return protocol_v2;
else
return protocol_unknown_version;
 }
diff --git a/protocol.h b/protocol.h
index 1b2bc94a8..2ad35e433 100644
--- a/protocol.h
+++ b/protocol.h
@@ -5,6 +5,7 @@ enum protocol_version {
protocol_unknown_version = -1,
protocol_v0 = 0,
protocol_v1 = 1,
+   protocol_v2 = 2,
 };
 
 /*
diff --git a/remote-curl.c b/remote-curl.c
index 74c6c3049..abb6e2ac1 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -183,6 +183,9 @@ static struct ref *parse_git_refs(struct discovery *heads, 
int for_push)
packet_reader_init(, -1, heads->buf, heads->len);
 
switch (discover_version()) {
+   case protocol_v2:
+   die("support for protocol v2 not implemented yet");
+   break;
case protocol_v1:
case protocol_v0:
get_remote_heads(, , for_push ? REF_NORMAL : 0,
diff --git a/transport.c b/transport.c
index 4160c4167..8a3735cf5 100644
--- a/transport.c
+++ b/transport.c
@@ -201,6 +201,9 @@ static struct ref *get_refs_via_connect(struct transport 
*transport, int for_pus
 
data->version = discover_version();
switch (data->version) {
+   case protocol_v2:
+   die("support for protocol v2 not implemented yet");
+   break;
case protocol_v1:
case protocol_v0:
get_remote_heads(, ,
@@ -248,6 +251,9 @@ static int fetch_refs_via_pack(struct transport *transport,
refs_tmp = get_refs_via_connect(transport, 0);
 
switch (data->version) {
+   case protocol_v2:
+   die("support for protocol v2 not implemented yet");
+   break;
case protocol_v1:
case protocol_v0:
refs = fetch_pack(, data->fd, data->conn,
@@ -584,6 +590,9 @@ static int git_transport_push(struct transport *transport, 
struct ref *remote_re
args.push_cert = 

[WIP 14/15] upload_pack: introduce fetch server command

2017-12-04 Thread Brandon Williams
Introduce the 'fetch' server command.

Signed-off-by: Brandon Williams 
---
 serve.c   |   2 +
 upload-pack.c | 223 ++
 upload-pack.h |   9 +++
 3 files changed, 234 insertions(+)
 create mode 100644 upload-pack.h

diff --git a/serve.c b/serve.c
index 36f77c365..7823ef0d9 100644
--- a/serve.c
+++ b/serve.c
@@ -6,6 +6,7 @@
 #include "argv-array.h"
 #include "ls-refs.h"
 #include "serve.h"
+#include "upload-pack.h"
 
 static int always_advertise(struct repository *r,
struct strbuf *value)
@@ -34,6 +35,7 @@ struct protocol_capability {
 static struct protocol_capability capabilities[] = {
{ "agent", 0, agent_advertise, NULL },
{ "ls-refs", 0, always_advertise, ls_refs },
+   { "fetch", 0, always_advertise, upload_pack_v2 },
 };
 
 static void advertise_capabilities(void)
diff --git a/upload-pack.c b/upload-pack.c
index 3296831f8..dc4421ab3 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -20,6 +20,7 @@
 #include "prio-queue.h"
 #include "protocol.h"
 #include "serve.h"
+#include "upload-pack.h"
 
 static const char * const upload_pack_usage[] = {
N_("git upload-pack [] "),
@@ -1041,6 +1042,228 @@ static void upload_pack(void)
}
 }
 
+static int process_want(const char *line)
+{
+   const char *arg;
+   if (skip_prefix(line, "want ", )) {
+   struct object_id oid;
+   struct object *o;
+
+   if (get_oid_hex(arg, ))
+   die("git upload-pack: protocol error, "
+   "expected to get oid, not '%s'", line);
+
+   o = parse_object();
+   if (!o) {
+   packet_write_fmt(1,
+"ERR upload-pack: not our ref %s",
+oid_to_hex());
+   die("git upload-pack: not our ref %s",
+   oid_to_hex());
+   }
+
+   if (!(o->flags & WANTED)) {
+   o->flags |= WANTED;
+   add_object_array(o, NULL, _obj);
+   }
+
+   return 1;
+   }
+
+   return 0;
+}
+
+static int parse_have(const char *line, struct oid_array *haves)
+{
+   const char *arg;
+   if (skip_prefix(line, "have ", )) {
+   struct object_id oid;
+
+   if (get_oid_hex(arg, ))
+   die("git upload-pack: expected SHA1 object, got '%s'", 
arg);
+   oid_array_append(haves, );
+   return 1;
+   }
+
+   return 0;
+}
+
+static void process_args(struct argv_array *args, struct oid_array *haves_oid)
+{
+   int i;
+
+   for (i = 0; i < args->argc; i++) {
+   const char *arg = args->argv[i];
+
+   /* process want */
+   if (process_want(arg))
+   continue;
+   /* process have line */
+   if (parse_have(arg, haves_oid))
+   continue;
+
+   /* process args like thin-pack */
+   if (!strcmp(arg, "thin-pack")) {
+   use_thin_pack = 1;
+   continue;
+   }
+   if (!strcmp(arg, "ofs-delta")) {
+   use_ofs_delta = 1;
+   continue;
+   }
+   if (!strcmp(arg, "no-progress")) {
+   no_progress = 1;
+   continue;
+   }
+   if (!strcmp(arg, "include-tag")) {
+   use_include_tag = 1;
+   continue;
+   }
+
+   /* ignore unknown lines maybe? */
+   die("unexpect line: '%s'", arg);
+   }
+}
+
+static int process_haves(struct oid_array *haves, struct oid_array *common)
+{
+   int i;
+
+   /* Process haves */
+   for (i = 0; i < haves->nr; i++) {
+   const struct object_id *oid = >oid[i];
+   struct object *o;
+   int we_knew_they_have = 0;
+
+   if (!has_object_file(oid))
+   continue;
+
+   oid_array_append(common, oid);
+
+   o = parse_object(oid);
+   if (!o)
+   die("oops (%s)", oid_to_hex(oid));
+   if (o->type == OBJ_COMMIT) {
+   struct commit_list *parents;
+   struct commit *commit = (struct commit *)o;
+   if (o->flags & THEY_HAVE)
+   we_knew_they_have = 1;
+   else
+   o->flags |= THEY_HAVE;
+   if (!oldest_have || (commit->date < oldest_have))
+   oldest_have = commit->date;
+   for (parents = commit->parents;
+parents;
+parents = 

[WIP 12/15] ls-refs: introduce ls-refs server command

2017-12-04 Thread Brandon Williams
Introduce the ls-refs server command.  In protocol v2, the ls-refs
command is used to request the ref advertisement from the server.  Since
it is a command which can be requested (as opposed to manditory in v1),
a clinet can sent a number of parameters in its request to limit the ref
advertisement based on provided ref-patterns.

Signed-off-by: Brandon Williams 
---
 Makefile  |  1 +
 ls-refs.c | 96 +++
 ls-refs.h |  9 ++
 serve.c   |  8 ++
 4 files changed, 114 insertions(+)
 create mode 100644 ls-refs.c
 create mode 100644 ls-refs.h

diff --git a/Makefile b/Makefile
index 710672cf4..be3c2f98b 100644
--- a/Makefile
+++ b/Makefile
@@ -807,6 +807,7 @@ LIB_OBJS += list-objects.o
 LIB_OBJS += ll-merge.o
 LIB_OBJS += lockfile.o
 LIB_OBJS += log-tree.o
+LIB_OBJS += ls-refs.o
 LIB_OBJS += mailinfo.o
 LIB_OBJS += mailmap.o
 LIB_OBJS += match-trees.o
diff --git a/ls-refs.c b/ls-refs.c
new file mode 100644
index 0..591dd105d
--- /dev/null
+++ b/ls-refs.c
@@ -0,0 +1,96 @@
+#include "cache.h"
+#include "repository.h"
+#include "refs.h"
+#include "remote.h"
+#include "argv-array.h"
+#include "ls-refs.h"
+#include "pkt-line.h"
+
+struct ls_refs_data {
+   unsigned peel;
+   unsigned symrefs;
+   struct argv_array patterns;
+};
+
+/*
+ * Is there one among the list of patterns that match the tail part
+ * of the path?
+ */
+static int tail_match(const char **pattern, const char *path)
+{
+   const char *p;
+   char *pathbuf;
+
+   if (!pattern)
+   return 1; /* no restriction */
+
+   pathbuf = xstrfmt("/%s", path);
+   while ((p = *(pattern++)) != NULL) {
+   if (!wildmatch(p, pathbuf, 0)) {
+   free(pathbuf);
+   return 1;
+   }
+   }
+   free(pathbuf);
+   return 0;
+}
+
+static int send_ref(const char *refname, const struct object_id *oid,
+   int flag, void *cb_data)
+{
+   struct ls_refs_data *data = cb_data;
+   const char *refname_nons = strip_namespace(refname);
+   struct strbuf refline = STRBUF_INIT;
+
+   if (data->patterns.argc && !tail_match(data->patterns.argv, refname))
+   return 0;
+
+   strbuf_addf(, "%s %s", oid_to_hex(oid), refname_nons);
+   if (data->symrefs && flag & REF_ISSYMREF) {
+   struct object_id unused;
+   const char *symref_target = resolve_ref_unsafe(refname, 0,
+  unused.hash,
+  );
+
+   if (!symref_target)
+   die("'%s' is a symref but it is not?", refname);
+
+   strbuf_addf(, " %s", symref_target);
+   }
+
+   strbuf_addch(, '\n');
+
+   packet_write(1, refline.buf, refline.len);
+   if (data->peel) {
+   struct object_id peeled;
+   if (!peel_ref(refname, peeled.hash))
+   packet_write_fmt(1, "%s %s^{}\n", oid_to_hex(),
+refname_nons);
+   }
+
+   strbuf_release();
+   return 0;
+}
+
+int ls_refs(struct repository *r, struct argv_array *keys, struct argv_array 
*args)
+{
+   int i;
+   struct ls_refs_data data = { 0, 0, ARGV_ARRAY_INIT };
+
+   for (i = 0; i < args->argc; i++) {
+   if (!strcmp("--peeled", args->argv[i]))
+   data.peel = 1;
+   else if (!strcmp("--symrefs", args->argv[i]))
+   data.symrefs = 1;
+   else
+   /* Pattern */
+   argv_array_pushf(, "*/%s", args->argv[i]);
+
+   }
+
+   head_ref_namespaced(send_ref, );
+   for_each_namespaced_ref(send_ref, );
+   packet_flush(1);
+   argv_array_clear();
+   return 0;
+}
diff --git a/ls-refs.h b/ls-refs.h
new file mode 100644
index 0..9e4c57bfe
--- /dev/null
+++ b/ls-refs.h
@@ -0,0 +1,9 @@
+#ifndef LS_REFS_H
+#define LS_REFS_H
+
+struct repository;
+struct argv_array;
+extern int ls_refs(struct repository *r, struct argv_array *keys,
+  struct argv_array *args);
+
+#endif /* LS_REFS_H */
diff --git a/serve.c b/serve.c
index 476e73b54..36f77c365 100644
--- a/serve.c
+++ b/serve.c
@@ -4,8 +4,15 @@
 #include "pkt-line.h"
 #include "version.h"
 #include "argv-array.h"
+#include "ls-refs.h"
 #include "serve.h"
 
+static int always_advertise(struct repository *r,
+   struct strbuf *value)
+{
+   return 1;
+}
+
 static int agent_advertise(struct repository *r,
   struct strbuf *value)
 {
@@ -26,6 +33,7 @@ struct protocol_capability {
 
 static struct protocol_capability capabilities[] = {
{ "agent", 0, agent_advertise, NULL },
+   { "ls-refs", 0, always_advertise, ls_refs },
 };
 
 static void advertise_capabilities(void)
-- 

[WIP 15/15] fetch-pack: perform a fetch using v2

2017-12-04 Thread Brandon Williams
When communicating with a v2 server, perform a fetch by requesting the
'fetch' command.

Signed-off-by: Brandon Williams 
---
 builtin/fetch-pack.c   |   2 +-
 fetch-pack.c   | 237 -
 fetch-pack.h   |   4 +-
 t/t5701-protocol-v2.sh |  26 ++
 transport.c|   8 +-
 5 files changed, 270 insertions(+), 7 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 061c278b4..e02c7761b 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -211,7 +211,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
}
 
ref = fetch_pack(, fd, conn, ref, dest, sought, nr_sought,
-, pack_lockfile_ptr);
+, pack_lockfile_ptr, protocol_v0);
if (pack_lockfile) {
printf("lock %s\n", pack_lockfile);
fflush(stdout);
diff --git a/fetch-pack.c b/fetch-pack.c
index 105506e9a..a8ef8f3e5 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1007,6 +1007,232 @@ static struct ref *do_fetch_pack(struct fetch_pack_args 
*args,
return ref;
 }
 
+static void add_wants(const struct ref *wants, struct strbuf *req_buf)
+{
+   for ( ; wants ; wants = wants->next) {
+   const struct object_id *remote = >old_oid;
+   const char *remote_hex;
+   struct object *o;
+
+   /*
+* If that object is complete (i.e. it is an ancestor of a
+* local ref), we tell them we have it but do not have to
+* tell them about its ancestors, which they already know
+* about.
+*
+* We use lookup_object here because we are only
+* interested in the case we *know* the object is
+* reachable and we have already scanned it.
+*/
+   if (((o = lookup_object(remote->hash)) != NULL) &&
+   (o->flags & COMPLETE)) {
+   continue;
+   }
+
+   remote_hex = oid_to_hex(remote);
+   packet_buf_write(req_buf, "want %s\n", remote_hex);
+   }
+}
+
+/* */
+static int add_haves(struct strbuf *req_buf)
+{
+   int count = 0;
+   const struct object_id *oid;
+   while ((oid = get_rev())) {
+   packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
+   if (++count >= INITIAL_FLUSH)
+   break;
+   };
+
+   return count;
+}
+
+static int send_fetch_request(int fd_out, const struct fetch_pack_args *args,
+ const struct ref *wants)
+{
+   int ret = 0;
+   struct strbuf req_buf = STRBUF_INIT;
+
+   use_sideband = 2;
+   packet_buf_write(_buf, "command=fetch");
+   packet_buf_write(_buf, "agent=%s", git_user_agent_sanitized());
+   packet_buf_delim(_buf);
+   if (args->use_thin_pack)
+   packet_buf_write(_buf, "thin-pack");
+   if (args->no_progress)
+   packet_buf_write(_buf, "no-progress");
+   if (args->include_tag)
+   packet_buf_write(_buf, "include-tag");
+   if (prefer_ofs_delta)
+   packet_buf_write(_buf, "ofs-delta");
+
+   /* add wants */
+   add_wants(wants, _buf);
+
+   /* Add initial haves */
+   ret = add_haves(_buf);
+
+   /* Send request */
+   packet_buf_flush(_buf);
+   write_or_die(fd_out, req_buf.buf, req_buf.len);
+
+   strbuf_release(_buf);
+   return ret;
+}
+
+static enum ack_type process_ack(const char *line, struct object_id *oid)
+{
+   const char *arg;
+
+   if (!strcmp(line, "NAK"))
+   return NAK;
+   if (skip_prefix(line, "ACK ", )) {
+   if (!parse_oid_hex(arg, oid, )) {
+   if (strstr(arg, "continue"))
+   return ACK_continue;
+   if (strstr(arg, "common"))
+   return ACK_common;
+   if (strstr(arg, "ready"))
+   return ACK_ready;
+   return ACK;
+   }
+   }
+   if (skip_prefix(line, "ERR ", ))
+   die(_("remote error: %s"), arg);
+   die(_("git fetch-pack: expected ACK/NAK, got '%s'"), line);
+}
+
+static int process_acks(struct packet_reader *reader)
+{
+   int got_ready = 0;
+   int got_common = 0;
+   while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+   struct object_id oid;
+   struct commit *commit;
+   enum ack_type ack = process_ack(reader->line, );
+
+   switch (ack) {
+   case ACK_ready:
+   clear_prio_queue(_list);
+   got_ready = 1;
+   /* fallthrough */
+   case ACK_common:
+   commit = lookup_commit();
+

[WIP 07/15] connect: convert get_remote_heads to use struct packet_reader

2017-12-04 Thread Brandon Williams
In order to allow for better control flow when protocol_v2 is introduced
convert 'get_remote_heads()' to use 'struct packet_reader' to read
packet lines.  This enables a client to be able to peek the first line
of a server's response (without consuming it) in order to determine the
protocol version its speaking and then passing control to the
appropriate handler.

This is needed because the initial response from a server speaking
protocol_v0 includes the first ref, while subsequent protocol versions
respond with a version line.  We want to be able to read this first line
without consuming the first ref sent in the protocol_v0 case so that the
protocol version the server is speaking can be determined outside of
'get_remote_heads()' in a future patch.

Signed-off-by: Brandon Williams 
---
 connect.c | 125 +++---
 1 file changed, 70 insertions(+), 55 deletions(-)

diff --git a/connect.c b/connect.c
index 7fbd396b3..f79ea9179 100644
--- a/connect.c
+++ b/connect.c
@@ -48,6 +48,12 @@ int check_ref_type(const struct ref *ref, int flags)
 
 static void die_initial_contact(int unexpected)
 {
+   /*
+* A hang-up after seeing some response from the other end
+* means that it is unexpected, as we know the other end is
+* willing to talk to us.  A hang-up before seeing any
+* response does not necessarily mean an ACL problem, though.
+*/
if (unexpected)
die(_("The remote end hung up upon initial contact"));
else
@@ -56,6 +62,41 @@ static void die_initial_contact(int unexpected)
  "and the repository exists."));
 }
 
+static enum protocol_version discover_version(struct packet_reader *reader)
+{
+   enum protocol_version version = protocol_unknown_version;
+
+   /*
+* Peek the first line of the server's response to
+* determine the protocol version the server is speaking.
+*/
+   switch (packet_reader_peek(reader)) {
+   case PACKET_READ_ERROR:
+   die_initial_contact(0);
+   case PACKET_READ_FLUSH:
+   case PACKET_READ_DELIM:
+   version = protocol_v0;
+   break;
+   case PACKET_READ_NORMAL:
+   version = determine_protocol_version_client(reader->line);
+   break;
+   }
+
+   /* Maybe process capabilities here, at least for v2 */
+   switch (version) {
+   case protocol_v1:
+   /* Read the peeked version line */
+   packet_reader_read(reader);
+   break;
+   case protocol_v0:
+   break;
+   case protocol_unknown_version:
+   BUG("ERROR");
+   }
+
+   return version;
+}
+
 static void parse_one_symref_info(struct string_list *symref, const char *val, 
int len)
 {
char *sym, *target;
@@ -109,44 +150,10 @@ static void annotate_refs_with_symref_info(struct ref 
*ref)
string_list_clear(, 0);
 }
 
-/*
- * Read one line of a server's ref advertisement into packet_buffer.
- */
-static int read_remote_ref(int in, char **src_buf, size_t *src_len,
-  int *responded)
-{
-   int len = packet_read(in, src_buf, src_len,
- packet_buffer, sizeof(packet_buffer),
- PACKET_READ_GENTLE_ON_EOF |
- PACKET_READ_CHOMP_NEWLINE);
-   const char *arg;
-   if (len < 0)
-   die_initial_contact(*responded);
-   if (len > 4 && skip_prefix(packet_buffer, "ERR ", ))
-   die("remote error: %s", arg);
-
-   *responded = 1;
-
-   return len;
-}
-
-#define EXPECTING_PROTOCOL_VERSION 0
-#define EXPECTING_FIRST_REF 1
-#define EXPECTING_REF 2
-#define EXPECTING_SHALLOW 3
-
-/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
-static int process_protocol_version(void)
-{
-   switch (determine_protocol_version_client(packet_buffer)) {
-   case protocol_v1:
-   return 1;
-   case protocol_v0:
-   return 0;
-   default:
-   die("server is speaking an unknown protocol");
-   }
-}
+#define EXPECTING_FIRST_REF 0
+#define EXPECTING_REF 1
+#define EXPECTING_SHALLOW 2
+#define EXPECTING_DONE 3
 
 static void process_capabilities(int *len)
 {
@@ -230,28 +237,34 @@ struct ref **get_remote_heads(int in, char *src_buf, 
size_t src_len,
  struct oid_array *shallow_points)
 {
struct ref **orig_list = list;
+   int len = 0;
+   int state = EXPECTING_FIRST_REF;
+   struct packet_reader reader;
+   const char *arg;
 
-   /*
-* A hang-up after seeing some response from the other end
-* means that it is unexpected, as we know the other end is
-* willing to talk to us.  A hang-up before seeing any
-* response does not necessarily mean an ACL problem, though.
-*/
- 

[WIP 08/15] connect: discover protocol version outside of get_remote_heads

2017-12-04 Thread Brandon Williams
In order to prepare for the addition of protocol_v2 push the protocol
version discovery outside of 'get_remote_heads()'.  This will allow for
keeping the logic for processing the reference advertisement for
protocol_v1 and protocol_v0 separate from the logic for protocol_v2.

Signed-off-by: Brandon Williams 
---
 builtin/fetch-pack.c | 14 +-
 builtin/send-pack.c  | 15 +--
 connect.c| 13 -
 connect.h|  3 +++
 remote-curl.c| 18 --
 remote.h |  5 +++--
 transport.c  | 22 +-
 7 files changed, 69 insertions(+), 21 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 366b9d13f..4873e9572 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -4,6 +4,7 @@
 #include "remote.h"
 #include "connect.h"
 #include "sha1-array.h"
+#include "protocol.h"
 
 static const char fetch_pack_usage[] =
 "git fetch-pack [--all] [--stdin] [--quiet | -q] [--keep | -k] [--thin] "
@@ -52,6 +53,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
struct fetch_pack_args args;
struct oid_array shallow = OID_ARRAY_INIT;
struct string_list deepen_not = STRING_LIST_INIT_DUP;
+   struct packet_reader reader;
 
packet_trace_identity("fetch-pack");
 
@@ -193,7 +195,17 @@ int cmd_fetch_pack(int argc, const char **argv, const char 
*prefix)
if (!conn)
return args.diag_url ? 0 : 1;
}
-   get_remote_heads(fd[0], NULL, 0, , 0, NULL, );
+
+   packet_reader_init(, fd[0], NULL, 0);
+
+   switch (discover_version()) {
+   case protocol_v1:
+   case protocol_v0:
+   get_remote_heads(, , 0, NULL, );
+   break;
+   case protocol_unknown_version:
+   BUG("unknown protocol version");
+   }
 
ref = fetch_pack(, fd, conn, ref, dest, sought, nr_sought,
 , pack_lockfile_ptr);
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index fc4f0bb5f..9c2ca80c8 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -14,6 +14,7 @@
 #include "sha1-array.h"
 #include "gpg-interface.h"
 #include "gettext.h"
+#include "protocol.h"
 
 static const char * const send_pack_usage[] = {
N_("git send-pack [--all | --mirror] [--dry-run] [--force] "
@@ -154,6 +155,7 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
int progress = -1;
int from_stdin = 0;
struct push_cas_option cas = {0};
+   struct packet_reader reader;
 
struct option options[] = {
OPT__VERBOSITY(),
@@ -256,8 +258,17 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
args.verbose ? CONNECT_VERBOSE : 0);
}
 
-   get_remote_heads(fd[0], NULL, 0, _refs, REF_NORMAL,
-_have, );
+   packet_reader_init(, fd[0], NULL, 0);
+
+   switch (discover_version()) {
+   case protocol_v1:
+   case protocol_v0:
+   get_remote_heads(, _refs, REF_NORMAL,
+_have, );
+   break;
+   case protocol_unknown_version:
+   BUG("unknown protocol version");
+   }
 
transport_verify_remote_names(nr_refspecs, refspecs);
 
diff --git a/connect.c b/connect.c
index f79ea9179..5f7cf05c7 100644
--- a/connect.c
+++ b/connect.c
@@ -62,7 +62,7 @@ static void die_initial_contact(int unexpected)
  "and the repository exists."));
 }
 
-static enum protocol_version discover_version(struct packet_reader *reader)
+enum protocol_version discover_version(struct packet_reader *reader)
 {
enum protocol_version version = protocol_unknown_version;
 
@@ -231,7 +231,7 @@ static int process_shallow(int len, struct oid_array 
*shallow_points)
 /*
  * Read all the refs from the other end
  */
-struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+struct ref **get_remote_heads(struct packet_reader *reader,
  struct ref **list, unsigned int flags,
  struct oid_array *extra_have,
  struct oid_array *shallow_points)
@@ -239,21 +239,16 @@ struct ref **get_remote_heads(int in, char *src_buf, 
size_t src_len,
struct ref **orig_list = list;
int len = 0;
int state = EXPECTING_FIRST_REF;
-   struct packet_reader reader;
const char *arg;
 
-   packet_reader_init(, in, src_buf, src_len);
-
-   discover_version();
-
*list = NULL;
 
while (state != EXPECTING_DONE) {
-   switch (packet_reader_read()) {
+   switch (packet_reader_read(reader)) {
case PACKET_READ_ERROR:
die_initial_contact(1);
case PACKET_READ_NORMAL:
-   len = reader.pktlen;
+   len = 

[WIP 03/15] pkt-line: add delim packet support

2017-12-04 Thread Brandon Williams
One of the design goals of protocol-v2 is to improve the semantics of
flush packets.  Currently in protocol-v1, flush packets are used both to
indicate a break in a list of packet lines as well as an indication that
one side has finished speaking.  This makes it particularly difficult
to implement proxies as a proxy would need to completely understand git
protocol instead of simply looking for a flush packet.

To do this, introduce the special deliminator packet '0001'.  A delim
packet can then be used as a deliminator between lists of packet lines
while flush packets can be reserved to indicate the end of a response.

Signed-off-by: Brandon Williams 
---
 pkt-line.c | 19 ++-
 pkt-line.h |  3 +++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/pkt-line.c b/pkt-line.c
index 518109bbe..222e1e310 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -91,6 +91,12 @@ void packet_flush(int fd)
write_or_die(fd, "", 4);
 }
 
+void packet_delim(int fd)
+{
+   packet_trace("0001", 4, 1);
+   write_or_die(fd, "0001", 4);
+}
+
 int packet_flush_gently(int fd)
 {
packet_trace("", 4, 1);
@@ -105,6 +111,12 @@ void packet_buf_flush(struct strbuf *buf)
strbuf_add(buf, "", 4);
 }
 
+void packet_buf_delim(struct strbuf *buf)
+{
+   packet_trace("0001", 4, 1);
+   strbuf_add(buf, "0001", 4);
+}
+
 static void set_packet_header(char *buf, const int size)
 {
static char hexchar[] = "0123456789abcdef";
@@ -297,7 +309,10 @@ enum packet_read_status packet_read_with_status(int fd, 
char **src_buffer, size_
if (len == 0) {
packet_trace("", 4, 0);
return PACKET_READ_FLUSH;
-   } else if (len >= 1 && len <= 3) {
+   } else if (len == 1) {
+   packet_trace("0001", 4, 0);
+   return PACKET_READ_DELIM;
+   } else if (len >= 2 && len <= 3) {
die("protocol error: bad line length character: %.4s", linelen);
}
 
@@ -333,6 +348,7 @@ int packet_read(int fd, char **src_buffer, size_t *src_len,
break;
case PACKET_READ_NORMAL:
break;
+   case PACKET_READ_DELIM:
case PACKET_READ_FLUSH:
pktlen = 0;
break;
@@ -447,6 +463,7 @@ enum packet_read_status packet_reader_read(struct 
packet_reader *reader)
case PACKET_READ_NORMAL:
reader->line = reader->buffer;
break;
+   case PACKET_READ_DELIM:
case PACKET_READ_FLUSH:
reader->pktlen = 0;
reader->line = NULL;
diff --git a/pkt-line.h b/pkt-line.h
index 2b5c7cf11..49ec80c80 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -20,8 +20,10 @@
  * side can't, we stay with pure read/write interfaces.
  */
 void packet_flush(int fd);
+void packet_delim(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format 
(printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
+void packet_buf_delim(struct strbuf *buf);
 void packet_write(int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) 
__attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
@@ -64,6 +66,7 @@ enum packet_read_status {
PACKET_READ_ERROR = -1,
PACKET_READ_NORMAL,
PACKET_READ_FLUSH,
+   PACKET_READ_DELIM,
 };
 #define PACKET_READ_GENTLE_ON_EOF (1u<<0)
 #define PACKET_READ_CHOMP_NEWLINE (1u<<1)
-- 
2.15.1.424.g9478a66081-goog



[WIP 01/15] pkt-line: introduce packet_read_with_status

2017-12-04 Thread Brandon Williams
The current pkt-line API encodes the status of a pkt-line read in the
length of the read content.  An error is indicated with '-1', a flush
with '0' (which can be confusing since a return value of '0' can also
indicate an empty pkt-line), and a positive integer for the length of
the read content otherwise.  This doesn't leave much room for allowing
the addition of additional special packets in the future.

To solve this introduce 'packet_read_with_status()' which reads a packet
and returns the status of the read encoded as an 'enum packet_status'
type.  This allows for easily identifying between special and normal
packets as well as errors.  It also enables easily adding a new special
packet in the future.

Signed-off-by: Brandon Williams 
---
 pkt-line.c | 55 ++-
 pkt-line.h |  8 
 2 files changed, 50 insertions(+), 13 deletions(-)

diff --git a/pkt-line.c b/pkt-line.c
index 7006b3587..ac619f05b 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -280,28 +280,33 @@ static int packet_length(const char *linelen)
return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
 }
 
-int packet_read(int fd, char **src_buf, size_t *src_len,
-   char *buffer, unsigned size, int options)
+enum packet_read_status packet_read_with_status(int fd, char **src_buffer, 
size_t *src_len,
+   char *buffer, unsigned size, 
int *pktlen,
+   int options)
 {
-   int len, ret;
+   int len;
char linelen[4];
 
-   ret = get_packet_data(fd, src_buf, src_len, linelen, 4, options);
-   if (ret < 0)
-   return ret;
+   if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0)
+   return PACKET_READ_ERROR;
+
len = packet_length(linelen);
if (len < 0)
die("protocol error: bad line length character: %.4s", linelen);
-   if (!len) {
+
+   if (len == 0) {
packet_trace("", 4, 0);
-   return 0;
+   return PACKET_READ_FLUSH;
+   } else if (len >= 1 && len <= 3) {
+   die("protocol error: bad line length character: %.4s", linelen);
}
+
len -= 4;
-   if (len >= size)
+   if ((len < 0) || ((unsigned)len >= size))
die("protocol error: bad line length %d", len);
-   ret = get_packet_data(fd, src_buf, src_len, buffer, len, options);
-   if (ret < 0)
-   return ret;
+
+   if (get_packet_data(fd, src_buffer, src_len, buffer, len, options) < 0)
+   return PACKET_READ_ERROR;
 
if ((options & PACKET_READ_CHOMP_NEWLINE) &&
len && buffer[len-1] == '\n')
@@ -309,7 +314,31 @@ int packet_read(int fd, char **src_buf, size_t *src_len,
 
buffer[len] = 0;
packet_trace(buffer, len, 0);
-   return len;
+   *pktlen = len;
+   return PACKET_READ_NORMAL;
+}
+
+int packet_read(int fd, char **src_buffer, size_t *src_len,
+   char *buffer, unsigned size, int options)
+{
+   enum packet_read_status status;
+   int pktlen;
+
+   status = packet_read_with_status(fd, src_buffer, src_len,
+buffer, size, ,
+options);
+   switch (status) {
+   case PACKET_READ_ERROR:
+   pktlen = -1;
+   break;
+   case PACKET_READ_NORMAL:
+   break;
+   case PACKET_READ_FLUSH:
+   pktlen = 0;
+   break;
+   }
+
+   return pktlen;
 }
 
 static char *packet_read_line_generic(int fd,
diff --git a/pkt-line.h b/pkt-line.h
index 3dad583e2..f1545929b 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -60,8 +60,16 @@ int write_packetized_from_buf(const char *src_in, size_t 
len, int fd_out);
  * If options contains PACKET_READ_CHOMP_NEWLINE, a trailing newline (if
  * present) is removed from the buffer before returning.
  */
+enum packet_read_status {
+   PACKET_READ_ERROR = -1,
+   PACKET_READ_NORMAL,
+   PACKET_READ_FLUSH,
+};
 #define PACKET_READ_GENTLE_ON_EOF (1u<<0)
 #define PACKET_READ_CHOMP_NEWLINE (1u<<1)
+enum packet_read_status packet_read_with_status(int fd, char **src_buffer, 
size_t *src_len,
+   char *buffer, unsigned size, 
int *pktlen,
+   int options);
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
*buffer, unsigned size, int options);
 
-- 
2.15.1.424.g9478a66081-goog



[WIP 09/15] transport: store protocol version

2017-12-04 Thread Brandon Williams
Once protocol_v2 is introduced requesting a fetch or a push will need to
be handled differently depending on the protocol version.  Store the
protocol version the serving is speaking in 'struct git_transport_data'
and use it to determine what to do in the case of a fetch or a push.

Signed-off-by: Brandon Williams 
---
 transport.c | 35 ++-
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/transport.c b/transport.c
index 28b744ec2..4160c4167 100644
--- a/transport.c
+++ b/transport.c
@@ -118,6 +118,7 @@ struct git_transport_data {
struct child_process *conn;
int fd[2];
unsigned got_remote_heads : 1;
+   enum protocol_version version;
struct oid_array extra_have;
struct oid_array shallow;
 };
@@ -198,7 +199,8 @@ static struct ref *get_refs_via_connect(struct transport 
*transport, int for_pus
 
packet_reader_init(, data->fd[0], NULL, 0);
 
-   switch (discover_version()) {
+   data->version = discover_version();
+   switch (data->version) {
case protocol_v1:
case protocol_v0:
get_remote_heads(, ,
@@ -219,7 +221,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 {
int ret = 0;
struct git_transport_data *data = transport->data;
-   struct ref *refs;
+   struct ref *refs = NULL;
char *dest = xstrdup(transport->url);
struct fetch_pack_args args;
struct ref *refs_tmp = NULL;
@@ -245,10 +247,18 @@ static int fetch_refs_via_pack(struct transport 
*transport,
if (!data->got_remote_heads)
refs_tmp = get_refs_via_connect(transport, 0);
 
-   refs = fetch_pack(, data->fd, data->conn,
- refs_tmp ? refs_tmp : transport->remote_refs,
- dest, to_fetch, nr_heads, >shallow,
- >pack_lockfile);
+   switch (data->version) {
+   case protocol_v1:
+   case protocol_v0:
+   refs = fetch_pack(, data->fd, data->conn,
+ refs_tmp ? refs_tmp : transport->remote_refs,
+ dest, to_fetch, nr_heads, >shallow,
+ >pack_lockfile);
+   break;
+   case protocol_unknown_version:
+   BUG("unknown protocol version");
+   }
+
close(data->fd[0]);
close(data->fd[1]);
if (finish_connect(data->conn))
@@ -548,7 +558,7 @@ static int git_transport_push(struct transport *transport, 
struct ref *remote_re
 {
struct git_transport_data *data = transport->data;
struct send_pack_args args;
-   int ret;
+   int ret = 0;
 
if (!data->got_remote_heads)
get_refs_via_connect(transport, 1);
@@ -573,8 +583,15 @@ static int git_transport_push(struct transport *transport, 
struct ref *remote_re
else
args.push_cert = SEND_PACK_PUSH_CERT_NEVER;
 
-   ret = send_pack(, data->fd, data->conn, remote_refs,
-   >extra_have);
+   switch (data->version) {
+   case protocol_v1:
+   case protocol_v0:
+   ret = send_pack(, data->fd, data->conn, remote_refs,
+   >extra_have);
+   break;
+   case protocol_unknown_version:
+   BUG("unknown protocol version");
+   }
 
close(data->fd[1]);
close(data->fd[0]);
-- 
2.15.1.424.g9478a66081-goog



[WIP 06/15] transport: use get_refs_via_connect to get refs

2017-12-04 Thread Brandon Williams
Remove code duplication and use the existing 'get_refs_via_connect()'
function to retrieve a remote's heads in 'fetch_refs_via_pack()' and
'git_transport_push()'.

Signed-off-by: Brandon Williams 
---
 transport.c | 18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/transport.c b/transport.c
index d75ff0514..7c969f285 100644
--- a/transport.c
+++ b/transport.c
@@ -230,12 +230,8 @@ static int fetch_refs_via_pack(struct transport *transport,
args.cloning = transport->cloning;
args.update_shallow = data->options.update_shallow;
 
-   if (!data->got_remote_heads) {
-   connect_setup(transport, 0);
-   get_remote_heads(data->fd[0], NULL, 0, _tmp, 0,
-NULL, >shallow);
-   data->got_remote_heads = 1;
-   }
+   if (!data->got_remote_heads)
+   refs_tmp = get_refs_via_connect(transport, 0);
 
refs = fetch_pack(, data->fd, data->conn,
  refs_tmp ? refs_tmp : transport->remote_refs,
@@ -542,14 +538,8 @@ static int git_transport_push(struct transport *transport, 
struct ref *remote_re
struct send_pack_args args;
int ret;
 
-   if (!data->got_remote_heads) {
-   struct ref *tmp_refs;
-   connect_setup(transport, 1);
-
-   get_remote_heads(data->fd[0], NULL, 0, _refs, REF_NORMAL,
-NULL, >shallow);
-   data->got_remote_heads = 1;
-   }
+   if (!data->got_remote_heads)
+   get_refs_via_connect(transport, 1);
 
memset(, 0, sizeof(args));
args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
-- 
2.15.1.424.g9478a66081-goog



[WIP 04/15] upload-pack: convert to a builtin

2017-12-04 Thread Brandon Williams
In order to allow for code sharing with the server-side of fetch in
protocol-v2 convert upload-pack to be a builtin.

Signed-off-by: Brandon Williams 
---
 Makefile  | 3 ++-
 builtin.h | 1 +
 git.c | 1 +
 upload-pack.c | 2 +-
 4 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 9ce68cded..86394b69d 100644
--- a/Makefile
+++ b/Makefile
@@ -630,7 +630,6 @@ PROGRAM_OBJS += imap-send.o
 PROGRAM_OBJS += sh-i18n--envsubst.o
 PROGRAM_OBJS += shell.o
 PROGRAM_OBJS += show-index.o
-PROGRAM_OBJS += upload-pack.o
 PROGRAM_OBJS += remote-testsvn.o
 
 # Binary suffix, set to .exe for Windows builds
@@ -692,6 +691,7 @@ BUILT_INS += git-merge-subtree$X
 BUILT_INS += git-show$X
 BUILT_INS += git-stage$X
 BUILT_INS += git-status$X
+BUILT_INS += git-upload-pack$X
 BUILT_INS += git-whatchanged$X
 
 # what 'all' will build and 'install' will install in gitexecdir,
@@ -890,6 +890,7 @@ LIB_OBJS += tree-diff.o
 LIB_OBJS += tree.o
 LIB_OBJS += tree-walk.o
 LIB_OBJS += unpack-trees.o
+LIB_OBJS += upload-pack.o
 LIB_OBJS += url.o
 LIB_OBJS += urlmatch.o
 LIB_OBJS += usage.o
diff --git a/builtin.h b/builtin.h
index 42378f3aa..f332a1257 100644
--- a/builtin.h
+++ b/builtin.h
@@ -231,6 +231,7 @@ extern int cmd_update_ref(int argc, const char **argv, 
const char *prefix);
 extern int cmd_update_server_info(int argc, const char **argv, const char 
*prefix);
 extern int cmd_upload_archive(int argc, const char **argv, const char *prefix);
 extern int cmd_upload_archive_writer(int argc, const char **argv, const char 
*prefix);
+extern int cmd_upload_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_var(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_commit(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_tag(int argc, const char **argv, const char *prefix);
diff --git a/git.c b/git.c
index f31dca696..e32e16f2d 100644
--- a/git.c
+++ b/git.c
@@ -474,6 +474,7 @@ static struct cmd_struct commands[] = {
{ "update-server-info", cmd_update_server_info, RUN_SETUP },
{ "upload-archive", cmd_upload_archive },
{ "upload-archive--writer", cmd_upload_archive_writer },
+   { "upload-pack", cmd_upload_pack },
{ "var", cmd_var, RUN_SETUP_GENTLY },
{ "verify-commit", cmd_verify_commit, RUN_SETUP },
{ "verify-pack", cmd_verify_pack },
diff --git a/upload-pack.c b/upload-pack.c
index ef99a029c..2d16952a3 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1033,7 +1033,7 @@ static int upload_pack_config(const char *var, const char 
*value, void *unused)
return parse_hide_refs_config(var, value, "uploadpack");
 }
 
-int cmd_main(int argc, const char **argv)
+int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 {
const char *dir;
int strict = 0;
-- 
2.15.1.424.g9478a66081-goog



[WIP 02/15] pkt-line: introduce struct packet_reader

2017-12-04 Thread Brandon Williams
Sometimes it is advantageous to be able to peek the next packet line
without consuming it (e.g. to be able to determine the protocol version
a server is speaking).  In order to do that introduce 'struct
packet_reader' which is an abstraction around the normal packet reading
logic.  This enables a caller to be able to peek a single line at a time
using 'packet_reader_peek()' and having a caller consume a line by
calling 'packet_reader_read()'.

Signed-off-by: Brandon Williams 
---
 pkt-line.c | 61 +
 pkt-line.h | 20 
 2 files changed, 81 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index ac619f05b..518109bbe 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -406,3 +406,64 @@ ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf 
*sb_out)
}
return sb_out->len - orig_len;
 }
+
+/* Packet Reader Functions */
+void packet_reader_init(struct packet_reader *reader, int fd,
+   char *src_buffer, size_t src_len)
+{
+   reader->fd = fd;
+   reader->src_buffer = src_buffer;
+   reader->src_len = src_len;
+   reader->buffer = packet_buffer;
+   reader->buffer_size = sizeof(packet_buffer);
+   reader->options = PACKET_READ_CHOMP_NEWLINE | PACKET_READ_GENTLE_ON_EOF;
+
+   reader->line = NULL;
+   reader->line_peeked = 0;
+   reader->pktlen = 0;
+   reader->status = PACKET_READ_ERROR;
+}
+
+enum packet_read_status packet_reader_read(struct packet_reader *reader)
+{
+   if (reader->line_peeked) {
+   reader->line_peeked = 0;
+   return reader->status;
+   }
+
+   reader->status = packet_read_with_status(reader->fd,
+>src_buffer,
+>src_len,
+reader->buffer,
+reader->buffer_size,
+>pktlen,
+reader->options);
+
+   switch (reader->status) {
+   case PACKET_READ_ERROR:
+   reader->pktlen = -1;
+   reader->line = NULL;
+   break;
+   case PACKET_READ_NORMAL:
+   reader->line = reader->buffer;
+   break;
+   case PACKET_READ_FLUSH:
+   reader->pktlen = 0;
+   reader->line = NULL;
+   break;
+   }
+
+   return reader->status;
+}
+
+enum packet_read_status packet_reader_peek(struct packet_reader *reader)
+{
+   /* Only allow peeking a single line */
+   if (reader->line_peeked)
+   return reader->status;
+
+   /* Peek a line by reading it and setting peeked flag */
+   packet_reader_read(reader);
+   reader->line_peeked = 1;
+   return reader->status;
+}
diff --git a/pkt-line.h b/pkt-line.h
index f1545929b..2b5c7cf11 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -104,6 +104,26 @@ char *packet_read_line_buf(char **src_buf, size_t 
*src_len, int *size);
  */
 ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out);
 
+struct packet_reader {
+   int fd;
+   char *src_buffer;
+   size_t src_len;
+
+   char *buffer;
+   unsigned buffer_size;
+   int options;
+
+   enum packet_read_status status;
+   int pktlen;
+   const char *line;
+   int line_peeked;
+};
+
+extern void packet_reader_init(struct packet_reader *reader, int fd,
+  char *src_buffer, size_t src_len);
+extern enum packet_read_status packet_reader_read(struct packet_reader 
*reader);
+extern enum packet_read_status packet_reader_peek(struct packet_reader 
*reader);
+
 #define DEFAULT_PACKET_MAX 1000
 #define LARGE_PACKET_MAX 65520
 #define LARGE_PACKET_DATA_MAX (LARGE_PACKET_MAX - 4)
-- 
2.15.1.424.g9478a66081-goog



[WIP 11/15] serve: introduce git-serve

2017-12-04 Thread Brandon Williams
Introduce git-serve, the base server for protocol version 2.

When connecting to a server supporting protocol version 2, the server
will send a list all of its capabilities and then wait for the client to
send a command request.  Some capabilities advertised are 'commands'
which the client can request (push and fetch are examples of such
commands).  A command request is comprised of a list of capabilities,
including a command request "command=", a delimiter packet,
followed by a list of parameters for the requested command.

At the end of each command a client can request that another command be
executed or can terminate the connection by sending a flush packet.

Signed-off-by: Brandon Williams 
---
 .gitignore  |   1 +
 Makefile|   2 +
 builtin.h   |   1 +
 builtin/serve.c |  25 
 git.c   |   1 +
 serve.c | 185 
 serve.h |   6 ++
 7 files changed, 221 insertions(+)
 create mode 100644 builtin/serve.c
 create mode 100644 serve.c
 create mode 100644 serve.h

diff --git a/.gitignore b/.gitignore
index 833ef3b0b..2d0450c26 100644
--- a/.gitignore
+++ b/.gitignore
@@ -140,6 +140,7 @@
 /git-rm
 /git-send-email
 /git-send-pack
+/git-serve
 /git-sh-i18n
 /git-sh-i18n--envsubst
 /git-sh-setup
diff --git a/Makefile b/Makefile
index 86394b69d..710672cf4 100644
--- a/Makefile
+++ b/Makefile
@@ -862,6 +862,7 @@ LIB_OBJS += revision.o
 LIB_OBJS += run-command.o
 LIB_OBJS += send-pack.o
 LIB_OBJS += sequencer.o
+LIB_OBJS += serve.o
 LIB_OBJS += server-info.o
 LIB_OBJS += setup.o
 LIB_OBJS += sha1-array.o
@@ -995,6 +996,7 @@ BUILTIN_OBJS += builtin/rev-parse.o
 BUILTIN_OBJS += builtin/revert.o
 BUILTIN_OBJS += builtin/rm.o
 BUILTIN_OBJS += builtin/send-pack.o
+BUILTIN_OBJS += builtin/serve.o
 BUILTIN_OBJS += builtin/shortlog.o
 BUILTIN_OBJS += builtin/show-branch.o
 BUILTIN_OBJS += builtin/show-ref.o
diff --git a/builtin.h b/builtin.h
index f332a1257..3f3fdfc28 100644
--- a/builtin.h
+++ b/builtin.h
@@ -215,6 +215,7 @@ extern int cmd_rev_parse(int argc, const char **argv, const 
char *prefix);
 extern int cmd_revert(int argc, const char **argv, const char *prefix);
 extern int cmd_rm(int argc, const char **argv, const char *prefix);
 extern int cmd_send_pack(int argc, const char **argv, const char *prefix);
+extern int cmd_serve(int argc, const char **argv, const char *prefix);
 extern int cmd_shortlog(int argc, const char **argv, const char *prefix);
 extern int cmd_show(int argc, const char **argv, const char *prefix);
 extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
diff --git a/builtin/serve.c b/builtin/serve.c
new file mode 100644
index 0..2ecaad3b6
--- /dev/null
+++ b/builtin/serve.c
@@ -0,0 +1,25 @@
+#include "cache.h"
+#include "builtin.h"
+#include "parse-options.h"
+#include "serve.h"
+
+static char const * const grep_usage[] = {
+   N_("git serve []"),
+   NULL
+};
+
+int cmd_serve(int argc, const char **argv, const char *prefix)
+{
+
+   struct option options[] = {
+   OPT_END()
+   };
+
+   /* ignore all unknown cmdline switches for now */
+   argc = parse_options(argc, argv, prefix, options, grep_usage,
+PARSE_OPT_KEEP_DASHDASH |
+PARSE_OPT_KEEP_UNKNOWN);
+   serve();
+
+   return 0;
+}
diff --git a/git.c b/git.c
index e32e16f2d..527086eaf 100644
--- a/git.c
+++ b/git.c
@@ -457,6 +457,7 @@ static struct cmd_struct commands[] = {
{ "revert", cmd_revert, RUN_SETUP | NEED_WORK_TREE },
{ "rm", cmd_rm, RUN_SETUP },
{ "send-pack", cmd_send_pack, RUN_SETUP },
+   { "serve", cmd_serve, RUN_SETUP },
{ "shortlog", cmd_shortlog, RUN_SETUP_GENTLY | USE_PAGER },
{ "show", cmd_show, RUN_SETUP },
{ "show-branch", cmd_show_branch, RUN_SETUP },
diff --git a/serve.c b/serve.c
new file mode 100644
index 0..476e73b54
--- /dev/null
+++ b/serve.c
@@ -0,0 +1,185 @@
+#include "cache.h"
+#include "repository.h"
+#include "config.h"
+#include "pkt-line.h"
+#include "version.h"
+#include "argv-array.h"
+#include "serve.h"
+
+static int agent_advertise(struct repository *r,
+  struct strbuf *value)
+{
+   strbuf_addstr(value, git_user_agent_sanitized());
+   return 1;
+}
+
+struct protocol_capability {
+   const char *name;
+   int advertised; /* capability was advertised */
+   /* int advertise(struct strbuf *value, struct repository *r) */
+   int (*advertise)(struct repository *r, struct strbuf *value);
+   /* int command(struct repository *r, struct argv_array *keys, struct 
argv_array *args)*/
+   int (*command)(struct repository *r,
+  struct argv_array *keys,
+  struct argv_array *args);
+};
+
+static struct protocol_capability capabilities[] = {
+   { "agent", 0, agent_advertise, NULL },
+};
+
+static void 

[WIP 13/15] connect: request remote refs using v2

2017-12-04 Thread Brandon Williams
Teach the client to be able to request a remote's refs using protocol
v2.  This is done by having a client issue a 'ls-refs' request to a v2
server.

Signed-off-by: Brandon Williams 
---
 connect.c  | 85 +-
 remote.h   |  2 ++
 t/t5701-protocol-v2.sh | 28 +
 transport.c|  2 +-
 upload-pack.c  |  3 +-
 5 files changed, 117 insertions(+), 3 deletions(-)
 create mode 100755 t/t5701-protocol-v2.sh

diff --git a/connect.c b/connect.c
index 433f08649..b3c933fc1 100644
--- a/connect.c
+++ b/connect.c
@@ -15,6 +15,7 @@
 #include "protocol.h"
 
 static char *server_capabilities;
+static struct argv_array server_capabilities_v2 = ARGV_ARRAY_INIT;
 static const char *parse_feature_value(const char *, const char *, int *);
 
 static int check_ref(const char *name, unsigned int flags)
@@ -62,6 +63,30 @@ static void die_initial_contact(int unexpected)
  "and the repository exists."));
 }
 
+static int server_supports_v2(const char *c)
+{
+   int i;
+
+   for (i = 0; i < server_capabilities_v2.argc; i++) {
+   const char *out;
+   if (skip_prefix(server_capabilities_v2.argv[i], c, ) &&
+   (!*out || *out == '='))
+   return 1;
+   }
+
+   return 0;
+}
+
+static void process_capabilities_v2(struct packet_reader *reader)
+{
+   while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+   argv_array_push(_capabilities_v2, reader->line);
+   }
+
+   if (reader->status != PACKET_READ_FLUSH)
+   die("protocol error");
+}
+
 enum protocol_version discover_version(struct packet_reader *reader)
 {
enum protocol_version version = protocol_unknown_version;
@@ -85,7 +110,7 @@ enum protocol_version discover_version(struct packet_reader 
*reader)
/* Maybe process capabilities here, at least for v2 */
switch (version) {
case protocol_v2:
-   die("support for protocol v2 not implemented yet");
+   process_capabilities_v2(reader);
break;
case protocol_v1:
/* Read the peeked version line */
@@ -292,6 +317,64 @@ struct ref **get_remote_heads(struct packet_reader *reader,
return list;
 }
 
+static int process_ref_v2(const char *line, struct ref ***list)
+{
+   struct object_id old_oid;
+   const char *end_of_name;
+   struct ref *ref;
+
+   if (parse_oid_hex(line, _oid, ))
+   return 0;
+   if (*line != ' ')
+   return 0;
+   line++;
+
+   end_of_name = strchr(line, ' ');
+
+   if (!end_of_name)
+   ref = alloc_ref(line);
+   else {
+   struct strbuf name = STRBUF_INIT;
+   /* symref info */
+   strbuf_add(, line, end_of_name - line);
+   ref = alloc_ref(name.buf);
+   ref->symref = xstrdup(end_of_name + 1);
+
+   strbuf_release();
+   }
+
+   oidcpy(>old_oid, _oid);
+   **list = ref;
+   *list = >next;
+
+   return 1;
+}
+struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
+struct ref **list, unsigned int flags)
+{
+   *list = NULL;
+   /* Check that the server supports the ls-refs command */
+   if (!server_supports_v2("ls-refs"))
+   die("server doesn't support 'ls-refs' command");
+
+   /* Issue request for ls-refs */
+   packet_write_fmt(fd_out, "command=ls-refs");
+   packet_delim(fd_out);
+   packet_write_fmt(fd_out, "--symrefs");
+   packet_flush(fd_out);
+
+   /* Process response from server */
+   while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+   if (!process_ref_v2(reader->line, ))
+   die("invalid ls-refs response: %s", reader->line);
+   }
+
+   if (reader->status != PACKET_READ_FLUSH)
+   die("protocol error");
+
+   return list;
+}
+
 static const char *parse_feature_value(const char *feature_list, const char 
*feature, int *lenp)
 {
int len;
diff --git a/remote.h b/remote.h
index af34e44a4..64d87bf5b 100644
--- a/remote.h
+++ b/remote.h
@@ -155,6 +155,8 @@ extern struct ref **get_remote_heads(struct packet_reader 
*reader,
 struct ref **list, unsigned int flags,
 struct oid_array *extra_have,
 struct oid_array *shallow_points);
+extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
+   struct ref **list, unsigned int flags);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 int ref_newer(const struct object_id *new_oid, const struct object_id 
*old_oid);
diff --git a/t/t5701-protocol-v2.sh b/t/t5701-protocol-v2.sh
new file mode 100755
index 

[WIP 00/15] protocol version 2

2017-12-04 Thread Brandon Williams
A while back I sent out a rough outline for what a protocol version 2 would
look like at
https://public-inbox.org/git/20171020171839.4188-1-bmw...@google.com/.  After
hacking at both the server and client code I've managed to get some patches for
protocol version 2 which implement listing refs and cloning/fetching working.
I still need to get the push side of things working before this would be viable
to accept and apply but I figured this was in a good enough state to start
getting some early feedback on the concept.  I'm hoping that some comments and
other pairs of eyes can help me identify deficiencies earlier rather than
later.

As a whole, fetch works fairly similar to the way it works in v1 except I
removed the support for shallow clients for the time being in order to make it
easier to implement at first.  I haven't decided if it makes more sense to have
the deepening as a separate server command or to keep it in fetch like it is in
v1, just something to think about.

Any comments or criticism is welcome. Thanks!

Brandon Williams (15):
  pkt-line: introduce packet_read_with_status
  pkt-line: introduce struct packet_reader
  pkt-line: add delim packet support
  upload-pack: convert to a builtin
  upload-pack: factor out processing lines
  transport: use get_refs_via_connect to get refs
  connect: convert get_remote_heads to use struct packet_reader
  connect: discover protocol version outside of get_remote_heads
  transport: store protocol version
  protocol: introduce enum protocol_version value protocol_v2
  serve: introduce git-serve
  ls-refs: introduce ls-refs server command
  connect: request remote refs using v2
  upload_pack: introduce fetch server command
  fetch-pack: perform a fetch using v2

 .gitignore |   1 +
 Makefile   |   6 +-
 builtin.h  |   2 +
 builtin/fetch-pack.c   |  19 ++-
 builtin/receive-pack.c |   3 +
 builtin/send-pack.c|  18 ++-
 builtin/serve.c|  25 
 connect.c  | 210 +-
 connect.h  |   3 +
 fetch-pack.c   | 237 +-
 fetch-pack.h   |   4 +-
 git.c  |   2 +
 ls-refs.c  |  96 ++
 ls-refs.h  |   9 ++
 pkt-line.c | 133 +--
 pkt-line.h |  31 +
 protocol.c |   2 +
 protocol.h |   1 +
 remote-curl.c  |  21 ++-
 remote.h   |   7 +-
 serve.c| 195 
 serve.h|   6 +
 t/t5701-protocol-v2.sh |  54 
 transport.c|  84 
 upload-pack.c  | 342 +++--
 upload-pack.h  |   9 ++
 26 files changed, 1371 insertions(+), 149 deletions(-)
 create mode 100644 builtin/serve.c
 create mode 100644 ls-refs.c
 create mode 100644 ls-refs.h
 create mode 100644 serve.c
 create mode 100644 serve.h
 create mode 100755 t/t5701-protocol-v2.sh
 create mode 100644 upload-pack.h

-- 
2.15.1.424.g9478a66081-goog



[PATCH v3 1/2] git-prompt: make __git_eread intended use explicit

2017-12-04 Thread Robert Abel
__git_eread is used to read a single line of a given file (if it exists)
into a single variable without the EOL. All six current users of __git_eread
use it that way and don't expect multi-line content.

Therefore, this patch removes the unused capability to split file conents into
tokens by passing multiple variable names. Add a comment and explicitly use $2
instead of $@ to read the file into one variable.

Signed-off-by: Robert Abel 
---
 contrib/completion/git-prompt.sh | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index c6cbef38c2..41a471957a 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -278,11 +278,12 @@ __git_ps1_colorize_gitstring ()
r="$c_clear$r"
 }
 
+# Helper function to read the first line of a file into a variable.
+# __git_eread requires 2 arguments, the file path and the name of the
+# variable, in that order.
 __git_eread ()
 {
-   local f="$1"
-   shift
-   test -r "$f" && read "$@" <"$f"
+   test -r "$1" && read "$2" <"$1"
 }
 
 # __git_ps1 accepts 0 or 1 arguments (i.e., format string)
-- 
2.13.0.windows.1



[PATCH v3 2/2] git-prompt: fix reading files with windows line endings

2017-12-04 Thread Robert Abel
If any of the files read by __git_eread have \r\n line endings, read
will only strip \n, leaving \r. This results in an ugly prompt, where
instead of

user@pc MINGW64 /path/to/repo (BARE:master)

the last parenthesis is printed over the beginning of the prompt like

)ser@pc MINGW64 /path/to/repo (BARE:master

This patch fixes the issue by setting the IFS to $'\r\n' for the read
operation. Note that ANSI-C Quoting ($'...') is supported by bash as
well as zsh, which are the current targets of git-prompt.sh, cf.
<20171130010811.17369-1-szeder@gmail.com>.

Signed-off-by: Robert Abel 
---
 contrib/completion/git-prompt.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index 41a471957a..983e419d2b 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -283,7 +283,7 @@ __git_ps1_colorize_gitstring ()
 # variable, in that order.
 __git_eread ()
 {
-   test -r "$1" && read "$2" <"$1"
+   test -r "$1" && IFS=$'\r\n' read "$2" <"$1"
 }
 
 # __git_ps1 accepts 0 or 1 arguments (i.e., format string)
-- 
2.13.0.windows.1



Re: What's cooking in git.git (Dec 2017, #01; Mon, 4)

2017-12-04 Thread Jonathan Tan
On Mon, 04 Dec 2017 13:46:43 -0800
Junio C Hamano  wrote:

> * jt/diff-anchored-patience (2017-11-28) 1 commit
>  - diff: support anchoring line(s)
> 
>  "git diff" learned a variant of the "--patience" algorithm, to
>  which the user can specify which 'unique' line to be used as
>  anchoring points.

Is there anything I can do to progress this? Johannes Schindelin wrote
that he is in favor of this feature [1]. He also remarked on the fact
that the anchors are not freed [2], but I have replied [3] that it seems
that diff arguments are not freed in general (so there would be no good
place to put the freeing statements).

[1] 
https://public-inbox.org/git/alpine.DEB.2.21.1.1712011447550.98586@virtualbox/
[2] 
https://public-inbox.org/git/alpine.DEB.2.21.1.1711300134560.6482@virtualbox/
[3] 
https://public-inbox.org/git/20171130152605.1b775e9cc2ddd7f917424...@google.com/


Re: [PATCH v6 4/7] checkout: describe_detached_head: remove ellipsis after committish

2017-12-04 Thread Ann T Ropea
We do not want an ellipsis displayed following an (abbreviated) SHA-1
value.

The days when this was necessary to indicate the truncation to
lower-level Git commands and/or the user are bygone.

However, to ease the transition, the ellipsis will still be printed if
the user sets the environment variable GIT_PRINT_SHA1_ELLIPSIS to "yes".

Correct documentation with respect to what describe_detached_head prints
when GIT_PRINT_SHA1_ELLIPSIS is not set as indicated above.

Add tests for the old and new behaviour.

Signed-off-by: Ann T Ropea 
---
v2: rename patch series & focus on removal of ellipses
v3: env var instead of config option, use one-line comments where appropriate, 
preserve indent level
v4: improve env var handling (rename, helper func to query, docu)
v5: rewrite series to take Junio's comments in 
 aboard
v6: polish to take Junio's comments from 
 into account
 Documentation/user-manual.txt |   2 +-
 builtin/checkout.c|  10 +++-
 t/t2020-checkout-detach.sh| 114 ++
 3 files changed, 123 insertions(+), 3 deletions(-)

diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt
index 497e82e88dd0..eff78902742a 100644
--- a/Documentation/user-manual.txt
+++ b/Documentation/user-manual.txt
@@ -319,7 +319,7 @@ do so (now or later) by using -b with the checkout command 
again. Example:
 
   git checkout -b new_branch_name
 
-HEAD is now at 427abfa... Linux v2.6.17
+HEAD is now at 427abfa Linux v2.6.17
 
 
 The HEAD then refers to the SHA-1 of the commit instead of to a branch,
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 3faae382de4f..b0499542158f 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -400,10 +400,16 @@ static void show_local_changes(struct object *head,
 static void describe_detached_head(const char *msg, struct commit *commit)
 {
struct strbuf sb = STRBUF_INIT;
+
if (!parse_commit(commit))
pp_commit_easy(CMIT_FMT_ONELINE, commit, );
-   fprintf(stderr, "%s %s... %s\n", msg,
-   find_unique_abbrev(commit->object.oid.hash, DEFAULT_ABBREV), 
sb.buf);
+   if (print_sha1_ellipsis()) {
+   fprintf(stderr, "%s %s... %s\n", msg,
+   find_unique_abbrev(commit->object.oid.hash, 
DEFAULT_ABBREV), sb.buf);
+   } else {
+   fprintf(stderr, "%s %s %s\n", msg,
+   find_unique_abbrev(commit->object.oid.hash, 
DEFAULT_ABBREV), sb.buf);
+   }
strbuf_release();
 }
 
diff --git a/t/t2020-checkout-detach.sh b/t/t2020-checkout-detach.sh
index fbb4ee9bb42d..3a37e07531ce 100755
--- a/t/t2020-checkout-detach.sh
+++ b/t/t2020-checkout-detach.sh
@@ -186,4 +186,118 @@ test_expect_success 'no advice given for explicit 
detached head state' '
test_cmp expect.no-advice actual
 '
 
+# Detached HEAD tests for GIT_PRINT_SHA1_ELLIPSIS (new format)
+test_expect_success 'describe_detached_head prints no SHA-1 ellipsis when not 
asked to' "
+
+   # The first detach operation is more chatty than the following ones.
+   cat 1>1st_detach <<'EOF' &&
+Note: checking out 'HEAD^'.
+
+You are in 'detached HEAD' state. You can look around, make experimental
+changes and commit them, and you can discard any commits you make in this
+state without impacting any branches by performing another checkout.
+
+If you want to create a new branch to retain commits you create, you may
+do so (now or later) by using -b with the checkout command again. Example:
+
+  git checkout -b 
+
+HEAD is now at 7c7cd714e262 three
+EOF
+
+   # The remaining ones just show info about previous and current HEADs.
+   cat 1>2nd_detach <<'EOF' &&
+Previous HEAD position was 7c7cd714e262 three
+HEAD is now at 139b20d8e6c5 two
+EOF
+
+   cat 1>3rd_detach <<'EOF' &&
+Previous HEAD position was 139b20d8e6c5 two
+HEAD is now at d79ce1670bdc one
+EOF
+
+   reset && check_not_detached && sane_unset GIT_PRINT_SHA1_ELLIPSIS &&
+
+   # Various ways of *not* asking for ellipses
+
+   sane_unset GIT_PRINT_SHA1_ELLIPSIS && git -c 'core.abbrev=12' checkout 
HEAD^ 1>actual 2>&1 &&
+   check_detached &&
+   test_i18ncmp 1st_detach actual && sane_unset GIT_PRINT_SHA1_ELLIPSIS &&
+
+   GIT_PRINT_SHA1_ELLIPSIS="no" git -c 'core.abbrev=12' checkout HEAD^ 
1>actual 2>&1 &&
+   check_detached &&
+   test_i18ncmp 2nd_detach actual && sane_unset GIT_PRINT_SHA1_ELLIPSIS &&
+
+   GIT_PRINT_SHA1_ELLIPSIS= git -c 'core.abbrev=12' checkout HEAD^ 
1>actual 2>&1 &&
+   check_detached &&
+   test_i18ncmp 3rd_detach actual && sane_unset GIT_PRINT_SHA1_ELLIPSIS &&
+
+   # We only have four commits, but we can re-use them
+   reset && check_not_detached && sane_unset GIT_PRINT_SHA1_ELLIPSIS &&
+
+   # Make no mention of the env var at all

Re: gitattributes not read for diff-tree anymore in 2.15?

2017-12-04 Thread Brandon Williams
On 12/04, Ben Boeckel wrote:
> Hi,
> 
> I've bisected a failure in our test suite to this commit:
> 
> commit 557a5998df19faf8641acfc5b6b1c3c2ba64dca9 (HEAD, refs/bisect/bad)
> Author: Brandon Williams 
> Date:   Thu Aug 3 11:20:00 2017 -0700
> 
> submodule: remove gitmodules_config
> 
> Now that the submodule-config subsystem can lazily read the gitmodules
> file we no longer need to explicitly pre-read the gitmodules by 
> calling
> 'gitmodules_config()' so let's remove it.
> 
> Signed-off-by: Brandon Williams 
> Signed-off-by: Junio C Hamano 
> 
> Which is tags/v2.15.0-rc0~120^2.
> 
> Our test suite is in a Rust project here:
> 
> https://gitlab.kitware.com/utils/rust-git-checks
> 
> and the failing test(s) can be run using:
> 
> cargo test whitespace_all_ignored
> 
> The test checks that when `.gitattributes` says that whitespace errors
> should be ignored, they aren't reported as errors. My guess is that not
> reading the gitmodules configuration also skips reading attributes
> files. Is this reasoning correct?
> 
> Thanks,
> 
> --Ben

Reading the attributes files should be done regardless if the gitmodules
file is read.  The gitmodules file should only come into play if you are
dealing with submodules.

Do you mind providing a reproduction recipe with expected outcome vs
actual outcome and I can take a closer look.

-- 
Brandon Williams


Re: [PATCH v2 1/2] git-prompt: make __git_eread intended use explicit

2017-12-04 Thread Robert Abel
Hi Junio,

On 04 Dec 2017 18:58, Junio C Hamano wrote:
> Robert Abel  writes:
>> __git_eread is used to read a single line of a given file (if it exists)
>> into a variable without the EOL. All six current users of __git_eread
>> use it that way and don't expect multi-line content.
> 
> Changing $@ to $2 does not change whether this is about "multi-line"
> or not. 

I'm aware of that. I was documenting current usage. The function is used
to read file contents (which are expected to be a single line) into
_a_ (i.e. single) variable.

None of the current users of the function expect tokens to be split,
which is why I removed it in preparation of patch 2/2, which would
break tokenizing file contents.

Regards,

Robert


Re: What's cooking in git.git (Dec 2017, #01; Mon, 4)

2017-12-04 Thread Elijah Newren
On Mon, Dec 4, 2017 at 1:46 PM, Junio C Hamano  wrote:

> * en/rename-directory-detection (2017-11-21) 33 commits

>  Rename detection logic in "diff" family that is used in "merge" has
>  learned to guess when all of x/a, x/b and x/c have moved to z/a,
>  z/b and z/c, it is likely that x/d added in the meantime would also
>  want to move to z/d by taking the hint that the entire directory
>  'x' moved to 'z'.

It appears that you're still using V3 of my directory rename detection
series; I got some feedback from Stefan and Johannes on that series
(including compilation failures on Windows) and believe I have
addressed all of it in V4, which can be found here:
   https://public-inbox.org/git/20171129014237.32570-1-new...@gmail.com/


Re: [PATCH 0/2] fix v2.15 progress regression

2017-12-04 Thread Junio C Hamano
Jeff King  writes:

> Here's what I think we should do: fix the bug in the minimal way, and
> then drop the useless code. It's worth doing in two steps, because we
> may decide to resurrect the feature later, and it would then just be a
> straight revert of the second commit.

Yup.  Thanks; will queue.


Re: [PATCH v4 2/2] launch_editor(): indicate that Git waits for user input

2017-12-04 Thread Jeff King
On Mon, Dec 04, 2017 at 10:54:40PM +0100, Lars Schneider wrote:

> > Better, IMHO, though I still think literally saying:
> > 
> >  hint: Waiting for your editor to exit...
> > 
> > is the most accurate, which I think makes it clear that you must _exit_
> > your editor, not just save and close the file.
> 
> I think "exit" would be confusing because most graphical editors (Sublime,
> Textmate, Notepad++, ...) can open multiple files and do not need to exit. 
> The requirement is indeed save and close the file.
> 
> How about:
> 
> hint: Waiting for your editor to close the file...
> 
> I generally like that as this is technical correct from all angles.
> My only nit would be that "the file" is a bit imprecise... but
> that's probably no problem.

OK, that makes sense. There are two definitions of "exit" here. We care
about the process exiting, but of course the editor may present a
concept of "exiting" to the user that is different.

I know emacsclient behaves that way (the client process exits when the
buffer is closed, even though the rest of emacs may still be running),
but I didn't realize other editors did so. I know atom doesn't, but then
it also exits immediately, making it inappropriate for $GIT_EDITOR in
the first place.

-Peff


[PATCH 2/2] progress: drop delay-threshold code

2017-12-04 Thread Jeff King
From: Lars Schneider 

Since 180a9f2268 (provide a facility for "delayed" progress
reporting, 2007-04-20), the progress code has allowed
callers to skip showing progress if they have reached a
percentage-threshold of the total work before the delay
period passes.

But since 8aade107dd (progress: simplify "delayed" progress
API, 2017-08-19), that parameter is not available to outside
callers (we always passed zero after that commit, though
that was corrected in the previous commit to "100%").

Let's drop the threshold code, which never triggers in
any meaningful way.

Signed-off-by: Jeff King 
---
I tweaked your patch slightly to clean up the now-simplified
conditional.

 progress.c | 24 +---
 1 file changed, 5 insertions(+), 19 deletions(-)

diff --git a/progress.c b/progress.c
index b774cb1cd1..5f87f4568f 100644
--- a/progress.c
+++ b/progress.c
@@ -34,7 +34,6 @@ struct progress {
unsigned total;
unsigned last_percent;
unsigned delay;
-   unsigned delayed_percent_threshold;
struct throughput *throughput;
uint64_t start_ns;
 };
@@ -83,20 +82,8 @@ static int display(struct progress *progress, unsigned n, 
const char *done)
 {
const char *eol, *tp;
 
-   if (progress->delay) {
-   if (!progress_update || --progress->delay)
-   return 0;
-   if (progress->total) {
-   unsigned percent = n * 100 / progress->total;
-   if (percent > progress->delayed_percent_threshold) {
-   /* inhibit this progress report entirely */
-   clear_progress_signal();
-   progress->delay = -1;
-   progress->total = 0;
-   return 0;
-   }
-   }
-   }
+   if (progress->delay && (!progress_update || --progress->delay))
+   return 0;
 
progress->last_value = n;
tp = (progress->throughput) ? progress->throughput->display.buf : "";
@@ -206,7 +193,7 @@ int display_progress(struct progress *progress, unsigned n)
 }
 
 static struct progress *start_progress_delay(const char *title, unsigned total,
-unsigned percent_threshold, 
unsigned delay)
+unsigned delay)
 {
struct progress *progress = malloc(sizeof(*progress));
if (!progress) {
@@ -219,7 +206,6 @@ static struct progress *start_progress_delay(const char 
*title, unsigned total,
progress->total = total;
progress->last_value = -1;
progress->last_percent = -1;
-   progress->delayed_percent_threshold = percent_threshold;
progress->delay = delay;
progress->throughput = NULL;
progress->start_ns = getnanotime();
@@ -229,12 +215,12 @@ static struct progress *start_progress_delay(const char 
*title, unsigned total,
 
 struct progress *start_delayed_progress(const char *title, unsigned total)
 {
-   return start_progress_delay(title, total, 100, 2);
+   return start_progress_delay(title, total, 2);
 }
 
 struct progress *start_progress(const char *title, unsigned total)
 {
-   return start_progress_delay(title, total, 0, 0);
+   return start_progress_delay(title, total, 0);
 }
 
 void stop_progress(struct progress **p_progress)
-- 
2.15.0.691.g622df76569


[PATCH 1/2] progress: set default delay threshold to 100%, not 0%

2017-12-04 Thread Jeff King
Commit 8aade107dd (progress: simplify "delayed" progress
API, 2017-08-19) dropped the parameter by which callers
could say "show my progress only if I haven't passed M%
progress after N seconds". The intent was to just show
nothing for 2 seconds, and then always progress after that.

But we flipped the logic in the wrapper: it sets M=0,
meaning that we'd almost _never_ show progress after 2
seconds, since we'd generally have made some progress. This
should have been 100%, not 0%.

We were fooled by existing calls like:

  start_progress_delay("foo", 0, 0, 2);

which behaved this way. The trick is that the first "0"
there is "how many items total", and there zero means "we
don't know". And without knowing that, we cannot compute a
completed percent at all, and we ignored the threshold
parameter entirely! Modeling our wrapper after that broke
callers which pass a non-zero value for "total".

We can switch to the intended behavior by using "100" in the
wrapper call.

Reported-by: Lars Schneider 
Signed-off-by: Jeff King 
---
 progress.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/progress.c b/progress.c
index 289678d43d..b774cb1cd1 100644
--- a/progress.c
+++ b/progress.c
@@ -229,7 +229,7 @@ static struct progress *start_progress_delay(const char 
*title, unsigned total,
 
 struct progress *start_delayed_progress(const char *title, unsigned total)
 {
-   return start_progress_delay(title, total, 0, 2);
+   return start_progress_delay(title, total, 100, 2);
 }
 
 struct progress *start_progress(const char *title, unsigned total)
-- 
2.15.0.691.g622df76569



Re: What's cooking in git.git (Dec 2017, #01; Mon, 4)

2017-12-04 Thread Jeff Hostetler



On 12/4/2017 4:46 PM, Junio C Hamano wrote:

* cc/object-filtering-typofix (2017-12-04) 1 commit
  - list-objects-filter-options: fix 'keword' typo in comment
  (this branch uses jh/object-filtering; is tangled with jh/fsck-promisors and 
jh/partial-clone.)

  Typofix for a topic already in 'next'.

  JeffH said that jh/object-filtering needs further polishing a bit
  before graduating to 'master', so it would be appreciated if this
  can also be rolled into such an incremental update.


I've pulled this into my branch and will include it in my V6 update.
Thanks
jeff



[PATCH 0/2] fix v2.15 progress regression

2017-12-04 Thread Jeff King
On Mon, Dec 04, 2017 at 01:38:41PM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > So the minimal fix is actually:
> >
> > diff --git a/progress.c b/progress.c
> > index 289678d43d..b774cb1cd1 100644
> > --- a/progress.c
> > +++ b/progress.c
> > @@ -229,7 +229,7 @@ static struct progress *start_progress_delay(const char 
> > *title, unsigned total,
> >  
> >  struct progress *start_delayed_progress(const char *title, unsigned total)
> >  {
> > -   return start_progress_delay(title, total, 0, 2);
> > +   return start_progress_delay(title, total, 100, 2);
> >  }
> 
> That makes a lot more sense to me (at least from a cursory
> comparison between the two approaches).

Here's what I think we should do: fix the bug in the minimal way, and
then drop the useless code. It's worth doing in two steps, because we
may decide to resurrect the feature later, and it would then just be a
straight revert of the second commit.

  [1/2]: progress: set default delay threshold to 100%, not 0%
  [2/2]: progress: drop delay-threshold code

 progress.c | 24 +---
 1 file changed, 5 insertions(+), 19 deletions(-)

This regression is in v2.15, so this probably ought to go to maint (at
least the first part, though I think the second should have no
user-visible effects).

-Peff


Re: [PATCH v4 2/2] launch_editor(): indicate that Git waits for user input

2017-12-04 Thread Lars Schneider

> On 04 Dec 2017, at 22:42, Jeff King  wrote:
> 
> On Mon, Dec 04, 2017 at 10:31:15PM +0100, Lars Schneider wrote:
> 
 I would like to add "for your input" or "for you" to convey 
 that Git is not waiting for the machine but for the user.
 
   "hint: Launched editor. Waiting for your input..."
 
 Would that work for you?
>>> 
>>> I guess "input" was the part that I found funny/confusing. The only
>>> thing we know is that we're waiting on the editor process to finish, and
>>> everything else is making assumptions about what's happening in the
>>> editor.
>> 
>> I see. How about:
>> 
>> "hint: Launched editor. Waiting for your action..."
>> (my preference)
>> 
>> or
>> 
>> "hint: Launched editor. Waiting for you..."
> 
> Better, IMHO, though I still think literally saying:
> 
>  hint: Waiting for your editor to exit...
> 
> is the most accurate, which I think makes it clear that you must _exit_
> your editor, not just save and close the file.

I think "exit" would be confusing because most graphical editors (Sublime,
Textmate, Notepad++, ...) can open multiple files and do not need to exit. 
The requirement is indeed save and close the file.

How about:

hint: Waiting for your editor to close the file...

I generally like that as this is technical correct from all angles.
My only nit would be that "the file" is a bit imprecise... but
that's probably no problem.

- Lars


What's cooking in git.git (Dec 2017, #01; Mon, 4)

2017-12-04 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[New Topics]

* ab/simplify-perl-makefile (2017-12-04) 1 commit
 - Makefile: replace perl/Makefile.PL with simple make rules

 The build procedure for perl/ part has been greatly simplified by
 weaning ourselves off of MakeMaker.


* cc/object-filtering-typofix (2017-12-04) 1 commit
 - list-objects-filter-options: fix 'keword' typo in comment
 (this branch uses jh/object-filtering; is tangled with jh/fsck-promisors and 
jh/partial-clone.)

 Typofix for a topic already in 'next'.

 JeffH said that jh/object-filtering needs further polishing a bit
 before graduating to 'master', so it would be appreciated if this
 can also be rolled into such an incremental update.


* cc/skip-to-optional-val (2017-12-04) 4 commits
 - diff: use skip_to_optional_val_default()
 - diff: use skip_to_optional_val()
 - index-pack: use skip_to_optional_val()
 - git-compat-util: introduce skip_to_optional_val()

 Introduce a helper to simplify code to parse a common pattern that
 expects either "--key" or "--key=".

 This may want a final reroll to make it harder to misuse by
 allowing NULL at the valp part of the argument.
 cf. 
 cf. 


* lb/rebase-i-short-command-names (2017-12-04) 9 commits
 - t3404: add test case for abbreviated commands
 - rebase -i: learn to abbreviate command names
 - rebase -i -x: add exec commands via the rebase--helper
 - rebase -i: update functions to use a flags parameter
 - rebase -i: replace reference to sha1 with oid
 - rebase -i: refactor transform_todo_ids
 - rebase -i: set commit to null in exec commands
 - Documentation: use preferred name for the 'todo list' script
 - Documentation: move rebase.* configs to new file

 Allow a single-letter command name in the "rebase -i" todo list.

 This may want a final reroll to avoid adding new reference to insn
 where todo is the more modern term, among other minor things.
 cf. 


* ra/prompt-eread-fix (2017-12-04) 2 commits
 - git-prompt: fix reading files with windows line endings
 - git-prompt: make __git_eread intended use explicit

 Update the shell prompt script (in contrib/) to strip trailing CR
 from strings read from various "state" files.

 Proposed log messages of both commits may need small fixes.
 cf. 
 cf. 


* ds/for-each-file-in-obj-micro-optim (2017-12-04) 1 commit
 - sha1_file: use strbuf_add() instead of strbuf_addf()

 The code to iterate over loose object files got optimized.

 Will merge to 'next'.

--
[Stalled]

* jc/merge-symlink-ours-theirs (2017-09-26) 1 commit
 - merge: teach -Xours/-Xtheirs to symbolic link merge

 "git merge -Xours/-Xtheirs" learned to use our/their version when
 resolving a conflicting updates to a symbolic link.

 Needs review.


* mg/merge-base-fork-point (2017-09-17) 3 commits
 - merge-base: find fork-point outside partial reflog
 - merge-base: return fork-point outside reflog
 - t6010: test actual test output

 "merge-base --fork-point $branch $commit" is used to guess on which
 commit among the commits that were once at the tip of the $branch the
 $commit was built on top of, and it learns these historical tips from
 the reflog of the $branch.  When the true fork-point is lost due to
 pruning of old reflog entries, the command does not give any output,
 because it has no way to guess correctly and does not want to mislead
 the user with a wrong guess.

 The command has been updated to give the best but not known to be
 correct guess, based on a hope that a merge-base between $commit and a
 virtual merge across all the reflog entries that still are available
 for $branch may still be a closer to the true fork-point than the
 merge-base between $commit and the current tip of the $branch.

 This may have to be offered by an additional option, to allow the
 users that are prepared to see a potentially incorrect guess to opt
 into the feature, without affecting the current callers that may not
 be prepared to accept a guess that is not known to be correct.

 What's the doneness of this one?


* jk/drop-ancient-curl (2017-08-09) 5 commits
 - http: #error on too-old curl
 - curl: remove ifdef'd code never used with curl >=7.19.4
 - http: drop support for curl < 7.19.4
 - http: drop support for curl < 7.16.0
 - http: drop support for curl < 7.11.1

 Some code in 

Re: [PATCH v4 2/2] launch_editor(): indicate that Git waits for user input

2017-12-04 Thread Jeff King
On Mon, Dec 04, 2017 at 10:31:15PM +0100, Lars Schneider wrote:

> >> I would like to add "for your input" or "for you" to convey 
> >> that Git is not waiting for the machine but for the user.
> >> 
> >>"hint: Launched editor. Waiting for your input..."
> >> 
> >> Would that work for you?
> > 
> > I guess "input" was the part that I found funny/confusing. The only
> > thing we know is that we're waiting on the editor process to finish, and
> > everything else is making assumptions about what's happening in the
> > editor.
> 
> I see. How about:
> 
> "hint: Launched editor. Waiting for your action..."
> (my preference)
> 
> or
> 
> "hint: Launched editor. Waiting for you..."

Better, IMHO, though I still think literally saying:

  hint: Waiting for your editor to exit...

is the most accurate, which I think makes it clear that you must _exit_
your editor, not just save and close the file.

I dunno, maybe that is being overly paranoid. Certainly I have seen
graphical programs that have a mismatch with the one-process-per-action
way that most terminal editors view the world, and would hang around
even after the user thinks they are done editing. But at the same time,
those programs are unlikely to work well as $GIT_EDITOR in the first
place, because running them from the terminal may just open a new window
in an existing session and exit immediately (which is the opposite
problem -- the editor exited before the user actually did their thing).

So I'm not sure if that would be a problem in practice or not. I'm too
mired in the vim world to have any real data. Somebody like you who is
supporting a large number of less-Unixy users probably has more
perspective there.

-Peff


Re: [PATCH v4 2/2] launch_editor(): indicate that Git waits for user input

2017-12-04 Thread Junio C Hamano
Jeff King  writes:

> On Sun, Dec 03, 2017 at 01:47:04PM +0100, Lars Schneider wrote:
>
>> I am on the fence here. I like consistency but I don't want to
>> bother Git users.
>> 
>> @Peff: Do you lean into either direction? Could you imagine that
>> novice/regular users are bothered? (I don't expect experts to be
>> bothered too much as they would likely be able to find and set 
>> the advice config).
>
> I also am on the fence, and am OK with any of the options discussed.
>
> But now I've said my reservations on the list, so I can say "I told you
> so" if people complain (and naturally refuse to admit my objections if
> people love it). :)

Heh.  I am even OK with not doing anything for that matter ;-)



submodule modify/delete wrong message

2017-12-04 Thread David Turner
When merging with a submodule modify/delete conflict (i.e. I've deleted
the submodule, and I'm merging in a branch that modified it), git lies
about what it is doing:

"CONFLICT (modify/delete): submodule deleted in HEAD and modified in
submodules. Version submodules of submodule left in tree at
submodule~submodules.
Automatic merge failed; fix conflicts and then commit the result."

In fact, the working tree does not contain anything named
'submodule~submodules'.  

In addition, I would ordinarily resolve a conflict like this by using
'git rm'. Here, this gives a warning:

$ git rm submodule
submodule: needs merge
rm 'submodule'
warning: Could not find section in .gitmodules where path=submodule

Git's behavior here is significantly better than liggit2's (which tries
to check out 'submodule' as if it were a blob, and fails to do so), but
it's still confusing.

It's not clear to me what the correct behavior is here.  Maybe it's
sufficient to just fix the message?


Re: [PATCH v1] progress: print progress output for all operations taking longer than 2s

2017-12-04 Thread Junio C Hamano
Jeff King  writes:

> So the minimal fix is actually:
>
> diff --git a/progress.c b/progress.c
> index 289678d43d..b774cb1cd1 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -229,7 +229,7 @@ static struct progress *start_progress_delay(const char 
> *title, unsigned total,
>  
>  struct progress *start_delayed_progress(const char *title, unsigned total)
>  {
> - return start_progress_delay(title, total, 0, 2);
> + return start_progress_delay(title, total, 100, 2);
>  }

That makes a lot more sense to me (at least from a cursory
comparison between the two approaches).



Re: [PATCH v1] progress: print progress output for all operations taking longer than 2s

2017-12-04 Thread Junio C Hamano
lars.schnei...@autodesk.com writes:

> In 180a9f2 we implemented a progress API which suppresses the progress
> output if the progress has reached a specified percentage threshold
> within a given time frame. In 8aade10 we simplified the API and set the
> threshold to 0% and the time frame to 2 seconds for all delayed progress
> operations. That means we would only see a progress output if we still
> have 0% progress after 2 seconds. Consequently, only operations that
> have a very slow start would show the progress output at all.
>
> Remove the threshold entirely and print the progress output for all
> operations that take longer than 2 seconds.

Isn't this likely to result in much chattier progress output (read:
regression) forplaces whose the (P, N) was (0%, 2s) before 8aade107
("progress: simplify "delayed" progress API", 2017-08-19)?

Before or after that change, the places that passed (0%, 2s)
refrained from showing any progress, if at least 1 per-cent of the
work has finished at 2-second mark.  With this change, they will
suddenly start showing progress after 2-second mark, even if they
completed that much work already.

The places that did change the behaviour with the cited change are
the ones that used parameters different from (0%, 2s).  "git blame",
"diffcore-rename" and "unpack-trees" seem to be among them; they
used (50%, 1s), and we'd have seen the progress meter after 1s,
unless half of the work is already done by that time.  By replacing
that with (0%, 2s), the change made it a lot less likely to trigger.

The analysis in the cited commit log claims that (50%, 1s) is
equivalent to (0%, 2s) when the workload is smooth, but I think that
math is bogus X-<.

And the one in prune-packed, which used to be (95%, 2s), is quite
different from the value after the simplification.  We deliberately
made it 95 times more unlikely to trigger with that commit---it used
to be that unless 95% of work is already done, we saw progress
starting at 2-second mark, but now we see progress only when less
than 1% of work is done at 2-second mark.



Re: [PATCH v4 2/2] launch_editor(): indicate that Git waits for user input

2017-12-04 Thread Lars Schneider

> On 04 Dec 2017, at 18:32, Jeff King  wrote:
> 
> On Sun, Dec 03, 2017 at 01:47:04PM +0100, Lars Schneider wrote:
> 
>> I am on the fence here. I like consistency but I don't want to
>> bother Git users.
>> 
>> @Peff: Do you lean into either direction? Could you imagine that
>> novice/regular users are bothered? (I don't expect experts to be
>> bothered too much as they would likely be able to find and set 
>> the advice config).
> 
> I also am on the fence, and am OK with any of the options discussed.
> 
> But now I've said my reservations on the list, so I can say "I told you
> so" if people complain (and naturally refuse to admit my objections if
> people love it). :)

Well, since you and I are on the fence, I will follow Junio. Then we
can at least argue that the feature is consistent on all terminals.

- Lars


Re: [PATCH v1] progress: print progress output for all operations taking longer than 2s

2017-12-04 Thread Jeff King
On Mon, Dec 04, 2017 at 09:36:47PM +0100, lars.schnei...@autodesk.com wrote:

> From: Lars Schneider 
> 
> In 180a9f2 we implemented a progress API which suppresses the progress
> output if the progress has reached a specified percentage threshold
> within a given time frame. In 8aade10 we simplified the API and set the
> threshold to 0% and the time frame to 2 seconds for all delayed progress
> operations. That means we would only see a progress output if we still
> have 0% progress after 2 seconds. Consequently, only operations that
> have a very slow start would show the progress output at all.
> 
> Remove the threshold entirely and print the progress output for all
> operations that take longer than 2 seconds.

Good catch.

I was surprised at first to see 8aade10 as the culprit here, because it
was supposed to be a mechanical conversion. I.e.:

  start_progress_delay("foo", 0, 0, 2);
  start_progress_delay("bar", nr, 50, 1);

would both call the wrapper with a 0-percent delay threshold.

And now call sites like the first one still works, but the second one is
broken.

The problem is that commit got the meaning of the threshold parameter
totally backwards. It is "only show progress if we have not gotten past
this percent". Which means that passing "0" is nonsense, as you
discovered.

But wait, why did we have all those calls using "0" in the first place,
and why did they work? The answer is that we can only look at the
threshold at all when we know the total, since otherwise we can't
compute a percentage. So that first call should logically have been:

  start_progress_delay("foo", 0, 100, 2);

I.e., 100% to indicate that we always show the progress after 2 seconds
regardless. But since our total was "0", we never looked at the
threshold at all. But they misled us into thinking that "0" was the way
to say "always show this progress after 2 seconds".

So the minimal fix is actually:

diff --git a/progress.c b/progress.c
index 289678d43d..b774cb1cd1 100644
--- a/progress.c
+++ b/progress.c
@@ -229,7 +229,7 @@ static struct progress *start_progress_delay(const char 
*title, unsigned total,
 
 struct progress *start_delayed_progress(const char *title, unsigned total)
 {
-   return start_progress_delay(title, total, 0, 2);
+   return start_progress_delay(title, total, 100, 2);
 }
 
 struct progress *start_progress(const char *title, unsigned total)

That keeps "start_progress_delay" functioning properly, but makes
callers of the wrapper use a sane threshold.

That said, after 8aade10 there are literally no callers that use a
threshold other than what is set here. So we could just rip out the
"threshold" feature entirely, which is what your patch is doing.

I'm OK with that route, but I think we would want to explain that
decision in the commit message a bit better.

> a few weeks ago I was puzzled why the progress output is not shown in
> certain situations [1]. I debugged the issue a bit today and came up
> with this patch as solution. It is entirely possible that I misunderstood
> the intentions of the progress API and therefore my patch is bogus.
> In this case, please treat this email as RFC.

No, I think 8aade10 misunderstood the progress API. ;)

-Peff


Re: [PATCH v4 2/2] launch_editor(): indicate that Git waits for user input

2017-12-04 Thread Lars Schneider

> On 04 Dec 2017, at 18:26, Jeff King  wrote:
> 
> On Sun, Dec 03, 2017 at 05:39:10PM +0100, Lars Schneider wrote:
> 
> +   fprintf(stderr, _("hint: Waiting for your editor 
> input..."));
 I found "waiting for editor input" to be a funny way of saying this. I
 input to the editor, the editor does not input to Git. :)
 Maybe "waiting for your editor finish" or something would make more
 sense?
>>> 
>>> May be the good "Launched editor. Waiting ..." message, that was used in a 
>>> previous version, itself makes sense?
>> 
>> Perfect bikeshed topic :-)
>> 
>> I would like to add "for your input" or "for you" to convey 
>> that Git is not waiting for the machine but for the user.
>> 
>>"hint: Launched editor. Waiting for your input..."
>> 
>> Would that work for you?
> 
> I guess "input" was the part that I found funny/confusing. The only
> thing we know is that we're waiting on the editor process to finish, and
> everything else is making assumptions about what's happening in the
> editor.

I see. How about:

"hint: Launched editor. Waiting for your action..."
(my preference)

or

"hint: Launched editor. Waiting for you..."

- Lars

gitattributes not read for diff-tree anymore in 2.15?

2017-12-04 Thread Ben Boeckel
Hi,

I've bisected a failure in our test suite to this commit:

commit 557a5998df19faf8641acfc5b6b1c3c2ba64dca9 (HEAD, refs/bisect/bad)
Author: Brandon Williams 
Date:   Thu Aug 3 11:20:00 2017 -0700

submodule: remove gitmodules_config

Now that the submodule-config subsystem can lazily read the gitmodules
file we no longer need to explicitly pre-read the gitmodules by calling
'gitmodules_config()' so let's remove it.

Signed-off-by: Brandon Williams 
Signed-off-by: Junio C Hamano 

Which is tags/v2.15.0-rc0~120^2.

Our test suite is in a Rust project here:

https://gitlab.kitware.com/utils/rust-git-checks

and the failing test(s) can be run using:

cargo test whitespace_all_ignored

The test checks that when `.gitattributes` says that whitespace errors
should be ignored, they aren't reported as errors. My guess is that not
reading the gitmodules configuration also skips reading attributes
files. Is this reasoning correct?

Thanks,

--Ben


[PATCH v1] progress: print progress output for all operations taking longer than 2s

2017-12-04 Thread lars . schneider
From: Lars Schneider 

In 180a9f2 we implemented a progress API which suppresses the progress
output if the progress has reached a specified percentage threshold
within a given time frame. In 8aade10 we simplified the API and set the
threshold to 0% and the time frame to 2 seconds for all delayed progress
operations. That means we would only see a progress output if we still
have 0% progress after 2 seconds. Consequently, only operations that
have a very slow start would show the progress output at all.

Remove the threshold entirely and print the progress output for all
operations that take longer than 2 seconds.

Signed-off-by: Lars Schneider 
---

Hi,

a few weeks ago I was puzzled why the progress output is not shown in
certain situations [1]. I debugged the issue a bit today and came up
with this patch as solution. It is entirely possible that I misunderstood
the intentions of the progress API and therefore my patch is bogus.
In this case, please treat this email as RFC.

Thanks,
Lars


[1] https://public-inbox.org/git/dc84fb2e-a26e-4957-b5fa-be6ddec34...@gmail.com/


Notes:
Base Commit: 1a4e40aa5d (1a4e40aa5dc16564af879142ba9dfbbb88d1e5ff)
Diff on Web: https://github.com/larsxschneider/git/commit/3e5fdc512a
Checkout:git fetch https://github.com/larsxschneider/git 
progress-fix-v1 && git checkout 3e5fdc512a

 progress.c | 18 +++---
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/progress.c b/progress.c
index 289678d43d..7fa1b0f235 100644
--- a/progress.c
+++ b/progress.c
@@ -34,7 +34,6 @@ struct progress {
unsigned total;
unsigned last_percent;
unsigned delay;
-   unsigned delayed_percent_threshold;
struct throughput *throughput;
uint64_t start_ns;
 };
@@ -86,16 +85,6 @@ static int display(struct progress *progress, unsigned n, 
const char *done)
if (progress->delay) {
if (!progress_update || --progress->delay)
return 0;
-   if (progress->total) {
-   unsigned percent = n * 100 / progress->total;
-   if (percent > progress->delayed_percent_threshold) {
-   /* inhibit this progress report entirely */
-   clear_progress_signal();
-   progress->delay = -1;
-   progress->total = 0;
-   return 0;
-   }
-   }
}

progress->last_value = n;
@@ -206,7 +195,7 @@ int display_progress(struct progress *progress, unsigned n)
 }

 static struct progress *start_progress_delay(const char *title, unsigned total,
-unsigned percent_threshold, 
unsigned delay)
+unsigned delay)
 {
struct progress *progress = malloc(sizeof(*progress));
if (!progress) {
@@ -219,7 +208,6 @@ static struct progress *start_progress_delay(const char 
*title, unsigned total,
progress->total = total;
progress->last_value = -1;
progress->last_percent = -1;
-   progress->delayed_percent_threshold = percent_threshold;
progress->delay = delay;
progress->throughput = NULL;
progress->start_ns = getnanotime();
@@ -229,12 +217,12 @@ static struct progress *start_progress_delay(const char 
*title, unsigned total,

 struct progress *start_delayed_progress(const char *title, unsigned total)
 {
-   return start_progress_delay(title, total, 0, 2);
+   return start_progress_delay(title, total, 2);
 }

 struct progress *start_progress(const char *title, unsigned total)
 {
-   return start_progress_delay(title, total, 0, 0);
+   return start_progress_delay(title, total, 0);
 }

 void stop_progress(struct progress **p_progress)
--
2.15.1



Re: submodules and merging (Was: Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue)

2017-12-04 Thread Jacob Keller
On Mon, Dec 4, 2017 at 11:33 AM, Stefan Beller  wrote:
> On Sat, Nov 25, 2017 at 9:59 PM, Jacob Keller  wrote:
>> On Sat, Nov 25, 2017 at 2:37 PM, Elijah Newren  wrote:
>>> On Wed, Nov 15, 2017 at 9:13 AM, Jacob Keller  
>>> wrote:
 On Tue, Nov 14, 2017 at 10:13 AM, Stefan Beller  wrote:
>>>
> But this line of though might be distracting from your original point,
> which was that we have so much to keep in mind when doing tree
> operations (flags, D/F conflicts, now submodules too). I wonder how
> a sensible refactoring would look like to detangle all these aspects,
> but still keeping Git fast and not overengineered.

 I think given how complex a lot of these code paths are, that an
 attempt to refactor it a bit to detangle some of the mess would be
 well worth the time. I'd suspect it might make handling the more
 complex task of actually resolving conflicts to be easier, so the
 effort to clean up the code here should be worth it.
>>>
>>> I think changing from a 4-way merge to a 3-way merge would make things
>>> much better, as Junio outlined here:
>>>
>>> https://public-inbox.org/git/xmqqd147kpdm@gitster.mtv.corp.google.com/
>>>
>>> I don't know of any way to detangle the other aspects, yet.
>
> Jonathan Nieder and me tried some pair programming some time ago[1]
> plumbing the repository object through most of the low level internals, which
> would help in detangling submodule merges as then these merges could
> be done in-core, just as Junio laid out.
>
> [1] https://github.com/stefanbeller/git/tree/object-store-jrn-rebased
>

This sounds promising to me.

Thanks,
Jake

>> I agree, that is absolutely a (big) step in the right direction.
>
>
> I agree as well; A better (abstracted) merge backend would be huge for
> the future of Git.
>
> Thanks,
> Stefan


Re: git-clone ignores submodule.recurse configuration

2017-12-04 Thread Stefan Beller
On Fri, Dec 1, 2017 at 1:47 PM, Brandon Williams  wrote:
> On 12/01, Ralf Thielow wrote:
>> Today I played around a bit with git submodules and noticed
>> that the very handy configuration "submodule.recurse" is not
>> working for the git-clone command.

The rationale here is that submodule.recursive ought to make your
conglomerate of repositories (superproject + some submodules)
look and feel like one giant repository.  Thinking further this sounds
like a user wants to set it globally, hence it may be expected to
work for fresh clones, too.

However at clone time, we don't know which submodules the
user wants. The first submodule recursive switches were an
all or nothing, but I think we need a better selection there.
For fetch there is an "only-those-needed" selection as
"on-demand". Similar for push.

For clone we expected the user wanting to specify which
submodules are interesting:

  git clone --recurse-submodule=

(Note to self: the `[=> "git help config" tells me that submodule.recurse affects
>> all commands that have a --recurse-submodules option.
>> git-clone seems to be an exception which is also mentioned in
>> a workflow example in Documentation/gitsubmodules.txt that
>> says:
>>
>>   # Unlike the other commands below clone still needs
>>   # its own recurse flag:
>>   git clone --recurse  
>>   cd 
>>
>> So this seems to be a known issue. Is someone already
>> working on this or has a plan to do it? Or is there a reason
>> not doing this for git-clone but for git-pull?
>
> When we introduced the "submodule.recurse" config we explicitly had it
> not work with clone because a recursive clone ends up pulling data from
> other sources aside from the URL the user explicitly provides.
> Historically there have been a number of security related bugs with
> respect to cloning submodules so we felt it was best to require a
> recursive clone to be requested explicitly, at least for the time being.
>
> --
> Brandon Williams


Re: [PATCH v2] Makefile: replace perl/Makefile.PL with simple make rules

2017-12-04 Thread Dan Jacques
Junio C Hamano writes:

> Well the thing is that I cannot queue this and Dan's at the same
> time, while both of these topics are expected to be in flux.  For
> today's pushout, I tentatively kicked out Dan's relative path series
> and queued this one to see how well it works with the rest of the
> system, after giving this patch another round of reading.
>
> It seemed that Dan was happy with (an earlier draft of?) this
> build procedure simplification patch, so hopefully we can solidify
> this one reasonably quickly and ask the relative path series to be
> rebuilt on top of it?

Sounds good to me! I'm in no rush, and if this is moving forward I'm more
than happy to wait for it to land and rebase on top.

I didn't want to weigh in too heavily on the actual details of this patch
since I think the majority stakeholders are the folks who maintain the Perl
subsystem and those who build distribution packages. My work on Perl only
touches installation as a means to an end, and avarab@'s work does simplify
this for me.

Cheers, and thanks!
-Dan


Re: [PATCH v3] diff: support anchoring line(s)

2017-12-04 Thread Stefan Beller
On Thu, Nov 30, 2017 at 3:26 PM, Jonathan Tan  wrote:
> On Thu, 30 Nov 2017 01:36:37 +0100 (CET)
> Johannes Schindelin  wrote:
>
>> Hi Jonathan,
>>
>> On Tue, 28 Nov 2017, Jonathan Tan wrote:
>>
>> > @@ -4607,7 +4627,14 @@ int diff_opt_parse(struct diff_options *options,
>> > DIFF_XDL_CLR(options, NEED_MINIMAL);
>> > options->xdl_opts &= ~XDF_DIFF_ALGORITHM_MASK;
>> > options->xdl_opts |= value;
>> > +   if (value == XDF_PATIENCE_DIFF)
>> > +   clear_patience_anchors(options);
>> > return argcount;
>> > +   } else if (skip_prefix(arg, "--anchored=", )) {
>> > +   options->xdl_opts = DIFF_WITH_ALG(options, PATIENCE_DIFF);
>> > +   ALLOC_GROW(options->anchors, options->anchors_nr + 1,
>> > +  options->anchors_alloc);
>> > +   options->anchors[options->anchors_nr++] = xstrdup(arg);
>>
>> I looked and failed to find the code that releases this array after the
>> diff is done... did I miss anything?
>
> You didn't miss anything. As far as I can tell, occurrences of struct
> diff_options live throughout the lifetime of an invocation of Git and
> are not freed. (Even if the struct itself is allocated on the stack, its
> pathspec has some elements allocated on the heap.)

I thought at least for the intra-line word diff, which allocates its own
diff options struct, we'd clear them, but looking around we seem to leak
the diff options consistently. (builtin/blame.c is nice enough to
`clear_pathspec(_opts.pathspec)`, but that's about it, no other
command takes care of the cruft.

I wonder if diff_flush might be a good place to clean up the diff options
and after the flush the diff options are declared invalid?

> test_expect_success '--anchored with non-unique line has no effect'

okay, if it turns out we need this case in the future we can still come up
with ideas.

Thanks,
Stefan


Re: [PATCH v2] Makefile: replace perl/Makefile.PL with simple make rules

2017-12-04 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> On Mon, Dec 4, 2017 at 5:22 PM, Junio C Hamano  wrote:
>> I did this immediately after applying; please double check.
>>
>> Thanks.
>
> Thanks. Looks good to me. I'll incorporate that info future
> submissions if there's more stuff to fix, but for now if you could
> queue it like that that would be great.

Well the thing is that I cannot queue this and Dan's at the same
time, while both of these topics are expected to be in flux.  For
today's pushout, I tentatively kicked out Dan's relative path series
and queued this one to see how well it works with the rest of the
system, after giving this patch another round of reading.

It seemed that Dan was happy with (an earlier draft of?) this
build procedure simplification patch, so hopefully we can solidify
this one reasonably quickly and ask the relative path series to be
rebuilt on top of it?

Thanks.


Re: submodules and merging (Was: Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue)

2017-12-04 Thread Stefan Beller
On Sat, Nov 25, 2017 at 9:59 PM, Jacob Keller  wrote:
> On Sat, Nov 25, 2017 at 2:37 PM, Elijah Newren  wrote:
>> On Wed, Nov 15, 2017 at 9:13 AM, Jacob Keller  wrote:
>>> On Tue, Nov 14, 2017 at 10:13 AM, Stefan Beller  wrote:
>>
 But this line of though might be distracting from your original point,
 which was that we have so much to keep in mind when doing tree
 operations (flags, D/F conflicts, now submodules too). I wonder how
 a sensible refactoring would look like to detangle all these aspects,
 but still keeping Git fast and not overengineered.
>>>
>>> I think given how complex a lot of these code paths are, that an
>>> attempt to refactor it a bit to detangle some of the mess would be
>>> well worth the time. I'd suspect it might make handling the more
>>> complex task of actually resolving conflicts to be easier, so the
>>> effort to clean up the code here should be worth it.
>>
>> I think changing from a 4-way merge to a 3-way merge would make things
>> much better, as Junio outlined here:
>>
>> https://public-inbox.org/git/xmqqd147kpdm@gitster.mtv.corp.google.com/
>>
>> I don't know of any way to detangle the other aspects, yet.

Jonathan Nieder and me tried some pair programming some time ago[1]
plumbing the repository object through most of the low level internals, which
would help in detangling submodule merges as then these merges could
be done in-core, just as Junio laid out.

[1] https://github.com/stefanbeller/git/tree/object-store-jrn-rebased

> I agree, that is absolutely a (big) step in the right direction.


I agree as well; A better (abstracted) merge backend would be huge for
the future of Git.

Thanks,
Stefan


Re: [PATCH v4 9/9] t3512/t3513: remove KNOWN_FAILURE_CHERRY_PICK_SEES_EMPTY_COMMIT=1

2017-12-04 Thread Stefan Beller
On Fri, Nov 24, 2017 at 3:07 AM, Phillip Wood  wrote:
> From: Phillip Wood 
>
> Now that the sequencer creates commits without forking 'git commit' it
> does not see an empty commit in these tests which fixes the known
> breakage. Note that logic for handling
> KNOWN_FAILURE_CHERRY_PICK_SEES_EMPTY_COMMIT=1 is not removed from
> lib-submodule-update.sh as it is still used by other tests.
>
> Signed-off-by: Phillip Wood 
> ---
>
> Notes:
> The output of the tests with after the previous patch looks like the
> output of the merge tests (see below), so I'm hopeful that this is a
> genuine fix, but someone who knows about submodules should check that
> things are in fact working properly now.

Looking at the patch only (in combination with the history of the
submodule tests,
283f56a40b (cherry-pick: add t3512 for submodule updates, 2014-06-15))
this patch
looks good to me. I wonder though if this needs to be squashed in another commit
to keep the test suite working for each patch and have no intermittent
failure in the
series.

Thanks,
Stefan


Re: [PATCH] hashmap: adjust documentation to reflect reality

2017-12-04 Thread Stefan Beller
On Sat, Dec 2, 2017 at 9:35 PM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> My second suggestion (which I'm on the fence about) is: would it better
>> to just say "see t/helper/test-hashmap.c for a representative example?"

I think that may be better in the long run, indeed.

>
> I also had the same thought.  It is rather unwieldy to ask people to
> lift code from comment text, and it is also hard to maintain such a
> commented out code to make sure it is up to date.
>
>> I'm all for code examples in documentation, but this one is quite
>> complex. The code in test-hashmap.c is not much more complex, and is at
>> least guaranteed to compile and run.
>
> Yup.  Exactly.
>
>> It doesn't show off how to combine a flex-array with a hashmap as
>> well, but I'm not sure how important that is. So I could go either
>> way.

We could add that example to the test helper as then we have a good (tested)
example for that case, too.

> In any case, keeping a bad example as-is is less good than replacing
> it with a corrected one, so I do not mind taking this patch as an
> immediate first step, whether we later decide to remove it and refer
> to an external file that has a real example that will be easier to
> maintain and use.
>
> Thanks.

Thanks for taking this and building on top,
Stefan


Re: [PATCH v2 2/2] Doc/check-ref-format: clarify information about @{-N} syntax

2017-12-04 Thread Junio C Hamano
Kaartic Sivaraam  writes:

>> Stepping back a bit, the mild suspicion above says
>> 
>> $ git checkout HEAD^0
>> ... do things ...
>> $ git checkout -b temp
>> ... do more things ...
>> $ git checkout -B @{-1}
>> 
>> that creates a new branch whose name is 40-hex of a commit that
>> happens to be where we started the whole dance *is* a bug.  No sane
>> user expects that to happen, and the last step "checkout -B @{-1}"
>> should result in an error instead [*1*].
>> 
>> I was wondering if "git check-ref-format --branch @{-1}", when used
>> in place of "checkout -B @{-1}" in the above sequence,
>
> I guess you mean '... "git checkout -B $(git check-ref-format --branch
> @{-1}", when used in place of "git checkout -B @{-1}" ...' ?

No you guessed wrong.  I was (and am) wondering if the last step in
the following sequence should fail.

>> $ git checkout HEAD^0
>> ... do things ...
>> $ git checkout -b temp
>> ... do more things ...
>> $ git check-ref-format --branch @{-1}

And I am leaning towards saying that it is a bug that it does not
fail; @{-1} is a detached HEAD and not a concrete branch name in
this case, so "check-ref-format --branch" should at least notice
and say that it is a request that may lead to a nonsense next step
(which is to create a branch with that 40-hex name).


Re: [PATCH v4 7/9] sequencer: load commit related config

2017-12-04 Thread Junio C Hamano
Phillip Wood  writes:

> --- a/builtin/rebase--helper.c
> +++ b/builtin/rebase--helper.c
> @@ -9,6 +9,17 @@ static const char * const builtin_rebase_helper_usage[] = {
>   NULL
>  };
>  
> +static int git_rebase_helper_config(const char *k, const char *v, void *cb)
> +{
> + int status;
> +
> + status = git_sequencer_config(k, v, NULL);
> + if (status)
> + return status;
> +
> + return git_default_config(k, v, NULL);
> +}
> +

Sorry for spotting the problem this late, but this code is
unfortunate and we will need to revisit it later; we may want to do
so sooner rather than later.

When k,v is a valid configuration that is handled by
sequencer_config() successfully, this code still needs to call into
default_config() with the same k,v, only to get it ignored.

The problem lies in the (mis)design of git_sequencer_config().  The
function should either allow the caller to notice that (k,v) has
already been handled, or be the last one in the callback by making a
call to default_config() itself.

For the former, because this helper does not have to be called
directly as a git_config() callback, but instead it is always meant
to be called indirectly from another git_config() callback (like
git_rebase_helper_config() here, and common_config() in
builtin/revert.c like we see below), it does *not* have to be
constrained by the function signature required for it to be a config
callback.  It could take a pointer to an int that stores if 'k' was
handled inside the function,

int git_sequencer_config_helper(char *k, char *v, int *handled);

for example.


Re: [PATCH v2] Makefile: replace perl/Makefile.PL with simple make rules

2017-12-04 Thread Ævar Arnfjörð Bjarmason
On Mon, Dec 4, 2017 at 5:22 PM, Junio C Hamano  wrote:
> I did this immediately after applying; please double check.
>
> Thanks.

Thanks. Looks good to me. I'll incorporate that info future
submissions if there's more stuff to fix, but for now if you could
queue it like that that would be great.

> 1: da337670f5 ! 1: aeae85bdd0 Makefile: replace perl/Makefile.PL with simple 
> make rules
> @@ -27,7 +27,7 @@
>  So replace the whole thing with something that's pretty much a copy 
> of
>  how we generate po/build/**.mo from po/*.po, just with a small sed(1)
>  command instead of msgfmt. As that's being done rename the files
> -from *.pm to *.pmc just to indicate that they're genreated (see
> +from *.pm to *.pmc just to indicate that they're generated (see
>  "perldoc -f require").
>
>  While I'm at it, change the fallback for Error.pm from being 
> something
> @@ -50,9 +50,9 @@
> it could be set in addition to INSTLIBDIR, but my reading of [7] 
> is
> that this is the desired behavior.
>
> - * We don't man pages for all of the perl modules as we used t, only
> -   Git(3pm). As discussed on-list[8] that we were building installed
> -   manpages for purely internal APIs like Git::I18N or
> + * We don't build man pages for all of the perl modules as we used 
> to,
> +   only Git(3pm). As discussed on-list[8] that we were building
> +   installed manpages for purely internal APIs like Git::I18N or
> private-Error.pm was always a bug anyway, and all the Git::SVN::*
> ones say they're internal APIs.
>
> :


Re: [PATCH v2 1/2] git-prompt: make __git_eread intended use explicit

2017-12-04 Thread Junio C Hamano
Robert Abel  writes:

> __git_eread is used to read a single line of a given file (if it exists)
> into a variable without the EOL. All six current users of __git_eread
> use it that way and don't expect multi-line content.

Changing $@ to $2 does not change whether this is about "multi-line"
or not.  What you are changing is that the original was prepared to
be given two or more variable names, and split an input line at IFS
into multiple tokens to be assigned to these variables, but with
this change, the caller can only use one variable and this function
will not split the line and store it into that single variable.

The above can easily be fixed with a bit of rewording, perhaps like:

... that way.  We do not need to split the line into tokens and
assign them to multiple variables---reading only into a single
variable needs to be supported.

While reviewing this patch, I also wondered if the "read" wants to
become "read -r" or something that is even safer than simply
avoiding tokenization, but after scanning to see exactly which files
__git_eread is used to read from, I do not think it matters (the
input will not have a backslash that would want to be protected from
'read'), so this should be OK.

> Signed-off-by: Robert Abel 
> ---
>  contrib/completion/git-prompt.sh | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/contrib/completion/git-prompt.sh 
> b/contrib/completion/git-prompt.sh
> index c6cbef38c..41a471957 100644
> --- a/contrib/completion/git-prompt.sh
> +++ b/contrib/completion/git-prompt.sh
> @@ -278,11 +278,12 @@ __git_ps1_colorize_gitstring ()
>   r="$c_clear$r"
>  }
>  
> +# Helper function to read the first line of a file into a variable.
> +# __git_eread requires 2 arguments, the file path and the name of the
> +# variable, in that order.
>  __git_eread ()
>  {
> - local f="$1"
> - shift
> - test -r "$f" && read "$@" <"$f"
> + test -r "$1" && read "$2" <"$1"
>  }
>  
>  # __git_ps1 accepts 0 or 1 arguments (i.e., format string)


Re: [PATCH v4 2/2] launch_editor(): indicate that Git waits for user input

2017-12-04 Thread Jeff King
On Sun, Dec 03, 2017 at 01:47:04PM +0100, Lars Schneider wrote:

> I am on the fence here. I like consistency but I don't want to
> bother Git users.
> 
> @Peff: Do you lean into either direction? Could you imagine that
> novice/regular users are bothered? (I don't expect experts to be
> bothered too much as they would likely be able to find and set 
> the advice config).

I also am on the fence, and am OK with any of the options discussed.

But now I've said my reservations on the list, so I can say "I told you
so" if people complain (and naturally refuse to admit my objections if
people love it). :)

-Peff


Re: [PATCH v4 2/2] launch_editor(): indicate that Git waits for user input

2017-12-04 Thread Jeff King
On Sat, Dec 02, 2017 at 09:15:29PM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > I tried to think of ways this "show a message and then delete it" could
> > go wrong. It should work OK with editors that just do curses-like
> > things, taking over the terminal and then restoring it at the end.
> > ...
> 
> I think that it is not worth to special-case "dumb" terminals like
> this round of the patches do.  If it so much disturbs reviewers that
> "\e[K" may not work everywhere, we can do without the "then delete
> it" part.  It was merely trying to be extra nice, and the more
> important part of the "feature" is to be noticeable, and I do think
> that not showing anything on "dumb", only because the message cannot
> be retracted, is putting the cart before the horse.  
> 
> Since especially now people are hiding this behind an advise.*
> thing, I think it is OK to show a message and waste a line, even.

Yeah, I was tempted to suggest just dropping this terminal magic
completely. But it probably _does_ work and is helpful in the majority
of cases (i.e., where people have in-terminal editors). I dunno.

I am a little wary of hiding behind "but you can disable it with a
config option", because that's still a thing that users have to actually
do to get the previous behavior. And I expect to get some "ugh, git is
too chatty and annoying" backlash once this is in a released version.

But maybe that is just being paranoid. It's not like we don't have a lot
of other advice flags. I really could go either way on this whole thing
(but I'll be setting the advice flag myself ;) ).

> > An even worse case (and yes, this is really reaching) is:
> >
> >   $ GIT_EDITOR='echo one; printf "two\\r"; vim' git commit
> >   hint: Waiting for your editor input...one
> >   Aborting commit due to empty commit message.
> >
> > There we ate the "two" line.
> 
> Yes, I would have to agree that this one is reaching, as there isn't
> any valid reason other than "the editor then wanted to do \e[K
> later" for it to end its last line with CR.  So our eating that line
> is not a problem.

Yeah, this was just me trying to come up with all possible implications.
I agree it's probably not worth worrying about.

-Peff


Re: [PATCH v4 2/2] launch_editor(): indicate that Git waits for user input

2017-12-04 Thread Jeff King
On Sun, Dec 03, 2017 at 05:39:10PM +0100, Lars Schneider wrote:

> >>> +   fprintf(stderr, _("hint: Waiting for your editor 
> >>> input..."));
> >> I found "waiting for editor input" to be a funny way of saying this. I
> >> input to the editor, the editor does not input to Git. :)
> >> Maybe "waiting for your editor finish" or something would make more
> >> sense?
> > 
> > May be the good "Launched editor. Waiting ..." message, that was used in a 
> > previous version, itself makes sense?
> 
> Perfect bikeshed topic :-)
> 
> I would like to add "for your input" or "for you" to convey 
> that Git is not waiting for the machine but for the user.
> 
> "hint: Launched editor. Waiting for your input..."
> 
> Would that work for you?

I guess "input" was the part that I found funny/confusing. The only
thing we know is that we're waiting on the editor process to finish, and
everything else is making assumptions about what's happening in the
editor.

-Peff


Re: [PATCH v2 2/2] Doc/check-ref-format: clarify information about @{-N} syntax

2017-12-04 Thread Kaartic Sivaraam
On Sat, 2017-12-02 at 17:52 -0800, Junio C Hamano wrote:
> Kaartic Sivaraam  writes:
> 
> > > I have a mild suspicion that "git checkout -B @{-1}" would want to
> > > error out instead of creating a valid new branch whose name is
> > > 40-hex that happen to be the name of the commit object you were
> > > detached at previously.
> > 
> > I thought this the other way round. Rather than letting the callers
> > error out when @{-N} didn't expand to a branch name, I thought we
> > should not be expanding @{-N} syntax for "check-ref-format --branch" at
> > all to make a "stronger guarantee" that the result is "always" a valid
> > branch name. Then I thought it might be too restrictive and didn't
> > mention it. So, I dunno.
> > 

OK. Seems I was out of mind in the above reply. I have answered the
question about whether "branch=$(git check-ref-format @{-1}) && git
checkout -B $branch" should fail or not. Sorry, for the confusion in
case you mind it.


> > 
> > > I am not sure if "check-ref-format --branch" should the same; it is
> > > more about the syntax and the 40-hex _is_ valid there, so...
> > 
> > I'm not sure what you were trying to say here, sorry.
> 
> The "am not sure if" comes from this question: should these two be
> equivalent?
> 
> $ git check-ref-format --branch @{-1}
> $ git check-ref-format --branch $(git rev-parse --verify @{-1})
> 

I could see how,

$ git rev-parse --verify @{-1}
$ git rev-parse --verify $(git check-ref-format --branch @{-1})

could be equivalent. I'm not sure how the previous two might be
equivalent except in the case when the previous thing checked out was
not a branch.

1. "git check-ref-format --branch @{-1}" returns the previous thing
checked out which might be a branch name or a commit hash

2. "git check-ref-format --branch $(git rev-parse --verify @{-1})"
returns the commit hash of the previous thing (the hash of the tip if
was a branch). IIUC, this thing never returns a branch name.


> If they should be equivalent (and I think the current implementation
> says they are), then the answer to "if ... should do the same?"
> becomes "no, we should not error out".
> 
> Stepping back a bit, the mild suspicion above says
> 
> $ git checkout HEAD^0
> ... do things ...
> $ git checkout -b temp
> ... do more things ...
> $ git checkout -B @{-1}
> 
> that creates a new branch whose name is 40-hex of a commit that
> happens to be where we started the whole dance *is* a bug.  No sane
> user expects that to happen, and the last step "checkout -B @{-1}"
> should result in an error instead [*1*].
> 
> I was wondering if "git check-ref-format --branch @{-1}", when used
> in place of "checkout -B @{-1}" in the above sequence,

I guess you mean '... "git checkout -B $(git check-ref-format --branch
@{-1}", when used in place of "git checkout -B @{-1}" ...' ?


>  should or
> should not fail.  It really boils down to this question: what @{-1}
> expands to and what the user wants to do with it.
> 
> In the context of getting a revision (i.e. "rev-parse @{-1}") where
> we are asking what the object name is, the value of the detached
> HEAD we were on previously is a valid answer we are "expanding @{-1}
> to".  If we were on a concrete branch and @{-1} yields a concrete
> branch name, then rev-parse would turn that into an object name, and
> in the end, in either case, the object name is what we wanted to
> get.  So we do not want to error this out.
> 

Right.


> But a user of "git check-ref-format --branch" is not asking about
> the object name ("git rev-parse" would have been used otherwise).
> Which argues for erroring out "check-ref-format --branch @{-1}" if
> we were not on a branch in the previous state.
> 

Exactly what I thought.


> And that argues for erroring out "check-ref-format --branch @{-1}"
> in such a case, i.e. declaring that the first two commands in this
> message are not equivalent.
> 

As said before, IIUC, they give equivalent output to stdout only when
the previous thing checked out was not a branch.


-- 
Kaartic


Re: [PATCH v4 2/2] launch_editor(): indicate that Git waits for user input

2017-12-04 Thread Jeff King
On Sat, Dec 02, 2017 at 09:15:49AM +0530, Kaartic Sivaraam wrote:

> > Or given that the goal is really just making it clear that we've spawned
> > an editor, something like "starting editor %s..." would work.
> 
> There was already discussion related to the "continuous tense" used in the
> phrase.
> 
> Extract from [1]:

Thanks, I missed that one. I agree with the argument there.

-Peff


Re: RFC: Native clean/smudge filter for UTF-16 files

2017-12-04 Thread Jeff King
On Sun, Dec 03, 2017 at 07:48:01PM +0100, Lars Schneider wrote:

> > - if core.convertEncoding is true, then for any file with an
> >   encoding=foo attribute, internally run iconv(foo, utf8) in
> >   convert_to_git(), and likewise iconv(utf8, foo) in
> >   convert_to_working_tree.
> > 
> > - I'm not sure if core.convertEncoding should be enabled by default. If
> >   it's a noop as long as there's no encoding attribute, then it's
> >   probably fine. But I would not want accidental conversion or any
> >   slowdown for the common case that the user wants no conversion.
> 
> I think we should mimic the behavior of "eol=crlf/lf" attribute.
> 
> AFAIK whenever I set "*.ext text eol=crlf", then I can be sure the 
> file is checked out with CRLF independent of any of my local config
> settings. Isn't that correct? I would expect a similar behavior if
> "*.ext text encoding=utf16" is set. Wouldn't that mean that we do
> not need a "core.convertEncoding" config?

Yeah, on further thought, that's probably the right thing. Both "eol"
and "encoding" attributes are definite indications of what should happen
(unlike "text", which is just saying you _could_ convert line endings if
you wished to, and therefore has to be used in conjunction with a config
setting).

I like the name "encoding" for the attribute, but I do wonder if this
would bite anybody using it already for other purposes (like gitk).

> > There is one other approach, which is to really store utf-16 in the
> > repository and better teach the diff tools to handle it (which are
> > really the main thing in git that cares about looking into the blob
> > contents). You can do this already with a textconv filter, but:
> > 
> >  1. It's slow (though cacheable).
> > 
> >  2. It doesn't work unless each repo configures the filter (so not on
> > sites like GitHub, unless we define a micro-format that diff=utf16
> > should be textconv'd on display, and get all implementations to
> > respect that).
> 
> Actually, rendering diffs on Git hosting sites such as GitHub is one
> of my goals. Therefore, storing content as UTF-16 wouldn't be a solution
> for me.

If there were a convention for specifying the attribute, I think sites
like GitHub would start respecting it in the server-side diffs (though
like I said, we could also just auto-detect via BOM without even
requiring any attributes to be set up).

> >  3. Textconv patches look good, but can't be applied. This occasionally
> > makes things awkward, depending on your workflow.
> 
> TBH I dont't understand what you mean here. What do you mean with
> "textconv patches"?

I mean the patch produced by "git diff" when textconv is in effect. That
patch cannot be applied to the original content. E.g.:

  git init
  echo "* diff=foo" >.git/info/attributes
  git config diff.foo.textconv "sed s/^/foo:/"

  echo base >file
  git add file
  git commit -m base

  echo change >file
  git diff >patch

  git reset --hard
  git apply patch

That works in the absence of the textconv, but not with it. (For a real
binary file, you'd probably need "diff --binary" to produce a usable
patch, but the principle is the same).

-Peff


Re: [PATCH v6 1/2] http-backend: respect CONTENT_LENGTH as specified by rfc3875

2017-12-04 Thread Jeff King
On Mon, Dec 04, 2017 at 07:18:55AM +, Florian Manschwetus wrote:

> I could provide a bash script I used in between to make this working
> with IIS, without fixing http-backend binary, maybe this helps to
> understand how this cases might be handled.

I think it would be pretty easy to handle all cases by inserting another
process in front of http-backend that just reads $CONTENT_LENGTH bytes
and then gives an EOF to http-backend. I assume that's what your bash
script does. It's just that this passes the data through an extra layer
of pipes.

If that's an acceptable tradeoff, it seems like an ideal solution from a
maintainability standpoint, since it skips all the tricky situations
inside the http-backend code.

-Peff


Re: [PATCH v2] sha1_file: use strbuf_add() instead of strbuf_addf()

2017-12-04 Thread Derrick Stolee

On 12/4/2017 11:56 AM, Jeff King wrote:

When you put your cover letter first, you need to use "scissors" like:
-- >8 --

to separate it from the commit message. Using three-dashes means "git
am" will include your cover letter as the commit message, and omit your
real commit message entirely.


Thanks. I updated our internal best-practices accordingly.

-Stolee


Re: [PATCH v2] sha1_file: use strbuf_add() instead of strbuf_addf()

2017-12-04 Thread Jeff King
On Mon, Dec 04, 2017 at 09:06:03AM -0500, Derrick Stolee wrote:

> Thanks for the feedback on v1. This version fixes the cruft_cb
> bug and streamlines the strlen(). I would include an inter-diff
> but it was the same size as the patch.

Thanks, this version looks good to me. One procedural nit:

> Thanks,
> -Stolee
> 
> ---

When you put your cover letter first, you need to use "scissors" like:

-- >8 --

to separate it from the commit message. Using three-dashes means "git
am" will include your cover letter as the commit message, and omit your
real commit message entirely.

-Peff


Re: [PATCH v5 1/7] Documentation: revisions: fix typo: "three dot" ---> "three-dot" (in line with "two-dot").

2017-12-04 Thread Junio C Hamano
I found a few questionable pieces in 4/7 but other than that the
whole series is very nicely done.

Will queue; thanks.


Re: [PATCH v5 4/7] checkout: describe_detached_head: remove ellipsis after committish

2017-12-04 Thread Junio C Hamano
Ann T Ropea  writes:

> +# Detached HEAD tests for GIT_PRINT_SHA1_ELLIPSIS
> +
> +# The first detach operation is more chatty than the following ones.
> +cat > 1st_detach <<'EOF'
> +Note: checking out 'HEAD^'.
> +
> +You are in 'detached HEAD' state. You can look around, make experimental
> +changes and commit them, and you can discard any commits you make in this
> +state without impacting any branches by performing another checkout.
> +
> +If you want to create a new branch to retain commits you create, you may
> +do so (now or later) by using -b with the checkout command again. Example:
> +
> +  git checkout -b 
> +
> +HEAD is now at 7c7cd714e262 three
> +EOF
> +# The remaining ones just show info about previous and current HEADs.
> +cat > 2nd_detach <<'EOF'
> +Previous HEAD position was 7c7cd714e262 three
> +HEAD is now at 139b20d8e6c5 two
> +EOF
> +cat > 3rd_detach <<'EOF'
> +Previous HEAD position was 139b20d8e6c5 two
> +HEAD is now at d79ce1670bdc one
> +EOF

It is preferrable to have all of the above inside the
test_expect_success block that uses them.  

Also lose the SP between redirection operator and its target
filename, i.e.

command >file

not

command > file

> +test_expect_success 'describe_detached_head prints no SHA-1 ellipsis when 
> not asked to' '
> + reset && check_not_detached && unset GIT_PRINT_SHA1_ELLIPSIS &&
> +
> + # Various ways of *not* asking for ellipses
> +
> + unset GIT_PRINT_SHA1_ELLIPSIS && git -c 'core.abbrev=12' checkout HEAD^ 
> 1> actual 2>&1 &&

Use sane_unset from t/test-lib-functions.sh instead, unless you are
absolutely sure that the variable you are unsetting _is_ set at this
point.

> + check_detached &&
> + test_cmp 1st_detach actual && unset GIT_PRINT_SHA1_ELLIPSIS &&

Is the output we are grabbing with check_detached from the command
internationalized?  If so, the comparison should be done with
test_i18ncmp (otherwise, the test will break under the "poisoned
gettext" build).

Thanks.


Re: [PATCH v4 2/2] launch_editor(): indicate that Git waits for user input

2017-12-04 Thread Kaartic Sivaraam
On Sun, 2017-12-03 at 17:39 +0100, Lars Schneider wrote:
> > On 02 Dec 2017, at 04:45, Kaartic Sivaraam  
> > wrote:
> > 
> > On Friday 01 December 2017 11:59 PM, Jeff King wrote:
> > > On Fri, Dec 01, 2017 at 01:52:14PM +0100, Lars Schneider wrote:
> > > > 
> > > > Thanks for the review :-)
> > > 
> > > Actually, I meant to bikeshed one part but forgot. ;)
> > > > +   fprintf(stderr, _("hint: Waiting for your 
> > > > editor input..."));
> > > 
> > > I found "waiting for editor input" to be a funny way of saying this. I
> > > input to the editor, the editor does not input to Git. :)
> > > Maybe "waiting for your editor finish" or something would make more
> > > sense?
> > 
> > May be the good "Launched editor. Waiting ..." message, that was used in a 
> > previous version, itself makes sense?
> 
> Perfect bikeshed topic :-)
> 

Yep :-)


> I would like to add "for your input" or "for you" to convey 
> that Git is not waiting for the machine but for the user.
> 
> "hint: Launched editor. Waiting for your input..."
> 
> Would that work for you?
> 

Yeah, this one does look fine.

That said, FWIW I don't have strong opinions about the phrase/sentence
except that they should be readable and shouldn't get too verbose.


Thanks, 
Kaartic


Re: [PATCH v2] Makefile: replace perl/Makefile.PL with simple make rules

2017-12-04 Thread Junio C Hamano
I did this immediately after applying; please double check.

Thanks.

1: da337670f5 ! 1: aeae85bdd0 Makefile: replace perl/Makefile.PL with simple 
make rules
@@ -27,7 +27,7 @@
 So replace the whole thing with something that's pretty much a copy of
 how we generate po/build/**.mo from po/*.po, just with a small sed(1)
 command instead of msgfmt. As that's being done rename the files
-from *.pm to *.pmc just to indicate that they're genreated (see
+from *.pm to *.pmc just to indicate that they're generated (see
 "perldoc -f require").

 While I'm at it, change the fallback for Error.pm from being something
@@ -50,9 +50,9 @@
it could be set in addition to INSTLIBDIR, but my reading of [7] is
that this is the desired behavior.

- * We don't man pages for all of the perl modules as we used t, only
-   Git(3pm). As discussed on-list[8] that we were building installed
-   manpages for purely internal APIs like Git::I18N or
+ * We don't build man pages for all of the perl modules as we used to,
+   only Git(3pm). As discussed on-list[8] that we were building
+   installed manpages for purely internal APIs like Git::I18N or
private-Error.pm was always a bug anyway, and all the Git::SVN::*
ones say they're internal APIs.

:


Re: [PATCH v2 6/9] rebase -i: update functions to use a flags parameter

2017-12-04 Thread Johannes Schindelin
Hi Liam,

On Mon, 4 Dec 2017, Johannes Schindelin wrote:

> On Sun, 3 Dec 2017, Liam Beguin wrote:
> 
> > diff --git a/sequencer.h b/sequencer.h
> > index 4e444e3bf1c4..3bb6b0658192 100644
> > --- a/sequencer.h
> > +++ b/sequencer.h
> > @@ -45,10 +45,12 @@ int sequencer_continue(struct replay_opts *opts);
> >  int sequencer_rollback(struct replay_opts *opts);
> >  int sequencer_remove_state(struct replay_opts *opts);
> >  
> > -int sequencer_make_script(int keep_empty, FILE *out,
> > -   int argc, const char **argv);
> > +#define TODO_LIST_KEEP_EMPTY (1U << 0)
> > +#define TODO_LIST_SHORTED_IDS (1U << 1)
> 
> Maybe SHORTEN_IDs? And either revert back to transform_todo_ids() or use
> SHORTEN_INSNS...
> 
> Maybe also TRANSFORM_TODO_LIST_* and maybe move the #define's above the
> transform_todo_ids() function, i.e. one line further down?
> 
> > +int sequencer_make_script(FILE *out, int argc, const char **argv,
> > + unsigned flags);

Ah, I just realized that make_script() already takes `flags`. Sorry for
the noise,
Johannes


Re: [PATCH v2 4/9] rebase -i: refactor transform_todo_ids

2017-12-04 Thread Junio C Hamano
Johannes Schindelin  writes:

> On Sun, 3 Dec 2017, Liam Beguin wrote:
>
>> The transform_todo_ids function is a little hard to read. Lets try
>> to make it easier by using more of the strbuf API. Also, since we'll
>> soon be adding command abbreviations, let's rename the function so
>> it's name reflects that change.
>
> I am not really a fan of the new name, and would prefer the old one, but
> that's only a nit, not a reason to reject the patch.

FWIW, I do think the new name goes backwards.  The command uses to
remember what operations are to be carried out in which order using
a thing, and the name of that thing "todo list".  We also called it
the "instruction sheet", and "insn" was a good term to call one item
on that sheet among other items.

But recent commits in the area are pushing us to call it "todo list"
consistently.  An element in that list should be called "todo".

A "todo" consists of two parts, "what operation is done" part and
"using what commit object" part.  The original implementation of
this function affected only the latter part, and in that light, the
original name transform_todo_ids() is understandable.  Now you are
planning to make it modify both parts, not just "ids", so it is
understandable that you would want to rename it.  But I do not think
"insn" matches the recent trend.  It even risks misunderstanding
(i.e. xfrm_todo_ids() is about modifying "using what commit" part,
so perhaps xfrm_todo_insns() is about modifying "what operation is
done" part---but that is different from what you want to do, which
is to update _both_ halves).

Calling it just transform_todo() would probably be more in line with
the reason why you wanted to rename it in the first place.



Re: [PATCH v2 0/9] rebase -i: add config to abbreviate command names

2017-12-04 Thread Johannes Schindelin
Hi Liam,

On Sun, 3 Dec 2017, Liam Beguin wrote:

> This series will add the 'rebase.abbreviateCommands' configuration
> option to allow `git rebase -i` to default to the single-letter command
> names when generating the todo list.
> 
> Using single-letter command names can present two benefits. First, it
> makes it easier to change the action since you only need to replace a
> single character (i.e.: in vim "r" instead of
> "ciw").  Second, using this with a large enough value of
> 'core.abbrev' enables the lines of the todo list to remain aligned
> making the files easier to read.
> 
> Changes in V2:
> - Refactor and rename 'transform_todo_ids'
> - Replace SHA-1 by OID in rebase--helper.c
> - Update todo list related functions to take a generic 'flags' parameter
> - Rename 'add_exec_commands' function to 'sequencer_add_exec_commands'
> - Rename 'add-exec' option to 'add-exec-commands'
> - Use 'strbur_read_file' instead of rewriting it
> - Make 'command_to_char' return 'comment_char_line' if no single-letter
>   command name is defined
> - Combine both tests into a single test case
> - Update commit messages

Looks very nice already! I offered a couple of comments/suggestions, but
nothing major.

Thank you,
Johannes


Antw: Re: bug deleting "unmerged" branch (2.12.3)

2017-12-04 Thread Ulrich Windl

Hi Philip!

I'm unsure what you are asking for...

Ulrich
>>> "Philip Oakley"  04.12.17 0.30 Uhr >>>
From: "Junio C Hamano" 
> "Philip Oakley"  writes:
>
>> I think it was that currently you are on M, and neither A nor B are
>> ancestors (i.e. merged) of M.
>>
>> As Junio said:- "branch -d" protects branches that are yet to be
>> merged to the **current branch**.
>
> Actually, I think people loosened this over time and removal of
> branch X is not rejected even if the range HEAD..X is not empty, as
> long as X is marked to integrate with/build on something else with
> branch.X.{remote,merge} and the range X@{upstream}..X is empty.
>
> So the stress of "current branch" above you added is a bit of a
> white lie.

Ah, thanks. [I haven't had chance to check the code]

The man page does say:
.-d
.Delete a branch. The branch must be fully merged in its upstream
.branch, or in HEAD if no upstream was set with --track 
.or --set-upstream.

It's whether or not Ulrich had joined the two aspects together, and if the
doc was sufficient to help recognise the 'unmerged' issue. Ulrich?
--
Philip




Re: How hard would it be to implement sparse fetching/pulling?

2017-12-04 Thread Jeff Hostetler



On 12/1/2017 1:24 PM, Jonathan Nieder wrote:

Jeff Hostetler wrote:

On 11/30/2017 6:43 PM, Philip Oakley wrote:



The 'companies' problem is that it tends to force a client-server, always-on
on-line mentality. I'm also wanting the original DVCS off-line capability to
still be available, with _user_ control, in a generic sense, of what they
have locally available (including files/directories they have not yet looked
at, but expect to have. IIUC Jeff's work is that on-line view, without the
off-line capability.

I'd commented early in the series at [1,2,3].


Yes, this does tend to lead towards an always-online mentality.
However, there are 2 parts:
[a] dynamic object fetching for missing objects, such as during a
 random command like diff or blame or merge.  We need this
 regardless of usage -- because we can't always predict (or
 dry-run) every command the user might run in advance.
[b] batch fetch mode, such as using partial-fetch to match your
 sparse-checkout so that you always have the blobs of interest
 to you.  And assuming you don't wander outside of this subset
 of the tree, you should be able to work offline as usual.
If you can work within the confines of [b], you wouldn't need to
always be online.


Just to amplify this: for our internal use we care a lot about
disconnected usage working.  So it is not like we have forgotten about
this use case.


We might also add a part [c] with explicit commands to back-fill or
alter your incomplete view of the ODB


Agreed, this will be a nice thing to add.

[...]

At its core, my idea was to use the object store to hold markers for the
'not yet fetched' objects (mainly trees and blobs). These would be in a
known fixed format, and have the same effect (conceptually) as the
sub-module markers - they _confirm_ the oid, yet say 'not here, try
elsewhere'.


We do have something like this.  Jonathan can explain better than I, but
basically, we denote possibly incomplete packfiles from partial clones
and fetches as "promisor" and have special rules in the code to assert
that a missing blob referenced from a "promisor" packfile is OK and can
be fetched later if necessary from the "promising" remote.

The main problem with markers or other lists of missing objects is
that it has scale problems for large repos.


Any chance that we can get a design doc in Documentation/technical/
giving an overview of the design, with a brief "alternatives
considered" section describing this kind of thing?


Yeah, I'll start one.  I have notes within the individual protocol
docs and man-pages, but no summary doc.  Thanks!



E.g. some of the earlier descriptions like
  
https://public-inbox.org/git/20170915134343.3814d...@twelve2.svl.corp.google.com/
  https://public-inbox.org/git/cover.1506714999.git.jonathanta...@google.com/
  https://public-inbox.org/git/20170113155253.1644-1-benpe...@microsoft.com/
may help as a starting point.

Thanks,
Jonathan



Re: [PATCH v2 6/9] rebase -i: update functions to use a flags parameter

2017-12-04 Thread Johannes Schindelin
Hi Liam,

On Sun, 3 Dec 2017, Liam Beguin wrote:

> diff --git a/sequencer.h b/sequencer.h
> index 4e444e3bf1c4..3bb6b0658192 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -45,10 +45,12 @@ int sequencer_continue(struct replay_opts *opts);
>  int sequencer_rollback(struct replay_opts *opts);
>  int sequencer_remove_state(struct replay_opts *opts);
>  
> -int sequencer_make_script(int keep_empty, FILE *out,
> - int argc, const char **argv);
> +#define TODO_LIST_KEEP_EMPTY (1U << 0)
> +#define TODO_LIST_SHORTED_IDS (1U << 1)

Maybe SHORTEN_IDs? And either revert back to transform_todo_ids() or use
SHORTEN_INSNS...

Maybe also TRANSFORM_TODO_LIST_* and maybe move the #define's above the
transform_todo_ids() function, i.e. one line further down?

> +int sequencer_make_script(FILE *out, int argc, const char **argv,
> +   unsigned flags);
>  
> -int transform_todo_insn(int shorten_ids);
> +int transform_todo_insn(unsigned flags);
>  int check_todo_list(void);
>  int skip_unnecessary_picks(void);
>  int rearrange_squash(void);

Ciao,
Johannes


Re: How hard would it be to implement sparse fetching/pulling?

2017-12-04 Thread Jeff Hostetler



On 12/2/2017 11:30 AM, Philip Oakley wrote:

From: "Jeff Hostetler" 
Sent: Friday, December 01, 2017 2:30 PM

On 11/30/2017 8:51 PM, Vitaly Arbuzov wrote:

I think it would be great if we high level agree on desired user
experience, so let me put a few possible use cases here.

1. Init and fetch into a new repo with a sparse list.
Preconditions: origin blah exists and has a lot of folders inside of
src including "bar".
Actions:
git init foo && cd foo
git config core.sparseAll true # New flag to activate all sparse
operations by default so you don't need to pass options to each
command.
echo "src/bar" > .git/info/sparse-checkout
git remote add origin blah
git pull origin master
Expected results: foo contains src/bar folder and nothing else,
objects that are unrelated to this tree are not fetched.
Notes: This should work same when fetch/merge/checkout operations are
used in the right order.


With the current patches (parts 1,2,3) we can pass a blob-ish
to the server during a clone that refers to a sparse-checkout
specification.


I hadn't appreciated this capability. I see it as important, and should be 
available both ways, so that a .gitNarrow spec can be imposed from the server 
side, as well as by the requester.

It could also be used to assist in the 'precious/secret' blob problem, so that 
AWS keys are never pushed, nor available for fetching!


To be honest, I've always considered partial clone/fetch as
a client-side request as a performance feature to minimize
download times and disk space requirements on the client.
I've not thought of it from the "server has secrets" point
of view.

We can talk about it, but I'd like to keep it outside the
scope of the current effort.  My concerns are that that is
not the appropriate mechanism to enforce MAC/DAC like security
mechanisms.  For example:
[a] The client will still receive the containing trees that
refer to the sensitive blobs, so the user can tell when
the secret blobs change -- they wouldn't have either blob,
but can tell when they are changed.  This event by itself
may or may not leak sensitive information depending on the
terms of the security policy in place.
[b] The existence of such missing blobs would tell the client
which blobs are significant and secret and allow them to
focus their attack.  It would be better if those assets
were completely hidden and not in the tree at all.
[c] The client could push a fake secret blob to replace the
valid one on the server.  You would have to audit the
server to ensure that it never accepts a push containing
a change to any secret blob.  And the server would need
an infrastructure to know about all secrets in the tree.
[d] When a secret blob does change, any local merges by the
user lack information to complete the merge -- they can't
merge the secrets and they can't be trusted to correctly
pick-ours or pick-theirs -- so their workflows are broken.
I'm not trying to blindly spread FUD here, but it is arguments
like these that make me suggest that the partial clone mechanism
is not the right vehicle for such "secret" blobs.





   There's a bit of a chicken-n-egg problem getting
things set up.  So if we assume your team would create a series
of "known enlistments" under version control, then you could


s/enlistments/entitlements/ I presume?


Within my org we speak of "enlistments" as subset of the tree
that you plan to work on.  For example, you might enlist in the
"file system" portion of the tree or in the "device drivers"
portion.  If the Makefiles have good partitioning, you should
only need one of the above portions to do productive work within
a feature area.

I'm not sure what you mean by "entitlements".




just reference one by : during your clone.  The
server can lookup that blob and just use it.

    git clone --filter=sparse:oid=master:templates/bar URL

And then the server will filter-out the unwanted blobs during
the clone.  (The current version only filters blobs; you still
get full commits and trees.  That will be revisited later.)


I'm for the idea that only the in-heirachy trees should be sent.
It should also be possible that the server replies that it is 
only sending a narrow clone, with the given (accessible?) spec.


I do want to extend this to have unneeded tree filtering too.
It is just not in this version.





On the client side, the partial clone installs local config
settings into the repo so that subsequent fetches default to
the same filter criteria as used in the clone.


I don't currently have provision to send a full sparse-checkout
specification to the server during a clone or fetch.  That
seemed like too much to try to squeeze into the protocols.
We can revisit this later if there is interest, but it wasn't
critical for the initial phase.


Agreed. I think it should be somewhere 'visible' to the user, but could be setup by the 
server admin / repo maintainer if they don't have 

Re: [PATCH v2 4/9] rebase -i: refactor transform_todo_ids

2017-12-04 Thread Johannes Schindelin
Hi Liam,

On Sun, 3 Dec 2017, Liam Beguin wrote:

> The transform_todo_ids function is a little hard to read. Lets try
> to make it easier by using more of the strbuf API. Also, since we'll
> soon be adding command abbreviations, let's rename the function so
> it's name reflects that change.

I am not really a fan of the new name, and would prefer the old one, but
that's only a nit, not a reason to reject the patch.

The rest of it makes the code reads a lot nicer than before. Thank you,
Johannes


Re: [PATCH v2] git-gui: Prevent double UTF-8 conversion

2017-12-04 Thread Johannes Schindelin
Hi Łukasz,

On Sat, 2 Dec 2017, Łukasz Stelmach wrote:

> Convert author's name from the UTF-8 (or any other) encoding in
> load_last_commit function the same way commit message is converted.
> 
> Amending commits in git-gui without such conversion breaks UTF-8
> strings. For example, "\305\201ukasz" (as written by git cat-file) becomes
> "\303\205\302\201ukasz" in an amended commit.

Okay, that makes the issue a lot clearer to me (the explicit mention of
"author's name", that is).

> diff --git a/git-gui/lib/commit.tcl b/git-gui/lib/commit.tcl
> index 83620b7cb..f820c24bf 100644
> --- a/git-gui/lib/commit.tcl
> +++ b/git-gui/lib/commit.tcl
> @@ -34,9 +34,7 @@ You are currently in the middle of a merge that has not 
> been fully completed.  Y
>   lappend parents [string range $line 7 
> end]
>   } elseif {[string match {encoding *} $line]} {
>   set enc [string tolower [string range 
> $line 9 end]]
> - } elseif {[regexp "author 
> (.*)\\s<(.*)>\\s(\\d.*$)" $line all name email time]} {
> - set commit_author [list name $name 
> email $email date $time]
> - }
> + } elseif {[regexp "author 
> (.*)\\s<(.*)>\\s(\\d.*$)" $line all name email time]} { }
>   }

This looks wrong, as the commit_author would now also be set if the header
was not found (mind you, this would make for an incorrect Git commit, but
the code explicitly tries to set commit_author only in the case that the
author line was found.

But we cannot set commit_author here because the encoding is read as
another header line (and in a valid commit object, the encoding line (if
any) has to be *below* the author line).

So it *has* to be this way. Maybe mention this in the commit message, to
avoid head-scratching?

However, I would still recommend to `set name ""` before the loop parsing
the header, and...

>   set msg [read $fd]
>   close $fd
> @@ -44,7 +42,9 @@ You are currently in the middle of a merge that has not 
> been fully completed.  Y
>   set enc [tcl_encoding $enc]
>   if {$enc ne {}} {
>   set msg [encoding convertfrom $enc $msg]
> + set name [encoding convertfrom $enc $name]
>   }
> + set commit_author [list name $name email $email date 
> $time]

Guarding this assignment in an `if {$name ne ""} { ... }`, just in case.

>   set msg [string trim $msg]
>   } err]} {
>   error_popup [strcat [mc "Error loading commit data for amend:"] 
> "\n\n$err"]

Thanks,
Johannes

Re: [PATCH v2 1/4] git-compat-util: introduce skip_to_optional_val()

2017-12-04 Thread Junio C Hamano
Christian Couder  writes:

> From: Christian Couder 
>
> We often accept both a "--key" option and a "--key=" option.
>
> These options currently are parsed using something like:
>
> if (!strcmp(arg, "--key")) {
>   /* do something */
> } else if (skip_prefix(arg, "--key=", )) {
>   /* do something with arg */
> }
>
> which is a bit cumbersome compared to just:
>
> if (skip_to_optional_val(arg, "--key", )) {
>   /* do something with arg */
> }
>
> This also introduces skip_to_optional_val_default() for the few
> cases where something different should be done when the first
> argument is exactly "--key" than when it is exactly "--key=".
>
> In general it is better for UI consistency and simplicity if
> "--key" and "--key=" do the same thing though, so that using
> skip_to_optional_val() should be encouraged compared to
> skip_to_optional_val_default().
>
> Note that these functions can be used to parse any "key=value"
> string where "key" is also considered as valid, not just
> command line options.
>
> Signed-off-by: Christian Couder 
> ---
>  git-compat-util.h | 23 +++
>  strbuf.c  | 20 
>  2 files changed, 43 insertions(+)
>
> After thinking about it a bit more and taking a look at the
> current code, I thought that it was probably best to have
> both skip_to_optional_val() and skip_to_optional_val_default().

I guess we came to the same conclusion independently while our mails
crossed ;-)

>   - 2 new functions instead of 1
>   - "optional" instead of "opt" in the function names

I thought that the more important part you agreed was s/val/arg/,
though.

I had a few small comments on later steps, but I think these are
moving in the right direction.

Thanks.


Re: [PATCH v2 4/4] diff: use skip_to_optional_val_default()

2017-12-04 Thread Junio C Hamano
Christian Couder  writes:

> - } else if (!strcmp(arg, "--submodule"))
> - options->submodule_format = DIFF_SUBMODULE_LOG;
> - else if (skip_prefix(arg, "--submodule=", ))
> + } else if (skip_to_optional_val_default(arg, "--submodule", , 
> "log"))
>   return parse_submodule_opt(options, arg);

It may be annoying if we later diagnose that DIFF_SUBMODULE_LOG may
not be used in the context (e.g. together with some other optoins
that are set in "options" variable) and have to throw an error
message at the user.  parse_submodule_opt() would not know if that
string "log" came from the user or substituted by the helper, and in
the latter case, "don't use --submodule=log here" is a message that
is not quite appropriate.

This may be minor enough not matter in practice, but I dunno.  I
didn't look at other hunks in this step very carefully but I would
expect that whenever you use "default" to substitute, you'll have
the same information lossage.



Re: [PATCH v2 3/4] diff: use skip_to_optional_val()

2017-12-04 Thread Junio C Hamano
Christian Couder  writes:

> @@ -4540,13 +4535,13 @@ int diff_opt_parse(struct diff_options *options,
>   return stat_opt(options, av);
>  
>   /* renames options */
> - else if (starts_with(arg, "-B") || starts_with(arg, 
> "--break-rewrites=") ||
> -  !strcmp(arg, "--break-rewrites")) {
> + else if (starts_with(arg, "-B") ||
> +  skip_to_optional_val(arg, "--break-rewrites", )) {
>   if ((options->break_opt = diff_scoreopt_parse(arg)) == -1)
>   return error("invalid argument to -B: %s", arg+2);
>   }

This is curious; "optarg" gets something, but it is not used (what
is passed to scoreopt_parse() is still "arg".

It merely is curious and not wrong; the actual parsing of the whole
thing is done in scoreopt_parse() and skip_to_optional_val() is used
merely as a substitute for "the thing must either be --break-rewrites
or must begin with --break-rewrites=" check that is done with
starts_with() and !strcmp().

It probably makes sense to allow skip_to_optional() to take a NULL
for the third parameter to clarify a callsite like this.  Otherwise
the readers will wonder who consumes optarg, and why it is OK for it
to be sometimes set and sometimes left undefined.

> - else if (starts_with(arg, "-M") || starts_with(arg, "--find-renames=") 
> ||
> -  !strcmp(arg, "--find-renames")) {
> + else if (starts_with(arg, "-M") ||
> +  skip_to_optional_val(arg, "--find-renames", )) {

Likewise.

> @@ -4554,8 +4549,8 @@ int diff_opt_parse(struct diff_options *options,
>   else if (!strcmp(arg, "-D") || !strcmp(arg, "--irreversible-delete")) {
>   options->irreversible_delete = 1;
>   }
> - else if (starts_with(arg, "-C") || starts_with(arg, "--find-copies=") ||
> -  !strcmp(arg, "--find-copies")) {
> + else if (starts_with(arg, "-C") ||
> +  skip_to_optional_val(arg, "--find-copies", )) {
>   if (options->detect_rename == DIFF_DETECT_COPY)
>   options->flags.find_copies_harder = 1;
>   if ((options->rename_score = diff_scoreopt_parse(arg)) == -1)

Likewise.


Re: [PATCH v2 2/2] git-prompt: fix reading files with windows line endings

2017-12-04 Thread Johannes Schindelin
Hi Robert,

1/2 looks very good.

On Sat, 2 Dec 2017, Robert Abel wrote:

> If any of the files read by __git_eread have \r\n line endings, read
> will only strip \n, leaving \r. This results in an ugly prompt, where
> instead of
> 
> user@pc MINGW64 /path/to/repo (BARE:master)
> 
> the last parenthesis is printed over the beginning of the prompt like
> 
> )ser@pc MINGW64 /path/to/repo (BARE:master

Maybe mention explicitly what Gabór said about $'...' being supported by
Bash and zsh, the only two intended users of git-prompt.sh (and there
being precedent of that construct being used already e.g. in __git_ps1)?

Other than that, this looks very good to me.

Thanks,
Johannes

[PATCH v2] sha1_file: use strbuf_add() instead of strbuf_addf()

2017-12-04 Thread Derrick Stolee
Thanks for the feedback on v1. This version fixes the cruft_cb
bug and streamlines the strlen(). I would include an inter-diff
but it was the same size as the patch.

Thanks,
-Stolee

---

Replace use of strbuf_addf() with strbuf_add() when enumerating
loose objects in for_each_file_in_obj_subdir(). Since we already
check the length and hex-values of the string before consuming
the path, we can prevent extra computation by using the lower-
level method.

One consumer of for_each_file_in_obj_subdir() is the abbreviation
code. OID abbreviations use a cached list of loose objects (per
object subdirectory) to make repeated queries fast, but there is
significant cache load time when there are many loose objects.

Most repositories do not have many loose objects before repacking,
but in the GVFS case the repos can grow to have millions of loose
objects. Profiling 'git log' performance in GitForWindows on a
GVFS-enabled repo with ~2.5 million loose objects revealed 12% of
the CPU time was spent in strbuf_addf().

Add a new performance test to p4211-line-log.sh that is more
sensitive to this cache-loading. By limiting to 1000 commits, we
more closely resemble user wait time when reading history into a
pager.

For a copy of the Linux repo with two ~512 MB packfiles and ~572K
loose objects, running 'git log --oneline --parents --raw -1000'
had the following performance:

 HEAD~1HEAD

 7.70(7.15+0.54)   7.44(7.09+0.29) -3.4%

Signed-off-by: Derrick Stolee 
---
 sha1_file.c  | 12 +++-
 t/perf/p4211-line-log.sh |  4 
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 8ae6cb6285..2fc8fa93b4 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1903,7 +1903,6 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr,
origlen = path->len;
strbuf_complete(path, '/');
strbuf_addf(path, "%02x", subdir_nr);
-   baselen = path->len;
 
dir = opendir(path->buf);
if (!dir) {
@@ -1914,15 +1913,18 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr,
}
 
oid.hash[0] = subdir_nr;
+   strbuf_addch(path, '/');
+   baselen = path->len;
 
while ((de = readdir(dir))) {
+   size_t namelen;
if (is_dot_or_dotdot(de->d_name))
continue;
 
+   namelen = strlen(de->d_name);
strbuf_setlen(path, baselen);
-   strbuf_addf(path, "/%s", de->d_name);
-
-   if (strlen(de->d_name) == GIT_SHA1_HEXSZ - 2 &&
+   strbuf_add(path, de->d_name, namelen);
+   if (namelen == GIT_SHA1_HEXSZ - 2 &&
!hex_to_bytes(oid.hash + 1, de->d_name,
  GIT_SHA1_RAWSZ - 1)) {
if (obj_cb) {
@@ -1941,7 +1943,7 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr,
}
closedir(dir);
 
-   strbuf_setlen(path, baselen);
+   strbuf_setlen(path, baselen - 1);
if (!r && subdir_cb)
r = subdir_cb(subdir_nr, path->buf, data);
 
diff --git a/t/perf/p4211-line-log.sh b/t/perf/p4211-line-log.sh
index e0ed05907c..392bcc0e51 100755
--- a/t/perf/p4211-line-log.sh
+++ b/t/perf/p4211-line-log.sh
@@ -35,4 +35,8 @@ test_perf 'git log --oneline --raw --parents' '
git log --oneline --raw --parents >/dev/null
 '
 
+test_perf 'git log --oneline --raw --parents -1000' '
+   git log --oneline --raw --parents -1000 >/dev/null
+'
+
 test_done
-- 
2.15.0



Re: [PATCH] send-email: extract email-parsing code into a subroutine

2017-12-04 Thread Junio C Hamano
Nathan PAYRE  writes:

> I've tested your code, and after few changes it's works perfectly!
> The code looks better now.
> Thanks a lot for your review.

Thanks, both of you.  

Could you send in the final version later so that I can pick it up?
I agree with Matthieu's suggestion on what address to use on your
From: (authorship identity) and S-o-b:; as you are showing your work
done as a uni student, the authorship and sign-off should be done as
such.


Re: [PATCH 1/3] git-compat-util: introduce skip_to_opt_val()

2017-12-04 Thread Junio C Hamano
Christian Couder  writes:

> In the few cases where do_something() accepts NULL and does something
> different with it, the former can be changed to:
>
> if (skip_to_optional_val(arg, "--key", , NULL) /* the last
> argument is the default value */
> do_something(arg);

That's the thing.  If do_something() that takes "" and NULL and
behaves differently is rare, that indicates that the existing code
may not be committed to treat "--key" and "--key=''" the same in the
first place, and I am not 100% convinced that I want to see us
committed to force that design throughout the system by introducing
a helper that hardcodes the equivalence and encourages to use it.

Imagine a command that takes "--do-something" option and does that
"something" unconditionally.  We may later extend it to take an
optional argument, i.e. "--do-something=c1,c2,...", to tell the
command to do that "something" under some but not all conditions.
The values c1,c2 would tell that we want that something done only
under either conditions c1 or c2 holds true.

It would be natural to expect that "--do-something=" to do that
"something" under no condition (i.e. as if no such option was
given); that would help scripts that accumulate the set of
conditions in a variable and say "--do-something=$when", by making
it a no-op when the variable $when turns out to be an empty string.
"--do-something" without "=" would not want to mean the same thing.

The above observation makes me suspect that it depends on the "key"
what "--key=$value" we want to be equivalent to "--key".  In the
"--do-something" case, we do not want to pretend as if we got an
empty string; instead we'd pretend as if we got "always" or
something like that.

And your "default" would work well for this "default is tied to what
the key is" paradigm, i.e.

skip_to_optional_arg(arg, "--do-something", , "always")

would make us treat "--do-something" and "--do-something=always" the
same way.

If it turns out that the default arg almost always is an empty
string, I do not mind 

#define skip_to_opt_arg(s,k,v) skip_to_optional_arg(s,k,v,"")

of course.


Urgent Message

2017-12-04 Thread James Udo
Dear Western Union Customer,
 
You have been awarded with the sum of $250,000 USD by our office, as one of our 
customers who use Western Union in their daily business transaction. This award 
was been selected through the internet, where your e-mail address was indicated 
and notified. Please provide Mr. James Udo with the following details listed 
below so that your fund will be remited to you through Western Union.
 
1. Name:
2. Address:
3. Country:
4. Phone Number:
5. Occupation:
6. Sex:
7. Age:
 
Mr. James Udo E-mail: westerrnunion2...@outlook.com
 
As soon as these details are received and verified, your fund will be 
transferred to you.Thank you, for using western union.




[PATCH v2 2/4] index-pack: use skip_to_optional_val()

2017-12-04 Thread Christian Couder
From: Christian Couder 

Let's simplify index-pack option parsing using
skip_to_optional_val().

Signed-off-by: Christian Couder 
---
 builtin/index-pack.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 8ec459f522..20b4b85a6a 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1660,10 +1660,7 @@ int cmd_index_pack(int argc, const char **argv, const 
char *prefix)
from_stdin = 1;
} else if (!strcmp(arg, "--fix-thin")) {
fix_thin_pack = 1;
-   } else if (!strcmp(arg, "--strict")) {
-   strict = 1;
-   do_fsck_object = 1;
-   } else if (skip_prefix(arg, "--strict=", )) {
+   } else if (skip_to_optional_val(arg, "--strict", )) 
{
strict = 1;
do_fsck_object = 1;
fsck_set_msg_types(_options, arg);
@@ -1679,10 +1676,8 @@ int cmd_index_pack(int argc, const char **argv, const 
char *prefix)
verify = 1;
show_stat = 1;
stat_only = 1;
-   } else if (!strcmp(arg, "--keep")) {
-   keep_msg = "";
-   } else if (starts_with(arg, "--keep=")) {
-   keep_msg = arg + 7;
+   } else if (skip_to_optional_val(arg, "--keep", 
_msg)) {
+   ; /* nothing to do */
} else if (starts_with(arg, "--threads=")) {
char *end;
nr_threads = strtoul(arg+10, , 0);
-- 
2.15.1.274.g3f22e311ce.dirty



[PATCH v2 4/4] diff: use skip_to_optional_val_default()

2017-12-04 Thread Christian Couder
Let's simplify diff option parsing using
skip_to_optional_val_default().

Signed-off-by: Christian Couder 
---
 diff.c | 20 
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/diff.c b/diff.c
index 83509f0658..d234b4f0cb 100644
--- a/diff.c
+++ b/diff.c
@@ -4619,9 +4619,7 @@ int diff_opt_parse(struct diff_options *options,
else if (!strcmp(arg, "--no-follow")) {
options->flags.follow_renames = 0;
options->flags.default_follow_renames = 0;
-   } else if (!strcmp(arg, "--color"))
-   options->use_color = 1;
-   else if (skip_prefix(arg, "--color=", )) {
+   } else if (skip_to_optional_val_default(arg, "--color", , 
"always")) {
int value = git_config_colorbool(NULL, arg);
if (value < 0)
return error("option `color' expects \"always\", 
\"auto\", or \"never\"");
@@ -4641,14 +4639,9 @@ int diff_opt_parse(struct diff_options *options,
if (cm < 0)
die("bad --color-moved argument: %s", arg);
options->color_moved = cm;
-   } else if (!strcmp(arg, "--color-words")) {
-   options->use_color = 1;
-   options->word_diff = DIFF_WORDS_COLOR;
-   }
-   else if (skip_prefix(arg, "--color-words=", )) {
+   } else if (skip_to_optional_val_default(arg, "--color-words", 
>word_regex, NULL)) {
options->use_color = 1;
options->word_diff = DIFF_WORDS_COLOR;
-   options->word_regex = arg;
}
else if (!strcmp(arg, "--word-diff")) {
if (options->word_diff == DIFF_WORDS_NONE)
@@ -4687,15 +4680,10 @@ int diff_opt_parse(struct diff_options *options,
options->flags.textconv_set_via_cmdline = 1;
} else if (!strcmp(arg, "--no-textconv"))
options->flags.allow_textconv = 0;
-   else if (!strcmp(arg, "--ignore-submodules")) {
-   options->flags.override_submodule_config = 1;
-   handle_ignore_submodules_arg(options, "all");
-   } else if (skip_prefix(arg, "--ignore-submodules=", )) {
+   else if (skip_to_optional_val_default(arg, "--ignore-submodules", , 
"all")) {
options->flags.override_submodule_config = 1;
handle_ignore_submodules_arg(options, arg);
-   } else if (!strcmp(arg, "--submodule"))
-   options->submodule_format = DIFF_SUBMODULE_LOG;
-   else if (skip_prefix(arg, "--submodule=", ))
+   } else if (skip_to_optional_val_default(arg, "--submodule", , 
"log"))
return parse_submodule_opt(options, arg);
else if (skip_prefix(arg, "--ws-error-highlight=", ))
return parse_ws_error_highlight_opt(options, arg);
-- 
2.15.1.274.g3f22e311ce.dirty



[PATCH v2 1/4] git-compat-util: introduce skip_to_optional_val()

2017-12-04 Thread Christian Couder
From: Christian Couder 

We often accept both a "--key" option and a "--key=" option.

These options currently are parsed using something like:

if (!strcmp(arg, "--key")) {
/* do something */
} else if (skip_prefix(arg, "--key=", )) {
/* do something with arg */
}

which is a bit cumbersome compared to just:

if (skip_to_optional_val(arg, "--key", )) {
/* do something with arg */
}

This also introduces skip_to_optional_val_default() for the few
cases where something different should be done when the first
argument is exactly "--key" than when it is exactly "--key=".

In general it is better for UI consistency and simplicity if
"--key" and "--key=" do the same thing though, so that using
skip_to_optional_val() should be encouraged compared to
skip_to_optional_val_default().

Note that these functions can be used to parse any "key=value"
string where "key" is also considered as valid, not just
command line options.

Signed-off-by: Christian Couder 
---
 git-compat-util.h | 23 +++
 strbuf.c  | 20 
 2 files changed, 43 insertions(+)

After thinking about it a bit more and taking a look at the
current code, I thought that it was probably best to have
both skip_to_optional_val() and skip_to_optional_val_default().

The changes compared to previous version are:

  - 2 new functions instead of 1
  - "optional" instead of "opt" in the function names
  - the big function is not inlined
  - more code in diff.c is simplified using the functions 

diff --git a/git-compat-util.h b/git-compat-util.h
index cedad4d581..2858d66f85 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -484,6 +484,29 @@ static inline int skip_prefix(const char *str, const char 
*prefix,
return 0;
 }
 
+/*
+ * If the string "str" is the same as the string in "prefix", then the "val"
+ * parameter is set to the "def" parameter and 1 is returned.
+ * If the string "str" begins with the string found in "prefix" and then a
+ * "=" sign, then the "val" parameter is set to "str + strlen(prefix) + 1"
+ * (i.e., to the point in the string right after the prefix and the "=" sign),
+ * and 1 is returned.
+ *
+ * Otherwise, return 0 and leave "val" untouched.
+ *
+ * When we accept both a "--key" and a "--key=" option, this function
+ * can be used instead of !strcmp(arg, "--key") and then
+ * skip_prefix(arg, "--key=", ) to parse such an option.
+ */
+int skip_to_optional_val_default(const char *str, const char *prefix,
+const char **val, const char *def);
+
+static inline int skip_to_optional_val(const char *str, const char *prefix,
+  const char **val)
+{
+   return skip_to_optional_val_default(str, prefix, val, "");
+}
+
 /*
  * Like skip_prefix, but promises never to read past "len" bytes of the input
  * buffer, and returns the remaining number of bytes in "out" via "outlen".
diff --git a/strbuf.c b/strbuf.c
index 323c49ceb3..3430499f9e 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -11,6 +11,26 @@ int starts_with(const char *str, const char *prefix)
return 0;
 }
 
+int skip_to_optional_val_default(const char *str, const char *prefix,
+const char **val, const char *def)
+{
+   const char *p;
+
+   if (!skip_prefix(str, prefix, ))
+   return 0;
+
+   if (!*p) {
+   *val = def;
+   return 1;
+   }
+
+   if (*p != '=')
+   return 0;
+
+   *val = p + 1;
+   return 1;
+}
+
 /*
  * Used as the default ->buf value, so that people can always assume
  * buf is non NULL and ->buf is NUL terminated even for a freshly
-- 
2.15.1.274.g3f22e311ce.dirty



  1   2   >