Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-06 Thread Johannes Schindelin
Hi Buga,

On Tue, 6 Mar 2018, Igor Djordjevic wrote:

> On 06/03/2018 19:12, Johannes Schindelin wrote:
> > 
> > > > And I guess being consistent is pretty important, too - if you add new
> > > > content during merge rebase, it should always show up in the merge,
> > > > period. 
> > >
> > > Yes, that should make it easy for the user to know what to expect from
> > > rebase.
> > 
> > [...]
> > 
> > It will be slightly inconsistent. But in a defendable way, I think.
> 
> I like where this discussion is heading, and here`s what I thought 
> about it :)
> 
> [...]
> 
> Here`s a twist - not letting `merge` trying to be too smart by 
> figuring out whether passed arguments correspond to rewritten 
> versions of the original merge parents (which would be too 
> restrictive, too, I`m afraid), but just be explicit about it, instead!

That's the missing piece, I think.

> So, it could be something like:
> 
>   merge -C deadbee 123abc:cafecafe 234bcd:bedbedbed

I like where this is heading, too, but I do not think that we can do this
on a per-MERGE_HEAD basis. The vast majority of merge commits, in
practice, have two parents. So the `merge` command would actually only
have one revision to merge (because HEAD is the implicit first parent). So
that is easy.

But as soon as you go octopus, you can either perform an octopus merge, or
rebase the original merge commit. You cannot really mix and match here.

Unless we reimplement the octopus merge (which works quite a bit
differently from the "rebase merge commit" strategy, even if it is
incremental, too), which has its own challenges: if there are merge
conflicts before merging the last MERGE_HEAD, the octopus merge will exit
with status 2, telling you "Should not be doing an octopus.". While we
will want to keep merge conflict markers and continue with the "rebase the
original merge commit" strategy.

And it would slam the door shut for adding support for *other* merge
strategies to perform a more-than-two-parents merge.

Also, I do not think that it makes a whole lot of sense in practice to let
users edit what will be used for "original parent". If the user wants to
do complicated stuff, they can already do that, via `exec`. The `merge`
command really should be about facilitating common workflows, guiding the
user to what is sane.

Currently my favorite idea is to introduce a new flag: -R (for "rebase the
original merge commit"). It would look like this:

merge -R -C   # 

This flag would of course trigger the consistency check (does the number
of parents of the original merge commit agree with the parameter list? Was
an original merge commit specified to begin with?), and it would not fall
back to the recursive merge, but error out if that check failed.

Side note: I wonder whether we really need to perform the additional check
that ensures that the  refers to the rewritten version of the
original merge commit's parent.

Second side note: if we can fast-forward, currently we prefer that, and I
think we should keep that behavior with -R, too.

If the user wants to force a new merge, they simply remove that -R flag.

What do you think?

Ciao,
Dscho


Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-06 Thread Johannes Schindelin
Hi Junio,

On Tue, 6 Mar 2018, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> >> I don't think its possible to guess the semantics of the original merge
> >> as users can use custom merge strategies and amend the result. It would
> >> be possible to detect and unamended '-s ours' merge but special casing
> >> that may end up causing users more confusion rather than helping them.
> >
> > FWIW I agree.
> 
> I think it is a mistake to sacrifice predictability only to add
> cleverness that sometimes work.  Elsewhere in the thread, I think I
> saw an argument to treat interactive and non-interactive something
> very different, but there is no fundamental difference between them
> (it is far easier with interactive to force the command to "port"
> each change to a vastly different context) so having consistent
> behaviour between the two cases is important, too.

I could be swayed both ways, but Buga already pointed out that we do not
have to compromise any consistency, simply by adding some syntactic sugar
to the `merge` command so that the different behavior is *explicit*.

> > My original plan was to always merge recursively and suggest to use `exec`
> > commands if anything else is needed.
> >
> > But now with that excellent new idea to perform successive three-way
> > merges of the original merge commit with the new tips, using the old tips
> > as merge base, I am considering to change that.
> 
> OK, does this mean we want to wait before merging the "recreate
> merge" topic down to 'next'?  For more than a few weeks, it has been
> slated for 'next'.

Maybe a few more days.

My current thinking is to rework the handling of -c vs -C and *not* have
two different todo_command enum values, but rather introduce an unsigned
integer that has flags such as TODO_MERGE_EDIT.

And for this new behavior, we could introduce a new flag
(TODO_MERGE_REBASE_MERGE_COMMIT or something less unwieldy) and set that
flag via

merge -R -C  ...

(i.e. via a new flag `-R`).

I want to discuss this in the other subthread, though.

Ciao,
Dscho


Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-06 Thread Johannes Schindelin
Hi Buga,

On Wed, 7 Mar 2018, Igor Djordjevic wrote:

> On 05/03/2018 18:29, Johannes Schindelin wrote:
> > 
> > > By the way, is there documentation for `git merge-recursive`
> > > anywhere, besides the code itself...? :$
> > 
> > I am not aware of any. The commit message adding the command is not very
> > illuminating (https://github.com/git-for-windows/git/commit/720d150c4):
> > 
> > Add a new merge strategy by Fredrik Kuivinen.
> > 
> > I really wanted to try this out, instead of asking for an adjustment
> > to the 'git merge' driver and waiting.  For now the new strategy is
> > called 'fredrik' and not in the list of default strategies to be tried.
> > 
> > The script wants Python 2.4 so this commit also adjusts Debian and RPM
> > build procecure files.
> > 
> > Digging through https://public-inbox.org/git/ during that time frame comes
> > up with this hit, though:
> > 
> > https://public-inbox.org/git/20050907164734.ga20...@c165.ib.student.liu.se/
> > 
> > which is still not a good documentation of the algorithm. You can probably
> > dig further yourself, but I think I can describe it very quickly here:
> > 
> > To merge two commits recursively, you first have to find their "merge
> > bases". If there was an obvious branch point, then that is the merge base.
> > But when you start a branch off of master, then work a bit, then merge
> > master, you already have two merge bases.
> > 
> > The trick about the recursive merge is to reduce the number of merge bases
> > iteratively to one. It does that by taking two merge bases, and performing
> > a recursive merge on them, which generates a "virtual" commit, the
> > condensed merge base. That one is then merged recursively with the next
> > merge base, until there is only one left.
> > 
> > A recursive merge of two commits with exactly one merge base is simply a
> > three-way merge.
> > 
> > I vaguely remember that there was something funny about the order in which
> > order you want to process the merge bases: if you did it in one
> > (chronological) direction, it worked beautifully, in the other direction
> > it would generate tons of merge conflicts or something like that.
> 
> Thanks, this is very informative (together with the linked discussion).
> 
> Not remembering seeing this before, I wasn`t really sure if this was 
> some undocumented core Git (plumbing?) utility (used withing Git 
> itself, too), or just a leftover (yet still useful) sample tool.

You raise a good point. The "recursive merge" is not really documented
properly. Maybe you can enhance my vague description and provide a patch
for Documentation/technical/?

Thanks,
Dscho


Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-06 Thread Johannes Schindelin
Hi Sergey,

On Wed, 7 Mar 2018, Sergey Organov wrote:

> Johannes Schindelin  writes:
> 
> > On Tue, 6 Mar 2018, Phillip Wood wrote:
> >
> >> On 03/03/18 00:29, Igor Djordjevic wrote:
> >> > 
> >> > On 02/03/2018 12:31, Phillip Wood wrote:
> >> >>
> >> >>> Thinking about it overnight, I now suspect that original proposal
> >> >>> had a mistake in the final merge step. I think that what you did is
> >> >>> a way to fix it, and I want to try to figure what exactly was wrong
> >> >>> in the original proposal and to find simpler way of doing it right.
> >> >>>
> >> >>> The likely solution is to use original UM as a merge-base for final
> >> >>> 3-way merge of U1' and U2', but I'm not sure yet. Sounds pretty
> >> >>> natural though, as that's exactly UM from which both U1' and U2'
> >> >>> have diverged due to rebasing and other history editing.
> >> >>
> >> >> Hi Sergey, I've been following this discussion from the sidelines,
> >> >> though I haven't had time to study all the posts in this thread in
> >> >> detail. I wonder if it would be helpful to think of rebasing a merge
> >> >> as merging the changes in the parents due to the rebase back into the
> >> >> original merge. So for a merge M with parents A B C that are rebased
> >> >> to A' B' C' the rebased merge M' would be constructed by (ignoring
> >> >> shell quoting issues)
> >> >>
> >> >> git checkout --detach M
> >> >> git merge-recursive A -- M A'
> >> >> tree=$(git write-tree)
> >> >> git merge-recursive B -- $tree B'
> >> >> tree=$(git write-tree)
> >> >> git merge-recursive C -- $tree C'
> >> >> tree=$(git write-tree)
> >> >> M'=$(git log --pretty=%B -1 M | git commit-tree -pA' -pB' -pC')
> >> >>
> >> >> This should pull in all the changes from the parents while preserving
> >> >> any evil conflict resolution in the original merge. It superficially
> >> >> reminds me of incremental merging [1] but it's so long since I looked at
> >> >> that I'm not sure if there are any significant similarities.
> >> >>
> >> >> [1] https://github.com/mhagger/git-imerge
> >> > 
> >> > Interesting, from quick test[3], this seems to produce the same 
> >> > result as that other test I previously provided[2], where temporary 
> >> > commits U1' and U2' are finally merged with original M as a base :)
> >> > 
> >> > Just that this looks like even more straight-forward approach...?
> >> > 
> >> > The only thing I wonder of here is how would we check if the 
> >> > "rebased" merge M' was "clean", or should we stop for user amendment? 
> >> > With that other approach Sergey described, we have U1'==U2' to test with.
> >> 
> >> I think (though I haven't rigorously proved to myself) that in the
> >> absence of conflicts this scheme has well defined semantics (the merges
> >> can be commuted), so the result should be predicable from the users
> >> point of view so maybe it could just offer an option to stop.
> >
> > I am not so sure that the result is independent of the order of the
> > merges. In other words, I am not necessarily certain that it is impossible
> > to concoct A,A',B,B' commits where merging B'/B before A'/A has a
> > different result than merging A'/A before B'/B.
> >
> > Remember, when constructing counter-examples to hypotheses, those
> > counter-examples do not really *have* to make sense on their own. For
> > example, A' could introduce *completely different* changes from A, and the
> > same is true for B' and B.
> >
> > I could imagine, for example, that using a ton of consecutive empty lines,
> > and using patches that insert something into these empty lines (and are
> > thusly inherently ambiguous when said set of empty lines has changed),
> > could even introduce a merge conflict in one order, but no conflict in the
> > other.
> >
> > Even so, I think that merging in the order of the parents makes the most
> > sense, and that using that strategy makes sense, too, because you really
> > have to try hard to make it fail.
> 
> Alternatively, consider to adopt the original approach that has none of
> these issues as it uses exactly the same method for rebasing merge
> commits that you are already using for rebasing simple commits, not to
> mention the advantage of the built-in consistency check.

Surely I misunderstand?

How can your approach -- which relies *very much* on having the original
parent commits -- not *require* that consistency check?

What would your approach (that still has no satisfyingly trivial
explanation, in my mind) do if somebody edited a `merge` command and let
it merge a completely unrelated commit?

Ciao,
Johannes


Re: [RFC v2] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-06 Thread Johannes Schindelin
Hi Sergey,

On Tue, 6 Mar 2018, Sergey Organov wrote:

> This is v2 of my "Rebasing merges" proposal.

Didn't we settle on Phillip's "perform successive three-way merges between
the original merge commit and the new tips with the old tips as base"
strategy? It has the following advantages:

- it is *very simple* to describe

- it is *very easy* to reason about, once it is pointed out that rebases
  and merges result in the same trees.

... and BTW...

> 3. I now use "True Merge" name instead of former "Trivial Merge", to
>avoid confusion with what Git documentation calls "trivial merge",
>thanks to Junio C Hamano  for pointing this out.

"True Merge" is probably also a candidate for improvement. If what you
refer to is a "true" merge, that means all others are "untrue" merges???

Ciao,
Johannes


Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-06 Thread Sergey Organov
Hi Johannes,

Johannes Schindelin  writes:

> Hi Phillip,
>
> On Tue, 6 Mar 2018, Phillip Wood wrote:
>
>> On 03/03/18 00:29, Igor Djordjevic wrote:
>> > 
>> > On 02/03/2018 12:31, Phillip Wood wrote:
>> >>
>> >>> Thinking about it overnight, I now suspect that original proposal
>> >>> had a mistake in the final merge step. I think that what you did is
>> >>> a way to fix it, and I want to try to figure what exactly was wrong
>> >>> in the original proposal and to find simpler way of doing it right.
>> >>>
>> >>> The likely solution is to use original UM as a merge-base for final
>> >>> 3-way merge of U1' and U2', but I'm not sure yet. Sounds pretty
>> >>> natural though, as that's exactly UM from which both U1' and U2'
>> >>> have diverged due to rebasing and other history editing.
>> >>
>> >> Hi Sergey, I've been following this discussion from the sidelines,
>> >> though I haven't had time to study all the posts in this thread in
>> >> detail. I wonder if it would be helpful to think of rebasing a merge
>> >> as merging the changes in the parents due to the rebase back into the
>> >> original merge. So for a merge M with parents A B C that are rebased
>> >> to A' B' C' the rebased merge M' would be constructed by (ignoring
>> >> shell quoting issues)
>> >>
>> >> git checkout --detach M
>> >> git merge-recursive A -- M A'
>> >> tree=$(git write-tree)
>> >> git merge-recursive B -- $tree B'
>> >> tree=$(git write-tree)
>> >> git merge-recursive C -- $tree C'
>> >> tree=$(git write-tree)
>> >> M'=$(git log --pretty=%B -1 M | git commit-tree -pA' -pB' -pC')
>> >>
>> >> This should pull in all the changes from the parents while preserving
>> >> any evil conflict resolution in the original merge. It superficially
>> >> reminds me of incremental merging [1] but it's so long since I looked at
>> >> that I'm not sure if there are any significant similarities.
>> >>
>> >> [1] https://github.com/mhagger/git-imerge
>> > 
>> > Interesting, from quick test[3], this seems to produce the same 
>> > result as that other test I previously provided[2], where temporary 
>> > commits U1' and U2' are finally merged with original M as a base :)
>> > 
>> > Just that this looks like even more straight-forward approach...?
>> > 
>> > The only thing I wonder of here is how would we check if the 
>> > "rebased" merge M' was "clean", or should we stop for user amendment? 
>> > With that other approach Sergey described, we have U1'==U2' to test with.
>> 
>> I think (though I haven't rigorously proved to myself) that in the
>> absence of conflicts this scheme has well defined semantics (the merges
>> can be commuted), so the result should be predicable from the users
>> point of view so maybe it could just offer an option to stop.
>
> I am not so sure that the result is independent of the order of the
> merges. In other words, I am not necessarily certain that it is impossible
> to concoct A,A',B,B' commits where merging B'/B before A'/A has a
> different result than merging A'/A before B'/B.
>
> Remember, when constructing counter-examples to hypotheses, those
> counter-examples do not really *have* to make sense on their own. For
> example, A' could introduce *completely different* changes from A, and the
> same is true for B' and B.
>
> I could imagine, for example, that using a ton of consecutive empty lines,
> and using patches that insert something into these empty lines (and are
> thusly inherently ambiguous when said set of empty lines has changed),
> could even introduce a merge conflict in one order, but no conflict in the
> other.
>
> Even so, I think that merging in the order of the parents makes the most
> sense, and that using that strategy makes sense, too, because you really
> have to try hard to make it fail.

Alternatively, consider to adopt the original approach that has none of
these issues as it uses exactly the same method for rebasing merge
commits that you are already using for rebasing simple commits, not to
mention the advantage of the built-in consistency check.

-- Sergey


Re: [PATCH 2/2] completion: simplify _git_notes

2018-03-06 Thread SZEDER Gábor

> SZEDER Gábor  writes:
> 
> > There is a minor behaviour change here, though.  This
> >
> >   prune,*)
> > ;;
> >
> > case arm ensured that we don't list refs for 'git notes prune ',
> > because it doesn't accept them (and then we take our usual fallback and
> > let Bash complete filenames;  yeah, 'git notes prune' doesn't accept
> > filenames either, but, as I said, that's our usual fallback when we
> > can't offer anything for completion).
> >
> > This patch removes that case arm, and refs will be offered for 'git
> > notes prune '.
> >
> >> +   *,--*)
> >> +   __gitcomp_builtin notes_$subcommand
> >> ;;
> >> *)
> >> case "$prev" in
> 
> I have this tentatively queued on the topic.  Can we wrap the topic
> up and move it forward, instead of leaving it (and other topics)
> hanging around and causing conflicts with other topics in flight,
> please?
> 
> Thanks.
> 
> 
> Subject: [PATCH] SQUASH???
> 
> By Szeder 

[PATCH v2 0/4] nd/parseopt-completion fixups

2018-03-06 Thread Nguyễn Thái Ngọc Duy
v2 fixes the comments from v1 and adds to new patches to improve
_git_notes() (sorry I couldn't resist)

Interdiff with what's on 'pu'

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 5f7495cda3..2e30950299 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1815,7 +1815,7 @@ _git_name_rev ()
 
 _git_notes ()
 {
-   local subcommands='add append copy edit list prune remove show'
+   local subcommands='add append copy edit get-ref list merge prune remove 
show'
local subcommand="$(__git_find_on_cmdline "$subcommands")"
 
case "$subcommand,$cur" in
@@ -1832,18 +1832,15 @@ _git_notes ()
;;
esac
;;
-   add,--reuse-message=*|append,--reuse-message=*|\
-   add,--reedit-message=*|append,--reedit-message=*)
+   *,--reuse-message=*|*,--reedit-message=*)
__git_complete_refs --cur="${cur#*=}"
;;
-   prune,--*)
-   __gitcomp_builtin notes_prune
-   ;;
-   prune,*)
-   ;;
*,--*)
__gitcomp_builtin notes_$subcommand
;;
+   prune,*|get-ref,*)
+   # this command does not take a ref, do not complete it
+   ;;
*)
case "$prev" in
-m|-F)
@@ -1956,6 +1953,7 @@ _git_rebase ()
--autostash --no-autostash
--verify --no-verify
--keep-empty --root --force-rebase --no-ff
+   --rerere-autoupdate
--exec
"
 

Nguyễn Thái Ngọc Duy (4):
  completion: don't set PARSE_OPT_NOCOMPLETE on --rerere-autoupdate
  completion: simplify _git_notes
  completion: complete --{reuse,reedit}-message= for all notes subcmds
  completion: more subcommands in _git_notes()

 contrib/completion/git-completion.bash | 25 -
 parse-options.h|  4 ++--
 rerere.h   |  3 +--
 3 files changed, 11 insertions(+), 21 deletions(-)

-- 
2.16.2.785.g429c04a1b9



[PATCH v2 3/4] completion: complete --{reuse,reedit}-message= for all notes subcmds

2018-03-06 Thread Nguyễn Thái Ngọc Duy
The new subcommand that takes these options is 'git notes edit'. Just
accept the options from subcommands since we handle them the same way
in builtin/notes.c anyway. If a user does

git prune --reuse-message=...

just let the command catches that error when it's executed.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 contrib/completion/git-completion.bash | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 37bf4a64d3..dc3ec43b65 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1832,8 +1832,7 @@ _git_notes ()
;;
esac
;;
-   add,--reuse-message=*|append,--reuse-message=*|\
-   add,--reedit-message=*|append,--reedit-message=*)
+   *,--reuse-message=*|*,--reedit-message=*)
__git_complete_refs --cur="${cur#*=}"
;;
*,--*)
-- 
2.16.2.785.g429c04a1b9



[PATCH v2 1/4] completion: don't set PARSE_OPT_NOCOMPLETE on --rerere-autoupdate

2018-03-06 Thread Nguyễn Thái Ngọc Duy
There is not a strong reason to hide this option, and git-merge already
completes this one. Let's allow to complete this for all commands (and
let git-completion.bash do the suppressing if needed).

This makes --rerere-autoupdate completable for am, cherry-pick and
revert. rebase completion is fixed manually because it's a shell
script and does not benefit from --git-completion-helper.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 contrib/completion/git-completion.bash | 4 ++--
 parse-options.h| 4 ++--
 rerere.h   | 3 +--
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 0ddf40063b..0d858cacce 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1754,8 +1754,7 @@ _git_merge ()
 
case "$cur" in
--*)
-   __gitcomp_builtin merge "--rerere-autoupdate
-   --no-rerere-autoupdate
+   __gitcomp_builtin merge "--no-rerere-autoupdate
--no-commit --no-edit --no-ff
--no-log --no-progress
--no-squash --no-stat
@@ -1963,6 +1962,7 @@ _git_rebase ()
--autostash --no-autostash
--verify --no-verify
--keep-empty --root --force-rebase --no-ff
+   --rerere-autoupdate
--exec
"
 
diff --git a/parse-options.h b/parse-options.h
index 0ba08691e6..ab1cc362bf 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -148,8 +148,8 @@ struct option {
 #define OPT_STRING_LIST(s, l, v, a, h) \
{ OPTION_CALLBACK, (s), (l), (v), (a), \
  (h), 0, _opt_string_list }
-#define OPT_UYN(s, l, v, h, f)  { OPTION_CALLBACK, (s), (l), (v), NULL, \
- (h), PARSE_OPT_NOARG|(f), 
_opt_tertiary }
+#define OPT_UYN(s, l, v, h) { OPTION_CALLBACK, (s), (l), (v), NULL, \
+ (h), PARSE_OPT_NOARG, _opt_tertiary 
}
 #define OPT_DATE(s, l, v, h) \
{ OPTION_CALLBACK, (s), (l), (v), N_("time"),(h), 0,\
  parse_opt_approxidate_cb }
diff --git a/rerere.h b/rerere.h
index 5e5a312e4c..c2961feaaa 100644
--- a/rerere.h
+++ b/rerere.h
@@ -37,7 +37,6 @@ extern void rerere_clear(struct string_list *);
 extern void rerere_gc(struct string_list *);
 
 #define OPT_RERERE_AUTOUPDATE(v) OPT_UYN(0, "rerere-autoupdate", (v), \
-   N_("update the index with reused conflict resolution if possible"), \
-   PARSE_OPT_NOCOMPLETE)
+   N_("update the index with reused conflict resolution if possible"))
 
 #endif
-- 
2.16.2.785.g429c04a1b9



[PATCH v2 2/4] completion: simplify _git_notes

2018-03-06 Thread Nguyễn Thái Ngọc Duy
This also adds completion for 'git notes remove' and 'git notes edit'.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 contrib/completion/git-completion.bash | 14 +++---
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 0d858cacce..37bf4a64d3 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1836,19 +1836,11 @@ _git_notes ()
add,--reedit-message=*|append,--reedit-message=*)
__git_complete_refs --cur="${cur#*=}"
;;
-   add,--*)
-   __gitcomp_builtin notes_add
-   ;;
-   append,--*)
-   __gitcomp_builtin notes_append
-   ;;
-   copy,--*)
-   __gitcomp_builtin notes_copy
-   ;;
-   prune,--*)
-   __gitcomp_builtin notes_prune
+   *,--*)
+   __gitcomp_builtin notes_$subcommand
;;
prune,*)
+   # this command does not take a ref, do not complete it
;;
*)
case "$prev" in
-- 
2.16.2.785.g429c04a1b9



[PATCH v2 4/4] completion: more subcommands in _git_notes()

2018-03-06 Thread Nguyễn Thái Ngọc Duy
Two subcommands are added for completion: merge and get-ref. get-ref
is more like plumbing. But since it does not share the prefix with any
other subcommands, it won't slow anybody down.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 contrib/completion/git-completion.bash | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index dc3ec43b65..2e30950299 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1815,7 +1815,7 @@ _git_name_rev ()
 
 _git_notes ()
 {
-   local subcommands='add append copy edit list prune remove show'
+   local subcommands='add append copy edit get-ref list merge prune remove 
show'
local subcommand="$(__git_find_on_cmdline "$subcommands")"
 
case "$subcommand,$cur" in
@@ -1838,7 +1838,7 @@ _git_notes ()
*,--*)
__gitcomp_builtin notes_$subcommand
;;
-   prune,*)
+   prune,*|get-ref,*)
# this command does not take a ref, do not complete it
;;
*)
-- 
2.16.2.785.g429c04a1b9



Re: [PATCH] xdiff: improve trimming preprocessing

2018-03-06 Thread Jun Wu
Excerpts from Junio C Hamano's message of 2018-03-06 11:29:44 -0800:
> Eric Sunshine  writes:
> 
> > On Tue, Mar 6, 2018 at 6:53 AM, Jun Wu  wrote:
> >> xdiff-interface trims common suffix if ctxlen is 0. Teach it to also
> >> trim common prefix, and trim less lines if ctxlen > 0. So it can benefit
> >> the default diff command, as seen by profiling: [...]
> 
> I vaguely recall that we have tried this in the distant past, found
> that it produced incorrect result, and that is why we limit the
> optimization for no-context case.
> 
> Does anybody have an archive reference?

I think it's d2f8295 ("Re(-re)*fix trim_common_tail()", 2007-12-20).

Yeah, this is more complex than I thought. In Mercurial's use-case, ctxlen
is 0 and context are added in a higher layer so it's fine.

It's clearly hunk shifting causing problems. I'll send a V2 to forbid hunk
shifting if it can possibly lose context lines. That implies counting suffix
lines, which adds some overhead but the overall perf win still seems
worthwhile.


Re: [PATCH v9 4/8] utf8: add function to detect a missing UTF-16/32 BOM

2018-03-06 Thread Junio C Hamano
Lars Schneider  writes:

> After thinking about it I wonder if we should barf on "utf16" without
> dash. Your Linux iconv would handle this correctly. My macOS iconv would not.
> That means the repo would checkout correctly on your machine but not on mine.
>
> What do you think?

To be bluntly honest, I prefer not to have excess "sanity checks";
there is no need to barf on utf16 in a project run by those who are
without macOS friends, for example.  For that matter, while I do not
hate it so much to reject it, I am not all that supportive of this
"The consortium says without -LE or -BE suffix there must be BOM,
and this lacks one, so barf loudly" step in this topic myself.


What's cooking in git.git (Mar 2018, #02; Tue, 6)

2018-03-06 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

I haven't seen enough reviews (nor I had chance to sufficiently
review them myself) to comfortably decide what to do with them on a
handful of new topics that appeared in the past week or two, so the
ones in the new topics section may not have any description yet in
this issue.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ab/fetch-prune (2018-02-09) 17 commits
  (merged to 'next' on 2018-02-27 at eafb648dd9)
 + fetch: make the --prune-tags work with 
 + fetch: add a --prune-tags option and fetch.pruneTags config
 + fetch tests: add scaffolding for the new fetch.pruneTags
 + git-fetch & config doc: link to the new PRUNING section
 + git remote doc: correct dangerous lies about what prune does
 + git fetch doc: add a new section to explain the ins & outs of pruning
 + fetch tests: fetch   as well as fetch []
 + fetch tests: expand case/esac for later change
 + fetch tests: double quote a variable for interpolation
 + fetch tests: test --prune and refspec interaction
 + fetch tests: add a tag to be deleted to the pruning tests
 + fetch tests: re-arrange arguments for future readability
 + fetch tests: refactor in preparation for testing tag pruning
 + remote: add a macro for "refs/tags/*:refs/tags/*"
 + fetch: stop accessing "remote" variable indirectly
 + fetch: trivially refactor assignment to ref_nr
 + fetch: don't redundantly NULL something calloc() gave us

 "git fetch --prune-tags" may be used as a handy short-hand for
 getting rid of stale tags that are locally held.


* ab/simplify-perl-makefile (2018-02-15) 1 commit
  (merged to 'next' on 2018-02-27 at b0d68a2013)
 + Makefile: generate Git(3pm) as dependency of the 'doc' and 'man' targets

 Hotfix for a topic already in 'master'.


* bw/c-plus-plus (2018-02-22) 37 commits
  (merged to 'next' on 2018-02-27 at daf85c03de)
 + replace: rename 'new' variables
 + trailer: rename 'template' variables
 + tempfile: rename 'template' variables
 + wrapper: rename 'template' variables
 + environment: rename 'namespace' variables
 + diff: rename 'template' variables
 + environment: rename 'template' variables
 + init-db: rename 'template' variables
 + unpack-trees: rename 'new' variables
 + trailer: rename 'new' variables
 + submodule: rename 'new' variables
 + split-index: rename 'new' variables
 + remote: rename 'new' variables
 + ref-filter: rename 'new' variables
 + read-cache: rename 'new' variables
 + line-log: rename 'new' variables
 + imap-send: rename 'new' variables
 + http: rename 'new' variables
 + entry: rename 'new' variables
 + diffcore-delta: rename 'new' variables
 + diff: rename 'new' variables
 + diff-lib: rename 'new' variable
 + commit: rename 'new' variables
 + combine-diff: rename 'new' variables
 + remote: rename 'new' variables
 + reflog: rename 'new' variables
 + pack-redundant: rename 'new' variables
 + help: rename 'new' variables
 + checkout: rename 'new' variables
 + apply: rename 'new' variables
 + apply: rename 'try' variables
 + diff: rename 'this' variables
 + rev-parse: rename 'this' variable
 + pack-objects: rename 'this' variables
 + blame: rename 'this' variables
 + object: rename function 'typename' to 'type_name'
 + object_info: change member name from 'typename' to 'type_name'

 We now avoid using identifiers that clash with C++ keywords.  Even though
 it is not a goal to compile Git with C++ compilers, changes like
 this help use of code analysis tools that targets C++ on our
 codebase.


* bw/doc-submodule-recurse-config-with-clone (2018-02-21) 1 commit
  (merged to 'next' on 2018-02-27 at 5b12841508)
 + submodule: indicate that 'submodule.recurse' doesn't apply to clone

 Doc update.


* bw/perl-timegm-timelocal-fix (2018-02-23) 1 commit
  (merged to 'next' on 2018-02-27 at 565a3141ce)
 + perl: call timegm and timelocal with 4-digit year

 Y2k20 fix ;-) for our perl scripts.


* jc/allow-ff-merging-kept-tags (2018-02-16) 1 commit
  (merged to 'next' on 2018-02-27 at 8b03610d2b)
 + merge: allow fast-forward when merging a tracked tag

 Since Git 1.7.9, "git merge" defaulted to --no-ff (i.e. even when
 the side branch being merged is a descendant of the current commit,
 create a merge commit instead of fast-forwarding) when merging a
 tag object.  This was appropriate default for integrators who pull
 signed tags from their downstream contributors, but caused an
 unnecessary merges when used by downstream contributors who
 habitually "catch up" their topic branches with tagged releases
 from the upstream.  Update "git merge" to default to --no-ff only
 when merging 

Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-06 Thread Junio C Hamano
Johannes Schindelin  writes:

>> I don't think its possible to guess the semantics of the original merge
>> as users can use custom merge strategies and amend the result. It would
>> be possible to detect and unamended '-s ours' merge but special casing
>> that may end up causing users more confusion rather than helping them.
>
> FWIW I agree.

I think it is a mistake to sacrifice predictability only to add
cleverness that sometimes work.  Elsewhere in the thread, I think I
saw an argument to treat interactive and non-interactive something
very different, but there is no fundamental difference between them
(it is far easier with interactive to force the command to "port"
each change to a vastly different context) so having consistent
behaviour between the two cases is important, too.

>
> My original plan was to always merge recursively and suggest to use `exec`
> commands if anything else is needed.
>
> But now with that excellent new idea to perform successive three-way
> merges of the original merge commit with the new tips, using the old tips
> as merge base, I am considering to change that.

OK, does this mean we want to wait before merging the "recreate
merge" topic down to 'next'?  For more than a few weeks, it has been
slated for 'next'.



Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-06 Thread Igor Djordjevic
Hi Johannes,

On 05/03/2018 18:29, Johannes Schindelin wrote:
> 
> > By the way, is there documentation for `git merge-recursive` 
> > anywhere, besides the code itself...? :$
> 
> I am not aware of any. The commit message adding the command is not very
> illuminating (https://github.com/git-for-windows/git/commit/720d150c4):
> 
> Add a new merge strategy by Fredrik Kuivinen.
> 
> I really wanted to try this out, instead of asking for an adjustment
> to the 'git merge' driver and waiting.  For now the new strategy is
> called 'fredrik' and not in the list of default strategies to be tried.
> 
> The script wants Python 2.4 so this commit also adjusts Debian and RPM
> build procecure files.
> 
> Digging through https://public-inbox.org/git/ during that time frame comes
> up with this hit, though:
> 
> https://public-inbox.org/git/20050907164734.ga20...@c165.ib.student.liu.se/
> 
> which is still not a good documentation of the algorithm. You can probably
> dig further yourself, but I think I can describe it very quickly here:
> 
> To merge two commits recursively, you first have to find their "merge
> bases". If there was an obvious branch point, then that is the merge base.
> But when you start a branch off of master, then work a bit, then merge
> master, you already have two merge bases.
> 
> The trick about the recursive merge is to reduce the number of merge bases
> iteratively to one. It does that by taking two merge bases, and performing
> a recursive merge on them, which generates a "virtual" commit, the
> condensed merge base. That one is then merged recursively with the next
> merge base, until there is only one left.
> 
> A recursive merge of two commits with exactly one merge base is simply a
> three-way merge.
> 
> I vaguely remember that there was something funny about the order in which
> order you want to process the merge bases: if you did it in one
> (chronological) direction, it worked beautifully, in the other direction
> it would generate tons of merge conflicts or something like that.

Thanks, this is very informative (together with the linked discussion).

Not remembering seeing this before, I wasn`t really sure if this was 
some undocumented core Git (plumbing?) utility (used withing Git 
itself, too), or just a leftover (yet still useful) sample tool.

Regards, Buga


Git / Software Freedom Conservancy status report (2018)

2018-03-06 Thread Jeff King
It's been about a year since I sent a big update regarding our
activities with Conservancy. So I'm going to pretend that it was all a
plan to have an annual report, and this is the 2018 one. The previous
report can be found at:

  
https://public-inbox.org/git/20170202024501.57hrw4657tsqe...@sigill.intra.peff.net/


# Background

Most people who work on the project know this already, but here's a
little background.

Git is a member project of the Software Freedom Conservancy. We joined
in 2010 so that Conservancy could help us manage our money and other
assets, and represent us for legal stuff like trademarks. Conservancy
doesn't hold any copyright on Git code, and our status as a member
project is totally disconnected from the development of the code. The
technical direction of the project is delegated back to Git.

A "Project Leadership Committee" (PLC) represents Git in Conservancy.
The PLC currently consists of Junio Hamano and me.


# Financials

We have ~26k USD in our account. That's up ~5k from last year. Here are
the ledger numbers from the last year (it's double-entry, so negative is
good):

$-6,632.97  Income:Git
$-5,368.20Donations
$-1,174.86Royalties
   $-89.91Currency Conversion
 $1,117.79  Expenses:Git
$58.69Currency Conversion
   $176.98Banking Fees
   $882.12Conferences:Travel
   $225.00  Per Diem
  
$-5,515.18

Most of our money is taken in from donations. About 20% of it comes from
"royalties". Most of that is from the Amazon affiliate links on
git-scm.com (we link to the Pro Git book, but if people click through
and buy other stuff, that counts too). A little bit of the money comes
from Packt Publishing, who sends the project royalties based on sales of
their Git-related books. We give 10% of incoming money to Conservancy's
general fund (these numbers are after the 10%).

Our main expense was travel money for last year's Contributor Summit. It
looks like we spend a lot on banking fees and currency conversion, but
really those are just necessary parts of taking in the donations (e.g.,
somebody donates through paypal who charges a small cut to process the
transaction).

We'll probably spend about the same amount this year for travel to Git
Merge (more on that below).


# Trademark

As I reported last year, we hold a trademark on the term "Git" in the
space of software and version control. I won't repeat all the details;
see last year's report.

There were a few requests this year from various entities; we mostly
followed the rules of thumb I laid out last year.

Subjectively, it felt like there were fewer requests this year than in
the past. I suspect last year's discussion may have gotten our trademark
policy a bit more attention (most of the previous requests are of the
form "I've been using the name "Git Foo" for my project and just found
out you don't want me to do so).


# Travel Budget Allocation

As with past years, we offered money for community members to come to
Git Merge (and the contributor summit). We've always sort of made up the
procedure each year, but this year tried to formalize it a little more.
Our procedure was basically:

  0. We picked an amount that seemed OK to spend. For this year we
 picked $2,000 (which admittedly is kind of made up).

  1. I put out a call for requests to the mailing list.

  2. For each request, I asked the requester to ballpark the cost of the
 flight plus lodging to cover the conference.

  3. We ordered the requests roughly by contribution history. We also
 treated people in programs like GSoC and Outreachy as frequent
 contributors. Even though they haven't contributed as much code
 yet, the point of the program is getting them involved.

  4. We walked down the list until we ran out of budget. At any given
 level, the amount of money offered was tweaked manually based on
 need/fairness (i.e., to offer more to people whose flights are more
 expensive).

 This meant that we extended an offer of partial funds in some
 cases (at the end of the line).

That whole procedure is totally open to comment and suggestions. There's
a huge amount of hand-waving and subjectivity there. It also doesn't
really work as a general procedure if somebody were to request one-off
funding for a conference besides Git Merge (which has never actually
happened so far).


# Project Governance

As many of you know, the third member of the PLC, Shawn Pearce, passed
away early this year. Our charter calls for a minimum of three PLC
members. New members are decided by a simple majority vote of the
existing committee (of which there are now 2 members, but Conservancy
has a tie-breaking vote if for some reason we deadlock).

Beyond that, there's no specified procedure. Junio and I have some
thoughts on candidates, but we want to open it up to the list for

Re: [PATCH v9 4/8] utf8: add function to detect a missing UTF-16/32 BOM

2018-03-06 Thread Lars Schneider

> On 07 Mar 2018, at 00:07, Junio C Hamano  wrote:
> 
> Junio C Hamano  writes:
> 
>> Lars Schneider  writes:
>> 
 Also "UTF16" or other spelling
 the platform may support but this code fails to recognise will go
 unchecked.
>>> 
>>> That is true. However, I would assume all iconv implementations use the
>>> same encoding names for UTF encodings, no? That means UTF16 would never be
>>> valid. Would you agree?
>> 
>> After seeing "UTF16" and others in "iconv -l | grep -i utf" output,
>> I am not sure what you mean by "Would you agree?"  People can say in
>> their .gitattributes file that this path is in "UTF16" without dash
>> and that is what will be fed to this code, no?
> 
> FWIW, "iconv -f utf8 -t utf16" seems not to complain, so I am
> reasonably sure that people expect downcased ones to work as well.

Sure! That's why I normalized it to upper case:
https://public-inbox.org/git/CAPig+cQE0pKs-AMvh4GndyCXBMnx=70jppdm6k4jjte-74f...@mail.gmail.com/

After thinking about it I wonder if we should barf on "utf16" without
dash. Your Linux iconv would handle this correctly. My macOS iconv would not.
That means the repo would checkout correctly on your machine but not on mine.

What do you think?

- Lars



Re: [PATCH v9 4/8] utf8: add function to detect a missing UTF-16/32 BOM

2018-03-06 Thread Junio C Hamano
Junio C Hamano  writes:

> Lars Schneider  writes:
>
>>>  Also "UTF16" or other spelling
>>> the platform may support but this code fails to recognise will go
>>> unchecked.
>>
>> That is true. However, I would assume all iconv implementations use the
>> same encoding names for UTF encodings, no? That means UTF16 would never be
>> valid. Would you agree?
>
> After seeing "UTF16" and others in "iconv -l | grep -i utf" output,
> I am not sure what you mean by "Would you agree?"  People can say in
> their .gitattributes file that this path is in "UTF16" without dash
> and that is what will be fed to this code, no?

FWIW, "iconv -f utf8 -t utf16" seems not to complain, so I am
reasonably sure that people expect downcased ones to work as well.


Re: [PATCH] xdiff: improve trimming preprocessing

2018-03-06 Thread Jun Wu
Excerpts from Eric Sunshine's message of 2018-03-06 14:23:46 -0500:
> On Tue, Mar 6, 2018 at 6:53 AM, Jun Wu  wrote:
> > +  printf "x%.0s" {1..934} >>d # pad common suffix to 1024 bytes
> 
> The expression {x..y} is not portable to non-POSIX shells.

Is there a recommended way to generate a repetitive string?
Maybe `seq 1000 | sed 's/.*/x/'` ?

> > +fgrep "@@ -1001 +1000,0 @@"
> > +'
> 
> Style: Mix of tabs and spaces for indentation. Please indent only with
> tabs throughout the patch.

Sorry. Will fix in V2.
 
> > +   /* prefix - need line count for xecfg->ptrimmed */
> > +   for (i = 0; ++i < smaller && *ap == *bp;) {
> > +   lines += (*ap == '\n');
> > +   ap++, bp++;
> 
> Is there a good reason for not placing 'ap++, bp++' directly in the 'for'?

"lines += (*ap == '\n');" needs the "ap" before adding. Alternatives are

for (i = 0; ++i < smaller && *ap == *bp; ) /* 1 */
lines += (*ap++, *bp++) == '\n';

for (i = 0; ++i < smaller && *ap == *bp; ap++, bp++) /* 2 */
lines += (*(ap - 1) == '\n');

Maybe will pick /* 1 */ to make the code shorter.


git-scm.com update

2018-03-06 Thread Jeff King
Last year I reported on the state of the git-scm.com website:

  
https://public-inbox.org/git/20170202023349.7fopb3a6pc6dk...@sigill.intra.peff.net/

There was a little bit of public discussion, and I privately got
approximately one zillion offers to host the site or otherwise help with
it. Thank you to everybody who responded.

Here's an update on what happened since then:

 - we now have a small group of maintainers able to triage incoming bug
   reports and patches, and generate some fixes. Thanks especially to
   Jean-Noël Avila, Pedro Rijo, and Samuel Lijin for their work over the
   past year.

 - we resolved most of the performance issues. These were partly due to
   some inefficient database queries, but mostly it was resolved with
   lots and lots of caching.

   The main problem, it turned out, is just that we get a _lot_ of hits.
   Like 5-6 million requests per day (that's individual HTTP requests;
   analytics report that we get close to 200,000 unique visitors on any
   given weekday).

   Details on the caching are in the architecture document below.

 - the site remains a rails app. This is probably overkill, but it was
   the path of least resistance to keep it one. Converting to a static
   site would require at least some grunt-work, but also figuring out
   some solution for the site-wide search. There's no immediate plan to
   move to anything else.

 - previously the site was being paid for by GitHub (and was tied to
   GitHub's Heroku account). We could have continued that indefinitely,
   but I wanted to move it to a community-owned account. That happened
   (details in the architecture document).

   Rather than ask for money from GitHub or elsewhere to cover hosting
   costs (which would have been easy -- we had quite a few offers!), I
   instead approached the companies whose services we're using to host
   the site and asked for direct sponsorship. Every one of them was
   happy to oblige, and Heroku in particular was helpful with migrating
   the site to the new account. So we're currently receiving free
   service from Heroku, Cloudflare, and Bonsai.

 - I've written up a few documents about how the site is organized. One
   higher-level page on the site for bug reporters, etc:

 https://github.com/git/git-scm.com/pull/1172/files

   and one more detailed architecture document:

 https://github.com/git/git-scm.com/blob/master/ARCHITECTURE.md

So at this point I think the site is in reasonably good shape going
forward. One of the long-standing issues is that the design is not very
responsive, and looks bad on phones and tablets. Jason Long, who did
much of the original site design, is working on a visual refresh that
should fix this and update some of the messier and outdated parts of the
CSS and related tooling. The plan is to have the refresh running
alongside the production site soon, so that we can gather feedback and
iterate from there.

-Peff


Re: [PATCH v9 4/8] utf8: add function to detect a missing UTF-16/32 BOM

2018-03-06 Thread Lars Schneider

> On 06 Mar 2018, at 23:53, Junio C Hamano  wrote:
> 
> Lars Schneider  writes:
> 
>>> Also "UTF16" or other spelling
>>> the platform may support but this code fails to recognise will go
>>> unchecked.
>> 
>> That is true. However, I would assume all iconv implementations use the
>> same encoding names for UTF encodings, no? That means UTF16 would never be
>> valid. Would you agree?
> 
> After seeing "UTF16" and others in "iconv -l | grep -i utf" output,
> I am not sure what you mean by "Would you agree?"  People can say in
> their .gitattributes file that this path is in "UTF16" without dash
> and that is what will be fed to this coe, no?

On macOS I don't see UTF16... but I just checked on my Linux box and
there it is. Consequently, we need to check both versions: with and
without dash.

I'll reroll ASAP (I try to do it first thing in the morning).

Thanks,
Lars


Re: [PATCH v9 4/8] utf8: add function to detect a missing UTF-16/32 BOM

2018-03-06 Thread Junio C Hamano
Lars Schneider  writes:

>>  Also "UTF16" or other spelling
>> the platform may support but this code fails to recognise will go
>> unchecked.
>
> That is true. However, I would assume all iconv implementations use the
> same encoding names for UTF encodings, no? That means UTF16 would never be
> valid. Would you agree?

After seeing "UTF16" and others in "iconv -l | grep -i utf" output,
I am not sure what you mean by "Would you agree?"  People can say in
their .gitattributes file that this path is in "UTF16" without dash
and that is what will be fed to this coe, no?


Re: [PATCH 2/2] completion: simplify _git_notes

2018-03-06 Thread Junio C Hamano
SZEDER Gábor  writes:

> There is a minor behaviour change here, though.  This
>
>   prune,*)
> ;;
>
> case arm ensured that we don't list refs for 'git notes prune ',
> because it doesn't accept them (and then we take our usual fallback and
> let Bash complete filenames;  yeah, 'git notes prune' doesn't accept
> filenames either, but, as I said, that's our usual fallback when we
> can't offer anything for completion).
>
> This patch removes that case arm, and refs will be offered for 'git
> notes prune '.
>
>> +   *,--*)
>> +   __gitcomp_builtin notes_$subcommand
>> ;;
>> *)
>> case "$prev" in

I have this tentatively queued on the topic.  Can we wrap the topic
up and move it forward, instead of leaving it (and other topics)
hanging around and causing conflicts with other topics in flight,
please?

Thanks.


Subject: [PATCH] SQUASH???

By Szeder 

Re: [PATCH v9 4/8] utf8: add function to detect a missing UTF-16/32 BOM

2018-03-06 Thread Lars Schneider

> On 06 Mar 2018, at 21:50, Junio C Hamano  wrote:
> 
> lars.schnei...@autodesk.com writes:
> 
>> +int is_missing_required_utf_bom(const char *enc, const char *data, size_t 
>> len)
>> +{
>> +return (
>> +   !strcmp(enc, "UTF-16") &&
>> +   !(has_bom_prefix(data, len, utf16_be_bom, sizeof(utf16_be_bom)) ||
>> + has_bom_prefix(data, len, utf16_le_bom, sizeof(utf16_le_bom)))
>> +) || (
>> +   !strcmp(enc, "UTF-32") &&
>> +   !(has_bom_prefix(data, len, utf32_be_bom, sizeof(utf32_be_bom)) ||
>> + has_bom_prefix(data, len, utf32_le_bom, sizeof(utf32_le_bom)))
>> +);
>> +}
> 
> These strcmp() calls seem inconsistent with the principle embodied
> by utf8.c::fallback_encoding(), i.e. "be lenient to what we accept",
> and make the interface uneven. I am wondering if we also want to
> complain when the user gave us "utf16" and there is no byte order
> mark in the contents, for example?

Well, if I use stricmp() then I don't need to call and cleanup
xstrdup_toupper() as discussed with Eric [1]. Is there a case
insensitive starts_with() method?

[1] 
https://public-inbox.org/git/CAPig+cQE0pKs-AMvh4GndyCXBMnx=70jppdm6k4jjte-74f...@mail.gmail.com/


>  Also "UTF16" or other spelling
> the platform may support but this code fails to recognise will go
> unchecked.

That is true. However, I would assume all iconv implementations use the
same encoding names for UTF encodings, no? That means UTF16 would never be
valid. Would you agree?

- Lars

Re: [PATCH 2/2] git-svn: allow empty email-address in authors-prog and authors-file

2018-03-06 Thread Andreas Heiduk
Am 05.03.2018 um 10:37 schrieb Andreas Heiduk:
> 2018-03-05 2:42 GMT+01:00 Eric Sunshine :
>> On Sun, Mar 4, 2018 at 6:22 AM, Andreas Heiduk  wrote:
>>> ---
>>> diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
>>> @@ -1482,7 +1482,6 @@ sub call_authors_prog {
>>> }
>>> if ($author =~ /^\s*(.+?)\s*<(.*)>\s*$/) {
>>> my ($name, $email) = ($1, $2);
>>> -   $email = undef if length $2 == 0;
>>> return [$name, $email];
>>
>> Mental note: existing behavior intentionally makes $email undefined if
>> not present in $author; revised behavior leaves it defined.
> 
> But only if the data comes from authors-prog. authors-file is unaffected.
> 
>>
>>> } else {
>>> die "Author: $orig_author: $::_authors_prog returned "
>>> @@ -2020,8 +2019,8 @@ sub make_log_entry {
>>> remove_username($full_url);
>>> $log_entry{metadata} = "$full_url\@$r $uuid";
>>> $log_entry{svm_revision} = $r;
>>> -   $email ||= "$author\@$uuid";
>>> -   $commit_email ||= "$author\@$uuid";
>>> +   $email //= "$author\@$uuid";
>>> +   $commit_email //= "$author\@$uuid";
>>
>> With the revised behavior (above), $email is unconditionally defined,
>> so these //= expressions will _never_ assign "$author\@$uuid" to
>> $email. Am I reading that correctly? If so, then isn't this now just
>> dead code? Wouldn't it be clearer to remove these lines altogether?
> 
> The olf behaviour still kicks in if
>  - neither authors-file nor authors-prog is used
>  - only authors-file is used
> 
>> I see from reading the code that there is a "if (!defined $email)"
>> earlier in the function which becomes misleading with this change. I'd
>> have expected the patch to modify that, as well.
> 
> I will look into that one later.

I don't want to let that slip through the cracks: The `if` statement
still makes a difference if:

 - neither ` --authors-prog` nor `--authors-file` is active,
 - but `--use-log-author` is active and
 - the commit at hand does not contain a `From:` or `Signed-off-by:`
   trailer.

In that case the result was and still is `$user <$user>` for the author
and `$user <$user@$uuid>` for the comitter. That doesn't make sense to
me but doesn't concern me right now.


Re: [PATCH v9 5/8] convert: add 'working-tree-encoding' attribute

2018-03-06 Thread Eric Sunshine
On Tue, Mar 6, 2018 at 5:13 PM, Lars Schneider  wrote:
>> On 06 Mar 2018, at 21:42, Eric Sunshine  wrote:
>> On Sun, Mar 4, 2018 at 3:14 PM,   wrote:
>>> +   return xstrdup_toupper(value);
>>
>> xstrdup_toupper() allocates memory...
>>
>>> +   const char *working_tree_encoding; /* Supported encoding or default 
>>> encoding if NULL */
>>
>> ...which is assigned to 'const char *'...
>>
>>> +   ca->working_tree_encoding = git_path_check_encoding(ccheck 
>>> + 5);
>>
>> ...by this code, and eventually leaked.
>>
>> It's too bad it isn't cleaned up (freed), but looking at the callers,
>> fixing this leak would be mildly noisy (though not particularly
>> invasive). How much do we care about this leak?
>
> Hmm. You are right. That was previously handled by the encoding struct
> linked list that I removed in this iteration. I forgot about that aspect :/
> I don't like it leaking. I think I would like to reintroduce the linked
> list. This way every encoding is only once in memory. What do you think?

It's subjective, but I find the use of a linked-list just for the
purpose of not leaking these strings unnecessarily confusing.

If I was doing it, I'd probably add a conv_attrs_free() function and
call it from each function allocates a 'struct conv_attrs' (including
calling it before early returns -- which prompted my earlier comment
about it being a "mildly noisy" fix).


Re: [PATCH v9 5/8] convert: add 'working-tree-encoding' attribute

2018-03-06 Thread Lars Schneider

> On 06 Mar 2018, at 21:42, Eric Sunshine  wrote:
> 
> On Sun, Mar 4, 2018 at 3:14 PM,   wrote:
>> Git recognizes files encoded with ASCII or one of its supersets (e.g.
>> UTF-8 or ISO-8859-1) as text files. All other encodings are usually
>> interpreted as binary and consequently built-in Git text processing
>> tools (e.g. 'git diff') as well as most Git web front ends do not
>> visualize the content.
>> [...]
>> Signed-off-by: Lars Schneider 
>> ---
>> diff --git a/convert.c b/convert.c
>> @@ -978,6 +1051,25 @@ static int ident_to_worktree(const char *path, const 
>> char *src, size_t len,
>> +static const char *git_path_check_encoding(struct attr_check_item *check)
>> +{
>> +   [...]
>> +   /*
>> +* Ensure encoding names are always upper case (e.g. UTF-8) to
>> +* simplify subsequent string comparisons.
>> +*/
>> +   return xstrdup_toupper(value);
> 
> xstrdup_toupper() allocates memory...
> 
>> +}
>> @@ -1033,6 +1125,7 @@ struct conv_attrs {
>>enum crlf_action attr_action; /* What attr says */
>>enum crlf_action crlf_action; /* When no attr is set, use 
>> core.autocrlf */
>>int ident;
>> +   const char *working_tree_encoding; /* Supported encoding or default 
>> encoding if NULL */
> 
> ...which is assigned to 'const char *'...
> 
>> };
>> @@ -1064,6 +1158,7 @@ static void convert_attrs(struct conv_attrs *ca, const 
>> char *path)
>>else if (eol_attr == EOL_CRLF)
>>ca->crlf_action = CRLF_TEXT_CRLF;
>>}
>> +   ca->working_tree_encoding = git_path_check_encoding(ccheck + 
>> 5);
> 
> ...by this code, and eventually leaked.
> 
> It's too bad it isn't cleaned up (freed), but looking at the callers,
> fixing this leak would be mildly noisy (though not particularly
> invasive). How much do we care about this leak?

Hmm. You are right. That was previously handled by the encoding struct 
linked list that I removed in this iteration. I forgot about that aspect :/
I don't like it leaking. I think I would like to reintroduce the linked
list. This way every encoding is only once in memory. What do you think?


>>} else {
>>ca->drv = NULL;
>>ca->crlf_action = CRLF_UNDEFINED;
>> diff --git a/t/t0028-working-tree-encoding.sh 
>> b/t/t0028-working-tree-encoding.sh
>> @@ -0,0 +1,135 @@
>> +test_expect_success 'check $GIT_DIR/info/attributes support' '
>> +   test_when_finished "rm -f test.utf8.raw test.utf32.raw 
>> test.utf32.git" &&
> 
> It seems weird to be cleaning up files this test didn't create
> (test.utf8.raw and test.utf32.raw).

Agreed.


>> +   test_when_finished "git reset --hard HEAD" &&
>> +
>> +   echo "*.utf32 text working-tree-encoding=utf-32" 
>> >.git/info/attributes &&
>> +   git add test.utf32 &&
>> +
>> +   git cat-file -p :test.utf32 >test.utf32.git &&
>> +   test_cmp_bin test.utf8.raw test.utf32.git
>> +'
>> +
>> +test_expect_success 'check unsupported encodings' '
>> +   test_when_finished "rm -f err.out" &&
>> +   test_when_finished "git reset --hard HEAD" &&
> 
> Resetting to HEAD here is an important cleanup action, but tests don't
> usually clean up files such as 'err.out' since such detritus doesn't
> usually impact subsequent tests negatively. (Just an observation; no
> re-roll needed.)

OK. I'll fix it if I reroll.

- Lars

QUICK REPLY

2018-03-06 Thread Kuan Wai Chang




Would you like to be part of our supply? 

Please respond back to my private email 

on: kuanwaich...@gmail.com






Re: [PATCH v2 1/3] add -p: select individual hunk lines

2018-03-06 Thread Igor Djordjevic
On 06/03/2018 21:29, Igor Djordjevic wrote:
> 
> > diff --git a/git-add--interactive.perl b/git-add--interactive.perl
> > index f83e7450ad..a273b41e95 100755
> > --- a/git-add--interactive.perl
> > +++ b/git-add--interactive.perl
> > 
> > [...]
> > 
> > @@ -1255,6 +1382,7 @@ j - leave this hunk undecided, see next undecided hunk
> >  J - leave this hunk undecided, see next hunk
> >  k - leave this hunk undecided, see previous undecided hunk
> >  K - leave this hunk undecided, see previous hunk
> > +l - select hunk lines to use
> 
> s/select hunk lines to use/stage hunk lines/

I was wrong here - in the context of Junio`s remark, I now think this 
might even belong to context-aware "help_patch_modes" instead, 
phrased accordingly in there (stage/stash/unstage... etc.).


Re: [GSoC][PATCH v4] Make options that expect object ids less chatty if id is invalid

2018-03-06 Thread Junio C Hamano
Junio C Hamano  writes:

> I kind of find it surprising that the one single case I happened to
> have noticed is the only one that needed special treatment.  Did you
> go though all the codepath and made sure that the ones that still
> return -1 (not -2 and not -3) to parse_options_step() are all OK (in
> other words, I was just lucky) or does this version change only the
> "ambiguous" case, simply because that was the only one I noticed in
> my review (in other words, this may still not be sufficient)?
>
> Just double checking.
>
> The changes to existing tests have become a lot less noisy, compared
> to the previous one, which is probably a good thing.

I guess I should stop reading my inbox in reverse order.  In your
reply to my v3 review you said you studied all the codepaths from
parse_short_opt() and parse_long_opt() and addition of -3 was needed
only for the "ambiguous" case, so the answer to my question above is
"I was just lucky and happened to have hit the only problematic
case" ;-)

Will revert what is in 'next' and queue this one.

Thanks.


Re: [PATCH v2 0/3] add -p: select individual hunk lines

2018-03-06 Thread Igor Djordjevic
Hi Junio,

On 06/03/2018 22:03, Junio C Hamano wrote:
> 
> > A small nitpick - I see you use phrasing like "select lines", where 
> > the other commands usually talk about "staging", instead, so "stage 
> > lines" might be more aligned with the existing text.
> 
> Isn't this machinery shared across "add -p" and "reset -p"?  What is
> done to the selected lines when you are using this UI while running
> "reset -p"?  I hope it is not "staging".  If the interface only
> "selects lines" and what is done to the selected lines depends on
> what operation is using this backend, then the current phrasing is
> perfectly fine and saying "staging" makes it actively worse.

Hmm, if that is the case, I agree, but I was merely trying to review 
the files being changed - for example, inside "Documentation/git-add.txt":

   y - stage this hunk
   n - do not stage this hunk
   q - quit; do not stage this hunk or any of the remaining ones
   a - stage this hunk and all later hunks in the file
   d - do not stage this hunk or any of the later hunks in the file
   g - select a hunk to go to
   / - search for a hunk matching the given regex
   j - leave this hunk undecided, see next undecided hunk
   J - leave this hunk undecided, see next hunk
   k - leave this hunk undecided, see previous undecided hunk
   K - leave this hunk undecided, see previous hunk
   s - split the current hunk into smaller hunks
   e - manually edit the current hunk
   ? - print help


In there, adding "l" should follow "stage" phrasing, I would think.

But you are right for "git-add--interactive.perl", for example - in 
there, I didn`t notice the line (seems to be?) added inside the shared 
"help_patch_cmd".

But if so, I guess it should then be moved to more context-related 
"help_patch_modes", being phrased accordingly in there.

Thanks for pointing this out, let me recheck my comments.

Regards, Buga


Re: [GSoC][PATCH v4] Make options that expect object ids less chatty if id is invalid

2018-03-06 Thread Junio C Hamano
Paul-Sebastian Ungureanu  writes:

> Usually, the usage should be shown only if the user does not know what
> options are available. If the user specifies an invalid value, the user
> is already aware of the available options. In this case, there is no
> point in displaying the usage anymore.
>
> This patch applies to "git tag --contains", "git branch --contains",
> "git branch --points-at", "git for-each-ref --contains" and many more.
>
> Signed-off-by: Paul-Sebastian Ungureanu 
> ---

I notice that this version changes the way the case where an
unbiguous long option is given, compared to the previous one is
handled.  And I recall that that single case is what I happened to
have noticed during my review of the previous one, in which I was
not trying to be exhausitive.  

I kind of find it surprising that the one single case I happened to
have noticed is the only one that needed special treatment.  Did you
go though all the codepath and made sure that the ones that still
return -1 (not -2 and not -3) to parse_options_step() are all OK (in
other words, I was just lucky) or does this version change only the
"ambiguous" case, simply because that was the only one I noticed in
my review (in other words, this may still not be sufficient)?

Just double checking.

The changes to existing tests have become a lot less noisy, compared
to the previous one, which is probably a good thing.

Thanks.


Re: [PATCH v9 0/8] convert: add support for different encodings

2018-03-06 Thread Eric Sunshine
On Sun, Mar 4, 2018 at 3:14 PM,   wrote:
> Changes since v8: [...]
>
> Thanks a lot Eric for your great review! I think I fixed everything you
> objected with one exception. You noticed that the current code only
> checks for BOMs corresponding to the declared size (16 or 32 bits) [1].
> I understand your point of view and I agree that any BOM in these cases
> is *most likely* an error. However, according to the spec it might
> still be valid. The comments on my related question on StackOverflow
> seem to support that view [2]. Therefore, I would like to leave it as
> it is in this series. If it turns out to be a problem in practice, then
> I am happy to change it later. OK for you?

Fine. As this is not a correctness issue with the conversion itself,
but rather an attempt to detect misconfiguration by the user, it can
always be revisited later if someone comes up with a more solid
argument that one interpretation is more correct than the other.

I did leave a few review comments on v9, but I don't think any are
necessarily worth a re-roll. If anything, and if they seem actionable,
then perhaps a patch or two atop the series could address them (at
some point).


Re: [PATCH v2 0/3] add -p: select individual hunk lines

2018-03-06 Thread Junio C Hamano
Igor Djordjevic  writes:

> A small nitpick - I see you use phrasing like "select lines", where 
> the other commands usually talk about "staging", instead, so "stage 
> lines" might be more aligned with the existing text.

Isn't this machinery shared across "add -p" and "reset -p"?  What is
done to the selected lines when you are using this UI while running
"reset -p"?  I hope it is not "staging".  If the interface only
"selects lines" and what is done to the selected lines depends on
what operation is using this backend, then the current phrasing is
perfectly fine and saying "staging" makes it actively worse.



Re: [PATCH v9 6/8] convert: check for detectable errors in UTF encodings

2018-03-06 Thread Eric Sunshine
On Sun, Mar 4, 2018 at 3:14 PM,   wrote:
> Check that new content is valid with respect to the user defined
> 'working-tree-encoding' attribute.
>
> Signed-off-by: Lars Schneider 
> ---
> diff --git a/convert.c b/convert.c
> @@ -266,6 +266,53 @@ static int will_convert_lf_to_crlf(size_t len, struct 
> text_stat *stats,
> +static int validate_encoding(const char *path, const char *enc,
> + const char *data, size_t len, int die_on_error)
> +{
> +   if (!memcmp("UTF-", enc, 4)) {
> +   [...]
> +   if (has_prohibited_utf_bom(enc, data, len)) {
> +   [...]
> +   if (die_on_error)
> +   die(error_msg, path, enc);
> +   else {
> +   return error(error_msg, path, enc);
> +   }
> +   } [...]
> +   return 0;
> +}
> @@ -291,6 +338,9 @@ static int encode_to_git(const char *path, const char 
> *src, size_t src_len,
> +   if (validate_encoding(path, enc, src, src_len, die_on_error))
> +   return 0;

There could be a cleaner separation of responsibilities here which, as
a nice side-effect, would eliminate the repeated "if (die_on_error)
die(...); else return error(...);" pattern. Rather than passing a
'die_on_error' flag to validate_encoding(), it could accept a 'strbuf
*err' to populate in case of error:

static int validate_encoding(..., struct strbuf *err)
{
if validation error:
populate 'err'
return -1;
return 0
}

and let the caller be responsible for deciding how to handle failure:

struct strbuf err = STRBUF_INIT;
...
if (validate_encoding(..., )) {
if (die_on_error)
die(err.buf);
else {
error(err.buf);
strbuf_release();
return 0;
}
}

Not necessarily worth a re-roll, but perhaps a cleanup someone could
submit at some point if interested.

> dst = reencode_string_len(src, src_len, default_encoding, enc,
>   _len);


Re: [PATCH v9 4/8] utf8: add function to detect a missing UTF-16/32 BOM

2018-03-06 Thread Junio C Hamano
lars.schnei...@autodesk.com writes:

> +int is_missing_required_utf_bom(const char *enc, const char *data, size_t 
> len)
> +{
> + return (
> +!strcmp(enc, "UTF-16") &&
> +!(has_bom_prefix(data, len, utf16_be_bom, sizeof(utf16_be_bom)) ||
> +  has_bom_prefix(data, len, utf16_le_bom, sizeof(utf16_le_bom)))
> + ) || (
> +!strcmp(enc, "UTF-32") &&
> +!(has_bom_prefix(data, len, utf32_be_bom, sizeof(utf32_be_bom)) ||
> +  has_bom_prefix(data, len, utf32_le_bom, sizeof(utf32_le_bom)))
> + );
> +}

These strcmp() calls seem inconsistent with the principle embodied
by utf8.c::fallback_encoding(), i.e. "be lenient to what we accept",
and make the interface uneven.  I am wondering if we also want to
complain when the user gave us "utf16" and there is no byte order
mark in the contents, for example?  Also "UTF16" or other spelling
the platform may support but this code fails to recognise will go
unchecked.

Which actually may be a feature, not a bug, to be able to bypass
this check---I dunno.

The same comment applies to the previous step.



Re: [PATCH v9 5/8] convert: add 'working-tree-encoding' attribute

2018-03-06 Thread Eric Sunshine
On Sun, Mar 4, 2018 at 3:14 PM,   wrote:
> Git recognizes files encoded with ASCII or one of its supersets (e.g.
> UTF-8 or ISO-8859-1) as text files. All other encodings are usually
> interpreted as binary and consequently built-in Git text processing
> tools (e.g. 'git diff') as well as most Git web front ends do not
> visualize the content.
> [...]
> Signed-off-by: Lars Schneider 
> ---
> diff --git a/convert.c b/convert.c
> @@ -978,6 +1051,25 @@ static int ident_to_worktree(const char *path, const 
> char *src, size_t len,
> +static const char *git_path_check_encoding(struct attr_check_item *check)
> +{
> +   [...]
> +   /*
> +* Ensure encoding names are always upper case (e.g. UTF-8) to
> +* simplify subsequent string comparisons.
> +*/
> +   return xstrdup_toupper(value);

xstrdup_toupper() allocates memory...

> +}
> @@ -1033,6 +1125,7 @@ struct conv_attrs {
> enum crlf_action attr_action; /* What attr says */
> enum crlf_action crlf_action; /* When no attr is set, use 
> core.autocrlf */
> int ident;
> +   const char *working_tree_encoding; /* Supported encoding or default 
> encoding if NULL */

...which is assigned to 'const char *'...

>  };
> @@ -1064,6 +1158,7 @@ static void convert_attrs(struct conv_attrs *ca, const 
> char *path)
> else if (eol_attr == EOL_CRLF)
> ca->crlf_action = CRLF_TEXT_CRLF;
> }
> +   ca->working_tree_encoding = git_path_check_encoding(ccheck + 
> 5);

...by this code, and eventually leaked.

It's too bad it isn't cleaned up (freed), but looking at the callers,
fixing this leak would be mildly noisy (though not particularly
invasive). How much do we care about this leak?

> } else {
> ca->drv = NULL;
> ca->crlf_action = CRLF_UNDEFINED;
> diff --git a/t/t0028-working-tree-encoding.sh 
> b/t/t0028-working-tree-encoding.sh
> @@ -0,0 +1,135 @@
> +test_expect_success 'check $GIT_DIR/info/attributes support' '
> +   test_when_finished "rm -f test.utf8.raw test.utf32.raw 
> test.utf32.git" &&

It seems weird to be cleaning up files this test didn't create
(test.utf8.raw and test.utf32.raw).

> +   test_when_finished "git reset --hard HEAD" &&
> +
> +   echo "*.utf32 text working-tree-encoding=utf-32" 
> >.git/info/attributes &&
> +   git add test.utf32 &&
> +
> +   git cat-file -p :test.utf32 >test.utf32.git &&
> +   test_cmp_bin test.utf8.raw test.utf32.git
> +'
> +
> +test_expect_success 'check unsupported encodings' '
> +   test_when_finished "rm -f err.out" &&
> +   test_when_finished "git reset --hard HEAD" &&

Resetting to HEAD here is an important cleanup action, but tests don't
usually clean up files such as 'err.out' since such detritus doesn't
usually impact subsequent tests negatively. (Just an observation; no
re-roll needed.)

> +   echo "*.nothing text working-tree-encoding=" >>.gitattributes &&
> +   printf "nothing" >t.nothing &&
> +   git add t.nothing &&
> +
> +   echo "*.garbage text working-tree-encoding=garbage" >>.gitattributes 
> &&
> +   printf "garbage" >t.garbage &&
> +   test_must_fail git add t.garbage 2>err.out &&
> +   test_i18ngrep "fatal: failed to encode" err.out
> +'


Re: [PATCH v2 2/3] add -p: allow line selection to be inverted

2018-03-06 Thread Igor Djordjevic
On 06/03/2018 11:17, Phillip Wood wrote:
> From: Phillip Wood 
> 
> If the list of lines to be selected begins with '^' select all the
> lines except the ones listed.

s/to be selected begins with '^' select all/to be staged begins with '^' stage 
all/

> 
> Signed-off-by: Phillip Wood 
> ---
>  Documentation/git-add.txt  |  3 ++-
>  git-add--interactive.perl  | 17 -
>  t/t3701-add-interactive.sh |  2 +-
>  3 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
> index ad33fda9a2..0e2c11e97b 100644
> --- a/Documentation/git-add.txt
> +++ b/Documentation/git-add.txt
> @@ -341,7 +341,8 @@ If you press "l" then the hunk will be reprinted with 
> each insertion
>  or deletion labelled with a number and you will be prompted to enter
>  which lines you wish to select. Individual line numbers should be
>  separated by a space or comma, to specify a range of lines use a dash
> -between them.
> +between them. To invert the selection prefix it with "\^" so "^3-5,8"
> +will select everything except lines 3, 4, 5 and 8.

Hmm, here, first "selection" seems to make sense as it is (I guess),
but might still be better to later say 
s/will select everything/will stage everything/ ...?

That said, might be "to invert the selection" could rather be "to unstage," 
instead? Not sure, though.

>  +
>  After deciding the fate for all hunks, if there is any hunk
>  that was chosen, the index is updated with the selected hunks.
> diff --git a/git-add--interactive.perl b/git-add--interactive.perl
> index a273b41e95..6fa3d0a87c 100755
> --- a/git-add--interactive.perl
> +++ b/git-add--interactive.perl
> @@ -1085,9 +1085,21 @@ sub check_hunk_label {
>  sub parse_hunk_selection {
>   local $_;
>   my ($hunk, $line) = @_;
> - my $max_label = $hunk->{MAX_LABEL};
> + my ($max_label, $invert) = ($hunk->{MAX_LABEL}, undef);
>   my @selected = (0) x ($max_label + 1);
>   my @fields = split(/[,\s]+/, $line);
> + if ($fields[0] =~ /^\^(.*)/) {
> + $invert = 1;
> + if ($1 ne '') {
> + $fields[0] = $1;
> + } else {
> + shift @fields;
> + unless (@fields) {
> + error_msg __("no lines to invert\n");
> + return undef;
> + }
> + }
> + }
>   for (@fields) {
>   if (/^([0-9]*)-([0-9]*)$/) {
>   if ($1 eq '' and $2 eq '') {
> @@ -1110,6 +1122,9 @@ sub parse_hunk_selection {
>   return undef;
>   }
>   }
> + if ($invert) {
> + @selected = map { !$_ } @selected;
> + }
>   return \@selected;
>  }
>  
> diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
> index 65c8c3354b..89c0e73f2b 100755
> --- a/t/t3701-add-interactive.sh
> +++ b/t/t3701-add-interactive.sh
> @@ -410,7 +410,7 @@ test_expect_success 'setup expected diff' '
>  '
>  
>  test_expect_success 'can reset individual lines of patch' '
> - printf "%s\n" l 2 |
> + printf "%s\n" l "^1 3" |
>   EDITOR=: git reset -p 2>error &&
>   test_must_be_empty error &&
>   git diff --cached HEAD >actual &&
> 


Re: [PATCH v2 3/3] add -p: optimize line selection for short hunks

2018-03-06 Thread Igor Djordjevic
On 06/03/2018 11:17, Phillip Wood wrote:
> From: Phillip Wood 
> 
> If there are fewer than ten changes in a hunk then make spaces
> optional when selecting individual lines. This means that for short

Not sure if using s/selecting individual lines/staging individual lines/ 
would make sense here, too, but not that important (as you later do 
say "to stage lines").

> hunks one can just type -357 to stage lines 1, 2, 3, 5 & 7.
> 
> Signed-off-by: Phillip Wood 
> ---
>  Documentation/git-add.txt  |  3 ++-
>  git-add--interactive.perl  | 30 ++
>  t/t3701-add-interactive.sh |  2 +-
>  3 files changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
> index 0e2c11e97b..d52acfc722 100644
> --- a/Documentation/git-add.txt
> +++ b/Documentation/git-add.txt
> @@ -340,7 +340,8 @@ patch::
>  If you press "l" then the hunk will be reprinted with each insertion
>  or deletion labelled with a number and you will be prompted to enter
>  which lines you wish to select. Individual line numbers should be
> -separated by a space or comma, to specify a range of lines use a dash
> +separated by a space or comma (these can be omitted if there are fewer
> +than ten labelled lines), to specify a range of lines use a dash
>  between them. To invert the selection prefix it with "\^" so "^3-5,8"
>  will select everything except lines 3, 4, 5 and 8.
>  +
> diff --git a/git-add--interactive.perl b/git-add--interactive.perl
> index 6fa3d0a87c..9a6bcd5085 100755
> --- a/git-add--interactive.perl
> +++ b/git-add--interactive.perl
> @@ -1082,6 +1082,33 @@ sub check_hunk_label {
>   return 1;
>  }
>  
> +sub split_hunk_selection {
> + local $_;
> + my @fields = @_;
> + my @ret;
> + for (@fields) {
> + if (/^(-[0-9])(.*)/) {
> + push @ret, $1;
> + $_ = $2;
> + }
> + while ($_ ne '') {
> + if (/^[0-9]-$/) {
> + push @ret, $_;
> + last;
> + } elsif (/^([0-9](?:-[0-9])?)(.*)/) {
> + push @ret, $1;
> + $_ = $2;
> + } else {
> + error_msg sprintf
> + __("invalid hunk line '%s'\n"),
> + substr($_, 0, 1);
> + return ();
> + }
> + }
> + }
> + return @ret;
> +}
> +
>  sub parse_hunk_selection {
>   local $_;
>   my ($hunk, $line) = @_;
> @@ -1100,6 +1127,9 @@ sub parse_hunk_selection {
>   }
>   }
>   }
> + if ($max_label < 10) {
> + @fields = split_hunk_selection(@fields) or return undef;
> + }
>   for (@fields) {
>   if (/^([0-9]*)-([0-9]*)$/) {
>   if ($1 eq '' and $2 eq '') {
> diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
> index 89c0e73f2b..d3bce154da 100755
> --- a/t/t3701-add-interactive.sh
> +++ b/t/t3701-add-interactive.sh
> @@ -410,7 +410,7 @@ test_expect_success 'setup expected diff' '
>  '
>  
>  test_expect_success 'can reset individual lines of patch' '
> - printf "%s\n" l "^1 3" |
> + printf "%s\n" l ^13 |
>   EDITOR=: git reset -p 2>error &&
>   test_must_be_empty error &&
>   git diff --cached HEAD >actual &&
> 


Re: [PATCH v2 1/3] add -p: select individual hunk lines

2018-03-06 Thread Igor Djordjevic
On 06/03/2018 11:17, Phillip Wood wrote:
> From: Phillip Wood 
> 
> When I end up editing hunks it is almost always because I want to
> stage a subset of the lines in the hunk. Doing this by editing the
> hunk is inconvenient and error prone (especially so if the patch is
> going to be reversed before being applied). Instead offer an option
> for add -p to stage individual lines. When the user presses 'l' the
> hunk is redrawn with labels by the insertions and deletions and they
> are prompted to enter a list of the lines they wish to stage. Ranges
> of lines may be specified using 'a-b' where either 'a' or 'b' may be
> omitted to mean all lines from 'a' to the end of the hunk or all lines
> from 1 upto and including 'b'.
> 
> Signed-off-by: Phillip Wood 
> ---
>  Documentation/git-add.txt  |   7 +++
>  git-add--interactive.perl  | 143 
> +
>  t/t3701-add-interactive.sh |  65 +
>  3 files changed, 215 insertions(+)
> 
> diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
> index d50fa339dc..ad33fda9a2 100644
> --- a/Documentation/git-add.txt
> +++ b/Documentation/git-add.txt
> @@ -332,10 +332,17 @@ patch::
> J - leave this hunk undecided, see next hunk
> k - leave this hunk undecided, see previous undecided hunk
> K - leave this hunk undecided, see previous hunk
> +   l - select hunk lines to use

Might be more surrounding context aligned to say "stage hunk lines" 
here (phrase "stage", instead of "select to use").

> s - split the current hunk into smaller hunks
> e - manually edit the current hunk
> ? - print help
>  +
> +If you press "l" then the hunk will be reprinted with each insertion
> +or deletion labelled with a number and you will be prompted to enter
> +which lines you wish to select. Individual line numbers should be

Likewise, s/you wish to select/you wish to stage/.

> +separated by a space or comma, to specify a range of lines use a dash
> +between them.
> ++
>  After deciding the fate for all hunks, if there is any hunk
>  that was chosen, the index is updated with the selected hunks.
>  +
> diff --git a/git-add--interactive.perl b/git-add--interactive.perl
> index f83e7450ad..a273b41e95 100755
> --- a/git-add--interactive.perl
> +++ b/git-add--interactive.perl
> @@ -1013,6 +1013,133 @@ sub color_diff {
>   } @_;
>  }
>  
> +sub label_hunk_lines {
> + local $_;
> + my $hunk = shift;
> + my $i = 0;
> + my $labels = [ map { /^[-+]/ ? ++$i : 0 } @{$hunk->{TEXT}} ];
> + if ($i > 1) {
> + @{$hunk}{qw(LABELS MAX_LABEL)} = ($labels, $i);
> + return 1;
> + }
> + return 0;
> +}
> +
> +sub select_hunk_lines {

This is just something I`ve spotted, but I have no actual idea if 
renaming this to "stage_hunk_lines" might be better, too, or not 
(depending on the surrounding code context), so please take this with 
a big grain of salt.

> + my ($hunk, $selected) = @_;
> + my ($text, $labels) = @{$hunk}{qw(TEXT LABELS)};
> + my ($i, $o_cnt, $n_cnt) = (0, 0, 0);
> + my ($push_eol, @newtext);
> + # Lines with this mode will become context lines if they are
> + # not selected
> + my $context_mode = $patch_mode_flavour{IS_REVERSE} ? '+' : '-';
> + for $i (1..$#{$text}) {
> + my $mode = substr($text->[$i], 0, 1);
> + if ($mode eq '\\') {
> + push @newtext, $text->[$i] if ($push_eol);
> + undef $push_eol;
> + } elsif ($labels->[$i] and $selected->[$labels->[$i]]) {
> + push @newtext, $text->[$i];
> + if ($mode eq '+') {
> + $n_cnt++;
> + } else {
> + $o_cnt++;
> + }
> + $push_eol = 1;
> + } elsif ($mode eq ' ' or $mode eq $context_mode) {
> + push @newtext, ' ' . substr($text->[$i], 1);
> + $o_cnt++; $n_cnt++;
> + $push_eol = 1;
> + } else {
> + undef $push_eol;
> + }
> + }
> + my ($o_ofs, $orig_o_cnt, $n_ofs, $orig_n_cnt) =
> + parse_hunk_header($text->[0]);
> + unshift @newtext, format_hunk_header($o_ofs, $o_cnt, $n_ofs, $n_cnt);
> + my $newhunk = {
> + TEXT => \@newtext,
> + DISPLAY => [ color_diff(@newtext) ],
> + OFS_DELTA => $orig_o_cnt - $orig_n_cnt - $o_cnt + $n_cnt,
> + TYPE => $hunk->{TYPE},
> + USE => 1,
> + };
> + # If this hunk has previously been edited add the offset delta
> + # of the old hunk to get the real delta from the original
> + # hunk.
> + if ($hunk->{OFS_DELTA}) {
> + $newhunk->{OFS_DELTA} += $hunk->{OFS_DELTA};
> + }
> + 

Re: [RFC PATCH] git-submodule.sh:cmd_update: if submodule branch exists, fetch that instead of default

2018-03-06 Thread Jonathan Nieder
(cc list snipped)
Hi,

Eddy Petrișor wrote:

> Cc: [a lot of people]

Can you say a little about how this cc list was created?  E.g. should
"git send-email" get a feature to warn about long cc lists?

> Signed-off-by: Eddy Petrișor 
> ---
>
> There are projects such as llvm/clang which use several repositories, and they
> might be forked for providing support for various features such as adding 
> Redox
> awareness to the toolchain. This typically means the superproject will use
> another branch than master, occasionally even use an old commit from that
> non-master branch.
>
> Combined with the fact that when incorporating such a hierachy of repositories
> usually the user is interested in just the exact commit specified in the
> submodule info, it follows that a desireable usecase is to be also able to
> provide '--depth 1' to avoid waiting for ages for the clone operation to
> finish.

Some previous discussion is at
https://public-inbox.org/git/CAGZ79ka6UXKyVLmdLg_M5-sB1x96g8FRzRZy=eny5ajbqf9...@mail.gmail.com/.

In theory this should be straightforward: Git protocol allows fetching
an arbitrary commit, so "git submodule update" and similar commands
could fetch the submodule commit by SHA-1 instead of by refname.  Poof!
Problem gone.

In practice, some complications:

 - some servers do not permit fetch-by-sha1.  For example, github does
   not permit it.  This is governed by the
   uploadpack.allowReachableSHA1InWant / uploadpack.allowAnySHA1InWant
   configuration items.

   That should be surmountable by making the behavior conditional, but
   it's a complication.

 - When the user passes --depth=, do they mean that to apply to
   the superproject, to the submodules, or both?  Documentation should
   make the behavior clear.

   Fortunately I believe this complication has been takencare of using
   the --shallow-submodules option.

> Git submodule seems to be very stubborn and cloning master, although the
> wrapper script and the gitmodules-helper could work together to clone directly
> the branch specified in the .gitmodules file, if specified.

This could make sense.  For the same reason as --depth in the
superproject gives ambiguous signals about what should happen in
submodules, --single-branch in the superproject gives ambiguous
signals about what branch to fetch in submodules.

> Another wrinkle is that when the commit is not the tip of the branch, the 
> depth
> parameter should somehow be stored in the .gitmodules info, but any change in
> the submodule will break the supermodule submodule depth info sooner or later,
> which is definitly frigile.

Hm, this seems to go in another direction.  I don't think we should
store the depth parameter in the .gitmodules file, since different
users are likely to have different preferences about what to make
shallow.  If we make --depth easy enough to use at the superproject
level then the user can specify what they want there.

> I tried digging into this section of the code and debugging with bashdb to see
> where --depth might fit, but I got stuck on the shell-to-helper interaction 
> and
> the details of the submodule implementation, so I want to lay out this first
> patch as starting point for the discussion in the hope somebody else picks it
> up or can provide some inputs. I have the feeling there are multiple code 
> paths
> that are being ran, depending on the moment (initial clone, submodule
> recursive, post-clone update etc.) and I have a gut feeling there shouldn't be
> any code duplication just because the operation is different.
>
> This first patch is only trying to use a non-master branch, I have some 
> changes
> for the --depth part, but I stopped working on it due to the "default depth"
> issue above.
>
> Does any of this sound reasonable?
> Is this patch idea usable or did I managed to touch the part of the code that
> should not be touched?

I agree with the goal.  As mentioned above, I'm not confident about
the particular mechanism --- e.g. something using fetch-by-sha1 seems
likely to be more intuitive.

Today, the 'branch' setting in .gitmodules is only for "git submodule
update --remote".  This patch would be a significant expansion in
scope for it.  Hopefully others on the list can talk more about how
that fits into various workflows and whether it would work out well.

Thanks and hope that helps,
Jonathan

>  git-submodule.sh | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 2491496..370f19e 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -589,8 +589,11 @@ cmd_update()
>   branch=$(git submodule--helper remote-branch "$sm_path")
>   if test -z "$nofetch"
>   then
> + # non-default branch
> + rbranch=$(git config -f .gitmodules 
> submodule.$sm_path.branch)
> + 
> 

Re: [PATCH v2 0/3] add -p: select individual hunk lines

2018-03-06 Thread Igor Djordjevic
Hi Phillip,

On 06/03/2018 11:17, Phillip Wood wrote:
> 
> From: Phillip Wood 
> 
> I've added some documentation to git-add.txt for the new selection
> mode and cleaned up some style issues, otherwise these are unchanged
> since v1.  These patches build on top of the recount fixes in [1]. The
> commit message for the first patch describes the motivation:
> 
> "When I end up editing hunks it is almost always because I want to
> stage a subset of the lines in the hunk. Doing this by editing the
> hunk is inconvenient and error prone (especially so if the patch is
> going to be reversed before being applied). Instead offer an option
> for add -p to stage individual lines. When the user presses 'l' the
> hunk is redrawn with labels by the insertions and deletions and they
> are prompted to enter a list of the lines they wish to stage. Ranges
> of lines may be specified using 'a-b' where either 'a' or 'b' may be
> omitted to mean all lines from 'a' to the end of the hunk or all lines
> from 1 upto and including 'b'."
> 
> [1] 
> https://public-inbox.org/git/xmqqbmg29x1n@gitster-ct.c.googlers.com/T/#m01d0f1af90f32b698e583b56f8e53b986bcec7c6

Nice, thank you :)

A small nitpick - I see you use phrasing like "select lines", where 
the other commands usually talk about "staging", instead, so "stage 
lines" might be more aligned with the existing text.

I`ll quickly go through the patches regarding this (not being of much 
help for the code itself at the moment, sorry!).

Regards, Buga


Re: [PATCH v2 2/3] add -p: allow line selection to be inverted

2018-03-06 Thread Junio C Hamano
Phillip Wood  writes:

> From: Phillip Wood 
>
> If the list of lines to be selected begins with '^' select all the
> lines except the ones listed.

There is "# Input that begins with '-'; unchoose" in list_and_choose
helper.  Does it make things inconsistent to use '^' for negation
like this patch does with it?


Re: [GSoC][PATCH v3] Make options that expect object ids less chatty if id is invalid

2018-03-06 Thread Paul-Sebastian Ungureanu
Hello Junio,

Thank you for reviewing my code. I appreciate it. I made the changes
here [1].

On Mon, 2018-03-05 at 16:19 -0800, Junio C Hamano wrote:
> Taking these together, I _think_ this patch is moving things in the
> right direction, in that it allows callers of parse_options_step()
> to tell "user knew what option s/he wanted, and sufficient error
> message has already be given" and "user gave us a nonsense option
> and would be helped by usage text" cases apart by introducing a new
> value PARSE_OPT_ERROR, but in order to be able to correctly give
> PARSE_OPT_ERROR back to the caller, parse_long_opt() and
> parse_short_opt() (possibly, but I didn't check) would need a bit of
> tweak to help their callers in this function.

I am not sure I got this right, but I believe this is already done.

- If an error occurs during value parsing in parse_short_opt or
parse_long_opt then -1 is returned and parse_option_step returns
PARSE_OPT_ERROR.

- If parse_short_opt and parse_long_opt encounter an unknown option
then -2 is returned and parse_option_step returns PARSE_OPT_UNKNOWN
(but only if PARSE_OPT_KEEP_UNKNOWN is not specified).

- If usage is shown by calling usage_with_options_internal then
PARSE_OPT_HELP is going to be forwarded and also returned by
parse_options_step.

What I also changed in the new patch [1] is to make parse_long_opt
return -3 when ambiguous option are found, in which case
parse_options_step will handle this return value by showing usage and
returning PARSE_OPT_HELP.

Please correct me if I am wrong.

[1] https://public-inbox.org/git/20180306193116.23876-1-ungureanupaulse
bast...@gmail.com/T/#u

Best regards,
Paul Ungureanu



Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-06 Thread Igor Djordjevic
On 06/03/2018 19:12, Johannes Schindelin wrote:
> 
> > > And I guess being consistent is pretty important, too - if you add new
> > > content during merge rebase, it should always show up in the merge,
> > > period. 
> >
> > Yes, that should make it easy for the user to know what to expect from
> > rebase.
> 
> Indeed. We have seen time and time again that consistent behavior is the
> only thing that lets us adhere to the Law of Least Surprise.
> 
> And here lies the rub: do we really want to let `merge -C ` behave
> completely differently than `merge`? Granted, in one case we provide a
> template merge commit, in the other case, we do not. And the idea is
> already to behave differently, although that difference only extends to
> the commit message so far.
> 
> But given the benefit (i.e. that the strategy to transform the original
> merge commit into the new merge commit), I am willing to run that risk,
> especially since I foresee only few users wanting to create new merge
> commits from scratch using the `merge` todo command.
> 
> Of course, even then we need to be careful: the user might have
> *changed* or *moved* the original `merge` command. For example, if the
> merge command read:
> 
>   merge -C deadbee cafecafe bedbedbed
> 
> and the user switched the order of the merged branches into
> 
>   merge -C deadbee bedbedbed cafecafe
> 
> we would have to detect the changed order of the arguments so that we
> could still find the original branch tips.
> 
> But the user might also have changed the branch(es) to merge completely,
> in which case we might not even be able to find original branch tips.
> 
> My preferred solution would be to let the `merge` command figure out
> whether the passed arguments correspond to the rewritten versions of the
> original merge parents. And only in that case would we use the fancy
> strategy, in all other cases we would fall back to performing a regular
> recursive (or octopus) merge.
> 
> How does that sound?
> 
> It will be slightly inconsistent. But in a defendable way, I think.

I like where this discussion is heading, and here`s what I thought 
about it :)

First, starting from non-interactive rebase, I guess we may now agree 
that _rebasing_ merges is an actually expected behavior, not recreating 
them (thus keeping manual conflict resolutions and amendments, not 
losing them).

Now, interactive rebase is a totally different story, we already said 
user can change pretty much about everything, making merge 
_recreation_ to be a more sane choice, but let`s leave this other 
extreme for a brief moment.

In the least interesting situation, though, user could just review 
and close todo list, without changing anything - and in that case it 
would be important, consistency wise, to behave exactly like in case 
of non-interactive rebase, meaning still rebasing merges, not 
recreating them.

Ok, so that still aligns with what`s written so far - we need to be 
able to rebase merges interactively, too (not just recreate them), to 
stay consistent in less complex interactive rebases.

But, what if user really wants to _recreate_ merges, for whatever 
reason? Come on, this is interactive rebase we`re talking about, why 
being restrictive? :)

Here`s a twist - not letting `merge` trying to be too smart by 
figuring out whether passed arguments correspond to rewritten 
versions of the original merge parents (which would be too 
restrictive, too, I`m afraid), but just be explicit about it, instead!

So, it could be something like:

merge -C deadbee 123abc:cafecafe 234bcd:bedbedbed


The format is still something to think about, but the point is rather 
simple - explicitly map old and new merge parents, showing this 
inside todo list by default.

This makes it much easier for later processing (and correct, no need 
to guess which one goes where), but also gives more power to the 
user, being able to decide which merge parents get "rebased", and 
which ones should go into the merge just like "new".

So if a user gets an interactive todo list like that and just closes 
it, we still have exact situation like non-interactive rebase (and no 
guessing on implementation side).

But, user might still decide to introduce new merge parents into the 
mix, even, where we could then just be merging those (as there is no 
old merge parent to actually rebase from):

merge -C deadbee 123abc:cafecafe 234bcd:bedbedbed new-branch

Here, "new-branch" is something new, introduced inside interactive 
rebase, and it will be just merged into the other two (which are 
still being rebased).

Also, another example - if original merge parent "123abc" was merged 
from the other side using `-s ours` strategy, that means all the 
content this branch originally had will still be missing from the 
rebased merge (expect for what`s been cherry-picked elsewhere).

But, I would argue it`s quite legit to want to revise that decision, 
and let that content in this time. To make that 

[GSoC][PATCH v4] Make options that expect object ids less chatty if id is invalid

2018-03-06 Thread Paul-Sebastian Ungureanu
Usually, the usage should be shown only if the user does not know what
options are available. If the user specifies an invalid value, the user
is already aware of the available options. In this case, there is no
point in displaying the usage anymore.

This patch applies to "git tag --contains", "git branch --contains",
"git branch --points-at", "git for-each-ref --contains" and many more.

Signed-off-by: Paul-Sebastian Ungureanu 
---
 builtin/blame.c   |  1 +
 builtin/shortlog.c|  1 +
 builtin/update-index.c|  1 +
 parse-options.c   | 20 
 parse-options.h   |  1 +
 t/t0040-parse-options.sh  |  2 +-
 t/t3404-rebase-interactive.sh |  6 +--
 t/tcontains.sh| 92 +++
 8 files changed, 110 insertions(+), 14 deletions(-)
 create mode 100755 t/tcontains.sh

diff --git a/builtin/blame.c b/builtin/blame.c
index 9dcb367b9..a5fbec8fb 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -728,6 +728,7 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
PARSE_OPT_KEEP_DASHDASH | PARSE_OPT_KEEP_ARGV0);
for (;;) {
switch (parse_options_step(, options, blame_opt_usage)) {
+   case PARSE_OPT_ERROR:
case PARSE_OPT_HELP:
exit(129);
case PARSE_OPT_DONE:
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index e29875b84..0789e2eea 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -282,6 +282,7 @@ int cmd_shortlog(int argc, const char **argv, const char 
*prefix)
 
for (;;) {
switch (parse_options_step(, options, shortlog_usage)) {
+   case PARSE_OPT_ERROR:
case PARSE_OPT_HELP:
exit(129);
case PARSE_OPT_DONE:
diff --git a/builtin/update-index.c b/builtin/update-index.c
index 58d1c2d28..34adf55a7 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1059,6 +1059,7 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
break;
switch (parseopt_state) {
case PARSE_OPT_HELP:
+   case PARSE_OPT_ERROR:
exit(129);
case PARSE_OPT_NON_OPTION:
case PARSE_OPT_DONE:
diff --git a/parse-options.c b/parse-options.c
index d02eb8b01..47c09a82b 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -317,14 +317,16 @@ static int parse_long_opt(struct parse_opt_ctx_t *p, 
const char *arg,
return get_value(p, options, all_opts, flags ^ opt_flags);
}
 
-   if (ambiguous_option)
-   return error("Ambiguous option: %s "
+   if (ambiguous_option) {
+   error("Ambiguous option: %s "
"(could be --%s%s or --%s%s)",
arg,
(ambiguous_flags & OPT_UNSET) ?  "no-" : "",
ambiguous_option->long_name,
(abbrev_flags & OPT_UNSET) ?  "no-" : "",
abbrev_option->long_name);
+   return -3;
+   }
if (abbrev_option)
return get_value(p, abbrev_option, all_opts, abbrev_flags);
return -2;
@@ -434,7 +436,6 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
   const char * const usagestr[])
 {
int internal_help = !(ctx->flags & PARSE_OPT_NO_INTERNAL_HELP);
-   int err = 0;
 
/* we must reset ->opt, unknown short option leave it dangling */
ctx->opt = NULL;
@@ -459,7 +460,7 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
ctx->opt = arg + 1;
switch (parse_short_opt(ctx, options)) {
case -1:
-   goto show_usage_error;
+   return PARSE_OPT_ERROR;
case -2:
if (ctx->opt)
check_typos(arg + 1, options);
@@ -472,7 +473,7 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
while (ctx->opt) {
switch (parse_short_opt(ctx, options)) {
case -1:
-   goto show_usage_error;
+   return PARSE_OPT_ERROR;
case -2:
if (internal_help && *ctx->opt == 'h')
goto show_usage;
@@ -504,9 +505,11 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
goto show_usage;
switch (parse_long_opt(ctx, arg + 2, options)) {
case -1:
-   goto show_usage_error;
+   return PARSE_OPT_ERROR;

Re: [PATCH] xdiff: improve trimming preprocessing

2018-03-06 Thread Junio C Hamano
Eric Sunshine  writes:

> On Tue, Mar 6, 2018 at 6:53 AM, Jun Wu  wrote:
>> xdiff-interface trims common suffix if ctxlen is 0. Teach it to also
>> trim common prefix, and trim less lines if ctxlen > 0. So it can benefit
>> the default diff command, as seen by profiling: [...]

I vaguely recall that we have tried this in the distant past, found
that it produced incorrect result, and that is why we limit the
optimization for no-context case.

Does anybody have an archive reference?


Bug: moving submodules that have submodules inside them causes a fatal error in git status

2018-03-06 Thread Sean Behan
Hello,

I encountered this error when moving some submodules in vim, basically if you
have a submodule that has submodules inside it and you try to move it you'll
encouter a fatal error with `git status`. I have a pastebin example of this
here: https://ptpb.pw/5g9-/bash

Appologies if this is an invalid bug report.

I'm running Git version: 2.16.2 (Debian Unstable)

--

Sean Behan


signature.asc
Description: PGP signature


Re: [PATCH] git-gui: Add hotkeys to change focus between ui widgets

2018-03-06 Thread Junio C Hamano
Birger Skogeng Pedersen  writes:

> Thanks for the feedback.
>
> On Mon, Mar 5, 2018 at 5:55 PM, Johannes Schindelin
>  wrote:
>> Do you think there is a way to focus on the last-selected path? That would
>> make this feature even more convenient, I think.
>
> Yes, good idea. I'll implement it and create a second version.

Please make it a patch against the main git-gui project, not against
our tree.  That is, your patch text would look like this:

diff --git a/git-gui.sh b/git-gui.sh
index 5bc21b878d..39e80ebafa 100755
--- a/git-gui.sh
+++ b/git-gui.sh
@@ -3843,6 +3843,7 @@ bind .   <$M1B-Key-equal> {show_more_context;break}
 bind .   <$M1B-Key-plus> {show_more_context;break}
 bind .   <$M1B-Key-KP_Add> {show_more_context;break}
 bind .   <$M1B-Key-Return> do_commit
+bind .   <$M1B-Key-KP_Enter> do_commit
 foreach i [list $ui_index $ui_workdir] {
bind $i{ toggle_or_diff click %W %x %y; break }
bind $i <$M1B-Button-1>  { add_one_to_selection %W %x %y; break }

We've seen three patches to git-gui from three different people in
the past week.  The project seems to be abandoned and we need to
find a volunteer (or a few) to take it over, it seems.  In the
meantime I have blindly been picking and queuing git-gui changes
but because I am not even a casual user of it, I know I will not do
a good job maintaining it in the longer term.


Re: [PATCH] xdiff: improve trimming preprocessing

2018-03-06 Thread Eric Sunshine
On Tue, Mar 6, 2018 at 6:53 AM, Jun Wu  wrote:
> xdiff-interface trims common suffix if ctxlen is 0. Teach it to also
> trim common prefix, and trim less lines if ctxlen > 0. So it can benefit
> the default diff command, as seen by profiling: [...]

A few comments (below) based upon a quick scan of the patch...

> Signed-off-by: Jun Wu 
> ---
> diff --git a/t/t4066-diff-trimming.sh b/t/t4066-diff-trimming.sh
> @@ -0,0 +1,49 @@
> +test_expect_success 'setup' '
> +  printf "x\n%.0s" {1..1000} >a &&
> +  printf "x\n%.0s" {1..1001} >b &&
> +  cat >c >c &&\
> +  printf "x%.0s" {1..934} >>d # pad common suffix to 1024 bytes

The expression {x..y} is not portable to non-POSIX shells.

> +test_expect_success 'git diff -U0 shifts hunk towards the end' '
> +   test_expect_code 1 git diff -U0 --no-index a b |\
> +fgrep "@@ -1000,0 +1001 @@" &&
> +   test_expect_code 1 git diff -U0 --no-index b a |\
> +fgrep "@@ -1001 +1000,0 @@"
> +'

Style: Mix of tabs and spaces for indentation. Please indent only with
tabs throughout the patch.

> diff --git a/xdiff-interface.c b/xdiff-interface.c
> @@ -101,42 +101,64 @@ static int xdiff_outf(void *priv_, mmbuffer_t *mb, int 
> nbuf)
> +static void trim_common(mmfile_t *a, mmfile_t *b, long reserved,
> +   xdemitconf_t *xecfg)
>  {
> +   /* prefix - need line count for xecfg->ptrimmed */
> +   for (i = 0; ++i < smaller && *ap == *bp;) {
> +   lines += (*ap == '\n');
> +   ap++, bp++;

Is there a good reason for not placing 'ap++, bp++' directly in the 'for'?

for (i = 0; ++i < smaller && *ap == *bp; ap++, bp++)


Re: [PATCH v2 3/5] gc --auto: exclude base pack if not enough mem to "repack -ad"

2018-03-06 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> On linux-2.6.git, valgrind massif reports 1.6GB heap in "pack all"
> case, and 535MB [1] in "pack all except the base pack" case. We save
> roughly 1GB memory by excluding the base pack.

;-)

> gc --auto decides to do this based on an estimation of pack-objects
> memory usage, which is quite accurate at least for the heap part, and
> whether that fits in half of system memory (the assumption here is for
> desktop environment where there are many other applications running).

I was still confused by "decides to do this..." after reading the
above three times.  If this is describing the state _with_ this
patch applied, then "Teach 'gc --auto' to do this automatically..."
would make it clear that is what is going on.

> diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> index 571b5a7e3c..35ad420d5c 100644
> --- a/Documentation/git-gc.txt
> +++ b/Documentation/git-gc.txt
> @@ -59,6 +59,11 @@ then existing packs (except those marked with a `.keep` 
> file)
>  are consolidated into a single pack by using the `-A` option of
>  'git repack'. Setting `gc.autoPackLimit` to 0 disables
>  automatic consolidation of packs.
> ++
> +If the physical amount of memory is considered not enough for `git
> +repack` to run smoothly, `--keep-base-pack` is enabled. This could be
> +overridden by setting `gc.bigBasePackThreshold` which only enables
> +`--keep-base-pack` when the base pack is larger the specified limit.

I somehow find the flow of logic in these two sentences harder to
follow than necessary.  Perhaps swapping the order may make it
easier to grok?  That is:

 - When gc.bigBasePackThreshold is set, packs larger than that will
   automatically be kept (i.e. not considered for repacking);

 - When it is not set, we try to guess how memory constrained we are,
   and behave as if the threshold were set to the size of the
   largest packfile we have (i.e. that single pack is kept).

I think another and bigger reason why I found the original hard to
read is because it is described for those who already understand
what "--keep-base-pack" option does.  Rewriting it not to require
the pre-existing knowledge of that option would make it a lot easier
to grok, I would think (you may not realize it because you wrote the
feature and are very familiar with it, though).

> +--keep-base-pack::
> + All packs except the base pack are consolidated into a single
> + pack. The largest pack is considered the base pack.

This makes it sound as if packs with .keep are also repacked unless
they meet the threshold for "base pack".  Is that what you actually
implemented?

In order to do so, [2/5] needs to allow the "--keep-pack" option
override the on-disk .keep files.  In an earlier message, I wondered
if such an arrangement is useful in some use cases; I think it is,
and because those who do want to see the on-disk .keep files honored
can collect and include them in the set of packs to be kept via
"--keep-pack" (after all this is an option for low-level scripting
anyway).

> +Set environment variable `GIT_TRACE` in order to see the memory usage
> +estimation in `git gc --auto` that determines whether the base pack is
> +kept.

This is somewhat a puzzling use of trace.  It sounds more like a way
to find out "how" memory usage estimation is done and arriving at a
wrong value for those who want to debug the feature.

> +static unsigned long big_base_pack_threshold;
> +static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE;

A new symbol, which is a good addition.

> +static struct packed_git *find_the_base_pack(void)
> +{
> + struct packed_git *p, *base = NULL;
> +
> + prepare_packed_git();
> +
> + for (p = packed_git; p; p = p->next) {
> + if (p->pack_local &&
> + (!base || base->pack_size < p->pack_size))
> + base = p;
> + }
> +
> + return base;
> +}

This is finding the largest pack.

> @@ -187,7 +211,101 @@ static int too_many_packs(void)
>   return gc_auto_pack_limit < cnt;
>  }
>  
> -static void add_repack_all_option(void)
> +static inline unsigned long total_ram(void)

"inline"?  We'd rather have compiler figure it out, no?

> +{
> + unsigned long default_ram = 4;

4 what?  4 bytes?  Name it perhaps "default_ram_gb" or something?

> +#ifdef HAVE_SYSINFO
> + struct sysinfo si;
> +
> + if (!sysinfo())
> + return si.totalram;
> +#elif defined(HAVE_BSD_SYSCTL) && defined(HW_MEMSIZE)
> + int64_t physical_memory;
> + int mib[2];
> + size_t length;
> +
> + mib[0] = CTL_HW;
> + mib[1] = HW_MEMSIZE;
> + length = sizeof(int64_t);
> + if (!sysctl(mib, 2, _memory, , NULL, 0))
> + return physical_memory;
> +#elif defined(GIT_WINDOWS_NATIVE)
> + MEMORYSTATUSEX memInfo;
> +
> + memInfo.dwLength = sizeof(MEMORYSTATUSEX);
> + if (GlobalMemoryStatusEx())
> + return memInfo;ullTotalPhys;

Is this legal C in 

Re: [PATCH v2 2/5] repack: add --keep-pack option

2018-03-06 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> +--keep-pack=::
> + Ignore the given pack. This is the equivalent of having
> + `.keep` file on the pack. Implies `--honor-pack-keep`.
> +

A few questions I am not sure how I would answer:

 - Do we want to have this listed in the SYNOPSIS section, too?

 - We would want to make the SP in "" consistent with
   the dash in "" in the same document; which way do
   we make it uniform?

 - Is this description clear enough to convey that we allow more
   than one instance of this option specified, and the pack names
   accumulate?

 - Are there use cases where we want to _ignore_ on-disk ".keep" and
   only honor the ones given via the "--keep-pack" options?

 - Is this description clear enough to convey that  is
   just the filename part (i.e. "pack-[0-9a-f]{40}.pack") in our
   local $GIT_OBJECT_DIRECTORY/pack/ and not a full path to the
   packfile?  I think that design is sensible, simplifies the
   implementation and reduces mistakes.

> +static void add_extra_kept_packs(const struct string_list *names)
> +{
> + struct packed_git *p;
> +
> + if (!names->nr)
> + return;
> +
> + prepare_packed_git();
> + for (p = packed_git; p; p = p->next) {
> + const char *name = basename(p->pack_name);
> + int i;
> +
> + if (!p->pack_local)
> + continue;
> +
> + for (i = 0; i < names->nr; i++) {
> + if (fspathcmp(name, names->items[i].string))
> + continue;
> +
> + p->pack_keep = 1;
> + ignore_packed_keep = 1;
> + break;
> + }
> + }
> +}

OK.

> diff --git a/builtin/repack.c b/builtin/repack.c
> index 7bdb40142f..6a1dade0e1 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -86,7 +86,8 @@ static void remove_pack_on_signal(int signo)
>   * have a corresponding .keep or .promisor file. These packs are not to
>   * be kept if we are going to pack everything into one file.
>   */
> -static void get_non_kept_pack_filenames(struct string_list *fname_list)
> +static void get_non_kept_pack_filenames(struct string_list *fname_list,
> + const struct string_list *extra_keep)
>  {
>   DIR *dir;
>   struct dirent *e;
> @@ -97,6 +98,14 @@ static void get_non_kept_pack_filenames(struct string_list 
> *fname_list)
>  
>   while ((e = readdir(dir)) != NULL) {
>   size_t len;
> + int i;
> +
> + for (i = 0;i < extra_keep->nr; i++)

Style: SP after ';' before 'i'.

> + if (!fspathcmp(e->d_name, extra_keep->items[i].string))
> + break;
> + if (extra_keep->nr > 0 && i < extra_keep->nr)
> + continue;
> +
>   if (!strip_suffix(e->d_name, ".pack", ))
>   continue;

> diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> index 38247afbec..553d907d34 100755
> --- a/t/t7700-repack.sh
> +++ b/t/t7700-repack.sh
> @@ -196,5 +196,24 @@ test_expect_success 'objects made unreachable by grafts 
> only are kept' '
>   git cat-file -t $H1
>  '
>  
> +test_expect_success 'repack --keep-pack' '
> + test_create_repo keep-pack &&
> + (
> + cd keep-pack &&
> + for cmit in one two three four; do
> + test_commit $cmit &&
> + git repack -d
> + done &&

Style: replace "; " before do with LF followed by a few HT.

This 'for' loop would not exit and report error if an early
test_commit or "git repack -d" fails, would it?

> + ( cd .git/objects/pack && ls *.pack ) >pack-list &&
> + test_line_count = 4 pack-list &&
> + KEEP1=`head -n1 pack-list` &&
> + KEEP4=`tail -n1 pack-list` &&

Style: $()

> + git repack -a -d --keep-pack $KEEP1 --keep-pack $KEEP4 &&
> + ls .git/objects/pack/*.pack >new-counts &&
> + test_line_count = 3 new-counts &&
> + git fsck

One important invariant for this new feature is that $KEEP1 and
$KEEP4 will both appear in new-counts file, no?  Rename new-counts
to new-pack-list and inspect the contents, not just line count,
perhaps?

> + )
> +'
> +
>  test_done


Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-06 Thread Johannes Schindelin
Hi Phillip & Buga,

On Tue, 6 Mar 2018, Phillip Wood wrote:

> On 02/03/18 23:33, Igor Djordjevic wrote:
> > 
> > [...]
> > Otherwise, I would be interested to learn how context/semantics 
> > guessing could provide a better default action (without introducing 
> > more complexity for might not be much benefit, if any).
> 
> I don't think its possible to guess the semantics of the original merge
> as users can use custom merge strategies and amend the result. It would
> be possible to detect and unamended '-s ours' merge but special casing
> that may end up causing users more confusion rather than helping them.

FWIW I agree.

My original plan was to always merge recursively and suggest to use `exec`
commands if anything else is needed.

But now with that excellent new idea to perform successive three-way
merges of the original merge commit with the new tips, using the old tips
as merge base, I am considering to change that.

There is a big problem here, though: consistency. See below for more
musings about that.

> > And I guess being consistent is pretty important, too - if you add new
> > content during merge rebase, it should always show up in the merge,
> > period. 
> 
> Yes, that should make it easy for the user to know what to expect from
> rebase.

Indeed. We have seen time and time again that consistent behavior is the
only thing that lets us adhere to the Law of Least Surprise.

And here lies the rub: do we really want to let `merge -C ` behave
completely differently than `merge`? Granted, in one case we provide a
template merge commit, in the other case, we do not. And the idea is
already to behave differently, although that difference only extends to
the commit message so far.

But given the benefit (i.e. that the strategy to transform the original
merge commit into the new merge commit), I am willing to run that risk,
especially since I foresee only few users wanting to create new merge
commits from scratch using the `merge` todo command.

Of course, even then we need to be careful: the user might have
*changed* or *moved* the original `merge` command. For example, if the
merge command read:

merge -C deadbee cafecafe bedbedbed

and the user switched the order of the merged branches into

merge -C deadbee bedbedbed cafecafe

we would have to detect the changed order of the arguments so that we
could still find the original branch tips.

But the user might also have changed the branch(es) to merge completely,
in which case we might not even be able to find original branch tips.

My preferred solution would be to let the `merge` command figure out
whether the passed arguments correspond to the rewritten versions of the
original merge parents. And only in that case would we use the fancy
strategy, in all other cases we would fall back to performing a regular
recursive (or octopus) merge.

How does that sound?

It will be slightly inconsistent. But in a defendable way, I think.

Ciao,
Dscho


Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-06 Thread Johannes Schindelin
Hi Phillip,

On Tue, 6 Mar 2018, Phillip Wood wrote:

> On 03/03/18 00:29, Igor Djordjevic wrote:
> > 
> > On 02/03/2018 12:31, Phillip Wood wrote:
> >>
> >>> Thinking about it overnight, I now suspect that original proposal
> >>> had a mistake in the final merge step. I think that what you did is
> >>> a way to fix it, and I want to try to figure what exactly was wrong
> >>> in the original proposal and to find simpler way of doing it right.
> >>>
> >>> The likely solution is to use original UM as a merge-base for final
> >>> 3-way merge of U1' and U2', but I'm not sure yet. Sounds pretty
> >>> natural though, as that's exactly UM from which both U1' and U2'
> >>> have diverged due to rebasing and other history editing.
> >>
> >> Hi Sergey, I've been following this discussion from the sidelines,
> >> though I haven't had time to study all the posts in this thread in
> >> detail. I wonder if it would be helpful to think of rebasing a merge
> >> as merging the changes in the parents due to the rebase back into the
> >> original merge. So for a merge M with parents A B C that are rebased
> >> to A' B' C' the rebased merge M' would be constructed by (ignoring
> >> shell quoting issues)
> >>
> >> git checkout --detach M
> >> git merge-recursive A -- M A'
> >> tree=$(git write-tree)
> >> git merge-recursive B -- $tree B'
> >> tree=$(git write-tree)
> >> git merge-recursive C -- $tree C'
> >> tree=$(git write-tree)
> >> M'=$(git log --pretty=%B -1 M | git commit-tree -pA' -pB' -pC')
> >>
> >> This should pull in all the changes from the parents while preserving
> >> any evil conflict resolution in the original merge. It superficially
> >> reminds me of incremental merging [1] but it's so long since I looked at
> >> that I'm not sure if there are any significant similarities.
> >>
> >> [1] https://github.com/mhagger/git-imerge
> > 
> > Interesting, from quick test[3], this seems to produce the same 
> > result as that other test I previously provided[2], where temporary 
> > commits U1' and U2' are finally merged with original M as a base :)
> > 
> > Just that this looks like even more straight-forward approach...?
> > 
> > The only thing I wonder of here is how would we check if the 
> > "rebased" merge M' was "clean", or should we stop for user amendment? 
> > With that other approach Sergey described, we have U1'==U2' to test with.
> 
> I think (though I haven't rigorously proved to myself) that in the
> absence of conflicts this scheme has well defined semantics (the merges
> can be commuted), so the result should be predicable from the users
> point of view so maybe it could just offer an option to stop.

I am not so sure that the result is independent of the order of the
merges. In other words, I am not necessarily certain that it is impossible
to concoct A,A',B,B' commits where merging B'/B before A'/A has a
different result than merging A'/A before B'/B.

Remember, when constructing counter-examples to hypotheses, those
counter-examples do not really *have* to make sense on their own. For
example, A' could introduce *completely different* changes from A, and the
same is true for B' and B.

I could imagine, for example, that using a ton of consecutive empty lines,
and using patches that insert something into these empty lines (and are
thusly inherently ambiguous when said set of empty lines has changed),
could even introduce a merge conflict in one order, but no conflict in the
other.

Even so, I think that merging in the order of the parents makes the most
sense, and that using that strategy makes sense, too, because you really
have to try hard to make it fail.

Ciao,
Dscho


Re: [PATCH v2 1/5] fixup! Add a test showing that 'git repack' throws away grafted-away parents

2018-03-06 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> The closing quote of a test body by convention is always at the start
> of line.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---

Obviously correct.  While the title may technically be correct, the
original commit is age old that we won't be amending it anyway, and
it is not very helpful not to say which aspect of the original is
being fixed up.  Perhaps something like:

t7700: have closing quote of a test at the beginning of line

1ec64827 ("Add a test showing that 'git repack' throws away
grafted-away parents", 2009-07-23) added this test but indented
the closing quote by mistake.

if we really cared documenting where the blame lies, but I do not
think it is quite worth it; anybody who cares deeply can ask "git
blame" about it, so I'll just retitle and use your original log
message body.

Thanks.

>  t/t7700-repack.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> index 6061a04147..38247afbec 100755
> --- a/t/t7700-repack.sh
> +++ b/t/t7700-repack.sh
> @@ -194,7 +194,7 @@ test_expect_success 'objects made unreachable by grafts 
> only are kept' '
>   git reflog expire --expire=$test_tick --expire-unreachable=$test_tick 
> --all &&
>   git repack -a -d &&
>   git cat-file -t $H1
> - '
> +'
>  
>  test_done


Re: [PATCH v2 0/5] Avoid expensive 'repack -ad' in gc --auto

2018-03-06 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> I'm pretty happy with this now. Like v1, this is about not touching
> the giant base pack when doing background gc. This saves about 2/3 of
> memory, which in turn should improve performance if you're under
> memory pressure.

Thanks.  I've quickly scanned them and the overall idea looks quite
sound.  I'll comment on individual changes if needed.

Will queue.



Re: [PATCH v9 6/8] convert: check for detectable errors in UTF encodings

2018-03-06 Thread Lars Schneider

> On 06 Mar 2018, at 02:23, Junio C Hamano  wrote:
> 
> Lars Schneider  writes:
> 
>>> On 05 Mar 2018, at 22:50, Junio C Hamano  wrote:
>>> 
>>> lars.schnei...@autodesk.com writes:
>>> 
 +static int validate_encoding(const char *path, const char *enc,
 +const char *data, size_t len, int die_on_error)
 +{
 +  if (!memcmp("UTF-", enc, 4)) {
>>> 
>>> Does the caller already know that enc is sufficiently long that
>>> using memcmp is safe?
>> 
>> No :-(
>> 
>> Would you be willing to squash that in?
>> 
>>if (strlen(enc) > 4 && !memcmp("UTF-", enc, 4)) {
>> 
>> I deliberately used "> 4" as plain "UTF-" is not even valid.
> 
> I'd rather not.  The code does not have to even look at 6th and
> later bytes in the enc[] even if it wanted to reject "UTF-" followed
> by nothing, but use of strlen() forces it to look at everything.
> 
> Stepping back, shouldn't
> 
>   if (starts_with(enc, "UTF-") 
> 
> be sufficient?  If you really care about the case where "UTF-" alone
> comes here, you could write
> 
>   if (starts_with(enc, "UTF-") && enc[4])
> 
> but I do not think "&& enc[4]" is even needed.  The functions called
> from this block would not consider "UTF-" alone as something valid
> anyway, so with that "&& enf[4]" we would be piling more code only
> for invalid/rare case.

Agreed, "if (starts_with(enc, "UTF-"))" is sufficient. Can you squash
that in? Thanks for pointing me to starts_with() as I forgot about this
function!

- Lars


Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-06 Thread Junio C Hamano
Phillip Wood  writes:

> I wonder if just having a predicable result rather than forcing the
> rebase to stop if the user just squashes a fixup commit into a topic
> branch that is the parent of a merge might be more convenient in practice.

Unless I am misunderstanding what you are saying, that is pretty
much what I have automated for my daily rebuild of the 'pu' branch

Non-textual semantic conflicts are made (in the best case just once)
as a separate commit on top of mechanical auto-merge whose focus is
predictability (rather than cleverness) done by Git, and then that
separate commit is kept outside the history.  When replaying these
merges to rebuild the 'pu' branch, after resetting the tip to
'master', each topic is merged mechanically, and if such a fix-up
commit is present, "cherry-pick --no-commit" applies it and then
"commit --amend --no-edit" to adjust the merge.  I find it quite
valuable to have a separate record of what "evil" non-mechanical
adjustment was done, which I know won't be lost in the noise when
these merges need to be redone daily or more often.

The Appendix in Documentation/howto/maintain-git.txt talks about
this process.  You can see what topics have such merge-fix defined
by peeking https://github.com/gitster/git/ repository.



Improve `git log -L` functionality

2018-03-06 Thread KES
Hi.
 I want to `Trace the evolution of the line range`. 
And not committed change is sort of evolution and should be taken into account 
by -L option. 

Currently I MUST `stash save` change, 
look actual line number, 
trace evolution, 
`stash pop` to bring back current change. 

EXPECTED:
Allow to use those line numbers which I see in my editor
without excess `stash save/stash pop` commands

If file has not committed change then this change maybe shown by `-L` as commit 
NOT COMMITTED YET
If file staged 'commit STAGED'

More description what is comming on:
https://stackoverflow.com/q/49130112/4632019


How to use filter-branch with --state-branch?

2018-03-06 Thread Michele Locati
Recent versions of git filter-branch command introduced the --state-branch
option.
BTW I can't find any info about how this can be actually used.

We have this repository on github:
https://github.com/concrete5/concrete5

When someone pushes to that repo, we clone it and execute
`git filter-branch --subdirectory-filter concrete`
to extract the concrete directory, and we push the result to
https://github.com/concrete5/concrete5-core
(including all the branches and tags)

The script at the moment is this one:
https://github.com/concrete5/core_splitter/blob/70879e676b95160f7fc5d0ffc22b8f7420b0580b/bin/splitcore

I tried to use the --state-branch option on a local mirror, so that we could
do an incremental filtering. Here's the script:

# Executed just one time
git clone --no-checkout --mirror \
   https://github.com/concrete5/concrete5.git work
cd work
git filter-branch \
   --subdirectory-filter concrete \
   --tag-name-filter cat \
   --prune-empty \
   --state-branch FILTERBRANCH_STATE \
   -- --all
# Executed every time the repo is updated
git remote update --prune
git filter-branch \
   --subdirectory-filter concrete \
   --tag-name-filter cat \
   --prune-empty \
   --state-branch FILTERBRANCH_STATE \
   -- --all

The first filter-branch call required 7168 steps, so did the second call...
I also tried without the --prune option of remote update (I had to add
--force to the second filter-branch), but nothing changed.


Re: [PATCH] git-gui: Add hotkeys to change focus between ui widgets

2018-03-06 Thread Birger Skogeng Pedersen
Thanks for the feedback.

On Mon, Mar 5, 2018 at 5:55 PM, Johannes Schindelin
 wrote:
> Do you think there is a way to focus on the last-selected path? That would
> make this feature even more convenient, I think.

Yes, good idea. I'll implement it and create a second version.

> I am not sure that this information is still there if switching back from
> another component...

I don't think so. But I can add a variable to hold the last selected
(clicked) path for both widgets.

Thanks (again),
Birger

On Mon, Mar 5, 2018 at 5:55 PM, Johannes Schindelin
 wrote:
> Hi Birger,
>
> On Wed, 28 Feb 2018, Birger Skogeng Pedersen wrote:
>
>> The user cannot change focus between the list of files, the diff view
>> and the commit message widgets without using the mouse (clicking either of
>> the four widgets ).
>>
>> Hotkeys CTRL/CMD+number (1-4) now focuses the first file of either the
>> "Unstaged Changes" or "Staged Changes", the diff view or the
>> commit message dialog widgets, respectively. This enables the user to
>> select/unselect files, view the diff and create a commit in git-gui
>> using keyboard-only.
>
> I like this!
>
>> diff --git a/git-gui/git-gui.sh b/git-gui/git-gui.sh
>> index 91c00e648..f96c0a6b8 100755
>>
>> (This is my first patch ever, any feedback is highly appreciated)
>
> I am not an expert in Tcl/Tk, but I'll do my best to comment on this
> patch.
>
>> --- a/git-gui/git-gui.sh
>> +++ b/git-gui/git-gui.sh
>> @@ -2664,6 +2664,38 @@ proc show_less_context {} {
>>   }
>>  }
>>
>> +proc select_first_path {w} {
>> + global file_lists last_clicked selected_paths
>> + if {[llength $file_lists($w)] > 0} {
>> + focus $w
>> + set last_clicked [list $w 1]
>> + set path [lindex $file_lists($w) 0]
>> + array unset selected_paths
>> + set selected_paths($path) 1
>> + show_diff $path $w
>> + }
>> +}
>
> Do you think there is a way to focus on the last-selected path? That would
> make this feature even more convenient, I think.
>
> I am not sure that this information is still there if switching back from
> another component...
>
>> +proc select_first_unstaged_changes_path {} {
>> + global ui_workdir
>> + select_first_path $ui_workdir
>> +}
>> +
>> +proc select_first_staged_changes_path {} {
>> + global ui_index
>> + select_first_path $ui_index
>> +}
>> +
>> +proc focus_diff {} {
>> + global ui_diff
>> + focus $ui_diff
>> +}
>> +
>> +proc focus_commit_message {} {
>> + global ui_comm
>> + focus $ui_comm
>> +}
>> +
>>  ##
>>  ##
>>  ## ui construction
>> @@ -3876,6 +3908,11 @@ foreach i [list $ui_index $ui_workdir] {
>>  }
>>  unset i
>>
>> +bind . <$M1B-Key-1> {select_first_unstaged_changes_path}
>> +bind . <$M1B-Key-2> {select_first_staged_changes_path}
>> +bind . <$M1B-Key-3> {focus_diff}
>> +bind . <$M1B-Key-4> {focus_commit_message}
>> +
>>  set file_lists($ui_index) [list]
>>  set file_lists($ui_workdir) [list]
>
> Looks good!
>
> We are currently without an active Git GUI maintainer, so I hope that
> Junio (the Git maintainer) will pick this up.
>
> Ciao,
> Johannes


[RFC v2] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-06 Thread Sergey Organov
Hi,

This is v2 of my "Rebasing merges" proposal.

Significant changes are:

1. Fixed mistake in the final merge step in the original proposal: wrong
   merge base was used. Thanks everybody who provided test-cases, and
   special thanks to Igor Djordjevic  for
   implementing and testing a few variants of the method.

2. Added discussion of the exact place where handling of special
   frequent cases such as "git merge -ours", if any, should go.

3. I now use "True Merge" name instead of former "Trivial Merge", to
   avoid confusion with what Git documentation calls "trivial merge",
   thanks to Junio C Hamano  for pointing this out.

During discussion of the original proposal, yet another way of
implementing a true rebase of a merge commit has been suggested by
Phillip Wood [1]. His method also gathers
the changes on both sides of the merge and then merges them back to the
original merge, so both methods have similar concept and differ in
implementation. It looks like both implementations bring the same
result, at least it was so in the limited testing that Igor performed.

[1] 
https://public-inbox.org/git/6c8749ca-ec5d-b4b7-f1a0-50d9ad294...@talktalk.net/


---8<-8<--

By accepting the challenges raised in recent discussion of advanced
support for history rebasing and editing in Git, I hopefully figured out
a clean and elegant method of rebasing merges that I think is "The Right
Way (TM)" to perform this so far troublesome operation. ["(TM)" here has
second meaning: "True Merge" (TM), see below.]

Let me begin with quick outline of the method, to illustrate the
simplicity of the approach, and special thanks here must go to "Johannes
Sixt"  for his original bright idea to use "cherry-pick
-m1" to rebase merge commits.

Given 2 original branches, b1 and b2, and a merge commit M that joins
them, suppose we've already rebased b1 to b1', and b2 to b2'. Suppose
also that B1' and B2' happen to be the tip commits on b1' and b2',
respectively.

To produce merge commit M' that joins b1' and b2', the following
operations will suffice:

1. Checkout b2' and cherry-pick -m2 M, to produce U2' (and new b2').
2. Checkout b1' and cherry-pick -m1 M, to produce U1' (and new b1').
3. Perform 3-way merge of U1' and U2' using original M as merge base,
   to get UM'.
4. Get rid of U1' and U2' by re-writing parent references of UM' from
   U1' and U2' to  B1' and B2', respectively, to produce M'.
5. Mission complete.

Let's now turn to the method itself and see why and how it actually
works.

First off, let me introduce you to my new friend, the True Merge, or
(TM) for short. By definition, (TM) is a merge that brings absolutely
no differences to the sides of the merge. (I also like to call him
"Angel Merge" (AM), both as being the most beautiful of all merges, and
as direct antithesis to "[d]evil merge"; or even "Perfect Merge" (PM),
but the latter goes after lunch time.)

Being trivial history joint and nothing else, (TM)/(AM)/(PM) is safe and
easy to be rebased (see below). However, since most of us have never met
(TM) in practice, you probably wonder how (TM) can actually help us
handle general case of rebasing of some random merge.

Let's start with this history:

M
   / \
  B1  B2

where B1 and B2 are tip commits of 2 branches, and M is the merge commit
that joins them. Let's transform this history to the following one,
contextually equivalent to the original, by introducing 2 non-merge
utility commits U1 and U2, and a new utility merge commit UM:

UM
   /  \
  U1   U2
  ||
  B1   B2

were contents of all the created commits match, and are exact copies of
the original content of M. I.e., provided [A] denotes "content of commit
A", we have:

  [UM] = [U1] = [U2] = [M]

Stress again how these changes to the history preserve the exact content
of the original merge ([UM] = [M]), how U1 an U2 represent content
changes due to the merge on either side[*], and how content of neither
preceding nor subsequent commits is affected by the change of
representation.

Now observe that as [U1] = [UM], and [U2] = [UM], the UM happens to be
exactly our new friend -- the "True Merge (TM)" his true self,
introducing exactly zero changes to content. Pure history joint.

Next, we separately rebase both branches of our new representation of
the history to whatever new base we need, and we get:

  U1'  U2'
  ||
  B1'  B2'

where U1' and U2' are rebased versions of U1 and U2, obtained by usual
rebasing methods for non-merge commits.

Finally, let's merge back our branches.

To perform the right kind of merge, notice that U1' and U2' have
diverged from U1 and U2, respectively. Further, provided [U1] = [U2] =
[UM] = [M], they have both diverged from the original merge commit
M. Therefore, to merge U1' and U2' into UM', it suffices to use 3-way
merge using original M as the merge base:

UM'
   /  \
  U1'  U2'
   \  /

Re: [PATCH] git.el: handle default excludesfile properly

2018-03-06 Thread Alexandre Julliard
Junio C Hamano  writes:

> Having said that, I am sorry to say that I am not sure if the copy
> we have is the one to be patched, so I would appreciate if Alexandre
> (cc'ed) can clarify the situation.  The last change done to our copy
> of the script is from 2012, and I do not know if Alexandre is still
> taking care of it here but the script is so perfect that there was
> no need to update it for the past 5 years and we haven't seen an
> update, or the caninical copy is now being maintained elsewhere and
> we only have a stale copy, or what.

This is the canonical version, and I guess in theory I'm still taking
care of it. However, the need that git.el was originally addressing is
now fulfilled by better tools. As such, I feel that it has outlived its
usefulness, and I'm no longer actively developing it.

I'd recommend that anybody still using it switch to Magit, which is
being actively maintained, and IMO superior to git.el in all respects.

-- 
Alexandre Julliard
julli...@winehq.org


[PATCH] xdiff: improve trimming preprocessing

2018-03-06 Thread Jun Wu
xdiff-interface trims common suffix if ctxlen is 0. Teach it to also
trim common prefix, and trim less lines if ctxlen > 0. So it can benefit
the default diff command, as seen by profiling:

  $ GIT_PERF_REPEAT_COUNT=10 ./run origin/master . -- p4000-diff-algorithms.sh
  [...]
  Test  origin/master this tree
  --
  4000.1: log -3000 (baseline)  0.07(0.06+0.01)   0.06(0.05+0.01) -14.3%
  4000.2: log --raw -3000 (tree-only)   0.31(0.28+0.02)   0.31(0.27+0.02) +0.0%
  4000.3: log -p -3000 (Myers)  2.25(2.05+0.17)   1.66(1.52+0.11) -26.2%
  4000.4: log -p -3000 --histogram  2.51(2.36+0.10)   1.90(1.79+0.09) -24.3%
  4000.5: log -p -3000 --patience   2.63(2.44+0.16)   1.70(1.56+0.12) -35.4%

Diff shifting behaviors (hunk merging, indent heuristic, shift towards
the end for repetitive content) are preserved as a best effort by:

  1. Reserve extra 100 lines to not be trimmed. Leave room for shifting.
  2. Calculate common prefix first. Disallow suffix overlap with prefix
 in the smaller file. So diff hunk tends to be shifted down.

Add tests for those. Make sure they fail on the old code.

Backported from a Mercurial patch [1]. Adjusted to git's existing
xdiff-interface.c (which isn't ported to Mercurial).

Note: xdl_trim_ends does a similar job. But too late - it's after line
splitting, hashing, correcting hash values. Those steps have visible
overhead.

[1]: https://phab.mercurial-scm.org/D2686

Signed-off-by: Jun Wu 
---
 t/t4066-diff-trimming.sh | 49 +++
 xdiff-interface.c| 60 +---
 xdiff/xdiff.h|  3 +++
 xdiff/xdiffi.c   |  5 ++--
 xdiff/xemit.c|  3 ++-
 5 files changed, 98 insertions(+), 22 deletions(-)
 create mode 100755 t/t4066-diff-trimming.sh

diff --git a/t/t4066-diff-trimming.sh b/t/t4066-diff-trimming.sh
new file mode 100755
index 0..3d90175ac
--- /dev/null
+++ b/t/t4066-diff-trimming.sh
@@ -0,0 +1,49 @@
+#!/bin/sh
+
+test_description='diff trimming optimization'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+  printf "x\n%.0s" {1..1000} >a &&
+  printf "x\n%.0s" {1..1001} >b &&
+  cat >c c &&\
+  printf "x%.0s" {1..934} >>d # pad common suffix to 1024 bytes
+  try:
+  import foo
+  except ImportError:
+  pass
+  try:
+  import bar
+  except ImportError:
+  pass
+EOF1
+  try:
+  import foo
+  except ImportError:
+  pass
+  try:
+  import baz
+  except ImportError:
+  pass
+  try:
+  import bar
+  except ImportError:
+  pass
+EOF2
+'
+
+test_expect_success 'git diff -U0 shifts hunk towards the end' '
+   test_expect_code 1 git diff -U0 --no-index a b |\
+fgrep "@@ -1000,0 +1001 @@" &&
+   test_expect_code 1 git diff -U0 --no-index b a |\
+fgrep "@@ -1001 +1000,0 @@"
+'
+
+test_expect_success 'git diff -U0 --indent-heuristic' '
+   test_expect_code 1 git diff -U0 --no-index --indent-heuristic c d |\
+fgrep "@@ -4,0 +5,4 @@"
+'
+
+test_done
diff --git a/xdiff-interface.c b/xdiff-interface.c
index 770e1f7f8..a4141e2db 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -101,42 +101,64 @@ static int xdiff_outf(void *priv_, mmbuffer_t *mb, int 
nbuf)
 }
 
 /*
- * Trim down common substring at the end of the buffers,
- * but end on a complete line.
+ * Trim down common prefix and suffix, except for reserved lines.
+ * Align to line boundary.
  */
-static void trim_common_tail(mmfile_t *a, mmfile_t *b)
+static void trim_common(mmfile_t *a, mmfile_t *b, long reserved,
+   xdemitconf_t *xecfg)
 {
const int blk = 1024;
-   long trimmed = 0, recovered = 0;
-   char *ap = a->ptr + a->size;
-   char *bp = b->ptr + b->size;
+   long i = 0, lines = 0;
+   char *ap = a->ptr, *ae = a->ptr + a->size - 1;
+   char *bp = b->ptr, *be = b->ptr + b->size - 1;
long smaller = (a->size < b->size) ? a->size : b->size;
+   size_t prefix = 0, suffix = 0;
 
-   while (blk + trimmed <= smaller && !memcmp(ap - blk, bp - blk, blk)) {
-   trimmed += blk;
-   ap -= blk;
-   bp -= blk;
+   if (smaller == 0)
+   return;
+
+   /* prefix - need line count for xecfg->ptrimmed */
+   for (i = 0; ++i < smaller && *ap == *bp;) {
+   lines += (*ap == '\n');
+   ap++, bp++;
+   }
+   for (i = 0; i <= reserved && --ap >= a->ptr;)
+   i += (*ap == '\n');
+   if (ap > a->ptr) {
+   prefix = ap + 1 - a->ptr;
+   xecfg->ptrimmed = lines + 1 - i;
}
 
-   while (recovered < trimmed)
-   if (ap[recovered++] == '\n')
-   break;
-   a->size -= trimmed - recovered;
-   b->size -= trimmed - recovered;
-}
+   /* suffix - no line count, but do not 

Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-06 Thread Sergey Organov
Hi Phillip,

Phillip Wood  writes:

> On 03/03/18 00:29, Igor Djordjevic wrote:
>> Hi Phillip,

[...]

>> 
>> The only thing I wonder of here is how would we check if the 
>> "rebased" merge M' was "clean", or should we stop for user amendment? 
>> With that other approach Sergey described, we have U1'==U2' to test with.
>
> I think (though I haven't rigorously proved to myself) that in the
> absence of conflicts this scheme has well defined semantics (the merges
> can be commuted), so the result should be predicable from the users
> point of view so maybe it could just offer an option to stop.

Yes, hopefully it's predictable, but is it the intended one? We don't
know, so there is still some level of uncertainty.

When in doubt, I try to find similar cases. There are two I'm aware of:

1. "git merge" just commits the result when there are no conflicts.
However, it supposedly has been run by the user just now, and thus user
can amend what he gets. That's effectively a stop for amendment from our
POV.

2. When rebasing, "rerere", when fires, stages the changes, and rebasing
stops for amendment. For me "rerere" behavior is rather annoying (I've
never in fact amended what it prepared), but I always assumed there are
good reasons it behaves this way.

Overall, to be consistent, it seems we do need to stop at U1' != U2', at
least by default. Additional options could be supported then to specify
user intentions, both on the command level and in the todo list,
provided it proves to be useful.

-- Sergey


Reply

2018-03-06 Thread Tamale David
Dear Doron,
I plead an indulgence if I have invaded your privacy by receiving this
mail from me without prior permission. With due respect, I contact you
purposely based on the similarities of names between you and my
deceased client who was an oil servicing contractor with shell
petroleum in West Africa.

This is about Nine years I have been searching the contacts of the
relatives to my late client in order that they may come forward for
the repatriation of the estate of my late client Engineer Victor Doron
 valued $16.4 Million but unfortunately all my search proved abortive
and the Togo bank holding these funds issued me a last notice to bring
the supposed heir/relatives before end of this year or they will have
the account declared un serviceable thereby confiscating the funds
hence my contact to you so that you may stand as the heir and receive
the funds into your bank account since you are bearing same surname
with my deceased client. I have the death certificate to send to you
as well as deposit document as proof. The sharing ratio of the funds
after a successful transfer to your bank account shall be 40/60 of
which I do have trust that the funds will be secured pending my
arrival to meet you in your country.
Waiting to hear from you.
Sincerely yours,
Barrister Tamale David


Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-06 Thread Phillip Wood
On 05/03/18 05:00, Sergey Organov wrote:
> Hi Plillip and Igor,
> 
> Igor Djordjevic  writes:
>> Hi Phillip,
>>
>> On 02/03/2018 12:31, Phillip Wood wrote:
>>>
 Thinking about it overnight, I now suspect that original proposal had a
 mistake in the final merge step. I think that what you did is a way to
 fix it, and I want to try to figure what exactly was wrong in the
 original proposal and to find simpler way of doing it right.

 The likely solution is to use original UM as a merge-base for final
 3-way merge of U1' and U2', but I'm not sure yet. Sounds pretty natural
 though, as that's exactly UM from which both U1' and U2' have diverged
 due to rebasing and other history editing.
>>>
>>> Hi Sergey, I've been following this discussion from the sidelines,
>>> though I haven't had time to study all the posts in this thread in
>>> detail. I wonder if it would be helpful to think of rebasing a merge as
>>> merging the changes in the parents due to the rebase back into the
>>> original merge. So for a merge M with parents A B C that are rebased to
>>> A' B' C' the rebased merge M' would be constructed by (ignoring shell
>>> quoting issues)
>>>
>>> git checkout --detach M
>>> git merge-recursive A -- M A'
>>> tree=$(git write-tree)
>>> git merge-recursive B -- $tree B'
>>> tree=$(git write-tree)
>>> git merge-recursive C -- $tree C'
>>> tree=$(git write-tree)
>>> M'=$(git log --pretty=%B -1 M | git commit-tree -pA' -pB' -pC')
>>>
>>> This should pull in all the changes from the parents while preserving
>>> any evil conflict resolution in the original merge. It superficially
>>> reminds me of incremental merging [1] but it's so long since I looked at
>>> that I'm not sure if there are any significant similarities.
>>>
>>> [1] https://github.com/mhagger/git-imerge
>>
>> Interesting, from quick test[3], this seems to produce the same 
>> result as that other test I previously provided[2], where temporary 
>> commits U1' and U2' are finally merged with original M as a base :)
> 
> Looks like sound approach and it's interesting if these 2 methods do in
> fact always bring the same result. Because if we look at the (now fixed)
> original approach closely, it also just gathers the changes in merge
> parents into U1' and U2', then merges the changes back into the original
> M (=U1=U2=UM).
> 
> Overall, this one looks like another implementation of essentially the
> same method and confirms that we all have the right thought direction
> here.
> 

Yes, I think they are doing the same thing. If there are no conflicts
then U1' is the should as "git merge-recursive A -- M A'". My patch
algebra isn't very good, but I think one ought to be able to show that
in the absence of conflicts the two approaches are equivalent.

>>
>> Just that this looks like even more straight-forward approach...?
>>
>> The only thing I wonder of here is how would we check if the 
>> "rebased" merge M' was "clean", or should we stop for user amendment? 
>> With that other approach Sergey described, we have U1'==U2' to test
>> with.
> 
> That's an advantage of the original, yes.

I wonder if just having a predicable result rather than forcing the
rebase to stop if the user just squashes a fixup commit into a topic
branch that is the parent of a merge might be more convenient in practice.

Best Wishes

Phillip

> -- Sergey
> 



Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-06 Thread Phillip Wood
On 03/03/18 00:29, Igor Djordjevic wrote:
> Hi Phillip,
> 
> On 02/03/2018 12:31, Phillip Wood wrote:
>>
>>> Thinking about it overnight, I now suspect that original proposal had a
>>> mistake in the final merge step. I think that what you did is a way to
>>> fix it, and I want to try to figure what exactly was wrong in the
>>> original proposal and to find simpler way of doing it right.
>>>
>>> The likely solution is to use original UM as a merge-base for final
>>> 3-way merge of U1' and U2', but I'm not sure yet. Sounds pretty natural
>>> though, as that's exactly UM from which both U1' and U2' have diverged
>>> due to rebasing and other history editing.
>>
>> Hi Sergey, I've been following this discussion from the sidelines,
>> though I haven't had time to study all the posts in this thread in
>> detail. I wonder if it would be helpful to think of rebasing a merge as
>> merging the changes in the parents due to the rebase back into the
>> original merge. So for a merge M with parents A B C that are rebased to
>> A' B' C' the rebased merge M' would be constructed by (ignoring shell
>> quoting issues)
>>
>> git checkout --detach M
>> git merge-recursive A -- M A'
>> tree=$(git write-tree)
>> git merge-recursive B -- $tree B'
>> tree=$(git write-tree)
>> git merge-recursive C -- $tree C'
>> tree=$(git write-tree)
>> M'=$(git log --pretty=%B -1 M | git commit-tree -pA' -pB' -pC')
>>
>> This should pull in all the changes from the parents while preserving
>> any evil conflict resolution in the original merge. It superficially
>> reminds me of incremental merging [1] but it's so long since I looked at
>> that I'm not sure if there are any significant similarities.
>>
>> [1] https://github.com/mhagger/git-imerge
> 
> Interesting, from quick test[3], this seems to produce the same 
> result as that other test I previously provided[2], where temporary 
> commits U1' and U2' are finally merged with original M as a base :)
> 
> Just that this looks like even more straight-forward approach...?
> 
> The only thing I wonder of here is how would we check if the 
> "rebased" merge M' was "clean", or should we stop for user amendment? 
> With that other approach Sergey described, we have U1'==U2' to test with.

I think (though I haven't rigorously proved to myself) that in the
absence of conflicts this scheme has well defined semantics (the merges
can be commuted), so the result should be predicable from the users
point of view so maybe it could just offer an option to stop.

> 
> By the way, is there documentation for `git merge-recursive` 
> anywhere, besides the code itself...? :$

Not that I'm aware of unfortunately. It's pretty simple though

git merge-recursive []  --  

The options are listed on the 'git merge' man page but you specify them
as '--option' rather than '-Xoption'. I'm not sure what the exact
requirements are for the index, having it match  should always
be safe.

> 
> Thanks, Buga
> 
> [2] 
> https://public-inbox.org/git/f1a960dc-cc5c-e7b0-10b6-39e551665...@gmail.com/
> [3] Quick test script:
> -- >8 --
> #!/bin/sh
> 
> # rm -rf ./.git
> # rm -f ./test.txt
> 
> git init
> 
> touch ./test.txt
> git add -- test.txt
> 
> # prepare repository
> for i in {1..8}
> do
>   echo X$i >>test.txt
>   git commit -am "X$i"
> done
> 
> # prepare branch A
> git checkout -b A
> sed -i '2iA1' test.txt
> git commit -am "A1"
> sed -i '4iA2' test.txt
> git commit -am "A2"
> sed -i '6iA3' test.txt
> git commit -am "A3"
> 
> # prepare branch B
> git checkout -b B master
> sed -i '5iB1' test.txt
> git commit -am "B1"
> sed -i '7iB2' test.txt
> git commit -am "B2"
> sed -i '9iB3' test.txt
> git commit -am "B3"
> 
> git checkout -b topic A
> git merge -s ours --no-commit B # merge A and B with `-s ours`
> sed -i '8iM' test.txt   # amend merge commit ("evil merge")
> git commit -am "M"
> git tag M #original-merge
> 
> # master moves on...
> git checkout master
> git cherry-pick B^ # cherry-pick B2 into master
> sed -i "1iX9" test.txt # add X9
> git commit -am "X9"
> 
> # (0) ---X8--B2'--X9 (master)
> #|\
> #| A1---A2---A3 (A)
> #| \
> #|  M (topic)
> #| /
> #\-B1---B2---B3 (B)
> 
> # simple/naive demonstration of proposed merge rebasing logic
> # using iterative merge-recursive, preserving merge commit manual
> # amendments, testing `-s ours` merge with cherry-picking from
> # obsoleted part, but still respecting interactively rebased
> # added/modified/dropped/cherry-picked commits :)
> 
> git checkout -b A-prime A
> git reset --hard HEAD^ # drop A3 from A
> sed -i '/A1/c\A12' test.txt# amend A1 to A12
> git commit -a --amend --no-edit
> git rebase master  # rebase A onto master
> git cherry-pick B  # cherry-pick B3 into A
> 
> git checkout -b B-prime B
> git rebase master  # rebase B onto master
> sed -i '12iB4' test.txt# add B4
> git commit 

[PATCH v2 5/5] pack-objects: display progress in get_object_details()

2018-03-06 Thread Nguyễn Thái Ngọc Duy
This code is mostly about reading object headers, which is cheap. But
when the number of objects is very large (e.g. 6.5M on linux-2.6.git)
and the system is under memory pressure, this could take some time (86
seconds on my system).

Show something during this time to let the user know pack-objects is
still going strong.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 24af4068a9..2ec911bf10 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1713,6 +1713,10 @@ static void get_object_details(void)
uint32_t i;
struct object_entry **sorted_by_offset;
 
+   if (progress)
+   progress_state = start_progress(_("Getting object details"),
+   to_pack.nr_objects);
+
sorted_by_offset = xcalloc(to_pack.nr_objects, sizeof(struct 
object_entry *));
for (i = 0; i < to_pack.nr_objects; i++)
sorted_by_offset[i] = to_pack.objects + i;
@@ -1723,7 +1727,9 @@ static void get_object_details(void)
check_object(entry);
if (big_file_threshold < entry->size)
entry->no_try_delta = 1;
+   display_progress(progress_state, i + 1);
}
+   stop_progress(_state);
 
/*
 * This must happen in a second pass, since we rely on the delta
-- 
2.16.2.784.gb291bd247e



[PATCH v2 4/5] pack-objects: show some progress when counting kept objects

2018-03-06 Thread Nguyễn Thái Ngọc Duy
We only show progress when there are new objects to be packed. But
when --keep-pack is specified on the base pack, we will exclude most
of objects. This makes 'pack-objects' stay silent for a long time
while the counting phase is going.

Let's show some progress whenever we visit an object instead. The
number of packed objects will be shown after if it's not the same as
the number of visited objects.

Since the meaning of this number has changed, use another word instead
of "Counting" to hint about the change.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index fcdd398eb7..24af4068a9 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -44,7 +44,7 @@ static const char *pack_usage[] = {
 static struct packing_data to_pack;
 
 static struct pack_idx_entry **written_list;
-static uint32_t nr_result, nr_written;
+static uint32_t nr_result, nr_written, nr_seen;
 
 static int non_empty;
 static int reuse_delta = 1, reuse_object = 1;
@@ -1092,6 +1092,8 @@ static int add_object_entry(const struct object_id *oid, 
enum object_type type,
off_t found_offset = 0;
uint32_t index_pos;
 
+   display_progress(progress_state, nr_seen++);
+
if (have_duplicate_entry(oid, exclude, _pos))
return 0;
 
@@ -1107,8 +1109,6 @@ static int add_object_entry(const struct object_id *oid, 
enum object_type type,
create_object_entry(oid, type, pack_name_hash(name),
exclude, name && no_try_delta(name),
index_pos, found_pack, found_offset);
-
-   display_progress(progress_state, nr_result);
return 1;
 }
 
@@ -1119,6 +1119,8 @@ static int add_object_entry_from_bitmap(const struct 
object_id *oid,
 {
uint32_t index_pos;
 
+   display_progress(progress_state, nr_seen++);
+
if (have_duplicate_entry(oid, 0, _pos))
return 0;
 
@@ -1126,8 +1128,6 @@ static int add_object_entry_from_bitmap(const struct 
object_id *oid,
return 0;
 
create_object_entry(oid, type, name_hash, 0, 0, index_pos, pack, 
offset);
-
-   display_progress(progress_state, nr_result);
return 1;
 }
 
@@ -3205,7 +3205,7 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
}
 
if (progress)
-   progress_state = start_progress(_("Counting objects"), 0);
+   progress_state = start_progress(_("Enumerating objects"), 0);
if (!use_internal_rev_list)
read_object_list_from_stdin();
else {
-- 
2.16.2.784.gb291bd247e



[PATCH v2 3/5] gc --auto: exclude base pack if not enough mem to "repack -ad"

2018-03-06 Thread Nguyễn Thái Ngọc Duy
pack-objects could be a big memory hog especially on large repos,
everybody knows that. The suggestion to stick a .keep file on the
giant base pack to avoid this problem is also known for a long time.

Let's do the suggestion automatically instead of waiting for people to
come to Git mailing list and get the advice. When a certain condition
is met, "gc --auto" tells "git repack" to keep the base pack around.
The end result would be two packs instead of one.

On linux-2.6.git, valgrind massif reports 1.6GB heap in "pack all"
case, and 535MB [1] in "pack all except the base pack" case. We save
roughly 1GB memory by excluding the base pack.

gc --auto decides to do this based on an estimation of pack-objects
memory usage, which is quite accurate at least for the heap part, and
whether that fits in half of system memory (the assumption here is for
desktop environment where there are many other applications running).

Since the estimation may be inaccurate and that 1/2 threshold is
really arbitrary, give the user a finer control over this mechanism:
if the largest pack is larger than gc.bigBasePackThreshold, it's kept.

PS. A big chunk of the remaining 535MB is the result of pack-objects
running rev-list internally. This will be dealt with when we could run
rev-list externally. Right now we can't because pack-objects internal
rev-list does more regarding unreachable objects, which cannot be done
by "git rev-list".

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/config.txt |   7 ++
 Documentation/git-gc.txt |  13 
 builtin/gc.c | 153 +--
 builtin/pack-objects.c   |   2 +-
 config.mak.uname |   1 +
 git-compat-util.h|   4 +
 pack-objects.h   |   2 +
 t/t6500-gc.sh|  29 
 8 files changed, 204 insertions(+), 7 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f57e9cf10c..120cf6bac9 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1549,6 +1549,13 @@ gc.autoDetach::
Make `git gc --auto` return immediately and run in background
if the system supports it. Default is true.
 
+gc.bigBasePackThreshold::
+   Make `git gc --auto` only enable `--keep-base-pack` when the
+   base pack's size is larger than this limit (in bytes).
+   Defaults to zero, which disables this check and lets
+   `git gc --auto` determine when to enable `--keep-base-pack`
+   based on memory usage.
+
 gc.logExpiry::
If the file gc.log exists, then `git gc --auto` won't run
unless that file is more than 'gc.logExpiry' old.  Default is
diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index 571b5a7e3c..35ad420d5c 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -59,6 +59,11 @@ then existing packs (except those marked with a `.keep` file)
 are consolidated into a single pack by using the `-A` option of
 'git repack'. Setting `gc.autoPackLimit` to 0 disables
 automatic consolidation of packs.
++
+If the physical amount of memory is considered not enough for `git
+repack` to run smoothly, `--keep-base-pack` is enabled. This could be
+overridden by setting `gc.bigBasePackThreshold` which only enables
+`--keep-base-pack` when the base pack is larger the specified limit.
 
 --prune=::
Prune loose objects older than date (default is 2 weeks ago,
@@ -78,6 +83,10 @@ automatic consolidation of packs.
Force `git gc` to run even if there may be another `git gc`
instance running on this repository.
 
+--keep-base-pack::
+   All packs except the base pack are consolidated into a single
+   pack. The largest pack is considered the base pack.
+
 Configuration
 -
 
@@ -167,6 +176,10 @@ run commands concurrently have to live with some risk of 
corruption (which
 seems to be low in practice) unless they turn off automatic garbage
 collection with 'git config gc.auto 0'.
 
+Set environment variable `GIT_TRACE` in order to see the memory usage
+estimation in `git gc --auto` that determines whether the base pack is
+kept.
+
 HOOKS
 -
 
diff --git a/builtin/gc.c b/builtin/gc.c
index 77fa720bd0..273657ddf4 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -20,6 +20,10 @@
 #include "argv-array.h"
 #include "commit.h"
 #include "packfile.h"
+#include "pack.h"
+#include "pack-objects.h"
+#include "blob.h"
+#include "tree.h"
 
 #define FAILED_RUN "failed to run %s"
 
@@ -39,6 +43,8 @@ static timestamp_t gc_log_expire_time;
 static const char *gc_log_expire = "1.day.ago";
 static const char *prune_expire = "2.weeks.ago";
 static const char *prune_worktrees_expire = "3.months.ago";
+static unsigned long big_base_pack_threshold;
+static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE;
 
 static struct argv_array pack_refs_cmd = ARGV_ARRAY_INIT;
 static struct argv_array reflog = ARGV_ARRAY_INIT;
@@ -126,6 +132,9 @@ static void gc_config(void)

[PATCH v2 0/5] Avoid expensive 'repack -ad' in gc --auto

2018-03-06 Thread Nguyễn Thái Ngọc Duy
I'm pretty happy with this now. Like v1, this is about not touching
the giant base pack when doing background gc. This saves about 2/3 of
memory, which in turn should improve performance if you're under
memory pressure.

v2 changes:

- the core idea remains the same, mem_have is increased to half total
  memory though. I figure including the whole mmap'd pack in the
  memory estimation may be a bit much, which is why I make this
  change.
- no creating .keep files temporarily
- the config key is renamed gc.bigBasePackThreshold (named after
  core.bigFileThreshold)
- note that if you set gc.bigFileThreshold, then normal gc (without
  --auto) can trigger this mode too.
- you can also control this with --[no-]keep-base-pack
- documents and tests
- some more progress output improvements

I'm _not_ doing external rev-list in this series though. I found out
that we have added more and more stuff in the internal rev-list code
path over the year and simply running

git rev-list  | git pack-objects

will break stuff (the test suite first for example). I will do it
because it does help. But it will take some time.

PS. This conflicts with sb/packfiles-in-repository on 'pu' because I
introduced new references to the global variable "packed_git" and
prepare_packed_git(). Resolving this should be simple though:

- drop prepare_packed_git()
- replace packed_git with get_packed_git(the_repository)

Nguyễn Thái Ngọc Duy (5):
  fixup! Add a test showing that 'git repack' throws away grafted-away
parents
  repack: add --keep-pack option
  gc --auto: exclude base pack if not enough mem to "repack -ad"
  pack-objects: show some progress when counting kept objects
  pack-objects: display progress in get_object_details()

 Documentation/config.txt   |   7 ++
 Documentation/git-gc.txt   |  13 +++
 Documentation/git-pack-objects.txt |   4 +
 Documentation/git-repack.txt   |   4 +
 builtin/gc.c   | 153 +++--
 builtin/pack-objects.c |  51 --
 builtin/repack.c   |  23 -
 config.mak.uname   |   1 +
 git-compat-util.h  |   4 +
 pack-objects.h |   2 +
 t/t6500-gc.sh  |  29 ++
 t/t7700-repack.sh  |  21 +++-
 12 files changed, 295 insertions(+), 17 deletions(-)

-- 
2.16.2.784.gb291bd247e



[PATCH v2 2/5] repack: add --keep-pack option

2018-03-06 Thread Nguyễn Thái Ngọc Duy
We allow to keep existing packs by having companion .keep files. This
is helpful when a pack is permanently kept. In the next patch, git-gc
just wants to keep a pack temporarily, for one pack-objects
run. git-gc can use --keep-pack for this use case.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-pack-objects.txt |  4 
 Documentation/git-repack.txt   |  4 
 builtin/pack-objects.c | 31 ++
 builtin/repack.c   | 23 +++---
 t/t7700-repack.sh  | 19 ++
 5 files changed, 78 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 81bc490ac5..1975477160 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -126,6 +126,10 @@ base-name::
has a .keep file to be ignored, even if it would have
otherwise been packed.
 
+--keep-pack=::
+   Ignore the given pack. This is the equivalent of having
+   `.keep` file on the pack. Implies `--honor-pack-keep`.
+
 --incremental::
This flag causes an object already in a pack to be ignored
even if it would have otherwise been packed.
diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
index ae750e9e11..12b073e115 100644
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@@ -133,6 +133,10 @@ other objects in that pack they already have locally.
with `-b` or `repack.writeBitmaps`, as it ensures that the
bitmapped packfile has the necessary objects.
 
+--keep-pack=::
+   Exclude the given pack from repacking. This is the equivalent
+   of having `.keep` file on the pack. Implies `--pack-kept-objects`.
+
 --unpack-unreachable=::
When loosening unreachable objects, do not bother loosening any
objects older than ``. This can be used to optimize out
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5c674b2843..8e3f870d71 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -28,6 +28,7 @@
 #include "argv-array.h"
 #include "list.h"
 #include "packfile.h"
+#include "dir.h"
 
 static const char *pack_usage[] = {
N_("git pack-objects --stdout [...] [<  | < 
]"),
@@ -2917,6 +2918,32 @@ static void get_object_list(int ac, const char **av)
oid_array_clear(_objects);
 }
 
+static void add_extra_kept_packs(const struct string_list *names)
+{
+   struct packed_git *p;
+
+   if (!names->nr)
+   return;
+
+   prepare_packed_git();
+   for (p = packed_git; p; p = p->next) {
+   const char *name = basename(p->pack_name);
+   int i;
+
+   if (!p->pack_local)
+   continue;
+
+   for (i = 0; i < names->nr; i++) {
+   if (fspathcmp(name, names->items[i].string))
+   continue;
+
+   p->pack_keep = 1;
+   ignore_packed_keep = 1;
+   break;
+   }
+   }
+}
+
 static int option_parse_index_version(const struct option *opt,
  const char *arg, int unset)
 {
@@ -2956,6 +2983,7 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
struct argv_array rp = ARGV_ARRAY_INIT;
int rev_list_unpacked = 0, rev_list_all = 0, rev_list_reflog = 0;
int rev_list_index = 0;
+   struct string_list keep_pack_list = STRING_LIST_INIT_NODUP;
struct option pack_objects_options[] = {
OPT_SET_INT('q', "quiet", ,
N_("do not show progress meter"), 0),
@@ -3022,6 +3050,8 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
 N_("create packs suitable for shallow fetches")),
OPT_BOOL(0, "honor-pack-keep", _packed_keep,
 N_("ignore packs that have companion .keep file")),
+   OPT_STRING_LIST(0, "keep-pack", _pack_list, N_("name"),
+   N_("ignore this pack")),
OPT_INTEGER(0, "compression", _compression_level,
N_("pack compression level")),
OPT_SET_INT(0, "keep-true-parents", _replace_parents,
@@ -3150,6 +3180,7 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
progress = 2;
 
prepare_packed_git();
+   add_extra_kept_packs(_pack_list);
if (ignore_packed_keep) {
struct packed_git *p;
for (p = packed_git; p; p = p->next)
diff --git a/builtin/repack.c b/builtin/repack.c
index 7bdb40142f..6a1dade0e1 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -86,7 +86,8 @@ static void remove_pack_on_signal(int signo)
  * have a corresponding .keep or .promisor file. These packs are not to
  * be kept if we are going to 

[PATCH v2 1/5] fixup! Add a test showing that 'git repack' throws away grafted-away parents

2018-03-06 Thread Nguyễn Thái Ngọc Duy
The closing quote of a test body by convention is always at the start
of line.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 t/t7700-repack.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 6061a04147..38247afbec 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -194,7 +194,7 @@ test_expect_success 'objects made unreachable by grafts 
only are kept' '
git reflog expire --expire=$test_tick --expire-unreachable=$test_tick 
--all &&
git repack -a -d &&
git cat-file -t $H1
-   '
+'
 
 test_done
 
-- 
2.16.2.784.gb291bd247e



Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-06 Thread Phillip Wood
On 02/03/18 23:33, Igor Djordjevic wrote:
> Hi Phillip,
> 
>> On Fri, Mar 2, 2018 at 4:36 AM, Phillip Wood wrote:
>>>
 It is interesting to think what it means to faithfully rebase a '-s
 ours' merge.
>>>
>>> I should have explained that I mean is a faithful rebase one that
>>> adheres to the semantics of '-s ours' (i.e. ignores any changes in the
>>> side branch) or one that merges new changes from the side branch into
>>> the content of the original merge? In your example you add B4 to B. If
>>> M' preserves the semantics of '-s ours' then it will not contain the
>>> changes in B4. I think your result does (correct me if I'm wrong) so it
>>> is preserving the content of the original merge rather than the
>>> semantics of it.
> 
> Yeah, I understood what you mean, and I see you noticed that B4 
> commit, for which I did anticipate possibly bringing up a discussion 
> like this ;)
> 
> I agree with Jake here, my thoughts exactly (what I wrote in that 
> other subthread[1], too):
> 
> On 02/03/2018 17:02, Jacob Keller wrote:
>>
>> We only have the content, and we don't know the semantics (nor, I
>> think, should we attempt to understand or figure out the semantics).
> 
> Hmm, I wanted to elaborate a bit here, but that sentence seems to 
> summarize the pure essence of it, and whatever I write looks like 
> just repeating the same stuff again...
> 
> That`s just it. And stopping to give the user a chance to 
> review/amend the result, where he might decide he actually did want 
> something else - so all good.
> 
> Otherwise, I would be interested to learn how context/semantics 
> guessing could provide a better default action (without introducing 
> more complexity for might not be much benefit, if any).

I don't think its possible to guess the semantics of the original merge
as users can use custom merge strategies and amend the result. It would
be possible to detect and unamended '-s ours' merge but special casing
that may end up causing users more confusion rather than helping them.

> But in the end, I guess we can just discuss the "most sane default" 
> to present to the user (introduce or obsolete that new commit B4, in 
> the discussed example[2]), as we should definitely stop for amending 
> anyway, not proceeding automatically whenever U1' != U2'.

I can see the reason for that but I'm concerned that it might get
annoying with an interactive rebase as it would stop whenever one of the
commits on a topic branch that is a parent of a merge gets amended.
(squashing and reordering existing commits on a topic branch would be OK
though)

> Oh, and what about situations where we introduce new or drop existing 
> branches (which should be possible with new `--recreate-merges`)...? 
> "Preserving old branch semantics" may have even less sense here - the 
> (possibly heavily reorganized) content is the only thing we have, 
> where context will (and should) be provided by the user.

In this scheme there is now way to change the parents of a merge so
preserving the old branch sementics is well defined. If the user wants
to change the parents of the merge then this scheme wont help them.

> And I guess being consistent is pretty important, too - if you add 
> new content during merge rebase, it should always show up in the 
> merge, period. 

Yes, that should make it easy for the user to know what to expect from
rebase.

> It seems pretty confusing to find out one of the 
> branches "declared special" (even more if it`s based on uncertain 
> guess-work), so when you add something to it it`s just "swallowed", 
> as the whole branch is always obsoleted, for now and ever.
> 
> I might even see a value in such behavior, but only as a conscious 
> user action, not something done automatically... I guess? :)
> 
> Regards, Buga
> 
> [1] 
> https://public-inbox.org/git/f26cdbe2-1bc3-02ff-7b99-12a6ebab5...@gmail.com/
> [2] 
> https://public-inbox.org/git/f1a960dc-cc5c-e7b0-10b6-39e551665...@gmail.com/
> 



Re: [PATCH v5 0/9] Correct offsets of hunks when one is skipped

2018-03-06 Thread Phillip Wood
On 05/03/18 18:50, Junio C Hamano wrote:
> Phillip Wood  writes:
> 
>> From: Phillip Wood 
>>
>> I've updated these to clean up the perl style in response to Junio's
>> comments on v4.
> 
> I've considered the use of "apply --recount" in this script (eh,
> rather, the existence of that option in "apply" command itself ;-))
> a bug that need to be eventually fixed for a long time, so the copy
> of earlier rounds I've been carrying in my tree were forked from
> 'maint'.

Oh I've been basing this on master, I hope that wasn't a problem.

> I'll queue this round on the same base commit as before to replace;
> I _think_ this is ready for 'next'.

Yes I think so (I've not come up with any new ways to break it, lets see
if someone else can)

> Thanks for working on this.
> 
Thanks, I use add -p quite a lot so it's nice to be able to contribute
something back.

Best Wishes

Phillip


Re: [PATCH 1/2] completion: don't set PARSE_OPT_NOCOMPLETE on --rerere-autoupdate

2018-03-06 Thread Phillip Wood
On 03/03/18 09:23, Nguyễn Thái Ngọc Duy wrote:
> 
> There is not a strong reason to hide this option, and git-merge already
> completes this one. Let's allow to complete this for all commands (and
> let git-completion.bash do the suppressing if neede).
> 
> This makes --rerere-autoupdate completable for am, cherry-pick and
> revert.
> 

This is slightly off topic as it doesn't use OPT_RERERE_AUTOUPDATE but
it looks[1] as if rebase is missing completion for --rerere-autoupdate
(and --signoff which is definitely off topic)

Best Wishes

Phillip

[1]
https://git.kernel.org/pub/scm/git/git.git/tree/contrib/completion/git-completion.bash#n2010


> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  contrib/completion/git-completion.bash | 3 +--
>  parse-options.h| 4 ++--
>  rerere.h   | 3 +--
>  3 files changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index 0ddf40063b..c310b241d3 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -1754,8 +1754,7 @@ _git_merge ()
>  
>   case "$cur" in
>   --*)
> - __gitcomp_builtin merge "--rerere-autoupdate
> - --no-rerere-autoupdate
> + __gitcomp_builtin merge "--no-rerere-autoupdate
>   --no-commit --no-edit --no-ff
>   --no-log --no-progress
>   --no-squash --no-stat
> diff --git a/parse-options.h b/parse-options.h
> index 0ba08691e6..ab1cc362bf 100644
> --- a/parse-options.h
> +++ b/parse-options.h
> @@ -148,8 +148,8 @@ struct option {
>  #define OPT_STRING_LIST(s, l, v, a, h) \
>   { OPTION_CALLBACK, (s), (l), (v), (a), \
> (h), 0, _opt_string_list }
> -#define OPT_UYN(s, l, v, h, f)  { OPTION_CALLBACK, (s), (l), (v), NULL, \
> -   (h), PARSE_OPT_NOARG|(f), 
> _opt_tertiary }
> +#define OPT_UYN(s, l, v, h) { OPTION_CALLBACK, (s), (l), (v), NULL, \
> +   (h), PARSE_OPT_NOARG, _opt_tertiary 
> }
>  #define OPT_DATE(s, l, v, h) \
>   { OPTION_CALLBACK, (s), (l), (v), N_("time"),(h), 0,\
> parse_opt_approxidate_cb }
> diff --git a/rerere.h b/rerere.h
> index 5e5a312e4c..c2961feaaa 100644
> --- a/rerere.h
> +++ b/rerere.h
> @@ -37,7 +37,6 @@ extern void rerere_clear(struct string_list *);
>  extern void rerere_gc(struct string_list *);
>  
>  #define OPT_RERERE_AUTOUPDATE(v) OPT_UYN(0, "rerere-autoupdate", (v), \
> - N_("update the index with reused conflict resolution if possible"), \
> - PARSE_OPT_NOCOMPLETE)
> + N_("update the index with reused conflict resolution if possible"))
>  
>  #endif
> 



[PATCH v2 1/3] add -p: select individual hunk lines

2018-03-06 Thread Phillip Wood
From: Phillip Wood 

When I end up editing hunks it is almost always because I want to
stage a subset of the lines in the hunk. Doing this by editing the
hunk is inconvenient and error prone (especially so if the patch is
going to be reversed before being applied). Instead offer an option
for add -p to stage individual lines. When the user presses 'l' the
hunk is redrawn with labels by the insertions and deletions and they
are prompted to enter a list of the lines they wish to stage. Ranges
of lines may be specified using 'a-b' where either 'a' or 'b' may be
omitted to mean all lines from 'a' to the end of the hunk or all lines
from 1 upto and including 'b'.

Signed-off-by: Phillip Wood 
---
 Documentation/git-add.txt  |   7 +++
 git-add--interactive.perl  | 143 +
 t/t3701-add-interactive.sh |  65 +
 3 files changed, 215 insertions(+)

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index d50fa339dc..ad33fda9a2 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -332,10 +332,17 @@ patch::
J - leave this hunk undecided, see next hunk
k - leave this hunk undecided, see previous undecided hunk
K - leave this hunk undecided, see previous hunk
+   l - select hunk lines to use
s - split the current hunk into smaller hunks
e - manually edit the current hunk
? - print help
 +
+If you press "l" then the hunk will be reprinted with each insertion
+or deletion labelled with a number and you will be prompted to enter
+which lines you wish to select. Individual line numbers should be
+separated by a space or comma, to specify a range of lines use a dash
+between them.
++
 After deciding the fate for all hunks, if there is any hunk
 that was chosen, the index is updated with the selected hunks.
 +
diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index f83e7450ad..a273b41e95 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -1013,6 +1013,133 @@ sub color_diff {
} @_;
 }
 
+sub label_hunk_lines {
+   local $_;
+   my $hunk = shift;
+   my $i = 0;
+   my $labels = [ map { /^[-+]/ ? ++$i : 0 } @{$hunk->{TEXT}} ];
+   if ($i > 1) {
+   @{$hunk}{qw(LABELS MAX_LABEL)} = ($labels, $i);
+   return 1;
+   }
+   return 0;
+}
+
+sub select_hunk_lines {
+   my ($hunk, $selected) = @_;
+   my ($text, $labels) = @{$hunk}{qw(TEXT LABELS)};
+   my ($i, $o_cnt, $n_cnt) = (0, 0, 0);
+   my ($push_eol, @newtext);
+   # Lines with this mode will become context lines if they are
+   # not selected
+   my $context_mode = $patch_mode_flavour{IS_REVERSE} ? '+' : '-';
+   for $i (1..$#{$text}) {
+   my $mode = substr($text->[$i], 0, 1);
+   if ($mode eq '\\') {
+   push @newtext, $text->[$i] if ($push_eol);
+   undef $push_eol;
+   } elsif ($labels->[$i] and $selected->[$labels->[$i]]) {
+   push @newtext, $text->[$i];
+   if ($mode eq '+') {
+   $n_cnt++;
+   } else {
+   $o_cnt++;
+   }
+   $push_eol = 1;
+   } elsif ($mode eq ' ' or $mode eq $context_mode) {
+   push @newtext, ' ' . substr($text->[$i], 1);
+   $o_cnt++; $n_cnt++;
+   $push_eol = 1;
+   } else {
+   undef $push_eol;
+   }
+   }
+   my ($o_ofs, $orig_o_cnt, $n_ofs, $orig_n_cnt) =
+   parse_hunk_header($text->[0]);
+   unshift @newtext, format_hunk_header($o_ofs, $o_cnt, $n_ofs, $n_cnt);
+   my $newhunk = {
+   TEXT => \@newtext,
+   DISPLAY => [ color_diff(@newtext) ],
+   OFS_DELTA => $orig_o_cnt - $orig_n_cnt - $o_cnt + $n_cnt,
+   TYPE => $hunk->{TYPE},
+   USE => 1,
+   };
+   # If this hunk has previously been edited add the offset delta
+   # of the old hunk to get the real delta from the original
+   # hunk.
+   if ($hunk->{OFS_DELTA}) {
+   $newhunk->{OFS_DELTA} += $hunk->{OFS_DELTA};
+   }
+   return $newhunk;
+}
+
+sub check_hunk_label {
+   my ($max_label, $label) = ($_[0]->{MAX_LABEL}, $_[1]);
+   if ($label < 1 or $label > $max_label) {
+   error_msg sprintf(__("invalid hunk line '%d'\n"), $label);
+   return 0;
+   }
+   return 1;
+}
+
+sub parse_hunk_selection {
+   local $_;
+   my ($hunk, $line) = @_;
+   my $max_label = $hunk->{MAX_LABEL};
+   my @selected = (0) x ($max_label + 1);
+   my @fields = split(/[,\s]+/, $line);
+   for (@fields) {
+   if 

[PATCH v2 3/3] add -p: optimize line selection for short hunks

2018-03-06 Thread Phillip Wood
From: Phillip Wood 

If there are fewer than ten changes in a hunk then make spaces
optional when selecting individual lines. This means that for short
hunks one can just type -357 to stage lines 1, 2, 3, 5 & 7.

Signed-off-by: Phillip Wood 
---
 Documentation/git-add.txt  |  3 ++-
 git-add--interactive.perl  | 30 ++
 t/t3701-add-interactive.sh |  2 +-
 3 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index 0e2c11e97b..d52acfc722 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -340,7 +340,8 @@ patch::
 If you press "l" then the hunk will be reprinted with each insertion
 or deletion labelled with a number and you will be prompted to enter
 which lines you wish to select. Individual line numbers should be
-separated by a space or comma, to specify a range of lines use a dash
+separated by a space or comma (these can be omitted if there are fewer
+than ten labelled lines), to specify a range of lines use a dash
 between them. To invert the selection prefix it with "\^" so "^3-5,8"
 will select everything except lines 3, 4, 5 and 8.
 +
diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index 6fa3d0a87c..9a6bcd5085 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -1082,6 +1082,33 @@ sub check_hunk_label {
return 1;
 }
 
+sub split_hunk_selection {
+   local $_;
+   my @fields = @_;
+   my @ret;
+   for (@fields) {
+   if (/^(-[0-9])(.*)/) {
+   push @ret, $1;
+   $_ = $2;
+   }
+   while ($_ ne '') {
+   if (/^[0-9]-$/) {
+   push @ret, $_;
+   last;
+   } elsif (/^([0-9](?:-[0-9])?)(.*)/) {
+   push @ret, $1;
+   $_ = $2;
+   } else {
+   error_msg sprintf
+   __("invalid hunk line '%s'\n"),
+   substr($_, 0, 1);
+   return ();
+   }
+   }
+   }
+   return @ret;
+}
+
 sub parse_hunk_selection {
local $_;
my ($hunk, $line) = @_;
@@ -1100,6 +1127,9 @@ sub parse_hunk_selection {
}
}
}
+   if ($max_label < 10) {
+   @fields = split_hunk_selection(@fields) or return undef;
+   }
for (@fields) {
if (/^([0-9]*)-([0-9]*)$/) {
if ($1 eq '' and $2 eq '') {
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 89c0e73f2b..d3bce154da 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -410,7 +410,7 @@ test_expect_success 'setup expected diff' '
 '
 
 test_expect_success 'can reset individual lines of patch' '
-   printf "%s\n" l "^1 3" |
+   printf "%s\n" l ^13 |
EDITOR=: git reset -p 2>error &&
test_must_be_empty error &&
git diff --cached HEAD >actual &&
-- 
2.16.2



[PATCH v2 2/3] add -p: allow line selection to be inverted

2018-03-06 Thread Phillip Wood
From: Phillip Wood 

If the list of lines to be selected begins with '^' select all the
lines except the ones listed.

Signed-off-by: Phillip Wood 
---
 Documentation/git-add.txt  |  3 ++-
 git-add--interactive.perl  | 17 -
 t/t3701-add-interactive.sh |  2 +-
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index ad33fda9a2..0e2c11e97b 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -341,7 +341,8 @@ If you press "l" then the hunk will be reprinted with each 
insertion
 or deletion labelled with a number and you will be prompted to enter
 which lines you wish to select. Individual line numbers should be
 separated by a space or comma, to specify a range of lines use a dash
-between them.
+between them. To invert the selection prefix it with "\^" so "^3-5,8"
+will select everything except lines 3, 4, 5 and 8.
 +
 After deciding the fate for all hunks, if there is any hunk
 that was chosen, the index is updated with the selected hunks.
diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index a273b41e95..6fa3d0a87c 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -1085,9 +1085,21 @@ sub check_hunk_label {
 sub parse_hunk_selection {
local $_;
my ($hunk, $line) = @_;
-   my $max_label = $hunk->{MAX_LABEL};
+   my ($max_label, $invert) = ($hunk->{MAX_LABEL}, undef);
my @selected = (0) x ($max_label + 1);
my @fields = split(/[,\s]+/, $line);
+   if ($fields[0] =~ /^\^(.*)/) {
+   $invert = 1;
+   if ($1 ne '') {
+   $fields[0] = $1;
+   } else {
+   shift @fields;
+   unless (@fields) {
+   error_msg __("no lines to invert\n");
+   return undef;
+   }
+   }
+   }
for (@fields) {
if (/^([0-9]*)-([0-9]*)$/) {
if ($1 eq '' and $2 eq '') {
@@ -1110,6 +1122,9 @@ sub parse_hunk_selection {
return undef;
}
}
+   if ($invert) {
+   @selected = map { !$_ } @selected;
+   }
return \@selected;
 }
 
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 65c8c3354b..89c0e73f2b 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -410,7 +410,7 @@ test_expect_success 'setup expected diff' '
 '
 
 test_expect_success 'can reset individual lines of patch' '
-   printf "%s\n" l 2 |
+   printf "%s\n" l "^1 3" |
EDITOR=: git reset -p 2>error &&
test_must_be_empty error &&
git diff --cached HEAD >actual &&
-- 
2.16.2



[PATCH v2 0/3] add -p: select individual hunk lines

2018-03-06 Thread Phillip Wood
From: Phillip Wood 

I've added some documentation to git-add.txt for the new selection
mode and cleaned up some style issues, otherwise these are unchanged
since v1.  These patches build on top of the recount fixes in [1]. The
commit message for the first patch describes the motivation:

"When I end up editing hunks it is almost always because I want to
stage a subset of the lines in the hunk. Doing this by editing the
hunk is inconvenient and error prone (especially so if the patch is
going to be reversed before being applied). Instead offer an option
for add -p to stage individual lines. When the user presses 'l' the
hunk is redrawn with labels by the insertions and deletions and they
are prompted to enter a list of the lines they wish to stage. Ranges
of lines may be specified using 'a-b' where either 'a' or 'b' may be
omitted to mean all lines from 'a' to the end of the hunk or all lines
from 1 upto and including 'b'."

[1] 
https://public-inbox.org/git/xmqqbmg29x1n@gitster-ct.c.googlers.com/T/#m01d0f1af90f32b698e583b56f8e53b986bcec7c6

Interdiff from v1:

 Documentation/git-add.txt  |  9 +
 git-add--interactive.perl  | 19 ++-
 t/t3701-add-interactive.sh | 12 +++-
 3 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index d50fa339dc..d52acfc722 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -332,10 +332,19 @@ patch::
J - leave this hunk undecided, see next hunk
k - leave this hunk undecided, see previous undecided hunk
K - leave this hunk undecided, see previous hunk
+   l - select hunk lines to use
s - split the current hunk into smaller hunks
e - manually edit the current hunk
? - print help
 +
+If you press "l" then the hunk will be reprinted with each insertion
+or deletion labelled with a number and you will be prompted to enter
+which lines you wish to select. Individual line numbers should be
+separated by a space or comma (these can be omitted if there are fewer
+than ten labelled lines), to specify a range of lines use a dash
+between them. To invert the selection prefix it with "\^" so "^3-5,8"
+will select everything except lines 3, 4, 5 and 8.
++
 After deciding the fate for all hunks, if there is any hunk
 that was chosen, the index is updated with the selected hunks.
 +
diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index aa474c5149..9a6bcd5085 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -1018,8 +1018,11 @@ sub label_hunk_lines {
my $hunk = shift;
my $i = 0;
my $labels = [ map { /^[-+]/ ? ++$i : 0 } @{$hunk->{TEXT}} ];
-   $i > 1 and @{$hunk}{qw(LABELS MAX_LABEL)} = ($labels, $i);
-   return $i > 1;
+   if ($i > 1) {
+   @{$hunk}{qw(LABELS MAX_LABEL)} = ($labels, $i);
+   return 1;
+   }
+   return 0;
 }
 
 sub select_hunk_lines {
@@ -1064,7 +1067,9 @@ sub select_hunk_lines {
# If this hunk has previously been edited add the offset delta
# of the old hunk to get the real delta from the original
# hunk.
-   $hunk->{OFS_DELTA} and $newhunk->{OFS_DELTA} += $hunk->{OFS_DELTA};
+   if ($hunk->{OFS_DELTA}) {
+   $newhunk->{OFS_DELTA} += $hunk->{OFS_DELTA};
+   }
return $newhunk;
 }
 
@@ -1135,7 +1140,9 @@ sub parse_hunk_selection {
my $hi = $2 eq '' ? $max_label : $2;
check_hunk_label($hunk, $lo) or return undef;
check_hunk_label($hunk, $hi) or return undef;
-   $hi < $lo and ($lo, $hi) = ($hi, $lo);
+   if ($hi < $lo) {
+   ($lo, $hi) = ($hi, $lo);
+   }
@selected[$lo..$hi] = (1) x (1 + $hi - $lo);
} elsif (/^([0-9]+)$/) {
check_hunk_label($hunk, $1) or return undef;
@@ -1145,7 +1152,9 @@ sub parse_hunk_selection {
return undef;
}
}
-   $invert and @selected = map { !$_ } @selected;
+   if ($invert) {
+   @selected = map { !$_ } @selected;
+   }
return \@selected;
 }
 
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 7bea4a2341..d3bce154da 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -363,6 +363,7 @@ test_expect_failure 'split hunk "add -p (no, yes, edit)"' '
 test_expect_success 'setup expected diff' '
cat >expected <<-\EOF
diff --git a/test b/test
+   index 0889435..341cc6b 100644
--- a/test
+++ b/test
@@ -1,6 +1,9 @@
@@ -385,13 +386,14 @@ test_expect_success 'can stage individual lines of patch' 
'
printf "%s\n" l "-2 4" |
EDITOR=: git add -p 2>error &&
test_must_be_empty error &&
-   

[PATCH 1/2] object.h: update flag allocation comment

2018-03-06 Thread Nguyễn Thái Ngọc Duy
Since the "flags" is shared, it's a good idea to keep track of who
uses what bit. When we need to use more flags in library code, we can
be sure it won't be re-used for another purpose by some caller.

While at there, fix the location of "5" (should be in a different
column than "4" two lines down)

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/index-pack.c | 1 +
 builtin/pack-objects.c   | 1 +
 builtin/reflog.c | 1 +
 builtin/unpack-objects.c | 1 +
 object.h | 6 +-
 5 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 7e3e1a461c..b4239a633c 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -49,6 +49,7 @@ struct thread_local {
int pack_fd;
 };
 
+/* Remember to update object flag allocation in object.h */
 #define FLAG_LINK (1u<<20)
 #define FLAG_CHECKED (1u<<21)
 
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5c674b2843..833126e671 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2549,6 +2549,7 @@ static void read_object_list_from_stdin(void)
}
 }
 
+/* Remember to update object flag allocation in object.h */
 #define OBJECT_ADDED (1u<<20)
 
 static void show_commit(struct commit *commit, void *data)
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 2233725315..95becf0e7e 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -52,6 +52,7 @@ struct collect_reflog_cb {
int nr;
 };
 
+/* Remember to update object flag allocation in object.h */
 #define INCOMPLETE (1u<<10)
 #define STUDYING   (1u<<11)
 #define REACHABLE  (1u<<12)
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 7235d2ffbf..fba9498ffe 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -158,6 +158,7 @@ struct obj_info {
struct object *obj;
 };
 
+/* Remember to update object flag allocation in object.h */
 #define FLAG_OPEN (1u<<20)
 #define FLAG_WRITTEN (1u<<21)
 
diff --git a/object.h b/object.h
index 87563d9056..15901d2901 100644
--- a/object.h
+++ b/object.h
@@ -29,7 +29,7 @@ struct object_array {
 /*
  * object flag allocation:
  * revision.h:  0-1026
- * fetch-pack.c:0---5
+ * fetch-pack.c:05
  * walker.c:0-2
  * upload-pack.c:   4   1119
  * builtin/blame.c:   12-13
@@ -40,6 +40,10 @@ struct object_array {
  * sha1_name.c: 20
  * list-objects-filter.c: 21
  * builtin/fsck.c:  0--3
+ * builtin/index-pack.c:2021
+ * builtin/pack-objects.c:  20
+ * builtin/reflog.c:  10--12
+ * builtin/unpack-objects.c:2021
  */
 #define FLAG_BITS  27
 
-- 
2.16.2.784.gb291bd247e



[PATCH 2/2] object.h: realign object flag allocation comment

2018-03-06 Thread Nguyễn Thái Ngọc Duy
Some new path names are too long and eat into the graph part. Move the
graph 9 columns to the right to avoid this.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 object.h | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/object.h b/object.h
index 15901d2901..6f56a86937 100644
--- a/object.h
+++ b/object.h
@@ -28,22 +28,22 @@ struct object_array {
 #define TYPE_BITS   3
 /*
  * object flag allocation:
- * revision.h:  0-1026
- * fetch-pack.c:05
- * walker.c:0-2
- * upload-pack.c:   4   1119
- * builtin/blame.c:   12-13
- * bisect.c:   16
- * bundle.c:   16
- * http-push.c:16-19
- * commit.c:   16-19
- * sha1_name.c: 20
- * list-objects-filter.c: 21
- * builtin/fsck.c:  0--3
- * builtin/index-pack.c:2021
- * builtin/pack-objects.c:  20
- * builtin/reflog.c:  10--12
- * builtin/unpack-objects.c:2021
+ * revision.h:   0-1026
+ * fetch-pack.c: 05
+ * walker.c: 0-2
+ * upload-pack.c:4   1119
+ * builtin/blame.c:12-13
+ * bisect.c:16
+ * bundle.c:16
+ * http-push.c: 16-19
+ * commit.c:16-19
+ * sha1_name.c:  20
+ * list-objects-filter.c:  21
+ * builtin/fsck.c:   0--3
+ * builtin/index-pack.c: 2021
+ * builtin/pack-objects.c:   20
+ * builtin/reflog.c:   10--12
+ * builtin/unpack-objects.c: 2021
  */
 #define FLAG_BITS  27
 
-- 
2.16.2.784.gb291bd247e



Re: [Bug] git log --show-signature print extra carriage return ^M

2018-03-06 Thread Larry Hunter
The same ^M is shown in the output of tutorial


https://www.geekality.net/2017/08/23/setting-up-gpg-signing-for-gitgithub-on-windows/

at item "4. Verify commit was signed"

I confirm the output is right (no ^M characters) with commands

git verify-commit HEAD
git -p verify-commit HEAD
git verify-commit --v HEAD
git verify-commit --raw HEAD

and wrong (ending with ^M characters) with

git  log --show-signature -1 HEAD
git  -p log --show-signature -1 HEAD

I need gpg version 2.1 or greater to generate a gpg key for my windows
system, as stated by the github documentation:

https://help.github.com/articles/generating-a-new-gpg-key/

that saves my keys in ~/AppData/Roaming/GnuPG.

2018-03-05 15:29 GMT+01:00 Johannes Schindelin :
> Hi Larry,
>
> On Sun, 4 Mar 2018, Larry Hunter wrote:
>
>> There is bug using "git log --show-signature" in my installation
>>
>> git 2.16.2.windows.1
>> gpg (GnuPG) 2.2.4
>> libgcrypt 1.8.2
>
> The gpg.exe shipped in Git for Windows should say something like this:
>
> $ gpg --version
> gpg (GnuPG) 1.4.22
> Copyright (C) 2015 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> 
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
>
> Home: ~/.gnupg
> Supported algorithms:
> Pubkey: RSA, RSA-E, RSA-S, ELG-E, DSA
> Cipher: IDEA, 3DES, CAST5, BLOWFISH, AES, AES192, AES256, TWOFISH,
> CAMELLIA128, CAMELLIA192, CAMELLIA256
> Hash: MD5, SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224
> Compression: Uncompressed, ZIP, ZLIB, BZIP2
>
> Therefore, the GNU Privacy Guard version you use is not the one shipped
> and supported by the Git for Windows project.
>
>> that prints (with colors) an extra ^M (carriage return?) at the end of
>> the gpg lines. As an example, the output of "git log --show-signature
>> HEAD" looks like:
>>
>> $ git log --show-signature HEAD
>> commit 46c490188ebd216f20c454ee61108e51b481844e (HEAD -> master)
>> gpg: Signature made 03/04/18 16:53:06 ora solare Europa occidentale^M
>> gpg:using RSA key ...^M
>> gpg: Good signature from "..." [ultimate]^M
>> Author: ... <...>
>> Date:   Sun Mar 4 16:53:06 2018 +0100
>> ...
>>
>> To help find a fix, I tested the command "git verify-commit HEAD" that
>> prints (without colors) the same lines without extra ^M characters.
>>
>> $ git verify-commit HEAD
>> gpg: Signature made 03/04/18 16:53:06 ora solare Europa occidentale
>> gpg:using RSA key ...
>> gpg: Good signature from "..." [ultimate]
>
> My guess is that the latter command simply does not go through the pager
> while the former does.
>
> Do you see the ^M in the output of `git -p verify-commit HEAD`?
>
> Ciao,
> Johannes


Re: [GSoC] [PATCH] travis-ci: added clang static analysis

2018-03-06 Thread Siddhartha Mishra
On Tue, Mar 6, 2018 at 6:57 AM, Junio C Hamano  wrote:
> SiddharthaMishra  writes:
>
>> Added a job to run clang static code analysis on the master and maint branch
>>
>> Signed-off-by: SiddharthaMishra 
>> ---
>
> Why on 'master' and 'maint' and not others?  Quite frankly, I find
> this choice of branches rather odd, as these two branches are not
> where the real development happens.  If we do not want to increase
> the number of jobs and limit the test only to a single branch, I
> would probably pick 'next', and if we can afford two, probably
> 'pu' and 'next'.

I might have been misinterpreting it, but I did so because that's what
it said in the microproject description. Thinking about it, I guess it
does makes more sense to work on the other two branches instead. Are
there any other glaring issues you see in the code?


Re: git-send-email: Support for Reply-To

2018-03-06 Thread Junio C Hamano
Christian Ludwig  writes:

> this is the third iteration of this series. There was a request to
> rebase the changes on the refactoring patch b6049542 ("send-email:
> extract email-parsing code into a subroutine", 2017-12-15). This is
> the result.

Thanks.  Let me Cc the party who did the refactoring, as it was
unclear how much value the change that did only refactoring without
having an actual user of the end result---now we do have code that
uses it.

> The diffstat is the same compared to the last revision. It could be
> made smaller by sacrificing readibility and breaking the scheme
> introduced by the refactoring patch. But I do agree that send-email's
> readability could benefit from slicing it into handy functions. And the
> refactoring patch reduces the nesting of loops/conditionals.

Thanks.

I compared the result of applying these on top of the refactoring
commit, and cherry-picking the previous round on top of the same
refactoring commit, and found that they pretty much result in the
same code (which was an exercise for me to gain confidence in this
code).