Re: Bug: rebase -i creates committer time inversions on 'reword'

2018-04-15 Thread Johannes Sixt

Am 15.04.2018 um 23:35 schrieb Junio C Hamano:

Ah, do you mean we have an internal sequence like this, when "rebase
--continue" wants to conclude an edit/reword?


Yes, it's only 'reword' that is affected, because then subsequent picks 
are processed by the original process.



  - we figure out the committer ident, which grabs a timestamp and
cache it;

  - we spawn "commit" to conclude the stopped step, letting it record
its beginning time (which is a bit older than the above) or its
ending time (which is much older due to human typing speed);


Younger in both cases, of course. According to my tests, we seem to pick 
the beginning time, because the first 'reword'ed commit typically has 
the same timestamp as the preceding picks. Later 'reword'ed commits have 
noticably younger timestamps.



  - subsequent "picks" are made in the same process, and share the
timestamp we grabbed in the first step, which is older than the
second one.

I guess we'd want a mechanism to tell ident.c layer "discard the
cached one, as we are no longer in the same automated sequence", and
use that whenever we spawn an editor (or otherwise go interactive).


Frankly, I think that this caching is overengineered (or prematurly 
optimized). If the design requires that different callers of datestamp() 
must see the same time, then the design is broken. In a fixed design, 
there would be a single call of datestamp() in advance, and then the 
timestamp, which then obviously is a very important piece of data, would 
be passed along as required.


-- Hannes


[PATCH] glossary: substitute "ancestor" for "direct ancestor" in 'push' description.

2018-04-15 Thread Sergey Organov

Even though "direct ancestor" is not defined in the glossary, the
common meaning of the term is simply "parent", parents being the only
direct ancestors, and the rest of ancestors being indirect ancestors.

As "parent" is obviously wrong in this place in the description, we
should simply say "ancestor", as everywhere else.

Signed-off-by: Sergey Organov 
---
 Documentation/glossary-content.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/glossary-content.txt 
b/Documentation/glossary-content.txt
index 6bd..6c2d23d 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -463,7 +463,7 @@ exclude;;
 [[def_push]]push::
Pushing a <> means to get the branch's
<> from a remote <>,
-   find out if it is a direct ancestor to the branch's local
+   find out if it is an ancestor to the branch's local
head ref, and in that case, putting all
objects, which are <> from the local
head ref, and which are missing from the remote
-- 
2.10.0.1.g57b01a3



Re: [PATCH] completion: reduce overhead of clearing cached --options

2018-04-15 Thread Junio C Hamano
SZEDER Gábor  writes:

> On Fri, Apr 13, 2018 at 11:44 PM, Jakub Narebski  wrote:
>> SZEDER Gábor  writes:
>>> In Bash we can do better: run the 'compgen -v __gitcomp_builtin_'
>>> builtin command, which lists the same variables, but without a
>>> pipeline and 'sed' it can do so with lower overhead.
>>
>> What about ZSH?
>
> Nothing, ZSH is unaffected by this patch.

OK, do we want to follow it up with [PATCH 2/1] to add the LC_ALL=C
thing to the ZSH side to help that "sed", then?


Re: [PATCH 8/8] gpg-interface: handle alternative signature types

2018-04-15 Thread Junio C Hamano
"brian m. carlson"  writes:

> On Tue, Apr 10, 2018 at 04:24:27AM -0400, Eric Sunshine wrote:
>> How confident are we that _all_ possible signing programs will conform
>> to the "-BEGIN %s-" pattern? If we're not confident, then
>> perhaps the user should be providing the full string here, not just
>> the '%s' part?
>
> This is not likely to be true of other signing schemes.  In fact, other
> than OpenPGP, PEM, and CMS (S/MIME), this is probably not true at all.

Hmph.  

That argues more strongly that we would regret unless we make the
end-user configuration to at least the whole string (which later can
be promoted to "a pattern that matches the whole string"), not just
the part after mandatory "-BEGIN ", methinks.



Re: [PATCH v4 0/4] Make DEVELOPER more more flexible with DEVOPTS

2018-04-15 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> This is a v4 and replacement of gitster/nd/warn-more-for-devs. I'm
> sending this with Duy's blessing.
>
> The first two patches are the same, except for one trivial
> s/faimily/family/ typo fix.
>
> The third patch in gitster/nd/warn-more-for-devs ("Makefile: add
> EAGER_DEVELOPER mode") is gone, instead there's now a DEVOPTS
> option. The 3/4 and 4/4 add a way to turn off -Werror & the -Wextra
> suppression, respectively.

Looks sensible, mostly.  Will queue.  Thanks.


Re: [PATCH v2 0/7] nd/repack-keep-pack update

2018-04-15 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> On Sat, Apr 14, 2018 at 9:47 PM, Ævar Arnfjörð Bjarmason  
> wrote:
>> The only (trivial) issue I found in the patches themselves was that
>> between 4/5 and 5/7 you're adding an empty line to config.txt in 4/7
>> just to erase it in 5/7, better not to add it to begin with, but
>> hopefully Junio can fix that up (if he cares).
>
> Fixed (and is the only change in v2) in case Junio has not picked up
> the last round and fixed it himself yet.

Thanks.


Re: [PATCH v2 3/6] generate-cmdlist.sh: keep all information in common-cmds.h

2018-04-15 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> +category_list () {
> + command_list "$1" | awk '{print $2;}' | sort | uniq
> +}

Piping output of awk to sort/uniq, instead of processing all inside
awk within the END block of the script, means that we are wasting
two processes---I do not think we care too much about it, but some
people might.


Re: [PATCH v2 2/6] git.c: implement --list-cmds=all and use it in git-completion.bash

2018-04-15 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  contrib/completion/git-completion.bash |  2 +-
>  git.c  |  2 ++
>  help.c | 18 ++
>  help.h |  1 +
>  4 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index 3556838759..a5f13ade20 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -839,7 +839,7 @@ __git_commands () {
>   then
>   printf "%s" "${GIT_TESTING_COMMAND_COMPLETION}"
>   else
> - git help -a|egrep '^  [a-zA-Z0-9]'
> + git --list-cmds=all
>   fi
>  }

To those of us who install a copy of git-completion.bash somewhere
in $HOME and forget about it, while installing different versions of
Git all the time for testing, may see breakage caused by an invalid
combination of having new completion script with Git that does not
know about --list=cmds= option.  I do not think it matters too
much, though ;-)

> +void list_all_cmds(void)
> +{
> + struct cmdnames main_cmds, other_cmds;
> + int i;
> +
> + memset(_cmds, 0, sizeof(main_cmds));
> + memset(_cmds, 0, sizeof(other_cmds));
> + load_command_list("git-", _cmds, _cmds);
> +
> + for (i = 0; i < main_cmds.cnt; i++)
> + puts(main_cmds.names[i]->name);
> + for (i = 0; i < other_cmds.cnt; i++)
> + puts(other_cmds.names[i]->name);
> +
> + clean_cmdnames(_cmds);
> + clean_cmdnames(_cmds);
> +}
> +

OK.

By reusing load_command_list(), the duplicate-removal logic at its
end that is used for "help -a" kicks in; the above looks good.


Re: [PATCH v7 2/4] worktree: improve message when creating a new worktree

2018-04-15 Thread Junio C Hamano
Thomas Gummerer  writes:

> Currently 'git worktree add' produces output like the following:
>
> Preparing ../foo (identifier foo)
> HEAD is now at 26da330922 
>
> The '../foo' is the path where the worktree is created, which the user
> has just given on the command line.  The identifier is an internal
> implementation detail, which is not particularly relevant for the user
> and indeed isn't mentioned explicitly anywhere in the man page.

OK.  Maybe there once was a place or two that took the identifier as
an input to name a specific worktree to work on (perhaps "prune"?),
but if we no longer do that (which makes sense, as we should be able
to uniquely identify a worktree by the path to it), then it makes
perfect sense to prevent it from appearing in the UI.

> Instead of this message, print a message that gives the user a bit more
> detail of what exactly 'git worktree' is doing.  There are various dwim
> modes which are perform some magic under the hood, which should be

s/are perform/perform/, I think (no need to reroll, only to fix this).

> Additionally currently the "Preparing ..." line is printed to stderr,
> while the "HEAD is now at ..." line is printed to stdout by 'git reset
> --hard', which is used internally by 'git worktree add'.  Fix this
> inconsistency by printing the "Preparing ..." message to stdout as
> well.  As "Preparing ..." is not an error, stdout also seems like the
> more appropriate output stream.

I think it is a bug for reset to give this kind of informational
message to the standard output stream.  A rule of thumb I use is "is
this something that a user who wants quiet operation would wish to
squelch with --quiet option?" and if the answer is yes, it should go
to the standard error output, so info and progress should go to the
standard output in general.

But I am OK to unify to the standard output with this patch.  We may
want to come back and correct _both_ this new message and what reset
says, but that is outside the scope of this topic.

Thanks.


Re: Optimizing writes to unchanged files during merges?

2018-04-15 Thread Linus Torvalds
On Sun, Apr 15, 2018 at 6:44 PM, Junio C Hamano  wrote:
>
> I think Elijah's corrected was_tracked() also does not care "has
> this been renamed".

I'm perfectly happy with the slightly smarter patches. My patch was
really just an RFC and because I had tried it out.

> One thing that makes me curious is what happens (and what we want to
> happen) when such a "we already have the changes the side branch
> tries to bring in" path has local (i.e. not yet in the index)
> changes.  For a dirty file that trivially merges (e.g. a path we
> modified since our histories forked, while the other side didn't do
> anything, has local changes in the working tree), we try hard to
> make the merge succeed while keeping the local changes, and we
> should be able to do the same in this case, too.

I think it might be nice, but probably not really worth it.

I find the "you can merge even if some files are dirty" to be really
convenient, because I often keep stupid test patches in my tree that I
may not even intend to commit, and I then use the same tree for
merging.

For example, I sometimes end up editing the Makefile for the release
version early, but I won't *commit* that until I actually cut the
release. But if I pull some branch that has also changed the Makefile,
it's not worth any complexity to try to be nice about the dirty state.

If it's a file that actually *has* been changed in the branch I'm
merging, and I'm more than happy to just stage the patch (or throw it
away - I think it's about 50:50 for me).

So I don't think it's a big deal, and I'd rather have the merge fail
very early with "that file has seen changes in the branch you are
merging" than add any real complexity to the merge logic.

Linus


Re: Optimizing writes to unchanged files during merges?

2018-04-15 Thread Junio C Hamano
Linus Torvalds  writes:

> How about we take a completely different approach? Instead of relying
> on fragile (but clever) tests, why not rely on stupid brute force?
>
> Yeah, yeah, it's bad to be stupid, but sometimes simple and stupid
> really does work.
>
> See the attached patch. It gets rid of all the subtle "has this been
> renamed" tests entirely, and just _always_ does that final
> update_file().

I think Elijah's corrected was_tracked() also does not care "has
this been renamed".

One thing that makes me curious is what happens (and what we want to
happen) when such a "we already have the changes the side branch
tries to bring in" path has local (i.e. not yet in the index)
changes.  For a dirty file that trivially merges (e.g. a path we
modified since our histories forked, while the other side didn't do
anything, has local changes in the working tree), we try hard to
make the merge succeed while keeping the local changes, and we
should be able to do the same in this case, too.  

Your illustration patch that reads from the working tree to compare
with the contents we are about to write out of course won't cope
with this case ;-).  If we do things in the index like the approach
to fix was_tracked(), I suspect that we would just say "as long as
the index and the HEAD matches, i.e. we are keeping the pathas it is
found in HEAD as the merge result, we do not touch the working tree"
and leave such a local modification alone.



Re: [PATCH] t1510-repo-setup.sh: rm useless mkdir

2018-04-15 Thread Junio C Hamano
qingyu...@foxmail.com writes:

> From: Tao Qingyun <845767...@qq.com>
>
> Signed-off-by: Tao Qingyun <845767...@qq.com>
> ---
>  t/t1510-repo-setup.sh | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/t/t1510-repo-setup.sh b/t/t1510-repo-setup.sh
> index e6854b828..972bd9c78 100755
> --- a/t/t1510-repo-setup.sh
> +++ b/t/t1510-repo-setup.sh
> @@ -238,7 +238,6 @@ test_expect_success '#0: nonbare repo, no explicit 
> configuration' '
>  '
>  
>  test_expect_success '#1: GIT_WORK_TREE without explicit GIT_DIR is accepted' 
> '
> - mkdir -p wt &&
>   try_repo 1 "$here" unset unset "" unset \
>   "$here/1/.git" "$here" "$here" 1/ \
>   "$here/1/.git" "$here" "$here" 1/sub/ 2>message &&

Interesting.  

This was added by 4868b2ea ("setup: officially support --work-tree
without --git-dir", 2011-01-19), together with others like 9/wt that
do get used.  I am not sure if the old codebase needed this directory
created and over time we lost the need for it, or what.


Re: [PATCH v1] fsmonitor: fix incorrect buffer size when printing version number

2018-04-15 Thread Junio C Hamano
Ben Peart  writes:

> On 4/10/2018 4:17 PM, Eric Sunshine wrote:
>> On Tue, Apr 10, 2018 at 2:43 PM, Ben Peart  wrote:
>>> This is a trivial bug fix for passing the incorrect size to snprintf() when
>>> outputing the version.  It should be passing the size of the destination 
>>> buffer
>>
>> s/outputing/outputting/
>>
>>> rather than the size of the value being printed.
>>>
>>> Signed-off-by: Ben Peart 
>
> Junio, do you want a re-roll of this simple fix or can you fix the
> spelling of the commit message?

I think I already have that typofix in what is queued.

If there is anything more, I'd prefer to replace with a reroll ;-)

Thanks.


Re: [PATCH v9 29.75/30] merge-recursive: Fix was_tracked() to quit lying with some renamed paths

2018-04-15 Thread Junio C Hamano
Elijah Newren  writes:

> @@ -362,13 +363,17 @@ static int git_merge_trees(struct merge_options *o,
>   init_tree_desc_from_tree(t+2, merge);
>  
>   rc = unpack_trees(3, t, >unpack_opts);
> + cache_tree_free(_cache_tree);
> +
> + o->orig_index = the_index;
> + the_index = tmp_index;
> +
>   /*
> -  * unpack_trees NULLifies src_index, but it's used in verify_uptodate,
> -  * so set to the new index which will usually have modification
> -  * timestamp info copied over.
> +  * src_index is used in verify_uptodate, but was NULLified in
> +  * unpack_trees, so we need to set it back to the original index.
>*/

Was NULLified?  I thought that the point of src/dst distinction
Linus introduced long time ago at 34110cd4 ("Make 'unpack_trees()'
have a separate source and destination index", 2008-03-06) was that
we can then keep the source side of the traversal unmodified.

> - o->unpack_opts.src_index = _index;
> - cache_tree_free(_cache_tree);
> + o->unpack_opts.src_index = >orig_index;

> -static int was_tracked(const char *path)
> +/*
> + * Returns whether path was tracked in the index before the merge started
> + */
> +static int was_tracked(struct merge_options *o, const char *path)
>  {
> - int pos = cache_name_pos(path, strlen(path));
> + int pos = index_name_pos(>orig_index, path, strlen(path));
>  
>   if (0 <= pos)
> - /* we have been tracking this path */
> + /* we were tracking this path before the merge */
>   return 1;
>  
> - /*
> -  * Look for an unmerged entry for the path,
> -  * specifically stage #2, which would indicate
> -  * that "our" side before the merge started
> -  * had the path tracked (and resulted in a conflict).
> -  */
> - for (pos = -1 - pos;
> -  pos < active_nr && !strcmp(path, active_cache[pos]->name);
> -  pos++)
> - if (ce_stage(active_cache[pos]) == 2)
> - return 1;
>   return 0;
>  }

I do agree with the simplicity of the new code that directly asks
exactly what we want to ask.  However, there is one thing that is
puzzling below...

>  static int would_lose_untracked(const char *path)
>  {
> - return !was_tracked(path) && file_exists(path);
> + /*
> +  * This may look like it can be simplified to:
> +  *   return !was_tracked(o, path) && file_exists(path)
> +  * but it can't.  This function needs to know whether path was
> +  * in the working tree due to EITHER having been tracked in the
> +  * index before the merge OR having been put into the working copy
> +  * and index by unpack_trees().  Due to that either-or requirement,
> +  * we check the current index instead of the original one.
> +  */

If this path was created by merge-recursive, not by unpack_trees(),
what does this function want to say?  Say, we are looking at path P,
the other branch we are merging moved some other path Q to P (while
our side modified contents at path Q).  Then path P we are looking
at has contents of Q at the merge base at stage #1, the contents of
Q from our HEAD at stage #2 and the contents of P from the other
branch at stage #3.  The code below says "path P is OK, we won't
lose it" in such a case, but it is unclear if the above comment
wants to also cover that case.

> + int pos = cache_name_pos(path, strlen(path));
> +
> + if (pos < 0)
> + pos = -1 - pos;
> + while (pos < active_nr &&
> +!strcmp(path, active_cache[pos]->name)) {
> + /*
> +  * If stage #0, it is definitely tracked.
> +  * If it has stage #2 then it was tracked
> +  * before this merge started.  All other
> +  * cases the path was not tracked.
> +  */
> + switch (ce_stage(active_cache[pos])) {
> + case 0:
> + case 2:
> + return 0;
> + }
> + pos++;
> + }
> + return file_exists(path);
>  }
>  
>  static int was_dirty(struct merge_options *o, const char *path)


Re: [PATCH v8 04/14] graph: add commit graph design document

2018-04-15 Thread Jakub Narebski
Derrick Stolee  writes:

> +Future Work
> +---
> +
> +- The commit graph feature currently does not honor commit grafts. This can
> +  be remedied by duplicating or refactoring the current graft logic.

The problem in my opinion lies in different direction, namely that
commit grafts can change, changing the view of the history.  If we want
commit-graph file to follow user-visible view of the history of the
project, it needs to respect current version of commit grafts - but what
if commit grafts changed since commit-graph file was generated?

Actually, there are currently three ways to affect the view of the
history:

a. legacy commit grafts mechanism; it was first, but it is not safe,
   cannot be transferred on fetch / push, and is now deprecated.

b. shallow clones, which are kind of specialized and limited grafts;
   they used to limit available functionality, but restrictions are
   being lifted (or perhaps even got lifted)

c. git-replace mechanism, where we can create an "overlay" of any
   object, and is intended to be among others replacement for commit
   grafts; safe, transferable, can be turned off with "git
   --no-replace-objects "

All those can change; some more likely than others.  The problem is if
they change between writing commit-graph file (respecting old view of
the history) and reading it (where we expect to see the new view).

a. grafts file can change: lines can be added, removed or changed

b. shallow clones can be deepened or shortened, or even make
   not shallow

c. new replacements can be added, old removed, and existing edited


There are, as far as I can see, two ways of handling the issue of Git
features that can change the view of the project's history, namely:

 * Disable commit-graph reading when any of this features are used, and
   always write full graph info.

   This may not matter much for shallow clones, where commit count
   should be small anyway, but when using git-replace to stitch together
   current repository with historical one, commit-graph would be
   certainly useful.  Also, git-replace does not need to change history.

   On the other hand I think it is the easier solution.

Or

 * Detect somehow that the view of the history changed, and invalidate
   commit-graph (perhaps even automatically regenerate it).

   For shallow clone changes I think one can still use the old
   commit-graph file to generate the new one.  For other cases, the
   metadata is simple to modify, but indices such as generation number
   would need to be at least partially calculated anew.

Happily, you don't need to do this now.  It can be done in later series;
on the other hand this would be required before the switch can be turned
from "default off" to "default on" for commit-graph feature (configured
with core.commitGraph).

So please keep up the good work with sending nicely digestible patch
series.

> +
> +- The 'commit-graph' subcommand does not have a "verify" mode that is
> +  necessary for integration with fsck.

The "read" mode has beginnings of "verify", or at least "fsck", isn't
it?

[...]
> +- The current design uses the 'commit-graph' subcommand to generate the 
> graph.
> +  When this feature stabilizes enough to recommend to most users, we should
> +  add automatic graph writes to common operations that create many commits.
> +  For example, one could compute a graph on 'clone', 'fetch', or 'repack'
> +  commands.

Automatic is good ("git gc --auto").

But that needs handling of view chaning features such as commit grafts,
isn't it?

> +
> +- A server could provide a commit graph file as part of the network protocol
> +  to avoid extra calculations by clients. This feature is only of benefit if
> +  the user is willing to trust the file, because verifying the file is 
> correct
> +  is as hard as computing it from scratch.

Should server send different commit-graph file / info depending on
whether client fetches from refs/replaces/* nameespace?

> +
> +Related Links
> +-

Thank you for providing them (together with summary).

> +[0] https://bugs.chromium.org/p/git/issues/detail?id=8
> +Chromium work item for: Serialized Commit Graph
> +
> +[1] 
> https://public-inbox.org/git/20110713070517.gc18...@sigill.intra.peff.net/
> +An abandoned patch that introduced generation numbers.
> +
> +[2] 
> https://public-inbox.org/git/20170908033403.q7e6dj7benasr...@sigill.intra.peff.net/
> +Discussion about generation numbers on commits and how they interact
> +with fsck.
> +
> +[3] 
> https://public-inbox.org/git/20170908034739.4op3w4f2ma5s6...@sigill.intra.peff.net/
> +More discussion about generation numbers and not storing them inside
> +commit objects. A valuable quote:
> +
> +"I think we should be moving more in the direction of keeping
> + repo-local caches for optimizations. Reachability bitmaps have been
> + a big performance win. I think we should be doing the same with our
> + properties of 

Re: .gitattributes lookup doesn't respect GIT_WORK_TREE

2018-04-15 Thread Andreas Schwab
On Apr 16 2018, Junio C Hamano  wrote:

> I may be mistaken (I do not have the code in front of me right now)
> but IIRC after the setup.c code runs (which happens quite early in
> the sequence that starts from git.c::cmd_main()), the Git process
> moves to the top level of the working tree,

git log/show don't appear to do that.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll

2018-04-15 Thread Junio C Hamano
Junio C Hamano  writes:

> I think you identified the problem and diagnosed it correctly, but I
> find that the change proposed here introduces a severe layering
> violation.  The code is still calling what is called poll(), which
> should not have such a broken semantics.

I only mentioned a piece of fact (i.e. "the code calls poll() after
the patch"), but I guess I should have made it clear what makes that
a bad thing.  Future readers of the code in daemon.c are required to
be aware of the limitation of some poll() emulation; they cannot
"optimize" out and made the code unware of the (non-)existence of
remaining children, for example.  When the callsite uses poll(),
those who know how poll() ought to work won't be.  The reason why
the xpoll() I mentioned as a possible alternative would be better is
because they will learn why we do not use normal poll() there and
why we maintain and pass live_children (and those who cut and paste
without understanding the existing code _will_ copy the calling site
of xpoll(), which will automatically copy the need to maintain the
number of remaining children ;-).




Draft of Git Rev News edition 38

2018-04-15 Thread Christian Couder
Hi,

A draft of a new Git Rev News edition is available here:

  https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-38.md

Everyone is welcome to contribute in any section either by editing the
above page on GitHub and sending a pull request, or by commenting on
this GitHub issue:

  https://github.com/git/git.github.io/issues/285

You can also reply to this email.

In general all kinds of contribution, for example proofreading,
suggestions for articles or links, help on the issues in GitHub, and
so on, are very much appreciated.

I tried to cc everyone who appears in this edition, but maybe I missed
some people, sorry about that.

Jakub, Markus, Gabriel and me plan to publish this edition on
Wednesday April 18th.

Thanks,
Christian.


Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll

2018-04-15 Thread Junio C Hamano
Kim Gybels  writes:

> The poll provided in compat/poll.c is not interrupted by receiving
> SIGCHLD. Use a timeout for cleaning up dead children in a timely manner.

I think you identified the problem and diagnosed it correctly, but I
find that the change proposed here introduces a severe layering
violation.  The code is still calling what is called poll(), which
should not have such a broken semantics.

The ideal solution would be to fix the emulation so that it also
properly works for reaping a dead child process, but if that is not
possible, another solution that does not break the API layering
would probably be to introduce our own version of something similar
to poll() that helps various platforms that cannot implement the
real poll() faithfully for whatever reason.  Such an xpoll() API
function we introduce (and implement in compat/poll.c) may take, in
addition to the usual parameters to reall poll(), the value of
live_children we have at this call site.  With that

 - On platforms whose poll() does work correctly for culling dead
   children will just ignore the live_children paramater in its
   implementation of xpoll()

 - On other platforms, it will shorten the timeout depending on the
   need to cull dead children, just like your patch did.

Thanks.


>
> Signed-off-by: Kim Gybels 
> ---
>  daemon.c | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/daemon.c b/daemon.c
> index fe833ea7de..6dc95c1b2f 100644
> --- a/daemon.c
> +++ b/daemon.c
> @@ -1147,6 +1147,7 @@ static int service_loop(struct socketlist *socklist)
>  {
>   struct pollfd *pfd;
>   int i;
> + int poll_timeout = -1;
>  
>   pfd = xcalloc(socklist->nr, sizeof(struct pollfd));
>  
> @@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist)
>   int i;
>  
>   check_dead_children();
> -
> - if (poll(pfd, socklist->nr, -1) < 0) {
> +#ifdef NO_POLL
> + poll_timeout = live_children ? 100 : -1;
> +#endif
> + int ret = poll(pfd, socklist->nr, poll_timeout);
> + if  (ret == 0) {
> + continue;
> + } else if (ret < 0) {
>   if (errno != EINTR) {
>   logerror("Poll failed, resuming: %s",
> strerror(errno));


Re: Bug: rebase -i creates committer time inversions on 'reword'

2018-04-15 Thread Junio C Hamano
Johannes Sixt  writes:

> I just noticed that all commits in a 70-commit branch have the same
> committer timestamp. This is very unusual on Windows, where rebase -i of
> such a long branch takes more than one second (but not more than 3 or
> so thanks to the builtin nature of the command!).
>
> And, in fact, if you mark some commits with 'reword' to delay the quick
> processing of the patches, then the reworded commits have later time
> stamps, but subsequent not reworded commits receive the earlier time
> stamp. This is clearly not intended.

Hmm, I may be missing something without enough caffeine but I am
puzzled how that would be possible.  With a "few picks, an edit, and
a yet more picks" sequence, the first picks may share the same
timestamp due to the git_default_date caching (which I think is a
deliberate design choice we made), an edit that stops will let the
concluding "commit" (either by the end user or invoked internally
via "rebase --continue"), but because that process restarts afresh,
the commits made by "yet more picks" cannot share the timestamp that
was cached for the earliest ones from the same series, no?

Ah, do you mean we have an internal sequence like this, when "rebase
--continue" wants to conclude an edit/reword?

 - we figure out the committer ident, which grabs a timestamp and
   cache it;

 - we spawn "commit" to conclude the stopped step, letting it record
   its beginning time (which is a bit older than the above) or its
   ending time (which is much older due to human typing speed);

 - subsequent "picks" are made in the same process, and share the
   timestamp we grabbed in the first step, which is older than the
   second one.

I guess we'd want a mechanism to tell ident.c layer "discard the
cached one, as we are no longer in the same automated sequence", and
use that whenever we spawn an editor (or otherwise go interactive).

>
> Perhaps something like this below is needed.
>
> diff --git a/ident.c b/ident.c
> index 327abe557f..2c6bff7b9d 100644
> --- a/ident.c
> +++ b/ident.c
> @@ -178,8 +178,8 @@ const char *ident_default_email(void)
>  
>  static const char *ident_default_date(void)
>  {
> - if (!git_default_date.len)
> - datestamp(_default_date);
> + strbuf_reset(_default_date);
> + datestamp(_default_date);
>   return git_default_date.buf;
>  }
>  


Re: [PATCH/RFC 0/5] Keep all info in command-list.txt in git binary

2018-04-15 Thread Philip Oakley

From: "Duy Nguyen"  : Saturday, April 14, 2018 4:44 PM

On Thu, Apr 12, 2018 at 12:06 AM, Philip Oakley 
wrote:

I'm only just catching up, but does/can this series also capture the
non-command guides that are available in git so that the 'git help -g'
can
begin to list them all?


It currently does not. But I don't see why it should not. This should
allow git.txt to list all the guides too, for people who skip "git
help" and go hard core mode with "man git". Thanks for bringing this
up.
--
Duy


Is that something I should add to my todo to add a 'guide' category etc.?

A quick search of public-inbox suggests
https://public-inbox.org/git/1361660761-1932-1-git-send-email-philipoak...@iee.org/
as being where I first made the suggestions, but it got trimmed back to not
update (be embedded in) the command-list.txt

Philip



Re: Feature Request: Add diff.word-diff and perhaps diff.word-diff-regex configuration options to enable always using word-diffs in git diff

2018-04-15 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> On Sat, Apr 14 2018, Doron Behar wrote:
>
>> I've just came across the wonderful command line option for `git diff`:
>> `--word-diff`. It could be great to have a configuration option that
>> will enable this feature by default when running `git diff`.
>
> Do you know you can do:
>
> git config --global alias.wdiff "diff --word-diff"

Correct, but it is a "you could instead do it this way" that is not
accompanied by a "you do not want to do what you said you want to
because.." answer.  Having the latter is often helpful to learn why
the suggestion is actually not a mere "you could" but is a "you are
better off doing it this way".

Even though it is discouraged, people script using the Porcelain
"git diff" command, and once users of such a configuration variable
is used in Doron's fork of Git, those scripts will break for them.

Using alias like you showed won't have such a problem.


Re: .gitattributes lookup doesn't respect GIT_WORK_TREE

2018-04-15 Thread Junio C Hamano
Andreas Schwab  writes:

> The functions in attr.c do not look at $GIT_WORK_TREE when trying to
> find the .gitattributes file.  Thus if you are not inside the work tree,
> but have GIT_WORK_TREE set attribute lookup will be wrong.

I may be mistaken (I do not have the code in front of me right now)
but IIRC after the setup.c code runs (which happens quite early in
the sequence that starts from git.c::cmd_main()), the Git process
moves to the top level of the working tree, and at that point there
is no reason to pay attention to $GIT_WORK_TREE or other mechanisms
that places the working tree at a non-standard location, because the
attributes (or any in-tree files like .gitignore or .gitmodules)
will be found relative to the current working directory no matter
what.  Did we break that recently?



[PATCH v7 4/4] worktree: teach "add" to check out existing branches

2018-04-15 Thread Thomas Gummerer
Currently 'git worktree add ' creates a new branch named after the
basename of the path by default.  If a branch with that name already
exists, the command refuses to do anything, unless the '--force' option
is given.

However we can do a little better than that, and check the branch out if
it is not checked out anywhere else.  This will help users who just want
to check an existing branch out into a new worktree, and save a few
keystrokes.

As the current behaviour is to simply 'die()' when a branch with the name
of the basename of the path already exists, there are no backwards
compatibility worries here.

We will still 'die()' if the branch is checked out in another worktree,
unless the --force flag is passed.

Signed-off-by: Thomas Gummerer 
---
 Documentation/git-worktree.txt |  9 +++--
 builtin/worktree.c | 30 --
 t/t2025-worktree-add.sh| 26 +++---
 3 files changed, 50 insertions(+), 15 deletions(-)

diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
index 41585f535d..5d54f36a71 100644
--- a/Documentation/git-worktree.txt
+++ b/Documentation/git-worktree.txt
@@ -61,8 +61,13 @@ $ git worktree add --track -b   
/
 
 +
 If `` is omitted and neither `-b` nor `-B` nor `--detach` used,
-then, as a convenience, a new branch based at HEAD is created automatically,
-as if `-b $(basename )` was specified.
+then, as a convenience, the new worktree is associated with a branch
+(call it ``) named after `$(basename )`.  If ``
+doesn't exist, a new branch based on HEAD is automatically created as
+if `-b ` was given.  If `` does exist, it will be
+checked out in the new worktree, if it's not checked out anywhere
+else, otherwise the command will refuse to create the worktree (unless
+`--force` is used).
 
 list::
 
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 893139629a..f5a5283b39 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -355,9 +355,12 @@ static int add_worktree(const char *path, const char 
*refname,
 
 static void print_preparing_worktree_line(const char *branch,
  const char *new_branch,
- const char *new_branch_force)
+ const char *new_branch_force,
+ int checkout_existing_branch)
 {
-   if (new_branch_force) {
+   if (checkout_existing_branch) {
+   printf_ln(_("Preparing worktree (checking out '%s')"), branch);
+   } else if (new_branch_force) {
struct commit *commit = 
lookup_commit_reference_by_name(new_branch_force);
if (!commit)
printf_ln(_("Preparing worktree (new branch '%s')"), 
new_branch_force);
@@ -378,11 +381,23 @@ static void print_preparing_worktree_line(const char 
*branch,
}
 }
 
-static const char *dwim_branch(const char *path, const char **new_branch)
+static const char *dwim_branch(const char *path, const char **new_branch,
+  int *checkout_existing_branch)
 {
int n;
const char *s = worktree_basename(path, );
-   *new_branch = xstrndup(s, n);
+   const char *branchname = xstrndup(s, n);
+   struct strbuf ref = STRBUF_INIT;
+
+   if (!strbuf_check_branch_ref(, branchname) &&
+   ref_exists(ref.buf)) {
+   *checkout_existing_branch = 1;
+   strbuf_release();
+   UNLEAK(branchname);
+   return branchname;
+   }
+
+   *new_branch = branchname;
if (guess_remote) {
struct object_id oid;
const char *remote =
@@ -397,6 +412,7 @@ static int add(int ac, const char **av, const char *prefix)
struct add_opts opts;
const char *new_branch_force = NULL;
char *path;
+   int checkout_existing_branch = 0;
const char *branch;
const char *new_branch = NULL;
const char *opt_track = NULL;
@@ -444,7 +460,8 @@ static int add(int ac, const char **av, const char *prefix)
}
 
if (ac < 2 && !new_branch && !opts.detach) {
-   const char *s = dwim_branch(path, _branch);
+   const char *s = dwim_branch(path, _branch,
+   _existing_branch);
if (s)
branch = s;
}
@@ -464,7 +481,8 @@ static int add(int ac, const char **av, const char *prefix)
}
}
 
-   print_preparing_worktree_line(branch, new_branch, new_branch_force);
+   print_preparing_worktree_line(branch, new_branch, new_branch_force,
+ checkout_existing_branch);
 
if (new_branch) {
struct child_process cp = CHILD_PROCESS_INIT;
diff --git a/t/t2025-worktree-add.sh b/t/t2025-worktree-add.sh
index 2b95944973..ad38507d00 100755
--- 

[PATCH v7 1/4] worktree: remove extra members from struct add_opts

2018-04-15 Thread Thomas Gummerer
There are two members of 'struct add_opts', which are only used inside
the 'add()' function, but being part of 'struct add_opts' they are
needlessly also passed to the 'add_worktree' function.

Make them local to the 'add()' function to make it clearer where they
are used.

Signed-off-by: Thomas Gummerer 
---
 builtin/worktree.c | 32 +++-
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 7cef5b120b..4d96a21a45 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -27,8 +27,6 @@ struct add_opts {
int detach;
int checkout;
int keep_locked;
-   const char *new_branch;
-   int force_new_branch;
 };
 
 static int show_only;
@@ -363,10 +361,11 @@ static int add(int ac, const char **av, const char 
*prefix)
const char *new_branch_force = NULL;
char *path;
const char *branch;
+   const char *new_branch = NULL;
const char *opt_track = NULL;
struct option options[] = {
OPT__FORCE(, N_("checkout  even if already 
checked out in other worktree")),
-   OPT_STRING('b', NULL, _branch, N_("branch"),
+   OPT_STRING('b', NULL, _branch, N_("branch"),
   N_("create a new branch")),
OPT_STRING('B', NULL, _branch_force, N_("branch"),
   N_("create or reset a branch")),
@@ -384,7 +383,7 @@ static int add(int ac, const char **av, const char *prefix)
memset(, 0, sizeof(opts));
opts.checkout = 1;
ac = parse_options(ac, av, prefix, options, worktree_usage, 0);
-   if (!!opts.detach + !!opts.new_branch + !!new_branch_force > 1)
+   if (!!opts.detach + !!new_branch + !!new_branch_force > 1)
die(_("-b, -B, and --detach are mutually exclusive"));
if (ac < 1 || ac > 2)
usage_with_options(worktree_usage, options);
@@ -395,33 +394,32 @@ static int add(int ac, const char **av, const char 
*prefix)
if (!strcmp(branch, "-"))
branch = "@{-1}";
 
-   opts.force_new_branch = !!new_branch_force;
-   if (opts.force_new_branch) {
+   if (new_branch_force) {
struct strbuf symref = STRBUF_INIT;
 
-   opts.new_branch = new_branch_force;
+   new_branch = new_branch_force;
 
if (!opts.force &&
-   !strbuf_check_branch_ref(, opts.new_branch) &&
+   !strbuf_check_branch_ref(, new_branch) &&
ref_exists(symref.buf))
die_if_checked_out(symref.buf, 0);
strbuf_release();
}
 
-   if (ac < 2 && !opts.new_branch && !opts.detach) {
+   if (ac < 2 && !new_branch && !opts.detach) {
int n;
const char *s = worktree_basename(path, );
-   opts.new_branch = xstrndup(s, n);
+   new_branch = xstrndup(s, n);
if (guess_remote) {
struct object_id oid;
const char *remote =
-   unique_tracking_name(opts.new_branch, );
+   unique_tracking_name(new_branch, );
if (remote)
branch = remote;
}
}
 
-   if (ac == 2 && !opts.new_branch && !opts.detach) {
+   if (ac == 2 && !new_branch && !opts.detach) {
struct object_id oid;
struct commit *commit;
const char *remote;
@@ -430,25 +428,25 @@ static int add(int ac, const char **av, const char 
*prefix)
if (!commit) {
remote = unique_tracking_name(branch, );
if (remote) {
-   opts.new_branch = branch;
+   new_branch = branch;
branch = remote;
}
}
}
 
-   if (opts.new_branch) {
+   if (new_branch) {
struct child_process cp = CHILD_PROCESS_INIT;
cp.git_cmd = 1;
argv_array_push(, "branch");
-   if (opts.force_new_branch)
+   if (new_branch_force)
argv_array_push(, "--force");
-   argv_array_push(, opts.new_branch);
+   argv_array_push(, new_branch);
argv_array_push(, branch);
if (opt_track)
argv_array_push(, opt_track);
if (run_command())
return -1;
-   branch = opts.new_branch;
+   branch = new_branch;
} else if (opt_track) {
die(_("--[no-]track can only be used if a new branch is 
created"));
}
-- 
2.17.0.252.gfe0a9eaf31



[PATCH v7 0/4] worktree: teach "add" to check out existing branches

2018-04-15 Thread Thomas Gummerer
Thanks Eric for the review and all of the suggestions in the last
round.

Previous rounds are at <20180121120208.12760-1-t.gumme...@gmail.com>,
<20180204221305.28300-1-t.gumme...@gmail.com>,
<20180317220830.30963-1-t.gumme...@gmail.com>,
<2018031719.4940-1-t.gumme...@gmail.com>,
<20180325134947.25828-1-t.gumme...@gmail.com> and
<20180331151804.30380-1-t.gumme...@gmail.com>.

The main change once again in this series is the user interface.  It
feels like we went in a complete circle here now, as this iteration is
bringing the "Preparing ..." line back (although in a slightly
different form than the original), and is moving away from printing
it's own "HEAD is now at..." line again.  This also means we don't
need the new hidden option to 'git reset' anymore, which is nice.

I do like the new UI more than what we had in the last round (which I
already liked more than the original UI) :)

Other than those changes, it also includes Eric's suggestion for a
better wording in the documentation, fixes the argument order to
test_cmd_rev in a test, and makes a test more robust.

To demonstrate the UI updates, let's compare the new UI and the old UI
again.  This is the same commands as used for the demonstration in the
last iteration, so please have a look at 
<20180331151804.30380-1-t.gumme...@gmail.com> 
for an example of what it looked like after the last round.

New UI:

 - guess-remote mode

$ git worktree add --guess-remote ../next
Preparing worktree (new branch 'next')
Branch 'next' set up to track remote branch 'next' from 'origin'.
HEAD is now at caa68db14 Merge branch 'sb/packfiles-in-repository' into next

 - original dwim (create a branch based on the current HEAD)

$ git worktree add ../test
Preparing worktree (new branch 'test')
HEAD is now at c2a499e6c Merge branch 'jh/partial-clone'

 - new dwim (check out existing branch)

$ git worktree add ../test
Preparing worktree (checking out 'test')
HEAD is now at c2a499e6c Merge branch 'jh/partial-clone'

 - detached HEAD

$ git worktree add ../test2 origin/master
Preparing worktree (detached HEAD c2a499e6c)
HEAD is now at c2a499e6c Merge branch 'jh/partial-clone'

 - resetting existing branch

$ git worktree add -B next ../test3 origin/master
Preparing worktree (resetting branch 'next' (was at caa68db14))
Branch 'next' set up to track remote branch 'master' from 'origin'.
HEAD is now at c2a499e6c Merge branch 'jh/partial-clone'

This output is new in this round and wasn't previously discussed.
While working on the "Preparing ..." line I noticed this, and
thought it would be a good idea.  I feel like this fits in the
scope of the series quite well, as it's improving the UI, but I'm
happy to split it out if that's preferred.

It may also be worth displaying the title of the commit here, but
at that point the line would get a bit long, so dunno.

 - large repository (with original dwim)

$ git worktree add ../test
Preparing worktree (new branch 'test')
Checking out files: 100% (62915/62915), done.
HEAD is now at c2a9838452a4 Merge tag 'for-4.16/dm-fixes-4' of 
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

 - error when directory already exists

$ git worktree add ../test
Preparing worktree (checking out 'test')
fatal: '../test' already exists

Compare this to the old UI (new dwim omitted, as there's no old
version of that):

 - guess-remote mode

$ git worktree add --guess-remote ../next
Branch 'next' set up to track remote branch 'next' from 'origin'.
Preparing ../next (identifier next)
HEAD is now at caa68db14 Merge branch 'sb/packfiles-in-repository' into next

 - original dwim (create a branch based on the current HEAD)

$ git worktree add ../test
Preparing ../test (identifier test)
HEAD is now at c2a499e6c Merge branch 'jh/partial-clone'

 - no new branch created

$ git worktree add ../test2 origin/master
Preparing ../test2 (identifier test2)
HEAD is now at c2a499e6c Merge branch 'jh/partial-clone'

 - large repository

$ git worktree add ../test
Preparing ../test (identifier test)
Checking out files: 100% (62915/62915), done.
HEAD is now at c2a9838452a4 Merge tag 'for-4.16/dm-fixes-4' of 
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

 - error when directory already exists

$ git worktree add ../test
fatal: '../test' already exists

Interdiff below:

diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
index eaa6bf713f..5d54f36a71 100644
--- a/Documentation/git-worktree.txt
+++ b/Documentation/git-worktree.txt
@@ -61,13 +61,13 @@ $ git worktree add --track -b   
/
 
 +
 If `` is omitted and neither `-b` nor `-B` nor `--detach` used,
-then, as a convenience, a worktree with a branch named after
-`$(basename )` (call it ``) is created.  If ``
+then, as a convenience, the new worktree is 

[PATCH v7 2/4] worktree: improve message when creating a new worktree

2018-04-15 Thread Thomas Gummerer
Currently 'git worktree add' produces output like the following:

Preparing ../foo (identifier foo)
HEAD is now at 26da330922 

The '../foo' is the path where the worktree is created, which the user
has just given on the command line.  The identifier is an internal
implementation detail, which is not particularly relevant for the user
and indeed isn't mentioned explicitly anywhere in the man page.

Instead of this message, print a message that gives the user a bit more
detail of what exactly 'git worktree' is doing.  There are various dwim
modes which are perform some magic under the hood, which should be
helpful to users.  Just from the output of the command it is not always
visible to users what exactly has happened.

Help the users a bit more by modifying the "Preparing ..." message and
adding some additional information of what 'git worktree add' did under
the hood, while not displaying the identifier anymore.

Currently this ends up in three different cases:

  - 'git worktree add -b ...' or 'git worktree add ', both of
which create a new branch, either through the user explicitly
requesting it, or through 'git worktree add' implicitly creating
it.  This will end up with the following output:

  Preparing worktree (new branch '')
  HEAD is now at 26da330922 

  - 'git worktree add -B ...', which may either create a new branch if
the branch with the given name does not exist yet, or resets an
existing branch to the current HEAD, or the commit-ish given.
Depending on which action is taken, we'll end up with the following
output:

  Preparing worktree (resetting branch 'next' (was at caa68db14))
  HEAD is now at 26da330922 

or:

  Preparing worktree (new branch '')
  HEAD is now at 26da330922 

  - 'git worktree add --detach' or 'git worktree add  ',
both of which create a new worktree with a detached HEAD, for which
we will print the following output:

  Preparing worktree (detached HEAD 26da330922)
  HEAD is now at 26da330922 

Additionally currently the "Preparing ..." line is printed to stderr,
while the "HEAD is now at ..." line is printed to stdout by 'git reset
--hard', which is used internally by 'git worktree add'.  Fix this
inconsistency by printing the "Preparing ..." message to stdout as
well.  As "Preparing ..." is not an error, stdout also seems like the
more appropriate output stream.

Helped-by: Eric Sunshine 
Signed-off-by: Thomas Gummerer 
---
 builtin/worktree.c | 29 +++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 4d96a21a45..a2667d74fb 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -301,8 +301,6 @@ static int add_worktree(const char *path, const char 
*refname,
strbuf_addf(, "%s/commondir", sb_repo.buf);
write_file(sb.buf, "../..");
 
-   fprintf_ln(stderr, _("Preparing %s (identifier %s)"), path, name);
-
argv_array_pushf(_env, "%s=%s", GIT_DIR_ENVIRONMENT, sb_git.buf);
argv_array_pushf(_env, "%s=%s", GIT_WORK_TREE_ENVIRONMENT, path);
cp.git_cmd = 1;
@@ -355,6 +353,31 @@ static int add_worktree(const char *path, const char 
*refname,
return ret;
 }
 
+static void print_preparing_worktree_line(const char *branch,
+ const char *new_branch,
+ const char *new_branch_force)
+{
+   if (new_branch_force) {
+   struct commit *commit = 
lookup_commit_reference_by_name(new_branch_force);
+   if (!commit)
+   printf_ln(_("Preparing worktree (new branch '%s')"), 
new_branch_force);
+   else
+   printf_ln(_("Preparing worktree (resetting branch '%s' 
(was at %s))"),
+ new_branch_force,
+ find_unique_abbrev(commit->object.oid.hash,
+DEFAULT_ABBREV));
+   } else if (new_branch) {
+   printf_ln(_("Preparing worktree (new branch '%s')"), 
new_branch);
+   } else {
+   struct commit *commit = lookup_commit_reference_by_name(branch);
+   if (!commit)
+   die(_("invalid reference: %s"), branch);
+   printf_ln(_("Preparing worktree (detached HEAD %s)"),
+ find_unique_abbrev(commit->object.oid.hash,
+DEFAULT_ABBREV));
+   }
+}
+
 static int add(int ac, const char **av, const char *prefix)
 {
struct add_opts opts;
@@ -434,6 +457,8 @@ static int add(int ac, const char **av, const char *prefix)
}
}
 
+   print_preparing_worktree_line(branch, new_branch, new_branch_force);
+
if (new_branch) {
struct child_process cp = CHILD_PROCESS_INIT;

[PATCH v7 3/4] worktree: factor out dwim_branch function

2018-04-15 Thread Thomas Gummerer
Factor out a dwim_branch function, which takes care of the dwim'ery in
'git worktree add '.  It's not too much code currently, but we're
adding a new kind of dwim in a subsequent patch, at which point it makes
more sense to have it as a separate function.

Factor it out now to reduce the patch noise in the next patch.

Signed-off-by: Thomas Gummerer 
---
 builtin/worktree.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index a2667d74fb..893139629a 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -378,6 +378,20 @@ static void print_preparing_worktree_line(const char 
*branch,
}
 }
 
+static const char *dwim_branch(const char *path, const char **new_branch)
+{
+   int n;
+   const char *s = worktree_basename(path, );
+   *new_branch = xstrndup(s, n);
+   if (guess_remote) {
+   struct object_id oid;
+   const char *remote =
+   unique_tracking_name(*new_branch, );
+   return remote;
+   }
+   return NULL;
+}
+
 static int add(int ac, const char **av, const char *prefix)
 {
struct add_opts opts;
@@ -430,16 +444,9 @@ static int add(int ac, const char **av, const char *prefix)
}
 
if (ac < 2 && !new_branch && !opts.detach) {
-   int n;
-   const char *s = worktree_basename(path, );
-   new_branch = xstrndup(s, n);
-   if (guess_remote) {
-   struct object_id oid;
-   const char *remote =
-   unique_tracking_name(new_branch, );
-   if (remote)
-   branch = remote;
-   }
+   const char *s = dwim_branch(path, _branch);
+   if (s)
+   branch = s;
}
 
if (ac == 2 && !new_branch && !opts.detach) {
-- 
2.17.0.252.gfe0a9eaf31



Re: [PATCH 2/2] daemon: graceful shutdown of client connection

2018-04-15 Thread Kim Gybels
On (13/04/18 15:03), Johannes Schindelin wrote:
> I wonder whether you found a reliable way to trigger this? It would be
> nice to have a regression test for this.

On my system, it reproduced reliably using Oleg's example [1], below is my bash
version of it.

Script to generate repository with some history:

  $ cat example.sh
  #!/bin/bash
  
  git init example
  cd example
  
  git --help > foo.txt
  
  for i in $(seq 1 12); do
  cat foo.txt foo.txt > bar.txt
  mv bar.txt foo.txt
  git add foo.txt
  git commit -m v$i
  done
  
  $ ./example.sh
  Initialized empty Git repository in C:/git/bug/example/.git/
  [master (root-commit) 2e44b4a] v1
   1 file changed, 84 insertions(+)
   create mode 100644 foo.txt
  [master 9791332] v2
   1 file changed, 84 insertions(+)
  [master 524e672] v3
   1 file changed, 168 insertions(+)
  [master afec6ef] v4
   1 file changed, 336 insertions(+)
  [master 1bcd9cc] v5
   1 file changed, 672 insertions(+)
  [master 2f38a8e] v6
   1 file changed, 1344 insertions(+)
  [master 33382fe] v7
   1 file changed, 2688 insertions(+)
  [master 6c2cbd6] v8
   1 file changed, 5376 insertions(+)
  [master 8d0770f] v9
   1 file changed, 10752 insertions(+)
  [master 517d650] v10
   1 file changed, 21504 insertions(+)
  [master 9e12406] v11
   1 file changed, 43008 insertions(+)
  [master 4c4f600] v12
   1 file changed, 86016 insertions(+)

Server side:

  $ git daemon --verbose --reuseaddr --base-path=$(pwd) --export-all
  [4760] Ready to rumble
  [696] Connection from 127.0.0.1:2054
  [696] unable to set SO_KEEPALIVE on socket: No such file or directory
  [696] Extended attribute "host": 127.0.0.1
  [696] Request upload-pack for '/example'

Client side:

  $ git clone git://127.0.0.1/example
  Cloning into 'example'...
  remote: Counting objects: 36, done.
  remote: Compressing objects: 100% (24/24), done.
  fatal: read error: Invalid argument
  fatal: early EOF
  fatal: index-pack failed

System information:

  $ git --version --build-options
  git version 2.17.0.windows.1
  cpu: x86_64
  built from commit: e7621d891d081acff6acd1f0ba6ae0adce06dd09
  sizeof-long: 4
  
  $ cmd.exe /c ver
  
  Microsoft Windows [Version 10.0.16299.371]

Best regards,
Kim

[1] https://github.com/git-for-windows/git/issues/304#issuecomment-274266897


.gitattributes lookup doesn't respect GIT_WORK_TREE

2018-04-15 Thread Andreas Schwab
The functions in attr.c do not look at $GIT_WORK_TREE when trying to
find the .gitattributes file.  Thus if you are not inside the work tree,
but have GIT_WORK_TREE set attribute lookup will be wrong.

Password Store  is using this feature
when the password store is a git repository.  It sets up a diff
attribute that decrypts the contents, but that works only if the command
is issued while inside the repository.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


[PATCH v13 10/10] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-04-15 Thread lars . schneider
From: Lars Schneider 

UTF supports lossless conversion round tripping and conversions between
UTF and other encodings are mostly round trip safe as Unicode aims to be
a superset of all other character encodings. However, certain encodings
(e.g. SHIFT-JIS) are known to have round trip issues [1].

Add 'core.checkRoundtripEncoding', which contains a comma separated
list of encodings, to define for what encodings Git should check the
conversion round trip if they are used in the 'working-tree-encoding'
attribute.

Set SHIFT-JIS as default value for 'core.checkRoundtripEncoding'.

[1] 
https://support.microsoft.com/en-us/help/170559/prb-conversion-problem-between-shift-jis-and-unicode

Signed-off-by: Lars Schneider 
---
 Documentation/config.txt |  6 
 Documentation/gitattributes.txt  |  8 +
 config.c |  5 +++
 convert.c| 77 
 convert.h|  1 +
 environment.c|  1 +
 t/t0028-working-tree-encoding.sh | 39 
 7 files changed, 137 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 0e25b2c92b..7dcac9b540 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -530,6 +530,12 @@ core.autocrlf::
This variable can be set to 'input',
in which case no output conversion is performed.
 
+core.checkRoundtripEncoding::
+   A comma and/or whitespace separated list of encodings that Git
+   performs UTF-8 round trip checks on if they are used in an
+   `working-tree-encoding` attribute (see linkgit:gitattributes[5]).
+   The default value is `SHIFT-JIS`.
+
 core.symlinks::
If false, symbolic links are checked out as small plain files that
contain the link text. linkgit:git-update-index[1] and
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 31a4f92840..aa3deae392 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -312,6 +312,14 @@ number of pitfalls:
   internal contents as UTF-8 and try to convert it to UTF-16 on checkout.
   That operation will fail and cause an error.
 
+- Reencoding content to non-UTF encodings can cause errors as the
+  conversion might not be UTF-8 round trip safe. If you suspect your
+  encoding to not be round trip safe, then add it to
+  `core.checkRoundtripEncoding` to make Git check the round trip
+  encoding (see linkgit:git-config[1]). SHIFT-JIS (Japanese character
+  set) is known to have round trip issues with UTF-8 and is checked by
+  default.
+
 - Reencoding content requires resources that might slow down certain
   Git operations (e.g 'git checkout' or 'git add').
 
diff --git a/config.c b/config.c
index 1f003fbb90..d0ada9fcd4 100644
--- a/config.c
+++ b/config.c
@@ -1172,6 +1172,11 @@ static int git_default_core_config(const char *var, 
const char *value)
return 0;
}
 
+   if (!strcmp(var, "core.checkroundtripencoding")) {
+   check_roundtrip_encoding = xstrdup(value);
+   return 0;
+   }
+
if (!strcmp(var, "core.notesref")) {
notes_ref_name = xstrdup(value);
return 0;
diff --git a/convert.c b/convert.c
index bc35c33249..1ae6301629 100644
--- a/convert.c
+++ b/convert.c
@@ -347,6 +347,42 @@ static void trace_encoding(const char *context, const char 
*path,
strbuf_release();
 }
 
+static int check_roundtrip(const char *enc_name)
+{
+   /*
+* check_roundtrip_encoding contains a string of comma and/or
+* space separated encodings (eg. "UTF-16, ASCII, CP1125").
+* Search for the given encoding in that string.
+*/
+   const char *found = strcasestr(check_roundtrip_encoding, enc_name);
+   const char *next;
+   int len;
+   if (!found)
+   return 0;
+   next = found + strlen(enc_name);
+   len = strlen(check_roundtrip_encoding);
+   return (found && (
+   /*
+* check that the found encoding is at the
+* beginning of check_roundtrip_encoding or
+* that it is prefixed with a space or comma
+*/
+   found == check_roundtrip_encoding || (
+   (isspace(found[-1]) || found[-1] == ',')
+   )
+   ) && (
+   /*
+* check that the found encoding is at the
+* end of check_roundtrip_encoding or
+* that it is suffixed with a space or comma
+*/
+   next == check_roundtrip_encoding + len || (
+   next < check_roundtrip_encoding + len &&
+   (isspace(next[0]) || next[0] == 

[PATCH v13 09/10] convert: add tracing for 'working-tree-encoding' attribute

2018-04-15 Thread lars . schneider
From: Lars Schneider 

Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable
tracing for content that is reencoded with the 'working-tree-encoding'
attribute. This is useful to debug encoding issues.

Signed-off-by: Lars Schneider 
---
 convert.c| 25 +
 t/t0028-working-tree-encoding.sh |  2 ++
 2 files changed, 27 insertions(+)

diff --git a/convert.c b/convert.c
index 0e7930c154..bc35c33249 100644
--- a/convert.c
+++ b/convert.c
@@ -324,6 +324,29 @@ static int validate_encoding(const char *path, const char 
*enc,
return 0;
 }
 
+static void trace_encoding(const char *context, const char *path,
+  const char *encoding, const char *buf, size_t len)
+{
+   static struct trace_key coe = TRACE_KEY_INIT(WORKING_TREE_ENCODING);
+   struct strbuf trace = STRBUF_INIT;
+   int i;
+
+   strbuf_addf(, "%s (%s, considered %s):\n", context, path, 
encoding);
+   for (i = 0; i < len && buf; ++i) {
+   strbuf_addf(
+   ,"| \e[2m%2i:\e[0m %2x \e[2m%c\e[0m%c",
+   i,
+   (unsigned char) buf[i],
+   (buf[i] > 32 && buf[i] < 127 ? buf[i] : ' '),
+   ((i+1) % 8 && (i+1) < len ? ' ' : '\n')
+   );
+   }
+   strbuf_addchars(, '\n', 1);
+
+   trace_strbuf(, );
+   strbuf_release();
+}
+
 static const char *default_encoding = "UTF-8";
 
 static int encode_to_git(const char *path, const char *src, size_t src_len,
@@ -352,6 +375,7 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
if (validate_encoding(path, enc, src, src_len, die_on_error))
return 0;
 
+   trace_encoding("source", path, enc, src, src_len);
dst = reencode_string_len(src, src_len, default_encoding, enc,
  _len);
if (!dst) {
@@ -369,6 +393,7 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
return 0;
}
}
+   trace_encoding("destination", path, default_encoding, dst, dst_len);
 
strbuf_attach(buf, dst, dst_len, dst_len + 1);
return 1;
diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh
index a318d03232..026544ce09 100755
--- a/t/t0028-working-tree-encoding.sh
+++ b/t/t0028-working-tree-encoding.sh
@@ -4,6 +4,8 @@ test_description='working-tree-encoding conversion via 
gitattributes'
 
 . ./test-lib.sh
 
+GIT_TRACE_WORKING_TREE_ENCODING=1 && export GIT_TRACE_WORKING_TREE_ENCODING
+
 test_expect_success 'setup test files' '
git config core.eol lf &&
 
-- 
2.16.2



[PATCH v13 07/10] convert: add 'working-tree-encoding' attribute

2018-04-15 Thread lars . schneider
From: Lars Schneider 

Git recognizes files encoded with ASCII or one of its supersets (e.g.
UTF-8 or ISO-8859-1) as text files. All other encodings are usually
interpreted as binary and consequently built-in Git text processing
tools (e.g. 'git diff') as well as most Git web front ends do not
visualize the content.

Add an attribute to tell Git what encoding the user has defined for a
given file. If the content is added to the index, then Git reencodes
the content to a canonical UTF-8 representation. On checkout Git will
reverse this operation.

Signed-off-by: Lars Schneider 
---
 Documentation/gitattributes.txt  |  80 ++
 convert.c| 113 ++-
 convert.h|   1 +
 sha1_file.c  |   2 +-
 t/t0028-working-tree-encoding.sh | 142 +++
 5 files changed, 336 insertions(+), 2 deletions(-)
 create mode 100755 t/t0028-working-tree-encoding.sh

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 30687de81a..31a4f92840 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -272,6 +272,86 @@ few exceptions.  Even though...
   catch potential problems early, safety triggers.
 
 
+`working-tree-encoding`
+^^^
+
+Git recognizes files encoded in ASCII or one of its supersets (e.g.
+UTF-8, ISO-8859-1, ...) as text files. Files encoded in certain other
+encodings (e.g. UTF-16) are interpreted as binary and consequently
+built-in Git text processing tools (e.g. 'git diff') as well as most Git
+web front ends do not visualize the contents of these files by default.
+
+In these cases you can tell Git the encoding of a file in the working
+directory with the `working-tree-encoding` attribute. If a file with this
+attribute is added to Git, then Git reencodes the content from the
+specified encoding to UTF-8. Finally, Git stores the UTF-8 encoded
+content in its internal data structure (called "the index"). On checkout
+the content is reencoded back to the specified encoding.
+
+Please note that using the `working-tree-encoding` attribute may have a
+number of pitfalls:
+
+- Alternative Git implementations (e.g. JGit or libgit2) and older Git
+  versions (as of March 2018) do not support the `working-tree-encoding`
+  attribute. If you decide to use the `working-tree-encoding` attribute
+  in your repository, then it is strongly recommended to ensure that all
+  clients working with the repository support it.
+
+  For example, Microsoft Visual Studio resources files (`*.rc`) or
+  PowerShell script files (`*.ps1`) are sometimes encoded in UTF-16.
+  If you declare `*.ps1` as files as UTF-16 and you add `foo.ps1` with
+  a `working-tree-encoding` enabled Git client, then `foo.ps1` will be
+  stored as UTF-8 internally. A client without `working-tree-encoding`
+  support will checkout `foo.ps1` as UTF-8 encoded file. This will
+  typically cause trouble for the users of this file.
+
+  If a Git client, that does not support the `working-tree-encoding`
+  attribute, adds a new file `bar.ps1`, then `bar.ps1` will be
+  stored "as-is" internally (in this example probably as UTF-16).
+  A client with `working-tree-encoding` support will interpret the
+  internal contents as UTF-8 and try to convert it to UTF-16 on checkout.
+  That operation will fail and cause an error.
+
+- Reencoding content requires resources that might slow down certain
+  Git operations (e.g 'git checkout' or 'git add').
+
+Use the `working-tree-encoding` attribute only if you cannot store a file
+in UTF-8 encoding and if you want Git to be able to process the content
+as text.
+
+As an example, use the following attributes if your '*.ps1' files are
+UTF-16 encoded with byte order mark (BOM) and you want Git to perform
+automatic line ending conversion based on your platform.
+
+
+*.ps1  text working-tree-encoding=UTF-16
+
+
+Use the following attributes if your '*.ps1' files are UTF-16 little
+endian encoded without BOM and you want Git to use Windows line endings
+in the working directory. Please note, it is highly recommended to
+explicitly define the line endings with `eol` if the `working-tree-encoding`
+attribute is used to avoid ambiguity.
+
+
+*.ps1  text working-tree-encoding=UTF-16LE eol=CRLF
+
+
+You can get a list of all available encodings on your platform with the
+following command:
+
+
+iconv --list
+
+
+If you do not know the encoding of a file, then you can use the `file`
+command to guess the encoding:
+
+
+file foo.ps1
+
+
+
 `ident`
 ^^^
 
diff --git a/convert.c b/convert.c
index b976eb968c..21d5cb60da 100644
--- a/convert.c
+++ b/convert.c
@@ -7,6 +7,7 

[PATCH v13 06/10] utf8: add function to detect a missing UTF-16/32 BOM

2018-04-15 Thread lars . schneider
From: Lars Schneider 

If the endianness is not defined in the encoding name, then let's
be strict and require a BOM to avoid any encoding confusion. The
is_missing_required_utf_bom() function returns true if a required BOM
is missing.

The Unicode standard instructs to assume big-endian if there in no BOM
for UTF-16/32 [1][2]. However, the W3C/WHATWG encoding standard used
in HTML5 recommends to assume little-endian to "deal with deployed
content" [3]. Strictly requiring a BOM seems to be the safest option
for content in Git.

This function is used in a subsequent commit.

[1] http://unicode.org/faq/utf_bom.html#gen6
[2] http://www.unicode.org/versions/Unicode10.0.0/ch03.pdf
 Section 3.10, D98, page 132
[3] https://encoding.spec.whatwg.org/#utf-16le

Signed-off-by: Lars Schneider 
---
 utf8.c | 13 +
 utf8.h | 19 +++
 2 files changed, 32 insertions(+)

diff --git a/utf8.c b/utf8.c
index 83af3a9f6b..25d366d6b3 100644
--- a/utf8.c
+++ b/utf8.c
@@ -586,6 +586,19 @@ int has_prohibited_utf_bom(const char *enc, const char 
*data, size_t len)
);
 }
 
+int is_missing_required_utf_bom(const char *enc, const char *data, size_t len)
+{
+   return (
+  (same_utf_encoding(enc, "UTF-16")) &&
+  !(has_bom_prefix(data, len, utf16_be_bom, sizeof(utf16_be_bom)) ||
+has_bom_prefix(data, len, utf16_le_bom, sizeof(utf16_le_bom)))
+   ) || (
+  (same_utf_encoding(enc, "UTF-32")) &&
+  !(has_bom_prefix(data, len, utf32_be_bom, sizeof(utf32_be_bom)) ||
+has_bom_prefix(data, len, utf32_le_bom, sizeof(utf32_le_bom)))
+   );
+}
+
 /*
  * Returns first character length in bytes for multi-byte `text` according to
  * `encoding`.
diff --git a/utf8.h b/utf8.h
index 0db1db4519..cce654a64a 100644
--- a/utf8.h
+++ b/utf8.h
@@ -79,4 +79,23 @@ void strbuf_utf8_align(struct strbuf *buf, align_type 
position, unsigned int wid
  */
 int has_prohibited_utf_bom(const char *enc, const char *data, size_t len);
 
+/*
+ * If the endianness is not defined in the encoding name, then we
+ * require a BOM. The function returns true if a required BOM is missing.
+ *
+ * The Unicode standard instructs to assume big-endian if there in no
+ * BOM for UTF-16/32 [1][2]. However, the W3C/WHATWG encoding standard
+ * used in HTML5 recommends to assume little-endian to "deal with
+ * deployed content" [3].
+ *
+ * Therefore, strictly requiring a BOM seems to be the safest option for
+ * content in Git.
+ *
+ * [1] http://unicode.org/faq/utf_bom.html#gen6
+ * [2] http://www.unicode.org/versions/Unicode10.0.0/ch03.pdf
+ * Section 3.10, D98, page 132
+ * [3] https://encoding.spec.whatwg.org/#utf-16le
+ */
+int is_missing_required_utf_bom(const char *enc, const char *data, size_t len);
+
 #endif
-- 
2.16.2



[PATCH v13 03/10] strbuf: add a case insensitive starts_with()

2018-04-15 Thread lars . schneider
From: Lars Schneider 

Check in a case insensitive manner if one string is a prefix of another
string.

This function is used in a subsequent commit.

Signed-off-by: Lars Schneider 
---
 git-compat-util.h | 1 +
 strbuf.c  | 9 +
 2 files changed, 10 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 68b2ad531e..95c9b34832 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -455,6 +455,7 @@ extern void (*get_warn_routine(void))(const char *warn, 
va_list params);
 extern void set_die_is_recursing_routine(int (*routine)(void));
 
 extern int starts_with(const char *str, const char *prefix);
+extern int istarts_with(const char *str, const char *prefix);
 
 /*
  * If the string "str" begins with the string found in "prefix", return 1.
diff --git a/strbuf.c b/strbuf.c
index b635f0bdc4..99812b8488 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -11,6 +11,15 @@ int starts_with(const char *str, const char *prefix)
return 0;
 }
 
+int istarts_with(const char *str, const char *prefix)
+{
+   for (; ; str++, prefix++)
+   if (!*prefix)
+   return 1;
+   else if (tolower(*str) != tolower(*prefix))
+   return 0;
+}
+
 int skip_to_optional_arg_default(const char *str, const char *prefix,
 const char **arg, const char *def)
 {
-- 
2.16.2



[PATCH v13 04/10] utf8: teach same_encoding() alternative UTF encoding names

2018-04-15 Thread lars . schneider
From: Lars Schneider 

The function same_encoding() could only recognize alternative names for
UTF-8 encodings. Teach it to recognize all kinds of alternative UTF
encoding names (e.g. utf16).

While we are at it, fix a crash that would occur if same_encoding() was
called with a NULL argument and a non-NULL argument.

This function is used in a subsequent commit.

Signed-off-by: Lars Schneider 
---
 utf8.c | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/utf8.c b/utf8.c
index 2c27ce0137..40f4b142d6 100644
--- a/utf8.c
+++ b/utf8.c
@@ -401,18 +401,40 @@ void strbuf_utf8_replace(struct strbuf *sb_src, int pos, 
int width,
strbuf_release(_dst);
 }
 
+/*
+ * Returns true (1) if the src encoding name matches the dst encoding
+ * name directly or one of its alternative names. E.g. UTF-16BE is the
+ * same as UTF16BE.
+ */
+static int same_utf_encoding(const char *src, const char *dst)
+{
+   if (istarts_with(src, "utf") && istarts_with(dst, "utf")) {
+   /* src[3] or dst[3] might be '\0' */
+   int i = (src[3] == '-' ? 4 : 3);
+   int j = (dst[3] == '-' ? 4 : 3);
+   return !strcasecmp(src+i, dst+j);
+   }
+   return 0;
+}
+
 int is_encoding_utf8(const char *name)
 {
if (!name)
return 1;
-   if (!strcasecmp(name, "utf-8") || !strcasecmp(name, "utf8"))
+   if (same_utf_encoding("utf-8", name))
return 1;
return 0;
 }
 
 int same_encoding(const char *src, const char *dst)
 {
-   if (is_encoding_utf8(src) && is_encoding_utf8(dst))
+   static const char utf8[] = "UTF-8";
+
+   if (!src)
+   src = utf8;
+   if (!dst)
+   dst = utf8;
+   if (same_utf_encoding(src, dst))
return 1;
return !strcasecmp(src, dst);
 }
-- 
2.16.2



[PATCH v13 01/10] strbuf: remove unnecessary NUL assignment in xstrdup_tolower()

2018-04-15 Thread lars . schneider
From: Lars Schneider 

Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we
allocate the buffer for the lower case string with xmallocz(). This
already ensures a NUL at the end of the allocated buffer.

Remove the unnecessary assignment.

Signed-off-by: Lars Schneider 
---
 strbuf.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/strbuf.c b/strbuf.c
index 1df674e919..55b7daeb35 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -781,7 +781,6 @@ char *xstrdup_tolower(const char *string)
result = xmallocz(len);
for (i = 0; i < len; i++)
result[i] = tolower(string[i]);
-   result[i] = '\0';
return result;
 }
 
-- 
2.16.2



[PATCH v13 05/10] utf8: add function to detect prohibited UTF-16/32 BOM

2018-04-15 Thread lars . schneider
From: Lars Schneider 

Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE
or UTF-32LE a BOM must not be used [1]. The function returns true if
this is the case.

This function is used in a subsequent commit.

[1] http://unicode.org/faq/utf_bom.html#bom10

Signed-off-by: Lars Schneider 
---
 utf8.c | 26 ++
 utf8.h |  9 +
 2 files changed, 35 insertions(+)

diff --git a/utf8.c b/utf8.c
index 40f4b142d6..83af3a9f6b 100644
--- a/utf8.c
+++ b/utf8.c
@@ -560,6 +560,32 @@ char *reencode_string_len(const char *in, int insz,
 }
 #endif
 
+static int has_bom_prefix(const char *data, size_t len,
+ const char *bom, size_t bom_len)
+{
+   return data && bom && (len >= bom_len) && !memcmp(data, bom, bom_len);
+}
+
+static const char utf16_be_bom[] = {0xFE, 0xFF};
+static const char utf16_le_bom[] = {0xFF, 0xFE};
+static const char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
+static const char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
+
+int has_prohibited_utf_bom(const char *enc, const char *data, size_t len)
+{
+   return (
+ (same_utf_encoding("UTF-16BE", enc) ||
+  same_utf_encoding("UTF-16LE", enc)) &&
+ (has_bom_prefix(data, len, utf16_be_bom, sizeof(utf16_be_bom)) ||
+  has_bom_prefix(data, len, utf16_le_bom, sizeof(utf16_le_bom)))
+   ) || (
+ (same_utf_encoding("UTF-32BE",  enc) ||
+  same_utf_encoding("UTF-32LE", enc)) &&
+ (has_bom_prefix(data, len, utf32_be_bom, sizeof(utf32_be_bom)) ||
+  has_bom_prefix(data, len, utf32_le_bom, sizeof(utf32_le_bom)))
+   );
+}
+
 /*
  * Returns first character length in bytes for multi-byte `text` according to
  * `encoding`.
diff --git a/utf8.h b/utf8.h
index 6bbcf31a83..0db1db4519 100644
--- a/utf8.h
+++ b/utf8.h
@@ -70,4 +70,13 @@ typedef enum {
 void strbuf_utf8_align(struct strbuf *buf, align_type position, unsigned int 
width,
   const char *s);
 
+/*
+ * If a data stream is declared as UTF-16BE or UTF-16LE, then a UTF-16
+ * BOM must not be used [1]. The same applies for the UTF-32 equivalents.
+ * The function returns true if this rule is violated.
+ *
+ * [1] http://unicode.org/faq/utf_bom.html#bom10
+ */
+int has_prohibited_utf_bom(const char *enc, const char *data, size_t len);
+
 #endif
-- 
2.16.2



[PATCH v13 00/10] convert: add support for different encodings

2018-04-15 Thread lars . schneider
From: Lars Schneider 

Hi,

Patches 1-6,9 are preparation and helper functions.
Patch 7,8,10 are the actual change.

This series is based on v2.16.0 and Torsten's 8462ff43e4 (convert_to_git():
safe_crlf/checksafe becomes int conv_flags, 2018-01-13).

The series can be rebased without conflicts on top of v2.17.0:
https://github.com/larsxschneider/git/tree/encoding-2.17

Changes since v12:

* commit message improvement (Torsten)
* prevent undefined memcpy behavior in has_bom_prefix (Avar)
* improve error message: true/false are no valid working-tree-encodings 
(Torsten)
* fix crash in same_encoding() if only one argument is NULL (this bug
  was already present before this series, Eric)

Thanks,
Lars

  RFC: 
https://public-inbox.org/git/bdb9b884-6d17-4be3-a83c-f67e2afa2...@gmail.com/
   v1: 
https://public-inbox.org/git/20171211155023.1405-1-lars.schnei...@autodesk.com/
   v2: 
https://public-inbox.org/git/2017122915.39680-1-lars.schnei...@autodesk.com/
   v3: 
https://public-inbox.org/git/20180106004808.77513-1-lars.schnei...@autodesk.com/
   v4: 
https://public-inbox.org/git/20180120152418.52859-1-lars.schnei...@autodesk.com/
   v5: https://public-inbox.org/git/20180129201855.9182-1-tbo...@web.de/
   v6: 
https://public-inbox.org/git/20180209132830.55385-1-lars.schnei...@autodesk.com/
   v7: 
https://public-inbox.org/git/20180215152711.158-1-lars.schnei...@autodesk.com/
   v8: 
https://public-inbox.org/git/20180224162801.98860-1-lars.schnei...@autodesk.com/
   v9: 
https://public-inbox.org/git/20180304201418.60958-1-lars.schnei...@autodesk.com/
  v10: 
https://public-inbox.org/git/20180307173026.30058-1-lars.schnei...@autodesk.com/
  v11: 
https://public-inbox.org/git/20180309173536.62012-1-lars.schnei...@autodesk.com/
  v12: 
https://public-inbox.org/git/20180315225746.18119-1-lars.schnei...@autodesk.com/

Base Ref:
Web-Diff: https://github.com/larsxschneider/git/commit/3aa98e6975
Checkout: git fetch https://github.com/larsxschneider/git encoding-v13 && git 
checkout 3aa98e6975


### Interdiff (v12..v13):

diff --git a/convert.c b/convert.c
index 2a002af66d..1ae6301629 100644
--- a/convert.c
+++ b/convert.c
@@ -1222,7 +1222,7 @@ static const char *git_path_check_encoding(struct 
attr_check_item *check)
return NULL;

if (ATTR_TRUE(value) || ATTR_FALSE(value)) {
-   die(_("working-tree-encoding attribute requires a value"));
+   die(_("true/false are no valid working-tree-encodings"));
}

/* Don't encode to the default encoding */
diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh
index 884f0878b1..12b8eb963a 100755
--- a/t/t0028-working-tree-encoding.sh
+++ b/t/t0028-working-tree-encoding.sh
@@ -152,7 +152,7 @@ test_expect_success 'check unsupported encodings' '
echo "*.set text working-tree-encoding" >.gitattributes &&
printf "set" >t.set &&
test_must_fail git add t.set 2>err.out &&
-   test_i18ngrep "working-tree-encoding attribute requires a value" 
err.out &&
+   test_i18ngrep "true/false are no valid working-tree-encodings" err.out 
&&

echo "*.unset text -working-tree-encoding" >.gitattributes &&
printf "unset" >t.unset &&
diff --git a/utf8.c b/utf8.c
index 2d8821d36e..25d366d6b3 100644
--- a/utf8.c
+++ b/utf8.c
@@ -428,8 +428,12 @@ int is_encoding_utf8(const char *name)

 int same_encoding(const char *src, const char *dst)
 {
-   if (is_encoding_utf8(src) && is_encoding_utf8(dst))
-   return 1;
+   static const char utf8[] = "UTF-8";
+
+   if (!src)
+   src = utf8;
+   if (!dst)
+   dst = utf8;
if (same_utf_encoding(src, dst))
return 1;
return !strcasecmp(src, dst);
@@ -559,7 +563,7 @@ char *reencode_string_len(const char *in, int insz,
 static int has_bom_prefix(const char *data, size_t len,
  const char *bom, size_t bom_len)
 {
-   return (len >= bom_len) && !memcmp(data, bom, bom_len);
+   return data && bom && (len >= bom_len) && !memcmp(data, bom, bom_len);
 }

 static const char utf16_be_bom[] = {0xFE, 0xFF};


### Patches

Lars Schneider (10):
  strbuf: remove unnecessary NUL assignment in xstrdup_tolower()
  strbuf: add xstrdup_toupper()
  strbuf: add a case insensitive starts_with()
  utf8: teach same_encoding() alternative UTF encoding names
  utf8: add function to detect prohibited UTF-16/32 BOM
  utf8: add function to detect a missing UTF-16/32 BOM
  convert: add 'working-tree-encoding' attribute
  convert: check for detectable errors in UTF encodings
  convert: add tracing for 'working-tree-encoding' attribute
  convert: add round trip check based on 'core.checkRoundtripEncoding'

 Documentation/config.txt |   6 +
 Documentation/gitattributes.txt  |  88 +
 config.c |   5 +
 convert.c| 276 ++-
 

[PATCH v13 02/10] strbuf: add xstrdup_toupper()

2018-04-15 Thread lars . schneider
From: Lars Schneider 

Create a copy of an existing string and make all characters upper case.
Similar xstrdup_tolower().

This function is used in a subsequent commit.

Signed-off-by: Lars Schneider 
---
 strbuf.c | 12 
 strbuf.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/strbuf.c b/strbuf.c
index 55b7daeb35..b635f0bdc4 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -784,6 +784,18 @@ char *xstrdup_tolower(const char *string)
return result;
 }
 
+char *xstrdup_toupper(const char *string)
+{
+   char *result;
+   size_t len, i;
+
+   len = strlen(string);
+   result = xmallocz(len);
+   for (i = 0; i < len; i++)
+   result[i] = toupper(string[i]);
+   return result;
+}
+
 char *xstrvfmt(const char *fmt, va_list ap)
 {
struct strbuf buf = STRBUF_INIT;
diff --git a/strbuf.h b/strbuf.h
index 14c8c10d66..df7ced53ed 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -607,6 +607,7 @@ __attribute__((format (printf,2,3)))
 extern int fprintf_ln(FILE *fp, const char *fmt, ...);
 
 char *xstrdup_tolower(const char *);
+char *xstrdup_toupper(const char *);
 
 /**
  * Create a newly allocated string using printf format. You can do this easily
-- 
2.16.2



[PATCH v13 08/10] convert: check for detectable errors in UTF encodings

2018-04-15 Thread lars . schneider
From: Lars Schneider 

Check that new content is valid with respect to the user defined
'working-tree-encoding' attribute.

Signed-off-by: Lars Schneider 
---
 convert.c| 61 +++
 t/t0028-working-tree-encoding.sh | 62 
 2 files changed, 123 insertions(+)

diff --git a/convert.c b/convert.c
index 21d5cb60da..0e7930c154 100644
--- a/convert.c
+++ b/convert.c
@@ -266,6 +266,64 @@ static int will_convert_lf_to_crlf(size_t len, struct 
text_stat *stats,
 
 }
 
+static int validate_encoding(const char *path, const char *enc,
+ const char *data, size_t len, int die_on_error)
+{
+   /* We only check for UTF here as UTF?? can be an alias for UTF-?? */
+   if (istarts_with(enc, "UTF")) {
+   /*
+* Check for detectable errors in UTF encodings
+*/
+   if (has_prohibited_utf_bom(enc, data, len)) {
+   const char *error_msg = _(
+   "BOM is prohibited in '%s' if encoded as %s");
+   /*
+* This advice is shown for UTF-??BE and UTF-??LE 
encodings.
+* We cut off the last two characters of the encoding 
name
+* to generate the encoding name suitable for BOMs.
+*/
+   const char *advise_msg = _(
+   "The file '%s' contains a byte order "
+   "mark (BOM). Please use UTF-%s as "
+   "working-tree-encoding.");
+   const char *stripped = NULL;
+   char *upper = xstrdup_toupper(enc);
+   upper[strlen(upper)-2] = '\0';
+   if (!skip_prefix(upper, "UTF-", ))
+   skip_prefix(stripped, "UTF", );
+   advise(advise_msg, path, stripped);
+   free(upper);
+   if (die_on_error)
+   die(error_msg, path, enc);
+   else {
+   return error(error_msg, path, enc);
+   }
+
+   } else if (is_missing_required_utf_bom(enc, data, len)) {
+   const char *error_msg = _(
+   "BOM is required in '%s' if encoded as %s");
+   const char *advise_msg = _(
+   "The file '%s' is missing a byte order "
+   "mark (BOM). Please use UTF-%sBE or UTF-%sLE "
+   "(depending on the byte order) as "
+   "working-tree-encoding.");
+   const char *stripped = NULL;
+   char *upper = xstrdup_toupper(enc);
+   if (!skip_prefix(upper, "UTF-", ))
+   skip_prefix(stripped, "UTF", );
+   advise(advise_msg, path, stripped, stripped);
+   free(upper);
+   if (die_on_error)
+   die(error_msg, path, enc);
+   else {
+   return error(error_msg, path, enc);
+   }
+   }
+
+   }
+   return 0;
+}
+
 static const char *default_encoding = "UTF-8";
 
 static int encode_to_git(const char *path, const char *src, size_t src_len,
@@ -291,6 +349,9 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
if (!buf && !src)
return 1;
 
+   if (validate_encoding(path, enc, src, src_len, die_on_error))
+   return 0;
+
dst = reencode_string_len(src, src_len, default_encoding, enc,
  _len);
if (!dst) {
diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh
index 8e574ccdd8..a318d03232 100755
--- a/t/t0028-working-tree-encoding.sh
+++ b/t/t0028-working-tree-encoding.sh
@@ -62,6 +62,52 @@ test_expect_success 'check $GIT_DIR/info/attributes support' 
'
 
 for i in 16 32
 do
+   test_expect_success "check prohibited UTF-${i} BOM" '
+   test_when_finished "git reset --hard HEAD" &&
+
+   echo "*.utf${i}be text working-tree-encoding=utf-${i}be" 
>>.gitattributes &&
+   echo "*.utf${i}le text working-tree-encoding=utf-${i}LE" 
>>.gitattributes &&
+
+   # Here we add a UTF-16 (resp. UTF-32) files with BOM 
(big/little-endian)
+   # but we tell Git to treat it as UTF-16BE/UTF-16LE (resp. 
UTF-32).
+   # In these cases the BOM is prohibited.
+   cp bebom.utf${i}be.raw bebom.utf${i}be &&
+   test_must_fail git add bebom.utf${i}be 2>err.out &&
+   test_i18ngrep 

Re: [PATCH 1/2] daemon: use timeout for uninterruptible poll

2018-04-15 Thread Kim Gybels
On (13/04/18 14:36), Johannes Schindelin wrote:
> > The poll provided in compat/poll.c is not interrupted by receiving
> > SIGCHLD. Use a timeout for cleaning up dead children in a timely manner.
> 
> Maybe say "When using this poll emulation, use a timeout ..."?

I will rewrite the commit message when I reroll the patch. Calling the
poll "uninterruptible" might be wrong as well, although the poll
doesn't return with EINTR when a child process terminates, it might
still be interruptible in other ways. On a related note, the handler
for SIGCHLD is simply not called in Git-for-Windows' daemon.

> > diff --git a/daemon.c b/daemon.c
> > index fe833ea7de..6dc95c1b2f 100644
> > --- a/daemon.c
> > +++ b/daemon.c
> > @@ -1147,6 +1147,7 @@ static int service_loop(struct socketlist *socklist)
> >  {
> > struct pollfd *pfd;
> > int i;
> > +   int poll_timeout = -1;
> 
> Just reuse the line above:
> 
>   int poll_timeout = -1, i;

Sure.

> > @@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist)
> > int i;
> >  
> > check_dead_children();
> > -
> > -   if (poll(pfd, socklist->nr, -1) < 0) {
> > +#ifdef NO_POLL
> > +   poll_timeout = live_children ? 100 : -1;
> > +#endif
> > +   int ret = poll(pfd, socklist->nr, poll_timeout);
> > +   if  (ret == 0) {
> > +   continue;
> > +   } else if (ret < 0) {
> 
> I would find it a bit easier on the eyes if this did not use curlies, and
> dropped the unnecessary `else` (`continue` will take care of that):
> 
>   if (!ret)
>   continue;
>   if (ret < 0)
>   [...]

Funny, that's how I would normally write it, if I wasn't so focused on
trying to follow the coding quidelines. While I'm at it, I will also
fix that sneaky double space after the if.

Is it ok to add the timeout for all platforms using the poll
emulation, since I only tested for Windows?

Best regards,
Kim


Re: [PATCH v6 04/15] sequencer: introduce new commands to reset the revision

2018-04-15 Thread Philip Oakley

From: "Phillip Wood" 
: Friday, April 13, 2018 11:03 AM

If a label or reset command fails it is likely to be due to a
typo. Rescheduling the command would make it easier for the user to fix
the problem as they can just run 'git rebase --edit-todo'. 


Is this worth noting in the command documentation? 
"If the label or reset command fails then fix

the problem by runnning 'git rebase --edit-todo'." ?

Just a thought.


It also
ensures that the problem has actually been fixed when the rebase
continues. I think you could do it like this



--
Philip
(also @dunelm, 73-79..)


Re: [PATCH v11 06/10] convert: add 'working-tree-encoding' attribute

2018-04-15 Thread Lars Schneider

> On 05 Apr 2018, at 18:41, Torsten Bögershausen  wrote:
> 
> On 01.04.18 15:24, Lars Schneider wrote:
>>> TRUE or false are values, but just wrong ones.
>>> If this test is removed, the user will see "failed to encode "TRUE" to 
>>> "UTF-8",
>>> which should give enough information to fix it.
>> 
>> I see your point. However, I would like to stop the processing right
>> there for these invalid values. How about 
>> 
>>  error(_("true/false are no valid working-tree-encodings"));
>> 
>> I think that is the most straight forward/helpful error message
>> for the enduser (I consider the term "boolean" but dismissed it
>> as potentially confusing to folks not familiar with the term).
>> 
>> OK with you?
> 
> Yes.

Great!


> Another thing that came up recently, independent of your series:
> 
> What should happen if a user specifies "UTF-8" and the file
> has an UTF-8 encoded BOM ?
> I ask because I stumbled over such a file coming from a Windows
> which the java compiler under Linux didn't accept.
> 
> And because some tools love to put an UTF-8 encoded BOM
> into text files.
> 
> The clearest thing would be to extend the BOM check in 5/9
> to cover UTF-32, UTF-16 and UTF-8.
> 
> Are there any plans to do so?

If `working-tree-encoding` is not defined or defined as UTF-8,
then we would return from encode_to_git() early. That means we
would never run validate_encoding() which would check the BOM.

However, adding the UTF-8 BOM would still make sense. This way
Git could scream if a user set `working-tree-encoding` to UTF-16
but the file is really UTF-8 encoded.


> And thanks for the work.

Thanks :-)


- Lars

[PATCH v2 3/6] generate-cmdlist.sh: keep all information in common-cmds.h

2018-04-15 Thread Nguyễn Thái Ngọc Duy
common-cmds.h is used to extract the list of common commands (by
group) and a one-line summary of each command. Some information is
dropped, for example command category or summary of other commands.
Update generate-cmdlist.sh to keep all the information. The extra info
will be used shortly.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 generate-cmdlist.sh | 61 +
 help.c  | 42 ++-
 2 files changed, 81 insertions(+), 22 deletions(-)

diff --git a/generate-cmdlist.sh b/generate-cmdlist.sh
index eeea4b67ea..e0893e979a 100755
--- a/generate-cmdlist.sh
+++ b/generate-cmdlist.sh
@@ -1,17 +1,30 @@
 #!/bin/sh
 
+# Don't let locale affect this script.
+LC_ALL=C
+LANG=C
+export LC_ALL LANG
+
+command_list () {
+   sed '1,/^### command list/d;/^#/d' "$1"
+}
+
+category_list () {
+   command_list "$1" | awk '{print $2;}' | sort | uniq
+}
+
 echo "/* Automatically generated by generate-cmdlist.sh */
 struct cmdname_help {
-   char name[16];
+   char name[32];
char help[80];
-   unsigned char group;
+   unsigned int category;
+   unsigned int group;
 };
 
 static const char *common_cmd_groups[] = {"
 
 grps=grps$$.tmp
-match=match$$.tmp
-trap "rm -f '$grps' '$match'" 0 1 2 3 15
+trap "rm -f '$grps'" 0 1 2 3 15
 
 sed -n '
1,/^### common groups/b
@@ -23,28 +36,44 @@ sed -n '
' "$1"
 printf '};\n\n'
 
+echo "#define GROUP_NONE 0xff /* no common group */"
 n=0
-substnum=
 while read grp
 do
-   echo "^git-..*[ ]$grp"
-   substnum="$substnum${substnum:+;}s/[]$grp/$n/"
+   echo "#define GROUP_${grp:-NONE} $n"
n=$(($n+1))
-done <"$grps" >"$match"
+done <"$grps"
+echo
+
+echo '/*'
+printf 'static const char *cmd_categories[] = {\n'
+category_list "$1" |
+while read category; do
+   printf '\t\"'$category'\",\n'
+done
+printf '\tNULL\n};\n\n'
+echo '*/'
+
+n=0
+category_list "$1" |
+while read category; do
+   echo "#define CAT_$category $n"
+   n=$(($n+1))
+done
+echo
 
-printf 'static struct cmdname_help common_cmds[] = {\n'
-grep -f "$match" "$1" |
-sed 's/^git-//' |
+printf 'static struct cmdname_help command_list[] = {\n'
+command_list "$1" |
 sort |
-while read cmd tags
+while read cmd category tags
 do
-   tag=$(echo "$tags" | sed "$substnum; s/[^0-9]//g")
+   name=${cmd/git-}
sed -n '
-   /^NAME/,/git-'"$cmd"'/H
+   /^NAME/,/'"$cmd"'/H
${
x
-   s/.*git-'"$cmd"' - \(.*\)/  {"'"$cmd"'", N_("\1"), 
'$tag'},/
+   s/.*'"$cmd"' - \(.*\)/  {"'"$name"'", N_("\1"), 
CAT_'$category', GROUP_'${tags:-NONE}' },/
p
-   }' "Documentation/git-$cmd.txt"
+   }' "Documentation/$cmd.txt"
 done
 echo "};"
diff --git a/help.c b/help.c
index e155c39870..b5da7fa013 100644
--- a/help.c
+++ b/help.c
@@ -190,6 +190,27 @@ void list_commands(unsigned int colopts,
}
 }
 
+static void extract_common_cmds(struct cmdname_help **p_common_cmds,
+   int *p_nr)
+{
+   int i, nr = 0;
+   struct cmdname_help *common_cmds;
+
+   ALLOC_ARRAY(common_cmds, ARRAY_SIZE(command_list));
+
+   for (i = 0; i < ARRAY_SIZE(command_list); i++) {
+   const struct cmdname_help *cmd = command_list + i;
+
+   if (cmd->category != CAT_mainporcelain)
+   continue;
+
+   common_cmds[nr++] = *cmd;
+   }
+
+   *p_common_cmds = common_cmds;
+   *p_nr = nr;
+}
+
 static int cmd_group_cmp(const void *elem1, const void *elem2)
 {
const struct cmdname_help *e1 = elem1;
@@ -206,17 +227,21 @@ void list_common_cmds_help(void)
 {
int i, longest = 0;
int current_grp = -1;
+   int nr = 0;
+   struct cmdname_help *common_cmds;
+
+   extract_common_cmds(_cmds, );
 
-   for (i = 0; i < ARRAY_SIZE(common_cmds); i++) {
+   for (i = 0; i < nr; i++) {
if (longest < strlen(common_cmds[i].name))
longest = strlen(common_cmds[i].name);
}
 
-   QSORT(common_cmds, ARRAY_SIZE(common_cmds), cmd_group_cmp);
+   QSORT(common_cmds, nr, cmd_group_cmp);
 
puts(_("These are common Git commands used in various situations:"));
 
-   for (i = 0; i < ARRAY_SIZE(common_cmds); i++) {
+   for (i = 0; i < nr; i++) {
if (common_cmds[i].group != current_grp) {
printf("\n%s\n", 
_(common_cmd_groups[common_cmds[i].group]));
current_grp = common_cmds[i].group;
@@ -226,6 +251,7 @@ void list_common_cmds_help(void)
mput_char(' ', longest - strlen(common_cmds[i].name));
puts(_(common_cmds[i].help));
}
+   free(common_cmds);
 }
 
 void list_all_cmds(void)
@@ -301,8 +327,9 @@ static const char bad_interpreter_advice[] =
 
 const 

[PATCH v2 4/6] git.c: implement --list-cmds=porcelain

2018-04-15 Thread Nguyễn Thái Ngọc Duy
This is useful for git-completion.bash because it needs this set of
commands. Right now we have to maintain a separate command category in
there.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 contrib/completion/git-completion.bash | 94 ++
 git.c  |  2 +
 help.c | 12 
 help.h |  1 +
 t/t9902-completion.sh  |  4 +-
 5 files changed, 20 insertions(+), 93 deletions(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index a5f13ade20..9f17703aa7 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -839,14 +839,15 @@ __git_commands () {
then
printf "%s" "${GIT_TESTING_COMMAND_COMPLETION}"
else
-   git --list-cmds=all
+   git --list-cmds=$1
fi
 }
 
 __git_list_all_commands ()
 {
local i IFS=" "$'\n'
-   for i in $(__git_commands)
+   local category=${1-all}
+   for i in $(__git_commands $category)
do
case $i in
*--*) : helper pattern;;
@@ -862,98 +863,11 @@ __git_compute_all_commands ()
__git_all_commands=$(__git_list_all_commands)
 }
 
-__git_list_porcelain_commands ()
-{
-   local i IFS=" "$'\n'
-   __git_compute_all_commands
-   for i in $__git_all_commands
-   do
-   case $i in
-   *--*) : helper pattern;;
-   applymbox): ask gittus;;
-   applypatch)   : ask gittus;;
-   archimport)   : import;;
-   cat-file) : plumbing;;
-   check-attr)   : plumbing;;
-   check-ignore) : plumbing;;
-   check-mailmap): plumbing;;
-   check-ref-format) : plumbing;;
-   checkout-index)   : plumbing;;
-   column)   : internal helper;;
-   commit-tree)  : plumbing;;
-   count-objects): infrequent;;
-   credential)   : credentials;;
-   credential-*) : credentials helper;;
-   cvsexportcommit)  : export;;
-   cvsimport): import;;
-   cvsserver): daemon;;
-   daemon)   : daemon;;
-   diff-files)   : plumbing;;
-   diff-index)   : plumbing;;
-   diff-tree): plumbing;;
-   fast-import)  : import;;
-   fast-export)  : export;;
-   fsck-objects) : plumbing;;
-   fetch-pack)   : plumbing;;
-   fmt-merge-msg): plumbing;;
-   for-each-ref) : plumbing;;
-   hash-object)  : plumbing;;
-   http-*)   : transport;;
-   index-pack)   : plumbing;;
-   init-db)  : deprecated;;
-   local-fetch)  : plumbing;;
-   ls-files) : plumbing;;
-   ls-remote): plumbing;;
-   ls-tree)  : plumbing;;
-   mailinfo) : plumbing;;
-   mailsplit): plumbing;;
-   merge-*)  : plumbing;;
-   mktree)   : plumbing;;
-   mktag): plumbing;;
-   pack-objects) : plumbing;;
-   pack-redundant)   : plumbing;;
-   pack-refs): plumbing;;
-   parse-remote) : plumbing;;
-   patch-id) : plumbing;;
-   prune): plumbing;;
-   prune-packed) : plumbing;;
-   quiltimport)  : import;;
-   read-tree): plumbing;;
-   receive-pack) : plumbing;;
-   remote-*) : transport;;
-   rerere)   : plumbing;;
-   rev-list) : plumbing;;
-   rev-parse): plumbing;;
-   runstatus): plumbing;;
-   sh-setup) : internal;;
-   shell): daemon;;
-   show-ref) : plumbing;;
-   send-pack): plumbing;;
-   show-index)   : plumbing;;
-   ssh-*): transport;;
-   stripspace)   : plumbing;;
-   symbolic-ref) : plumbing;;
-   unpack-file)  : plumbing;;
-   unpack-objects)   : plumbing;;
-   update-index) : plumbing;;
-   update-ref)   : plumbing;;
-   update-server-info) : daemon;;
-   upload-archive)   : plumbing;;
-   upload-pack)  : plumbing;;
-   write-tree)   : plumbing;;
-   var)  : infrequent;;
-   verify-pack)  

[PATCH v2 6/6] help: use command-list.txt for the source of guides

2018-04-15 Thread Nguyễn Thái Ngọc Duy
The help command currently hard codes the list of guides and their
summary in C. Let's move this list to command-list.txt. This lets us
extract summary lines from Documentation/git*.txt. This also
potentially lets us lists guides in git.txt, but I'll leave that for
now.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/gitattributes.txt |  2 +-
 Documentation/gitmodules.txt|  2 +-
 Documentation/gitrevisions.txt  |  2 +-
 builtin/help.c  | 32 
 command-list.txt|  8 
 generate-cmdlist.sh |  6 +-
 help.c  | 22 ++
 help.h  |  1 +
 8 files changed, 39 insertions(+), 36 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 1094fe2b5b..083c2f380d 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -3,7 +3,7 @@ gitattributes(5)
 
 NAME
 
-gitattributes - defining attributes per path
+gitattributes - Defining attributes per path
 
 SYNOPSIS
 
diff --git a/Documentation/gitmodules.txt b/Documentation/gitmodules.txt
index db5d47eb19..4d63def206 100644
--- a/Documentation/gitmodules.txt
+++ b/Documentation/gitmodules.txt
@@ -3,7 +3,7 @@ gitmodules(5)
 
 NAME
 
-gitmodules - defining submodule properties
+gitmodules - Defining submodule properties
 
 SYNOPSIS
 
diff --git a/Documentation/gitrevisions.txt b/Documentation/gitrevisions.txt
index 27dec5b91d..1f6cceaefb 100644
--- a/Documentation/gitrevisions.txt
+++ b/Documentation/gitrevisions.txt
@@ -3,7 +3,7 @@ gitrevisions(7)
 
 NAME
 
-gitrevisions - specifying revisions and ranges for Git
+gitrevisions - Specifying revisions and ranges for Git
 
 SYNOPSIS
 
diff --git a/builtin/help.c b/builtin/help.c
index 0e0af8426a..5727fb5e51 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -402,38 +402,6 @@ static void show_html_page(const char *git_cmd)
open_html(page_path.buf);
 }
 
-static struct {
-   const char *name;
-   const char *help;
-} common_guides[] = {
-   { "attributes", N_("Defining attributes per path") },
-   { "everyday", N_("Everyday Git With 20 Commands Or So") },
-   { "glossary", N_("A Git glossary") },
-   { "ignore", N_("Specifies intentionally untracked files to ignore") },
-   { "modules", N_("Defining submodule properties") },
-   { "revisions", N_("Specifying revisions and ranges for Git") },
-   { "tutorial", N_("A tutorial introduction to Git (for version 1.5.1 or 
newer)") },
-   { "workflows", N_("An overview of recommended workflows with Git") },
-};
-
-static void list_common_guides_help(void)
-{
-   int i, longest = 0;
-
-   for (i = 0; i < ARRAY_SIZE(common_guides); i++) {
-   if (longest < strlen(common_guides[i].name))
-   longest = strlen(common_guides[i].name);
-   }
-
-   puts(_("The common Git guides are:\n"));
-   for (i = 0; i < ARRAY_SIZE(common_guides); i++) {
-   printf("   %s   ", common_guides[i].name);
-   mput_char(' ', longest - strlen(common_guides[i].name));
-   puts(_(common_guides[i].help));
-   }
-   putchar('\n');
-}
-
 static const char *check_git_cmd(const char* cmd)
 {
char *alias;
diff --git a/command-list.txt b/command-list.txt
index a1fad28fd8..0809a19184 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -149,3 +149,11 @@ gitweb  
ancillaryinterrogators
 git-whatchanged ancillaryinterrogators
 git-worktreemainporcelain
 git-write-tree  plumbingmanipulators
+gitattributes   guide
+giteveryday guide
+gitglossary guide
+gitignore   guide
+gitmodules  guide
+gitrevisionsguide
+gittutorial guide
+gitworkflowsguide
diff --git a/generate-cmdlist.sh b/generate-cmdlist.sh
index e0893e979a..e35f3e357b 100755
--- a/generate-cmdlist.sh
+++ b/generate-cmdlist.sh
@@ -67,7 +67,11 @@ command_list "$1" |
 sort |
 while read cmd category tags
 do
-   name=${cmd/git-}
+   if [ "$category" = guide ]; then
+   name=${cmd/git}
+   else
+   name=${cmd/git-}
+   fi
sed -n '
/^NAME/,/'"$cmd"'/H
${
diff --git a/help.c b/help.c
index 7f72051641..a44f4a113e 100644
--- a/help.c
+++ b/help.c
@@ -313,6 +313,28 @@ static void list_commands_by_category(int cat, struct 
cmdname_help *cmds,
}
 }
 
+void list_common_guides_help(void)
+{
+   int i, longest = 0;
+   int nr = ARRAY_SIZE(command_list);
+   struct cmdname_help *cmds = command_list;
+
+   QSORT(cmds, nr, 

[PATCH v2 0/6] Keep all info in command-list.txt in git binary

2018-04-15 Thread Nguyễn Thái Ngọc Duy
v2 changes

- bug fixes spotted by Eric
- keep 'git help -av' layout close to what's in git.txt
- enable pager for 'git help -av' because it's usually long
- move guide list (aka 'help -g') to command-list.txt

Nguyễn Thái Ngọc Duy (6):
  git.c: convert --list-builtins to --list-cmds=builtins
  git.c: implement --list-cmds=all and use it in git-completion.bash
  generate-cmdlist.sh: keep all information in common-cmds.h
  git.c: implement --list-cmds=porcelain
  help: add "-a --verbose" to list all commands with synopsis
  help: use command-list.txt for the source of guides

 Documentation/git-help.txt |   4 +-
 Documentation/gitattributes.txt|   2 +-
 Documentation/gitmodules.txt   |   2 +-
 Documentation/gitrevisions.txt |   2 +-
 builtin/help.c |  39 ++
 command-list.txt   |   8 ++
 contrib/completion/git-completion.bash |  96 +--
 generate-cmdlist.sh|  65 +++---
 git.c  |  16 ++-
 help.c | 163 -
 help.h |   4 +
 t/t0012-help.sh|   2 +-
 t/t9902-completion.sh  |   4 +-
 13 files changed, 249 insertions(+), 158 deletions(-)

-- 
2.17.0.367.g5dd2e386c3



[PATCH v2 1/6] git.c: convert --list-builtins to --list-cmds=builtins

2018-04-15 Thread Nguyễn Thái Ngọc Duy
Even if this is a hidden option, let's make it a bit more generic
since we're introducing more listing types.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 contrib/completion/git-completion.bash |  2 +-
 git.c  | 12 +++-
 t/t0012-help.sh|  2 +-
 3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index a757073945..3556838759 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -3049,7 +3049,7 @@ __git_complete_common () {
 __git_cmds_with_parseopt_helper=
 __git_support_parseopt_helper () {
test -n "$__git_cmds_with_parseopt_helper" ||
-   __git_cmds_with_parseopt_helper="$(__git 
--list-parseopt-builtins)"
+   __git_cmds_with_parseopt_helper="$(__git --list-cmds=parseopt)"
 
case " $__git_cmds_with_parseopt_helper " in
*" $1 "*)
diff --git a/git.c b/git.c
index 3a89893712..28bfa96d87 100644
--- a/git.c
+++ b/git.c
@@ -223,11 +223,13 @@ static int handle_options(const char ***argv, int *argc, 
int *envchanged)
}
(*argv)++;
(*argc)--;
-   } else if (!strcmp(cmd, "--list-builtins")) {
-   list_builtins(0, '\n');
-   exit(0);
-   } else if (!strcmp(cmd, "--list-parseopt-builtins")) {
-   list_builtins(NO_PARSEOPT, ' ');
+   } else if (skip_prefix(cmd, "--list-cmds=", )) {
+   if (!strcmp(cmd, "builtins"))
+   list_builtins(0, '\n');
+   else if (!strcmp(cmd, "parseopt"))
+   list_builtins(NO_PARSEOPT, ' ');
+   else
+   die("unsupported command listing type '%s'", 
cmd);
exit(0);
} else {
fprintf(stderr, _("unknown option: %s\n"), cmd);
diff --git a/t/t0012-help.sh b/t/t0012-help.sh
index 487b92a5de..fd2a7f27dc 100755
--- a/t/t0012-help.sh
+++ b/t/t0012-help.sh
@@ -50,7 +50,7 @@ test_expect_success "--help does not work for guides" "
 "
 
 test_expect_success 'generate builtin list' '
-   git --list-builtins >builtins
+   git --list-cmds=builtins >builtins
 '
 
 while read builtin
-- 
2.17.0.367.g5dd2e386c3



[PATCH v2 5/6] help: add "-a --verbose" to list all commands with synopsis

2018-04-15 Thread Nguyễn Thái Ngọc Duy
This lists all recognized commands [1] by category. The group order
follows closely git.txt.

[1] We may actually show commands that are not built (e.g. if you set
NO_PERL you don't have git-instaweb but it's still listed here). I
ignore the problem because on Linux a git package could be split
anyway. The "git-core" package may not contain git-instaweb even if
it's built because it may end up in a separate package. We can't know
anyway.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-help.txt |  4 ++-
 builtin/help.c |  7 
 help.c | 69 ++
 help.h |  1 +
 4 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-help.txt b/Documentation/git-help.txt
index 40d328a4b3..a40fc38d8b 100644
--- a/Documentation/git-help.txt
+++ b/Documentation/git-help.txt
@@ -8,7 +8,7 @@ git-help - Display help information about Git
 SYNOPSIS
 
 [verse]
-'git help' [-a|--all] [-g|--guide]
+'git help' [-a|--all [--verbose]] [-g|--guide]
   [-i|--info|-m|--man|-w|--web] [COMMAND|GUIDE]
 
 DESCRIPTION
@@ -42,6 +42,8 @@ OPTIONS
 --all::
Prints all the available commands on the standard output. This
option overrides any given command or guide name.
+   When used with `--verbose` print description for all recognized
+   commands.
 
 -g::
 --guides::
diff --git a/builtin/help.c b/builtin/help.c
index 598867cfea..0e0af8426a 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -36,6 +36,7 @@ static const char *html_path;
 
 static int show_all = 0;
 static int show_guides = 0;
+static int verbose;
 static unsigned int colopts;
 static enum help_format help_format = HELP_FORMAT_NONE;
 static int exclude_guides;
@@ -48,6 +49,7 @@ static struct option builtin_help_options[] = {
HELP_FORMAT_WEB),
OPT_SET_INT('i', "info", _format, N_("show info page"),
HELP_FORMAT_INFO),
+   OPT__VERBOSE(, N_("print command description")),
OPT_END(),
 };
 
@@ -463,6 +465,11 @@ int cmd_help(int argc, const char **argv, const char 
*prefix)
 
if (show_all) {
git_config(git_help_config, NULL);
+   if (verbose) {
+   setup_pager();
+   list_all_cmds_help();
+   return 0;
+   }
printf(_("usage: %s%s"), _(git_usage_string), "\n\n");
load_command_list("git-", _cmds, _cmds);
list_commands(colopts, _cmds, _cmds);
diff --git a/help.c b/help.c
index 1523ca175c..7f72051641 100644
--- a/help.c
+++ b/help.c
@@ -284,6 +284,75 @@ void list_porcelain_cmds(void)
}
 }
 
+static int cmd_category_cmp(const void *elem1, const void *elem2)
+{
+   const struct cmdname_help *e1 = elem1;
+   const struct cmdname_help *e2 = elem2;
+
+   if (e1->category < e2->category)
+   return -1;
+   if (e1->category > e2->category)
+   return 1;
+   return strcmp(e1->name, e2->name);
+}
+
+static void list_commands_by_category(int cat, struct cmdname_help *cmds,
+ int nr, int longest)
+{
+   int i;
+
+   for (i = 0; i < nr; i++) {
+   struct cmdname_help *cmd = cmds + i;
+
+   if (cmd->category != cat)
+   continue;
+
+   printf("   %s   ", cmd->name);
+   mput_char(' ', longest - strlen(cmd->name));
+   puts(_(cmd->help));
+   }
+}
+
+void list_all_cmds_help(void)
+{
+   int i, longest = 0;
+   int nr = ARRAY_SIZE(command_list);
+   struct cmdname_help *cmds = command_list;
+
+   for (i = 0; i < nr; i++) {
+   struct cmdname_help *cmd = cmds + i;
+
+   if (longest < strlen(cmd->name))
+   longest = strlen(cmd->name);
+   }
+
+   QSORT(cmds, nr, cmd_category_cmp);
+
+   printf("%s\n\n", _("Main Porcelain Commands"));
+   list_commands_by_category(CAT_mainporcelain, cmds, nr, longest);
+
+   printf("\n%s\n\n", _("Ancillary Commands / Manipulators"));
+   list_commands_by_category(CAT_ancillarymanipulators, cmds, nr, longest);
+
+   printf("\n%s\n\n", _("Ancillary Commands / Interrogators"));
+   list_commands_by_category(CAT_ancillaryinterrogators, cmds, nr, 
longest);
+
+   printf("\n%s\n\n", _("Interacting with Others"));
+   list_commands_by_category(CAT_foreignscminterface, cmds, nr, longest);
+
+   printf("\n%s\n\n", _("Low-level Commands / Manipulators"));
+   list_commands_by_category(CAT_plumbingmanipulators, cmds, nr, longest);
+
+   printf("\n%s\n\n", _("Low-level Commands / Interrogators"));
+   list_commands_by_category(CAT_plumbinginterrogators, cmds, nr, longest);
+
+   printf("\n%s\n\n", _("Low-level Commands / Synching Repositories"));
+   

[PATCH v2 2/6] git.c: implement --list-cmds=all and use it in git-completion.bash

2018-04-15 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 contrib/completion/git-completion.bash |  2 +-
 git.c  |  2 ++
 help.c | 18 ++
 help.h |  1 +
 4 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 3556838759..a5f13ade20 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -839,7 +839,7 @@ __git_commands () {
then
printf "%s" "${GIT_TESTING_COMMAND_COMPLETION}"
else
-   git help -a|egrep '^  [a-zA-Z0-9]'
+   git --list-cmds=all
fi
 }
 
diff --git a/git.c b/git.c
index 28bfa96d87..64f67e7f7f 100644
--- a/git.c
+++ b/git.c
@@ -228,6 +228,8 @@ static int handle_options(const char ***argv, int *argc, 
int *envchanged)
list_builtins(0, '\n');
else if (!strcmp(cmd, "parseopt"))
list_builtins(NO_PARSEOPT, ' ');
+   else if (!strcmp(cmd, "all"))
+   list_all_cmds();
else
die("unsupported command listing type '%s'", 
cmd);
exit(0);
diff --git a/help.c b/help.c
index 60071a9bea..e155c39870 100644
--- a/help.c
+++ b/help.c
@@ -228,6 +228,24 @@ void list_common_cmds_help(void)
}
 }
 
+void list_all_cmds(void)
+{
+   struct cmdnames main_cmds, other_cmds;
+   int i;
+
+   memset(_cmds, 0, sizeof(main_cmds));
+   memset(_cmds, 0, sizeof(other_cmds));
+   load_command_list("git-", _cmds, _cmds);
+
+   for (i = 0; i < main_cmds.cnt; i++)
+   puts(main_cmds.names[i]->name);
+   for (i = 0; i < other_cmds.cnt; i++)
+   puts(other_cmds.names[i]->name);
+
+   clean_cmdnames(_cmds);
+   clean_cmdnames(_cmds);
+}
+
 int is_in_cmdlist(struct cmdnames *c, const char *s)
 {
int i;
diff --git a/help.h b/help.h
index b21d7c94e8..0bf29f8dc5 100644
--- a/help.h
+++ b/help.h
@@ -17,6 +17,7 @@ static inline void mput_char(char c, unsigned int num)
 }
 
 extern void list_common_cmds_help(void);
+extern void list_all_cmds(void);
 extern const char *help_unknown_cmd(const char *cmd);
 extern void load_command_list(const char *prefix,
  struct cmdnames *main_cmds,
-- 
2.17.0.367.g5dd2e386c3



Re: [PATCH/RFC 3/5] generate-cmdlist.sh: keep all information in common-cmds.h

2018-04-15 Thread Duy Nguyen
On Mon, Apr 9, 2018 at 6:59 AM, Eric Sunshine  wrote:
>> +awk '{print $2;}' |
>
> At one time, Junio expressed concerns[2] about having an 'awk'
> dependency in the build system (in fact, with regards to this same
> generation process). Whether he still has such concerns is unknown,
> but it should be easy enough to avoid it here (and below).
>
> [2]: https://public-inbox.org/git/20150519004356.GA12854@flurp.local/

I'll stick with awk to avoid too much headache with regular
expressions (replacements are welcome though). We do use awk in our
test suite so it should be ok (who builds git and runs it without
testing?)
-- 
Duy


[PATCH v2 4/7] gc: add gc.bigPackThreshold config

2018-04-15 Thread Nguyễn Thái Ngọc Duy
The --keep-largest-pack option is not very convenient to use because
you need to tell gc to do this explicitly (and probably on just a few
large repos).

Add a config key that enables this mode when packs larger than a limit
are found. Note that there's a slight behavior difference compared to
--keep-largest-pack: all packs larger than the threshold are kept, not
just the largest one.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/config.txt |  7 +++
 Documentation/git-gc.txt |  6 --
 builtin/gc.c | 26 --
 3 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2659153cb3..728ae902e6 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1558,6 +1558,13 @@ gc.autoDetach::
Make `git gc --auto` return immediately and run in background
if the system supports it. Default is true.
 
+gc.bigPackThreshold::
+   If non-zero, all packs larger than this limit are kept when
+   `git gc` is run. This is very similar to `--keep-base-pack`
+   except that all packs that meet the threshold are kept, not
+   just the base pack. Defaults to zero. Common unit suffixes of
+   'k', 'm', or 'g' are supported.
+
 gc.logExpiry::
If the file gc.log exists, then `git gc --auto` won't run
unless that file is more than 'gc.logExpiry' old.  Default is
diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index 8f903231da..649faddfa6 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -56,7 +56,8 @@ single pack using `git repack -d -l`.  Setting the value of 
`gc.auto`
 to 0 disables automatic packing of loose objects.
 +
 If the number of packs exceeds the value of `gc.autoPackLimit`,
-then existing packs (except those marked with a `.keep` file)
+then existing packs (except those marked with a `.keep` file
+or over `gc.bigPackThreshold` limit)
 are consolidated into a single pack by using the `-A` option of
 'git repack'. Setting `gc.autoPackLimit` to 0 disables
 automatic consolidation of packs.
@@ -86,7 +87,8 @@ be performed as well.
 
 --keep-largest-pack::
All packs except the largest pack and those marked with a
-   `.keep` files are consolidated into a single pack.
+   `.keep` files are consolidated into a single pack. When this
+   option is used, `gc.bigPackThreshold` is ignored.
 
 Configuration
 -
diff --git a/builtin/gc.c b/builtin/gc.c
index f251662a8f..81ecdc5ffa 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -41,6 +41,7 @@ static timestamp_t gc_log_expire_time;
 static const char *gc_log_expire = "1.day.ago";
 static const char *prune_expire = "2.weeks.ago";
 static const char *prune_worktrees_expire = "3.months.ago";
+static unsigned long big_pack_threshold;
 
 static struct argv_array pack_refs_cmd = ARGV_ARRAY_INIT;
 static struct argv_array reflog = ARGV_ARRAY_INIT;
@@ -128,6 +129,8 @@ static void gc_config(void)
git_config_get_expiry("gc.worktreepruneexpire", 
_worktrees_expire);
git_config_get_expiry("gc.logexpiry", _log_expire);
 
+   git_config_get_ulong("gc.bigpackthreshold", _pack_threshold);
+
git_config(git_default_config, NULL);
 }
 
@@ -166,14 +169,17 @@ static int too_many_loose_objects(void)
return needed;
 }
 
-static void find_base_packs(struct string_list *packs)
+static void find_base_packs(struct string_list *packs, unsigned long limit)
 {
struct packed_git *p, *base = NULL;
 
for (p = get_packed_git(the_repository); p; p = p->next) {
if (!p->pack_local)
continue;
-   if (!base || base->pack_size < p->pack_size) {
+   if (limit) {
+   if (p->pack_size >= limit)
+   string_list_append(packs, p->pack_name);
+   } else if (!base || base->pack_size < p->pack_size) {
base = p;
}
}
@@ -244,9 +250,15 @@ static int need_to_gc(void)
 * we run "repack -A -d -l".  Otherwise we tell the caller
 * there is no need.
 */
-   if (too_many_packs())
-   add_repack_all_option(NULL);
-   else if (too_many_loose_objects())
+   if (too_many_packs()) {
+   struct string_list keep_pack = STRING_LIST_INIT_NODUP;
+
+   if (big_pack_threshold)
+   find_base_packs(_pack, big_pack_threshold);
+
+   add_repack_all_option(_pack);
+   string_list_clear(_pack, 0);
+   } else if (too_many_loose_objects())
add_repack_incremental_option();
else
return 0;
@@ -464,7 +476,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 
if (keep_base_pack != -1) {
if (keep_base_pack)
-   find_base_packs(_pack);
+

[PATCH v2 6/7] gc --auto: exclude base pack if not enough mem to "repack -ad"

2018-04-15 Thread Nguyễn Thái Ngọc Duy
pack-objects could be a big memory hog especially on large repos,
everybody knows that. The suggestion to stick a .keep file on the
giant base pack to avoid this problem is also known for a long time.

Recent patches add an option to do just this, but it has to be either
configured or activated manually. This patch lets `git gc --auto`
activate this mode automatically when it thinks `repack -ad` will use
a lot of memory and start affecting the system due to swapping or
flushing OS cache.

gc --auto decides to do this based on an estimation of pack-objects
memory usage, which is quite accurate at least for the heap part, and
whether that fits in half of system memory (the assumption here is for
desktop environment where there are many other applications running).

This mechanism only kicks in if gc.bigBasePackThreshold is not configured.
If it is, it is assumed that the user already knows what they want.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-gc.txt |  9 +++-
 builtin/gc.c | 98 +++-
 builtin/pack-objects.c   |  2 +-
 config.mak.uname |  1 +
 git-compat-util.h|  4 ++
 pack-objects.h   |  2 +
 t/t6500-gc.sh|  7 +++
 7 files changed, 119 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index 649faddfa6..0fb1d43b2c 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -59,8 +59,13 @@ If the number of packs exceeds the value of 
`gc.autoPackLimit`,
 then existing packs (except those marked with a `.keep` file
 or over `gc.bigPackThreshold` limit)
 are consolidated into a single pack by using the `-A` option of
-'git repack'. Setting `gc.autoPackLimit` to 0 disables
-automatic consolidation of packs.
+'git repack'.
+If the amount of memory is estimated not enough for `git repack` to
+run smoothly and `gc.bigPackThreshold` is not set, the largest
+pack will also be excluded (this is the equivalent of running `git gc`
+with `--keep-base-pack`).
+Setting `gc.autoPackLimit` to 0 disables automatic consolidation of
+packs.
 +
 If houskeeping is required due to many loose objects or packs, all
 other housekeeping tasks (e.g. rerere, working trees, reflog...) will
diff --git a/builtin/gc.c b/builtin/gc.c
index 23c17ba7e9..3c7c93e961 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -22,6 +22,10 @@
 #include "commit.h"
 #include "packfile.h"
 #include "object-store.h"
+#include "pack.h"
+#include "pack-objects.h"
+#include "blob.h"
+#include "tree.h"
 
 #define FAILED_RUN "failed to run %s"
 
@@ -42,6 +46,7 @@ static const char *gc_log_expire = "1.day.ago";
 static const char *prune_expire = "2.weeks.ago";
 static const char *prune_worktrees_expire = "3.months.ago";
 static unsigned long big_pack_threshold;
+static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE;
 
 static struct argv_array pack_refs_cmd = ARGV_ARRAY_INIT;
 static struct argv_array reflog = ARGV_ARRAY_INIT;
@@ -130,6 +135,7 @@ static void gc_config(void)
git_config_get_expiry("gc.logexpiry", _log_expire);
 
git_config_get_ulong("gc.bigpackthreshold", _pack_threshold);
+   git_config_get_ulong("pack.deltacachesize", _delta_cache_size);
 
git_config(git_default_config, NULL);
 }
@@ -169,7 +175,8 @@ static int too_many_loose_objects(void)
return needed;
 }
 
-static void find_base_packs(struct string_list *packs, unsigned long limit)
+static struct packed_git *find_base_packs(struct string_list *packs,
+ unsigned long limit)
 {
struct packed_git *p, *base = NULL;
 
@@ -186,6 +193,8 @@ static void find_base_packs(struct string_list *packs, 
unsigned long limit)
 
if (base)
string_list_append(packs, base->pack_name);
+
+   return base;
 }
 
 static int too_many_packs(void)
@@ -210,6 +219,79 @@ static int too_many_packs(void)
return gc_auto_pack_limit < cnt;
 }
 
+static uint64_t total_ram(void)
+{
+#if defined(HAVE_SYSINFO)
+   struct sysinfo si;
+
+   if (!sysinfo())
+   return si.totalram;
+#elif defined(HAVE_BSD_SYSCTL) && (defined(HW_MEMSIZE) || defined(HW_PHYSMEM))
+   int64_t physical_memory;
+   int mib[2];
+   size_t length;
+
+   mib[0] = CTL_HW;
+# if defined(HW_MEMSIZE)
+   mib[1] = HW_MEMSIZE;
+# else
+   mib[1] = HW_PHYSMEM;
+# endif
+   length = sizeof(int64_t);
+   if (!sysctl(mib, 2, _memory, , NULL, 0))
+   return physical_memory;
+#elif defined(GIT_WINDOWS_NATIVE)
+   MEMORYSTATUSEX memInfo;
+
+   memInfo.dwLength = sizeof(MEMORYSTATUSEX);
+   if (GlobalMemoryStatusEx())
+   return memInfo.ullTotalPhys;
+#endif
+   return 0;
+}
+
+static uint64_t estimate_repack_memory(struct packed_git *pack)
+{
+   unsigned long nr_objects = approximate_object_count();
+   size_t os_cache, heap;
+
+   if (!pack || !nr_objects)
+  

[PATCH v2 3/7] gc: add --keep-largest-pack option

2018-04-15 Thread Nguyễn Thái Ngọc Duy
This adds a new repack mode that combines everything into a secondary
pack, leaving the largest pack alone.

This could help reduce memory pressure. On linux-2.6.git, valgrind
massif reports 1.6GB heap in "pack all" case, and 535MB in "pack
all except the base pack" case. We save roughly 1GB memory by
excluding the base pack.

This should also lower I/O because we don't have to rewrite a giant
pack every time (e.g. for linux-2.6.git that's a 1.4GB pack file)..

PS. The use of string_list here seems overkill, but we'll need it in
the next patch...

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-gc.txt |  6 +-
 builtin/gc.c | 45 
 t/t6500-gc.sh| 25 ++
 3 files changed, 71 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index 3126e0dd00..8f903231da 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -9,7 +9,7 @@ git-gc - Cleanup unnecessary files and optimize the local 
repository
 SYNOPSIS
 
 [verse]
-'git gc' [--aggressive] [--auto] [--quiet] [--prune= | --no-prune] 
[--force]
+'git gc' [--aggressive] [--auto] [--quiet] [--prune= | --no-prune] 
[--force] [--keep-largest-pack]
 
 DESCRIPTION
 ---
@@ -84,6 +84,10 @@ be performed as well.
Force `git gc` to run even if there may be another `git gc`
instance running on this repository.
 
+--keep-largest-pack::
+   All packs except the largest pack and those marked with a
+   `.keep` files are consolidated into a single pack.
+
 Configuration
 -
 
diff --git a/builtin/gc.c b/builtin/gc.c
index 3e67124eaa..f251662a8f 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -166,6 +166,22 @@ static int too_many_loose_objects(void)
return needed;
 }
 
+static void find_base_packs(struct string_list *packs)
+{
+   struct packed_git *p, *base = NULL;
+
+   for (p = get_packed_git(the_repository); p; p = p->next) {
+   if (!p->pack_local)
+   continue;
+   if (!base || base->pack_size < p->pack_size) {
+   base = p;
+   }
+   }
+
+   if (base)
+   string_list_append(packs, base->pack_name);
+}
+
 static int too_many_packs(void)
 {
struct packed_git *p;
@@ -188,7 +204,13 @@ static int too_many_packs(void)
return gc_auto_pack_limit < cnt;
 }
 
-static void add_repack_all_option(void)
+static int keep_one_pack(struct string_list_item *item, void *data)
+{
+   argv_array_pushf(, "--keep-pack=%s", basename(item->string));
+   return 0;
+}
+
+static void add_repack_all_option(struct string_list *keep_pack)
 {
if (prune_expire && !strcmp(prune_expire, "now"))
argv_array_push(, "-a");
@@ -197,6 +219,9 @@ static void add_repack_all_option(void)
if (prune_expire)
argv_array_pushf(, "--unpack-unreachable=%s", 
prune_expire);
}
+
+   if (keep_pack)
+   for_each_string_list(keep_pack, keep_one_pack, NULL);
 }
 
 static void add_repack_incremental_option(void)
@@ -220,7 +245,7 @@ static int need_to_gc(void)
 * there is no need.
 */
if (too_many_packs())
-   add_repack_all_option();
+   add_repack_all_option(NULL);
else if (too_many_loose_objects())
add_repack_incremental_option();
else
@@ -354,6 +379,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
const char *name;
pid_t pid;
int daemonized = 0;
+   int keep_base_pack = -1;
 
struct option builtin_gc_options[] = {
OPT__QUIET(, N_("suppress progress reporting")),
@@ -366,6 +392,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
OPT_BOOL_F(0, "force", ,
   N_("force running gc even if there may be another gc 
running"),
   PARSE_OPT_NOCOMPLETE),
+   OPT_BOOL(0, "keep-largest-pack", _base_pack,
+N_("repack all other packs except the largest pack")),
OPT_END()
};
 
@@ -431,8 +459,17 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 */
daemonized = !daemonize();
}
-   } else
-   add_repack_all_option();
+   } else {
+   struct string_list keep_pack = STRING_LIST_INIT_NODUP;
+
+   if (keep_base_pack != -1) {
+   if (keep_base_pack)
+   find_base_packs(_pack);
+   }
+
+   add_repack_all_option(_pack);
+   string_list_clear(_pack, 0);
+   }
 
name = lock_repo_for_gc(force, );
if (name) {
diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index d5255dd576..c42f60bc5b 100755
--- a/t/t6500-gc.sh
+++ 

[PATCH v2 7/7] pack-objects: show some progress when counting kept objects

2018-04-15 Thread Nguyễn Thái Ngọc Duy
We only show progress when there are new objects to be packed. But
when --keep-pack is specified on the base pack, we will exclude most
of objects. This makes 'pack-objects' stay silent for a long time
while the counting phase is going.

Let's show some progress whenever we visit an object instead. The old
"Counting objects" is renamed to "Enumerating objects" and a new
progress "Counting objects" line is added.

This new "Counting objects" line should progress pretty quick when the
system is beefy. But when the system is under pressure, the reading
object header done in this phase could be slow and showing progress is
an improvement over staying silent in the current code.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index c77bea404d..6a1346c41f 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -46,7 +46,7 @@ static const char *pack_usage[] = {
 static struct packing_data to_pack;
 
 static struct pack_idx_entry **written_list;
-static uint32_t nr_result, nr_written;
+static uint32_t nr_result, nr_written, nr_seen;
 
 static int non_empty;
 static int reuse_delta = 1, reuse_object = 1;
@@ -1096,6 +1096,8 @@ static int add_object_entry(const struct object_id *oid, 
enum object_type type,
off_t found_offset = 0;
uint32_t index_pos;
 
+   display_progress(progress_state, ++nr_seen);
+
if (have_duplicate_entry(oid, exclude, _pos))
return 0;
 
@@ -,8 +1113,6 @@ static int add_object_entry(const struct object_id *oid, 
enum object_type type,
create_object_entry(oid, type, pack_name_hash(name),
exclude, name && no_try_delta(name),
index_pos, found_pack, found_offset);
-
-   display_progress(progress_state, nr_result);
return 1;
 }
 
@@ -1123,6 +1123,8 @@ static int add_object_entry_from_bitmap(const struct 
object_id *oid,
 {
uint32_t index_pos;
 
+   display_progress(progress_state, ++nr_seen);
+
if (have_duplicate_entry(oid, 0, _pos))
return 0;
 
@@ -1130,8 +1132,6 @@ static int add_object_entry_from_bitmap(const struct 
object_id *oid,
return 0;
 
create_object_entry(oid, type, name_hash, 0, 0, index_pos, pack, 
offset);
-
-   display_progress(progress_state, nr_result);
return 1;
 }
 
@@ -1716,6 +1716,10 @@ static void get_object_details(void)
uint32_t i;
struct object_entry **sorted_by_offset;
 
+   if (progress)
+   progress_state = start_progress(_("Counting objects"),
+   to_pack.nr_objects);
+
sorted_by_offset = xcalloc(to_pack.nr_objects, sizeof(struct 
object_entry *));
for (i = 0; i < to_pack.nr_objects; i++)
sorted_by_offset[i] = to_pack.objects + i;
@@ -1726,7 +1730,9 @@ static void get_object_details(void)
check_object(entry);
if (big_file_threshold < entry->size)
entry->no_try_delta = 1;
+   display_progress(progress_state, i + 1);
}
+   stop_progress(_state);
 
/*
 * This must happen in a second pass, since we rely on the delta
@@ -3209,7 +3215,7 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
}
 
if (progress)
-   progress_state = start_progress(_("Counting objects"), 0);
+   progress_state = start_progress(_("Enumerating objects"), 0);
if (!use_internal_rev_list)
read_object_list_from_stdin();
else {
-- 
2.17.0.367.g5dd2e386c3



[PATCH v2 5/7] gc: handle a corner case in gc.bigPackThreshold

2018-04-15 Thread Nguyễn Thái Ngọc Duy
This config allows us to keep  packs back if their size is larger
than a limit. But if this N >= gc.autoPackLimit, we may have a
problem. We are supposed to reduce the number of packs after a
threshold because it affects performance.

We could tell the user that they have incompatible gc.bigPackThreshold
and gc.autoPackLimit, but it's kinda hard when 'git gc --auto' runs in
background. Instead let's fall back to the next best stategy: try to
reduce the number of packs anyway, but keep the base pack out. This
reduces the number of packs to two and hopefully won't take up too
much resources to repack (the assumption still is the base pack takes
most resources to handle).

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/config.txt | 5 +
 builtin/gc.c | 8 +++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 728ae902e6..4c3f1d7651 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1564,6 +1564,11 @@ gc.bigPackThreshold::
except that all packs that meet the threshold are kept, not
just the base pack. Defaults to zero. Common unit suffixes of
'k', 'm', or 'g' are supported.
++
+Note that if the number of kept packs is more than gc.autoPackLimit,
+this configuration variable is ignored, all packs except the base pack
+will be repacked. After this the number of packs should go below
+gc.autoPackLimit and gc.bigPackThreshold should be respected again.
 
 gc.logExpiry::
If the file gc.log exists, then `git gc --auto` won't run
diff --git a/builtin/gc.c b/builtin/gc.c
index 81ecdc5ffa..23c17ba7e9 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -253,8 +253,14 @@ static int need_to_gc(void)
if (too_many_packs()) {
struct string_list keep_pack = STRING_LIST_INIT_NODUP;
 
-   if (big_pack_threshold)
+   if (big_pack_threshold) {
find_base_packs(_pack, big_pack_threshold);
+   if (keep_pack.nr >= gc_auto_pack_limit) {
+   big_pack_threshold = 0;
+   string_list_clear(_pack, 0);
+   find_base_packs(_pack, 0);
+   }
+   }
 
add_repack_all_option(_pack);
string_list_clear(_pack, 0);
-- 
2.17.0.367.g5dd2e386c3



[PATCH v2 0/7] nd/repack-keep-pack update

2018-04-15 Thread Nguyễn Thái Ngọc Duy
On Sat, Apr 14, 2018 at 9:47 PM, Ævar Arnfjörð Bjarmason  
wrote:
> The only (trivial) issue I found in the patches themselves was that
> between 4/5 and 5/7 you're adding an empty line to config.txt in 4/7
> just to erase it in 5/7, better not to add it to begin with, but
> hopefully Junio can fix that up (if he cares).

Fixed (and is the only change in v2) in case Junio has not picked up
the last round and fixed it himself yet.

Nguyễn Thái Ngọc Duy (7):
  t7700: have closing quote of a test at the beginning of line
  repack: add --keep-pack option
  gc: add --keep-largest-pack option
  gc: add gc.bigPackThreshold config
  gc: handle a corner case in gc.bigPackThreshold
  gc --auto: exclude base pack if not enough mem to "repack -ad"
  pack-objects: show some progress when counting kept objects

 Documentation/config.txt   |  12 +++
 Documentation/git-gc.txt   |  19 +++-
 Documentation/git-pack-objects.txt |   9 +-
 Documentation/git-repack.txt   |   9 +-
 builtin/gc.c   | 165 +++--
 builtin/pack-objects.c |  83 +++
 builtin/repack.c   |  21 +++-
 config.mak.uname   |   1 +
 git-compat-util.h  |   4 +
 object-store.h |   1 +
 pack-objects.h |   2 +
 t/t6500-gc.sh  |  32 ++
 t/t7700-repack.sh  |  27 -
 13 files changed, 349 insertions(+), 36 deletions(-)

-- 
2.17.0.367.g5dd2e386c3



[PATCH v2 1/7] t7700: have closing quote of a test at the beginning of line

2018-04-15 Thread Nguyễn Thái Ngọc Duy
The closing quote of a test body by convention is always at the start
of line.

Signed-off-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 
---
 t/t7700-repack.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 6061a04147..38247afbec 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -194,7 +194,7 @@ test_expect_success 'objects made unreachable by grafts 
only are kept' '
git reflog expire --expire=$test_tick --expire-unreachable=$test_tick 
--all &&
git repack -a -d &&
git cat-file -t $H1
-   '
+'
 
 test_done
 
-- 
2.17.0.367.g5dd2e386c3



Re: Regression in patch add?

2018-04-15 Thread Martin Ågren
Hi Mahmoud

On 15 April 2018 at 14:21,   wrote:
> I first run `git add -p`, then manually edit a chunk (after hitting `s`
> once, if it matters). The chunk originally contains the following:

[...]

> Under git 2.7.4, I can edit it to the following, which is accepted
> without a problem:
>
> ```diff
> # Manual hunk edit mode -- see bottom for a quick guide
> @@ -20,7 +20,7 @@
> "call dein#add('Shougo/dein.vim', {'rev': 'master'})
>
> " Add or remove your plugins here:
> -   " call dein#add('flazz/vim-colorschemes')
> -   call dein#add('Haron-Prime/evening_vim')
> +   call dein#add('flazz/vim-colorschemes')
> +   call dein#add('Haron-Prime/evening_vim')
>
> "core plugins that change the behavior of vim and how we use it 
> globally
> ```
>
> All I did here was remove one `+` line and manually add another (which
> is a variant of the second `-` line).

So the line is identical (sans s/^-/+/). Interesting.

> Under git 2.17.0.252.gfe0a9ea, the same piece is opened in $VISUAL for
> editing (and if left unmodified applies OK), but when modified in the
> to the same exact value, after exiting the editor I receive the
> following error from git:
>
> error: patch fragment without header at line 15: @@ -25,7 +25,8 @@

I can't seem to reproduce this with some very simple testing. Are you
able to share your files? Or even better, derive a minimal reproduction
recipe?

What happens if you do not do a "remove this line, then add it again",
but instead turn that unchanged line into context? That is, you edit the
hunk into something like this (but without white-space damage):

...
-   " call dein#add('flazz/vim-colorschemes')
+   call dein#add('flazz/vim-colorschemes')
call dein#add('Haron-Prime/evening_vim')
...

Adding Phillip to cc, since he was recently working in this area and
might have an idea.

Martin


[PATCH v3] bisect: create 'bisect_flags' parameter in find_bisection()

2018-04-15 Thread Harald Nordgren
Make it possible to implement bisecting only on first parents or on
merge commits by passing flags to find_bisection(), instead of just
a 'find_all' boolean.

Signed-off-by: Harald Nordgren 
---

Notes:
Updating commit message

 bisect.c   | 20 
 bisect.h   |  6 --
 builtin/rev-list.c |  6 +++---
 3 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/bisect.c b/bisect.c
index a579b50884..19dac7491d 100644
--- a/bisect.c
+++ b/bisect.c
@@ -254,7 +254,7 @@ static struct commit_list *best_bisection_sorted(struct 
commit_list *list, int n
  */
 static struct commit_list *do_find_bisection(struct commit_list *list,
 int nr, int *weights,
-int find_all)
+unsigned int bisect_flags)
 {
int n, counted;
struct commit_list *p;
@@ -313,7 +313,7 @@ static struct commit_list *do_find_bisection(struct 
commit_list *list,
clear_distance(list);
 
/* Does it happen to be at exactly half-way? */
-   if (!find_all && halfway(p, nr))
+   if (!(bisect_flags & BISECT_FIND_ALL) && halfway(p, nr))
return p;
counted++;
}
@@ -351,21 +351,21 @@ static struct commit_list *do_find_bisection(struct 
commit_list *list,
weight_set(p, weight(q));
 
/* Does it happen to be at exactly half-way? */
-   if (!find_all && halfway(p, nr))
+   if (!(bisect_flags & BISECT_FIND_ALL) && halfway(p, nr))
return p;
}
}
 
show_list("bisection 2 counted all", counted, nr, list);
 
-   if (!find_all)
+   if (!(bisect_flags & BISECT_FIND_ALL))
return best_bisection(list, nr);
else
return best_bisection_sorted(list, nr);
 }
 
 void find_bisection(struct commit_list **commit_list, int *reaches,
-   int *all, int find_all)
+   int *all, unsigned int bisect_flags)
 {
int nr, on_list;
struct commit_list *list, *p, *best, *next, *last;
@@ -400,9 +400,9 @@ void find_bisection(struct commit_list **commit_list, int 
*reaches,
weights = xcalloc(on_list, sizeof(*weights));
 
/* Do the real work of finding bisection commit. */
-   best = do_find_bisection(list, nr, weights, find_all);
+   best = do_find_bisection(list, nr, weights, bisect_flags);
if (best) {
-   if (!find_all) {
+   if (!(bisect_flags & BISECT_FIND_ALL)) {
list->item = best->item;
free_commit_list(list->next);
best = list;
@@ -943,6 +943,7 @@ int bisect_next_all(const char *prefix, int no_checkout)
struct rev_info revs;
struct commit_list *tried;
int reaches = 0, all = 0, nr, steps;
+   unsigned int bisect_flags = 0;
struct object_id *bisect_rev;
char *steps_msg;
 
@@ -957,7 +958,10 @@ int bisect_next_all(const char *prefix, int no_checkout)
 
bisect_common();
 
-   find_bisection(, , , !!skipped_revs.nr);
+   if (skipped_revs.nr)
+   bisect_flags |= BISECT_FIND_ALL;
+
+   find_bisection(, , , bisect_flags);
revs.commits = managed_skipped(revs.commits, );
 
if (!revs.commits) {
diff --git a/bisect.h b/bisect.h
index a5d9248a47..8efe243a34 100644
--- a/bisect.h
+++ b/bisect.h
@@ -1,15 +1,17 @@
 #ifndef BISECT_H
 #define BISECT_H
 
+#define BISECT_FIND_ALL(1u<<0)
+
 /*
  * Find bisection. If something is found, `reaches` will be the number of
  * commits that the best commit reaches. `all` will be the count of
  * non-SAMETREE commits. If nothing is found, `list` will be NULL.
  * Otherwise, it will be either all non-SAMETREE commits or the single
- * best commit, as chosen by `find_all`.
+ * best commit, as chosen by flag `BISECT_FIND_ALL`.
  */
 extern void find_bisection(struct commit_list **list, int *reaches, int *all,
-  int find_all);
+  unsigned int bisect_flags);
 
 extern struct commit_list *filter_skipped(struct commit_list *list,
  struct commit_list **tried,
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index fadd3ec14c..22c3d479fb 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -360,8 +360,8 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
int i;
int bisect_list = 0;
int bisect_show_vars = 0;
-   int bisect_find_all = 0;
int use_bitmap_index = 0;
+   unsigned int bisect_flags = 0;
const char *show_progress = NULL;
 
if (argc == 2 && !strcmp(argv[1], "-h"))
@@ -426,7 +426,7 @@ int cmd_rev_list(int 

Re: [PATCH v2] Make it possible to implement bisecting only on first parents or on merge commits by passing flags to find_bisection(), instead of just a find_all boolean

2018-04-15 Thread Christian Couder
On Sun, Apr 15, 2018 at 12:44 PM, Harald Nordgren
 wrote:
> Signed-off-by: Harald Nordgren 
> ---
>
> Notes:
> Updates according to Christian Couder's code review

About the commit message (its first line and its body), I was
suggesting a patch like the following:

>From 3a0ced0953ccc6038a43885d98a507eb18c19e42 Mon Sep 17 00:00:00 2001
From: Harald Nordgren 
Date: Sun, 15 Apr 2018 12:44:38 +0200
Subject: [PATCH] bisect: create 'bisect_flags' parameter in find_bisection()

Make it possible to implement bisecting only on first parents or on
merge commits by passing flags to find_bisection(), instead of just
a 'find_all' boolean.

Signed-off-by: Harald Nordgren 
---
 bisect.c   | 20 
 bisect.h   |  6 --
 builtin/rev-list.c |  6 +++---
 3 files changed, 19 insertions(+), 13 deletions(-)
...

The rest of the patch looks good to me.

Thanks!


Regression in patch add?

2018-04-15 Thread mqudsi
Hello all,

I'm currently running the latest version of git built from `master`, and
I'm running into what appears to be a regression in the behavior of the
piecewise `git add -p` when applying a manually edited chunk.

I first run `git add -p`, then manually edit a chunk (after hitting `s`
once, if it matters). The chunk originally contains the following:

```diff
# Manual hunk edit mode -- see bottom for a quick guide
@@ -20,7 +20,7 @@
"call dein#add('Shougo/dein.vim', {'rev': 'master'})

" Add or remove your plugins here:
-   " call dein#add('flazz/vim-colorschemes')
-   call dein#add('Haron-Prime/evening_vim')
+   call dein#add('flazz/vim-colorschemes')
+   call dein#add('danilo-augusto/vim-afterglow')

"core plugins that change the behavior of vim and how we use it globally
```

Under git 2.7.4, I can edit it to the following, which is accepted
without a problem:

```diff
# Manual hunk edit mode -- see bottom for a quick guide
@@ -20,7 +20,7 @@
"call dein#add('Shougo/dein.vim', {'rev': 'master'})

" Add or remove your plugins here:
-   " call dein#add('flazz/vim-colorschemes')
-   call dein#add('Haron-Prime/evening_vim')
+   call dein#add('flazz/vim-colorschemes')
+   call dein#add('Haron-Prime/evening_vim')

"core plugins that change the behavior of vim and how we use it globally
```

All I did here was remove one `+` line and manually add another (which
is a variant of the second `-` line).

Under git 2.17.0.252.gfe0a9ea, the same piece is opened in $VISUAL for
editing (and if left unmodified applies OK), but when modified in the
to the same exact value, after exiting the editor I receive the
following error from git:

error: patch fragment without header at line 15: @@ -25,7 +25,8 @@

I'm not sure what to make of this.

Thank you,

Mahmoud Al-Qudsi
NeoSmart Technologies




Can i have a word with you?

2018-04-15 Thread pradeep . bhardwaj


Disclaimer: This message may contain privileged and confidential information 
and 
is solely for the use of intended recipient. The views expressed in 
this email  
   are those of the sender and not Future Group's. The 
recipient should check this 
email and attachments for the presence of 
viruses. Future Group accepts no liabi  
  lity for any damage caused by any 
virus transmitted by this email. Future Group   
  may monitor and record 
all emails.



Generate more revenue from Git

2018-04-15 Thread Michal Sapozhnikov
Hi,

​​My name is Michal and I lead the SDK partnerships at Luminati.​ I assume your
software earns money by charging users for a premium subscription or by showing
ads - both models do not pay out much and harm the user experience.

We now offer you a third option.

Luminati’s monetization SDK for Windows desktop provides your users the option
to use the software for free, and in exchange we pay you $3,000 USD per month,
for every 100K daily active users.
Luminati is the largest proxy network, having millions of IPs based on other
developers who embedded our SDK. More information is available on
http://luminati.io/sdk_win.

I’d like to schedule a 15 minute call to let you know how we can start. Are you
available tomorrow at 12:30pm your local time?

Best Regards,
Michal
-- 
Michal Sapozhnikov | Business Manager, Luminati SDK | +972-50-2826778 | Skype:
live:michals_43
http://luminati.io/sdk_win


[PATCH v2] Make it possible to implement bisecting only on first parents or on merge commits by passing flags to find_bisection(), instead of just a find_all boolean

2018-04-15 Thread Harald Nordgren
Signed-off-by: Harald Nordgren 
---

Notes:
Updates according to Christian Couder's code review

 bisect.c   | 20 
 bisect.h   |  6 --
 builtin/rev-list.c |  6 +++---
 3 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/bisect.c b/bisect.c
index a579b50884..19dac7491d 100644
--- a/bisect.c
+++ b/bisect.c
@@ -254,7 +254,7 @@ static struct commit_list *best_bisection_sorted(struct 
commit_list *list, int n
  */
 static struct commit_list *do_find_bisection(struct commit_list *list,
 int nr, int *weights,
-int find_all)
+unsigned int bisect_flags)
 {
int n, counted;
struct commit_list *p;
@@ -313,7 +313,7 @@ static struct commit_list *do_find_bisection(struct 
commit_list *list,
clear_distance(list);
 
/* Does it happen to be at exactly half-way? */
-   if (!find_all && halfway(p, nr))
+   if (!(bisect_flags & BISECT_FIND_ALL) && halfway(p, nr))
return p;
counted++;
}
@@ -351,21 +351,21 @@ static struct commit_list *do_find_bisection(struct 
commit_list *list,
weight_set(p, weight(q));
 
/* Does it happen to be at exactly half-way? */
-   if (!find_all && halfway(p, nr))
+   if (!(bisect_flags & BISECT_FIND_ALL) && halfway(p, nr))
return p;
}
}
 
show_list("bisection 2 counted all", counted, nr, list);
 
-   if (!find_all)
+   if (!(bisect_flags & BISECT_FIND_ALL))
return best_bisection(list, nr);
else
return best_bisection_sorted(list, nr);
 }
 
 void find_bisection(struct commit_list **commit_list, int *reaches,
-   int *all, int find_all)
+   int *all, unsigned int bisect_flags)
 {
int nr, on_list;
struct commit_list *list, *p, *best, *next, *last;
@@ -400,9 +400,9 @@ void find_bisection(struct commit_list **commit_list, int 
*reaches,
weights = xcalloc(on_list, sizeof(*weights));
 
/* Do the real work of finding bisection commit. */
-   best = do_find_bisection(list, nr, weights, find_all);
+   best = do_find_bisection(list, nr, weights, bisect_flags);
if (best) {
-   if (!find_all) {
+   if (!(bisect_flags & BISECT_FIND_ALL)) {
list->item = best->item;
free_commit_list(list->next);
best = list;
@@ -943,6 +943,7 @@ int bisect_next_all(const char *prefix, int no_checkout)
struct rev_info revs;
struct commit_list *tried;
int reaches = 0, all = 0, nr, steps;
+   unsigned int bisect_flags = 0;
struct object_id *bisect_rev;
char *steps_msg;
 
@@ -957,7 +958,10 @@ int bisect_next_all(const char *prefix, int no_checkout)
 
bisect_common();
 
-   find_bisection(, , , !!skipped_revs.nr);
+   if (skipped_revs.nr)
+   bisect_flags |= BISECT_FIND_ALL;
+
+   find_bisection(, , , bisect_flags);
revs.commits = managed_skipped(revs.commits, );
 
if (!revs.commits) {
diff --git a/bisect.h b/bisect.h
index a5d9248a47..8efe243a34 100644
--- a/bisect.h
+++ b/bisect.h
@@ -1,15 +1,17 @@
 #ifndef BISECT_H
 #define BISECT_H
 
+#define BISECT_FIND_ALL(1u<<0)
+
 /*
  * Find bisection. If something is found, `reaches` will be the number of
  * commits that the best commit reaches. `all` will be the count of
  * non-SAMETREE commits. If nothing is found, `list` will be NULL.
  * Otherwise, it will be either all non-SAMETREE commits or the single
- * best commit, as chosen by `find_all`.
+ * best commit, as chosen by flag `BISECT_FIND_ALL`.
  */
 extern void find_bisection(struct commit_list **list, int *reaches, int *all,
-  int find_all);
+  unsigned int bisect_flags);
 
 extern struct commit_list *filter_skipped(struct commit_list *list,
  struct commit_list **tried,
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index fadd3ec14c..22c3d479fb 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -360,8 +360,8 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
int i;
int bisect_list = 0;
int bisect_show_vars = 0;
-   int bisect_find_all = 0;
int use_bitmap_index = 0;
+   unsigned int bisect_flags = 0;
const char *show_progress = NULL;
 
if (argc == 2 && !strcmp(argv[1], "-h"))
@@ -426,7 +426,7 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
}
if (!strcmp(arg, "--bisect-all")) {
   

Re: [PATCH] Create 'bisect_flags' parameter in find_bisection() in preparation of passing multiple options

2018-04-15 Thread Christian Couder
Hi,

On Sun, Apr 15, 2018 at 10:58 AM, Harald Nordgren
 wrote:
> ---
>
> Notes:
> Preperatory patch to enable either Tiago Botelho's or my patch, to do 
> bisection on first parents / merge commits

It would be nice if you could move some part of the above note into
the commit message (above the ---). For example:

"Make it possible to implement bisecting only on first parents or on
merge commits by passing flags to find_bisection(), instead of just a
find_all boolean".

While at it maybe the subject line of the commit message could start
with "bisect: create 'bisect_flags' parameter ..." so that we can
quickly tell which area of the code it is about.

>  bisect.c   | 21 -
>  bisect.h   |  5 +++--
>  builtin/rev-list.c |  6 +++---
>  3 files changed, 18 insertions(+), 14 deletions(-)
>
> diff --git a/bisect.c b/bisect.c
> index a579b50884..d85550fd89 100644
> --- a/bisect.c
> +++ b/bisect.c
> @@ -254,7 +254,7 @@ static struct commit_list *best_bisection_sorted(struct 
> commit_list *list, int n
>   */
>  static struct commit_list *do_find_bisection(struct commit_list *list,
>  int nr, int *weights,
> -int find_all)
> +int bisect_flags)

I think it's better to use "unsigned int" rather than just "int" for flags.

>  {
> int n, counted;
> struct commit_list *p;

[...]

> @@ -19,6 +19,7 @@ extern struct commit_list *filter_skipped(struct 
> commit_list *list,
>
>  #define BISECT_SHOW_ALL(1<<0)
>  #define REV_LIST_QUIET (1<<1)
> +#define BISECT_FIND_ALL(1<<2)

Is BISECT_FIND_ALL really related to the other flags, or is this
mixing rev list flags with bisect flags?

Thanks for working on this,
Christian.


[PATCH] Create 'bisect_flags' parameter in find_bisection() in preparation of passing multiple options

2018-04-15 Thread Harald Nordgren
---

Notes:
Preperatory patch to enable either Tiago Botelho's or my patch, to do 
bisection on first parents / merge commits

 bisect.c   | 21 -
 bisect.h   |  5 +++--
 builtin/rev-list.c |  6 +++---
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/bisect.c b/bisect.c
index a579b50884..d85550fd89 100644
--- a/bisect.c
+++ b/bisect.c
@@ -254,7 +254,7 @@ static struct commit_list *best_bisection_sorted(struct 
commit_list *list, int n
  */
 static struct commit_list *do_find_bisection(struct commit_list *list,
 int nr, int *weights,
-int find_all)
+int bisect_flags)
 {
int n, counted;
struct commit_list *p;
@@ -313,7 +313,7 @@ static struct commit_list *do_find_bisection(struct 
commit_list *list,
clear_distance(list);
 
/* Does it happen to be at exactly half-way? */
-   if (!find_all && halfway(p, nr))
+   if (!(bisect_flags & BISECT_FIND_ALL) && halfway(p, nr))
return p;
counted++;
}
@@ -351,21 +351,21 @@ static struct commit_list *do_find_bisection(struct 
commit_list *list,
weight_set(p, weight(q));
 
/* Does it happen to be at exactly half-way? */
-   if (!find_all && halfway(p, nr))
+   if (!(bisect_flags & BISECT_FIND_ALL) && halfway(p, nr))
return p;
}
}
 
show_list("bisection 2 counted all", counted, nr, list);
 
-   if (!find_all)
+   if (!(bisect_flags & BISECT_FIND_ALL))
return best_bisection(list, nr);
else
return best_bisection_sorted(list, nr);
 }
 
 void find_bisection(struct commit_list **commit_list, int *reaches,
-   int *all, int find_all)
+   int *all, int bisect_flags)
 {
int nr, on_list;
struct commit_list *list, *p, *best, *next, *last;
@@ -400,9 +400,9 @@ void find_bisection(struct commit_list **commit_list, int 
*reaches,
weights = xcalloc(on_list, sizeof(*weights));
 
/* Do the real work of finding bisection commit. */
-   best = do_find_bisection(list, nr, weights, find_all);
+   best = do_find_bisection(list, nr, weights, bisect_flags);
if (best) {
-   if (!find_all) {
+   if (!(bisect_flags & BISECT_FIND_ALL)) {
list->item = best->item;
free_commit_list(list->next);
best = list;
@@ -942,7 +942,7 @@ int bisect_next_all(const char *prefix, int no_checkout)
 {
struct rev_info revs;
struct commit_list *tried;
-   int reaches = 0, all = 0, nr, steps;
+   int reaches = 0, all = 0, bisect_flags = 0, nr, steps;
struct object_id *bisect_rev;
char *steps_msg;
 
@@ -957,7 +957,10 @@ int bisect_next_all(const char *prefix, int no_checkout)
 
bisect_common();
 
-   find_bisection(, , , !!skipped_revs.nr);
+   if (skipped_revs.nr)
+   bisect_flags |= BISECT_FIND_ALL;
+
+   find_bisection(, , , bisect_flags);
revs.commits = managed_skipped(revs.commits, );
 
if (!revs.commits) {
diff --git a/bisect.h b/bisect.h
index a5d9248a47..d46b078871 100644
--- a/bisect.h
+++ b/bisect.h
@@ -6,10 +6,10 @@
  * commits that the best commit reaches. `all` will be the count of
  * non-SAMETREE commits. If nothing is found, `list` will be NULL.
  * Otherwise, it will be either all non-SAMETREE commits or the single
- * best commit, as chosen by `find_all`.
+ * best commit, as chosen by flag `BISECT_FIND_ALL`.
  */
 extern void find_bisection(struct commit_list **list, int *reaches, int *all,
-  int find_all);
+  int bisect_flags);
 
 extern struct commit_list *filter_skipped(struct commit_list *list,
  struct commit_list **tried,
@@ -19,6 +19,7 @@ extern struct commit_list *filter_skipped(struct commit_list 
*list,
 
 #define BISECT_SHOW_ALL(1<<0)
 #define REV_LIST_QUIET (1<<1)
+#define BISECT_FIND_ALL(1<<2)
 
 struct rev_list_info {
struct rev_info *revs;
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index fadd3ec14c..88700e897d 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -360,7 +360,7 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
int i;
int bisect_list = 0;
int bisect_show_vars = 0;
-   int bisect_find_all = 0;
+   int bisect_flags = 0;
int use_bitmap_index = 0;
const char *show_progress = NULL;
 
@@ -426,7 +426,7 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
}