date:20171019

Re: [PATCH 2/3] t5615: avoid re-using descriptor 4

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 07:23:37PM -0400, Jeff King wrote:

> So one trick is that we can't just set it to a higher number. We have to
> also open and manage that descriptor. It might be enough to do:
> 
>   if test -n "$BASH_VERSION"
>   then
>   exec 999>&4
>   BASH_XTRACEFD=999
>   fi
> [...]
> I think it might be workable, but I'm worried we're opening a can of
> worms. Or continuing to dig into the can of worms from the original
> BASH_XTRACEFD, if you prefer. :)

So this is the best I came up with (on top of my other patches for
ease of reading, though I'd re-work them if this is the route we're
going to go):

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 14fac6d6f2..833ceaefd2 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -377,7 +377,12 @@ fi
 #
 # Note also that we don't need or want to export it. The tracing is local to
 # this shell, and we would not want to influence any shells we exec.
-BASH_XTRACEFD=4
+if test -n "$BASH_VERSION"
+then
+   exec 999>&4
+   BASH_XTRACEFD=999
+   silence_trace="999>/dev/null"
+fi

 test_failure=0
 test_count=0
@@ -627,14 +632,16 @@ test_eval_ () {
# be _inside_ the block to avoid polluting the "set -x" output
#

-   test_eval_inner_ "$@" &3 2>&4
-   {
-   test_eval_ret_=$?
-   if want_trace
-   then
-   set +x
-   fi
-   } 2>/dev/null 4>&2
+   eval '
+   test_eval_inner_ "$@" &3 2>&4
+   {
+   test_eval_ret_=$?
+   if want_trace
+   then
+   set +x
+   fi
+   } 2>/dev/null '"$silence_trace"'
+   '

if test "$test_eval_ret_" != 0 && want_trace
then

I really hate that extra eval, since it adds an extra layer of "+" to
the tracing output in bash.

But I can't find a way to make it work without it. The "999>/dev/null"
is syntactically bogus in non-bash shells, so it _must_ appear as part
of an eval.  We can't conditionally stick it at the end of the eval run
by test_eval_inner_, because that one is running the actual test code
(so it may bail early with "return", for example).

TBH, I'm not 100% sure that the initial "exec 999>&4" won't cause
complaints in some shells. Dash seems to handle it, but others might
not. That can be worked around with another eval, though.

So I dunno. It does solve the problem in a way that the individual test
scripts wouldn't have to care about. But it's a lot of eval trickery.

-Peff

Re: [Alt. PATCH] ls-remote: deprecate -h as short for --heads

2017-10-19 Thread Junio C Hamano

Jeff King  writes:

>  It seems weird and inconsistent to me that the meaning of "-h"
> depends on the position and presence of other unrelated options. Maybe
> it's just me. I know _why_ it's that way, but this seems like one of
> those weird corners of the interface that end up confusing people and
> giving Git's interface the reputation of being full of mysterious
> inconsistencies. I suspect one of the reasons nobody has complained
> about it is that "ls-remote" is not commonly used, and "ls-remote -h"
> less so (I didn't even know it was there until this conversation).

Technically, yes, it may be weird.

I may be biased as every time I think about this one, the first one
that comes to my mind is how "grep -h" (not "git grep", but GNU
grep) behaves.  Yes, "-h" means something else, but by itself, the
command does not make sense, and "ls-remote -h" is an exception to
the rule: most sane commands do not have a sensible semantics when
they take only "-h" and nothing else.  And even for ls-remote it is
true only when you try to be extra lazy and do not say which remote
you are asking.

And that is why I think "'cmd -h" and nothing else consistently
gives help" may overall be positive in usability, than a rule that
says "'cmd -h' is for short-help unless -h means something else for
the command".

Re: "Cannot fetch git.git" (worktrees at fault? or origin/HEAD) ?

2017-10-19 Thread Stefan Beller

On Thu, Oct 19, 2017 at 8:16 PM, Jeff King  wrote:
> On Thu, Oct 19, 2017 at 06:43:57PM -0700, Stefan Beller wrote:

>
> So that's the right immediate workaround (and hoping there was nothing
> valuable in that worktree!).

How would I know? ;)

Without the commit available I do not even know if the code there
is well reasoned about. (My commit messages usually tell my future
self how much thought I put into the code)

>
>> Any idea which direction we need to heading for a real fix?
>
> If my analysis above is correct, then it's already fixed. You just had
> leftover corruption.

Well fetching yesterday worked and the commit in question is from
8/23, the merge  8a044c7f1d56cef657be342e40de0795d688e882
occurred 9/18, so I suspect there is something else at play.
(I do not remember having a gc between yesterday and today.
Though maybe one in the background?)

I am curious how you can have a worktree owned by multiple
repositories [1] (?).

Thanks,
Stefan

Re: "Cannot fetch git.git" (worktrees at fault? or origin/HEAD) ?

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 06:43:57PM -0700, Stefan Beller wrote:

> So I ran "git fetch --all" inside git.git that you are all familiar with.
> All fetches failed with a similar error as:
> Fetching kernelorg
> fatal: bad object HEAD
> error: https://kernel.googlesource.com/pub/scm/git/git did not send
> all necessary objects
> 
> error: Could not fetch kernelorg

I ran into this a while back with worktrees, too. In my case I had a
worktree which was "owned" by multiple repositories[1]. One would write
the HEAD, but the other did not necessarily have those objects.

I don't quite recall how I ended up there. Doing "git worktree add"
multiple times does seem to complain about the directory already
existing. My case did involve relative symlinks in both repositories, so
maybe that confused things. Anyway, I can't seem to replicate it now.

> And here we go, I do use worktrees nowadays!
> $ git worktree list
> /usr/local/google/home/sbeller/OSS/git   117ddefdb4
> (detached HEAD)
> /u/git   d0c39a49cc
> (detached HEAD)
> /usr/local/google/home/sbeller/OSS/git_origin_master b14f27f917
> (detached HEAD)
> /usr/local/google/home/sbeller/OSS/submodule_remote_dot  3d9025bd69
> (detached HEAD)
> 
> $ git show 3d9025bd69
> fatal: ambiguous argument '3d9025bd69': unknown revision or path not
> in the working tree.

So maybe you've ended up in the same situation and somehow "reused"
submodule_remote_dot.

But I think until that commit you mentioned (d0c39a49cc), "gc" in the
parent repo could drop objects accessible only from the worktree. So if
that happened _before_ d0c39a49cc, then at that point your worktree was
corrupted. And now of course further operations will complain.

> ok, so I presume I just delete that working tree to "fix my copy of git"
> sbeller@sbeller:/u/git$ rm -rf
> /usr/local/google/home/sbeller/OSS/submodule_remote_dot
> sbeller@sbeller:/u/git$ git worktree prune
> sbeller@sbeller:/u/git$ git worktree list
> /usr/local/google/home/sbeller/OSS/git117ddefdb4 (detached 
> HEAD)
> /u/gitd0c39a49cc (detached 
> HEAD)
> /usr/local/google/home/sbeller/OSS/git_origin_master  b14f27f917 (detached 
> HEAD)

So that's the right immediate workaround (and hoping there was nothing
valuable in that worktree!).

> Any idea which direction we need to heading for a real fix?

If my analysis above is correct, then it's already fixed. You just had
leftover corruption.

-Peff

Re: [Alt. PATCH] ls-remote: deprecate -h as short for --heads

2017-10-19 Thread Jeff King

On Fri, Oct 20, 2017 at 10:04:23AM +0900, Junio C Hamano wrote:

> > Yuck. This "we only treat -h as special in certain cases" rule is
> > sufficiently magical that I don't think we want to advertise and lock
> > ourselves into it.
> 
> Hmph.  I think it is way too late to be worried about "locked into"
> the convention---hasn't the "git cmd -h" been with us forever?

I guess. I still find it pretty nasty (not that "-h" works for help, but
that it can override the normal usage).

> Besides, I personally feel that there is nothing magical in the rule
> that is "we always treat 'git $cmd -h' as asking for short help, but
> individual subcommand may choose to use -h for, perhaps to be
> compatible with other tools and conventions, in other situations".

 It seems weird and inconsistent to me that the meaning of "-h"
depends on the position and presence of other unrelated options. Maybe
it's just me. I know _why_ it's that way, but this seems like one of
those weird corners of the interface that end up confusing people and
giving Git's interface the reputation of being full of mysterious
inconsistencies. I suspect one of the reasons nobody has complained
about it is that "ls-remote" is not commonly used, and "ls-remote -h"
less so (I didn't even know it was there until this conversation).

-Peff

Re: [PATCH] builtin/push.c: add push.pushOption config

2017-10-19 Thread Junio C Hamano

Stefan Beller  writes:

>> @@ -161,6 +161,9 @@ already exists on the remote side.
>> Transmit the given string to the server, which passes them to
>> the pre-receive as well as the post-receive hook. The given string
>> must not contain a NUL or LF character.
>> +   When no `--push-option ` is given from the command
>> +   line, the values of configuration variable `push.pushOption`
>> +   are used instead.
>
> We'd also want to document how push.pushOption works in
> Documentation/config.txt (that contains all the configs)

Perhaps.

> So in the config, we have to explicitly give an empty option to
> clear the previous options, but on the command line we do not need
> that, but instead we'd have to repeat any push options that we desire
> that were configured?

It is not wrong per-se to phrase it like so, but I think that is
making it unnecessarily confusing by conflating two things.  (1)
configured values are overridden from the command line, just like
any other --option/config.variable pair and (2) unlike usual single
value variables where "last one wins" rule is simple enough to
explain,, multi-value variables need a way to "forget everything we
said so far and start from scratch" syntax, especially when multiple
input files are involved.

> Example:
>
>   /etc/gitconfig
>   push.pushoption = a
>   push.pushoption = b
>
>   ~/.gitconfig
>   push.pushoption = c
>
>   repo/.git/config
>   push.pushoption =
>   push.pushoption = b
>
> will result in only b as a and c are
> cleared.

The above is correct, and it might be worth giving it as an example
in the doc, because not just "give an empty entry to clear what has
been accumulated so far" but a multi-valued option in general is a
rather rare thing.

> If I were to run
>   git -c push.pushOption=d push ... (in repo)
> it would be b and d, but
>   git push --push-option=d
> would be d only?

>> @@ -584,12 +599,13 @@ int cmd_push(int argc, const char **argv, const char 
>> *prefix)
>> set_refspecs(argv + 1, argc - 1, repo);
>> }
>>
>> -   for_each_string_list_item(item, _options)
>> +   for_each_string_list_item(item, push_options)
>
> We have to do the same for _cmdline here, too?

I do not think so.  The point of these lines that appear before this
loop:

git_config(git_push_config, );
argc = parse_options(argc, argv, prefix, options, push_usage, 0);
+   push_options = (push_options_cmdline.nr
+   ? _options_cmdline
+   : _options_config);

is that the command line overrides configured values, just like any
other configuration.  Adding _cmdline variant here is doubly wrong
when command line options are given in that it (1) duplicates what
was obtained from the command line, and (2) does not clear the
configured values.

"Cannot fetch git.git" (worktrees at fault? or origin/HEAD) ?

2017-10-19 Thread Stefan Beller

So I ran "git fetch --all" inside git.git that you are all familiar with.
All fetches failed with a similar error as:
Fetching kernelorg
fatal: bad object HEAD
error: https://kernel.googlesource.com/pub/scm/git/git did not send
all necessary objects

error: Could not fetch kernelorg

Working with an older version of Git (2.13.0) gives me

$ GIT_TRACE=/u/trace GIT_TRACE_PACKET=/u/out git fetch git://github.com/git/git
>From git://github.com/git/git
 * branch  HEAD   -> FETCH_HEAD

After fiddling around and some printf debugging,
clean fsck(!)
I could not make sense of it, so bisecting

$ git bisect bad
d0c39a49ccb5dfe7feba4325c3374d99ab123c59 is the first bad commit
commit d0c39a49ccb5dfe7feba4325c3374d99ab123c59
Author: Nguyễn Thái Ngọc Duy 
Date:   Wed Aug 23 19:36:59 2017 +0700

revision.c: --all adds HEAD from all worktrees

Unless single_worktree is set, --all now adds HEAD from all worktrees.

Since reachable.c code does not use setup_revisions(), we need to call
other_head_refs_submodule() explicitly there to have the same effect on
"git prune", so that we won't accidentally delete objects needed by some
other HEADs.

A new FIXME is added because we would need something like

int refs_other_head_refs(struct ref_store *, each_ref_fn, cb_data);

in addition to other_head_refs() to handle it, which might require

int get_submodule_worktrees(const char *submodule, int flags);

It could be a separate topic to reduce the scope of this one.

Signed-off-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 

And here we go, I do use worktrees nowadays!
$ git worktree list
/usr/local/google/home/sbeller/OSS/git   117ddefdb4
(detached HEAD)
/u/git   d0c39a49cc
(detached HEAD)
/usr/local/google/home/sbeller/OSS/git_origin_master b14f27f917
(detached HEAD)
/usr/local/google/home/sbeller/OSS/submodule_remote_dot  3d9025bd69
(detached HEAD)

$ git show 3d9025bd69
fatal: ambiguous argument '3d9025bd69': unknown revision or path not
in the working tree.

ok, so I presume I just delete that working tree to "fix my copy of git"
sbeller@sbeller:/u/git$ rm -rf
/usr/local/google/home/sbeller/OSS/submodule_remote_dot
sbeller@sbeller:/u/git$ git worktree prune
sbeller@sbeller:/u/git$ git worktree list
/usr/local/google/home/sbeller/OSS/git117ddefdb4 (detached HEAD)
/u/gitd0c39a49cc (detached HEAD)
/usr/local/google/home/sbeller/OSS/git_origin_master  b14f27f917 (detached HEAD)
$ git bisect reset
Previous HEAD position was d0c39a49cc... revision.c: --all adds HEAD
from all worktrees
HEAD is now at 660fb3dfa8... Sync with maint
$ make install
$ git fetch --all
# works fine!

Any idea which direction we need to heading for a real fix?

Do you need financial help?

2017-10-19 Thread Mr . Robert

Do you need a loan to enhance your business?, Loan to consolidate your 
debt,Loan for personal use, Loan for credit Card, Medical Care loan,Car 
Loan,Mortgage Loan, Student Loan, loan for any purposes? e.t.c.Get loan at 1% 
interest rate annually, Hurry up now and fill out this below application 
details if interested: Contact us at: iceblueinvestmentgrou...@gmail.com

APPLICATION DETAILS
Full Name---
Gender:___
Contact Address:
Country:__
State:__
Amount Needed as Loan:___
Loan Duration:
Occupation:___ _
Purpose for Loan:
Phone: ___
Regard,

Re: [PATCH v1] read_index_from(): Skip verification of the cache entry order to speed index loading

2017-10-19 Thread Junio C Hamano

Jeff King  writes:

> It should be easy enough to check that; the patch below implements it.
> I couldn't measure any speedup with it running "git ls-files >/dev/null"
> on linux.git (60k files). But nor could I get any by dropping the check
> entirely.

I would expect that the speedup (due to possible cache effect)
wouldn't be measurable if the overhead of the existing check itself
is unmeasuably not-expensive.  No suprise here.

> This is mostly just a curiosity to me. For the record, I have no real
> problem with dropping this kind of on-disk data-structure validation
> when it's expensive. After all, we do not check the sort on pack .idx
> files on each run (nor pack sha1 checksums, etc) because it's too
> expensive to do so.

Yes, I agree with that stance, too.  If this were expensive in the
overall picture to be measurable, I think we are OK omitting when
the index is read, especially if we make sure we are not writing out
nonsense trees out of it.  Local damage to the index is very
contained as long as we do not spread breakages to trees and
commits.

[PATCH 4/4] fsmonitor: Delay updating state until after split index is merged

2017-10-19 Thread Alex Vandiver

If the fsmonitor extension is used in conjunction with the split index
extension, the set of entries in the index when it is first loaded is
only a subset of the real index.  This leads to only the non-"base"
index being marked as CE_FSMONITOR_VALID.

Delay the expansion of the ewah bitmap until after tweak_split_index
has been called to merge in the base index as well.

Signed-off-by: Alex Vandiver 
---
 cache.h  |  1 +
 fsmonitor.c  | 38 --
 read-cache.c |  4 
 3 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/cache.h b/cache.h
index 25adcf681..0a4f43ec2 100644
--- a/cache.h
+++ b/cache.h
@@ -348,6 +348,7 @@ struct index_state {
unsigned char sha1[20];
struct untracked_cache *untracked;
uint64_t fsmonitor_last_update;
+   struct ewah_bitmap *fsmonitor_dirty;
 };
 
 extern struct index_state the_index;
diff --git a/fsmonitor.c b/fsmonitor.c
index 7c1540c05..4c2668f57 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -49,20 +49,7 @@ int read_fsmonitor_extension(struct index_state *istate, 
const void *data,
ewah_free(fsmonitor_dirty);
return error("failed to parse ewah bitmap reading fsmonitor 
index extension");
}
-
-   if (git_config_get_fsmonitor()) {
-   /* Mark all entries valid */
-   for (i = 0; i < istate->cache_nr; i++)
-   istate->cache[i]->ce_flags |= CE_FSMONITOR_VALID;
-
-   /* Mark all previously saved entries as dirty */
-   ewah_each_bit(fsmonitor_dirty, fsmonitor_ewah_callback, istate);
-
-   /* Now mark the untracked cache for fsmonitor usage */
-   if (istate->untracked)
-   istate->untracked->use_fsmonitor = 1;
-   }
-   ewah_free(fsmonitor_dirty);
+   istate->fsmonitor_dirty = fsmonitor_dirty;
 
trace_printf_key(_fsmonitor, "read fsmonitor extension 
successful");
return 0;
@@ -238,6 +225,29 @@ void remove_fsmonitor(struct index_state *istate)
 
 void tweak_fsmonitor(struct index_state *istate)
 {
+   int i;
+
+   if (istate->fsmonitor_dirty) {
+   /* Mark all entries valid */
+   trace_printf_key(_fsmonitor, "fsmonitor is enabled; cache 
is %d", istate->cache_nr);
+   for (i = 0; i < istate->cache_nr; i++) {
+   istate->cache[i]->ce_flags |= CE_FSMONITOR_VALID;
+   }
+   trace_printf_key(_fsmonitor, "marked all as valid");
+
+   /* Mark all previously saved entries as dirty */
+   trace_printf_key(_fsmonitor, "setting each bit on %p", 
istate->fsmonitor_dirty);
+   ewah_each_bit(istate->fsmonitor_dirty, fsmonitor_ewah_callback, 
istate);
+
+   /* Now mark the untracked cache for fsmonitor usage */
+   trace_printf_key(_fsmonitor, "setting untracked state");
+   if (istate->untracked)
+   istate->untracked->use_fsmonitor = 1;
+   ewah_free(istate->fsmonitor_dirty);
+   } else {
+   trace_printf_key(_fsmonitor, "fsmonitor not enabled");
+   }
+
switch (git_config_get_fsmonitor()) {
case -1: /* keep: do nothing */
break;
diff --git a/read-cache.c b/read-cache.c
index c18e5e623..3b5cd00a2 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -330,6 +330,10 @@ int ie_match_stat(struct index_state *istate,
return 0;
if (!ignore_fsmonitor && (ce->ce_flags & CE_FSMONITOR_VALID))
return 0;
+   if (ce->ce_flags & CE_FSMONITOR_VALID)
+   trace_printf_key(_fsmonitor, "fsmon marked valid for %s", 
ce->name);
+   if (ignore_fsmonitor)
+   trace_printf_key(_fsmonitor, "Ignoring fsmonitor for %s", 
ce->name);
 
/*
 * Intent-to-add entries have not been added, so the index entry
-- 
2.15.0.rc1.417.g05670bb6e

[PATCH 1/4] fsmonitor: Watch, and ask about, the top of the repo, not the CWD

2017-10-19 Thread Alex Vandiver

Signed-off-by: Alex Vandiver 
---
 t/t7519/fsmonitor-watchman | 3 ++-
 templates/hooks--fsmonitor-watchman.sample | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
index a3e30bf54..377edc7be 100755
--- a/t/t7519/fsmonitor-watchman
+++ b/t/t7519/fsmonitor-watchman
@@ -41,7 +41,8 @@ if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) {
$git_work_tree =~ s/[\r\n]+//g;
$git_work_tree =~ s,\\,/,g;
 } else {
-   $git_work_tree = $ENV{'PWD'};
+   $git_work_tree = `git rev-parse --show-toplevel`;
+   chomp $git_work_tree;
 }
 
 my $retry = 1;
diff --git a/templates/hooks--fsmonitor-watchman.sample 
b/templates/hooks--fsmonitor-watchman.sample
index 9eba8a740..7df590d29 100755
--- a/templates/hooks--fsmonitor-watchman.sample
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -40,7 +40,8 @@ if ($system =~ m/^MSYS_NT/ || $system =~ m/^MINGW/) {
$git_work_tree =~ s/[\r\n]+//g;
$git_work_tree =~ s,\\,/,g;
 } else {
-   $git_work_tree = $ENV{'PWD'};
+   $git_work_tree = `git rev-parse --show-toplevel`;
+   chomp $git_work_tree;
 }
 
 my $retry = 1;
-- 
2.15.0.rc1.417.g05670bb6e

[PATCH 0/4] fsmonitor fixes

2017-10-19 Thread Alex Vandiver

A few fixes found from playing around with the fsmonitor branch in
next.
 - Alex

[PATCH 3/4] fsmonitor: Document GIT_TRACE_FSMONITOR

2017-10-19 Thread Alex Vandiver

Signed-off-by: Alex Vandiver 
---
 Documentation/git.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 1fca63634..720db196e 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -594,6 +594,10 @@ into it.
 Unsetting the variable, or setting it to empty, "0" or
 "false" (case insensitive) disables trace messages.
 
+`GIT_TRACE_FSMONITOR`::
+   Enables trace messages for the filesystem monitor extension.
+   See `GIT_TRACE` for available trace output options.
+
 `GIT_TRACE_PACK_ACCESS`::
Enables trace messages for all accesses to any packs. For each
access, the pack file name and an offset in the pack is
-- 
2.15.0.rc1.417.g05670bb6e

[PATCH 2/4] fsmonitor: Don't bother pretty-printing JSON from watchman

2017-10-19 Thread Alex Vandiver

This provides small performance savings.

Signed-off-by: Alex Vandiver 
---
 t/t7519/fsmonitor-watchman | 2 +-
 templates/hooks--fsmonitor-watchman.sample | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
index 377edc7be..eba46c78b 100755
--- a/t/t7519/fsmonitor-watchman
+++ b/t/t7519/fsmonitor-watchman
@@ -51,7 +51,7 @@ launch_watchman();
 
 sub launch_watchman {
 
-   my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j')
+   my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j --no-pretty')
or die "open2() failed: $!\n" .
"Falling back to scanning...\n";
 
diff --git a/templates/hooks--fsmonitor-watchman.sample 
b/templates/hooks--fsmonitor-watchman.sample
index 7df590d29..60eb7e70b 100755
--- a/templates/hooks--fsmonitor-watchman.sample
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -50,7 +50,7 @@ launch_watchman();
 
 sub launch_watchman {
 
-   my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j')
+   my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j --no-pretty')
or die "open2() failed: $!\n" .
"Falling back to scanning...\n";
 
-- 
2.15.0.rc1.417.g05670bb6e

Re: [Alt. PATCH] ls-remote: deprecate -h as short for --heads

2017-10-19 Thread Junio C Hamano

Jeff King  writes:

> On Thu, Oct 19, 2017 at 08:26:47PM +0200, René Scharfe wrote:
>
>> diff --git a/Documentation/git-ls-remote.txt 
>> b/Documentation/git-ls-remote.txt
>> index 5f2628c8f8..82622e7fbc 100644
>> --- a/Documentation/git-ls-remote.txt
>> +++ b/Documentation/git-ls-remote.txt
>> @@ -29,6 +29,9 @@ OPTIONS
>>  These options are _not_ mutually exclusive; when given
>>  both, references stored in refs/heads and refs/tags are
>>  displayed.
>> ++
>> +*NOTE*: -h without any other parameters shows a short help text instead
>> +of any refs.
>
> Yuck. This "we only treat -h as special in certain cases" rule is
> sufficiently magical that I don't think we want to advertise and lock
> ourselves into it.

Hmph.  I think it is way too late to be worried about "locked into"
the convention---hasn't the "git cmd -h" been with us forever?

Besides, I personally feel that there is nothing magical in the rule
that is "we always treat 'git $cmd -h' as asking for short help, but
individual subcommand may choose to use -h for, perhaps to be
compatible with other tools and conventions, in other situations".

Re: [PATCH v4 2/3] implement fetching of moved submodules

2017-10-19 Thread Stefan Beller

On Wed, Oct 18, 2017 at 5:35 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>>> but if we already have a submodule with that name (the most likely
>>> explanation for its existence is because it started its life there
>>> and then later moved), and the submodule is bound to a different
>>> path, then that is a different submodule.  Skipping and warning both
>>> are sensible thing to do.
>>
>> Skipping and warning is sensible once we decide to go this way.
>>
>> I propose to take a step back and not throw away the information
>> whether the given string is a name or path, as then we do not have
>> to warn, but we can treat both correctly.
>
> Now either one of us is utterly confused, and I suspect it is me, as
> I do not see how "treat both correctly" could possibly work in the
> case this code warns and skips.
>
> At this point in the flow, we already know that it is not name,
> because we asked and got a "Nah, there is no submodule registered in
> .gitmodules at that path" from submodule_from_path().  Then we ask
> submodule_from_name() if there is any submodule registered under the
> name it would have got if it were added there, and we indeed find
> one.  And that is definitely *not* a submodule we are looking for,
> because if it were, its .path would have pointed at the path we were
> using to ask in the first place.  The one we originally found at
> path and are interested in finding out the details is not known to
> .gitmodules, and the one under that name is not the one that we are
> intereted in, so fetching from the repository the other one that
> happens to have the same name but is different from the submodule we
> are interested in would simply be wrong.

Eventually we'd want to also init new submodules on fetch
(if you use submodule.active to specify the interesting submodules),
and in that case I would imagine to fetch both submodules.

As I wrote the code to further improve this series,
I realized that this is maybe "good enough" for now,
so assume that I have reviewed this series and found it good.

> If we only have path without any .gitmodules entry (hence there is
> not even URL), how would we proceed from that point on?

Oh well, right. We can only offer to keep the existing behavior
which means supporting existing repos in place at that path.

Stefan

>

Re: [PATCH 2/3] t5615: avoid re-using descriptor 4

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 02:46:33PM -0700, Stefan Beller wrote:

> > I also considered trying to bump the "set -x" output descriptor to "9".
> > That just moves the problem around, but presumably scripts are less
> > likely to go that high. :)
> >
> > It would also be possible to pick something insanely high, like "999".
> > Many shells choke on descriptors higher than 9, but since the issue is
> > related to BASH_XTRACEFD, we could make it a bash-only thing. I don't
> > know if it's worth the trouble. I hate leaving a subtle "don't use
> > descriptor 4 in a subshell or your script will break" hand-grenade like
> > this lying around, but we do seem to have only one instance of it over
> > the whole test suite.
> 
> I would imagine that a higher fd for BASH_XTRACEFD
> would be less explosive than requiring the tests to skip
> the low number 4. (It is not *that* low. In e.g. git-submodule.sh
> we make use of 3 in cmd_foreach, not needing more.)

So one trick is that we can't just set it to a higher number. We have to
also open and manage that descriptor. It might be enough to do:

  if test -n "$BASH_VERSION"
  then
exec 999>&4
BASH_XTRACEFD=999
  fi

but since we fiddle with descriptors on a per-test basis, I'm not sure
if that's it or not. I think for convincing output to go to it, it
probably is. We tend to point things _at_ descriptor 4, not point
descriptor 4 elsewhere. But the fix in patch 1 is an exception, where we
try to suppress the "set +x" output. For that we have to redirect 999,
too (which is hard because ">&999" is syntactically invalid in other
shells, so we have to follow a separate code path for bash or get into
evals).

Or start playing games with unsetting BASH_XTRACEFD (which closes 999,
and then we re-open 999>&4 and reset XTRACEFD).

I think it might be workable, but I'm worried we're opening a can of
worms. Or continuing to dig into the can of worms from the original
BASH_XTRACEFD, if you prefer. :)

-Peff

Re: Minor man page weirdness?

2017-10-19 Thread Lars Schneider


> On 18 Oct 2017, at 05:21, Jeff King  wrote:
> 
> On Wed, Oct 18, 2017 at 11:34:31AM +0900, Junio C Hamano wrote:
> 
>> -- >8 --
>> branch doc: sprinkle a few commas for readability
>> 
>> The "--force" option can also be used when the named branch does not
>> yet exist, and the point of the option is the user can (re)point the
>> branch to the named commit even if it does.  Add 'even' before 'if'
>> to clarify.  Also, insert another comma after "Without -f" before
>> "the command refuses..." to make the text easier to parse.
>> 
>> Incidentally, this change should help certain versions of
>> docbook-xsl-stylesheets that renders the original without any
>> whitespace between "-f" and "git".
> 
> Thanks, this looks good.

To me, too! Thanks for investigating and fixing the issue :-)

- Lars

Re: [PATCH v1] read_index_from(): Skip verification of the cache entry order to speed index loading

2017-10-19 Thread Stefan Beller

On Thu, Oct 19, 2017 at 8:12 AM, Ben Peart  wrote:

> If we are guarding against "git" writing out an invalid index, we can move
> this into an assert so that only git developers pay the cost of validating
> they haven't created a new bug.  I think this is better than just adding a
> new test case as a new test case would not achieve the same coverage.  This
> is my preferred solution.
>
> If we are guarding against "some other application" writing out an invalid
> index, then everyone will have to pay the cost as we can't insert the test
> into "some other applications."  Without user reports of it happening or any
> telemetry saying it has happened I really have no idea if it every actually
> happens in the wild anymore and whether the cost on every index load is
> still justified.

How well does this play out in the security realm?, c.f.
https://public-inbox.org/git/20171002234517.gv19...@aiede.mtv.corp.google.com/

Re: [PATCH 6/6] Add Git/Packet.pm from parts of t0021/rot13-filter.pl

2017-10-19 Thread Stefan Beller

On Thu, Oct 19, 2017 at 5:30 AM, Christian Couder
 wrote:
> And while at it let's simplify t0021/rot13-filter.pl by
> using Git/Packet.pm.
>
> This will make it possible to reuse packet related
> functions in other test scripts.
>
> Signed-off-by: Christian Couder 
> ---
>  perl/Git/Packet.pm  | 118 
> 
>  t/t0021/rot13-filter.pl |  94 ++
>  2 files changed, 121 insertions(+), 91 deletions(-)
>  create mode 100644 perl/Git/Packet.pm
>
> diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
> new file mode 100644
> index 00..b1e67477a0
> --- /dev/null
> +++ b/perl/Git/Packet.pm
> @@ -0,0 +1,118 @@
> +package Git::Packet;
> +use 5.008;

+Avar, who knows more about perl versioning.

As we move it from our tests to the official part of the code
we'd want to keep aligned to the rest of the perl parts
of the code base?

This whole series looks good to me as a perl bystander.

Thanks,
Stefan

Re: [PATCH 1/6] t0021/rot13-filter: refactor packet reading functions

2017-10-19 Thread Stefan Beller

On Thu, Oct 19, 2017 at 5:30 AM, Christian Couder
 wrote:
> To make it possible in a following commit to move packet
> reading and writing functions into a Packet.pm module,
> let's refactor these functions, so they don't handle
> printing debug output and exiting.
>
> Signed-off-by: Christian Couder 
> ---

It looks good to me, though I am no perl expert by far.

Re: [PATCH 5/5] diff: handle NULs in get_string_hash()

2017-10-19 Thread Stefan Beller

On Thu, Oct 19, 2017 at 2:39 PM, Jeff King  wrote:
> On Thu, Oct 19, 2017 at 02:31:20PM -0700, Stefan Beller wrote:
>
>> > This is unlikely to ever come up in practice since our line
>> > boundaries generally come from calling strlen() in the first
>> > place.
>>
>> get_string_hash is called from
>>  prepare_entry, which in turn is called from
>>   add_lines_to_move_detection or mark_color_as_moved
>>diff_flush_patch_all_file_pairs
>>
>> that constructs the arguments in
>> diff_flush_patch
>>  run_diff
>>   run_diff_cmd
>>builtin_diff (part "/* Crazy xdl interfaces.. */")
>> xdi_diff_outf( fn_out_consume as arg!)
>>  xdi_diff
>>   xdl_diff
>>xdl_call_hunk_func
>> -> fn_out_consume(cb, line, len)
>>
>> xdl_call_hunk_func however uses pointer arithmetic instead
>> of strlen. So I think this sentence is not a good idea to put in
>> the commit message.
>>
>> It may not occur in practice, due to binary files detection using
>> NUL as a signal, but conceptually our move-colored(!) diffs
>> should be compatible with NULs with this patch now.
>
> Good catch. I just skimmed over all the diff_emit_* functions, which use
> strlen(). But at the bottom is emit_add_line(), which has a real length
> marker from xdiff.

Doh!
There is diff_emit_submodule_* but I presume you meant
emit_diff_* actually as there the strlen is legit, as we generate
known non-NUL data to print.

The submodule output may get mangled in diff_emit_submodule_{add,del}
as the input is coming from user data.

> I agree we wouldn't see NULs in general, but this is maybe fixing "diff
> --color-moved -a". I dunno. It's probably not worth digging too much,
> since it seems like the right thing to do regardless.

I agree, that this digging seems not worth it; I was just agitated at
the seemingly incorrect commit message.

Thanks,
Stefan

>
> -Peff

Re: [PATCH 2/3] t5615: avoid re-using descriptor 4

2017-10-19 Thread Stefan Beller

> I also considered trying to bump the "set -x" output descriptor to "9".
> That just moves the problem around, but presumably scripts are less
> likely to go that high. :)
>
> It would also be possible to pick something insanely high, like "999".
> Many shells choke on descriptors higher than 9, but since the issue is
> related to BASH_XTRACEFD, we could make it a bash-only thing. I don't
> know if it's worth the trouble. I hate leaving a subtle "don't use
> descriptor 4 in a subshell or your script will break" hand-grenade like
> this lying around, but we do seem to have only one instance of it over
> the whole test suite.

I would imagine that a higher fd for BASH_XTRACEFD
would be less explosive than requiring the tests to skip
the low number 4. (It is not *that* low. In e.g. git-submodule.sh
we make use of 3 in cmd_foreach, not needing more.)

Thanks,
Stefan

Re: [PATCH 5/5] diff: handle NULs in get_string_hash()

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 02:31:20PM -0700, Stefan Beller wrote:

> > This is unlikely to ever come up in practice since our line
> > boundaries generally come from calling strlen() in the first
> > place.
> 
> get_string_hash is called from
>  prepare_entry, which in turn is called from
>   add_lines_to_move_detection or mark_color_as_moved
>diff_flush_patch_all_file_pairs
> 
> that constructs the arguments in
> diff_flush_patch
>  run_diff
>   run_diff_cmd
>builtin_diff (part "/* Crazy xdl interfaces.. */")
> xdi_diff_outf( fn_out_consume as arg!)
>  xdi_diff
>   xdl_diff
>xdl_call_hunk_func
> -> fn_out_consume(cb, line, len)
> 
> xdl_call_hunk_func however uses pointer arithmetic instead
> of strlen. So I think this sentence is not a good idea to put in
> the commit message.
> 
> It may not occur in practice, due to binary files detection using
> NUL as a signal, but conceptually our move-colored(!) diffs
> should be compatible with NULs with this patch now.

Good catch. I just skimmed over all the diff_emit_* functions, which use
strlen(). But at the bottom is emit_add_line(), which has a real length
marker from xdiff.

I agree we wouldn't see NULs in general, but this is maybe fixing "diff
--color-moved -a". I dunno. It's probably not worth digging too much,
since it seems like the right thing to do regardless.

-Peff

Re: [PATCH 5/5] diff: handle NULs in get_string_hash()

2017-10-19 Thread Stefan Beller

On Thu, Oct 19, 2017 at 1:31 PM, Jeff King  wrote:
> For computing moved lines, we feed the characters of each
> line into a hash. When we've been asked to ignore
> whitespace, then we pick each character using next_byte(),
> which returns -1 on end-of-string, which it determines using
> the start/end pointers we feed it.
>
> However our check of its return value treats "0" the same as
> "-1", meaning we'd quit if the string has an embedded NUL.

I agree. The code looks correct.

> This is unlikely to ever come up in practice since our line
> boundaries generally come from calling strlen() in the first
> place.

get_string_hash is called from
 prepare_entry, which in turn is called from
  add_lines_to_move_detection or mark_color_as_moved
   diff_flush_patch_all_file_pairs

that constructs the arguments in
diff_flush_patch
 run_diff
  run_diff_cmd
   builtin_diff (part "/* Crazy xdl interfaces.. */")
xdi_diff_outf( fn_out_consume as arg!)
 xdi_diff
  xdl_diff
   xdl_call_hunk_func
-> fn_out_consume(cb, line, len)

xdl_call_hunk_func however uses pointer arithmetic instead
of strlen. So I think this sentence is not a good idea to put in
the commit message.

It may not occur in practice, due to binary files detection using
NUL as a signal, but conceptually our move-colored(!) diffs
should be compatible with NULs with this patch now.

> But it was a bit surprising to me as a reader of the
> next_byte() code. And it's possible that we may one day feed
> this function with more exotic input, which otherwise works
> with arbitrary ptr/len pairs.

Good point.

>
> Signed-off-by: Jeff King 
> ---
> I noticed that we make an extra copy of each line here, just to feed it
> to memihash! I guess "-w" is not a critical-performance code path, but
> this could be fixed if we could do memhash() incrementally (e.g., by
> putting the FNV state into a struct and letting callers "add" to it
> incrementally). Maybe an interesting #leftoverbits, though I'd want to
> see timing tests that show it's worth doing.
>

I agree.

Thanks,
Stefan

Re: [PATCH 4/5] diff: fix whitespace-skipping with --color-moved

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 02:15:12PM -0700, Stefan Beller wrote:

> > +test_expect_failure 'move detection ignoring whitespace at eol' '
> > +   git reset --hard &&
> > +   # Lines 6-9 have new eol whitespace, but 9 also has it in the middle
> > +   q_to_tab <<-\EOF >lines.txt &&
> > +   long line 6Q
> > +   long line 7Q
> > +   long line 8Q
> > +   longQline 9Q
> > +   line 1
> > +   line 2
> > +   line 3
> > +   line 4
> > +   line 5
> > +   EOF
> > +
> > +   # avoid cluttering the output with complaints about our eol 
> > whitespace
> > +   test_config core.whitespace -blank-at-eol &&
> 
> We avoid the eol space change as we want to test the move detection
> without interference. Do we want to test it with that as well?

I don't think it creates interference. It's just that the expected
output becomes more like:

  +long line 6 

and the extra coloring just made it harder for me to read and write the
test. :)

I agree it might be worth checking the interaction between whitespace
coloring and --color-moved, but I think we'd want it to be more thorough
(e.g., covering the beginning of line with indent-with-non-tab or
similar).

> The commit message really enlightened me,
> Thanks!

Thanks for reviewing.

-Peff

I NEED YOUR URGENT HELP AND CORPERATION

2017-10-19 Thread IBRAHIM KABORE

Dear Friend

I am contacting you on a business deal of $9,500,000.00 Million United States 
Dollars, ready for transfer into your own personal account and if we make this 
claim, we will share it on the ratio of 50% / 50% basis, I would like to assure 
you that it be 100% risk free and it will be legally backed up with government 
approval. Once you are interested to transact this business with me, kindly 
give me your consent response immediately.

Hoping to hear from you.

My regards,
Mr Ibrahim Kabore
EMAIL,ibrakabor...@gmail.com

Re: [PATCH 4/5] diff: fix whitespace-skipping with --color-moved

2017-10-19 Thread Stefan Beller

On Thu, Oct 19, 2017 at 1:29 PM, Jeff King  wrote:
> The code for handling whitespace with --color-moved
> represents partial strings as a pair of pointers. There are
> two possible conventions for the end pointer:
>
>   1. It points to the byte right after the end of the
>  string.
>
>   2. It points to the final byte of the string.
>
> But we seem to use both conventions in the code:
>
>   a. we assign the initial pointers from the NUL-terminated
>  string using (1)
>
>   b. we eat trailing whitespace by checking the second
>  pointer for isspace(), which needs (2)
>
>   c. the next_byte() function checks for end-of-string with
>  "if (cp > endp)", which is (2)
>
>   d. in next_byte() we skip past internal whitespace with
>  "while (cp < end)", which is (1)
>
> This creates fewer bugs than you might think, because there
> are some subtle interactions. Because of (a) and (c), we
> always return the NUL-terminator from next_byte(). But all
> of the callers of next_byte() happen to handle that
> gracefully.
>
> Because of the mismatch between (d) and (c), next_byte()
> could accidentally return a whitespace character right at
> endp. But because of the interaction of (a) and (b), we fail
> to actually chomp trailing whitespace, meaning our endp
> _always_ points to a NUL, canceling out the problem.
>
> But that does leave (b) as a real bug: when ignoring
> whitespace only at the end-of-line, we don't correctly trim
> it, and fail to match up lines.
>
> We can fix the whole thing by moving consistently to one
> convention. Since convention (1) is idiomatic in our code
> base, we'll pick that one.
>
> The existing "-w" and "-b" tests continue to pass, and a new
> "--ignore-space-at-eol" shows off the breakage we're fixing.
>
> Signed-off-by: Jeff King 
> ---
>  diff.c | 15 +++
>  t/t4015-diff-whitespace.sh | 67 
> ++
>  2 files changed, 77 insertions(+), 5 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index 6fd288420b..09081a207c 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -712,7 +712,7 @@ static int next_byte(const char **cp, const char **endp,
>  {
> int retval;
>
> -   if (*cp > *endp)
> +   if (*cp >= *endp)
> return -1;

This converts (c) from (2) to (1).

> while (*cp < *endp && isspace(**cp))
> (*cp)++;
> -   /* return the first non-ws character via the usual 
> below */
> +   /*
> +* return the first non-ws character via the usual
> +* below, unless we ate all of the bytes
> +*/
> +   if (*cp >= *endp)
> +   return -1;

This fixes the mismatch between (d) and (c).

When I wrote the code, I did not follow proper commenting style
by capitalizing the sentence start and putting periods. :(
(Well it was a single line comment, which I have seen to not
follow style occasionally unlike any multi line comment. Anyway
no need to fix it here and now, just pointing out my bad code
in the beginning.)

> -   while (ae > ap && isspace(*ae))
> +   while (ae > ap && isspace(ae[-1]))
> ae--;
> -   while (be > bp && isspace(*be))
> +   while (be > bp && isspace(be[-1]))
> be--;
...
> -   while (ae > ap && isspace(*ae))
> +   while (ae > ap && isspace(ae[-1]))

These fixes convert (b) to (1), so we're all on (1).
As we check for strict endpointer > firstpointer
(and not >=), the check of -1 is fine, too.


> @@ -1463,6 +1463,73 @@ test_expect_success 'move detection ignoring 
> whitespace changes' '
> test_cmp expected actual
>  '
>
> +test_expect_failure 'move detection ignoring whitespace at eol' '
> +   git reset --hard &&
> +   # Lines 6-9 have new eol whitespace, but 9 also has it in the middle
> +   q_to_tab <<-\EOF >lines.txt &&
> +   long line 6Q
> +   long line 7Q
> +   long line 8Q
> +   longQline 9Q
> +   line 1
> +   line 2
> +   line 3
> +   line 4
> +   line 5
> +   EOF
> +
> +   # avoid cluttering the output with complaints about our eol whitespace
> +   test_config core.whitespace -blank-at-eol &&

We avoid the eol space change as we want to test the move detection
without interference. Do we want to test it with that as well?

> +   git diff HEAD --no-renames --color-moved --color |
> +   grep -v "index" |
> +   test_decode_color >actual &&
..
> +   git diff HEAD --no-renames --ignore-space-at-eol --color-moved 
> --color |
> +   grep -v "index" |
> +   test_decode_color >actual &&
..
> +   +long  line 9  

ok, we also have no interference with space changes,
which we assume is orthogonal.

Re: [PATCH 3/5] t4015: test the output of "diff --color-moved -b"

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 02:03:54PM -0700, Stefan Beller wrote:

> > +
> > +   git diff HEAD --no-renames --color-moved --color |
> > +   grep -v "index" |
> > +   test_decode_color >actual &&
> 
> The -v index grep makes it future proof (for a new hash)!
> I like that. What I do not like is the pipe at the end of
> git-diff as we certainly want to inspect the return code
> (or for a segfault ;)

I can claim neither credit nor failure here. These lines are copied
straight from your existing test. :)

> > +   git diff HEAD --no-renames -b --color-moved --color |
> > +   grep -v "index" |
> > +   test_decode_color >actual &&
> 
> Do we need (or rather want) to give --color additionally,
> instead of having --color-moved imply it? I guess this is
> extra carefulness from the "color always" series?

I do think --color-moved probably ought to imply --color (or at least
--color=auto). But AFAIK it doesn't do so now, so we need --color (and
if it implied "auto", we'd have to overcome that anyway).

But again, this comes straight from the existing test.

> > -test_expect_success 'move detection with whitespace changes' '
> > -   test_when_finished "git reset --hard" &&
> > -   test_seq 10 >test &&
> > -   git add test &&
> > -   sed s/3/42/ test.tmp &&
> > -   mv test.tmp test &&
> > -   git -c diff.colormoved diff --ignore-space-change -- test
> > -'
> 
> Ok, this is covered above in testing for the unchanged lines (1-5)

Yes, though I think it's actually the _changed_ lines which do it (the
problem was in next_byte(), so any diff with "--color-moved -b", whether
it had moves or not, would infinite loop as long as it had lines without
whitespace at the end).

-Peff

Re: [PATCH 1/5] t4015: refactor --color-moved whitespace test

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 01:56:55PM -0700, Stefan Beller wrote:

> On Thu, Oct 19, 2017 at 1:24 PM, Jeff King  wrote:
> 
> >
> > +test_expect_success 'clean up whitespace-test colors' '
> > +   git config --unset color.diff.oldMoved &&
> > +   git config --unset color.diff.newMoved
> > +'
> 
> This could be part of the previous test as
> 
>   test_config color.diff.oldMoved "magenta" &&
>   test_config color.diff.newMoved "cyan" &&
> 
> in the beginning. (That way we also do not pollute the setup,
> but keeping it test local).

It used to be in the previous test as test_config, but part of the setup
refactoring is to keep these common bits across several tests.  Hence we
"git config" in the setup step, and we must "git config --unset" here in
the cleanup. There's only the one test between setup/cleanup right now,
but the point is to add more.

The other alternative (besides repeating ourselves) would be to put all
those common bits into a function and call the function at the top of
each test.

-Peff

[PATCH 3/3] test-lib: make "-x" work with "--verbose-log"

2017-10-19 Thread Jeff King

The "-x" tracing option implies "--verbose". This is a
problem when running under a TAP harness like "prove", where
we need to use "--verbose-log" instead. Instead, let's
handle this the same way we do for --valgrind, including the
recent fix from 88c6e9d31c (test-lib: --valgrind should not
override --verbose-log, 2017-09-05). Namely, let's enable
--verbose only when we there isn't a more specific verbosity
option indicated.

Note that we also have to tweak `want_trace` to turn it on
(we re-check $verbose in each test because that's how we
handle --verbose-only, but --verbose-log is always on for
all tests).

Signed-off-by: Jeff King 
---
 t/test-lib.sh | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 3399fc3c88..14fac6d6f2 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -263,7 +263,6 @@ do
shift ;;
-x)
trace=t
-   verbose=t
shift ;;
--verbose-log)
verbose_log=t
@@ -282,6 +281,11 @@ then
test -z "$verbose_log" && verbose=t
 fi
 
+if test -n "$trace" && test -z "$verbose_log"
+then
+   verbose=t
+fi
+
 if test -n "$color"
 then
# Save the color control sequences now rather than run tput
@@ -585,7 +589,9 @@ maybe_setup_valgrind () {
 }
 
 want_trace () {
-   test "$trace" = t && test "$verbose" = t
+   test "$trace" = t && {
+   test "$verbose" = t || test "$verbose_log" = t
+   }
 }
 
 # This is a separate function because some tests use
-- 
2.15.0.rc1.560.g5f0609e481

[PATCH 2/3] t5615: avoid re-using descriptor 4

2017-10-19 Thread Jeff King

File descriptors 3 and 4 are special in our test suite, as
they link back to the test script's original stdout and
stderr. Normally this isn't something tests need to worry
about: they are free to clobber these descriptors for
sub-commands without affecting the overall script.

But there's one very special thing about descriptor 4: since
d88785e424 (test-lib: set BASH_XTRACEFD automatically,
2016-05-11), we ask bash to output "set -x" output to it by
number. This goes to _any_ descriptor 4, even if it no
longer points to the place it did when we set BASH_XTRACEFD.

But in t5615, we run a shell loop with descriptor 4
redirected.  As a result, t5615 works with non-bash shells
even with "-x". And it works with bash without "-x". But the
combination of "bash t5615-alternate-env.sh -x" gets a test
failure (because our "set -x" output pollutes one of the
files).

We can fix this by using any descriptor _except_ the magical
4. So let's switch arbitrarily to using 5/6 in this loop,
not 3/4.

Signed-off-by: Jeff King 
---
I also considered trying to bump the "set -x" output descriptor to "9".
That just moves the problem around, but presumably scripts are less
likely to go that high. :)

It would also be possible to pick something insanely high, like "999".
Many shells choke on descriptors higher than 9, but since the issue is
related to BASH_XTRACEFD, we could make it a bash-only thing. I don't
know if it's worth the trouble. I hate leaving a subtle "don't use
descriptor 4 in a subshell or your script will break" hand-grenade like
this lying around, but we do seem to have only one instance of it over
the whole test suite.

 t/t5615-alternate-env.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/t/t5615-alternate-env.sh b/t/t5615-alternate-env.sh
index d2d883f3a1..b4905b822c 100755
--- a/t/t5615-alternate-env.sh
+++ b/t/t5615-alternate-env.sh
@@ -7,9 +7,9 @@ check_obj () {
alt=$1; shift
while read obj expect
do
-   echo "$obj" >&3 &&
-   echo "$obj $expect" >&4
-   done 3>input 4>expect &&
+   echo "$obj" >&5 &&
+   echo "$obj $expect" >&6
+   done 5>input 6>expect &&
GIT_ALTERNATE_OBJECT_DIRECTORIES=$alt \
git "$@" cat-file --batch-check='%(objectname) %(objecttype)' \
actual &&
-- 
2.15.0.rc1.560.g5f0609e481

Re: [PATCH 3/5] t4015: test the output of "diff --color-moved -b"

2017-10-19 Thread Stefan Beller

On Thu, Oct 19, 2017 at 1:26 PM, Jeff King  wrote:
> Commit fa5ba2c1dd (diff: fix infinite loop with
> --color-moved --ignore-space-change, 2017-10-12) added a
> test to make sure that "--color-moved -b" doesn't run
> forever, but the test in question doesn't actually have any
> moved lines in it.
>
> Let's scrap that test and add a variant of the existing
> "--color-moved -w" test, but this time we'll check that we
> find the move with whitespace changes, but not arbitrary
> whitespace additions.
>
> Signed-off-by: Jeff King 
> ---

> +
> +   git diff HEAD --no-renames --color-moved --color |
> +   grep -v "index" |
> +   test_decode_color >actual &&

The -v index grep makes it future proof (for a new hash)!
I like that. What I do not like is the pipe at the end of
git-diff as we certainly want to inspect the return code
(or for a segfault ;)

> +   git diff HEAD --no-renames -b --color-moved --color |
> +   grep -v "index" |
> +   test_decode_color >actual &&

Do we need (or rather want) to give --color additionally,
instead of having --color-moved imply it? I guess this is
extra carefulness from the "color always" series?


>  test_expect_success 'clean up whitespace-test colors' '
> git config --unset color.diff.oldMoved &&
> git config --unset color.diff.newMoved
> @@ -1549,13 +1613,4 @@ test_expect_success 'move detection with submodules' '
> test_cmp expect decoded_actual
>  '
>
> -test_expect_success 'move detection with whitespace changes' '
> -   test_when_finished "git reset --hard" &&
> -   test_seq 10 >test &&
> -   git add test &&
> -   sed s/3/42/ test.tmp &&
> -   mv test.tmp test &&
> -   git -c diff.colormoved diff --ignore-space-change -- test
> -'

Ok, this is covered above in testing for the unchanged lines (1-5)

[PATCH 1/3] test-lib: silence "-x" cleanup under bash

2017-10-19 Thread Jeff King

When the test suite's "-x" option is used with bash, we end
up seeing cleanup cruft in the output:

  $ bash t0001-init.sh -x
  [...]
  ++ diff -u expected actual
  + test_eval_ret_=0
  + want_trace
  + test t = t
  + test t = t
  + set +x
  ok 42 - re-init from a linked worktree

This ranges from mildly annoying (for a successful test) to
downright confusing (when we say "last command exited with
error", but it's really 5 commands back).

We normally are able to suppress this cleanup. As the
in-code comment explains, we can't convince the shell not to
print it, but we can redirect its stderr elsewhere.

But since d88785e424 (test-lib: set BASH_XTRACEFD
automatically, 2016-05-11), that doesn't hold for bash. It
sends the "set -x" output directly to descriptor 4, not to
stderr.

We can fix this by also redirecting descriptor 4, and
paying close attention to which commands redirect and
which do not (see the updated comment).

Two alternatives I considered and rejected:

  - unsetting and setting BASH_XTRACEFD; doing so closes the
descriptor, which we must avoid

  - we could keep everything in a single block as before,
redirect 4>/dev/null there, but retain 5>&4 as a copy.
And then selectively restore 4>&5 for commands which
should be allowed to trace. This would work, but the
descriptor swapping seems unnecessarily confusing.

Signed-off-by: Jeff King 
---
 t/test-lib.sh | 34 --
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 9b61f16f7a..3399fc3c88 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -600,26 +600,40 @@ test_eval_inner_ () {
 }
 
 test_eval_ () {
-   # We run this block with stderr redirected to avoid extra cruft
-   # during a "-x" trace. Once in "set -x" mode, we cannot prevent
+   # If "-x" tracing is in effect, then we want to avoid polluting stderr
+   # with non-test commands. But once in "set -x" mode, we cannot prevent
# the shell from printing the "set +x" to turn it off (nor the saving
# of $? before that). But we can make sure that the output goes to
# /dev/null.
#
-   # The test itself is run with stderr put back to &4 (so either to
-   # /dev/null, or to the original stderr if --verbose was used).
+   # There are a few subtleties here:
+   #
+   #   - we have to redirect descriptor 4 in addition to 2, to cover
+   # BASH_XTRACEFD
+   #
+   #   - the actual eval has to come before the redirection block (since
+   # it needs to see descriptor 4 to set up its stderr)
+   #
+   #   - likewise, any error message we print must be outside the block to
+   # access descriptor 4
+   #
+   #   - checking $? has to come immediately after the eval, but it must
+   # be _inside_ the block to avoid polluting the "set -x" output
+   #
+
+   test_eval_inner_ "$@" &3 2>&4
{
-   test_eval_inner_ "$@" &3 2>&4
test_eval_ret_=$?
if want_trace
then
set +x
-   if test "$test_eval_ret_" != 0
-   then
-   say_color error >&4 "error: last command exited 
with \$?=$test_eval_ret_"
-   fi
fi
-   } 2>/dev/null
+   } 2>/dev/null 4>&2
+
+   if test "$test_eval_ret_" != 0 && want_trace
+   then
+   say_color error >&4 "error: last command exited with 
\$?=$test_eval_ret_"
+   fi
return $test_eval_ret_
 }
 
-- 
2.15.0.rc1.560.g5f0609e481

[PATCH 0/3] making test-suite tracing more useful

2017-10-19 Thread Jeff King

I sometimes run git's test suite as part of an automated testing
process. I was hoping to add "-x" support to get more details when a
test fails (since failures are sometimes hard to reproduce). But I hit a
few small snags:

  - you have to run with bash, since BASH_XTRACEFD is required to avoid
failures in some tests when we capture the stderr of shell functions
or subshells (which get polluted with the "set -x" outupt). This
requirement isn't a big deal for me, but it showed some other
issues.

  - the output with BASH_XTRACEFD is a little confusing; fixed by patch
1.

  - there's one test that _only_ fails with BASH_XTRACEFD. That's fixed
in patch 2.

  - I wanted to use "prove" since the output of "make -j64 test" is
unreadable. But that means using "--verbose-log", which was
incompatible with "-x". That's patch 3.

With these patches, I can now do:

  make -j64 test \
SHELL_PATH=/bin/bash \
GIT_TEST_OPTS="--verbose-log -x" \
DEFAULT_TEST_TARGET=prove \
GIT_PROVE_OPTS=-j64 || {
  echo "Failing tests:"
  echo "--"
  grep -l '[^0]' t/test-results/*.exit |
  while read failed; do
  base=${failed%.exit}
  name=${base#t/test-results/}
  echo "==> $name"
  cat "$base.out"
  done
  exit 1
}

and get fairly readable output (a nice summary from prove, and then a
dump of any failing test output).

Johannes, I've seen that you do "-x" in the tests that the
git-for-windows bot uses to comment on GitHub. You may have seen the
bogus failure in t5615, which this series should fix (you may also have
seen the "set +x" cruft at the end of each test, which is fixed here,
too).

Lars, I think with this it should be possible to turn on "-x" for the
Travis build.

  [1/3]: test-lib: silence "-x" cleanup under bash
  [2/3]: t5615: avoid re-using descriptor 4
  [3/3]: test-lib: make "-x" work with "--verbose-log"

 t/t5615-alternate-env.sh |  6 +++---
 t/test-lib.sh| 44 
 2 files changed, 35 insertions(+), 15 deletions(-)

-Peff

Re: [PATCH 1/5] t4015: refactor --color-moved whitespace test

2017-10-19 Thread Stefan Beller

On Thu, Oct 19, 2017 at 1:24 PM, Jeff King  wrote:

>
> +test_expect_success 'clean up whitespace-test colors' '
> +   git config --unset color.diff.oldMoved &&
> +   git config --unset color.diff.newMoved
> +'

This could be part of the previous test as

  test_config color.diff.oldMoved "magenta" &&
  test_config color.diff.newMoved "cyan" &&

in the beginning. (That way we also do not pollute the setup,
but keeping it test local).

With or without this nit, this is
Reviewed-by: Stefan Beller 

Thanks,
Stefan

Re: [PATCH 2/5] t4015: check "negative" case for "-w --color-moved"

2017-10-19 Thread Stefan Beller

On Thu, Oct 19, 2017 at 1:25 PM, Jeff King  wrote:
> We test that lines with whitespace changes are not found by
> "--color-moved" by default, but are found if "-w" is added.
> Let's add one more twist: a line that has non-whitespace
> changes should not be marked as a pure move.
>
> This is perhaps an obvious case for us to get right (and we
> do), but as we add more whitespace tests, they will form a
> pattern of "make sure this case is a move and this other
> case is not".
>
> Note that we have to add a line to our moved block, since
> having a too-small block doesn't trigger the "moved"
> heuristics.  And we also add a line of context to ensure
> that there's more context lines than moved lines (so the
> diff shows us moving the lines up, rather than moving the
> context down).
>
> Signed-off-by: Jeff King 

This patch is
Reviewed-by: Stefan Beller 

Thanks!

Re: [Alt. PATCH] ls-remote: deprecate -h as short for --heads

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 08:26:47PM +0200, René Scharfe wrote:

> diff --git a/Documentation/git-ls-remote.txt b/Documentation/git-ls-remote.txt
> index 5f2628c8f8..82622e7fbc 100644
> --- a/Documentation/git-ls-remote.txt
> +++ b/Documentation/git-ls-remote.txt
> @@ -29,6 +29,9 @@ OPTIONS
>   These options are _not_ mutually exclusive; when given
>   both, references stored in refs/heads and refs/tags are
>   displayed.
> ++
> +*NOTE*: -h without any other parameters shows a short help text instead
> +of any refs.

Yuck. This "we only treat -h as special in certain cases" rule is
sufficiently magical that I don't think we want to advertise and lock
ourselves into it.

-Peff

[PATCH 5/5] diff: handle NULs in get_string_hash()

2017-10-19 Thread Jeff King

For computing moved lines, we feed the characters of each
line into a hash. When we've been asked to ignore
whitespace, then we pick each character using next_byte(),
which returns -1 on end-of-string, which it determines using
the start/end pointers we feed it.

However our check of its return value treats "0" the same as
"-1", meaning we'd quit if the string has an embedded NUL.
This is unlikely to ever come up in practice since our line
boundaries generally come from calling strlen() in the first
place.

But it was a bit surprising to me as a reader of the
next_byte() code. And it's possible that we may one day feed
this function with more exotic input, which otherwise works
with arbitrary ptr/len pairs.

Signed-off-by: Jeff King 
---
I noticed that we make an extra copy of each line here, just to feed it
to memihash! I guess "-w" is not a critical-performance code path, but
this could be fixed if we could do memhash() incrementally (e.g., by
putting the FNV state into a struct and letting callers "add" to it
incrementally). Maybe an interesting #leftoverbits, though I'd want to
see timing tests that show it's worth doing.

 diff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 09081a207c..c4a669ffa8 100644
--- a/diff.c
+++ b/diff.c
@@ -782,7 +782,7 @@ static unsigned get_string_hash(struct emitted_diff_symbol 
*es, struct diff_opti
strbuf_reset();
while (ae > ap && isspace(ae[-1]))
ae--;
-   while ((c = next_byte(, , o)) > 0)
+   while ((c = next_byte(, , o)) >= 0)
strbuf_addch(, c);
 
return memhash(sb.buf, sb.len);
-- 
2.15.0.rc1.560.g5f0609e481

[PATCH 4/5] diff: fix whitespace-skipping with --color-moved

2017-10-19 Thread Jeff King

The code for handling whitespace with --color-moved
represents partial strings as a pair of pointers. There are
two possible conventions for the end pointer:

  1. It points to the byte right after the end of the
 string.

  2. It points to the final byte of the string.

But we seem to use both conventions in the code:

  a. we assign the initial pointers from the NUL-terminated
 string using (1)

  b. we eat trailing whitespace by checking the second
 pointer for isspace(), which needs (2)

  c. the next_byte() function checks for end-of-string with
 "if (cp > endp)", which is (2)

  d. in next_byte() we skip past internal whitespace with
 "while (cp < end)", which is (1)

This creates fewer bugs than you might think, because there
are some subtle interactions. Because of (a) and (c), we
always return the NUL-terminator from next_byte(). But all
of the callers of next_byte() happen to handle that
gracefully.

Because of the mismatch between (d) and (c), next_byte()
could accidentally return a whitespace character right at
endp. But because of the interaction of (a) and (b), we fail
to actually chomp trailing whitespace, meaning our endp
_always_ points to a NUL, canceling out the problem.

But that does leave (b) as a real bug: when ignoring
whitespace only at the end-of-line, we don't correctly trim
it, and fail to match up lines.

We can fix the whole thing by moving consistently to one
convention. Since convention (1) is idiomatic in our code
base, we'll pick that one.

The existing "-w" and "-b" tests continue to pass, and a new
"--ignore-space-at-eol" shows off the breakage we're fixing.

Signed-off-by: Jeff King 
---
 diff.c | 15 +++
 t/t4015-diff-whitespace.sh | 67 ++
 2 files changed, 77 insertions(+), 5 deletions(-)

diff --git a/diff.c b/diff.c
index 6fd288420b..09081a207c 100644
--- a/diff.c
+++ b/diff.c
@@ -712,7 +712,7 @@ static int next_byte(const char **cp, const char **endp,
 {
int retval;
 
-   if (*cp > *endp)
+   if (*cp >= *endp)
return -1;
 
if (isspace(**cp)) {
@@ -729,7 +729,12 @@ static int next_byte(const char **cp, const char **endp,
if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE)) {
while (*cp < *endp && isspace(**cp))
(*cp)++;
-   /* return the first non-ws character via the usual 
below */
+   /*
+* return the first non-ws character via the usual
+* below, unless we ate all of the bytes
+*/
+   if (*cp >= *endp)
+   return -1;
}
}
 
@@ -750,9 +755,9 @@ static int moved_entry_cmp(const struct diff_options 
*diffopt,
return a->es->len != b->es->len  || memcmp(ap, bp, a->es->len);
 
if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE_AT_EOL)) {
-   while (ae > ap && isspace(*ae))
+   while (ae > ap && isspace(ae[-1]))
ae--;
-   while (be > bp && isspace(*be))
+   while (be > bp && isspace(be[-1]))
be--;
}
 
@@ -775,7 +780,7 @@ static unsigned get_string_hash(struct emitted_diff_symbol 
*es, struct diff_opti
int c;
 
strbuf_reset();
-   while (ae > ap && isspace(*ae))
+   while (ae > ap && isspace(ae[-1]))
ae--;
while ((c = next_byte(, , o)) > 0)
strbuf_addch(, c);
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 1f54816c6b..eb9431da7d 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1463,6 +1463,73 @@ test_expect_success 'move detection ignoring whitespace 
changes' '
test_cmp expected actual
 '
 
+test_expect_failure 'move detection ignoring whitespace at eol' '
+   git reset --hard &&
+   # Lines 6-9 have new eol whitespace, but 9 also has it in the middle
+   q_to_tab <<-\EOF >lines.txt &&
+   long line 6Q
+   long line 7Q
+   long line 8Q
+   longQline 9Q
+   line 1
+   line 2
+   line 3
+   line 4
+   line 5
+   EOF
+
+   # avoid cluttering the output with complaints about our eol whitespace
+   test_config core.whitespace -blank-at-eol &&
+
+   git diff HEAD --no-renames --color-moved --color |
+   grep -v "index" |
+   test_decode_color >actual &&
+   cat <<-\EOF >expected &&
+   diff --git a/lines.txt b/lines.txt
+   --- a/lines.txt
+   +++ b/lines.txt
+   @@ -1,9 +1,9 @@
+   +long line 6   
+   +long line 7   
+   +long line 8   
+   +long  line 9  
+line 1
+line 2
+line 3
+line 4
+line 5
+   -long

[PATCH 3/5] t4015: test the output of "diff --color-moved -b"

2017-10-19 Thread Jeff King

Commit fa5ba2c1dd (diff: fix infinite loop with
--color-moved --ignore-space-change, 2017-10-12) added a
test to make sure that "--color-moved -b" doesn't run
forever, but the test in question doesn't actually have any
moved lines in it.

Let's scrap that test and add a variant of the existing
"--color-moved -w" test, but this time we'll check that we
find the move with whitespace changes, but not arbitrary
whitespace additions.

Signed-off-by: Jeff King 
---
 t/t4015-diff-whitespace.sh | 73 --
 1 file changed, 64 insertions(+), 9 deletions(-)

diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 503c9bc7f3..1f54816c6b 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1399,6 +1399,70 @@ test_expect_success 'move detection ignoring whitespace 
' '
test_cmp expected actual
 '
 
+test_expect_success 'move detection ignoring whitespace changes' '
+   git reset --hard &&
+   # Lines 6-8 have a space change, but 9 is new whitespace
+   q_to_tab <<-\EOF >lines.txt &&
+   longQline 6
+   longQline 7
+   longQline 8
+   long liQne 9
+   line 1
+   line 2
+   line 3
+   line 4
+   line 5
+   EOF
+
+   git diff HEAD --no-renames --color-moved --color |
+   grep -v "index" |
+   test_decode_color >actual &&
+   cat <<-\EOF >expected &&
+   diff --git a/lines.txt b/lines.txt
+   --- a/lines.txt
+   +++ b/lines.txt
+   @@ -1,9 +1,9 @@
+   +long  line 6
+   +long  line 7
+   +long  line 8
+   +long li   ne 9
+line 1
+line 2
+line 3
+line 4
+line 5
+   -long line 6
+   -long line 7
+   -long line 8
+   -long line 9
+   EOF
+   test_cmp expected actual &&
+
+   git diff HEAD --no-renames -b --color-moved --color |
+   grep -v "index" |
+   test_decode_color >actual &&
+   cat <<-\EOF >expected &&
+   diff --git a/lines.txt b/lines.txt
+   --- a/lines.txt
+   +++ b/lines.txt
+   @@ -1,9 +1,9 @@
+   +longline 6
+   +longline 7
+   +longline 8
+   +long li   ne 9
+line 1
+line 2
+line 3
+line 4
+line 5
+   -long line 6
+   -long line 7
+   -long line 8
+   -long line 9
+   EOF
+   test_cmp expected actual
+'
+
 test_expect_success 'clean up whitespace-test colors' '
git config --unset color.diff.oldMoved &&
git config --unset color.diff.newMoved
@@ -1549,13 +1613,4 @@ test_expect_success 'move detection with submodules' '
test_cmp expect decoded_actual
 '
 
-test_expect_success 'move detection with whitespace changes' '
-   test_when_finished "git reset --hard" &&
-   test_seq 10 >test &&
-   git add test &&
-   sed s/3/42/ test.tmp &&
-   mv test.tmp test &&
-   git -c diff.colormoved diff --ignore-space-change -- test
-'
-
 test_done
-- 
2.15.0.rc1.560.g5f0609e481

[PATCH 2/5] t4015: check "negative" case for "-w --color-moved"

2017-10-19 Thread Jeff King

We test that lines with whitespace changes are not found by
"--color-moved" by default, but are found if "-w" is added.
Let's add one more twist: a line that has non-whitespace
changes should not be marked as a pure move.

This is perhaps an obvious case for us to get right (and we
do), but as we add more whitespace tests, they will form a
pattern of "make sure this case is a move and this other
case is not".

Note that we have to add a line to our moved block, since
having a too-small block doesn't trigger the "moved"
heuristics.  And we also add a line of context to ensure
that there's more context lines than moved lines (so the
diff shows us moving the lines up, rather than moving the
context down).

Signed-off-by: Jeff King 
---
 t/t4015-diff-whitespace.sh | 26 ++
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 164b502405..503c9bc7f3 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1326,9 +1326,11 @@ test_expect_success 'set up whitespace tests' '
line 2
line 3
line 4
-   long line 5
+   line 5
long line 6
long line 7
+   long line 8
+   long line 9
EOF
git add lines.txt &&
git commit -m "add poetry" &&
@@ -1338,13 +1340,15 @@ test_expect_success 'set up whitespace tests' '
 
 test_expect_success 'move detection ignoring whitespace ' '
q_to_tab <<-\EOF >lines.txt &&
-   Qlong line 5
Qlong line 6
Qlong line 7
+   Qlong line 8
+   Qchanged long line 9
line 1
line 2
line 3
line 4
+   line 5
EOF
git diff HEAD --no-renames --color-moved --color |
grep -v "index" |
@@ -1353,17 +1357,20 @@ test_expect_success 'move detection ignoring whitespace 
' '
diff --git a/lines.txt b/lines.txt
--- a/lines.txt
+++ b/lines.txt
-   @@ -1,7 +1,7 @@
-   + long line 5
+   @@ -1,9 +1,9 @@
+ long line 6
+ long line 7
+   + long line 8
+   + changed long line 9
 line 1
 line 2
 line 3
 line 4
-   -long line 5
+line 5
-long line 6
-long line 7
+   -long line 8
+   -long line 9
EOF
test_cmp expected actual &&
 
@@ -1374,17 +1381,20 @@ test_expect_success 'move detection ignoring whitespace 
' '
diff --git a/lines.txt b/lines.txt
--- a/lines.txt
+++ b/lines.txt
-   @@ -1,7 +1,7 @@
-   +  long line 5
+   @@ -1,9 +1,9 @@
+  long line 6
+  long line 7
+   +  long line 8
+   + changed long line 9
 line 1
 line 2
 line 3
 line 4
-   -long line 5
+line 5
-long line 6
-long line 7
+   -long line 8
+   -long line 9
EOF
test_cmp expected actual
 '
-- 
2.15.0.rc1.560.g5f0609e481

[PATCH 1/5] t4015: refactor --color-moved whitespace test

2017-10-19 Thread Jeff King

In preparation for testing several different whitespace
options, let's split out the setup and cleanup steps of the
whitespace test.

While we're here, let's also switch to using "<<-" to indent
our here-documents properly, and use q_to_tab to more
explicitly mark where we expect whitespace to appear.

Signed-off-by: Jeff King 
---
 t/t4015-diff-whitespace.sh | 49 +++---
 1 file changed, 29 insertions(+), 20 deletions(-)

diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 87083f728f..164b502405 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1318,30 +1318,34 @@ test_expect_success 'no effect from --color-moved with 
--word-diff' '
test_cmp expect actual
 '
 
-test_expect_success 'move detection ignoring whitespace ' '
+test_expect_success 'set up whitespace tests' '
git reset --hard &&
-   cat <<\EOF >lines.txt &&
-line 1
-line 2
-line 3
-line 4
-long line 5
-long line 6
-long line 7
-EOF
-   git add lines.txt &&
-   git commit -m "add poetry" &&
-   cat <<\EOF >lines.txt &&
+   # Note that these lines have no leading or trailing whitespace.
+   cat <<-\EOF >lines.txt &&
+   line 1
+   line 2
+   line 3
+   line 4
long line 5
long line 6
long line 7
-line 1
-line 2
-line 3
-line 4
-EOF
-   test_config color.diff.oldMoved "magenta" &&
-   test_config color.diff.newMoved "cyan" &&
+   EOF
+   git add lines.txt &&
+   git commit -m "add poetry" &&
+   git config color.diff.oldMoved "magenta" &&
+   git config color.diff.newMoved "cyan"
+'
+
+test_expect_success 'move detection ignoring whitespace ' '
+   q_to_tab <<-\EOF >lines.txt &&
+   Qlong line 5
+   Qlong line 6
+   Qlong line 7
+   line 1
+   line 2
+   line 3
+   line 4
+   EOF
git diff HEAD --no-renames --color-moved --color |
grep -v "index" |
test_decode_color >actual &&
@@ -1385,6 +1389,11 @@ EOF
test_cmp expected actual
 '
 
+test_expect_success 'clean up whitespace-test colors' '
+   git config --unset color.diff.oldMoved &&
+   git config --unset color.diff.newMoved
+'
+
 test_expect_success '--color-moved block at end of diff output respects 
MIN_ALNUM_COUNT' '
git reset --hard &&
>bar &&
-- 
2.15.0.rc1.560.g5f0609e481

[PATCH 0/5] fix "diff --color-moved --ignore-space-at-eol"

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 01:42:46AM -0400, Jeff King wrote:

> It's late here, so I'll wait for comments from Stefan and then try to
> wrap it up with a commit message and test tomorrow.

Here it is.

The fix is in patch 4. The earlier ones are just beefing up the test
coverage, and the last one is a minor cleanup we can do on top.

  [1/5]: t4015: refactor --color-moved whitespace test
  [2/5]: t4015: check "negative" case for "-w --color-moved"
  [3/5]: t4015: test the output of "diff --color-moved -b"
  [4/5]: diff: fix whitespace-skipping with --color-moved
  [5/5]: diff: handle NULs in get_string_hash()

 diff.c |  17 ++--
 t/t4015-diff-whitespace.sh | 213 +
 2 files changed, 188 insertions(+), 42 deletions(-)

-Peff

Re: [PATCH] diff.c: increment buffer pointer in all code path

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 12:53:03PM -0700, Stefan Beller wrote:

> > @@ -771,7 +771,7 @@ static unsigned get_string_hash(struct 
> > emitted_diff_symbol *es, struct diff_opti
> >  {
> > if (o->xdl_opts & XDF_WHITESPACE_FLAGS) {
> > static struct strbuf sb = STRBUF_INIT;
> > -   const char *ap = es->line, *ae = es->line + es->len;
> > +   const char *ap = es->line, *ae = es->line + es->len - 1;
> > int c;
> >
> > strbuf_reset();
> >
> > it does. It just adjusts our "end pointer" to point to the last valid
> > character in the string (rather than one past),
> 
> Thanks for spotting. I can send a proper patch with tests if you'd like.

I'm putting the finishing touches on a series that should be out in a
few minutes. Hold off until then.

-Peff

Re: [PATCH] diff.c: increment buffer pointer in all code path

2017-10-19 Thread Stefan Beller

On Wed, Oct 18, 2017 at 10:42 PM, Jeff King  wrote:

>
> So I think the right fix is this:
>
[...]
>
> It's late here, so I'll wait for comments from Stefan and then try to
> wrap it up with a commit message and test tomorrow.
>
> -Peff

I agree that this is better and looks correct.
Thanks for offering to send a patch. I'd be happy to review it.

Thanks,
Stefan

[PATCH] add a test for "describe --contains" with mixed tags

2017-10-19 Thread orgads

From: Orgad Shaneh 

---
 t/t6120-describe.sh | 16 
 1 file changed, 16 insertions(+)

diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 1c0e865..08427f4 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -340,4 +340,20 @@ test_expect_success ULIMIT_STACK_SIZE 'describe works in a 
deep repo' '
test_cmp expect actual
 '
 
+test_expect_success 'describe --contains for light before annotated' '
+   test_tick &&
+   git commit --allow-empty -m First &&
+   test_tick &&
+   git commit --allow-empty -m Second &&
+   test_tick &&
+   git commit --allow-empty -m Third &&
+   test_tick &&
+   git tag light-before-annotated HEAD^ &&
+   test_tick &&
+   git tag -a -m annotated annotated-after-light
+
+'
+
+check_describe light-before-annotated~1 --contains light-before-annotated~1
+
 test_done
-- 
1.9.1

Re: [PATCH] diff.c: increment buffer pointer in all code path

2017-10-19 Thread Stefan Beller

On Wed, Oct 18, 2017 at 10:24 PM, Jeff King  wrote:
> On Thu, Oct 19, 2017 at 01:04:59AM -0400, Jeff King wrote:
>
>> So. That leaves me with:
>>
>>   - I'm unclear on whether next_byte() is meant to return that trailing
>> NUL or not. I don't think it causes any bugs, but it certainly
>> confused me for a function to take a cp/endp pair of pointers, and
>> then dereference endp. It might be worth either fixing or clarifying
>> with a comment.
>>
>>   - Those loops to eat trailing whitespace are doing nothing. I'm not
>> sure if that all works out because next_byte() eats whitespaces or
>> not (I think not, because it doesn't eat whitespace for the
>> IGNORE_WHITESPACE_AT_EOL case). But I'm not quite sure what a test
>> would look like.
>
> I had trouble constructing a test at first, but I think my test lines
> just weren't long enough to trigger the movement heuristics. If I switch
> to something besides seq, I can do:
>
>   # any input that has reasonably sized lines
>   look e | head -50 >file
>   git add file
>
>   perl -i -ne '
> # pick up lines 20-25 to move to line 40, and
> # add some trailing whitespace to them
> if ($. >= 20 && $. <= 25) {
>   s/$/ /;
>   $hold .= $_;
> } else {
>   print $hold if ($. == 40);
>   print;
> }
>   ' file
>
>   git diff --color-moved --ignore-space-at-eol
>
> I think that _should_ show the block as moved, but it doesn't. But if I
> apply this patch:
>
> diff --git a/diff.c b/diff.c
> index 93dccd1817..375d9cf447 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -743,8 +743,8 @@ static int moved_entry_cmp(const struct diff_options 
> *diffopt,
>const struct moved_entry *b,
>const void *keydata)
>  {
> -   const char *ap = a->es->line, *ae = a->es->line + a->es->len;
> -   const char *bp = b->es->line, *be = b->es->line + b->es->len;
> +   const char *ap = a->es->line, *ae = a->es->line + a->es->len - 1;
> +   const char *bp = b->es->line, *be = b->es->line + b->es->len - 1;
>
> if (!(diffopt->xdl_opts & XDF_WHITESPACE_FLAGS))
> return a->es->len != b->es->len  || memcmp(ap, bp, 
> a->es->len);
> @@ -771,7 +771,7 @@ static unsigned get_string_hash(struct 
> emitted_diff_symbol *es, struct diff_opti
>  {
> if (o->xdl_opts & XDF_WHITESPACE_FLAGS) {
> static struct strbuf sb = STRBUF_INIT;
> -   const char *ap = es->line, *ae = es->line + es->len;
> +   const char *ap = es->line, *ae = es->line + es->len - 1;
> int c;
>
> strbuf_reset();
>
> it does. It just adjusts our "end pointer" to point to the last valid
> character in the string (rather than one past),

Thanks for spotting. I can send a proper patch with tests if you'd like.


> which seems to be the
> convention that those loops (and next_byte) expect.

I'll look at that again.

Thanks for poking!
Stefan

>
> -Peff

Re: [PATCH/RFC] grep: turn off threading when recursing submodules

2017-10-19 Thread Martin Ågren

On 19 October 2017 at 21:34, Jeff King  wrote:
> On Thu, Oct 19, 2017 at 12:26:18PM -0700, Brandon Williams wrote:
>
>> One alternative to turning off threading would be to employ proper
>> locking (like I failed to do) by wrapping the call the
>> 'add_to_alternates_memory()' in calls to grep_read_lock/unlock:
>>
>>   +   grep_read_lock();
>>   add_to_alternates_memory(submodule.objectdir);
>>   +   grep_read_unlock();
>>
>> That way adding the submodule to the object database won't happen at the
>> same time another thread is trying to read from the object store.
>
> Yeah, I think that is the right approach. Many of Git's APIs aren't
> thread-safe, so we need a big mutex around "I am doing anything except
> actually looking at the already-read blob contents". We just missed this
> spot.

That's a big glaring lock I missed, right in front of me. Looks like the
right solution!

Re: [PATCH] builtin/push.c: add push.pushOption config

2017-10-19 Thread Stefan Beller

On Thu, Oct 19, 2017 at 10:47 AM, Marius Paliga  wrote:
> Push options need to be given explicitly, via the command line as "git
> push --push-option ".  Add the config option push.pushOption,
> which is a multi-valued option, containing push options that are sent
> by default.
>
> When push options are set in the lower-priority configulation file
> (e.g. /etc/gitconfig, or $HOME/.gitconfig), they can be unset later in
> the more specific repository config by the empty string.
>
> Add tests and update documentation as well.

Thanks for keeping working on this feature!


> @@ -161,6 +161,9 @@ already exists on the remote side.
> Transmit the given string to the server, which passes them to
> the pre-receive as well as the post-receive hook. The given string
> must not contain a NUL or LF character.
> +   When no `--push-option ` is given from the command
> +   line, the values of configuration variable `push.pushOption`
> +   are used instead.

We'd also want to document how push.pushOption works in
Documentation/config.txt (that contains all the configs)

So in the config, we have to explicitly give an empty option to
clear the previous options, but on the command line we do not need
that, but instead we'd have to repeat any push options that we desire
that were configured?

Example:

  /etc/gitconfig
  push.pushoption = a
  push.pushoption = b

  ~/.gitconfig
  push.pushoption = c

  repo/.git/config
  push.pushoption =
  push.pushoption = b

will result in only b as a and c are
cleared.

If I were to run
  git -c push.pushOption=d push ... (in repo)
it would be b and d, but
  git push --push-option=d
would be d only?

As a user I might have expected it to be the same,
expecting
  git push --push-option='' --push-option=d
to suppress b.


> @@ -584,12 +599,13 @@ int cmd_push(int argc, const char **argv, const char 
> *prefix)
> set_refspecs(argv + 1, argc - 1, repo);
> }
>
> -   for_each_string_list_item(item, _options)
> +   for_each_string_list_item(item, push_options)

We have to do the same for _cmdline here, too?

The tests look good!

Thanks,
Stefan

Re: [PATCH/RFC] grep: turn off threading when recursing submodules

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 12:26:18PM -0700, Brandon Williams wrote:

> One alternative to turning off threading would be to employ proper
> locking (like I failed to do) by wrapping the call the
> 'add_to_alternates_memory()' in calls to grep_read_lock/unlock:
> 
>   +   grep_read_lock();
>   add_to_alternates_memory(submodule.objectdir);
>   +   grep_read_unlock();
> 
> That way adding the submodule to the object database won't happen at the
> same time another thread is trying to read from the object store.

Yeah, I think that is the right approach. Many of Git's APIs aren't
thread-safe, so we need a big mutex around "I am doing anything except
actually looking at the already-read blob contents". We just missed this
spot.

-Peff

Re: [PATCH v3 4/4] status: test --ignored=matching

2017-10-19 Thread Stefan Beller

On Thu, Oct 19, 2017 at 9:06 AM, Jameson Miller
 wrote:

> +test_expect_success 'Verify behavior of status on folders with ignored 
> files' '

https://stackoverflow.com/questions/5078676/what-is-the-difference-between-a-directory-and-a-folder
We test directories here?

All tests are testing --ignored=matching, do we want to have at least one test
of "no" and "traditional" as well? (Just to see if the parsing works;
we can modify
an existing test for that?)

Re: [PATCH/RFC] grep: turn off threading when recursing submodules

2017-10-19 Thread Brandon Williams

On 10/19, Martin Ågren wrote:
> With --recurse-submodules, we use the machinery for alternate object
> databases and pretend that the repository is an alternate ODB. This has
> some drawbacks since the list of alternates is global and will grow as
> we proceed. Still, that's a problem with performance, not correctness.
> The other immediately available option is to spawn an external process.
> We actually used to do this until f9ee2fcdf (grep: recurse in-process
> using 'struct repository', 2017-08-02).
>
> So, as we encounter submodules during our recursing, we add them to the
> list of ODBs. But with threading, our changes to the list are not
> protected against races. Indeed, ThreadSanitizer reports a problem in
> this area with t7814.
>
> The race which ThreadSanitizer detects is that `grep_submodule()` calls
> `add_to_alternates_memory()` around the same time that
> `grep_source_load_oid()` ends up calling `prepare_packed_git()`. A
> similar race might occur if two threads encounter a submodule each at
> the same time and add them to the list of ODBs. This might end up losing
> one of them and could give incorrect results.
>
> Turn off threading when we're recursing submodules to avoid this race.
> Long term, a better approach will be to address the existing NEEDSWORK
> in builtin/grep.c to introduce per-repo object stores. Then we should be
> able to revert this commit.
>
> Alternatively, we could equip the list with a mutex (or maybe do some
> lock-less cleverness), but it seems a bit sad considering it shouldn't
> really be needed: the list of ODBs would normally be fully populated
> before we start using it.
>
> Another approach would be to first recurse the submodules and collect
> the ODBs, then recurse again to do the actual grepping. That would be a
> more involved change. Or, we could revert f9ee2fcdf. That would hurt
> those who do not use threading.
>
> Signed-off-by: Martin Ågren 
> ---
> This is a simple, stupid solution. I'm posting this in patch-form so
> that I can do one better than just mailing about the race and waving my
> hands. Maybe threading is common enough that reverting could be a better
> approach. Or implementing some (optional) locking...
>
> Output from ThreadSanitizer below the patch.

Thanks for finding this mistake of mine.  I would think that even after
moving towards removing the 'add_to_alternates_memory()' calls and
instead having the object store embedded in the repository struct we
would still need to use locking to ensure that multiple threads are able
to read from the object store properly (because even now two threads
can't read from it without a lock).

One alternative to turning off threading would be to employ proper
locking (like I failed to do) by wrapping the call the
'add_to_alternates_memory()' in calls to grep_read_lock/unlock:

  +   grep_read_lock();
  add_to_alternates_memory(submodule.objectdir);
  +   grep_read_unlock();

That way adding the submodule to the object database won't happen at the
same time another thread is trying to read from the object store.

>
>  builtin/grep.c | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/builtin/grep.c b/builtin/grep.c
> index 2d65f27d0..29f79104d 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -1023,6 +1023,15 @@ int cmd_grep(int argc, const char **argv, const char 
> *prefix)
>   die(_("invalid number of threads specified (%d)"), num_threads);
>   if (num_threads == 1)
>   num_threads = 0;
> + /*
> +  * NEEDSWORK: When we recurse through submodules we misuse the
> +  * alt-odb-mechanism (alternate object databases). Doing so may hit
> +  * data-races if we are threading. This can be removed once the object
> +  * store is no longer global and instead is a member of the repository
> +  * object.
> +  */
> + if (recurse_submodules)
> + num_threads = 0;
>  #else
>   if (num_threads)
>   warning(_("no threads support, ignoring --threads"));
> --
> WARNING: ThreadSanitizer: data race (pid=27207)
>   Write of size 8 at 0x7d4cde40 by main thread:
> #0 link_alt_odb_entry sha1_file.c:361
> #1 link_alt_odb_entries.lto_priv.1377 sha1_file.c:422
> #2 add_to_alternates_memory sha1_file.c:510
> #3 grep_submodule builtin/grep.c:434
> #4 grep_cache builtin/grep.c:508
> #5 grep_submodule builtin/grep.c:462
> #6 grep_cache builtin/grep.c:508
> #7 cmd_grep builtin/grep.c:1092
> #8 run_builtin git.c:346
> #9 handle_builtin.lto_priv.1929 git.c:554
> #10 run_argv git.c:606
> #11 cmd_main git.c:683
> #12 main common-main.c:43
>
>   Previous read of size 8 at 0x7d4cde40 by thread T4 (mutexes: write
> M6):
> #0 prepare_packed_git.part.9.lto_priv.953 packfile.c:883
> #1 prepare_packed_git packfile.c:289
> #2 find_pack_entry packfile.c:1836
> #3 sha1_object_info_extended sha1_file.c:1179
> #4

Re: [PATCH v4 3/3] submodule: simplify decision tree whether to or not to fetch

2017-10-19 Thread Brandon Williams

On 10/19, Heiko Voigt wrote:
> On Thu, Oct 19, 2017 at 09:36:47AM +0900, Junio C Hamano wrote:
> > Brandon Williams  writes:
> > 
> > > On 10/16, Heiko Voigt wrote:
> > >> To make extending this logic later easier.
> > >
> > > This makes things so much clearer, thanks!
> > 
> > I agree that it is clear to see what the code after the patch does,
> > but the code before the patch is so convoluted to follow that it is
> > a bit hard to see if the code before and after are doing the same
> > thing, though ;-)
> 
> That is why I would appreciate some extra pairs of eyes on this :) I
> tried to be as careful as possible when refactoring this, but since it
> is quite convoluted something might have slipped through. The testsuite
> does not show anything, but there might be corner cases that are not
> tested I guess.
> 
> Will hopefully have time to look into the comments to the main patch of
> this series tomorrow. Did not get around to properly do that yet.
> 
> Cheers Heiko
> 
> > >> Signed-off-by: Heiko Voigt 
> > >> ---
> > >>  submodule.c | 74 
> > >> ++---
> > >>  1 file changed, 37 insertions(+), 37 deletions(-)
> > >> 
> > >> diff --git a/submodule.c b/submodule.c
> > >> index 71d1773e2e..82d206eb65 100644
> > >> --- a/submodule.c
> > >> +++ b/submodule.c
> > >> @@ -1187,6 +1187,31 @@ struct submodule_parallel_fetch {
> > >>  };
> > >>  #define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0}
> > >>  
> > >> +static int get_fetch_recurse_config(const struct submodule *submodule,
> > >> +struct submodule_parallel_fetch 
> > >> *spf)
> > >> +{
> > >> +if (spf->command_line_option != RECURSE_SUBMODULES_DEFAULT)
> > >> +return spf->command_line_option;
> > >> +
> > >> +if (submodule) {
> > >> +char *key;
> > >> +const char *value;
> > >> +
> > >> +int fetch_recurse = submodule->fetch_recurse;
> > >> +key = xstrfmt("submodule.%s.fetchRecurseSubmodules", 
> > >> submodule->name);
> > >> +if (!repo_config_get_string_const(the_repository, key, 
> > >> )) {
> > >> +fetch_recurse = 
> > >> parse_fetch_recurse_submodules_arg(key, value);
> > >> +}
> > >> +free(key);
> > >> +
> > >> +if (fetch_recurse != RECURSE_SUBMODULES_NONE)
> > >> +/* local config overrules everything except 
> > >> commandline */
> > >> +return fetch_recurse;
> > >> +}
> > >> +
> > >> +return spf->default_option;
> > >> +}
> > >> +
> > >>  static int get_next_submodule(struct child_process *cp,
> > >>struct strbuf *err, void *data, void 
> > >> **task_cb)
> > >>  {
> > >> @@ -1214,46 +1239,21 @@ static int get_next_submodule(struct 
> > >> child_process *cp,
> > >>  }
> > >>  }
> > >>  
> > >> -default_argv = "yes";
> > >> -if (spf->command_line_option == 
> > >> RECURSE_SUBMODULES_DEFAULT) {
> > >> -int fetch_recurse = RECURSE_SUBMODULES_NONE;
> > >> -
> > >> -if (submodule) {
> > >> -char *key;
> > >> -const char *value;
> > >> -
> > >> -fetch_recurse = 
> > >> submodule->fetch_recurse;
> > >> -key = 
> > >> xstrfmt("submodule.%s.fetchRecurseSubmodules", submodule->name);
> > >> -if 
> > >> (!repo_config_get_string_const(the_repository, key, )) {
> > >> -fetch_recurse = 
> > >> parse_fetch_recurse_submodules_arg(key, value);
> > >> -}
> > >> -free(key);
> > >> -}
> > >> -
> > >> -if (fetch_recurse != RECURSE_SUBMODULES_NONE) {
> > >> -if (fetch_recurse == 
> > >> RECURSE_SUBMODULES_OFF)
> > >> -continue;
> > >> -if (fetch_recurse == 
> > >> RECURSE_SUBMODULES_ON_DEMAND) {
> > >> -if 
> > >> (!unsorted_string_list_lookup(_submodule_names,
> > >> -
> > >>  submodule->name))
> > >> -continue;
> > >> -default_argv = "on-demand";
> > >> -}
> > >> -} else {
> > >> -if (spf->default_option == 
> > >> RECURSE_SUBMODULES_OFF)
> > >> -continue;
> > >> -if

[PATCH/RFC] grep: turn off threading when recursing submodules

2017-10-19 Thread Martin Ågren

With --recurse-submodules, we use the machinery for alternate object
databases and pretend that the repository is an alternate ODB. This has
some drawbacks since the list of alternates is global and will grow as
we proceed. Still, that's a problem with performance, not correctness.
The other immediately available option is to spawn an external process.
We actually used to do this until f9ee2fcdf (grep: recurse in-process
using 'struct repository', 2017-08-02).

So, as we encounter submodules during our recursing, we add them to the
list of ODBs. But with threading, our changes to the list are not
protected against races. Indeed, ThreadSanitizer reports a problem in
this area with t7814.

The race which ThreadSanitizer detects is that `grep_submodule()` calls
`add_to_alternates_memory()` around the same time that
`grep_source_load_oid()` ends up calling `prepare_packed_git()`. A
similar race might occur if two threads encounter a submodule each at
the same time and add them to the list of ODBs. This might end up losing
one of them and could give incorrect results.

Turn off threading when we're recursing submodules to avoid this race.
Long term, a better approach will be to address the existing NEEDSWORK
in builtin/grep.c to introduce per-repo object stores. Then we should be
able to revert this commit.

Alternatively, we could equip the list with a mutex (or maybe do some
lock-less cleverness), but it seems a bit sad considering it shouldn't
really be needed: the list of ODBs would normally be fully populated
before we start using it.

Another approach would be to first recurse the submodules and collect
the ODBs, then recurse again to do the actual grepping. That would be a
more involved change. Or, we could revert f9ee2fcdf. That would hurt
those who do not use threading.

Signed-off-by: Martin Ågren 
---
This is a simple, stupid solution. I'm posting this in patch-form so
that I can do one better than just mailing about the race and waving my
hands. Maybe threading is common enough that reverting could be a better
approach. Or implementing some (optional) locking...

Output from ThreadSanitizer below the patch.

 builtin/grep.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index 2d65f27d0..29f79104d 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1023,6 +1023,15 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
die(_("invalid number of threads specified (%d)"), num_threads);
if (num_threads == 1)
num_threads = 0;
+   /*
+* NEEDSWORK: When we recurse through submodules we misuse the
+* alt-odb-mechanism (alternate object databases). Doing so may hit
+* data-races if we are threading. This can be removed once the object
+* store is no longer global and instead is a member of the repository
+* object.
+*/
+   if (recurse_submodules)
+   num_threads = 0;
 #else
if (num_threads)
warning(_("no threads support, ignoring --threads"));
-- 
WARNING: ThreadSanitizer: data race (pid=27207)
  Write of size 8 at 0x7d4cde40 by main thread:
#0 link_alt_odb_entry sha1_file.c:361
#1 link_alt_odb_entries.lto_priv.1377 sha1_file.c:422
#2 add_to_alternates_memory sha1_file.c:510
#3 grep_submodule builtin/grep.c:434
#4 grep_cache builtin/grep.c:508
#5 grep_submodule builtin/grep.c:462
#6 grep_cache builtin/grep.c:508
#7 cmd_grep builtin/grep.c:1092
#8 run_builtin git.c:346
#9 handle_builtin.lto_priv.1929 git.c:554
#10 run_argv git.c:606
#11 cmd_main git.c:683
#12 main common-main.c:43

  Previous read of size 8 at 0x7d4cde40 by thread T4 (mutexes: write
M6):
#0 prepare_packed_git.part.9.lto_priv.953 packfile.c:883
#1 prepare_packed_git packfile.c:289
#2 find_pack_entry packfile.c:1836
#3 sha1_object_info_extended sha1_file.c:1179
#4 read_object.lto_priv.900 packfile.c:1453
#5 read_sha1_file_extended.constprop.779 sha1_file.c:1279
#6 read_sha1_file cache.h:1162
#7 grep_source_load_oid grep.c:1966
#8 grep_source_load grep.c:2019
#9 grep_source_is_binary grep.c:2045
#10 grep_source_1 grep.c:1689
#11 grep_source grep.c:1897
#12 run builtin/grep.c:183
#13  

  Location is heap block of size 408 at 0x7d4cde40 allocated by main
thread:
#0 calloc 
#1 xcalloc.constprop.820 wrapper.c:160
#2 alloc_alt_odb sha1_file.c:449
#3 link_alt_odb_entry sha1_file.c:358
#4 link_alt_odb_entries.lto_priv.1377 sha1_file.c:422
#5 add_to_alternates_memory sha1_file.c:510
#6 grep_submodule builtin/grep.c:434
#7 grep_cache builtin/grep.c:508
#8 cmd_grep builtin/grep.c:1092
#9 run_builtin git.c:346
#10 handle_builtin.lto_priv.1929 git.c:554
#11 run_argv git.c:606
#12 cmd_main git.c:683
#13 main common-main.c:43

  Mutex M6 (0x00959340) created at:
#0

This is Mr.Frankly Ejero. am writing to inform you that am back from holidays,

2017-10-19 Thread Mr.Frankly Ejero

Good Day

This is Mr.Frankly Ejero. am writing to inform you that am back from
holidays, and regarding the last meeting it came out good, so the
bank is read to transfer the fund into your account.I am only
waiting for your directions(ejerofrankly...@gmail.com)


Best Regard
Mr.Frankly Ejero

[PATCH v2 2/2] mark git stash push deprecated in the man page

2017-10-19 Thread Thomas Gummerer

'git stash push' fixes a historical wart in the interface of 'git stash
save'.  As 'git stash push' has all functionality of 'git stash save',
with a nicer, more consistent user interface deprecate 'git stash
save'.  To do this, remove it from the synopsis of the man page, and
move it to a separate section, stating that it is deprecated.

Helped-by: Robert P. J. Day 
Signed-off-by: Thomas Gummerer 
---
 Documentation/git-stash.txt | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
index 53b2e60aeb..89b6a0e672 100644
--- a/Documentation/git-stash.txt
+++ b/Documentation/git-stash.txt
@@ -13,8 +13,6 @@ SYNOPSIS
 'git stash' drop [-q|--quiet] []
 'git stash' ( pop | apply ) [--index] [-q|--quiet] []
 'git stash' branch  []
-'git stash' save [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]
-[-u|--include-untracked] [-a|--all] []
 'git stash' [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]
 [-u|--include-untracked] [-a|--all] [-m|--message ]]
 [--] [...]]
@@ -48,7 +46,6 @@ stash index (e.g. the integer `n` is equivalent to 
`stash@{n}`).
 OPTIONS
 ---
 
-save [-p|--patch] [-k|--[no-]keep-index] [-u|--include-untracked] [-a|--all] 
[-q|--quiet] []::
 push [-p|--patch] [-k|--[no-]keep-index] [-u|--include-untracked] [-a|--all] 
[-q|--quiet] [-m|--message ] [--] [...]::
 
Save your local modifications to a new 'stash entry' and roll them
@@ -87,6 +84,10 @@ linkgit:git-add[1] to learn how to operate the `--patch` 
mode.
 The `--patch` option implies `--keep-index`.  You can use
 `--no-keep-index` to override this.
 
+save [-p|--patch] [-k|--[no-]keep-index] [-u|--include-untracked] [-a|--all] 
[-q|--quiet] []::
+
+   This option is deprecated in favour of 'git stash push'.
+
 list []::
 
List the stash entries that you currently have.  Each 'stash entry' is
-- 
2.15.0.rc1.65.g660fb3dfa8

[PATCH v2 1/2] replace git stash save with git stash push in the documentation

2017-10-19 Thread Thomas Gummerer

git stash push is the newer interface for creating a stash.  While we
are still keeping git stash save around for the time being, it's better
to point new users of git stash to the more modern (and more feature
rich) interface, instead of teaching them the older version that we
might want to phase out in the future.

Signed-off-by: Thomas Gummerer 
---
 Documentation/git-stash.txt| 12 ++--
 Documentation/gitworkflows.txt |  2 +-
 Documentation/user-manual.txt  |  2 +-
 git-stash.sh   | 10 +-
 4 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
index 00f95fee1f..53b2e60aeb 100644
--- a/Documentation/git-stash.txt
+++ b/Documentation/git-stash.txt
@@ -33,7 +33,7 @@ and reverts the working directory to match the `HEAD` commit.
 The modifications stashed away by this command can be listed with
 `git stash list`, inspected with `git stash show`, and restored
 (potentially on top of a different commit) with `git stash apply`.
-Calling `git stash` without any arguments is equivalent to `git stash save`.
+Calling `git stash` without any arguments is equivalent to `git stash push`.
 A stash is by default listed as "WIP on 'branchname' ...", but
 you can give a more descriptive message on the command line when
 you create one.
@@ -118,7 +118,7 @@ pop [--index] [-q|--quiet] []::
 
Remove a single stashed state from the stash list and apply it
on top of the current working tree state, i.e., do the inverse
-   operation of `git stash save`. The working directory must
+   operation of `git stash push`. The working directory must
match the index.
 +
 Applying the state can fail with conflicts; in this case, it is not
@@ -137,7 +137,7 @@ apply [--index] [-q|--quiet] []::
 
Like `pop`, but do not remove the state from the stash list. Unlike 
`pop`,
`` may be any commit that looks like a commit created by
-   `stash save` or `stash create`.
+   `stash push` or `stash create`.
 
 branch  []::
 
@@ -148,7 +148,7 @@ branch  []::
`stash@{}`, it then drops the ``. When no ``
is given, applies the latest one.
 +
-This is useful if the branch on which you ran `git stash save` has
+This is useful if the branch on which you ran `git stash push` has
 changed enough that `git stash apply` fails due to conflicts. Since
 the stash entry is applied on top of the commit that was HEAD at the
 time `git stash` was run, it restores the originally stashed state
@@ -255,14 +255,14 @@ $ git stash pop
 
 Testing partial commits::
 
-You can use `git stash save --keep-index` when you want to make two or
+You can use `git stash push --keep-index` when you want to make two or
 more commits out of the changes in the work tree, and you want to test
 each change before committing:
 +
 
 # ... hack hack hack ...
 $ git add --patch foo# add just first part to the index
-$ git stash save --keep-index# save all other changes to the stash
+$ git stash push --keep-index# save all other changes to the stash
 $ edit/build/test first part
 $ git commit -m 'First part' # commit fully tested change
 $ git stash pop  # prepare to work on all other changes
diff --git a/Documentation/gitworkflows.txt b/Documentation/gitworkflows.txt
index 177610e44e..02569d0614 100644
--- a/Documentation/gitworkflows.txt
+++ b/Documentation/gitworkflows.txt
@@ -40,7 +40,7 @@ beginning. It is always easier to squash a few commits 
together than
 to split one big commit into several.  Don't be afraid of making too
 small or imperfect steps along the way. You can always go back later
 and edit the commits with `git rebase --interactive` before you
-publish them.  You can use `git stash save --keep-index` to run the
+publish them.  You can use `git stash push --keep-index` to run the
 test suite independent of other uncommitted changes; see the EXAMPLES
 section of linkgit:git-stash[1].
 
diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt
index b4d88af133..3a03e63eb0 100644
--- a/Documentation/user-manual.txt
+++ b/Documentation/user-manual.txt
@@ -1556,7 +1556,7 @@ so on a different branch and then coming back), unstash 
the
 work-in-progress changes.
 
 
-$ git stash save "work in progress for foo feature"
+$ git stash push -m "work in progress for foo feature"
 
 
 This command will save your changes away to the `stash`, and
diff --git a/git-stash.sh b/git-stash.sh
index 8b2ce9afda..16919277ba 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -267,11 +267,11 @@ push_stash () {
# translation of "error: " takes in your language. E.g. 
in
# English this is:
#
-   #$ git stash

Re: [Alt. PATCH] ls-remote: deprecate -h as short for --heads

2017-10-19 Thread René Scharfe

Am 18.10.2017 um 01:22 schrieb Junio C Hamano:
> René Scharfe  writes:
> 
>> Stop advertising -h as the short equivalent of --heads, because it's
>> used for showing a short help text for almost all other git commands.
>> Since the ba5f28bf79 (ls-remote: use parse-options api) it has only
>> been working when used together with other parameters anyway.
>>
>> Signed-off-by: Rene Scharfe 
>> ---
>> That would be step on the way towards more consistent command line
>> switches, in the same vein as d69155119 (t0012: test "-h" with
>> builtins).
> 
> Sorry, but I am not sure whom this and the other approach are trying
> to help.
> 
> The rule we have currently seems to be that "git cmd -h" (no other
> arguments) consistently gives a short help, and if a subcommand
> supports an option "-h" specific to it that is not about giving a
> short help, the caller needs to be aware of it to invoke the option,
> by making sure that there is some other arguments on the command
> line if "-h" form of that subcommand specific option is used, by
> doing e.g. "git ls-remote -h origin" or "git ls-remote --head".
> 
> I can see that this "alternative" approach makes it less convenent
> for folks who have followed that rule by hiding "-h" (with the
> intention of deprecating and possibly removing it in the future) and
> encouraging (and later foring) "--head" to be used instead.
> 
> The other approach burdens new users by changing the rule to "some
> subcommands that have their own '-h' option cannot be asked for a
> brief usage with 'git cmd -h'".  But the thing is, these new users
> who do not know which subcommands do have their own '-h' and which
> ones do not are the ones that benefit most from the consistent "'git
> cmd -h' with no other argument gets short help" rule.

They would help by aligning documentation and code, at a price.  But
of course there is another way to do that -- I just didn't see it the
other day.  We don't have to take anything away:

-- >8 --
Subject: [PATCH] ls-remote: document behavior of lone parameter -h

Reported-by: Thomas Rikl 
Analyzed-by: Martin Ågren 
Analyzed-by: Junio C Hamano 
Signed-off-by: Rene Scharfe 
---
 Documentation/git-ls-remote.txt | 10 ++
 builtin/ls-remote.c |  5 -
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-ls-remote.txt b/Documentation/git-ls-remote.txt
index 5f2628c8f8..82622e7fbc 100644
--- a/Documentation/git-ls-remote.txt
+++ b/Documentation/git-ls-remote.txt
@@ -29,6 +29,9 @@ OPTIONS
These options are _not_ mutually exclusive; when given
both, references stored in refs/heads and refs/tags are
displayed.
++
+*NOTE*: -h without any other parameters shows a short help text instead
+of any refs.
 
 --refs::
Do not show peeled tags or pseudorefs like HEAD in the output.
@@ -89,6 +92,13 @@ EXAMPLES
f25a265a342aed6041ab0cc484224d9ca54b6f41refs/tags/v0.99.1
c5db5456ae3b0873fc659c19fafdde22313cc441refs/tags/v0.99.2
7ceca275d047c90c0c7d5afb13ab97efdf51bd6erefs/tags/v0.99.3
+   $ git ls-remote -hh
+   From https://git.kernel.org/pub/scm/git/git.git/
+   4c2224e83951a685185bb8c1f83b28e22fee0e27refs/heads/maint
+   5fe978a5381f1fbad26a80e682ddd2a401966740refs/heads/master
+   76aedb4517c834be2dc89efb5f9d15908e324422refs/heads/next
+   c781a84b5204fb294c9ccc79f8b3baceeb32c061refs/heads/pu
+   5d38b589ccc7a8355c62f1577865df5b8216c00drefs/heads/todo
 
 GIT
 ---
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index c4be98ab9e..0438dfec05 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -56,7 +56,10 @@ int cmd_ls_remote(int argc, const char **argv, const char 
*prefix)
   N_("path of git-upload-pack on the remote host"),
   PARSE_OPT_HIDDEN },
OPT_BIT('t', "tags", , N_("limit to tags"), REF_TAGS),
-   OPT_BIT('h', "heads", , N_("limit to heads"), REF_HEADS),
+   OPT_BIT(0, "heads", , N_("limit to heads"), REF_HEADS),
+   OPT_BIT('h', NULL, ,
+   N_("if only parameter: show this help text, otherwise 
limit to heads"),
+   REF_HEADS),
OPT_BIT(0, "refs", , N_("do not show peeled tags"), 
REF_NORMAL),
OPT_BOOL(0, "get-url", _url,
 N_("take url..insteadOf into account")),
-- 
2.14.2

[PATCH 2/2] fetch, push: keep separate lists of submodules and gitlinks

2017-10-19 Thread Stefan Beller

Currently when fetching we collect the names of submodules to be fetched
in a list. As we also want to support fetching 'gitlinks, that happen to
have a repo checked out at the right place', we'll just pretend that these
are submodules. We do that by assuming their path is their name. This in
turn can yield collisions between the name-namespace and the
path-namespace. (See the previous test for a demonstration.)

This patch rewrites the code such that we treat the 'real submodule' case
differently from the 'gitlink, but ok' case. This introduces a bit
of code duplication, but gets rid of the confusing mapping between names
and paths.

The test is incomplete as the long term vision is not achieved yet.
(which would be fetching both the renamed submodule as well as
the gitlink thing, putting them in place via e.g. git-pull)

Signed-off-by: Stefan Beller 
---

 Heiko,
 Junio,

 I assumed the code would ease up a lot more, but now I am undecided if
 I want to keep arguing as the code is not stopping to be ugly. :)
 
 The idea is to treat submodule and gitlinks separately, with submodules
 supporting renames, and gitlinks as a historic artefact.
 
 Sorry for the noise about code ugliness.
 
 Thanks,
 Stefan
 

 submodule.c | 168 +---
 t/t5526-fetch-submodules.sh |   1 -
 2 files changed, 81 insertions(+), 88 deletions(-)

diff --git a/submodule.c b/submodule.c
index 82d206eb65..115df82f32 100644
--- a/submodule.c
+++ b/submodule.c
@@ -22,6 +22,7 @@
 
 static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
 static struct string_list changed_submodule_names = STRING_LIST_INIT_DUP;
+static struct string_list changed_gitlink_paths = STRING_LIST_INIT_DUP;
 static int initialized_fetch_ref_tips;
 static struct oid_array ref_tips_before_fetch;
 static struct oid_array ref_tips_after_fetch;
@@ -674,11 +675,11 @@ const struct submodule *submodule_from_ce(const struct 
cache_entry *ce)
 }
 
 static struct oid_array *submodule_commits(struct string_list *submodules,
-  const char *name)
+  const char *key)
 {
struct string_list_item *item;
 
-   item = string_list_insert(submodules, name);
+   item = string_list_insert(submodules, key);
if (item->util)
return (struct oid_array *) item->util;
 
@@ -688,33 +689,20 @@ static struct oid_array *submodule_commits(struct 
string_list *submodules,
 }
 
 struct collect_changed_submodules_cb_data {
-   struct string_list *changed;
-   const struct object_id *commit_oid;
-};
+   /* used for submodules, supports renames: */
+   struct string_list *changed_by_name;
 
-/*
- * this would normally be two functions: default_name_from_path() and
- * path_from_default_name(). Since the default name is the same as
- * the submodule path we can get away with just one function which only
- * checks whether there is a submodule in the working directory at that
- * location.
- */
-static const char *default_name_or_path(const char *path_or_name)
-{
-   int error_code;
+   /* support old 'gitlink' with repo in-place, no rename support*/
+   struct string_list *changed_by_path;
 
-   if (!is_submodule_populated_gently(path_or_name, _code))
-   return NULL;
-
-   return path_or_name;
-}
+   const struct object_id *commit_oid;
+};
 
 static void collect_changed_submodules_cb(struct diff_queue_struct *q,
  struct diff_options *options,
  void *data)
 {
struct collect_changed_submodules_cb_data *me = data;
-   struct string_list *changed = me->changed;
const struct object_id *commit_oid = me->commit_oid;
int i;
 
@@ -722,42 +710,35 @@ static void collect_changed_submodules_cb(struct 
diff_queue_struct *q,
struct diff_filepair *p = q->queue[i];
struct oid_array *commits;
const struct submodule *submodule;
-   const char *name;
 
if (!S_ISGITLINK(p->two->mode))
continue;
 
submodule = submodule_from_path(commit_oid, p->two->path);
-   if (submodule)
-   name = submodule->name;
-   else {
-   name = default_name_or_path(p->two->path);
-   /* make sure name does not collide with existing one */
-   submodule = submodule_from_name(commit_oid, name);
-   if (submodule) {
-   warning("Submodule in commit %s at path: "
-   "'%s' collides with a submodule named "
-   "the same. Skipping it.",
-   oid_to_hex(commit_oid), name);
-   name = NULL;
-

[PATCH 1/2] t5526: check for name/path collision in submodule fetch

2017-10-19 Thread Stefan Beller

Signed-off-by: Stefan Beller 
---

This is just to test the corner case we're discussing.
Applies on top of origin/hv/fetch-moved-submodules-on-demand.


 t/t5526-fetch-submodules.sh | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index a552ad4ead..c82d519e06 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -571,6 +571,7 @@ test_expect_success 'fetching submodule into a broken 
repository' '
 '
 
 test_expect_success "fetch new commits when submodule got renamed" '
+   test_when_finished "rm -rf downstream_rename" &&
git clone . downstream_rename &&
(
cd downstream_rename &&
@@ -605,4 +606,45 @@ test_expect_success "fetch new commits when submodule got 
renamed" '
test_cmp expect actual
 '
 
+test_expect_success "warn on submodule name/path clash, but new commits 
fetched in renamed" '
+   test_when_finished "rm -rf downstream_rename" &&
+   git clone . downstream_rename &&
+   (
+   cd downstream_rename &&
+   git submodule update --init &&
+# NEEDSWORK: we omitted --recursive for the submodule update here since
+# that does not work. See test 7001 for mv "moving nested submodules"
+# for details. Once that is fixed we should add the --recursive option
+# here.
+   git checkout -b rename &&
+   git mv submodule submodule_renamed &&
+   (
+   cd submodule_renamed &&
+   git checkout -b rename_sub &&
+   echo a >a &&
+   git add a &&
+   git commit -ma &&
+   git push origin rename_sub &&
+   git rev-parse HEAD >../../expect
+   ) &&
+   git add submodule_renamed &&
+   git commit -m "update renamed submodule" &&
+   # produce collision, note that we use no submodule command
+   git clone ../submodule submodule &&
+   git add submodule &&
+   git commit -m "have new submodule at old path " &&
+   git push origin rename
+   ) &&
+   (
+   cd downstream &&
+   git fetch --recurse-submodules=on-demand 2>err &&
+   grep "collides with a submodule named" err &&
+   (
+   cd submodule &&
+   git rev-parse origin/rename_sub >../../actual
+   )
+   ) &&
+   test_cmp expect actual
+'
+
 test_done
-- 
2.14.0.rc0.3.g6c2e499285

Re: [PATCH 2/1] mention git stash push first in the man page

2017-10-19 Thread Thomas Gummerer

On 10/17, Jeff King wrote:
> On Tue, Oct 17, 2017 at 10:45:15PM +0100, Thomas Gummerer wrote:
> 
> > > Seems reasonable, though if we are deprecating "save" should we demote
> > > it from being in the synopsis entirely?
> > 
> > I saw that as a next step, with the "official" deprecation of "save".
> > I thought we might want to advertise "push" a bit more before actually
> > officially deprecating "save" and sending the deprecation notice out
> > in the release notes.
> 
> Right, my thinking was that it would still be documented in the manpage,
> just lower down. And that would probably say something like "save is a
> historical synonym for push, except that it differs in these ways...".
> 
> > OTOH, dropping it from the synopsis in the man page probably wouldn't
> > cause much issue, as "push" directly replaces it, and is easily
> > visible.  Not sure how slow we want to take the deprecation?
> 
> I don't think there's any reason to go slow in marking something as
> deprecated. It's the part where we follow up and remove or change the
> feature that must take a while.

Makes sense, let me drop it from the synopsis then.

Thanks!

> -Peff

Re: [PATCH 2/4] remote: handle broken symrefs

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 01:47:30PM -0400, Jeff King wrote:

> This is hard to trigger in practice, since this function is
> used as a callback to for_each_ref(), which will skip broken
> refs in the first place (so it would have to be broken
> racily, or for us to see a transient filesystem error).
> 
> If we see such a racy broken outcome let's treat it as "not
> a symref". This is exactly the same thing that would happen
> in the non-racy case (our function would not be called at
> all, as for_each_ref would skip the broken symref).

The fact that we have to re-resolve the ref here to find the symref
points to a short-coming in the for_each_ref() interface. It resolved
the ref already to get us the oid, so it should (or at least could) know
the symref details already. But it doesn't record them or make them
available to callers.

Ditto for patch 3. It doesn't use for_each_ref(), but I suspect it could
be recording the value of HEAD more carefully from the prior lookup,
avoiding the re-resolution completely.

Refactoring for_each_ref() is probably a bit big for a #leftoverbits,
but looking into the case in patch 3 might not be.

-Peff

[PATCH 4/4] worktree: handle broken symrefs in find_shared_symref()

2017-10-19 Thread Jeff King

The refs_resolve_ref_unsafe() function may return NULL even
with a REF_ISSYMREF flag if a symref points to a broken ref.
As a result, it's possible for find_shared_symref() to
segfault when it passes NULL to strcmp().

This is hard to trigger for most code paths. We typically
pass HEAD to the function as the symref to resolve, and
programs like "git branch" will bail much earlier if HEAD
isn't valid.

I did manage to trigger it through one very obscure
sequence:

  # You have multiple notes refs which conflict.
  git notes add -m base
  git notes --ref refs/notes/foo add -m foo

  # There's left-over cruft in NOTES_MERGE_REF that
  # makes it a broken symref (in this case we point
  # to a syntactically invalid ref).
  echo "ref: refs/heads/master.lock" >.git/NOTES_MERGE_REF

  # You try to merge the notes. We read the broken value in
  # order to complain that another notes-merge is
  # in-progress, but we segfault in find_shared_symref().
  git notes merge refs/notes/foo

This is obviously silly and almost certainly impossible to
trigger accidentally, but it does show that the bug is
triggerable from at least one code path. In addition, it
would trigger if we saw a transient filesystem error when
resolving the pointed-to ref.

We can fix this by treating NULL the same as a non-matching
symref. Arguably we'd prefer to tell know if a symref points
to "refs/heads/foo", but "refs/heads/foo" is broken. But
refs_resolve_ref_unsafe() isn't capable of giving us that
information, so this is the best we can do.

Signed-off-by: Jeff King 
---
 worktree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/worktree.c b/worktree.c
index 70015629dc..f8c40f2f5f 100644
--- a/worktree.c
+++ b/worktree.c
@@ -327,7 +327,8 @@ const struct worktree *find_shared_symref(const char 
*symref,
refs = get_worktree_ref_store(wt);
symref_target = refs_resolve_ref_unsafe(refs, symref, 0,
NULL, );
-   if ((flags & REF_ISSYMREF) && !strcmp(symref_target, target)) {
+   if ((flags & REF_ISSYMREF) &&
+   symref_target && !strcmp(symref_target, target)) {
existing = wt;
break;
}
-- 
2.15.0.rc1.560.g5f0609e481

[PATCH] builtin/push.c: add push.pushOption config

2017-10-19 Thread Marius Paliga

Push options need to be given explicitly, via the command line as "git
push --push-option ".  Add the config option push.pushOption,
which is a multi-valued option, containing push options that are sent
by default.

When push options are set in the lower-priority configulation file
(e.g. /etc/gitconfig, or $HOME/.gitconfig), they can be unset later in
the more specific repository config by the empty string.

Add tests and update documentation as well.

Signed-off-by: Marius Paliga 
---
 Documentation/git-push.txt |  3 ++
 builtin/push.c | 26 +---
 t/t5545-push-options.sh| 77 ++
 3 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 3e76e99f3..aa78057dc 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -161,6 +161,9 @@ already exists on the remote side.
Transmit the given string to the server, which passes them to
the pre-receive as well as the post-receive hook. The given string
must not contain a NUL or LF character.
+   When no `--push-option ` is given from the command
+   line, the values of configuration variable `push.pushOption`
+   are used instead.
 
 --receive-pack=::
 --exec=::
diff --git a/builtin/push.c b/builtin/push.c
index 2ac810422..1c28427d8 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -32,6 +32,8 @@ static const char **refspec;
 static int refspec_nr;
 static int refspec_alloc;
 
+static struct string_list push_options_config = STRING_LIST_INIT_DUP;
+
 static void add_refspec(const char *ref)
 {
refspec_nr++;
@@ -503,6 +505,15 @@ static int git_push_config(const char *k, const char *v, 
void *cb)
int val = git_config_bool(k, v) ?
RECURSE_SUBMODULES_ON_DEMAND : RECURSE_SUBMODULES_OFF;
recurse_submodules = val;
+   } else if (!strcmp(k, "push.pushoption")) {
+   if (!v)
+   return config_error_nonbool(k);
+   else
+   if (!*v)
+   string_list_clear(_options_config, 0);
+   else
+   string_list_append(_options_config, v);
+   return 0;
}
 
return git_default_config(k, v, NULL);
@@ -515,7 +526,8 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
int push_cert = -1;
int rc;
const char *repo = NULL;/* default repository */
-   struct string_list push_options = STRING_LIST_INIT_DUP;
+   struct string_list push_options_cmdline = STRING_LIST_INIT_DUP;
+   struct string_list *push_options;
const struct string_list_item *item;
 
struct option options[] = {
@@ -551,7 +563,7 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
  0, "signed", _cert, "yes|no|if-asked", N_("GPG sign the 
push"),
  PARSE_OPT_OPTARG, option_parse_push_signed },
OPT_BIT(0, "atomic", , N_("request atomic transaction on 
remote side"), TRANSPORT_PUSH_ATOMIC),
-   OPT_STRING_LIST('o', "push-option", _options, 
N_("server-specific"), N_("option to transmit")),
+   OPT_STRING_LIST('o', "push-option", _options_cmdline, 
N_("server-specific"), N_("option to transmit")),
OPT_SET_INT('4', "ipv4", , N_("use IPv4 addresses only"),
TRANSPORT_FAMILY_IPV4),
OPT_SET_INT('6', "ipv6", , N_("use IPv6 addresses only"),
@@ -562,6 +574,9 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
packet_trace_identity("push");
git_config(git_push_config, );
argc = parse_options(argc, argv, prefix, options, push_usage, 0);
+   push_options = (push_options_cmdline.nr
+   ? _options_cmdline
+   : _options_config);
set_push_cert_flags(, push_cert);
 
if (deleterefs && (tags || (flags & (TRANSPORT_PUSH_ALL | 
TRANSPORT_PUSH_MIRROR
@@ -584,12 +599,13 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
set_refspecs(argv + 1, argc - 1, repo);
}
 
-   for_each_string_list_item(item, _options)
+   for_each_string_list_item(item, push_options)
if (strchr(item->string, '\n'))
die(_("push options must not have new line 
characters"));
 
-   rc = do_push(repo, flags, _options);
-   string_list_clear(_options, 0);
+   rc = do_push(repo, flags, push_options);
+   string_list_clear(_options_cmdline, 0);
+   string_list_clear(_options_config, 0);
if (rc == -1)
usage_with_options(push_usage, options);
else
diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index 90a4b0d2f..463783789 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@

[PATCH 3/4] log: handle broken HEAD in decoration check

2017-10-19 Thread Jeff King

The resolve_ref_unsafe() function may return NULL even with
a REF_ISSYMREF flag if a symref points to a broken ref. As a
result, it's possible for the decoration code's "is this
branch the current HEAD" check to segfault when it passes
the NULL to starts_with().

This is unlikely in practice, since we can only reach this
code if we already resolved HEAD to a matching sha1 earlier.
But it's possible if HEAD racily becomes broken, or if
there's a transient filesystem error.

We can fix this by returning early in the broken case, since
NULL could not possibly match any of our branch names.

Signed-off-by: Jeff King 
---
 log-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/log-tree.c b/log-tree.c
index cea056234d..580b3a98a0 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -198,7 +198,7 @@ static const struct name_decoration 
*current_pointed_by_HEAD(const struct name_d
 
/* Now resolve and find the matching current branch */
branch_name = resolve_ref_unsafe("HEAD", 0, NULL, _flags);
-   if (!(rru_flags & REF_ISSYMREF))
+   if (!branch_name || !(rru_flags & REF_ISSYMREF))
return NULL;
 
if (!starts_with(branch_name, "refs/"))
-- 
2.15.0.rc1.560.g5f0609e481

[PATCH 2/4] remote: handle broken symrefs

2017-10-19 Thread Jeff King

It's possible for resolve_ref_unsafe() to return NULL with a
REF_ISSYMREF flag if a symref points to a broken ref.  In
this case, the read_remote_branches() function will segfault
passing the name to xstrdup().

This is hard to trigger in practice, since this function is
used as a callback to for_each_ref(), which will skip broken
refs in the first place (so it would have to be broken
racily, or for us to see a transient filesystem error).

If we see such a racy broken outcome let's treat it as "not
a symref". This is exactly the same thing that would happen
in the non-racy case (our function would not be called at
all, as for_each_ref would skip the broken symref).

Signed-off-by: Jeff King 
---
 builtin/remote.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/remote.c b/builtin/remote.c
index 4f5cac96b0..bc89623695 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -565,7 +565,7 @@ static int read_remote_branches(const char *refname,
item = string_list_append(rename->remote_branches, 
xstrdup(refname));
symref = resolve_ref_unsafe(refname, RESOLVE_REF_READING,
NULL, );
-   if (flag & REF_ISSYMREF)
+   if (symref && (flag & REF_ISSYMREF))
item->util = xstrdup(symref);
else
item->util = NULL;
-- 
2.15.0.rc1.560.g5f0609e481

[PATCH 1/4] test-ref-store: avoid passing NULL to printf

2017-10-19 Thread Jeff King

It's possible for resolve_ref_unsafe() to return NULL (e.g.,
if we are reading and the ref does not exist), in which case
we'll pass NULL to printf. On glibc systems this produces
"(null)", but on others it may segfault.

The tests don't expect any such case, but if we ever did
trigger this, we would prefer to cleanly fail the test with
unexpected input rather than segfault. Let's manually
replace NULL with "(null)". The exact value doesn't matter,
as it won't match any possible ref the caller could expect
(and anyway, the exit code of the program will tell whether
"ref" is valid or not).

Signed-off-by: Jeff King 
---
 t/helper/test-ref-store.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index 05d8c4d8af..6ec2670044 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -135,7 +135,7 @@ static int cmd_resolve_ref(struct ref_store *refs, const 
char **argv)
 
ref = refs_resolve_ref_unsafe(refs, refname, resolve_flags,
  sha1, );
-   printf("%s %s 0x%x\n", sha1_to_hex(sha1), ref, flags);
+   printf("%s %s 0x%x\n", sha1_to_hex(sha1), ref ? ref : "(null)", flags);
return ref ? 0 : 1;
 }
 
-- 
2.15.0.rc1.560.g5f0609e481

Re: [PATCH v2] commit: check result of resolve_ref_unsafe

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 12:36:50PM +0300, Andrey Okoshkin wrote:

> Add check of the resolved HEAD reference while printing of a commit summary.
> resolve_ref_unsafe() may return NULL pointer if underlying calls of lstat()
> or
> open() fail in files_read_raw_ref().
> Such situation can be caused by race: file becomes inaccessible to this
> moment.
> 
> Signed-off-by: Andrey Okoshkin 
> ---
> Thank you for your review.
> 
> Changes since the previous patch:
> * BUG is replaced with die, message;
> * Message is changed.

Thanks, this looks good to me. One other possible minor improvement:

>   head = resolve_ref_unsafe("HEAD", 0, junk_oid.hash, NULL);
> + if (!head)
> + die(_("unable to resolve HEAD after creating commit"));

Should we use die_errno() here to report the value of errno? I think
resolve_ref_unsafe() should set it consistently (even an internal
problem, like an illegally-formatted refname, yields EINVAL).

I grepped the code base looking for other instances of the same problem,
and found four of them. Patches to follow.

Unlike this one, I ended up quietly returning an error in most cases.
The individual commit messages discuss the reasoning for each case, but
I do wonder if we ought to simply die() in each case out of an abundance
of caution (either the repo has a broken symref, or some weird
filesystem error occurred, but either way it may be best not to
continue). I dunno.

These are all independent, so can be applied in any order or combination
with respect to each other and to your patch.

  [1/4]: test-ref-store: avoid passing NULL to printf
  [2/4]: remote: handle broken symrefs
  [3/4]: log: handle broken HEAD in decoration check
  [4/4]: worktree: handle broken symrefs in find_shared_symref()

 builtin/remote.c  | 2 +-
 log-tree.c| 2 +-
 t/helper/test-ref-store.c | 2 +-
 worktree.c| 3 ++-
 4 files changed, 5 insertions(+), 4 deletions(-)

-Peff

1 Unread Message

2017-10-19 Thread Mr James Amin

Good day,

I am Mr James Amin, I am Requesting for your partnership in re-profiling funds, 
Contact me for more details.

Thanks,
Mr James Amin

Re: [PATCH] use filetest pragma to work with ACL

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 09:53:28AM +0200, Guillaume Castagnino wrote:

> > This "use" will unconditionally at compile-time (such as "compile" is
> > for perl, anyway). Which raises a few questions:
> > 
> >   - would we want to use "require" instead to avoid loading when we
> > don't enter this function?
> 
> I was under the impression that as this is a pragma affecting perl
> “compiler”, you have to always use “use”, not require, as the pragma
> initialisation has to be done before code is run.

Yeah, I think you're right.

> I quickly grepped the code, I did not see other calls that could
> benefits from the pragma, but it could be indeed nice to move it at the
> top to avoid future issues with such tests and POSIX ACL.

Makes sense.

> >   - Do all relevant versions of perl ship with filetest? According to
> > Module::Corelist, it first shipped with perl 5.6. In general I
> > think
> > we treat that as a minimum for our perl scripts, though I do
> > notice
> > that the gitweb script says "use 5.008". I'm not sure how
> > realistic
> > that is.
> 
> So the CGI already requires perl 5.8, so it’s fine I think.

Right, I totally forgot about perl's funny version-numbering scheme.

> Attached a cleanup proposal and moving the use at the top.

Thanks, it looks good to me.

For future reference, we usually prefer patches inline, separated by
"-- >8 --" scissors if there's other material (like your reply).

-Peff

[PATCH v3 4/4] status: test --ignored=matching

2017-10-19 Thread Jameson Miller

Add tests around status reporting ignord files that match an exclude
pattern for both --untracked-files=normal and --untracked-files=all.

Signed-off-by: Jameson Miller 
---
 t/t7519-ignored-mode.sh | 183 
 1 file changed, 183 insertions(+)
 create mode 100755 t/t7519-ignored-mode.sh

diff --git a/t/t7519-ignored-mode.sh b/t/t7519-ignored-mode.sh
new file mode 100755
index 00..6be7701d79
--- /dev/null
+++ b/t/t7519-ignored-mode.sh
@@ -0,0 +1,183 @@
+#!/bin/sh
+
+test_description='git status collapse ignored'
+
+. ./test-lib.sh
+
+# commit initial ignore file
+test_expect_success 'setup initial commit and ignore file' '
+   cat >.gitignore <<-\EOF &&
+   *.ign
+   ignored_dir/
+   !*.unignore
+   EOF
+   git add . &&
+   git commit -m "Initial commit"
+'
+
+test_expect_success 'Verify behavior of status on folders with ignored files' '
+   test_when_finished "git clean -fdx" &&
+   cat >expect <<-\EOF &&
+   ? expect
+   ? output
+   ! dir/ignored/ignored_1.ign
+   ! dir/ignored/ignored_2.ign
+   ! ignored/ignored_1.ign
+   ! ignored/ignored_2.ign
+   EOF
+
+   mkdir -p ignored dir/ignored &&
+   touch ignored/ignored_1.ign ignored/ignored_2.ign \
+   dir/ignored/ignored_1.ign dir/ignored/ignored_2.ign &&
+
+   git status --porcelain=v2 --ignored=matching --untracked-files=all 
>output &&
+   test_i18ncmp expect output
+'
+
+test_expect_success 'Verify status behavior on folder with tracked & ignored 
files' '
+   test_when_finished "git clean -fdx && git reset HEAD~1 --hard" &&
+   cat >expect <<-\EOF &&
+   ? expect
+   ? output
+   ! dir/tracked_ignored/ignored_1.ign
+   ! dir/tracked_ignored/ignored_2.ign
+   ! tracked_ignored/ignored_1.ign
+   ! tracked_ignored/ignored_2.ign
+   EOF
+
+   mkdir -p tracked_ignored dir/tracked_ignored &&
+   touch tracked_ignored/tracked_1 tracked_ignored/tracked_2 \
+   tracked_ignored/ignored_1.ign tracked_ignored/ignored_2.ign \
+   dir/tracked_ignored/tracked_1 dir/tracked_ignored/tracked_2 \
+   dir/tracked_ignored/ignored_1.ign 
dir/tracked_ignored/ignored_2.ign &&
+
+   git add tracked_ignored/tracked_1 tracked_ignored/tracked_2 \
+   dir/tracked_ignored/tracked_1 dir/tracked_ignored/tracked_2 &&
+   git commit -m "commit tracked files" &&
+
+   git status --porcelain=v2 --ignored=matching --untracked-files=all 
>output &&
+   test_i18ncmp expect output
+'
+
+test_expect_success 'Verify status behavior on folder with untracked and 
ignored files' '
+   test_when_finished "git clean -fdx" &&
+   cat >expect <<-\EOF &&
+   ? dir/untracked_ignored/untracked_1
+   ? dir/untracked_ignored/untracked_2
+   ? expect
+   ? output
+   ? untracked_ignored/untracked_1
+   ? untracked_ignored/untracked_2
+   ! dir/untracked_ignored/ignored_1.ign
+   ! dir/untracked_ignored/ignored_2.ign
+   ! untracked_ignored/ignored_1.ign
+   ! untracked_ignored/ignored_2.ign
+   EOF
+
+   mkdir -p untracked_ignored dir/untracked_ignored &&
+   touch untracked_ignored/untracked_1 untracked_ignored/untracked_2 \
+   untracked_ignored/ignored_1.ign untracked_ignored/ignored_2.ign 
\
+   dir/untracked_ignored/untracked_1 
dir/untracked_ignored/untracked_2 \
+   dir/untracked_ignored/ignored_1.ign 
dir/untracked_ignored/ignored_2.ign &&
+
+   git status --porcelain=v2 --ignored=matching --untracked-files=all 
>output &&
+   test_i18ncmp expect output
+'
+
+test_expect_success 'Verify status matching ignored files on ignored folder' '
+   test_when_finished "git clean -fdx" &&
+   cat >expect <<-\EOF &&
+   ? expect
+   ? output
+   ! ignored_dir/
+   EOF
+
+   mkdir ignored_dir &&
+   touch ignored_dir/ignored_1 ignored_dir/ignored_2 \
+   ignored_dir/ignored_1.ign ignored_dir/ignored_2.ign &&
+
+   git status --porcelain=v2 --ignored=matching --untracked-files=all 
>output &&
+   test_i18ncmp expect output
+'
+
+test_expect_success 'Verify status behavior on ignored folder containing 
tracked file' '
+   test_when_finished "git clean -fdx && git reset HEAD~1 --hard" &&
+   cat >expect <<-\EOF &&
+   ? expect
+   ? output
+   ! ignored_dir/ignored_1
+   ! ignored_dir/ignored_1.ign
+   ! ignored_dir/ignored_2
+   ! ignored_dir/ignored_2.ign
+   EOF
+
+   mkdir ignored_dir &&
+   touch ignored_dir/ignored_1 ignored_dir/ignored_2 \
+   ignored_dir/ignored_1.ign ignored_dir/ignored_2.ign \
+   ignored_dir/tracked &&
+   git add -f ignored_dir/tracked &&
+   git commit -m "Force add file in ignored directory" &&
+   git status --porcelain=v2 --ignored=matching --untracked-files=all 
>output &&
+

[PATCH v3 2/4] status: report matching ignored and normal untracked

2017-10-19 Thread Jameson Miller

Teach status command to handle `--ignored=matching` with
`--untracked-files=normal`

Signed-off-by: Jameson Miller 
---
 dir.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/dir.c b/dir.c
index b9af87eca9..20457724c0 100644
--- a/dir.c
+++ b/dir.c
@@ -1585,6 +1585,7 @@ static enum path_treatment treat_one_path(struct 
dir_struct *dir,
 {
int exclude;
int has_path_in_index = !!index_file_exists(istate, path->buf, 
path->len, ignore_case);
+   enum path_treatment path_treatment;
 
if (dtype == DT_UNKNOWN)
dtype = get_dtype(de, istate, path->buf, path->len);
@@ -1631,8 +1632,23 @@ static enum path_treatment treat_one_path(struct 
dir_struct *dir,
return path_none;
case DT_DIR:
strbuf_addch(path, '/');
-   return treat_directory(dir, istate, untracked, path->buf, 
path->len,
-  baselen, exclude, pathspec);
+   path_treatment = treat_directory(dir, istate, untracked,
+path->buf, path->len,
+baselen, exclude, pathspec);
+   /*
+* If 1) we only want to return directories that
+* match an exclude pattern and 2) this directory does
+* not match an exclude pattern but all of its
+* contents are excluded, then indicate that we should
+* recurse into this directory (instead of marking the
+* directory itself as an ignored path).
+*/
+   if (!exclude &&
+   path_treatment == path_excluded &&
+   (dir->flags & DIR_SHOW_IGNORED_TOO) &&
+   (dir->flags & DIR_SHOW_IGNORED_TOO_MODE_MATCHING))
+   return path_recurse;
+   return path_treatment;
case DT_REG:
case DT_LNK:
return exclude ? path_excluded : path_untracked;
-- 
2.13.6

[PATCH v3 3/4] status: document options to show matching ignored files

2017-10-19 Thread Jameson Miller

Signed-off-by: Jameson Miller 
---
 Documentation/git-status.txt  | 21 +-
 Documentation/technical/api-directory-listing.txt | 27 +++
 2 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
index 9f3a78a36c..fc282e0a92 100644
--- a/Documentation/git-status.txt
+++ b/Documentation/git-status.txt
@@ -97,8 +97,27 @@ configuration variable documented in linkgit:git-config[1].
(and suppresses the output of submodule summaries when the config option
`status.submoduleSummary` is set).
 
---ignored::
+--ignored[=]::
Show ignored files as well.
++
+The mode parameter is used to specify the handling of ignored files.
+It is optional: it defaults to 'traditional'.
++
+The possible options are:
++
+   - 'traditional' - Shows ignored files and directories, unless
+ --untracked-files=all is specifed, in which case
+ individual files in ignored directories are
+ displayed.
+   - 'no'  - Show no ignored files.
+   - 'matching'- Shows ignored files and directories matching an
+ ignore pattern.
++
+When 'matching' mode is specified, paths that explicity match an
+ignored pattern are shown. If a directory matches an ignore pattern,
+then it is shown, but not paths contained in the ignored directory. If
+a directory does not match an ignore pattern, but all contents are
+ignored, then the directory is not shown, but all contents are shown.
 
 -z::
Terminate entries with NUL, instead of LF.  This implies
diff --git a/Documentation/technical/api-directory-listing.txt 
b/Documentation/technical/api-directory-listing.txt
index 6c77b4920c..7fae00f44f 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -22,16 +22,20 @@ The notable options are:
 
 `flags`::
 
-   A bit-field of options (the `*IGNORED*` flags are mutually exclusive):
+   A bit-field of options:
 
 `DIR_SHOW_IGNORED`:::
 
-   Return just ignored files in `entries[]`, not untracked files.
+   Return just ignored files in `entries[]`, not untracked
+   files. This flag is mutually exclusive with
+   `DIR_SHOW_IGNORED_TOO`.
 
 `DIR_SHOW_IGNORED_TOO`:::
 
-   Similar to `DIR_SHOW_IGNORED`, but return ignored files in `ignored[]`
-   in addition to untracked files in `entries[]`.
+   Similar to `DIR_SHOW_IGNORED`, but return ignored files in
+   `ignored[]` in addition to untracked files in
+   `entries[]`. This flag is mutually exclusive with
+   `DIR_SHOW_IGNORED`.
 
 `DIR_KEEP_UNTRACKED_CONTENTS`:::
 
@@ -39,6 +43,21 @@ The notable options are:
untracked contents of untracked directories are also returned in
`entries[]`.
 
+`DIR_SHOW_IGNORED_TOO_MODE_MATCHING`:::
+
+   Only has meaning if `DIR_SHOW_IGNORED_TOO` is also set; if
+   this is set, returns ignored files and directories that match
+   an exclude pattern. If a directory matches an exclude pattern,
+   then the directory is returned and the contained paths are
+   not. A directory that does not match an exclude pattern will
+   not be returned even if all of its contents are ignored. In
+   this case, the contents are returned as individual entries.
++
+If this is set, files and directories that explicity match an ignore
+pattern are reported. Implicity ignored directories (directories that
+do not match an ignore pattern, but whose contents are all ignored)
+are not reported, instead all of the contents are reported.
+
 `DIR_COLLECT_IGNORED`:::
 
Special mode for git-add. Return ignored files in `ignored[]` and
-- 
2.13.6

[PATCH v3 1/4] status: add option to show ignored files differently

2017-10-19 Thread Jameson Miller

Teach the status command more flexibility in how ignored files are
reported. Currently, the reporting of ignored files and untracked
files are linked. You cannot control how ignored files are reported
independently of how untracked files are reported (i.e. `all` vs
`normal`). This makes it impossible to show untracked files with the
`all` option, but show ignored files with the `normal` option.

This work 1) adds the ability to control the reporting of ignored
files independently of untracked files and 2) introduces the concept
of status reporting ignored paths that explicitly match an ignored
pattern. There are 2 benefits to these changes: 1) if a consumer needs
all untracked files but not all ignored files, there is a performance
benefit to not scanning all contents of an ignored directory and 2)
returning ignored files that explicitly match a path allow a consumer
to make more informed decisions about when a status result might be
stale.

This commit implements --ignored=matching with --untracked-files=all.
The following commit will implement --ignored=matching with
--untracked=files=normal.

As an example of where this flexibility could be useful is that our
application (Visual Studio) runs the status command and presents the
output. It shows all untracked files individually (e.g. using the
'--untracked-files==all' option), and would like to know about which
paths are ignored. It uses information about ignored paths to make
decisions about when the status result might have changed.
Additionally, many projects place build output into directories inside
a repository's working directory (e.g. in "bin/" and "obj/"
directories). Normal usage is to explicitly ignore these 2 directory
names in the .gitignore file (rather than or in addition to the *.obj
pattern).If an application could know that these directories are
explicitly ignored, it could infer that all contents are ignored as
well and make better informed decisions about files in these
directories. It could infer that any changes under these paths would
not affect the output of status. Additionally, there can be a
significant performance benefit by avoiding scanning through ignored
directories.

When status is set to report matching ignored files, it has the
following behavior. Ignored files and directories that explicitly
match an exclude pattern are reported. If an ignored directory matches
an exclude pattern, then the path of the directory is returned. If a
directory does not match an exclude pattern, but all of its contents
are ignored, then the contained files are reported instead of the
directory.

Signed-off-by: Jameson Miller 
---
 builtin/commit.c | 31 +--
 dir.c| 24 
 dir.h|  3 ++-
 wt-status.c  | 11 ---
 wt-status.h  |  8 +++-
 5 files changed, 66 insertions(+), 11 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index d75b3805ea..98d84d0277 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -118,7 +118,7 @@ static int edit_flag = -1; /* unspecified */
 static int quiet, verbose, no_verify, allow_empty, dry_run, renew_authorship;
 static int config_commit_verbose = -1; /* unspecified */
 static int no_post_rewrite, allow_empty_message;
-static char *untracked_files_arg, *force_date, *ignore_submodule_arg;
+static char *untracked_files_arg, *force_date, *ignore_submodule_arg, 
*ignored_arg;
 static char *sign_commit;
 
 /*
@@ -139,7 +139,7 @@ static const char *cleanup_arg;
 static enum commit_whence whence;
 static int sequencer_in_use;
 static int use_editor = 1, include_status = 1;
-static int show_ignored_in_status, have_option_m;
+static int have_option_m;
 static struct strbuf message = STRBUF_INIT;
 
 static enum wt_status_format status_format = STATUS_FORMAT_UNSPECIFIED;
@@ -1075,6 +1075,19 @@ static const char *find_author_by_nickname(const char 
*name)
die(_("--author '%s' is not 'Name ' and matches no existing 
author"), name);
 }
 
+static void handle_ignored_arg(struct wt_status *s)
+{
+   if (!ignored_arg)
+   ; /* default already initialized */
+   else if (!strcmp(ignored_arg, "traditional"))
+   s->show_ignored_mode = SHOW_TRADITIONAL_IGNORED;
+   else if (!strcmp(ignored_arg, "no"))
+   s->show_ignored_mode = SHOW_NO_IGNORED;
+   else if (!strcmp(ignored_arg, "matching"))
+   s->show_ignored_mode = SHOW_MATCHING_IGNORED;
+   else
+   die(_("Invalid ignored mode '%s'"), ignored_arg);
+}
 
 static void handle_untracked_files_arg(struct wt_status *s)
 {
@@ -1363,8 +1376,10 @@ int cmd_status(int argc, const char **argv, const char 
*prefix)
  N_("mode"),
  N_("show untracked files, optional modes: all, normal, no. 
(Default: all)"),
  PARSE_OPT_OPTARG, NULL, (intptr_t)"all" },
-   OPT_BOOL(0, "ignored", _ignored_in_status,
-

[PATCH v3 0/4] status: add option to show ignored files differently

2017-10-19 Thread Jameson Miller

The previous iteration can be found here:

https://public-inbox.org/git/20171011133504.15049-1-jam...@microsoft.com/

The main difference is to address feedback around commit organization
and wording.

Jameson Miller (4):
  status: add option to show ignored files differently
  status: report matching ignored and normal untracked
  status: document options to show matching ignored files
  status: test --ignored=matching

 Documentation/git-status.txt  |  21 ++-
 Documentation/technical/api-directory-listing.txt |  27 +++-
 builtin/commit.c  |  31 +++-
 dir.c |  44 +-
 dir.h |   3 +-
 t/t7519-ignored-mode.sh   | 183 ++
 wt-status.c   |  11 +-
 wt-status.h   |   8 +-
 8 files changed, 310 insertions(+), 18 deletions(-)
 create mode 100755 t/t7519-ignored-mode.sh

-- 
2.13.6

Re: [PATCH v1] read_index_from(): Skip verification of the cache entry order to speed index loading

2017-10-19 Thread Jeff King

On Thu, Oct 19, 2017 at 11:12:03AM -0400, Ben Peart wrote:

> > So, I think I like the direction of getting rid of the order
> > validation in post_read_index_from(), not only during the normal
> > operation but also in fsck.  I think it makes more sense to do so
> > incrementally inside do_read_index() all the time and see how fast
> > we can make it do so.
> 
> Unfortunately, all the cost is in the strcmp() - the other tests are
> negligible so moving it to be done incrementally inside do_read_index()
> won't reduce the cost, it just moves it and makes it harder to identify.

It's plausible that doing the strcmp() closer to where we are otherwise
manipulating the data may show an improvement due to memory cache
effects.

It should be easy enough to check that; the patch below implements it.
I couldn't measure any speedup with it running "git ls-files >/dev/null"
on linux.git (60k files). But nor could I get any by dropping the check
entirely.

This is mostly just a curiosity to me. For the record, I have no real
problem with dropping this kind of on-disk data-structure validation
when it's expensive. After all, we do not check the sort on pack .idx
files on each run (nor pack sha1 checksums, etc) because it's too
expensive to do so.

---
diff --git a/read-cache.c b/read-cache.c
index 65f4fe8375..ac8c8d2e93 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1664,25 +1664,19 @@ static struct cache_entry *create_from_disk(struct 
ondisk_cache_entry *ondisk,
return ce;
 }
 
-static void check_ce_order(struct index_state *istate)
-{
-   unsigned int i;
-
-   for (i = 1; i < istate->cache_nr; i++) {
-   struct cache_entry *ce = istate->cache[i - 1];
-   struct cache_entry *next_ce = istate->cache[i];
-   int name_compare = strcmp(ce->name, next_ce->name);
-
-   if (0 < name_compare)
-   die("unordered stage entries in index");
-   if (!name_compare) {
-   if (!ce_stage(ce))
-   die("multiple stage entries for merged file 
'%s'",
-   ce->name);
-   if (ce_stage(ce) > ce_stage(next_ce))
-   die("unordered stage entries for '%s'",
-   ce->name);
-   }
+static void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce)
+{
+   int name_compare = strcmp(ce->name, next_ce->name);
+
+   if (0 < name_compare)
+   die("unordered stage entries in index");
+   if (!name_compare) {
+   if (!ce_stage(ce))
+   die("multiple stage entries for merged file '%s'",
+   ce->name);
+   if (ce_stage(ce) > ce_stage(next_ce))
+   die("unordered stage entries for '%s'",
+   ce->name);
}
 }
 
@@ -1720,7 +1714,6 @@ static void tweak_split_index(struct index_state *istate)
 
 static void post_read_index_from(struct index_state *istate)
 {
-   check_ce_order(istate);
tweak_untracked_cache(istate);
tweak_split_index(istate);
 }
@@ -1784,6 +1777,8 @@ int do_read_index(struct index_state *istate, const char 
*path, int must_exist)
 
disk_ce = (struct ondisk_cache_entry *)((char *)mmap + 
src_offset);
ce = create_from_disk(disk_ce, , previous_name);
+   if (i > 0)
+   check_ce_order(istate->cache[i - 1], ce);
set_index_entry(istate, i, ce);
 
src_offset += consumed;

Re: [PATCH v4 3/3] submodule: simplify decision tree whether to or not to fetch

2017-10-19 Thread Heiko Voigt

On Thu, Oct 19, 2017 at 09:36:47AM +0900, Junio C Hamano wrote:
> Brandon Williams  writes:
> 
> > On 10/16, Heiko Voigt wrote:
> >> To make extending this logic later easier.
> >
> > This makes things so much clearer, thanks!
> 
> I agree that it is clear to see what the code after the patch does,
> but the code before the patch is so convoluted to follow that it is
> a bit hard to see if the code before and after are doing the same
> thing, though ;-)

That is why I would appreciate some extra pairs of eyes on this :) I
tried to be as careful as possible when refactoring this, but since it
is quite convoluted something might have slipped through. The testsuite
does not show anything, but there might be corner cases that are not
tested I guess.

Will hopefully have time to look into the comments to the main patch of
this series tomorrow. Did not get around to properly do that yet.

Cheers Heiko

> >> Signed-off-by: Heiko Voigt 
> >> ---
> >>  submodule.c | 74 
> >> ++---
> >>  1 file changed, 37 insertions(+), 37 deletions(-)
> >> 
> >> diff --git a/submodule.c b/submodule.c
> >> index 71d1773e2e..82d206eb65 100644
> >> --- a/submodule.c
> >> +++ b/submodule.c
> >> @@ -1187,6 +1187,31 @@ struct submodule_parallel_fetch {
> >>  };
> >>  #define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0}
> >>  
> >> +static int get_fetch_recurse_config(const struct submodule *submodule,
> >> +  struct submodule_parallel_fetch *spf)
> >> +{
> >> +  if (spf->command_line_option != RECURSE_SUBMODULES_DEFAULT)
> >> +  return spf->command_line_option;
> >> +
> >> +  if (submodule) {
> >> +  char *key;
> >> +  const char *value;
> >> +
> >> +  int fetch_recurse = submodule->fetch_recurse;
> >> +  key = xstrfmt("submodule.%s.fetchRecurseSubmodules", 
> >> submodule->name);
> >> +  if (!repo_config_get_string_const(the_repository, key, )) 
> >> {
> >> +  fetch_recurse = parse_fetch_recurse_submodules_arg(key, 
> >> value);
> >> +  }
> >> +  free(key);
> >> +
> >> +  if (fetch_recurse != RECURSE_SUBMODULES_NONE)
> >> +  /* local config overrules everything except commandline 
> >> */
> >> +  return fetch_recurse;
> >> +  }
> >> +
> >> +  return spf->default_option;
> >> +}
> >> +
> >>  static int get_next_submodule(struct child_process *cp,
> >>  struct strbuf *err, void *data, void **task_cb)
> >>  {
> >> @@ -1214,46 +1239,21 @@ static int get_next_submodule(struct child_process 
> >> *cp,
> >>}
> >>}
> >>  
> >> -  default_argv = "yes";
> >> -  if (spf->command_line_option == RECURSE_SUBMODULES_DEFAULT) {
> >> -  int fetch_recurse = RECURSE_SUBMODULES_NONE;
> >> -
> >> -  if (submodule) {
> >> -  char *key;
> >> -  const char *value;
> >> -
> >> -  fetch_recurse = submodule->fetch_recurse;
> >> -  key = 
> >> xstrfmt("submodule.%s.fetchRecurseSubmodules", submodule->name);
> >> -  if 
> >> (!repo_config_get_string_const(the_repository, key, )) {
> >> -  fetch_recurse = 
> >> parse_fetch_recurse_submodules_arg(key, value);
> >> -  }
> >> -  free(key);
> >> -  }
> >> -
> >> -  if (fetch_recurse != RECURSE_SUBMODULES_NONE) {
> >> -  if (fetch_recurse == RECURSE_SUBMODULES_OFF)
> >> -  continue;
> >> -  if (fetch_recurse == 
> >> RECURSE_SUBMODULES_ON_DEMAND) {
> >> -  if 
> >> (!unsorted_string_list_lookup(_submodule_names,
> >> -   
> >> submodule->name))
> >> -  continue;
> >> -  default_argv = "on-demand";
> >> -  }
> >> -  } else {
> >> -  if (spf->default_option == 
> >> RECURSE_SUBMODULES_OFF)
> >> -  continue;
> >> -  if (spf->default_option == 
> >> RECURSE_SUBMODULES_ON_DEMAND) {
> >> -  if 
> >> (!unsorted_string_list_lookup(_submodule_names,
> >> -
> >> submodule->name))
> >> -  continue;
> >> -  default_argv = "on-demand";
> >> -  }
> >> -  }
> >> -  } else if (spf->command_line_option == 
> >> RECURSE_SUBMODULES_ON_DEMAND) {
> >> -  if 
> >>

Re: [PATCH v1] read_index_from(): Skip verification of the cache entry order to speed index loading

2017-10-19 Thread Ben Peart




On 10/19/2017 1:22 AM, Junio C Hamano wrote:

Ben Peart  writes:


There is code in post_read_index_from() to catch out of order entries
when reading an index file.  This order verification is ~13% of the cost
of every call to read_index_from().


I find this a bit over-generalized claim---wouldn't the overhead
depend on various conditions, e.g. the size of the index and if
split-index is in effect?



I tested it against 10K, 100K and 1,000K files and the absolute time 
varies with different sized indexes, but the percentage is relatively 
consistent (after doing multiple runs and averaging the results to 
reduce noise).  I didn't measure it with split index so can't say how 
that would impact performance.



In general, I get very skeptical towards any change that makes the
integrity of the data less certain based only on microbenchmarks,
and prefer to see a solution that can absorb the overhead in some
other way.

When we are using split-index, the current code is not validating
the two input files from the disk. Because merge_base_index()
depends on the base to be properly sorted before the overriding
entries are added into it, if the input from disk is in a wrong
order, we are screwed already, and the order check in post
processing is pointless.  


The original commit message doesn't say *why* this test was added so I 
have to make some educated guesses.  Given no attempt is made to recover 
or continue and instead we just die when an out-of-order entry is 
detected, I'm assuming check_ce_order() is protecting against buggy code 
that incorrectly wrote out an invalid index (the sha check would have 
detected file corruption or torn writes).


If we are guarding against "git" writing out an invalid index, we can 
move this into an assert so that only git developers pay the cost of 
validating they haven't created a new bug.  I think this is better than 
just adding a new test case as a new test case would not achieve the 
same coverage.  This is my preferred solution.


If we are guarding against "some other application" writing out an 
invalid index, then everyone will have to pay the cost as we can't 
insert the test into "some other applications."  Without user reports of 
it happening or any telemetry saying it has happened I really have no 
idea if it every actually happens in the wild anymore and whether the 
cost on every index load is still justified.



If we want to do this order validation, I
think we should be doing it in do_read_index() where it does
create_from_disk() and the set_index_entry(), instead of having it
as a separate phase that scans a potentially large index array one
more time.  And doing so will not penalize the case where we do not
use split-index, either.

So, I think I like the direction of getting rid of the order
validation in post_read_index_from(), not only during the normal
operation but also in fsck.  I think it makes more sense to do so
incrementally inside do_read_index() all the time and see how fast
we can make it do so.



Unfortunately, all the cost is in the strcmp() - the other tests are 
negligible so moving it to be done incrementally inside do_read_index() 
won't reduce the cost, it just moves it and makes it harder to identify.


The only idea I could come up with for reducing the cost to our end 
users is to keep it separate and split the test across multiple threads 
with some minimum index size threshold as we have done elsewhere.  This 
adds code and complexity that we'll have to maintain forever so is less 
preferred than making it an assert.

[PATCH 1/6] t0021/rot13-filter: refactor packet reading functions

2017-10-19 Thread Christian Couder

To make it possible in a following commit to move packet
reading and writing functions into a Packet.pm module,
let's refactor these functions, so they don't handle
printing debug output and exiting.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index ad685d92f8..e4495a52f3 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -60,8 +60,7 @@ sub packet_bin_read {
my $bytes_read = read STDIN, $buffer, 4;
if ( $bytes_read == 0 ) {
# EOF - Git stopped talking to us!
-   print $debug "STOP\n";
-   exit();
+   return ( -1, "" );
}
elsif ( $bytes_read != 4 ) {
die "invalid packet: '$buffer'";
@@ -85,7 +84,7 @@ sub packet_bin_read {
 
 sub packet_txt_read {
my ( $res, $buf ) = packet_bin_read();
-   unless ( $buf eq '' or $buf =~ s/\n$// ) {
+   unless ( $res == -1 or $buf eq '' or $buf =~ s/\n$// ) {
die "A non-binary line MUST be terminated by an LF.";
}
return ( $res, $buf );
@@ -131,7 +130,12 @@ print $debug "init handshake complete\n";
 $debug->flush();
 
 while (1) {
-   my ( $command ) = packet_txt_read() =~ /^command=(.+)$/;
+   my ( $res, $command ) = packet_txt_read();
+   if ( $res == -1 ) {
+   print $debug "STOP\n";
+   exit();
+   }
+   $command =~ s/^command=//;
print $debug "IN: $command";
$debug->flush();
 
-- 
2.15.0.rc1.106.g7e97f58a41

[PATCH 0/6] Create Git/Packet.pm

2017-10-19 Thread Christian Couder

Goal


Packet related functions in Perl can be useful to write new filters or
to debug or test existing filters. So instead of having them in
t0021/rot13-filter.pl, let's extract them into a new Git/Packet.pm
module.

Links
~

This patch series has been extracted from previous "Add initial
experimental external ODB support" patch series.

Version 1, 2, 3, 4, 5 and 6 of this previous series are on the mailing
list here:

https://public-inbox.org/git/20160613085546.11784-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20160628181933.24620-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20161130210420.15982-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20170620075523.26961-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20170803091926.1755-1-chrisc...@tuxfamily.org/
https://public-inbox.org/git/20170916080731.13925-1-chrisc...@tuxfamily.org/

They are also available in the following branches:

https://github.com/chriscool/git/commits/gl-external-odb12
https://github.com/chriscool/git/commits/gl-external-odb22
https://github.com/chriscool/git/commits/gl-external-odb61
https://github.com/chriscool/git/commits/gl-external-odb239
https://github.com/chriscool/git/commits/gl-external-odb373
https://github.com/chriscool/git/commits/gl-external-odb411


Christian Couder (6):
  t0021/rot13-filter: refactor packet reading functions
  t0021/rot13-filter: improve 'if .. elsif .. else' style
  t0021/rot13-filter: improve error message
  t0021/rot13-filter: add packet_initialize()
  t0021/rot13-filter: add capability functions
  Add Git/Packet.pm from parts of t0021/rot13-filter.pl

 perl/Git/Packet.pm  | 118 
 t/t0021/rot13-filter.pl | 110 +---
 2 files changed, 140 insertions(+), 88 deletions(-)
 create mode 100644 perl/Git/Packet.pm

-- 
2.15.0.rc1.106.g7e97f58a41

[PATCH 4/6] t0021/rot13-filter: add packet_initialize()

2017-10-19 Thread Christian Couder

Let's refactor the code to initialize communication into its own
packet_initialize() function, so that we can reuse this
functionality in following patches.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 3b3da8a03d..278fc6f534 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -104,16 +104,22 @@ sub packet_flush {
STDOUT->flush();
 }
 
+sub packet_initialize {
+   my ($name, $version) = @_;
+
+   ( packet_txt_read() eq ( 0, $name . "-client" ) )   || die "bad 
initialize";
+   ( packet_txt_read() eq ( 0, "version=" . $version ) )   || die "bad 
version";
+   ( packet_bin_read() eq ( 1, "" ) )  || die "bad 
version end";
+
+   packet_txt_write( $name . "-server" );
+   packet_txt_write( "version=" . $version );
+   packet_flush();
+}
+
 print $debug "START\n";
 $debug->flush();
 
-( packet_txt_read() eq ( 0, "git-filter-client" ) ) || die "bad initialize";
-( packet_txt_read() eq ( 0, "version=2" ) ) || die "bad version";
-( packet_bin_read() eq ( 1, "" ) )  || die "bad version end";
-
-packet_txt_write("git-filter-server");
-packet_txt_write("version=2");
-packet_flush();
+packet_initialize("git-filter", 2);
 
 ( packet_txt_read() eq ( 0, "capability=clean" ) )  || die "bad capability";
 ( packet_txt_read() eq ( 0, "capability=smudge" ) ) || die "bad capability";
-- 
2.15.0.rc1.106.g7e97f58a41

[PATCH 6/6] Add Git/Packet.pm from parts of t0021/rot13-filter.pl

2017-10-19 Thread Christian Couder

And while at it let's simplify t0021/rot13-filter.pl by
using Git/Packet.pm.

This will make it possible to reuse packet related
functions in other test scripts.

Signed-off-by: Christian Couder 
---
 perl/Git/Packet.pm  | 118 
 t/t0021/rot13-filter.pl |  94 ++
 2 files changed, 121 insertions(+), 91 deletions(-)
 create mode 100644 perl/Git/Packet.pm

diff --git a/perl/Git/Packet.pm b/perl/Git/Packet.pm
new file mode 100644
index 00..b1e67477a0
--- /dev/null
+++ b/perl/Git/Packet.pm
@@ -0,0 +1,118 @@
+package Git::Packet;
+use 5.008;
+use strict;
+use warnings;
+BEGIN {
+   require Exporter;
+   if ($] < 5.008003) {
+   *import = \::import;
+   } else {
+   # Exporter 5.57 which supports this invocation was
+   # released with perl 5.8.3
+   Exporter->import('import');
+   }
+}
+
+our @EXPORT = qw(
+   packet_bin_read
+   packet_txt_read
+   packet_bin_write
+   packet_txt_write
+   packet_flush
+   packet_initialize
+   packet_read_capabilities
+   packet_write_capabilities
+   packet_read_and_check_capabilities
+   );
+our @EXPORT_OK = @EXPORT;
+
+sub packet_bin_read {
+   my $buffer;
+   my $bytes_read = read STDIN, $buffer, 4;
+   if ( $bytes_read == 0 ) {
+   # EOF - Git stopped talking to us!
+   return ( -1, "" );
+   } elsif ( $bytes_read != 4 ) {
+   die "invalid packet: '$buffer'";
+   }
+   my $pkt_size = hex($buffer);
+   if ( $pkt_size == 0 ) {
+   return ( 1, "" );
+   } elsif ( $pkt_size > 4 ) {
+   my $content_size = $pkt_size - 4;
+   $bytes_read = read STDIN, $buffer, $content_size;
+   if ( $bytes_read != $content_size ) {
+   die "invalid packet ($content_size bytes expected; 
$bytes_read bytes read)";
+   }
+   return ( 0, $buffer );
+   } else {
+   die "invalid packet size: $pkt_size";
+   }
+}
+
+sub packet_txt_read {
+   my ( $res, $buf ) = packet_bin_read();
+   unless ( $res == -1 or $buf eq '' or $buf =~ s/\n$// ) {
+   die "A non-binary line MUST be terminated by an LF.\n"
+   . "Received: '$buf'";
+   }
+   return ( $res, $buf );
+}
+
+sub packet_bin_write {
+   my $buf = shift;
+   print STDOUT sprintf( "%04x", length($buf) + 4 );
+   print STDOUT $buf;
+   STDOUT->flush();
+}
+
+sub packet_txt_write {
+   packet_bin_write( $_[0] . "\n" );
+}
+
+sub packet_flush {
+   print STDOUT sprintf( "%04x", 0 );
+   STDOUT->flush();
+}
+
+sub packet_initialize {
+   my ($name, $version) = @_;
+
+   ( packet_txt_read() eq ( 0, $name . "-client" ) )   || die "bad 
initialize";
+   ( packet_txt_read() eq ( 0, "version=" . $version ) )   || die "bad 
version";
+   ( packet_bin_read() eq ( 1, "" ) )  || die "bad 
version end";
+
+   packet_txt_write( $name . "-server" );
+   packet_txt_write( "version=" . $version );
+   packet_flush();
+}
+
+sub packet_read_capabilities {
+   my @cap;
+   while (1) {
+   my ( $res, $buf ) = packet_bin_read();
+   return ( $res, @cap ) if ( $res != 0 );
+   unless ( $buf =~ s/\n$// ) {
+   die "A non-binary line MUST be terminated by an LF.\n"
+   . "Received: '$buf'";
+   }
+   die "bad capability buf: '$buf'" unless ( $buf =~ 
s/capability=// );
+   push @cap, $buf;
+   }
+}
+
+sub packet_read_and_check_capabilities {
+   my @local_caps = @_;
+   my @remote_res_caps = packet_read_capabilities();
+   my $res = shift @remote_res_caps;
+   my %remote_caps = map { $_ => 1 } @remote_res_caps;
+   foreach (@local_caps) {
+   die "'$_' capability not available" unless 
(exists($remote_caps{$_}));
+   }
+   return $res;
+}
+
+sub packet_write_capabilities {
+   packet_txt_write( "capability=" . $_ ) foreach (@_);
+   packet_flush();
+}
diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index ba18b207c6..2e8ad4d496 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -30,9 +30,12 @@
 # to the "list_available_blobs" response.
 #
 
+use 5.008;
+use lib (split(/:/, $ENV{GITPERLLIB}));
 use strict;
 use warnings;
 use IO::File;
+use Git::Packet;
 
 my $MAX_PACKET_CONTENT_SIZE = 65516;
 my $log_file= shift @ARGV;
@@ -55,97 +58,6 @@ sub rot13 {
return $str;
 }
 
-sub packet_bin_read {
-   my $buffer;
-   my $bytes_read = read STDIN, $buffer, 4;
-   if (

[PATCH 2/6] t0021/rot13-filter: improve 'if .. elsif .. else' style

2017-10-19 Thread Christian Couder

Before further refactoring the "t0021/rot13-filter.pl" script,
let's modernize the style of its 'if .. elsif .. else' clauses
to improve its readability by making it more similar to our
other perl scripts.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 39 +--
 1 file changed, 13 insertions(+), 26 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index e4495a52f3..82882392ae 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -61,23 +61,20 @@ sub packet_bin_read {
if ( $bytes_read == 0 ) {
# EOF - Git stopped talking to us!
return ( -1, "" );
-   }
-   elsif ( $bytes_read != 4 ) {
+   } elsif ( $bytes_read != 4 ) {
die "invalid packet: '$buffer'";
}
my $pkt_size = hex($buffer);
if ( $pkt_size == 0 ) {
return ( 1, "" );
-   }
-   elsif ( $pkt_size > 4 ) {
+   } elsif ( $pkt_size > 4 ) {
my $content_size = $pkt_size - 4;
$bytes_read = read STDIN, $buffer, $content_size;
if ( $bytes_read != $content_size ) {
die "invalid packet ($content_size bytes expected; 
$bytes_read bytes read)";
}
return ( 0, $buffer );
-   }
-   else {
+   } else {
die "invalid packet size: $pkt_size";
}
 }
@@ -165,8 +162,7 @@ while (1) {
$debug->flush();
packet_txt_write("status=success");
packet_flush();
-   }
-   else {
+   } else {
my ( $pathname ) = packet_txt_read() =~ /^pathname=(.+)$/;
print $debug " $pathname";
$debug->flush();
@@ -205,17 +201,13 @@ while (1) {
my $output;
if ( exists $DELAY{$pathname} and exists 
$DELAY{$pathname}{"output"} ) {
$output = $DELAY{$pathname}{"output"}
-   }
-   elsif ( $pathname eq "error.r" or $pathname eq "abort.r" ) {
+   } elsif ( $pathname eq "error.r" or $pathname eq "abort.r" ) {
$output = "";
-   }
-   elsif ( $command eq "clean" and grep( /^clean$/, @capabilities 
) ) {
+   } elsif ( $command eq "clean" and grep( /^clean$/, 
@capabilities ) ) {
$output = rot13($input);
-   }
-   elsif ( $command eq "smudge" and grep( /^smudge$/, 
@capabilities ) ) {
+   } elsif ( $command eq "smudge" and grep( /^smudge$/, 
@capabilities ) ) {
$output = rot13($input);
-   }
-   else {
+   } else {
die "bad command '$command'";
}
 
@@ -224,25 +216,21 @@ while (1) {
$debug->flush();
packet_txt_write("status=error");
packet_flush();
-   }
-   elsif ( $pathname eq "abort.r" ) {
+   } elsif ( $pathname eq "abort.r" ) {
print $debug "[ABORT]\n";
$debug->flush();
packet_txt_write("status=abort");
packet_flush();
-   }
-   elsif ( $command eq "smudge" and
+   } elsif ( $command eq "smudge" and
exists $DELAY{$pathname} and
-   $DELAY{$pathname}{"requested"} == 1
-   ) {
+   $DELAY{$pathname}{"requested"} == 1 ) {
print $debug "[DELAYED]\n";
$debug->flush();
packet_txt_write("status=delayed");
packet_flush();
$DELAY{$pathname}{"requested"} = 2;
$DELAY{$pathname}{"output"} = $output;
-   }
-   else {
+   } else {
packet_txt_write("status=success");
packet_flush();
 
@@ -262,8 +250,7 @@ while (1) {
print $debug ".";
if ( length($output) > $MAX_PACKET_CONTENT_SIZE 
) {
$output = substr( $output, 
$MAX_PACKET_CONTENT_SIZE );
-   }
-   else {
+   } else {
$output = "";
}
}
-- 
2.15.0.rc1.106.g7e97f58a41

[PATCH 3/6] t0021/rot13-filter: improve error message

2017-10-19 Thread Christian Couder

If there is no new line at the end of something it receives,
the packet_txt_read() function die()s, but it's difficult to
debug without much context.

Let's give a bit more information when that happens.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 82882392ae..3b3da8a03d 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -82,7 +82,8 @@ sub packet_bin_read {
 sub packet_txt_read {
my ( $res, $buf ) = packet_bin_read();
unless ( $res == -1 or $buf eq '' or $buf =~ s/\n$// ) {
-   die "A non-binary line MUST be terminated by an LF.";
+   die "A non-binary line MUST be terminated by an LF.\n"
+   . "Received: '$buf'";
}
return ( $res, $buf );
 }
-- 
2.15.0.rc1.106.g7e97f58a41

[PATCH 5/6] t0021/rot13-filter: add capability functions

2017-10-19 Thread Christian Couder

Add functions to help read and write capabilities.
These functions will be reused in following patches.

Signed-off-by: Christian Couder 
---
 t/t0021/rot13-filter.pl | 40 
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 278fc6f534..ba18b207c6 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -116,20 +116,44 @@ sub packet_initialize {
packet_flush();
 }
 
+sub packet_read_capabilities {
+   my @cap;
+   while (1) {
+   my ( $res, $buf ) = packet_bin_read();
+   return ( $res, @cap ) if ( $res != 0 );
+   unless ( $buf =~ s/\n$// ) {
+   die "A non-binary line MUST be terminated by an LF.\n"
+   . "Received: '$buf'";
+   }
+   die "bad capability buf: '$buf'" unless ( $buf =~ 
s/capability=// );
+   push @cap, $buf;
+   }
+}
+
+sub packet_read_and_check_capabilities {
+   my @local_caps = @_;
+   my @remote_res_caps = packet_read_capabilities();
+   my $res = shift @remote_res_caps;
+   my %remote_caps = map { $_ => 1 } @remote_res_caps;
+   foreach (@local_caps) {
+   die "'$_' capability not available" unless 
(exists($remote_caps{$_}));
+   }
+   return $res;
+}
+
+sub packet_write_capabilities {
+   packet_txt_write( "capability=" . $_ ) foreach (@_);
+   packet_flush();
+}
+
 print $debug "START\n";
 $debug->flush();
 
 packet_initialize("git-filter", 2);
 
-( packet_txt_read() eq ( 0, "capability=clean" ) )  || die "bad capability";
-( packet_txt_read() eq ( 0, "capability=smudge" ) ) || die "bad capability";
-( packet_txt_read() eq ( 0, "capability=delay" ) )  || die "bad capability";
-( packet_bin_read() eq ( 1, "" ) )  || die "bad capability 
end";
+packet_read_and_check_capabilities("clean", "smudge", "delay");
+packet_write_capabilities(@capabilities);
 
-foreach (@capabilities) {
-   packet_txt_write( "capability=" . $_ );
-}
-packet_flush();
 print $debug "init handshake complete\n";
 $debug->flush();
 
-- 
2.15.0.rc1.106.g7e97f58a41

Re:

2017-10-19 Thread Amos Kalonzo

Attn:

I am wondering If You receive my email for some days now. reference to
the transfer of my client's contract balance payment of (11.7M) Kindly
get back to me for more details.

Best Regards

Amos Kalonzo

[PATCH v2] commit: check result of resolve_ref_unsafe

2017-10-19 Thread Andrey Okoshkin


Add check of the resolved HEAD reference while printing of a commit summary.
resolve_ref_unsafe() may return NULL pointer if underlying calls of 
lstat() or

open() fail in files_read_raw_ref().
Such situation can be caused by race: file becomes inaccessible to this 
moment.


Signed-off-by: Andrey Okoshkin 
---
Thank you for your review.

Changes since the previous patch:
* BUG is replaced with die, message;
* Message is changed.

 builtin/commit.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/commit.c b/builtin/commit.c
index 1a0da71a4..cc27c9af7 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1483,6 +1483,8 @@ static void print_summary(const char *prefix, 
const struct object_id *oid,

diff_setup_done();

head = resolve_ref_unsafe("HEAD", 0, junk_oid.hash, NULL);
+   if (!head)
+   die(_("unable to resolve HEAD after creating commit"));
if (!strcmp(head, "HEAD"))
head = _("detached HEAD");
else
--
2.14.2

Re: [PATCH] commit: check result of resolve_ref_unsafe

2017-10-19 Thread Andrey Okoshkin


I also think it's a good idea to record the destination of the updated
symref. But for now the most simple solution is just to catch the
unreadable HEAD error.
Maybe for the future improvement it would be nice to redesign
print_summary passing some ref_transaction results to it.

19.10.2017 05:49, Jeff King wrote:

TBH, I think the "most right" thing would be to actually capture the ref
that HEAD points to when we actually make the commit, remember it, and
then report it here (the whole function of this code is just to say "I
made a commit on branch X"). But I don't know how much trouble it is
worth going to for that. It's buried behind a ref_transaction, and I
don't think builtin/commit.c ever sees the real name (though maybe it
would be a good feature of the transaction to record the destinations of
any symrefs we updated).


--
Best regards,
Andrey Okoshkin

Re: [PATCH] use filetest pragma to work with ACL

2017-10-19 Thread Guillaume Castagnino

Hi,

Le mercredi 18 octobre 2017 à 17:24 -0400, Jeff King a écrit :
> On Wed, Oct 18, 2017 at 07:55:31PM +, Guillaume Castagnino wrote:
> 
> > From: Guillaume Castagnino 
> > [...]
> 
> Stefan raised a few meta issues, all of which I agree with. But I had
> some questions about the patch itself:
> 
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > index 9208f42ed1753..0ee7f304ce2b1 100755
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -3072,6 +3072,7 @@ sub git_get_projects_list {
> > # only directories can be git
> > repositories
> > return unless (-d $_);
> > # need search permission
> > +   use filetest 'access';
> > return unless (-x $_);
> 
> This "use" will unconditionally at compile-time (such as "compile" is
> for perl, anyway). Which raises a few questions:
> 
>   - would we want to use "require" instead to avoid loading when we
> don't enter this function?

I was under the impression that as this is a pragma affecting perl
“compiler”, you have to always use “use”, not require, as the pragma
initialisation has to be done before code is run.

>   - If the answer to the above is "no" (e.g., because we basically
> always need it; I didn't check), should it go at the top of the
> script with the other "use" directives?
> 
> I think this is a scoped pragma, so what you have here affects
> only
> this particular "-x". But wouldn't other uses of "-x" potentially
> want the same benefit?

I quickly grepped the code, I did not see other calls that could
benefits from the pragma, but it could be indeed nice to move it at the
top to avoid future issues with such tests and POSIX ACL.

>   - Do all relevant versions of perl ship with filetest? According to
> Module::Corelist, it first shipped with perl 5.6. In general I
> think
> we treat that as a minimum for our perl scripts, though I do
> notice
> that the gitweb script says "use 5.008". I'm not sure how
> realistic
> that is.

So the CGI already requires perl 5.8, so it’s fine I think.

Attached a cleanup proposal and moving the use at the top.

Thanks
Guillaume

-- 
Guillaume Castagnino
ca...@xwing.info / guilla...@castagnino.orgFrom 4d5a082970063b34d3dbdf5b9a577624310057d6 Mon Sep 17 00:00:00 2001
From: Guillaume Castagnino 
Date: Thu, 19 Oct 2017 09:32:46 +0200
Subject: [PATCH] gitweb: use filetest to allow ACLs

In commit 46a1385 (gitweb: skip unreadable subdirectories, 2017-07-18)
we forgot to handle non-unix ACLs as well. Fix this.

Signed-off-by: Guillaume Castagnino 
---
 gitweb/gitweb.perl | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 9208f42ed..6ac49eaf3 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -10,6 +10,8 @@
 use 5.008;
 use strict;
 use warnings;
+# handle ACL in file access tests
+use filetest 'access';
 use CGI qw(:standard :escapeHTML -nosticky);
 use CGI::Util qw(unescape);
 use CGI::Carp qw(fatalsToBrowser set_message);
-- 
2.14.2

90 matches

Mail list logo