date:20171123

Re: [PATCH 2/2] grep: fix segfault under -P + PCRE2 + (*NO_JIT)

2017-11-23 Thread Simon Ruderich

On Wed, Nov 22, 2017 at 01:36:30PM +, Ævar Arnfjörð Bjarmason wrote:
> +  *
> +  * This is because if the pattern contains the
> +  * (*NO_JIT) verb (see pcre2syntax(3))
> +  * pcre2_jit_compile() will exit early with 0. If we
> +  * then proceed to call pcre2_jit_match() further down
> +  * the line instead of pcre2_match() we'll segfault.
> +  */
> + patinforet = pcre2_pattern_info(p->pcre2_pattern, 
> PCRE2_INFO_JITSIZE, &jitsizearg);
> + if (patinforet)
> + die("BUG: The patinforet variable should be 0 after the 
> pcre2_pattern_info() call, not %d",
> + patinforet);

I think BUG() should be used here, and maybe shorten the error
message:

BUG("pcre2_pattern_info() failed: %d", patinforet);

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9

Re: [PATCH 1/2] test-lib: add LIBPCRE1 & LIBPCRE2 prerequisites

2017-11-23 Thread Ævar Arnfjörð Bjarmason


On Wed, Nov 22 2017, Jonathan Nieder jotted:

> Hi,
>
> Ævar Arnfjörð Bjarmason wrote:
>
>> Add LIBPCRE1 and LIBPCRE2 prerequisites which are true when git is
>> compiled with USE_LIBPCRE1=YesPlease or USE_LIBPCRE2=YesPlease,
>> respectively.
>>
>> The syntax of PCRE1 and PCRE2 isn't the same in all cases (see
>> pcresyntax(3) and pcre2syntax(3)). If test are added that test for
>> those they'll need to be guarded by these new prerequisites.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>>  t/README  | 12 
>>  t/test-lib.sh |  2 ++
>>  2 files changed, 14 insertions(+)
>>
>> diff --git a/t/README b/t/README
>> index 4b079e4494..599cd9808c 100644
>> --- a/t/README
>> +++ b/t/README
>> @@ -808,6 +808,18 @@ use these, and "test_set_prereq" for how to define your 
>> own.
>> Git was compiled with support for PCRE. Wrap any tests
>> that use git-grep --perl-regexp or git-grep -P in these.
>>
>> + - LIBPCRE1
>> +
>> +   Git was compiled with PCRE v1 support via
>> +   USE_LIBPCRE1=YesPlease. Wrap any PCRE using tests that for some
>> +   reason need v1 of the PCRE library instead of v2 in these.
>
> Are there plans to use the LIBPCRE1 prereq?  It might be simpler to
> only have LIBPCRE2, and LIBPCRE1 can still be expressed as
>
>   PCRE,!LIBPCRE2
>
> which I think is clearer about the intent.

I prefer to keep it as it is. It's more obvious to me to have a 1=1
mapping between the ${USE,NO}_* variables and the prerequisites, and
it's future-proof if there's ever a PCRE v3, since tests that use this
will mean v1 specifically, not just any non-v2 version (although now v1
is the only one).

Re: [PATCH 2/2] grep: fix segfault under -P + PCRE2 + (*NO_JIT)

2017-11-23 Thread Ævar Arnfjörð Bjarmason


On Wed, Nov 22 2017, Eric Sunshine jotted:

> On Wed, Nov 22, 2017 at 8:36 AM, Ævar Arnfjörð Bjarmason
>  wrote:
>> Fix a bug in the compilation of PCRE2 patterns under JIT (the most
>> common runtime configuration), any pattern with a (*NO_JIT) verb would
>> segfault. This bug dates back to my 94da9193a6 ("grep: add support for
>> PCRE v2", 2017-06-01):
>>
>> $ git grep -P '(*NO_JIT)hi.*there'
>> Segmentation fault
>>
>> As explained ad more length in the comment being added here it isn't
>
> s/ad/at/
> s/here/here,/

Thanks. I'll let this sit for a bit and submit a v2 soon. There's also
an upstream fix in pcre2 to prevent the segfault that'll be in future
versions & I'm going to note in the amended commit message.

>> sufficient to just check pcre2_config() to see whether the JIT should
>> be used, pcre2_pattern_info() also has to be asked.
>>
>> This is something I discovered myself when fiddling around with PCRE2
>> verbs in patterns passed to git. I don't expect that any user of git
>> has encountered this given the obscurity of passing PCRE2 verbs
>> through to the library, along with the relative obscurity of (*NO_JIT)
>> itself.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason

From Precious

2017-11-23 Thread preciouskaz...@myself.com

Hi dear,
My name is Precious, I saw your profile and i became interested in
you, I will also like to know you more, please contact me with this
e-mail address so I can give you my picture for you to know whom I am,
because I have something very important to tell you.
Precious.

[no subject]

2017-11-23 Thread HEARTLAND LOAN FIRM

Welcome to HEARTLAND LOAN FIRM, we give out loans from the range of € 9,000 to 
€ 900,000,000
at 2% interest rate.

Kindly apply by providing the details below -

Your name ..
Your country ...
Amount of loan needed .
Phone number ...

REGARDS
CHARLENE TAYLOR
HEARTLAND LOAN FIRM 
HEAD OFFICE --52, South Campbell Springfield UNITED KINGDOM
Email: heartlandloancenter8...@gmail.com

Re: [PATCH v3 00/33] Add directory rename detection to git

2017-11-23 Thread Adam Dinwoodie

On Tuesday 21 November 2017 at 12:00 am -0800, Elijah Newren wrote:
> 
> 
>  merge-recursive.c   | 1243 +++-
>  merge-recursive.h   |   17 +
>  t/t3501-revert-cherry-pick.sh   |5 +-
>  t/t6043-merge-rename-directories.sh | 3821 
> +++
>  t/t7607-merge-overwrite.sh  |7 +-
>  unpack-trees.c  |4 +-
>  unpack-trees.h  |4 +
>  7 files changed, 4985 insertions(+), 116 deletions(-)
>  create mode 100755 t/t6043-merge-rename-directories.sh

The new t6043.44 introduced in this branch is failing on my Cygwin
system.  I can't immeditely see what's causing the failure, but I've
copied the relevant verbose + shell tracing output below in the hope it
makes more sense to you:

expecting success:
(
cd 7b &&

git checkout A^0 &&

test_must_fail git merge -s recursive B^0 >out &&
test_i18ngrep "CONFLICT (rename/rename)" out &&

test 4 -eq $(git ls-files -s | wc -l) &&
test 2 -eq $(git ls-files -u | wc -l) &&
test 3 -eq $(git ls-files -o | wc -l) &&

git rev-parse >actual \
:0:y/b :0:y/c :2:y/d :3:y/d &&
git rev-parse >expect \
O:z/b O:z/c O:w/d O:x/d &&
test_cmp expect actual &&

test ! -f y/d &&
test -f y/d~HEAD &&
test -f y/d~B^0 &&

git hash-object >actual \
y/d~HEAD y/d~B^0 &&
git rev-parse >expect \
O:w/d O:x/d &&
test_cmp expect actual
)

++ cd 7b
++ git checkout 'A^0'
Note: checking out 'A^0'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b 

HEAD is now at 0808992 A
++ test_must_fail git merge -s recursive 'B^0'
++ case "$1" in
++ _test_ok=
++ git merge -s recursive 'B^0'
++ exit_code=1
++ test 1 -eq 0
++ test_match_signal 13 1
++ test 1 = 141
++ test 1 = 269
++ return 1
++ test 1 -gt 129
++ test 1 -eq 127
++ test 1 -eq 126
++ return 0
++ test_i18ngrep 'CONFLICT (rename/rename)' out
++ test -n ''
++ test 'x!' = 'xCONFLICT (rename/rename)'
++ grep 'CONFLICT (rename/rename)' out
CONFLICT (rename/rename): Rename w/d->y/d in HEAD. Rename x/d->y/d in B^0
+++ git ls-files -s
+++ wc -l
++ test 4 -eq 4
+++ git ls-files -u
+++ wc -l
++ test 2 -eq 2
+++ git ls-files -o
+++ wc -l
++ test 3 -eq 3
++ git rev-parse :0:y/b :0:y/c :2:y/d :3:y/d
++ git rev-parse O:z/b O:z/c O:w/d O:x/d
++ test_cmp expect actual
++ diff -u expect actual
++ test '!' -f y/d
+ test_eval_ret_=1
+ want_trace
+ test t = t
+ test t = t
+ set +x
error: last command exited with $?=1
not ok 44 - 7b-check: rename/rename(2to1), but only due to transitive rename

Re: [PATCH v2] gitcli: tweak "man gitcli" for clarity

2017-11-23 Thread Jeff King

On Thu, Nov 23, 2017 at 02:45:44AM -0500, Robert P. J. Day wrote:

> > It's pretty clear to me as it is, but maybe we can write it differently.
> > Like:
> >
> >   Without a disambiguating `--`, Git makes a reasonable guess. If it
> >   cannot guess (because your request is ambiguous), then it will error
> >   out.
> 
>   ok, i'll give this another try, given that there are two independent
> points to be made here:
> 
> 1) even without the "--", git can generally parse the command and do
> the right thing (or do a *valid* thing, given its heuristics)
> 
> 2) occasionally, without the "--", the command is really and truly
> ambiguous, at which point git will fail and tell you to disambiguate
> 
>   not the wording i will use, but can we agree that those are the two
> points to be made here?

Yep, I think so.

-Peff

[PATCH v2 1/2] test-lib: add LIBPCRE1 & LIBPCRE2 prerequisites

2017-11-23 Thread Ævar Arnfjörð Bjarmason

Add LIBPCRE1 and LIBPCRE2 prerequisites which are true when git is
compiled with USE_LIBPCRE1=YesPlease or USE_LIBPCRE2=YesPlease,
respectively.

The syntax of PCRE1 and PCRE2 isn't the same in all cases (see
pcresyntax(3) and pcre2syntax(3)). If test are added that test for
those they'll need to be guarded by these new prerequisites.

The subsequent patch will make use of LIBPCRE2, so LIBPCRE1 isn't
strictly needed for now, but let's add it for consistency and so that
checking for it doesn't have to be done with the less obvious "PCRE,
!LIBPCRE2", which while semantically the same is more confusing, and
would lead to bugs if PCRE v3 is ever released as the tests would mean
v1, not any non-v2 version.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

As noted on-list I changed nothing here in the code, but noted in the
commit message why I'm keeping LIBPCRE1.

 t/README  | 12 
 t/test-lib.sh |  2 ++
 2 files changed, 14 insertions(+)

diff --git a/t/README b/t/README
index 4b079e4494..599cd9808c 100644
--- a/t/README
+++ b/t/README
@@ -808,6 +808,18 @@ use these, and "test_set_prereq" for how to define your 
own.
Git was compiled with support for PCRE. Wrap any tests
that use git-grep --perl-regexp or git-grep -P in these.
 
+ - LIBPCRE1
+
+   Git was compiled with PCRE v1 support via
+   USE_LIBPCRE1=YesPlease. Wrap any PCRE using tests that for some
+   reason need v1 of the PCRE library instead of v2 in these.
+
+ - LIBPCRE2
+
+   Git was compiled with PCRE v2 support via
+   USE_LIBPCRE2=YesPlease. Wrap any PCRE using tests that for some
+   reason need v2 of the PCRE library instead of v1 in these.
+
  - CASE_INSENSITIVE_FS
 
Test is run on a case insensitive file system.
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 116bd6a70c..e7065df2bb 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1028,6 +1028,8 @@ test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PTHREADS" && test_set_prereq PTHREADS
 test -z "$NO_PYTHON" && test_set_prereq PYTHON
 test -n "$USE_LIBPCRE1$USE_LIBPCRE2" && test_set_prereq PCRE
+test -n "$USE_LIBPCRE1" && test_set_prereq LIBPCRE1
+test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
 # Can we rely on git's output in the C locale?
-- 
2.15.0.403.gc27cc4dac6

[PATCH v2 2/2] grep: fix segfault under -P + PCRE2 <=10.30 + (*NO_JIT)

2017-11-23 Thread Ævar Arnfjörð Bjarmason

Fix a bug in the compilation of PCRE2 patterns under JIT (the most
common runtime configuration). Any pattern with a (*NO_JIT) verb would
segfault in any currently released PCRE2 version:

$ git grep -P '(*NO_JIT)hi.*there'
Segmentation fault

That this segfaulted was a bug in PCRE2 itself, after reporting it[1]
on pcre-dev it's been fixed in a yet-to-be-released version of
PCRE (presumably released first as 10.31). Now it'll die with:

$ git grep -P '(*NO_JIT)hi.*there'
fatal: pcre2_jit_match failed with error code -45: bad JIT option

But the cause of the bug is in our own code dating back to my
94da9193a6 ("grep: add support for PCRE v2", 2017-06-01).

As explained at more length in the comment being added here, it isn't
sufficient to just check pcre2_config() to see whether the JIT should
be used, pcre2_pattern_info() also has to be asked.

This is something I discovered myself when fiddling around with PCRE2
verbs in patterns passed to git. I don't expect that any user of git
has encountered this given the obscurity of passing PCRE2 verbs
through to the library, along with the relative obscurity of (*NO_JIT)
itself.

1. "How am I supposed to use PCRE2 JIT in the face of (*NO_JIT) ?"
   ( &
https://lists.exim.org/lurker/thread/20171123.101502.7f0d38ca.en.html)
   on the pcre-dev mailing list

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

Incorporates feedback from Eric & Simon. Thanks both. I also amended
the commit message / comment to note that this was also a bug in PCRE2
upstream, which has been fixed after I reported it.

 grep.c  | 26 ++
 t/t7810-grep.sh |  6 ++
 2 files changed, 32 insertions(+)

diff --git a/grep.c b/grep.c
index d0b9b6cdfa..e8ae0b5d8f 100644
--- a/grep.c
+++ b/grep.c
@@ -477,6 +477,8 @@ static void compile_pcre2_pattern(struct grep_pat *p, const 
struct grep_opt *opt
int options = PCRE2_MULTILINE;
const uint8_t *character_tables = NULL;
int jitret;
+   int patinforet;
+   size_t jitsizearg;
 
assert(opt->pcre2);
 
@@ -511,6 +513,30 @@ static void compile_pcre2_pattern(struct grep_pat *p, 
const struct grep_opt *opt
jitret = pcre2_jit_compile(p->pcre2_pattern, 
PCRE2_JIT_COMPLETE);
if (jitret)
die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", 
p->pattern, jitret);
+
+   /*
+* The pcre2_config(PCRE2_CONFIG_JIT, ...) call just
+* tells us whether the library itself supports JIT,
+* but to see whether we're going to be actually using
+* JIT we need to extract PCRE2_INFO_JITSIZE from the
+* pattern *after* we do pcre2_jit_compile() above.
+*
+* This is because if the pattern contains the
+* (*NO_JIT) verb (see pcre2syntax(3))
+* pcre2_jit_compile() will exit early with 0. If we
+* then proceed to call pcre2_jit_match() further down
+* the line instead of pcre2_match() we'll either
+* segfault (pre PCRE 10.31) or run into a fatal error
+* (post PCRE2 10.31)
+*/
+   patinforet = pcre2_pattern_info(p->pcre2_pattern, 
PCRE2_INFO_JITSIZE, &jitsizearg);
+   if (patinforet)
+   BUG("pcre2_pattern_info() failed: %d", patinforet);
+   if (jitsizearg == 0) {
+   p->pcre2_jit_on = 0;
+   return;
+   }
+
p->pcre2_jit_stack = pcre2_jit_stack_create(1, 1024 * 1024, 
NULL);
if (!p->pcre2_jit_stack)
die("Couldn't allocate PCRE2 JIT stack");
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 2a6679c2f5..c8ff50cc30 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1110,6 +1110,12 @@ test_expect_success PCRE 'grep -P pattern' '
test_cmp expected actual
 '
 
+test_expect_success LIBPCRE2 "grep -P with (*NO_JIT) doesn't error out" '
+   git grep -P "(*NO_JIT)\p{Ps}.*?\p{Pe}" hello.c >actual &&
+   test_cmp expected actual
+
+'
+
 test_expect_success !PCRE 'grep -P pattern errors without PCRE' '
test_must_fail git grep -P "foo.*bar"
 '
-- 
2.15.0.403.gc27cc4dac6

RE: Re: Unify annotated and non-annotated tags

2017-11-23 Thread Randall S. Becker

On 2017-11-23 02:31 (GMT-05:00) anatoly techtonik wrote
>Subject: Re: Unify annotated and non-annotated tags 
>On Sat, Nov 11, 2017 at 5:06 AM, Junio C Hamano  wrote:
>> Igor Djordjevic  writes:
>>
>>> If you would like to mimic output of "git show-ref", repeating
>>> commits for each tag pointing to it and showing full tag name as
>>> well, you could do something like this, for example:
>>>
>>>   for tag in $(git for-each-ref --format="%(refname)" refs/tags)
>>>   do
>>>   printf '%s %s\n' "$(git rev-parse $tag^0)" "$tag"
>>>   done
>>>
>>>
>>> Hope that helps a bit.
>>
>> If you use for-each-ref's --format option, you could do something
>> like (pardon a long line):
>>
>> git for-each-ref 
>> --format='%(if)%(*objectname)%(then)%(*objectname)%(else)%(objectname)%(end) 
>> %(refname)' refs/tags
>>
>> without any loop, I would think.
>Thanks. That helps.
>So my proposal is to get rid of non-annotated tags, so to get all
>tags with commits that they point to, one would use:
>git for-each-ref --format='%(*objectname) %(refname)' refs/tags>
>For so-called non-annotated tags just leave the message empty.
>I don't see why anyone would need non-annotated tags though.

I have seen non-annotated tags used in automations (not necessarily well 
written ones) that create tags as a record of automation activity. I am not 
sure we should be writing off the concept of unannotated tags entirely. This 
may cause breakage based on existing expectations of how tags work at present. 
My take is that tags should include whodunnit, even if it's just the version of 
the automation being used, but I don't always get to have my wishes fulfilled. 
In essence, whatever behaviour a non-annotated tag has now may need to be 
emulated in future even if reconciliation happens. An option to preserve empty 
tag compatibility with pre-2.16 behaviour, perhaps? Sadly, I cannot supply 
examples of this usage based on a human memory page-fault and NDAs.

Cheers,
Randall

-- Brief whoami: NonStop&UNIX developer since approximately 
UNIX(421664400)/NonStop(2112884442)
-- In my real life, I talk too much.

RFC: Native clean/smudge filter for UTF-16 files

2017-11-23 Thread Lars Schneider

Hi,

I am working with a team that owns a repository with lots of UTF-16 files.
Converting these files to UTF-8 is no good option as downstream applications
require the UTF-16 encoding. Keeping the files in UTF-16 is no good option
either as Git and Git related tools (e.g. GitHub) consider the files binary
and consequently do not render diffs.

The obvious solution is to setup a clean/smudge filter like this [1]:
[filter "winutf16"]
clean = iconv -f utf-16 -t utf-8
smudge = iconv -f utf-8 -t utf-16

In general this works well but the "per-file" clean/smudge process invocation 
can be slow for many files. I could apply the same trick that we used for Git
LFS and write a iconv that processes all files with a single invocation (see
"edcc85814c convert: add filter..process option" [2]). 

Alternatively, I could add a native attribute to Git that translates UTF-16 
to UTF-8 and back. A conversion function is already available in "mingw.h" [3]
on Windows. Limiting this feature to Windows wouldn't be a problem from my
point of view as UTF-16 is only relevant on Windows anyways. The attribute 
could look like this:

*.txttext encoding=utf-16

There was a previous discussion on the topic and Jonathan already suggested
a "native" clean/smudge filter in 2010 [4]. Also the "encoding" attribute
is already present but, as far as I can tell, is only used by the git gui
for viewing [5].

Do you think a patch that converts UTF-16 files to UTF-8 via an attribute
"encoding=utf-16" on Windows would have a chance to get accepted?

Thanks,
Lars

[1] https://github.com/msysgit/msysgit/issues/113#issue-13142846
[2] https://github.com/git/git/commit/edcc85814c87ebd7f3b1b7d3979fac3dfb84d308
[3] 
https://github.com/git/git/blob/14c63a9dc093d6738454f6369a4f5663ca732cf7/compat/mingw.h#L501-L533
[4] https://public-inbox.org/git/20101022195331.GA12014@burratino/
[5] https://github.com/git/git/commit/1ffca60f0b0395e1e593e64d66e7ed3c47d8517e

[PATCH] grep: Add option --max-line-len

2017-11-23 Thread Marc-Antoine Ruel

This tells git grep to skip files longer than a specified length,
which is often the result of generators and not actual source files.

Signed-off-by: Marc-Antoine Ruel 
---
 Documentation/git-grep.txt | 5 +
 builtin/grep.c | 2 ++
 grep.c | 4 
 grep.h | 1 +
 t/t7810-grep.sh| 5 +
 5 files changed, 17 insertions(+)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 18b494731..75081defb 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -10,6 +10,7 @@ SYNOPSIS
 
 [verse]
 'git grep' [-a | --text] [-I] [--textconv] [-i | --ignore-case] [-w | 
--word-regexp]
+  [-M | --max-line-len ]
   [-v | --invert-match] [-h|-H] [--full-name]
   [-E | --extended-regexp] [-G | --basic-regexp]
   [-P | --perl-regexp]
@@ -127,6 +128,10 @@ OPTIONS
beginning of a line, or preceded by a non-word character; end at
the end of a line or followed by a non-word character).
 
+-M::
+--max-line-len::
+   Match the pattern only for line shorter or equal to this length.
+
 -v::
 --invert-match::
Select non-matching lines.
diff --git a/builtin/grep.c b/builtin/grep.c
index 5a6cfe6b4..cc5c70be5 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -796,6 +796,8 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
N_("case insensitive matching")),
OPT_BOOL('w', "word-regexp", &opt.word_regexp,
N_("match patterns only at word boundaries")),
+   OPT_INTEGER('M', "max-line-len", &opt.max_line_length,
+   N_("ignore lines longer than ")),
OPT_SET_INT('a', "text", &opt.binary,
N_("process binary files as text"), GREP_BINARY_TEXT),
OPT_SET_INT('I', NULL, &opt.binary,
diff --git a/grep.c b/grep.c
index d0b9b6cdf..881078b82 100644
--- a/grep.c
+++ b/grep.c
@@ -36,6 +36,7 @@ void init_grep_defaults(void)
opt->relative = 1;
opt->pathname = 1;
opt->max_depth = -1;
+   opt->max_line_length = -1;
opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED;
color_set(opt->color_context, "");
color_set(opt->color_filename, "");
@@ -151,6 +152,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
opt->pattern_type_option = def->pattern_type_option;
opt->linenum = def->linenum;
opt->max_depth = def->max_depth;
+   opt->max_line_length = def->max_line_length;
opt->pathname = def->pathname;
opt->relative = def->relative;
opt->output = def->output;
@@ -1273,6 +1275,8 @@ static int match_line(struct grep_opt *opt, char *bol, 
char *eol,
struct grep_pat *p;
regmatch_t match;
 
+   if (opt->max_line_length > 0 && eol-bol > opt->max_line_length)
+   return 0;
if (opt->extended)
return match_expr(opt, bol, eol, ctx, collect_hits);
 
diff --git a/grep.h b/grep.h
index 399381c90..0e76c0a19 100644
--- a/grep.h
+++ b/grep.h
@@ -151,6 +151,7 @@ struct grep_opt {
int null_following_name;
int color;
int max_depth;
+   int max_line_length;
int funcname;
int funcbody;
int extended_regexp_option;
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 2a6679c2f..c514bd388 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -766,6 +766,11 @@ test_expect_success 'grep -W shows no trailing empty 
lines' '
test_cmp expected actual
 '
 
+test_expect_success 'grep skips long lines' '
+   git grep -M18 -W include >actual &&
+   test_cmp expected actual
+'
+
 cat >expected <

Re: Changing encoding of a file : What should happen to CRLF in file ?

2017-11-23 Thread Ashish Negi

Thanks for confirming.
Is it possible to track this via a bug number ?
It will help me to try out the fix when its available.


On Thu, Nov 16, 2017 at 9:45 PM, Torsten Bögershausen  wrote:
> On Thu, Nov 16, 2017 at 12:35:33AM +0530, Ashish Negi wrote:
>> On windows :
>> > git --version
>> git version 2.14.2.windows.2
>>
>> On linux :
>> > git --version
>> git version 2.7.4
>>
>> I would like to understand the solution :
>> If i understood it correctly : it removes file_name.txt from index, so
>> git forgets about it.
>> we then add the file again after changing encoding. This time, git
>> takes it as utf-8 file and converts crlf to lf when storing it
>> internally.
>> Right ?
>
> Yes, exactly.
> (In a coming release of Git there will be a "git add --renormalize 
> " )
>
>>
>> Thank you for the support.
>>
>
> Thanks for a clean bug report.
> Actually it is a bug, I put it on my to do list
>
>

I wait for your prompt response.

2017-11-23 Thread SAM AZADA

Good day,

I am Mr. Sam Azada from Burkina Faso  a Minister confide on me to look
for foreign partner who will assist him to invest the sum of  Fifty
Million  Dollars  ($50,000,000) in your country.

He has investment interest in mining, exotic properties for commercial
resident, development properties, hotels and any other viable
investment opportunities in your country based on your recommendation
will be highly welcomed.

Hence your co -operation is highly needed to actualize this investment project

I wait for your prompt response.

Sincerely yours

Mr Sam Azada.

Problem installing Git

2017-11-23 Thread Phil Martel


Hi,

I'm trying to install Git-2.15.0-64-bit.exe onto my Windows 10 machine.  
When I run this installer program no matter what options I try or 
whether I run as administrator it ends up with an error box saying "The 
drive or UNC share you selected does not exist or is not accessible. 
Please select another".  I do not see any way of selecting a drive.  Any 
suggestions?


Best wishes,

--Phil Martel

Re: [PATCH] grep: Add option --max-line-len

2017-11-23 Thread Eric Sunshine

On Thu, Nov 23, 2017 at 10:41 AM, Marc-Antoine Ruel  wrote:
> This tells git grep to skip files longer than a specified length,

s/files/lines/

> which is often the result of generators and not actual source files.
>
> Signed-off-by: Marc-Antoine Ruel 
> ---
> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
> @@ -127,6 +128,10 @@ OPTIONS
> +-M::
> +--max-line-len::
> +   Match the pattern only for line shorter or equal to this length.

This documentation doesn't explain what it means by a line's length.
This implementation seems to take into consideration only the line's
byte count, however, given that displayed lines are normally
tab-expanded, people might intuitively expect that expansion to count
toward the length. A similar question arises for Unicode characters.

Should this option take tab-expansion and Unicode into account?
Regardless of the answer to that question, the documentation should
make it clear what "line length" means.

> diff --git a/builtin/grep.c b/builtin/grep.c
> @@ -796,6 +796,8 @@ int cmd_grep(int argc, const char **argv, const char 
> *prefix)
> N_("case insensitive matching")),
> OPT_BOOL('w', "word-regexp", &opt.word_regexp,
> N_("match patterns only at word boundaries")),
> +   OPT_INTEGER('M', "max-line-len", &opt.max_line_length,
> +   N_("ignore lines longer than ")),

On this project, it is typical to have only the long form of an option
name when the option is first implemented so as not to squat on one of
the relatively limited number of short option names. Only after it is
seen that an option is popular, does it get a short name. Whether this
actually matters in this case, I don't know, but is something to take
into consideration.

Why the choice of 'M'? Is there precedence in other Git or Unix
commands for that name over, say, 'N' or
? Is 'M' is for "max"? If so, what would
a short name for --max-depth be (if it ever got one)? (These are
somewhat rhetorical questions, but I'm interested in the answers if
you have any.)

> diff --git a/grep.c b/grep.c
> @@ -1273,6 +1275,8 @@ static int match_line(struct grep_opt *opt, char *bol, 
> char *eol,
> +   if (opt->max_line_length > 0 && eol-bol > opt->max_line_length)
> +   return 0;

If the user specifies "-M0", should that error out or at least warn
the user that the value is non-sensical? What about -1, etc.? (These
are UX-related questions; the implementation obviously doesn't care
one way or the other.)

> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> @@ -766,6 +766,11 @@ test_expect_success 'grep -W shows no trailing empty 
> lines' '
> +test_expect_success 'grep skips long lines' '
> +   git grep -M18 -W include >actual &&
> +   test_cmp expected actual
> +'

Meh, what is this actually testing? The output of "git grep -M18 -W
include" is no different that the output of "git grep -W include" in
the test just above this one. And, why is -W in this test at all? Its
presence confuses the reader into thinking that it is somehow
significant. I would have expected this test to look like this:

cat >expected <<-\EOF
hello.c:#include 
EOF

test_expect_success 'grep -M skips long lines' '
git grep -M18 include >actual &&
test_cmp expected actual
'

Re: RFC: Native clean/smudge filter for UTF-16 files

2017-11-23 Thread Torsten Bögershausen

On Thu, Nov 23, 2017 at 04:18:59PM +0100, Lars Schneider wrote:
> Hi,
> 
> I am working with a team that owns a repository with lots of UTF-16 files.
> Converting these files to UTF-8 is no good option as downstream applications
> require the UTF-16 encoding. Keeping the files in UTF-16 is no good option
> either as Git and Git related tools (e.g. GitHub) consider the files binary
> and consequently do not render diffs.
> 
> The obvious solution is to setup a clean/smudge filter like this [1]:
> [filter "winutf16"]
> clean = iconv -f utf-16 -t utf-8
> smudge = iconv -f utf-8 -t utf-16
> 
> In general this works well but the "per-file" clean/smudge process invocation 
> can be slow for many files. I could apply the same trick that we used for Git
> LFS and write a iconv that processes all files with a single invocation (see
> "edcc85814c convert: add filter..process option" [2]). 
> 
> Alternatively, I could add a native attribute to Git that translates UTF-16 
> to UTF-8 and back. A conversion function is already available in "mingw.h" [3]
> on Windows. Limiting this feature to Windows wouldn't be a problem from my
> point of view as UTF-16 is only relevant on Windows anyways. The attribute 
> could look like this:
> 
> *.txttext encoding=utf-16
> 

I would probably vote for UTF-16LE.
But we can specify whatever iconv allows, see below.

[]
> Do you think a patch that converts UTF-16 files to UTF-8 via an attribute
> "encoding=utf-16" on Windows would have a chance to get accepted?

Yes.
The thing is that we have convert.c and utf8.c (reencode_string_iconv()
and possible strbuf.c, which feels that they are in charge here.

Having a path using mingw.h would mean that this is Windows only,
and that is not a good solution.
(I sometimes host files on a Linux box, export them via SAMBA to
 Mac and Windows. Having that kind of setup allows to commit
 directly on the Linux fileserver)

But even if you don't have that setup, cross platform is a must, I would say.
There may be possible optimizations using xutftowcsn() and friends under
Windows, but they may come later into the picture (or are not needed)
What file sizes are we talking about ?
100K ?
100M ?

It is even possible to hook into the streaming interface.

> 
> Thanks,
> Lars
>

Re: Changing encoding of a file : What should happen to CRLF in file ?

2017-11-23 Thread Torsten Bögershausen

On Thu, Nov 23, 2017 at 10:01:40PM +0530, Ashish Negi wrote:
> Thanks for confirming.
> Is it possible to track this via a bug number ?
> It will help me to try out the fix when its available.
> 

No worry, the fix is nearly complete and will come out in a couple of days.

Re: [PATCH v2] gitcli: tweak "man gitcli" for clarity

2017-11-23 Thread Kevin Daudt

On Thu, Nov 23, 2017 at 08:51:55AM -0500, Jeff King wrote:
> On Thu, Nov 23, 2017 at 02:45:44AM -0500, Robert P. J. Day wrote:
> 
> > > It's pretty clear to me as it is, but maybe we can write it differently.
> > > Like:
> > >
> > >   Without a disambiguating `--`, Git makes a reasonable guess. If it
> > >   cannot guess (because your request is ambiguous), then it will error
> > >   out.
> > 
> >   ok, i'll give this another try, given that there are two independent
> > points to be made here:
> > 
> > 1) even without the "--", git can generally parse the command and do
> > the right thing (or do a *valid* thing, given its heuristics)
> > 
> > 2) occasionally, without the "--", the command is really and truly
> > ambiguous, at which point git will fail and tell you to disambiguate
> > 
> >   not the wording i will use, but can we agree that those are the two
> > points to be made here?
> 
> Yep, I think so.
> 
> -Peff

Just for completeness, as it is somewhat covered by point 1 already, but
there are cases where there is no real ambiguity but you are required to
add '--' to tell git that it should not look for the file in the working
tree:

  $ git show abc123 deleted_file.txt
  fatal: ambiguous argument 'deleted_file.txt':
  unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git  [...] -- [...]'

There might be good reasons why this is, but I don't consider this to be
actually ambiguous: there is no branch called 'deleted_file.txt' and git
could know that the files exists in the mentioned commit, so it should
be pretty clear what is meant.

Might be worth documenting this.

Kevin

Re: Unify annotated and non-annotated tags

2017-11-23 Thread Thomas Braun

Am 23.11.2017 um 16:08 schrieb Randall S. Becker:

[...]

>> So my proposal is to get rid of non-annotated tags, so to get all
>> tags with commits that they point to, one would use:
>> git for-each-ref --format='%(*objectname) %(refname)' refs/tags>
>> For so-called non-annotated tags just leave the message empty.
>> I don't see why anyone would need non-annotated tags though.

I'm using exclusively non-annotated tags. Why? Because that is the
default of "git tag" and I also, until now, never needed more than
giving a commit a "name".

Many annotated tag messages just repeat the tag name itself, so the
message seems quite redundant to me.

Just my 2 cents,
Thomas

Re: Problem installing Git

2017-11-23 Thread Igor Djordjevic

[ +Cc:  Git for Windows mailing list ]

Hi Phil,

On 23/11/2017 19:51, Phil Martel wrote:
> I'm trying to install Git-2.15.0-64-bit.exe onto my Windows 10 
> machine.  When I run this installer program no matter what options I 
> try or whether I run as administrator it ends up with an error box 
> saying "The drive or UNC share you selected does not exist or is not 
> accessible. Please select another".  I do not see any way of 
> selecting a drive.  Any suggestions?

>From what I could Google around, this seems to be (Inno Setup?) 
installation related issue...?

Do you already have "Git for Windows" installed? If so, does it work 
if you try uninstalling it first?

Regards, Buga

p.s. Note the existence of "Git for Windows"[1] specific mailing list 
as well, where this issue might belong better.

[1] git-for-wind...@googlegroups.com

Re: Delivery Status Notification (Failure)

2017-11-23 Thread Igor Djordjevic

On 23/11/2017 22:30, Mail Delivery Subsystem wrote:
> Hello igor.d.djordje...@gmail.com,
> 
> We're writing to let you know that the group you tried to contact 
> (git-for-windows) may not exist, or you may not have permission to post 
> messages to the group. A few more details on why you weren't able to post:
> 
>  * You might have spelled or formatted the group name incorrectly.
>  * The owner of the group may have removed this group.
>  * You may need to join the group before receiving permission to post.
>  * This group may not be open to posting.
> 
> If you have questions related to this or any other Google Group, visit the 
> Help Center at https://groups.google.com/support/.
> 
> Thanks,
> 
> Google Groups

Well, seems that +Cc-ing wasn`t the most successful one... :P

Re: [PATCH v3 00/33] Add directory rename detection to git

2017-11-23 Thread Elijah Newren

On Thu, Nov 23, 2017 at 3:52 AM, Adam Dinwoodie  wrote:
> On Tuesday 21 November 2017 at 12:00 am -0800, Elijah Newren wrote:
>> 
>>
>>  merge-recursive.c   | 1243 +++-
>>  merge-recursive.h   |   17 +
>>  t/t3501-revert-cherry-pick.sh   |5 +-
>>  t/t6043-merge-rename-directories.sh | 3821 
>> +++
>>  t/t7607-merge-overwrite.sh  |7 +-
>>  unpack-trees.c  |4 +-
>>  unpack-trees.h  |4 +
>>  7 files changed, 4985 insertions(+), 116 deletions(-)
>>  create mode 100755 t/t6043-merge-rename-directories.sh
>
> The new t6043.44 introduced in this branch is failing on my Cygwin
> system.  I can't immeditely see what's causing the failure, but I've
> copied the relevant verbose + shell tracing output below in the hope it
> makes more sense to you:

Thanks for reporting.  Unfortunately, I have been unable to locate or
create a cygwin system on which to replicate the testing.  Valgrind is
running clean for me, and I find it interesting that the output shows
it did detect the rename/rename(2to1) conflict on y/d for you, and the
index has all the right values, but somehow the 'test ! -f y/d' line
isn't passing for you.

Out of curiosity:
  * What is in the y/ directory in that test?  (That is, the y/
directory within the 7b/ directory)
  * What are the full contents of the 'out' file?  (again, within the
7b/ directory)
  * What commit have you actually built and run with (maybe I'm trying
with something slightly different?)

Re: Problem installing Git

2017-11-23 Thread Phil Martel


Thank you.  I'll look into it.

Best wishes,

--Phil Martel


On 11/23/2017 4:30 PM, Igor Djordjevic wrote:

[ +Cc:  Git for Windows mailing list ]

Hi Phil,

On 23/11/2017 19:51, Phil Martel wrote:

I'm trying to install Git-2.15.0-64-bit.exe onto my Windows 10
machine.  When I run this installer program no matter what options I
try or whether I run as administrator it ends up with an error box
saying "The drive or UNC share you selected does not exist or is not
accessible. Please select another".  I do not see any way of
selecting a drive.  Any suggestions?

 From what I could Google around, this seems to be (Inno Setup?)
installation related issue...?

Do you already have "Git for Windows" installed? If so, does it work
if you try uninstalling it first?

Regards, Buga

p.s. Note the existence of "Git for Windows"[1] specific mailing list
as well, where this issue might belong better.

[1] git-for-wind...@googlegroups.com

Re: Problem installing Git (followup)

2017-11-23 Thread Phil Martel


Hi Buga,

That solved my problem.  I apparently had enough cruft left over from a 
hard disk issue for Windows to think I still had a copy of Git 
installed.  when I got rid of it, the new version installed with no 
problems.


Thanks again,

--Phil Martel


On 11/23/2017 4:30 PM, Igor Djordjevic wrote:

[ +Cc:  Git for Windows mailing list ]

Hi Phil,

On 23/11/2017 19:51, Phil Martel wrote:

I'm trying to install Git-2.15.0-64-bit.exe onto my Windows 10
machine.  When I run this installer program no matter what options I
try or whether I run as administrator it ends up with an error box
saying "The drive or UNC share you selected does not exist or is not
accessible. Please select another".  I do not see any way of
selecting a drive.  Any suggestions?

 From what I could Google around, this seems to be (Inno Setup?)
installation related issue...?

Do you already have "Git for Windows" installed? If so, does it work
if you try uninstalling it first?

Regards, Buga

p.s. Note the existence of "Git for Windows"[1] specific mailing list
as well, where this issue might belong better.

[1] git-for-wind...@googlegroups.com

[PATCH] http-backend: respect CONTENT_LENGTH as specified by rfc3875

2017-11-23 Thread Max Kirillov

http-backend reads whole input until EOF. However, the RFC 3875 specifies
that a script must read only as many bytes as specified by CONTENT_LENGTH
environment variable. This causes hang under IIS/Windows, for example.

Make http-backend read only CONTENT_LENGTH bytes, if it's defined, rather than
the whole input until EOF. If the varibale is not defined, keep older behavior
of reading until EOF because it is used to support chunked transfer-encoding.

Signed-off-by: Florian Manschwetus 
Authored-by: Florian Manschwetus 
Fixed-by: Max Kirillov 
Signed-off-by: Max Kirillov 
---
Hi

I came across this issue, and I think is should be good to restore the patch.
It is basically same but I squashed them, fixed the thing you mentioned and
also some trivial build failures (null -> NULL and missing return from the 
wrapper).
I hope I marked it correctly in the trailers.
 config.c   |  8 +++
 config.h   |  1 +
 http-backend.c | 72 +-
 3 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/config.c b/config.c
index 231f9a750b..925bb65dfa 100644
--- a/config.c
+++ b/config.c
@@ -1525,6 +1525,14 @@ unsigned long git_env_ulong(const char *k, unsigned long 
val)
return val;
 }
 
+ssize_t git_env_ssize_t(const char *k, ssize_t val)
+{
+   const char *v = getenv(k);
+   if (v && !git_parse_ssize_t(v, &val))
+   die("failed to parse %s", k);
+   return val;
+}
+
 int git_config_system(void)
 {
return !git_env_bool("GIT_CONFIG_NOSYSTEM", 0);
diff --git a/config.h b/config.h
index 0352da117b..947695c304 100644
--- a/config.h
+++ b/config.h
@@ -74,6 +74,7 @@ extern int git_config_rename_section_in_file(const char *, 
const char *, const c
 extern const char *git_etc_gitconfig(void);
 extern int git_env_bool(const char *, int);
 extern unsigned long git_env_ulong(const char *, unsigned long);
+extern ssize_t git_env_ssize_t(const char *, ssize_t);
 extern int git_config_system(void);
 extern int config_error_nonbool(const char *);
 #if defined(__GNUC__)
diff --git a/http-backend.c b/http-backend.c
index 519025d2c3..317b99b87c 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -280,7 +280,7 @@ static struct rpc_service *select_service(struct strbuf 
*hdr, const char *name)
  * hit max_request_buffer we die (we'd rather reject a
  * maliciously large request than chew up infinite memory).
  */
-static ssize_t read_request(int fd, unsigned char **out)
+static ssize_t read_request_eof(int fd, unsigned char **out)
 {
size_t len = 0, alloc = 8192;
unsigned char *buf = xmalloc(alloc);
@@ -317,6 +317,76 @@ static ssize_t read_request(int fd, unsigned char **out)
}
 }
 
+/*
+ * replacement for original read_request, now renamed to read_request_eof,
+ * honoring given content_length (req_len),
+ * provided by new wrapper function read_request
+ */
+static ssize_t read_request_fix_len(int fd, size_t req_len, unsigned char 
**out)
+{
+   unsigned char *buf = NULL;
+   size_t len = 0;
+
+   /* check request size */
+   if (max_request_buffer < req_len) {
+   die("request was larger than our maximum size (%lu);"
+   " try setting GIT_HTTP_MAX_REQUEST_BUFFER",
+   max_request_buffer);
+   }
+
+   if (req_len <= 0) {
+   *out = NULL;
+   return 0;
+   }
+
+   /* allocate buffer */
+   buf = xmalloc(req_len);
+
+
+   while (1) {
+   ssize_t cnt;
+
+   cnt = read_in_full(fd, buf + len, req_len - len);
+   if (cnt < 0) {
+   free(buf);
+   return -1;
+   }
+
+   /* partial read from read_in_full means we hit EOF */
+   len += cnt;
+   if (len < req_len) {
+   /* TODO request incomplete?? */
+   /* maybe just remove this block and condition along 
with the loop, */
+   /* if read_in_full is prooven reliable */
+   *out = buf;
+   return len;
+   } else {
+   /* request complete */
+   *out = buf;
+   return len;
+   
+   }
+   }
+}
+
+/**
+ * wrapper function, whcih determines based on CONTENT_LENGTH value,
+ * to
+ * - use old behaviour of read_request, to read until EOF
+ * => read_request_eof(...)
+ * - just read CONTENT_LENGTH-bytes, when provided
+ * => read_request_fix_len(...)
+ */
+static ssize_t read_request(int fd, unsigned char **out)
+{
+   /* get request size */
+   ssize_t req_len = git_env_ssize_t("CONTENT_LENGTH", -1);
+   if (req_len < 0)
+   return read_request_eof(fd, out);
+   else
+   return read_request_fix_len(fd, req_len, out);
+}
+
 static void inflate_request(const char *prog_name, int out, int bu

Re: Problem installing Git (followup)

2017-11-23 Thread Igor Djordjevic

Hi Phil,

On 24/11/2017 00:44, Phil Martel wrote:
> On 11/23/2017 4:30 PM, Igor Djordjevic wrote:
>> On 23/11/2017 19:51, Phil Martel wrote:
>>> I'm trying to install Git-2.15.0-64-bit.exe onto my Windows 10 
>>> machine.  When I run this installer program no matter what 
>>> options I try or whether I run as administrator it ends up with 
>>> an error box saying "The drive or UNC share you selected does 
>>> not exist or is not accessible. Please select another".  I do 
>>> not see any way of selecting a drive.  Any suggestions?
>> 
>> Do you already have "Git for Windows" installed? If so, does it work
>> if you try uninstalling it first?
> 
> That solved my problem.  I apparently had enough cruft left over 
> from a hard disk issue for Windows to think I still had a copy of 
> Git installed.  when I got rid of it, the new version installed
> with no problems.
> 
> Thanks again,

No problem, it was more of a lucky shot, but glad it helped :)

Regards, Buga

p.s. When discussing here, it`s more customary to quote inline, instead of 
top-posting[1].

[1] https://en.wikipedia.org/wiki/Posting_style

Re: [PATCH v3 5/5] Testing: provide tests requiring them with ellipses after SHA-1 values

2017-11-23 Thread Junio C Hamano

"Philip Oakley"  writes:

>> We do not know until this change is released to the wild, at which
>> time we will hear noises about the lack of expected ellipses their
>> (poorly written) scripts rely on and tell them to set the workaround
>> environment variable.  We may not hear from such people at all, in
>> which case we may be able to remove it within a year or so, but it
>> is too early to tell.
>
> I was wondering if there should be a small documentation change for
> the env variable and states that it is a temporary measure for short
> term compatibility. Though I'm not sure where the 'right' place would
> be for it.

We list environment variables that applies git-wide in git(1), I
think.  If this is going to be only applicable for the "diff"
family, Documentation/diff-format.txt may be a better place, and
in that case the name of the environment variable may need to
include a word DIFF somewhere.

Re: [PATCH v2] gitcli: tweak "man gitcli" for clarity

2017-11-23 Thread Junio C Hamano

Kevin Daudt  writes:

> Just for completeness, as it is somewhat covered by point 1 already, but
> there are cases where there is no real ambiguity but you are required to
> add '--' to tell git that it should not look for the file in the working
> tree:
>
>   $ git show abc123 deleted_file.txt
>   fatal: ambiguous argument 'deleted_file.txt':
>   unknown revision or path not in the working tree.
>   Use '--' to separate paths from revisions, like this:
>   'git  [...] -- [...]'
>
> There might be good reasons why this is, but I don't consider this to be
> actually ambiguous: there is no branch called 'deleted_file.txt' and git
> could know that the files exists in the mentioned commit, so it should
> be pretty clear what is meant.

I know you understand all of this, but your "git could" needs to be
examined carefully.

The same can be said for "git log master deleted_file.txt" whose
intention is to refer to a file that the user _thinks_ existed once
in the past but may never have been there.

Surely, "git could" know if it runs "git log master" to the root of
the history to see if it ever existed.  Also, against "git log
master next deleted_file.txt", "git could" do the same traversal from
two tips of the history to check that.  But it requires us to do the
same work twice to materialize that "git could".

Actually the second example is a lot worse (and that is why I am
bringing it up).  If git does spend cycles to realize that "git
could", for consistency, it must also check if "next" is unambiguous
between a path or a rev, i.e. it must dig history from "master" and
see if "next" appears as a path ever in the history, and if so, die
with "ambiguous argument".

The message "unknown revision or path not in the working tree."
clearly shows why we decided to ask.  Even if deleted_file.txt could
have been a valid path somewhere in the history, "not in the working
tree" is the condition we check to see if it is unambiguous.  And we
stop and ask when it cannot be proven cheaply that it is not.

"git stops and asks when ambiguous" is a white lie to explain the
safety feature in a way that is easier to understand.  If somebody
wants to make the documentation more "technically correct", the
condition is not "when ambiguous", but "when git cannot prove it is
not unambiguous with inexpensive tests".

IIRC, I think Peff mentioned in this or nearby thread that we do not
want to spell out the implementation details and leave the exact
conditions vague on purpose, and that principle probably would apply
here, too.  We might later discover that checking not just the working
tree but also for the index to see if user meant by deleted_file.txt
a path and not a revision is not overly expensive, for example, and
at that point, we are still trying to prove that it is unambiguously
a path and not a rev with inexpensive tests.

Re: [PATCH] http-backend: respect CONTENT_LENGTH as specified by rfc3875

2017-11-23 Thread Eric Sunshine

On Thu, Nov 23, 2017 at 6:45 PM, Max Kirillov  wrote:
> [PATCH] http-backend: respect CONTENT_LENGTH as specified by rfc3875

The "RFC" seems to be missing from the subject line of this unpolished patch.

> http-backend reads whole input until EOF. However, the RFC 3875 specifies
> that a script must read only as many bytes as specified by CONTENT_LENGTH
> environment variable. This causes hang under IIS/Windows, for example.

By "_this_ causes a hang", I presume you mean "not respecting
CONTENT_LENGTH causes a hang"? Perhaps that could be spelled out
explicitly.

> Make http-backend read only CONTENT_LENGTH bytes, if it's defined, rather than
> the whole input until EOF. If the varibale is not defined, keep older behavior

s/varibale/variable/

> of reading until EOF because it is used to support chunked transfer-encoding.
>
> Signed-off-by: Florian Manschwetus 
> Authored-by: Florian Manschwetus 
> Fixed-by: Max Kirillov 
> Signed-off-by: Max Kirillov 
> ---
> diff --git a/http-backend.c b/http-backend.c
> @@ -317,6 +317,76 @@ static ssize_t read_request(int fd, unsigned char **out)
> +/*
> + * replacement for original read_request, now renamed to read_request_eof,
> + * honoring given content_length (req_len),
> + * provided by new wrapper function read_request
> + */

This comment has value only to someone who knew what the code was like
before this change, and it merely repeats what is already implied by
the commit message, rather than providing any valuable information
about this new function itself. Therefore, it should be dropped.

> +static ssize_t read_request_fix_len(int fd, size_t req_len, unsigned char 
> **out)

Wrong data type: s/size_t req_len/ssize_t req_len/

Also: s/fix/fixed/

> +{
> +   unsigned char *buf = NULL;
> +   size_t len = 0;
> +
> +   /* check request size */

Comment merely repeats what code says, thus has no value. Please drop.

> +   if (max_request_buffer < req_len) {
> +   die("request was larger than our maximum size (%lu);"
> +   " try setting GIT_HTTP_MAX_REQUEST_BUFFER",
> +   max_request_buffer);

This error message neglects to say what the request size was. Such
information would be useful given that it suggests bumping
GIT_HTTP_MAX_REQUEST_BUFFER to a larger value.

> +   }
> +
> +   if (req_len <= 0) {
> +   *out = NULL;
> +   return 0;
> +   }
> +
> +   /* allocate buffer */

Drop valueless comment.

> +   buf = xmalloc(req_len);
> +
> +

Style: Too many blank lines.

> +   while (1) {
> +   ssize_t cnt;
> +
> +   cnt = read_in_full(fd, buf + len, req_len - len);
> +   if (cnt < 0) {
> +   free(buf);
> +   return -1;
> +   }
> +
> +   /* partial read from read_in_full means we hit EOF */
> +   len += cnt;
> +   if (len < req_len) {
> +   /* TODO request incomplete?? */
> +   /* maybe just remove this block and condition along 
> with the loop, */
> +   /* if read_in_full is prooven reliable */

s/prooven/proven/

> +   *out = buf;
> +   return len;
> +   } else {
> +   /* request complete */
> +   *out = buf;
> +   return len;
> +
> +   }
> +   }

What is the purpose of the while(1) loop? Every code path inside the
loop returns, so it will never execute more than once. Likewise, why
is 'len' needed?

Rather than writing an entirely new "read" function, how about just
modifying the existing read_request() to optionally limit the read to
a specified number of bytes?

> +}
> +
> +/**
> + * wrapper function, whcih determines based on CONTENT_LENGTH value,

s/whcih/which/

Also, the placement of commas needs some attention.

> + * to
> + * - use old behaviour of read_request, to read until EOF
> + * => read_request_eof(...)
> + * - just read CONTENT_LENGTH-bytes, when provided
> + * => read_request_fix_len(...)
> + */

When talking about "old behavior", this comment is repeating
information more suitable to the commit message (and effectively
already covered there); information which only has value to someone
who knew what the old code/behavior was like. The rest of this comment
is merely repeating what the code itself already says, thus adds no
value, so should be dropped.

> +static ssize_t read_request(int fd, unsigned char **out)
> +{
> +   /* get request size */

Drop valueless comment.

> +   ssize_t req_len = git_env_ssize_t("CONTENT_LENGTH", -1);
> +   if (req_len < 0)
> +   return read_request_eof(fd, out);
> +   else
> +   return read_request_fix_len(fd, req_len, out);
> +}

Re: [PATCH] grep: Add option --max-line-len

2017-11-23 Thread Junio C Hamano

Marc-Antoine Ruel  writes:

> This tells git grep to skip files longer than a specified length,
> which is often the result of generators and not actual source files.
>
> ...
> +-M::
> +--max-line-len::
> + Match the pattern only for line shorter or equal to this length.
> +

All the excellent review comments from Eric I agree with.

With the name of the option and the above end-user facing
description, it is very clear that the only thing this feature does
is to declare that an overlong line does _not_ match when trying to
check against any pattern.

That is a much clearer definition and description than random new
features people propose here (and kicked back by reviewers, telling
them to make the specification clearer), and I'd commend you for that.

But it still leaves at least one thing unclear.  How should it
interact with "-v"?  If we consider an overlong line never matches,
would "git grep -v " should include the line in its output?

Speaking of the output, it also makes me wonder if the feature
really wants to include an overlong line as a context line when
showing a near-by line that matches the pattern when -A/-B/-C/-W
options are in use. Even though it is clear that it does from the
above description, is it really the best thing the feature can do to
help the end users?

Which leads me to suspect that this "feature" might not be the ideal
you wanted to achive, but is an approximate substitution that you
found is "good enough" to simulate what the real thing you wanted to
do, especially when I go back and read the justfication in the
proposed log message that talks about "result of generators".

Isn't it a property of the entire file, not individual lines, if you
find it uninteresting to see reported by "git grep"?  I cannot shake
the suspicion that this feature happened to have ended up in this
shape, instead of "ignore a file with a line this long", only
because your starting point was to use "has overlong lines" as the
heuristic for "not interesting", and because "git grep" code is not
structured to first scan the entire file to decide if it is worth
working on it, and it is extra work to restructure the codeflow to
make it so (which you avoided).

If your real motivation was either

 (1) whether the file has or does not have the pattern for certain
 class of files are uninteresting; do not even run "grep"
 processing for them; or

 (2) hits or no-hits may be intereseting but output of overlong
 lines from certain class of files I do not wish to see;

then I can think of two alternatives.

For (1), can't we tell "result of generators" and other files with
pathspec?  If so, perhaps a negative pathspec can rescue.  e.g.

git grep  -- '*.cc' ':!*-autogen.cc'

For (2), can't we model this after how users can tell "git diff"
that certain paths are not worth computing and showing textual
patches for, which is to Unset the 'diff' attribute?  When you have

*-autogen.cc-diff

in your .gitattributes, "git diff" would say "Binary files A and B
differ" instead of explaining line-by-line differences in the patch
form.  Perhaps we can also have a 'grep' attribute and squelch the
output if it is Unset?  

It is debatable but one could propose extending the use of existing
'diff' attribute to cover 'grep' too, with an argument that anything
not worth showing patch (i.e. 'diff' attribute is Unset) is not
worth showing grep hits from.

Re*: Documentation of post-receive hook

2017-11-23 Thread Junio C Hamano

Junio C Hamano  writes:

> Your suggesting to mention that particular message hints at me that
> you feel that the users may not necessarily understand that push did
> not result in any update of references on the other side when they
> see it.  If the message was clear enough to them, "when it reacts to
> push and updates" ought to be clear enough description, too.
>
> And if that indeed is the case (and I would not be surprised if it
> is, but I suspect that most users are clueful enough), it is not the
> documentation, but the "Already up-to-date" message, that needs to
> be clarified, no?

I do not know if it helps to _also_ clarify the message, but at
least, let's tie the loose end by updating the documentation.

-- >8 --
Subject: hooks doc: clarify when receive-pack invokes its hooks

The text meant to say that receive-pack runs these hooks, and only
because receive-pack is not a command the end users use every day
(ever), as an explanation also meantioned that it is run in response
to 'git push', which is an end-user facing command readers hopefully
know about.

This unfortunately gave an incorrect impression that 'git push'
always result in the hook to run.  If the refs push wanted to update
all already had the desired value, these hooks are not run.  

Explicitly mention "... and updates reference(s)" as a precondition
to avoid this ambiguity.

Helped-by: Christoph Michelbach 
Signed-off-by: Junio C Hamano 
---

 Documentation/githooks.txt | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 9565dc3fda..8f6a3cd63e 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -222,8 +222,8 @@ to the user by writing to standard error.
 pre-receive
 ~~~
 
-This hook is invoked by 'git-receive-pack' on the remote repository,
-which happens when a 'git push' is done on a local repository.
+This hook is invoked by 'git-receive-pack' when it reacts to
+'git push' and updates reference(s) in its repository.
 Just before starting to update refs on the remote repository, the
 pre-receive hook is invoked.  Its exit status determines the success
 or failure of the update.
@@ -260,8 +260,8 @@ will be set to zero, `GIT_PUSH_OPTION_COUNT=0`.
 update
 ~~
 
-This hook is invoked by 'git-receive-pack' on the remote repository,
-which happens when a 'git push' is done on a local repository.
+This hook is invoked by 'git-receive-pack' when it reacts to
+'git push' and updates reference(s) in its repository.
 Just before updating the ref on the remote repository, the update hook
 is invoked.  Its exit status determines the success or failure of
 the ref update.
@@ -305,8 +305,8 @@ unannotated tags to be pushed.
 post-receive
 
 
-This hook is invoked by 'git-receive-pack' on the remote repository,
-which happens when a 'git push' is done on a local repository.
+This hook is invoked by 'git-receive-pack' when it reacts to
+'git push' and updates reference(s) in its repository.
 It executes on the remote repository once after all the refs have
 been updated.
 
@@ -344,8 +344,8 @@ will be set to zero, `GIT_PUSH_OPTION_COUNT=0`.
 post-update
 ~~~
 
-This hook is invoked by 'git-receive-pack' on the remote repository,
-which happens when a 'git push' is done on a local repository.
+This hook is invoked by 'git-receive-pack' when it reacts to
+'git push' and updates reference(s) in its repository.
 It executes on the remote repository once after all the refs have
 been updated.
 
@@ -375,8 +375,8 @@ for the user.
 push-to-checkout
 
 
-This hook is invoked by 'git-receive-pack' on the remote repository,
-which happens when a 'git push' is done on a local repository, when
+This hook is invoked by 'git-receive-pack' when it reacts to
+'git push' and updates reference(s) in its repository, and when
 the push tries to update the branch that is currently checked out
 and the `receive.denyCurrentBranch` configuration variable is set to
 `updateInstead`.  Such a push by default is refused if the working

Re: [PATCH v2] doc: clarify that "git bisect" accepts one or more good commits

2017-11-23 Thread Junio C Hamano

"Robert P. J. Day"  writes:

>  This command uses a binary search algorithm to find which commit in
> -your project's history introduced a bug. You use it by first telling
> -it a "bad" commit that is known to contain the bug, and a "good"
> -commit that is known to be before the bug was introduced. Then `git
> -bisect` picks a commit between those two endpoints and asks you
> +your project's history introduced a bug. You use it by first telling it
> +a "bad" commit that is known to contain the bug, and one or more "good"
> +commits that are known to be before the bug was introduced. Then `git
> +bisect` picks a commit somewhere in between those commits and asks you

Good.

> -Once you have specified at least one bad and one good commit, `git
> +Once you have specified one bad and one or more good commits, `git
>  bisect` selects a commit in the middle of that range of history,
>  checks it out, and outputs something similar to the following:

Good.

> @@ -137,7 +137,7 @@ respectively, in place of "good" and "bad". (But note 
> that you cannot
>  mix "good" and "bad" with "old" and "new" in a single session.)
>
>  In this more general usage, you provide `git bisect` with a "new"
> -commit that has some property and an "old" commit that doesn't have that
> +commit with some property and some "old" commits that don't have that
>  property. Each time `git bisect` checks out a commit, you test if that

Good.

> @@ -145,19 +145,19 @@ will report which commit introduced the property.
>
>  To use "old" and "new" instead of "good" and bad, you must run `git
>  bisect start` without commits as argument and then run the following
> -commands to add the commits:
> +commands to identify the commits:

I am not sure if this is an improvement (see below).

>
>  
> -git bisect old []
> +git bisect old [...]
>  

Good.

> -to indicate that a commit was before the sought change, or
> +to identify one or more commits before the sought change, or
>
>  
> -git bisect new [...]
> +git bisect new []
>  

Good.

> -to indicate that it was after.
> +to indicate a single commit after that change.

As to "identify", I would say it is better to consistently use
"indicate" like the original of these two hunks at the end says,
i.e. "indicate that it is bad/new (or they are good/old)".

Regarding the earlier "add the commits", I do not think the original
is confusing and any reasonable reader would get that the verb is a
casually (or "carelessly") used short-hand for "add the commits to
the set of commits the bisect algorithm cares about", and turning it
to "identify" adds much clarity.

As it is immediately followed by two illustrations to use old and
new, I would think that we could just stop the sentence at "then run
the following commands:" without saying anything else.

If you really want to phrase it differently from the two sentences
to describe use of old and new, because this is acting as a headline
for these two, perhaps it is an improvement to say something like
"then run the following commands to limit the bisection range"; that
would explain _why_ these commits are "added" and would give additional
information to the readers.

Re: [PATCH v3] doc: tweak "man gitcli" for clarity

2017-11-23 Thread Junio C Hamano

"Robert P. J. Day"  writes:

> -This manual describes the convention used throughout Git CLI.
> +This manual describes the conventions used throughout Git CLI.

OK.

>  Many commands take revisions (most often "commits", but sometimes
>  "tree-ish", depending on the context and command) and paths as their
> @@ -32,32 +32,35 @@ arguments.  Here are the rules:
> between the HEAD commit and the work tree as a whole".  You can say
> `git diff HEAD --` to ask for the latter.
>
> - * Without disambiguating `--`, Git makes a reasonable guess, but errors
> -   out and asking you to disambiguate when ambiguous.  E.g. if you have a
> -   file called HEAD in your work tree, `git diff HEAD` is ambiguous, and
> + * In cases where a Git command is truly ambiguous, Git will error out
> +   and ask you to disambiguate the command.  E.g. if you have a file
> +   called HEAD in your work tree, `git diff HEAD` is ambiguous, and
> you have to say either `git diff HEAD --` or `git diff -- HEAD` to
> disambiguate.
>  +
>  When writing a script that is expected to handle random user-input, it is
>  a good practice to make it explicit which arguments are which by placing
> -disambiguating `--` at appropriate places.
> +a disambiguating `--` at appropriate places.

The above "truly" is misleading by giving the information the other
way around.  We ask to disambiguate when we cannot readily say the
command line is *not* unambiguous.  That is different from asking
when we know it is truly ambiguous.

Also see  if you want
to know more.

>   * Many commands allow wildcards in paths, but you need to protect
> -   them from getting globbed by the shell.  These two mean different
> -   things:
> +   them from getting globbed by the shell.  The following commands have
> +   two different meanings:
>  +
>  
>  $ git checkout -- *.c
> +
>  $ git checkout -- \*.c
> +$ git checkout -- "*.c"
> +$ git checkout -- '*.c'

I do not think these two additional ones add any value.

And if you do not add these two, you do not need any of the change
we see before or after this example.  The changes you made to these
paragraphs are primarily there because you need to explain that the
latter three are equivalent to each other now because of these two
extra ones, while the original did not have to say anything like
that.

Because this is not a tutorial for shell scripting and its quoting
rules, highlighting the difference between the case where Git sees
the asterisk and you let shell to expand asterisk and do not let Git
see the asterisk _is_ important, but the fact that you can quote the
asterisk in different ways from the shell is *not* important.  We
shouldn't clutter the description with the latter, I would think.

I would however be receptive if the change were to only replace the
backslash quoting of asterisk with the one that uses a single quote
pair (and with no changes in the surrounding paragraphs).  That
change could be justified by saying that a pair of single (or
double) quotes would be more familiar for people new to the shell.

Thanks.

Re: [PATCH v3 00/33] Add directory rename detection to git

2017-11-23 Thread Elijah Newren

On Thu, Nov 23, 2017 at 2:28 PM, Elijah Newren  wrote:
> On Thu, Nov 23, 2017 at 3:52 AM, Adam Dinwoodie  wrote:
>> On Tuesday 21 November 2017 at 12:00 am -0800, Elijah Newren wrote:
>>> 
>>>
>>>  merge-recursive.c   | 1243 +++-
>>>  merge-recursive.h   |   17 +
>>>  t/t3501-revert-cherry-pick.sh   |5 +-
>>>  t/t6043-merge-rename-directories.sh | 3821 
>>> +++
>>>  t/t7607-merge-overwrite.sh  |7 +-
>>>  unpack-trees.c  |4 +-
>>>  unpack-trees.h  |4 +
>>>  7 files changed, 4985 insertions(+), 116 deletions(-)
>>>  create mode 100755 t/t6043-merge-rename-directories.sh
>>
>> The new t6043.44 introduced in this branch is failing on my Cygwin
>> system.  I can't immeditely see what's causing the failure, but I've
>> copied the relevant verbose + shell tracing output below in the hope it
>> makes more sense to you:
>
> Thanks for reporting.  Unfortunately, I have been unable to locate or
> create a cygwin system on which to replicate the testing.  Valgrind is

Nevermind; found a system that allowed me to reproduce the problem,
even if it far more wrangling than I'd like.  I believe I have a
one-line fix, but I'm worried that attempting to fully explain it will
result in a novel-length commit message.

Re: Bisect marking new commits incorrectly

2017-11-23 Thread Junio C Hamano

Jeff King  writes:

> Yeah, I really would have expected this to work, too. Should we be
> taking the merge base of the set of "new" commits, and using that as the
> true "new"?
> ...
> So maybe that's not workable.
>
> (I've never really dug into the bisect algorithm, and this is written
> largely off the cuff, so I'm fine if the response is "nope, you have no
> idea what you are talking about").

You reached the right conclusion, I think.  

A single merge-base case still might be worth optimizing for (and
IIRC, unlike the "git log A..B" traversal that could be fooled by
clock skew, merge-base reliably rejects falsely independent bases in
the presence of clock skew, so it should be safe to do), but in a
scenario your second illustration depicts, the merge-bases would not
help us that much.  

When starting from D and F, both of which are known to be "bad", all
we know is that some of the merge bases between them are "bad",
while other bases are not "bad" as they do not inherit the badness
in the common ancestor of those "bad" bases.  

And taking a merge base of these multiple merge-bases recursively
would not help us there.  We started from "all of these, i.e. D and
F, are known to be bad", which is different from "some of these
among multiple bases are bad, but we do not know which ones", so
even if we were to go recursive, the first step needs to be "now
let's see which one of these multiple bases are bad, so that we can
take merge-base across them".

Re: [PATCH v2] Teach stash to parse -m/--message like commit does

2017-11-23 Thread Junio C Hamano

Phil Hord  writes:

> `git stash push -m foo` uses "foo" as the message for the stash. But
> `git stash push -m"foo"` does not parse successfully.  Similarly
> `git stash push --message="My stash message"` also fails.  The stash
> documentation doesn't suggest this syntax should work, but gitcli
> does and my fingers have learned this pattern long ago for `commit`.
>
> Teach `git stash` and `git store` to parse -mFoo and --message=Foo
> the same as `git commit` would do.  Even though it's an internal
> function, add similar support to create_stash() for consistency.

I sense some typo around "git store", but I am not exactly sure what
the right spelling should be. "git stash -m..", "git stash save -m..",
"git stash push -m.." and "git stash store -m..", if you want a full
enumeration, but stepping back a bit, mentioning "git stash" ought
to be sufficient.  A need to spell all of them whose handling of -m
you fixed would imply there may be some others whose handling of -m
is still broken, which is not a good place for us to end up with.

> Reviewd-by: Junio C Hamano 

Heh, I didn't review this version and certainly not its test.  

I didn't see anything questionable there after a quick read, though.

Thanks.

Re: [PATCH v2 2/2] stash-store: add failing test for same-ref

2017-11-23 Thread Junio C Hamano

Phil Hord  writes:

> stash-store cannot create a new stash with the same ref as stash@{0}. No
> error is returned even though no new stash log is created. Add a failing
> test to track.
>
> Signed-off-by: Phil Hord 
> ---

Sorry, I lost track.  Where is v1 of this series and 1/2 of v2?
IOW, where does this patch fit in in the larger picture?

>  t/t3903-stash.sh | 11 +++
>  1 file changed, 11 insertions(+)
>
> diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
> index 279e31717..7d511afd3 100755
> --- a/t/t3903-stash.sh
> +++ b/t/t3903-stash.sh
> @@ -813,6 +813,17 @@ test_expect_success 'push -m also works without space' '
>   test_cmp expect actual
>  '
>  
> +test_expect_failure 'store same ref twice' '
> + >foo &&
> + git add foo &&
> + STASH_ID=$(git stash create) &&
> + git stash store -m "original message" $STASH_ID &&
> + git stash store -m "custom message" $STASH_ID &&
> + echo "stash@{0}: custom message" >expect &&
> + git stash list -1 >actual &&
> + test_cmp expect actual
> +'
> +
>  test_expect_success 'store -m foo shows right message' '
>   >foo &&
>   git add foo &&

$27M USD

2017-11-23 Thread Sgt. Britta Lopez

Apologies! I am a military woman ,seeking your kind assistance.

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

Re: [PATCH] http-backend: respect CONTENT_LENGTH as specified by rfc3875

2017-11-23 Thread Junio C Hamano

Max Kirillov  writes:

> http-backend reads whole input until EOF. However, the RFC 3875 specifies
> that a script must read only as many bytes as specified by CONTENT_LENGTH
> environment variable. This causes hang under IIS/Windows, for example.
>
> Make http-backend read only CONTENT_LENGTH bytes, if it's defined, rather than
> the whole input until EOF. If the varibale is not defined, keep older behavior
> of reading until EOF because it is used to support chunked transfer-encoding.
>
> Signed-off-by: Florian Manschwetus 
> Authored-by: Florian Manschwetus 
> Fixed-by: Max Kirillov 
> Signed-off-by: Max Kirillov 
> ---
> ...
> I hope I marked it correctly in the trailers.

It is probably more conventional to do it like so:

From: Florian Manschwetus 
Date: 

http-backend reads whole input until EOF. However, the RFC 3875...
... chunked transfer-encoding.

Signed-off-by: Florian Manschwetus 
[mk: fixed trivial build failures and stuff]
Signed-off-by: Max Kirillov 
---

>  
> +/*
> + * replacement for original read_request, now renamed to read_request_eof,
> + * honoring given content_length (req_len),
> + * provided by new wrapper function read_request
> + */

I agree with Eric's suggestion.  In-code comment is read by those
who have the current code, without knowing/caring what it used to
be.  "It used to do this, but replace it with this new thing
because..." is a valuable thing to record in the log message, but
not here.

Re: [PATCH v3] doc: tweak "man gitcli" for clarity

2017-11-23 Thread Junio C Hamano

Junio C Hamano  writes:

>> - * Without disambiguating `--`, Git makes a reasonable guess, but errors
>> -   out and asking you to disambiguate when ambiguous.  E.g. if you have a
>> -   file called HEAD in your work tree, `git diff HEAD` is ambiguous, and
>> + * In cases where a Git command is truly ambiguous, Git will error out
>> +   and ask you to disambiguate the command.  E.g. if you have a file
>> +   called HEAD in your work tree, `git diff HEAD` is ambiguous, and
>> you have to say either `git diff HEAD --` or `git diff -- HEAD` to
>> disambiguate.
>>  ...
>
> The above "truly" is misleading by giving the information the other
> way around.  We ask to disambiguate when we cannot readily say the
> command line is *not* unambiguous.  That is different from asking
> when we know it is truly ambiguous.
>
> Also see  if you want
> to know more.

So, here is my attempt (other than "the original reads clear enough
to me, so I am not sure what's there to improve"):

When the command line does not have the disambiguating `--`, Git
needs to find where the revisions end and paths begin.  In order
to make sure it does not guess incorrectly, Git errors out when
it cannot cheaply determine that there is no ambiguity, and asks
you to disambiguate with `--`.  If you have a file whose name is
HEAD, "git diff HEAD" gets an error for this reason; you need to
say "git diff -- HEAD" or "git diff HEAD --" to disambiguate.

Also, if you do not have a file whose name is Nakefile and it is
not a branch or tag name, "git log Nakefile" is flagged as an
error.  If you know there used to be a file called Nakefile but
not in the current working tree, "git log -- Nakefile" is how
you tell Git that you did not make a typo and you mean to find
commits that touch the path.

I briefly considered the following as a more technically correct
description for "cheaply determine", but I am not sure if we want to
go down to that level of detail:

... when it cannot determine that every argument before one
point on the command line names a revision and does not have a
file with that name in the working tree, and every argument
after that point does have a file with that name and cannot be
interpreted as a valid revision.

>>  When writing a script that is expected to handle random user-input, it is
>>  a good practice to make it explicit which arguments are which by placing
>> -disambiguating `--` at appropriate places.
>> +a disambiguating `--` at appropriate places.

I forgot to mention in my earlier message, but this part of the
patch is obviously good ;-)

Re: Changing encoding of a file : What should happen to CRLF in file ?

2017-11-23 Thread Ashish Negi

Great work !!!
Thanks

On Fri, Nov 24, 2017 at 1:55 AM, Torsten Bögershausen  wrote:
> On Thu, Nov 23, 2017 at 10:01:40PM +0530, Ashish Negi wrote:
>> Thanks for confirming.
>> Is it possible to track this via a bug number ?
>> It will help me to try out the fix when its available.
>>
>
> No worry, the fix is nearly complete and will come out in a couple of days.

Re: [PATCH v2] launch_editor(): indicate that Git waits for user input

2017-11-23 Thread Junio C Hamano

Lars Schneider  writes:

>> Of course it is possible, if you really wanted to.  The code knows
>> if it gave the "we launched and waiting for you" notice, so it can
>> maintain not just one (i.e. "how I close the notice?") but another
>> one (i.e. "how I do so upon an error?") and use it in the error
>> codepath.
>
> I think a newline makes sense. I'll look into this for v3.

As this is an error codepath, I am OK to waste one more line with a
line break after the "we launched and are waiting..." message, but
with a shorter "we are waiting..." message, I do not know if it is
bad enough to have these two on the same line, so I am on the fence.

Re: [PATCH v4 1/4] checkout: factor out functions to new lib file

2017-11-23 Thread Junio C Hamano

Thomas Gummerer  writes:

> Factor the functions out, so they can be re-used from other places.  In
> particular these functions will be re-used in builtin/worktree.c to make
> git worktree add dwim more.
>
> While there add some docs to the function.
>
> Signed-off-by: Thomas Gummerer 
> ---
>  Makefile   |  1 +
>  builtin/checkout.c | 41 +
>  checkout.c | 42 ++
>  checkout.h | 13 +
>  4 files changed, 57 insertions(+), 40 deletions(-)
>  create mode 100644 checkout.c
>  create mode 100644 checkout.h
>
> diff --git a/Makefile b/Makefile
> index cd75985991..8d603c7443 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -757,6 +757,7 @@ LIB_OBJS += branch.o
>  LIB_OBJS += bulk-checkin.o
>  LIB_OBJS += bundle.o
>  LIB_OBJS += cache-tree.o
> +LIB_OBJS += checkout.o
>  LIB_OBJS += color.o
>  LIB_OBJS += column.o
>  LIB_OBJS += combine-diff.o
> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index fc4f8fd2ea..9e1cfd10b3 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -1,5 +1,6 @@
>  #include "builtin.h"
>  #include "config.h"
> +#include "checkout.h"
>  #include "lockfile.h"
>  #include "parse-options.h"
>  #include "refs.h"

With this, and also ...

> diff --git a/checkout.c b/checkout.c
> new file mode 100644
> index 00..b0c744d37a
> --- /dev/null
> +++ b/checkout.c
> @@ -0,0 +1,42 @@
> +#include "cache.h"
> +#include "remote.h"

... with this, I sort of expected that builtin/checkout.c no longer
has to include "remote.h" but can now rely on the common helpers in
this new file to perform anything remote-related operation.  But it
seems that it is not the case (yet).

Just recording my observation for future reference, as we might also
want to move report_tracking(), etc., to this new file in the future.

Re: [PATCH v4 2/4] worktree: add --[no-]track option to the add subcommand

2017-11-23 Thread Junio C Hamano

Thomas Gummerer  writes:

> diff --git a/t/t2025-worktree-add.sh b/t/t2025-worktree-add.sh
> index b5c47ac602..53042ce565 100755
> --- a/t/t2025-worktree-add.sh
> +++ b/t/t2025-worktree-add.sh
> @@ -313,5 +313,60 @@ test_expect_success 'checkout a branch under bisect' '
>  test_expect_success 'rename a branch under bisect not allowed' '
>   test_must_fail git branch -M under-bisect bisect-with-new-name
>  '
> +# Is branch "refs/heads/$1" set to pull from "$2/$3"?
> +test_branch_upstream () {
> + printf "%s\n" "$2" "refs/heads/$3" >expect.upstream &&
> + {
> + git config "branch.$1.remote" &&
> + git config "branch.$1.merge"
> + } >actual.upstream &&
> + test_cmp expect.upstream actual.upstream
> +}

OK.

> +test_expect_success '--track sets up tracking' '
> + test_when_finished rm -rf track &&
> + git worktree add --track -b track track master &&
> + git config "branch.track.merge" &&
> + (
> + test_branch_upstream track . master
> + )
> +'

Is this "git config" necessary, or is it a remnant of a debugging
session?  It is tested in the helper that branch.track.merge is set
to something, and otherwise the helper would fail the same way as
this standalnoe "git config" would, no?

> +# setup remote repository $1 and repository $2 with $1 set up as
> +# remote.  The remote has two branches, master and foo.
> +setup_remote_repo () {
> + git init $1 &&
> + (
> + cd $1 &&
> + test_commit $1_master &&
> + git checkout -b foo &&
> + test_commit upstream_foo
> + ) &&
> + git init $2 &&
> + (
> + cd $2 &&
> + test_commit $2_master &&
> + git remote add $1 ../$1 &&
> + git config remote.$1.fetch \
> + "refs/heads/*:refs/remotes/$1/*" &&
> + git fetch --all
> + )
> +}
> +
> +test_expect_success '--no-track avoids setting up tracking' '
> + test_when_finished rm -rf repo_upstream repo_local foo &&
> + setup_remote_repo repo_upstream repo_local &&
> + (
> + cd repo_local &&
> + git worktree add --no-track -b foo ../foo repo_upstream/foo
> + ) &&
> + (
> + cd foo &&
> + ! test_branch_upstream foo repo_upstream foo &&

It is true that this test helper must yield failure.  But what you
expect probably is more than that, no?  For example, the test helper
would fail even if branch.foo.remote is set to the upstream as long
as branch.foo.merge is not set to point at their foo, but what you
really want to make sure is that neither configuration variable is
set.

> + git rev-parse repo_upstream/foo >expect &&
> + git rev-parse foo >actual &&
> + test_cmp expect actual
> + )
> +'
>  
>  test_done

Re: [PATCH v4 3/4] worktree: make add dwim

2017-11-23 Thread Junio C Hamano

Thomas Gummerer  writes:

> Currently 'git worktree add  ', errors out when 'branch'
> is not a local branch.   It has no additional dwim'ing features that one
> might expect.
>
> Make it behave more like 'git checkout ' when the branch doesn't
> exist locally, but a remote tracking branch uniquely matches the desired
> branch name, i.e. create a new branch from the remote tracking branch
> and set the upstream to the remote tracking branch.
>
> As 'git worktree add' currently just dies in this situation, there are
> no backwards compatibility worries when introducing this feature.
>
> Signed-off-by: Thomas Gummerer 
> ---

This step makes sense and I did not spot anything controversial.

Re: [PATCH v4 4/4] worktree: make add dwim

2017-11-23 Thread Junio C Hamano

Thomas Gummerer  writes:

> Currently 'git worktree add ' creates a new branch named after the
> basename of the , that matches the HEAD of whichever worktree we
> were on when calling "git worktree add ".
>
> Make 'git worktree add  behave more like the dwim machinery in
> 'git checkout ', i.e. check if the new branch name uniquely
> matches the branch name of a remote tracking branch, and if so check out
> that branch and set the upstream to the remote tracking branch.
>
> This is a change of behaviour compared to the current behaviour, where
> we create a new branch matching HEAD.  However as 'git worktree' is
> still an experimental feature, and it's easy to notice/correct the
> behaviour in case it's not what the user desired it's probably okay to
> break existing behaviour here.

Is it "easy to notice"?  I doubt it.  Even if you assume that
everybody uses bash prompt that shows the name of the branch, the
user sees the same name of the branch in either mode.

> In order to also satisfy users who want the current behaviour of
> creating a new branch from HEAD, add a '--no-track' flag, which disables
> the new behaviour, and keeps the old behaviour of creating a new branch
> from the head of the current worktree.

I am not sure if this is a good match for "--track/--no-track";
which branch is to be checked out (either "automatically from the
unique remote-tracking branch" or "the current one") is one choice,
and whether the resulting branch is marked explicitly as integrating
with the remote or not is another choice within one branch of the
first choice.  IOW, this makes it impossible to say "create the branch
based on the unique remote-tracking branch, but do not add the two
branch.*.{merge,remote} variables".

Also, you have several mention of "remote tracking branch" in these
patches.  Please consistently spell them as "remote-tracking branch"
to be consistent with Documentation/glossary-content.txt and avoid a
casual/careful reference to "tracking branch" if possible, unless it
is quite clear to the readers that you are being loose for the sake
of brevity.  Some people used "tracking branch" to mean the local
branch that is marked as the branch to integrate with the work on
a branch at a remote that caused user confusion in the past.

That is

refs/remotes/origin/topic is a remote-tracking branch for the
branch 'topic' that came from the 'origin' remote.

when you have branch.foo.remote=origin and
branch.foo.merge=refs/heads/topic, then your local branch foo is
marked to integrate with the 'topic' branch at the 'origin'
remote.

and these two are quite different things that people in the past and
over time loosely used a phrase "tracking branch" to cause confusion.

Re: [PATCHv5 7/7] builtin/describe.c: describe a blob

2017-11-23 Thread Junio C Hamano

Stefan Beller  writes:

> On Tue, Nov 21, 2017 at 11:55 PM, Junio C Hamano  wrote:
>> Stefan Beller  writes:
>>
>>> ...
>>> fixed.
>>> ...
>>> fixed the text...
>>> ...
>>> I am not fully convinced all descriptions are in recent history, but I
>>> tend to agree that most are, so probably the trade off is a wash.
>>
>> So what do we want with this topic?  I think the "teach 'git log' to
>> highlight commits whose changes involve the given blob" is a more or
>> less an orthogonal thing,
>
> Well, both of them solve our immediate needs, so I'd be fine with pursuing
> just one of them, but I do not oppose taking both.
>
>> and I suspect that it is something users
>> may (although I personally do not) find valuable to have a related
>> but different feature in "git describe".
>
> agreed.

I was reacting to your "fixed".  So will we see a rerolled series or
not?

Make patch-id more flexible?

2017-11-23 Thread Eugeniu Rosca

Dear git Community,

This is my first post to the git mailing list, so I would first like to
express my gratitude to everyone involved in developing one of my
favorite development tools.

I will make my question short and concrete. My day to day job is doing
Linux kernel integration, which also includes importing of out-of-tree
kernel modules into the kernel tree. Our team extensively uses cherry
picking for integration purpose, since most often merging work is simply
not possible because of a different kernel base used by our suppliers.
We don't rebase remote commits --onto our repository/branch, since
(compared to `git cherry-pick -x`) `git rebase --onto` doesn't
add source/origin information to commit description. The `(cherry
picked from *)` line is extremely helpful in generating proper commit
statistics on a given branch, which is interesting because of a high
amount of commits coming from various non-vanilla remotes.

Reviewing the cherry picked commits, we extensively rely on patch id
comparison. We've developed scripts that extract the remote commit hash
from the `(cherry picked from )` line in the commit
description, in order to produce tables like below:

Remote-commit-id   Local-commit-idPatch-id-mismatch?
No
Yes
-
No

This information helps the reviewer identify the non-clean picks, which
are oftentimes (but not always) caused by manual conflict resolution,
which we try to briefly document in square brackets above the
`Signed-off-by` signature. We feel that documenting any manual conflict
resolution is important, as it can be source of bugs if not done
properly.

Troubles begin when we import out-of-tree kernel modules in-tree (some
suppliers delivery many of them). We use subtree cherry picking [1] for
that. Because subtree strategy alters the file-names, there will always
be a patch id mismatch between the origin commit and its pick. To
overcome this, we are using alternatives to `git patch-id`, which ignore
file-names. Here comes my actual question. Would it be conceptually fine
to implement some `git patch-id` parameter, which would allow ignoring
the file-names (or reducing those to their `basename`) before computing
the patch id? Or would it break the concept of patch id (which shouldn't
accept variations)?

Thank you.
Eugeniu.

[1] git cherry-pick -x -s --no-merges --strategy=subtree 
-Xsubtree=drivers/staging/mymodule ..

Re: [PATCH 1/1] diffcore: add a filter to find a specific blob

2017-11-23 Thread Junio C Hamano

Stefan Beller  writes:

> Sometimes users are given a hash of an object and they want to
> identify it further (ex.: Use verify-pack to find the largest blobs,
> but what are these? or [1])
>
> One might be tempted to extend git-describe to also work with blobs,
> such that `git describe ` gives a description as
> ':'.  This was implemented at [2]; as seen by the sheer
> number of responses (>110), it turns out this is tricky to get right.
> The hard part to get right is picking the correct 'commit-ish' as that
> could be the commit that (re-)introduced the blob or the blob that
> removed the blob; the blob could exist in different branches.
>
> Junio hinted at a different approach of solving this problem, which this
> patch implements. Teach the diff machinery another flag for restricting
> the information to what is shown. For example:
>
>   $ ./git log --oneline --blobfind=v2.0.0:Makefile
>   b2feb64309 Revert the whole "ask curl-config" topic for now
>   47fbfded53 i18n: only extract comments marked with "TRANSLATORS:"
>
> we observe that the Makefile as shipped with 2.0 was introduced in
> v1.9.2-471-g47fbfded53 and replaced in v2.0.0-rc1-5-gb2feb64309 by
> a different blob.
>
> [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob
> [2] https://public-inbox.org/git/20171028004419.10139-1-sbel...@google.com/
>
> Signed-off-by: Stefan Beller 
> ---
>
> On playing around with this, trying to find more interesting cases, I 
> observed:
>
> git log --oneline --blobfind=HEAD:COPYING
> 703601d678 Update COPYING with GPLv2 with new FSF address
> 
> git log --oneline --blobfind=703601d678^:COPYING
> 459b8d22e5 tests: do not borrow from COPYING and README from the real 
> source
> 703601d678 Update COPYING with GPLv2 with new FSF address
> 075b845a85 Add a COPYING notice, making it explicit that the license is 
> GPLv2.
>
> t/diff-lib/COPYING may need an update of the adress of the FSF,
> # leftoverbits I guess.

I do not think so.  See tz/fsf-address-update topic for details.

Please do not contaminate the list archive with careless mention of 
"hash-mark plus left over bits", as it will make searching the real
good bits harder.  Thanks.

> Another interesting case that I found was
>git log --oneline --blobfind=v2.14.0:Makefile
>3921a0b3c3 perf: add test for writing the index
>36f048c5e4 sha1dc: build git plumbing code more explicitly
>2118805b92 Makefile: add style build rule
>
> all of which were after v2.14, such that the introduction of that blob doesn't
> show up; I suspect it came in via a merge as unrelated series may have updated
> the Makefile in parallel, though git-log should have told me?

If that is the case, shouldn't we make this new mode imply
--full-history to forbid history simplification?  "git log" is a
tool to find _an_ explanation of the current state, and the usual
history simplification makes tons of sense there, but blobfind is
run most likely in order to find _all_ mention of the set of blobs
given.

> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index dd0dba5b1d..252a21cc19 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -500,6 +500,10 @@ information.
>  --pickaxe-regex::
>   Treat the  given to `-S` as an extended POSIX regular
>   expression to match.
> +--blobfind=::
> + Restrict the output such that one side of the diff
> + matches the given blob-id.
> +
>  endif::git-format-patch[]

Can we have a blank line between these enumerations to make the
source easier to read?  Thanks.

> diff --git a/diffcore-blobfind.c b/diffcore-blobfind.c
> new file mode 100644
> index 00..5d222fc336
> --- /dev/null
> +++ b/diffcore-blobfind.c
> @@ -0,0 +1,51 @@
> +/*
> + * Copyright (c) 2017 Google Inc.
> + */
> +#include "cache.h"
> +#include "diff.h"
> +#include "diffcore.h"
> +
> +static void diffcore_filter_blobs(struct diff_queue_struct *q,
> +   struct diff_options *options)
> +{
> + int i, j = 0, c = q->nr;
> +
> + if (!options->blobfind)
> + BUG("blobfind oidset not initialized???");
> +
> + for (i = 0; i < q->nr; i++) {
> + struct diff_filepair *p = q->queue[i];
> +
> + if (DIFF_PAIR_UNMERGED(p) ||
> + (DIFF_FILE_VALID(p->one) &&
> +  oidset_contains(options->blobfind, &p->one->oid)) ||
> + (DIFF_FILE_VALID(p->two) &&
> +  oidset_contains(options->blobfind, &p->two->oid)))
> + continue;

So, we keep an unmerged pair, a pair that mentions a sought-blob on
one side or the other side?  I am not sure if we want to keep the
unmerged pair for the purpose of this one.

> + diff_free_filepair(p);
> + q->queue[i] = NULL;
> + c--;

Also, if you are doing the in-place shrinking and have already
introduced another counter 'j' that is initia

Re: Make patch-id more flexible?

2017-11-23 Thread Junio C Hamano

Eugeniu Rosca  writes:

> file-names. Here comes my actual question. Would it be conceptually fine
> to implement some `git patch-id` parameter, which would allow ignoring
> the file-names (or reducing those to their `basename`) before computing
> the patch id? Or would it break the concept of patch id (which shouldn't
> accept variations)?

My gut feeling is that a tool like that would be fine as long as it
is local to your organization and is not called "git patch-id"; it
may be useful in the situation you described, but as you mention
above, it feels that it is differnt from what a patch-id is.

51 matches

Mail list logo