Re: [PATCH 5/6] builtin/grep.c: show column numbers via --column-number

2018-04-20 Thread René Scharfe
Am 21.04.2018 um 06:14 schrieb Junio C Hamano:
> Junio C Hamano  writes:
> 
>> Taylor Blau  writes:
>>
>>> This commit teaches 'git-grep(1)' a new option, '--column-number'. This
>>> ...
>>> +`columnnumber`;;
>>> +   column number prefix (when using `-m`)
>>
>> Is there other people's tool (preferrably some variant of "grep")
>> that has an option to tell it to show the column number of the hit?
>> What is the option called there?  Does that tool let the option
>> squat on short-and-sweet '-m'?
>>
>> Thanks.
> 
> I still do not know if we have a good existing model system to take
> the longer optoin name from, but at least, GNU grep seems to use -m
> to mean a completely different thing, so I'd think that we would not
> assign '-m' to this new feature.

https://beyondgrep.com/feature-comparison/ is a good resource for such
considerations.

ack and ripgrep use --column and no short option.  If git grep got
--column-number, then users could abbreviate it to --column (until it
gets some other option that makes it ambiguous).

-m seems to be used for --max-count across the board ("Stop searching
in each file after NUM matches") .

René


Re: [PATCH v1 1/2] merge: Add merge.renames config setting

2018-04-20 Thread Junio C Hamano
Elijah Newren  writes:

 +merge.renames::
 +   Whether and how Git detects renames.  If set to "false",
 +   rename detection is disabled. If set to "true", basic rename
 +   detection is enabled. This is the default.
>>>
>>>
>>> One can already control o->detect_rename via the -Xno-renames and
>>> -Xfind-renames options.
> ...
> Sorry, I think I wasn't being clear.  The documentation for the config
> options for e.g. diff.renameLimit, fetch.prune, log.abbrevCommit, and
> merge.ff all mention the equivalent command line parameters.  Your
> patch doesn't do that for merge.renames, but I think it would be
> helpful if it did.

Yes, and if we are adding a new configuration, we should do so in
such a way that we do not have to come back and extend it when we
know what the command line option does and the configuration being
proposed is less capable already.  I wonder if we can just add a
single configuration whose value can be "never" to pretend as if
"--Xno-renames" were given, and some similarity score like "50" to
pretend as if "--Xfind-renames=50" were given.  

That is, merge.renames does not have to a simple "yes-no to control
the --Xno-renames option".  And it would of course be better to
document it.

I also had to wonder how "merge -s resolve" faired, if the project
is not interested in renamed paths at all.

Thanks.


Re: [PATCH 5/6] builtin/grep.c: show column numbers via --column-number

2018-04-20 Thread Junio C Hamano
Junio C Hamano  writes:

> Taylor Blau  writes:
>
>> This commit teaches 'git-grep(1)' a new option, '--column-number'. This
>> ...
>> +`columnnumber`;;
>> +column number prefix (when using `-m`)
>
> Is there other people's tool (preferrably some variant of "grep")
> that has an option to tell it to show the column number of the hit?
> What is the option called there?  Does that tool let the option
> squat on short-and-sweet '-m'?
>
> Thanks.

I still do not know if we have a good existing model system to take
the longer optoin name from, but at least, GNU grep seems to use -m
to mean a completely different thing, so I'd think that we would not
assign '-m' to this new feature.



Re: [PATCH 5/6] builtin/grep.c: show column numbers via --column-number

2018-04-20 Thread Junio C Hamano
Taylor Blau  writes:

> This commit teaches 'git-grep(1)' a new option, '--column-number'. This
> ...
> +`columnnumber`;;
> + column number prefix (when using `-m`)

Is there other people's tool (preferrably some variant of "grep")
that has an option to tell it to show the column number of the hit?
What is the option called there?  Does that tool let the option
squat on short-and-sweet '-m'?

Thanks.


Re: [PATCH v1] perf/aggregate: tighten option parsing

2018-04-20 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

>> Not necessarily worth a re-roll.
>
> Not that it matters in this case, but just as a bit of Perl rx pedantry,
> yes his is tighter & more correct. You didn't consider how "." interacts
> with newlines:
>
> $ perl -wE 'my @rx = (qr/^--./, qr/^--.+$/, qr/^--./m, qr/^--.+$/m, 
> qr/^--./s, qr/^--.+$/s); for (@rx) { my $s = "--foo\n--bar"; say $_, "\t", 
> ($s =~ $_ ? 1 : 0) }'
> (?^u:^--.)  1
> (?^u:^--.+$)0
> (?^um:^--.) 1
> (?^um:^--.+$)   1
> (?^us:^--.) 1
> (?^us:^--.+$)   1
>
> I don't think it matters here, not like someone will pass \n in options
> to aggregate.perl...

Hmph, do we want the command not to barf when "--foo\n--bar" is
given from the command line and we cannot find such an option?

I thought that the location the match under discussion is used does
want to see a hit with any option looking string that begins with
double dashes.  I would have expected "tigher and hence incorrect",
in other words.

Somewhat puzzled...


[PATCH 3/6] grep.[ch]: teach columnnum, color_columnno to grep_opt

2018-04-20 Thread Taylor Blau
In preparation of adding --column-number to 'git-grep(1)', we extend
grep_opt to take in the requisite new members.

We additionally teach the 'grep.columnnumber' and
'color.grep.columnumber' configuration variables to configure showing
and coloring the column number, respectively. (These options remain
undocumented until 'git-grep(1)' learns the --column option in a
forthcoming commit.)

Signed-off-by: Taylor Blau 
---
 grep.c | 8 
 grep.h | 2 ++
 2 files changed, 10 insertions(+)

diff --git a/grep.c b/grep.c
index 29bc799ecf..7872a5d868 100644
--- a/grep.c
+++ b/grep.c
@@ -95,6 +95,10 @@ int grep_config(const char *var, const char *value, void *cb)
opt->linenum = git_config_bool(var, value);
return 0;
}
+   if (!strcmp(var, "grep.columnnumber")) {
+   opt->columnnum = git_config_bool(var, value);
+   return 0;
+   }
 
if (!strcmp(var, "grep.fullname")) {
opt->relative = !git_config_bool(var, value);
@@ -111,6 +115,8 @@ int grep_config(const char *var, const char *value, void 
*cb)
color = opt->color_function;
else if (!strcmp(var, "color.grep.linenumber"))
color = opt->color_lineno;
+   else if (!strcmp(var, "color.grep.columnumber"))
+   color = opt->color_columnno;
else if (!strcmp(var, "color.grep.matchcontext"))
color = opt->color_match_context;
else if (!strcmp(var, "color.grep.matchselected"))
@@ -155,6 +161,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
opt->extended_regexp_option = def->extended_regexp_option;
opt->pattern_type_option = def->pattern_type_option;
opt->linenum = def->linenum;
+   opt->columnnum = def->columnnum;
opt->max_depth = def->max_depth;
opt->pathname = def->pathname;
opt->relative = def->relative;
@@ -164,6 +171,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
color_set(opt->color_filename, def->color_filename);
color_set(opt->color_function, def->color_function);
color_set(opt->color_lineno, def->color_lineno);
+   color_set(opt->color_columnno, def->color_columnno);
color_set(opt->color_match_context, def->color_match_context);
color_set(opt->color_match_selected, def->color_match_selected);
color_set(opt->color_selected, def->color_selected);
diff --git a/grep.h b/grep.h
index 399381c908..08a0b391c5 100644
--- a/grep.h
+++ b/grep.h
@@ -127,6 +127,7 @@ struct grep_opt {
int prefix_length;
regex_t regexp;
int linenum;
+   int columnnum;
int invert;
int ignore_case;
int status_only;
@@ -159,6 +160,7 @@ struct grep_opt {
char color_filename[COLOR_MAXLEN];
char color_function[COLOR_MAXLEN];
char color_lineno[COLOR_MAXLEN];
+   char color_columnno[COLOR_MAXLEN];
char color_match_context[COLOR_MAXLEN];
char color_match_selected[COLOR_MAXLEN];
char color_selected[COLOR_MAXLEN];
-- 
2.17.0



[PATCH 4/6] grep.c: display column number of first match

2018-04-20 Thread Taylor Blau
Building upon our work in the previous commit to add members 'columnnum'
and 'color_columno' to 'grep_opt', we teach show_line() how to respect
those options.

When requested, show_line() will display the column number of the first
match on a non-context line. show_line() differentiates between context
and non-context lines through the '&& cno' check. 'cno' will be equal to
zero if and only if show_line() is invoked on a context line. It will be
a non-zero value otherwise.

Signed-off-by: Taylor Blau 
---
 grep.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/grep.c b/grep.c
index 7872a5d868..5aeb893263 100644
--- a/grep.c
+++ b/grep.c
@@ -1404,6 +1404,12 @@ static void show_line(struct grep_opt *opt, char *bol, 
char *eol,
output_color(opt, buf, strlen(buf), opt->color_lineno);
output_sep(opt, sign);
}
+   if (opt->columnnum && cno) {
+   char buf[32];
+   xsnprintf(buf, sizeof(buf), "%d", cno);
+   output_color(opt, buf, strlen(buf), opt->color_columnno);
+   output_sep(opt, sign);
+   }
if (opt->color) {
regmatch_t match;
enum grep_context ctx = GREP_CONTEXT_BODY;
-- 
2.17.0



[PATCH 5/6] builtin/grep.c: show column numbers via --column-number

2018-04-20 Thread Taylor Blau
This commit teaches 'git-grep(1)' a new option, '--column-number'. This
option builds upon previous commits to show the column number of the
first match on a non-context line.

For example:

  $ git grep -mn example | head -n3
  .clang-format:51:14:# myFunction(foo, bar, baz);
  .clang-format:64:7:# int foo();
  .clang-format:75:8:# void foo()

Now that configuration variables such as grep.columnNumber and
color.grep.columnNumber have a visible effect, we document them in this
patch as well.

Signed-off-by: Taylor Blau 
---
 Documentation/config.txt   |  5 +
 Documentation/git-grep.txt |  9 -
 builtin/grep.c |  1 +
 t/t7810-grep.sh| 22 ++
 4 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2659153cb3..02fd4b662b 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1159,6 +1159,8 @@ color.grep.::
function name lines (when using `-p`)
 `linenumber`;;
line number prefix (when using `-n`)
+`columnnumber`;;
+   column number prefix (when using `-m`)
 `match`;;
matching text (same as setting `matchContext` and `matchSelected`)
 `matchContext`;;
@@ -1708,6 +1710,9 @@ gitweb.snapshot::
 grep.lineNumber::
If set to true, enable `-n` option by default.
 
+grep.columnNumber::
+   If set to true, enable `-m` option by default.
+
 grep.patternType::
Set the default matching behavior. Using a value of 'basic', 'extended',
'fixed', or 'perl' will enable the `--basic-regexp`, 
`--extended-regexp`,
diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 18b494731f..dd90f74ded 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -13,7 +13,7 @@ SYNOPSIS
   [-v | --invert-match] [-h|-H] [--full-name]
   [-E | --extended-regexp] [-G | --basic-regexp]
   [-P | --perl-regexp]
-  [-F | --fixed-strings] [-n | --line-number]
+  [-F | --fixed-strings] [-n | --line-number] [-m | --column-number]
   [-l | --files-with-matches] [-L | --files-without-match]
   [(-O | --open-files-in-pager) []]
   [-z | --null]
@@ -44,6 +44,9 @@ CONFIGURATION
 grep.lineNumber::
If set to true, enable `-n` option by default.
 
+grep.columnNumber::
+   If set to true, enable `-m` option by default.
+
 grep.patternType::
Set the default matching behavior. Using a value of 'basic', 'extended',
'fixed', or 'perl' will enable the `--basic-regexp`, 
`--extended-regexp`,
@@ -169,6 +172,10 @@ providing this option will cause it to die.
 --line-number::
Prefix the line number to matching lines.
 
+-m::
+--column-number::
+   Prefix the 1-indexed column number of the first match on non-context 
lines.
+
 -l::
 --files-with-matches::
 --name-only::
diff --git a/builtin/grep.c b/builtin/grep.c
index 5f32d2ce84..faa65abab5 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -829,6 +829,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
GREP_PATTERN_TYPE_PCRE),
OPT_GROUP(""),
OPT_BOOL('n', "line-number", , N_("show line 
numbers")),
+   OPT_BOOL('m', "column-number", , N_("show column 
numbers")),
OPT_NEGBIT('h', NULL, , N_("don't show 
filenames"), 1),
OPT_BIT('H', NULL, , N_("show filenames"), 1),
OPT_NEGBIT(0, "full-name", ,
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 1797f632a3..0cf654824d 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -99,6 +99,28 @@ do
test_cmp expected actual
'
 
+   test_expect_success "grep -w $L" '
+   {
+   echo ${HC}file:5:foo mmap bar
+   echo ${HC}file:14:foo_mmap bar mmap
+   echo ${HC}file:5:foo mmap bar_mmap
+   echo ${HC}file:14:foo_mmap bar mmap baz
+   } >expected &&
+   git -c grep.linenumber=false grep -m -w -e mmap $H >actual &&
+   test_cmp expected actual
+   '
+
+   test_expect_success "grep -w $L" '
+   {
+   echo ${HC}file:1:5:foo mmap bar
+   echo ${HC}file:3:14:foo_mmap bar mmap
+   echo ${HC}file:4:5:foo mmap bar_mmap
+   echo ${HC}file:5:14:foo_mmap bar mmap baz
+   } >expected &&
+   git -c grep.linenumber=false grep -n -m -w -e mmap $H >actual &&
+   test_cmp expected actual
+   '
+
test_expect_success "grep -w $L" '
{
echo ${HC}file:1:foo mmap bar
-- 
2.17.0



[PATCH 2/6] grep.c: take column number as argument to show_line()

2018-04-20 Thread Taylor Blau
show_line() currently receives the line number within the
'grep_opt->buf' in order to determine which line number to display. In
order to display information about the matching column number--if
requested--we must additionally take in that information.

To do so, we extend the signature of show_line() to take in an
additional unsigned "cno". "cno" is either:

  * A 1-indexed column number of the first match on the given line, or
  * 0, if the column number is irrelevant (when displaying a function
name, context lines, etc).

We additionally modify all calls to show_line() in order to pass the new
required argument.

Signed-off-by: Taylor Blau 
---
 grep.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/grep.c b/grep.c
index 1c25782355..29bc799ecf 100644
--- a/grep.c
+++ b/grep.c
@@ -1361,7 +1361,7 @@ static int next_match(struct grep_opt *opt, char *bol, 
char *eol,
 }
 
 static void show_line(struct grep_opt *opt, char *bol, char *eol,
- const char *name, unsigned lno, char sign)
+ const char *name, unsigned lno, unsigned cno, char sign)
 {
int rest = eol - bol;
const char *match_color, *line_color = NULL;
@@ -1501,7 +1501,7 @@ static void show_funcname_line(struct grep_opt *opt, 
struct grep_source *gs,
break;
 
if (match_funcname(opt, gs, bol, eol)) {
-   show_line(opt, bol, eol, gs->name, lno, '=');
+   show_line(opt, bol, eol, gs->name, lno, 0, '=');
break;
}
}
@@ -1566,7 +1566,7 @@ static void show_pre_context(struct grep_opt *opt, struct 
grep_source *gs,
 
while (*eol != '\n')
eol++;
-   show_line(opt, bol, eol, gs->name, cur, sign);
+   show_line(opt, bol, eol, gs->name, cur, 0, sign);
bol = eol + 1;
cur++;
}
@@ -1830,7 +1830,7 @@ static int grep_source_1(struct grep_opt *opt, struct 
grep_source *gs, int colle
show_pre_context(opt, gs, bol, eol, lno);
else if (opt->funcname)
show_funcname_line(opt, gs, bol, lno);
-   show_line(opt, bol, eol, gs->name, lno, ':');
+   show_line(opt, bol, eol, gs->name, lno, match.rm_so+1, 
':');
last_hit = lno;
if (opt->funcbody)
show_function = 1;
@@ -1859,7 +1859,7 @@ static int grep_source_1(struct grep_opt *opt, struct 
grep_source *gs, int colle
/* If the last hit is within the post context,
 * we need to show this line.
 */
-   show_line(opt, bol, eol, gs->name, lno, '-');
+   show_line(opt, bol, eol, gs->name, lno, match.rm_so+1, 
'-');
}
 
next_line:
-- 
2.17.0



[PATCH 6/6] contrib/git-jump/git-jump: use column number when grep-ing

2018-04-20 Thread Taylor Blau
This patch adds the '--column-number' synonym '-m' to the default
grep command so that callers are brought to the correct line _and_
column of each matched location.

Signed-off-by: Taylor Blau 
---
 contrib/git-jump/git-jump | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/git-jump/git-jump b/contrib/git-jump/git-jump
index 80ab0590bc..2706963690 100755
--- a/contrib/git-jump/git-jump
+++ b/contrib/git-jump/git-jump
@@ -52,7 +52,7 @@ mode_merge() {
 # editor shows them to us in the status bar.
 mode_grep() {
cmd=$(git config jump.grepCmd)
-   test -n "$cmd" || cmd="git grep -n"
+   test -n "$cmd" || cmd="git grep -n -m"
$cmd "$@" |
perl -pe '
s/[ \t]+/ /g;
-- 
2.17.0


[PATCH 0/6] Teach '--column-number' to 'git-grep(1)'

2018-04-20 Thread Taylor Blau
Hi,

Attached is a series to add --column-number to 'git-grep(1)', and
support jumping to the correct column when using Peff's 'git-jump'.

I have chosen -m as the short form of --column-number, because -c was
taken, and -m is close to -n (the short form of --line-number).

The description in each patch is extensive, and can serve as a reference
during review. Thank you in advance for your time, and--as always--I
look forward to your feedback :-).


Thanks,
Taylor

Taylor Blau (6):
  grep.c: take regmatch_t as argument in match_line()
  grep.c: take column number as argument to show_line()
  grep.[ch]: teach columnnum, color_columnno to grep_opt
  grep.c: display column number of first match
  builtin/grep.c: show column numbers via --column-number
  contrib/git-jump/git-jump: use column number when grep-ing

 Documentation/config.txt   |  5 +
 Documentation/git-grep.txt |  9 -
 builtin/grep.c |  1 +
 contrib/git-jump/git-jump  |  2 +-
 grep.c | 33 -
 grep.h |  2 ++
 t/t7810-grep.sh| 22 ++
 7 files changed, 63 insertions(+), 11 deletions(-)

--
2.17.0


[PATCH 1/6] grep.c: take regmatch_t as argument in match_line()

2018-04-20 Thread Taylor Blau
In a subsequent patch, we teach show_line() to optionally include the
column number of the first match on each matched line.

The regmatch_t involved in match_line() and match_one_pattern() both
contain this information (via regmatch_t->rm_so), but their current
implementation throws this stack variable away at the end of the call.

Instead, let's teach match_line() to take in a 'regmatch_t *' so that
callers can inspect the result of their calls. This will prove useful in
a subsequent commit when a caller will forward on information from the
regmatch_t into show_line (as described above).

The return condition remains unchanged, therefore the only change
required of callers is the addition of a single argument.

Signed-off-by: Taylor Blau 
---
 grep.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/grep.c b/grep.c
index 65b90c10a3..1c25782355 100644
--- a/grep.c
+++ b/grep.c
@@ -1299,17 +1299,17 @@ static int match_expr(struct grep_opt *opt, char *bol, 
char *eol,
 }
 
 static int match_line(struct grep_opt *opt, char *bol, char *eol,
- enum grep_context ctx, int collect_hits)
+ regmatch_t *match, enum grep_context ctx,
+ int collect_hits)
 {
struct grep_pat *p;
-   regmatch_t match;
 
if (opt->extended)
return match_expr(opt, bol, eol, ctx, collect_hits);
 
/* we do not call with collect_hits without being extended */
for (p = opt->pattern_list; p; p = p->next) {
-   if (match_one_pattern(p, bol, eol, ctx, , 0))
+   if (match_one_pattern(p, bol, eol, ctx, match, 0))
return 1;
}
return 0;
@@ -1699,6 +1699,7 @@ static int grep_source_1(struct grep_opt *opt, struct 
grep_source *gs, int colle
int try_lookahead = 0;
int show_function = 0;
struct userdiff_driver *textconv = NULL;
+   regmatch_t match;
enum grep_context ctx = GREP_CONTEXT_HEAD;
xdemitconf_t xecfg;
 
@@ -1788,7 +1789,7 @@ static int grep_source_1(struct grep_opt *opt, struct 
grep_source *gs, int colle
if ((ctx == GREP_CONTEXT_HEAD) && (eol == bol))
ctx = GREP_CONTEXT_BODY;
 
-   hit = match_line(opt, bol, eol, ctx, collect_hits);
+   hit = match_line(opt, bol, eol, , ctx, collect_hits);
*eol = ch;
 
if (collect_hits)
-- 
2.17.0



Re: [PATCH v3] fast-export: fix regression skipping some merge-commits

2018-04-20 Thread Junio C Hamano
Martin Ågren  writes:

> +test_expect_success 'merge commit gets exported with --import-marks' '
> + test_create_repo merging &&
> + (
> + cd merging &&
> + test_commit initial &&
> + git checkout -b topic &&
> + test_commit on-topic &&
> + git checkout master &&
> + test_commit on-master &&
> + test_tick &&
> + git merge --no-ff -m Yeah topic &&
> +
> + echo ":1 $(git rev-parse HEAD^^)" >marks &&
> + git fast-export --import-marks=marks master >out &&
> + grep Yeah out
> + )
> +'

This test looks much better than the one in the earlier iteration,
but I do not think the updated "fix" below is better.  It might be
just aesthetics and I suspect I won't find it as disturbing if we
could push with

object_array_push(commits, (struct object *)commit);

or something that is more clearly symmetric to object_array_pop().
The "Queue again" comment is needed only because use of "add"
highlights the lack of symmetry.

With add_object_array(), it looks somewhat more odd than your
previous

peek it to check;
if (it should not be molested)
return;
pop to mark it consumed;
consume it;

sequence, in which peek() and pop() were more obviously related
operations on the same "array" object.

And I do not think it is a good idea to introduce _push() only for
symmetry (it would merely be a less capable version of add whose
name is spelled differently).  Hence my preference for peek-check-pop
over pop-oops-push-again-but-push-spelled-as-add.

Not worth a reroll, though.  I just wanted to spread better design
sense to contributors ;-)

>  test_done
> diff --git a/builtin/fast-export.c b/builtin/fast-export.c
> index 27b2cc138e..7b8dfc5af1 100644
> --- a/builtin/fast-export.c
> +++ b/builtin/fast-export.c
> @@ -651,8 +651,11 @@ static void handle_tail(struct object_array *commits, 
> struct rev_info *revs,
>   struct commit *commit;
>   while (commits->nr) {
>   commit = (struct commit *)object_array_pop(commits);
> - if (has_unshown_parent(commit))
> + if (has_unshown_parent(commit)) {
> + /* Queue again, to be handled later */
> + add_object_array(>object, NULL, commits);
>   return;
> + }
>   handle_commit(commit, revs, paths_of_changed_objects);
>   }
>  }


Re: Silly "git gc" UI issue.

2018-04-20 Thread Junio C Hamano
Simon Ruderich  writes:

> On Thu, Apr 19, 2018 at 02:10:40PM +0900, Junio C Hamano wrote:
>> diff --git a/parse-options-cb.c b/parse-options-cb.c
>> index c6679cb2cd..872627eafe 100644
>> --- a/parse-options-cb.c
>> +++ b/parse-options-cb.c
>> @@ -38,7 +38,11 @@ int parse_opt_approxidate_cb(const struct option *opt, 
>> const char *arg,
>>  int parse_opt_expiry_date_cb(const struct option *opt, const char *arg,
>>   int unset)
>>  {
>> -return parse_expiry_date(arg, (timestamp_t *)opt->value);
>> +if (unset)
>> +arg = "never";
>> +if (parse_expiry_date(arg, (timestamp_t *)opt->value))
>> +die("malformed expiration date '%s'", arg);
>> +return 0;
>>  }
>
> Should this error get translated?

Sure.  The new test to check this codepath even protects itself from
such a translation by using test_i18ngrep, so this is safe to mark
for translation from day one.

Thanks.

-- >8 --
Subject: [PATCH v2] parseopt: handle malformed --expire arguments more nicely

A few commands that parse --expire= command line option behave
sillily when given nonsense input.  For example

$ git prune --no-expire
Segmentation falut
$ git prune --expire=npw; echo $?
129

Both come from parse_opt_expiry_date_cb().

The former is because the function is not prepared to see arg==NULL
(for "--no-expire", it is a norm; "--expire" at the end of the
command line could be made to pass NULL, if it is told that the
argument is optional, but we don't so we do not have to worry about
that case).

The latter is because it does not check the value returned from the
underlying parse_expiry_date().

This seems to be a recent regression introduced while we attempted
to avoid spewing the entire usage message when given a correct
option but with an invalid value at 3bb0923f ("parse-options: do not
show usage upon invalid option value", 2018-03-22).  Before that, we
didn't fail silently but showed a full usage help (which arguably is
not all that better).

Also catch this error early when "git gc --prune=" is
misspelled by doing a dummy parsing before the main body of "gc"
that is time consuming even begins.  Otherwise, we'd spend time to
pack objects and then later have "git prune" first notice the error.
Aborting "gc" in the middle that way is not harmful but is ugly and
can be avoided.

Helped-by: Linus Torvalds 
Signed-off-by: Junio C Hamano 
---

 * marking the new message in parse_opt_expiry_date_cb() function
   for i18n is the only change from the previous round.

 builtin/gc.c   |  4 
 parse-options-cb.c |  6 +-
 t/t5304-prune.sh   | 10 ++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 3c5eae0edf..858aa444e1 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -353,6 +353,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
const char *name;
pid_t pid;
int daemonized = 0;
+   timestamp_t dummy;
 
struct option builtin_gc_options[] = {
OPT__QUIET(, N_("suppress progress reporting")),
@@ -388,6 +389,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
if (argc > 0)
usage_with_options(builtin_gc_usage, builtin_gc_options);
 
+   if (prune_expire && parse_expiry_date(prune_expire, ))
+   die(_("Failed to parse prune expiry value %s"), prune_expire);
+
if (aggressive) {
argv_array_push(, "-f");
if (aggressive_depth > 0)
diff --git a/parse-options-cb.c b/parse-options-cb.c
index c6679cb2cd..0f9f311a7a 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -38,7 +38,11 @@ int parse_opt_approxidate_cb(const struct option *opt, const 
char *arg,
 int parse_opt_expiry_date_cb(const struct option *opt, const char *arg,
 int unset)
 {
-   return parse_expiry_date(arg, (timestamp_t *)opt->value);
+   if (unset)
+   arg = "never";
+   if (parse_expiry_date(arg, (timestamp_t *)opt->value))
+   die(_("malformed expiration date '%s'"), arg);
+   return 0;
 }
 
 int parse_opt_color_flag_cb(const struct option *opt, const char *arg,
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 6694c19a1e..af69cdc112 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -320,4 +320,14 @@ test_expect_success 'prune: handle HEAD reflog in multiple 
worktrees' '
test_cmp expected actual
 '
 
+test_expect_success 'prune: handle expire option correctly' '
+   test_must_fail git prune --expire 2>error &&
+   test_i18ngrep "requires a value" error &&
+
+   test_must_fail git prune --expire=nyah 2>error &&
+   test_i18ngrep "malformed expiration" error &&
+
+   git prune --no-expire
+'
+
 test_done





[USABILITY] git-reset removes directories

2018-04-20 Thread Yutian Li
Ok I have been bitten by this issue twice. :(

`git reset --hard` will reset modifications in the working directory
(of course). But if I remove a file, add a directory with the same
name, `git reset --hard` will erase that whole directory.
Specifically the following steps:
```
touch file
git add file
git commit -m "add file"
# now the HEAD is constructed
rm file
mkdir file
touch file/a
# now here is the issue
git reset --hard # this will remove the directory and all its files
```

I guess this has to do with Git not tracking "changing a file to a
directory", especially because one is a blob and another is a tree.

In case of `git reset --hard`, people would expect reverting
modifications, bringing up removed files, but not removing a whole
directory with the same name. Especially if the directory is already
populated. Then the whole thing is gone.
I think it would better if Git could output a warning and abort, like
in case of checkout ("Updating the following ... would lose ...").

But honestly I usually take `git reset --hard` seriously and will
double check before invocation. What actually bit me was `git stash`.
It will call `git reset --hard -q` after saving the working directory.
But in the case I just described, since the new directory is not
tracked, it will not be saved and will be erased. `git stash` usually
gives me the mental model of "saving stuff in working directory", but
not like in this case where it also deletes stuff in working
directory.


Re: [PATCH v1 0/5] Allocate cache entries from memory pool

2018-04-20 Thread Jonathan Tan
On Tue, 17 Apr 2018 16:34:39 +
Jameson Miller  wrote:

> Jameson Miller (5):
>   read-cache: teach refresh_cache_entry to take istate
>   Add an API creating / discarding cache_entry structs
>   mem-pool: fill out functionality
>   Allocate cache entries from memory pools
>   Add optional memory validations around cache_entry lifecyle

In this patch set, there is no enforcement that the cache entry created
by make_index_cache_entry() goes into the correct index when
add_index_entry() is invoked. (Junio described similar things, I
believe, in [1].) This might be an issue when we bring up and drop
multiple indexes, and dropping one index causes a cache entry in another
to become invalidated.

One solution is to store the index for which the cache entry was created
in the cache entry itself, but that does increase its size. Another is
to change the API such that a cache entry is created and added in the
same function, and then have some rollback if the cache entry turns out
to be invalid (to support add-empty-entry -> fill -> verify), but I
don't know if this is feasible. Anyway, all these alternatives should be
at least discussed in the commit message, I think.

The make_transient_cache_entry() function might be poorly named, since
as far as I can tell, the entries produced by that function are actually
the longest lasting, since they are never freed.

Along those lines, I was slightly surprised to find out in patch 4 that
cache entry freeing is a no-op. That's fine, but in that case, it would
be better to delete all the calls to the "discard" function, and
document in the others that the entries they create will only be freed
when the memory pool itself is discarded.

[1] https://public-inbox.org/git/xmqqwox5i0f7@gitster-ct.c.googlers.com/


Re: [PATCH v3 01/11] argv_array: offer to split a string by whitespace

2018-04-20 Thread Stefan Beller
On Fri, Apr 20, 2018 at 3:20 PM, Johannes Schindelin
 wrote:
> This is a simple function that will interpret a string as a whitespace
> delimited list of values, and add those values into the array.
>
> Note: this function does not (yet) offer to split by arbitrary delimiters,
> or keep empty values in case of runs of whitespace, or de-quote Unix shell
> style. All fo this functionality can be added later, when and if needed.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  argv-array.c | 20 
>  argv-array.h |  1 +
>  2 files changed, 21 insertions(+)
>
> diff --git a/argv-array.c b/argv-array.c
> index 5d370fa3366..cb5bcd2c064 100644
> --- a/argv-array.c
> +++ b/argv-array.c
> @@ -64,6 +64,26 @@ void argv_array_pop(struct argv_array *array)
> array->argc--;
>  }
>
> +void argv_array_split(struct argv_array *array, const char *to_split)
> +{
> +   while (isspace(*to_split))
> +   to_split++;
> +   for (;;) {
> +   const char *p = to_split;
> +
> +   if (!*p)
> +   break;
> +
> +   while (*p && !isspace(*p))
> +   p++;
> +   argv_array_push_nodup(array, xstrndup(to_split, p - 
> to_split));
> +
> +   while (isspace(*p))
> +   p++;
> +   to_split = p;
> +   }
> +}

The code looks correct to me.

Though this seems so low level, that I find it hard to accept
to implement yet another low level split function.
Would reuse of strbuf_split or string_list_split make sense?

Looking at the user in patch 5 you really want to have the
argv array, though, so it cannot be pushed to an even higher
abstraction layer and be solved there. You really want a
string -> argv array split, which would mean we'd have to
do the split via string -> {strbufs, stringlist} and then perform
a conversion from that to argv array and both conversions
look ugly as we'd need to iterate their specific data structure
and push each element itself again.

So I guess we rather implement it yet another time.

Looking at their function declarations:

/*
 * lots of very good comments for string list splitting
 */
int string_list_split(struct string_list *list, const char *string,
  int delim, int maxsplit);

/*
 * lots of very good comments for strbuf splitting
 */
static inline struct strbuf **strbuf_split(const struct strbuf *sb,
   int terminator)

I find they have comments in common as well as the
terminator. In the commit message you defer the
implementation of a terminating symbol, as we
can do it later. Also the isspace takes more than one
delimiter, which the others do not.

I am debating myself if I want to ask for a comment, as
argv-array.h is very short for now and it would be consistent
not to add a comment.

Thanks,
Stefan


Re: [PATCH v1 3/5] mem-pool: fill out functionality

2018-04-20 Thread Jonathan Tan
On Tue, 17 Apr 2018 16:34:42 +
Jameson Miller  wrote:

> @@ -19,8 +19,27 @@ struct mem_pool {
>  
>   /* The total amount of memory allocated by the pool. */
>   size_t pool_alloc;
> +
> + /*
> +  * Array of pointers to "custom size" memory allocations.
> +  * This is used for "large" memory allocations.
> +  * The *_end variables are used to track the range of memory
> +  * allocated.
> +  */
> + void **custom, **custom_end;
> + int nr, nr_end, alloc, alloc_end;

This seems overly complicated - the struct mem_pool already has a linked
list of pages, so couldn't you create a custom page and insert it behind
the current front page instead whenever you needed a large-size page?

Also, when combining, there could be some wasted space on one of the
pages. I'm not sure if that's worth calling out, though.


Re: [PATCH v7 06/17] sequencer: introduce new commands to reset the revision

2018-04-20 Thread Johannes Schindelin
Hi Phillip,

On Fri, 20 Apr 2018, Phillip Wood wrote:

> On 19/04/18 13:20, Johannes Schindelin wrote:
>
> [... please cull long stretches of quoted mail that is not responded to ...]
>
> > @@ -2665,6 +2846,12 @@ static int pick_commits(struct todo_list *todo_list, 
> > struct replay_opts *opts)
> > /* `current` will be incremented below */
> > todo_list->current = -1;
> > }
> > +   } else if (item->command == TODO_LABEL) {
> > +   if ((res = do_label(item->arg, item->arg_len)))
> > +   goto reschedule;
> 
> I can see why you've implemented like this but I'm uneasy with jumping
> into a block guarded with "if (item->command <= TODO_SQUASH)" when
> item->command > TODO_SQUASH. I think it works OK at the moment but it's
> possible that in the future someone will edit that block of code and add
> something like
> 
> if (item->command == TODO_PICK)
>   do_something()
> else
>   do_something_else()
> 
> assuming that item->command <= TODO_SQUASH because they haven't noticed
> the goto jumping back into that block.

I changed it by duplicating the rescheduling, as I agree that it is
somewhat dangerous what with all the code going on after the rescheduling
of a pick/fixup/squash/reword.

My plan is to go over the documentation changes once more tomorrow, with a
fresh set of eyes, and then submit the hopefully final iteration of this
patch series.

Ciao,
Dscho


[PATCH v3 10/11] technical/shallow: describe why shallow cannot use replace refs

2018-04-20 Thread Johannes Schindelin
It is tempting to do away with commit_graft altogether (in the long
haul), now that grafts are deprecated.

However, the shallow feature needs a couple of things that the replace
refs cannot fulfill. Let's point that out in the documentation.

Signed-off-by: Johannes Schindelin 
---
 Documentation/technical/shallow.txt | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/technical/shallow.txt 
b/Documentation/technical/shallow.txt
index b3ff23c25f6..cb79181c2bb 100644
--- a/Documentation/technical/shallow.txt
+++ b/Documentation/technical/shallow.txt
@@ -25,6 +25,13 @@ Each line contains exactly one SHA-1. When read, a 
commit_graft
 will be constructed, which has nr_parent < 0 to make it easier
 to discern from user provided grafts.
 
+Note that the shallow feature could not be changed easily to
+use replace refs: a commit containing a `mergetag` is not allowed
+to be replaced, not even by a root commit. Such a commit can be
+made shallow, though. Also, having a `shallow` file explicitly
+listing all the commits made shallow makes it a *lot* easier to
+do shallow-specific things such as to deepen the history.
+
 Since fsck-objects relies on the library to read the objects,
 it honours shallow commits automatically.
 
-- 
2.17.0.windows.1.15.gaa56ade3205




[PATCH v3 11/11] Remove obsolete script to convert grafts to replace refs

2018-04-20 Thread Johannes Schindelin
The functionality is now implemented as `git replace
--convert-graft-file`.

Signed-off-by: Johannes Schindelin 
---
 contrib/convert-grafts-to-replace-refs.sh | 28 ---
 1 file changed, 28 deletions(-)
 delete mode 100755 contrib/convert-grafts-to-replace-refs.sh

diff --git a/contrib/convert-grafts-to-replace-refs.sh 
b/contrib/convert-grafts-to-replace-refs.sh
deleted file mode 100755
index 0cbc917b8cf..000
--- a/contrib/convert-grafts-to-replace-refs.sh
+++ /dev/null
@@ -1,28 +0,0 @@
-#!/bin/sh
-
-# You should execute this script in the repository where you
-# want to convert grafts to replace refs.
-
-GRAFTS_FILE="${GIT_DIR:-.git}/info/grafts"
-
-. $(git --exec-path)/git-sh-setup
-
-test -f "$GRAFTS_FILE" || die "Could not find graft file: '$GRAFTS_FILE'"
-
-grep '^[^# ]' "$GRAFTS_FILE" |
-while read definition
-do
-   if test -n "$definition"
-   then
-   echo "Converting: $definition"
-   git replace --graft $definition ||
-   die "Conversion failed for: $definition"
-   fi
-done
-
-mv "$GRAFTS_FILE" "$GRAFTS_FILE.bak" ||
-   die "Could not rename '$GRAFTS_FILE' to '$GRAFTS_FILE.bak'"
-
-echo "Success!"
-echo "All the grafts in '$GRAFTS_FILE' have been converted to replace refs!"
-echo "The grafts file '$GRAFTS_FILE' has been renamed: '$GRAFTS_FILE.bak'"
-- 
2.17.0.windows.1.15.gaa56ade3205


[PATCH v3 09/11] technical/shallow: describe the relationship with replace refs

2018-04-20 Thread Johannes Schindelin
Now that grafts are deprecated, we should start to assume that readers
have no idea what grafts are. So it makes more sense to describe the
"shallow" feature in terms of replace refs.

Suggested-by: Eric Sunshine 
Signed-off-by: Johannes Schindelin 
---
 Documentation/technical/shallow.txt | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/Documentation/technical/shallow.txt 
b/Documentation/technical/shallow.txt
index 5183b154229..b3ff23c25f6 100644
--- a/Documentation/technical/shallow.txt
+++ b/Documentation/technical/shallow.txt
@@ -9,14 +9,17 @@ these commits have no parents.
 *
 
 The basic idea is to write the SHA-1s of shallow commits into
-$GIT_DIR/shallow, and handle its contents like the contents
-of $GIT_DIR/info/grafts (with the difference that shallow
-cannot contain parent information).
-
-This information is stored in a new file instead of grafts, or
-even the config, since the user should not touch that file
-at all (even throughout development of the shallow clone, it
-was never manually edited!).
+$GIT_DIR/shallow, and handle its contents similar to replace
+refs (with the difference that shallow does not actually
+create those replace refs) and very much like the deprecated
+graft file (with the difference that shallow commits will
+always have their parents grafted away, not replaced by
+different parents).
+
+This information is stored in a special-purpose file because the
+user should not touch that file at all (even throughout
+development of the shallow clone, it was never manually
+edited!).
 
 Each line contains exactly one SHA-1. When read, a commit_graft
 will be constructed, which has nr_parent < 0 to make it easier
-- 
2.17.0.windows.1.15.gaa56ade3205




[PATCH v3 08/11] filter-branch: stop suggesting to use grafts

2018-04-20 Thread Johannes Schindelin
The graft file is deprecated now, so let's use replace refs in the example
in filter-branch's man page instead.

Suggested-by: Eric Sunshine 
Signed-off-by: Johannes Schindelin 
---
 Documentation/git-filter-branch.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-filter-branch.txt 
b/Documentation/git-filter-branch.txt
index b634043183b..1d4d2f86045 100644
--- a/Documentation/git-filter-branch.txt
+++ b/Documentation/git-filter-branch.txt
@@ -288,7 +288,7 @@ git filter-branch --parent-filter \
 or even simpler:
 
 ---
-echo "$commit-id $graft-id" >> .git/info/grafts
+git replace --graft $commit-id $graft-id
 git filter-branch $graft-id..HEAD
 ---
 
-- 
2.17.0.windows.1.15.gaa56ade3205




[PATCH v3 07/11] Deprecate support for .git/info/grafts

2018-04-20 Thread Johannes Schindelin
The grafts feature was a convenient way to "stitch together" ancient
history to the fresh start of linux.git.

Its implementation is, however, not up to Git's standards, as there are
too many ways where it can lead to surprising and unwelcome behavior.

For example, when pushing from a repository with active grafts, it is
possible to miss commits that have been "grafted out", resulting in a
broken state on the other side.

Also, the grafts feature is limited to "rewriting" commits' list of
parents, it cannot replace anything else.

The much younger feature implemented as `git replace` set out to remedy
those limitations and dangerous bugs.

Seeing as `git replace` is pretty mature by now (since 4228e8bc98
(replace: add --graft option, 2014-07-19) it can perform the graft
file's duties), it is time to deprecate support for the graft file, and
to retire it eventually.

Signed-off-by: Johannes Schindelin 
Reviewed-by: Stefan Beller 
---
 advice.c  |  2 ++
 advice.h  |  1 +
 commit.c  | 10 ++
 t/t6001-rev-list-graft.sh |  9 +
 4 files changed, 22 insertions(+)

diff --git a/advice.c b/advice.c
index 406efc183ba..4411704fd45 100644
--- a/advice.c
+++ b/advice.c
@@ -19,6 +19,7 @@ int advice_rm_hints = 1;
 int advice_add_embedded_repo = 1;
 int advice_ignored_hook = 1;
 int advice_waiting_for_editor = 1;
+int advice_graft_file_deprecated = 1;
 
 static struct {
const char *name;
@@ -42,6 +43,7 @@ static struct {
{ "addembeddedrepo", _add_embedded_repo },
{ "ignoredhook", _ignored_hook },
{ "waitingforeditor", _waiting_for_editor },
+   { "graftfiledeprecated", _graft_file_deprecated },
 
/* make this an alias for backward compatibility */
{ "pushnonfastforward", _push_update_rejected }
diff --git a/advice.h b/advice.h
index 70568fa7922..9f5064e82a8 100644
--- a/advice.h
+++ b/advice.h
@@ -21,6 +21,7 @@ extern int advice_rm_hints;
 extern int advice_add_embedded_repo;
 extern int advice_ignored_hook;
 extern int advice_waiting_for_editor;
+extern int advice_graft_file_deprecated;
 
 int git_default_advice_config(const char *var, const char *value);
 __attribute__((format (printf, 1, 2)))
diff --git a/commit.c b/commit.c
index 2952ec987c5..451d3ce8dfe 100644
--- a/commit.c
+++ b/commit.c
@@ -12,6 +12,7 @@
 #include "prio-queue.h"
 #include "sha1-lookup.h"
 #include "wt-status.h"
+#include "advice.h"
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char 
*buf, size_t len, const char **);
 
@@ -176,6 +177,15 @@ static int read_graft_file(const char *graft_file)
struct strbuf buf = STRBUF_INIT;
if (!fp)
return -1;
+   if (advice_graft_file_deprecated)
+   advise(_("Support for /info/grafts is deprecated\n"
+"and will be removed in a future Git version.\n"
+"\n"
+"Please use \"git replace --convert-graft-file\"\n"
+"to convert the grafts into replace refs.\n"
+"\n"
+"Turn this message off by running\n"
+"\"git config advice.graftFileDeprecated false\""));
while (!strbuf_getwholeline(, fp, '\n')) {
/* The format is just "Commit Parent1 Parent2 ...\n" */
struct commit_graft *graft = read_graft_line();
diff --git a/t/t6001-rev-list-graft.sh b/t/t6001-rev-list-graft.sh
index 05ddc69cf2a..7504ba47511 100755
--- a/t/t6001-rev-list-graft.sh
+++ b/t/t6001-rev-list-graft.sh
@@ -110,4 +110,13 @@ do
"
 
 done
+
+test_expect_success 'show advice that grafts are deprecated' '
+   git show HEAD 2>err &&
+   test_i18ngrep "git replace" err &&
+   test_config advice.graftFileDeprecated false &&
+   git show HEAD 2>err &&
+   test_i18ngrep ! "git replace" err
+'
+
 test_done
-- 
2.17.0.windows.1.15.gaa56ade3205




[PATCH v3 06/11] Add a test for `git replace --convert-graft-file`

2018-04-20 Thread Johannes Schindelin
The proof, as the saying goes, lies in the pudding. So here is a
regression test that not only demonstrates what the option is supposed to
accomplish, but also demonstrates that it does accomplish it.

Signed-off-by: Johannes Schindelin 
---
 t/t6050-replace.sh | 20 
 1 file changed, 20 insertions(+)

diff --git a/t/t6050-replace.sh b/t/t6050-replace.sh
index c630aba657e..8a3ee7c3db9 100755
--- a/t/t6050-replace.sh
+++ b/t/t6050-replace.sh
@@ -444,4 +444,24 @@ test_expect_success GPG '--graft on a commit with a 
mergetag' '
git replace -d $HASH10
 '
 
+test_expect_success '--convert-graft-file' '
+   : add and convert graft file &&
+   printf "%s\n%s %s\n\n# comment\n%s\n" \
+   $(git rev-parse HEAD^^ HEAD^ HEAD^^ HEAD^2) \
+   >.git/info/grafts &&
+   git replace --convert-graft-file &&
+   test_path_is_missing .git/info/grafts &&
+
+   : verify that the history is now "grafted" &&
+   git rev-list HEAD >out &&
+   test_line_count = 4 out &&
+
+   : create invalid graft file and verify that it is not deleted &&
+   test_when_finished "rm -f .git/info/grafts" &&
+   echo $EMPTY_BLOB $EMPTY_TREE >.git/info/grafts &&
+   test_must_fail git replace --convert-graft-file 2>err &&
+   grep "$EMPTY_BLOB $EMPTY_TREE" err &&
+   grep "$EMPTY_BLOB $EMPTY_TREE" .git/info/grafts
+'
+
 test_done
-- 
2.17.0.windows.1.15.gaa56ade3205




[PATCH v3 04/11] replace: "libify" create_graft() and callees

2018-04-20 Thread Johannes Schindelin
File this away as yet another patch in the "libification" category.

As with all useful functions, in the next commit we want to use
create_graft() from a higher-level function where it would be
inconvenient if the called function simply die()s: if there is a
problem, we want to let the user know how to proceed, and the callee
simply has no way of knowing what to say.

Signed-off-by: Johannes Schindelin 
---
 builtin/replace.c | 135 --
 1 file changed, 84 insertions(+), 51 deletions(-)

diff --git a/builtin/replace.c b/builtin/replace.c
index e345a5a0f1c..f63df405fd0 100644
--- a/builtin/replace.c
+++ b/builtin/replace.c
@@ -79,9 +79,9 @@ static int list_replace_refs(const char *pattern, const char 
*format)
else if (!strcmp(format, "long"))
data.format = REPLACE_FORMAT_LONG;
else
-   die("invalid replace format '%s'\n"
-   "valid formats are 'short', 'medium' and 'long'\n",
-   format);
+   return error("invalid replace format '%s'\n"
+"valid formats are 'short', 'medium' and 'long'\n",
+format);
 
for_each_replace_ref(show_reference, (void *));
 
@@ -134,7 +134,7 @@ static int delete_replace_ref(const char *name, const char 
*ref,
return 0;
 }
 
-static void check_ref_valid(struct object_id *object,
+static int check_ref_valid(struct object_id *object,
struct object_id *prev,
struct strbuf *ref,
int force)
@@ -142,12 +142,13 @@ static void check_ref_valid(struct object_id *object,
strbuf_reset(ref);
strbuf_addf(ref, "%s%s", git_replace_ref_base, oid_to_hex(object));
if (check_refname_format(ref->buf, 0))
-   die("'%s' is not a valid ref name.", ref->buf);
+   return error("'%s' is not a valid ref name.", ref->buf);
 
if (read_ref(ref->buf, prev))
oidclr(prev);
else if (!force)
-   die("replace ref '%s' already exists", ref->buf);
+   return error("replace ref '%s' already exists", ref->buf);
+   return 0;
 }
 
 static int replace_object_oid(const char *object_ref,
@@ -161,28 +162,31 @@ static int replace_object_oid(const char *object_ref,
struct strbuf ref = STRBUF_INIT;
struct ref_transaction *transaction;
struct strbuf err = STRBUF_INIT;
+   int res = 0;
 
obj_type = oid_object_info(object, NULL);
repl_type = oid_object_info(repl, NULL);
if (!force && obj_type != repl_type)
-   die("Objects must be of the same type.\n"
-   "'%s' points to a replaced object of type '%s'\n"
-   "while '%s' points to a replacement object of type '%s'.",
-   object_ref, type_name(obj_type),
-   replace_ref, type_name(repl_type));
+   return error("Objects must be of the same type.\n"
+"'%s' points to a replaced object of type '%s'\n"
+"while '%s' points to a replacement object of "
+"type '%s'.",
+object_ref, type_name(obj_type),
+replace_ref, type_name(repl_type));
 
-   check_ref_valid(object, , , force);
+   if (check_ref_valid(object, , , force))
+   return -1;
 
transaction = ref_transaction_begin();
if (!transaction ||
ref_transaction_update(transaction, ref.buf, repl, ,
   0, NULL, ) ||
ref_transaction_commit(transaction, ))
-   die("%s", err.buf);
+   res = error("%s", err.buf);
 
ref_transaction_free(transaction);
strbuf_release();
-   return 0;
+   return res;
 }
 
 static int replace_object(const char *object_ref, const char *replace_ref, int 
force)
@@ -190,9 +194,11 @@ static int replace_object(const char *object_ref, const 
char *replace_ref, int f
struct object_id object, repl;
 
if (get_oid(object_ref, ))
-   die("Failed to resolve '%s' as a valid ref.", object_ref);
+   return error("Failed to resolve '%s' as a valid ref.",
+object_ref);
if (get_oid(replace_ref, ))
-   die("Failed to resolve '%s' as a valid ref.", replace_ref);
+   return error("Failed to resolve '%s' as a valid ref.",
+replace_ref);
 
return replace_object_oid(object_ref, , replace_ref, , 
force);
 }
@@ -202,7 +208,7 @@ static int replace_object(const char *object_ref, const 
char *replace_ref, int f
  * If "raw" is true, then the object's raw contents are printed according to
  * "type". Otherwise, we pretty-print the contents for human editing.
  */
-static void 

[PATCH v3 05/11] replace: introduce --convert-graft-file

2018-04-20 Thread Johannes Schindelin
This option is intended to help with the transition away from the
now-deprecated graft file.

Signed-off-by: Johannes Schindelin 
---
 Documentation/git-replace.txt | 11 ++---
 builtin/replace.c | 44 ++-
 2 files changed, 51 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-replace.txt b/Documentation/git-replace.txt
index e5c57ae6ef4..246dc9943c2 100644
--- a/Documentation/git-replace.txt
+++ b/Documentation/git-replace.txt
@@ -11,6 +11,7 @@ SYNOPSIS
 'git replace' [-f]  
 'git replace' [-f] --edit 
 'git replace' [-f] --graft  [...]
+'git replace' [-f] --convert-graft-file
 'git replace' -d ...
 'git replace' [--format=] [-l []]
 
@@ -87,9 +88,13 @@ OPTIONS
content as  except that its parents will be
[...] instead of 's parents. A replacement ref
is then created to replace  with the newly created
-   commit. See contrib/convert-grafts-to-replace-refs.sh for an
-   example script based on this option that can convert grafts to
-   replace refs.
+   commit. Use `--convert-graft-file` to convert a
+   `$GIT_DIR/info/grafts` file and use replace refs instead.
+
+--convert-graft-file::
+   Creates graft commits for all entries in `$GIT_DIR/info/grafts`
+   and deletes that file upon success. The purpose is to help users
+   with transitioning off of the now-deprecated graft file.
 
 -l ::
 --list ::
diff --git a/builtin/replace.c b/builtin/replace.c
index f63df405fd0..6b3e0301e90 100644
--- a/builtin/replace.c
+++ b/builtin/replace.c
@@ -20,6 +20,7 @@ static const char * const git_replace_usage[] = {
N_("git replace [-f]  "),
N_("git replace [-f] --edit "),
N_("git replace [-f] --graft  [...]"),
+   N_("git replace [-f] --convert-graft-file"),
N_("git replace -d ..."),
N_("git replace [--format=] [-l []]"),
NULL
@@ -454,6 +455,38 @@ static int create_graft(int argc, const char **argv, int 
force)
return replace_object_oid(old_ref, _oid, "replacement", _oid, 
force);
 }
 
+static int convert_graft_file(int force)
+{
+   const char *graft_file = get_graft_file();
+   FILE *fp = fopen_or_warn(graft_file, "r");
+   struct strbuf buf = STRBUF_INIT, err = STRBUF_INIT;
+   struct argv_array args = ARGV_ARRAY_INIT;
+
+   if (!fp)
+   return -1;
+
+   while (strbuf_getline(, fp) != EOF) {
+   if (*buf.buf == '#')
+   continue;
+
+   argv_array_split(, buf.buf);
+   if (args.argc && create_graft(args.argc, args.argv, force))
+   strbuf_addf(, "\n\t%s", buf.buf);
+   argv_array_clear();
+   }
+
+   strbuf_release();
+   argv_array_clear();
+
+   if (!err.len)
+   return unlink_or_warn(graft_file);
+
+   warning(_("could not convert the following graft(s):\n%s"), err.buf);
+   strbuf_release();
+
+   return -1;
+}
+
 int cmd_replace(int argc, const char **argv, const char *prefix)
 {
int force = 0;
@@ -465,6 +498,7 @@ int cmd_replace(int argc, const char **argv, const char 
*prefix)
MODE_DELETE,
MODE_EDIT,
MODE_GRAFT,
+   MODE_CONVERT_GRAFT_FILE,
MODE_REPLACE
} cmdmode = MODE_UNSPECIFIED;
struct option options[] = {
@@ -472,6 +506,7 @@ int cmd_replace(int argc, const char **argv, const char 
*prefix)
OPT_CMDMODE('d', "delete", , N_("delete replace refs"), 
MODE_DELETE),
OPT_CMDMODE('e', "edit", , N_("edit existing object"), 
MODE_EDIT),
OPT_CMDMODE('g', "graft", , N_("change a commit's 
parents"), MODE_GRAFT),
+   OPT_CMDMODE(0, "convert-graft-file", , N_("convert 
existing graft file"), MODE_CONVERT_GRAFT_FILE),
OPT_BOOL_F('f', "force", , N_("replace the ref if it 
exists"),
   PARSE_OPT_NOCOMPLETE),
OPT_BOOL(0, "raw", , N_("do not pretty-print contents for 
--edit")),
@@ -494,7 +529,8 @@ int cmd_replace(int argc, const char **argv, const char 
*prefix)
if (force &&
cmdmode != MODE_REPLACE &&
cmdmode != MODE_EDIT &&
-   cmdmode != MODE_GRAFT)
+   cmdmode != MODE_GRAFT &&
+   cmdmode != MODE_CONVERT_GRAFT_FILE)
usage_msg_opt("-f only makes sense when writing a replacement",
  git_replace_usage, options);
 
@@ -527,6 +563,12 @@ int cmd_replace(int argc, const char **argv, const char 
*prefix)
  git_replace_usage, options);
return create_graft(argc, argv, force);
 
+   case MODE_CONVERT_GRAFT_FILE:
+   if (argc != 0)
+   usage_msg_opt("--convert-graft-file takes no argument",
+ git_replace_usage, options);
+   return 

[PATCH v3 03/11] replace: avoid using die() to indicate a bug

2018-04-20 Thread Johannes Schindelin
We have the BUG() macro for that purpose.

Signed-off-by: Johannes Schindelin 
---
 builtin/replace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/replace.c b/builtin/replace.c
index 245d3f4164e..e345a5a0f1c 100644
--- a/builtin/replace.c
+++ b/builtin/replace.c
@@ -501,6 +501,6 @@ int cmd_replace(int argc, const char **argv, const char 
*prefix)
return list_replace_refs(argv[0], format);
 
default:
-   die("BUG: invalid cmdmode %d", (int)cmdmode);
+   BUG("invalid cmdmode %d", (int)cmdmode);
}
 }
-- 
2.17.0.windows.1.15.gaa56ade3205




[PATCH v3 02/11] commit: Let the callback of for_each_mergetag return on error

2018-04-20 Thread Johannes Schindelin
This is yet another patch to be filed under the keyword "libification".

There is one subtle change in behavior here, where a `git log` that has
been asked to show the mergetags would now stop reporting the mergetags
upon the first failure, whereas previously, it would have continued to the
next mergetag, if any.

In practice, that change should not matter, as it is 1) uncommon to
perform octopus merges using multiple tags as merge heads, and 2) when the
user asks to be shown those tags, they really should be there.

Signed-off-by: Johannes Schindelin 
---
 builtin/replace.c |  8 
 commit.c  |  8 +---
 commit.h  |  4 ++--
 log-tree.c| 13 +++--
 4 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/builtin/replace.c b/builtin/replace.c
index 935647be6bd..245d3f4164e 100644
--- a/builtin/replace.c
+++ b/builtin/replace.c
@@ -345,7 +345,7 @@ struct check_mergetag_data {
const char **argv;
 };
 
-static void check_one_mergetag(struct commit *commit,
+static int check_one_mergetag(struct commit *commit,
   struct commit_extra_header *extra,
   void *data)
 {
@@ -368,20 +368,20 @@ static void check_one_mergetag(struct commit *commit,
if (get_oid(mergetag_data->argv[i], ) < 0)
die(_("Not a valid object name: '%s'"), 
mergetag_data->argv[i]);
if (!oidcmp(>tagged->oid, ))
-   return; /* found */
+   return 0; /* found */
}
 
die(_("original commit '%s' contains mergetag '%s' that is discarded; "
  "use --edit instead of --graft"), ref, oid_to_hex(_oid));
 }
 
-static void check_mergetags(struct commit *commit, int argc, const char **argv)
+static int check_mergetags(struct commit *commit, int argc, const char **argv)
 {
struct check_mergetag_data mergetag_data;
 
mergetag_data.argc = argc;
mergetag_data.argv = argv;
-   for_each_mergetag(check_one_mergetag, commit, _data);
+   return for_each_mergetag(check_one_mergetag, commit, _data);
 }
 
 static int create_graft(int argc, const char **argv, int force)
diff --git a/commit.c b/commit.c
index ca474a7c112..2952ec987c5 100644
--- a/commit.c
+++ b/commit.c
@@ -1288,17 +1288,19 @@ struct commit_extra_header 
*read_commit_extra_headers(struct commit *commit,
return extra;
 }
 
-void for_each_mergetag(each_mergetag_fn fn, struct commit *commit, void *data)
+int for_each_mergetag(each_mergetag_fn fn, struct commit *commit, void *data)
 {
struct commit_extra_header *extra, *to_free;
+   int res = 0;
 
to_free = read_commit_extra_headers(commit, NULL);
-   for (extra = to_free; extra; extra = extra->next) {
+   for (extra = to_free; !res && extra; extra = extra->next) {
if (strcmp(extra->key, "mergetag"))
continue; /* not a merge tag */
-   fn(commit, extra, data);
+   res = fn(commit, extra, data);
}
free_commit_extra_headers(to_free);
+   return res;
 }
 
 static inline int standard_header_field(const char *field, size_t len)
diff --git a/commit.h b/commit.h
index 0fb8271665c..9000895ad91 100644
--- a/commit.h
+++ b/commit.h
@@ -291,10 +291,10 @@ extern const char *find_commit_header(const char *msg, 
const char *key,
 /* Find the end of the log message, the right place for a new trailer. */
 extern int ignore_non_trailer(const char *buf, size_t len);
 
-typedef void (*each_mergetag_fn)(struct commit *commit, struct 
commit_extra_header *extra,
+typedef int (*each_mergetag_fn)(struct commit *commit, struct 
commit_extra_header *extra,
 void *cb_data);
 
-extern void for_each_mergetag(each_mergetag_fn fn, struct commit *commit, void 
*data);
+extern int for_each_mergetag(each_mergetag_fn fn, struct commit *commit, void 
*data);
 
 struct merge_remote_desc {
struct object *obj; /* the named object, could be a tag */
diff --git a/log-tree.c b/log-tree.c
index d1c0bedf244..f3a51a6e726 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -488,9 +488,9 @@ static int is_common_merge(const struct commit *commit)
&& !commit->parents->next->next);
 }
 
-static void show_one_mergetag(struct commit *commit,
- struct commit_extra_header *extra,
- void *data)
+static int show_one_mergetag(struct commit *commit,
+struct commit_extra_header *extra,
+void *data)
 {
struct rev_info *opt = (struct rev_info *)data;
struct object_id oid;
@@ -502,7 +502,7 @@ static void show_one_mergetag(struct commit *commit,
hash_object_file(extra->value, extra->len, type_name(OBJ_TAG), );
tag = lookup_tag();
if (!tag)
-   return; /* error message already given */
+   

[PATCH v3 01/11] argv_array: offer to split a string by whitespace

2018-04-20 Thread Johannes Schindelin
This is a simple function that will interpret a string as a whitespace
delimited list of values, and add those values into the array.

Note: this function does not (yet) offer to split by arbitrary delimiters,
or keep empty values in case of runs of whitespace, or de-quote Unix shell
style. All fo this functionality can be added later, when and if needed.

Signed-off-by: Johannes Schindelin 
---
 argv-array.c | 20 
 argv-array.h |  1 +
 2 files changed, 21 insertions(+)

diff --git a/argv-array.c b/argv-array.c
index 5d370fa3366..cb5bcd2c064 100644
--- a/argv-array.c
+++ b/argv-array.c
@@ -64,6 +64,26 @@ void argv_array_pop(struct argv_array *array)
array->argc--;
 }
 
+void argv_array_split(struct argv_array *array, const char *to_split)
+{
+   while (isspace(*to_split))
+   to_split++;
+   for (;;) {
+   const char *p = to_split;
+
+   if (!*p)
+   break;
+
+   while (*p && !isspace(*p))
+   p++;
+   argv_array_push_nodup(array, xstrndup(to_split, p - to_split));
+
+   while (isspace(*p))
+   p++;
+   to_split = p;
+   }
+}
+
 void argv_array_clear(struct argv_array *array)
 {
if (array->argv != empty_argv) {
diff --git a/argv-array.h b/argv-array.h
index 29056e49a12..c7c397695df 100644
--- a/argv-array.h
+++ b/argv-array.h
@@ -19,6 +19,7 @@ LAST_ARG_MUST_BE_NULL
 void argv_array_pushl(struct argv_array *, ...);
 void argv_array_pushv(struct argv_array *, const char **);
 void argv_array_pop(struct argv_array *);
+void argv_array_split(struct argv_array *, const char *);
 void argv_array_clear(struct argv_array *);
 const char **argv_array_detach(struct argv_array *);
 
-- 
2.17.0.windows.1.15.gaa56ade3205




[PATCH v3 00/11] Deprecate .git/info/grafts

2018-04-20 Thread Johannes Schindelin
It is fragile, as there is no way for the revision machinery to say "but
now I want to traverse the graph ignoring the graft file" e.g. when
pushing commits to a remote repository (which, as a consequence, can
miss commits).

And we already have a better solution with `git replace --graft 
[...]`.

Changes since v2:

- Fixed tyop in the description of --graft where it suggests
  --convert-graft-file instead.

- Dropped the rant from the "libify create_graft()" commit.

- Changed a die("BUG: ...") call to BUG() (instead of libifying it).

- Truly libified create_graft() by replacing every single instance where die()
  (or lookup_commit_or_die()) was called with a proper error(), adding resource
  handling to avoid leaks, so that that rant that was dropped from the commit
  message would now truly be merited.

- Graft files with commented lines are now handled correctly.

- Graft files with empty lines are now handled correctly, too.

- The graft file parser is now based on a shiny new argv_array_split().

- The script in contrib/ whose functionality is superseded by `git replace
  --convert-graft-file` was removed.


Johannes Schindelin (11):
  argv_array: offer to split a string by whitespace
  commit: Let the callback of for_each_mergetag return on error
  replace: avoid using die() to indicate a bug
  replace: "libify" create_graft() and callees
  replace: introduce --convert-graft-file
  Add a test for `git replace --convert-graft-file`
  Deprecate support for .git/info/grafts
  filter-branch: stop suggesting to use grafts
  technical/shallow: describe the relationship with replace refs
  technical/shallow: describe why shallow cannot use replace refs
  Remove obsolete script to convert grafts to replace refs

 Documentation/git-filter-branch.txt   |   2 +-
 Documentation/git-replace.txt |  11 +-
 Documentation/technical/shallow.txt   |  24 ++-
 advice.c  |   2 +
 advice.h  |   1 +
 argv-array.c  |  20 +++
 argv-array.h  |   1 +
 builtin/replace.c | 189 +++---
 commit.c  |  18 ++-
 commit.h  |   4 +-
 contrib/convert-grafts-to-replace-refs.sh |  28 
 log-tree.c|  13 +-
 t/t6001-rev-list-graft.sh |   9 ++
 t/t6050-replace.sh|  20 +++
 14 files changed, 235 insertions(+), 107 deletions(-)
 delete mode 100755 contrib/convert-grafts-to-replace-refs.sh


base-commit: fe0a9eaf31dd0c349ae4308498c33a5c3794b293
Published-As: https://github.com/dscho/git/releases/tag/deprecate-grafts-v3
Fetch-It-Via: git fetch https://github.com/dscho/git deprecate-grafts-v3

Interdiff vs v2:
 diff --git a/Documentation/git-replace.txt b/Documentation/git-replace.txt
 index 4dc0686f7d6..246dc9943c2 100644
 --- a/Documentation/git-replace.txt
 +++ b/Documentation/git-replace.txt
 @@ -89,7 +89,7 @@ OPTIONS
[...] instead of 's parents. A replacement ref
is then created to replace  with the newly created
commit. Use `--convert-graft-file` to convert a
 -  `$GIT_DIR/info/grafts` file use replace refs instead.
 +  `$GIT_DIR/info/grafts` file and use replace refs instead.
  
  --convert-graft-file::
Creates graft commits for all entries in `$GIT_DIR/info/grafts`
 diff --git a/argv-array.c b/argv-array.c
 index 5d370fa3366..cb5bcd2c064 100644
 --- a/argv-array.c
 +++ b/argv-array.c
 @@ -64,6 +64,26 @@ void argv_array_pop(struct argv_array *array)
array->argc--;
  }
  
 +void argv_array_split(struct argv_array *array, const char *to_split)
 +{
 +  while (isspace(*to_split))
 +  to_split++;
 +  for (;;) {
 +  const char *p = to_split;
 +
 +  if (!*p)
 +  break;
 +
 +  while (*p && !isspace(*p))
 +  p++;
 +  argv_array_push_nodup(array, xstrndup(to_split, p - to_split));
 +
 +  while (isspace(*p))
 +  p++;
 +  to_split = p;
 +  }
 +}
 +
  void argv_array_clear(struct argv_array *array)
  {
if (array->argv != empty_argv) {
 diff --git a/argv-array.h b/argv-array.h
 index 29056e49a12..c7c397695df 100644
 --- a/argv-array.h
 +++ b/argv-array.h
 @@ -19,6 +19,7 @@ LAST_ARG_MUST_BE_NULL
  void argv_array_pushl(struct argv_array *, ...);
  void argv_array_pushv(struct argv_array *, const char **);
  void argv_array_pop(struct argv_array *);
 +void argv_array_split(struct argv_array *, const char *);
  void argv_array_clear(struct argv_array *);
  const char **argv_array_detach(struct argv_array *);
  
 diff --git a/builtin/replace.c b/builtin/replace.c
 index 4cdc00a96df..6b3e0301e90 100644
 --- a/builtin/replace.c
 +++ b/builtin/replace.c
 @@ -80,9 +80,9 @@ static int list_replace_refs(const char *pattern, const char 

js/rebase-recreate-merges, was Re: What's cooking in git.git (Apr 2018, #02; Tue, 17)

2018-04-20 Thread Johannes Schindelin
Hi Junio,

On Tue, 17 Apr 2018, Junio C Hamano wrote:

> * js/rebase-recreate-merge (2018-04-11) 15 commits
>  - rebase -i --rebase-merges: add a section to the man page
>  - rebase -i: introduce --rebase-merges=[no-]rebase-cousins
>  - pull: accept --rebase=merges to recreate the branch topology
>  - rebase --rebase-merges: avoid "empty merges"
>  - sequencer: handle post-rewrite for merge commands
>  - sequencer: make refs generated by the `label` command worktree-local
>  - rebase --rebase-merges: add test for --keep-empty
>  - rebase: introduce the --rebase-merges option
>  - rebase-helper --make-script: introduce a flag to rebase merges
>  - sequencer: fast-forward `merge` commands, if possible
>  - sequencer: introduce the `merge` command
>  - sequencer: introduce new commands to reset the revision
>  - git-rebase--interactive: clarify arguments
>  - sequencer: make rearrange_squash() a bit more obvious
>  - sequencer: avoid using errno clobbered by rollback_lock_file()
> 
>  "git rebase" learned "--rebase-merges" to transplant the whole
>  topology of commit graph elsewhere.
> 
>  This looked more or less ready for 'next'.  Please stop me if there
>  are remaining issues I forgot about.

Phillip pointed out some ugliness in the way I reschedule commands after
they failed fatally, and I will fix that, I will also fix the erroneous
commit message starting with "# This is a combination of ...", and then I
wanted to go over the added documentation one last time (with a fresh set
of eyes, after looking at so much other code).

So: please do not yet merge this to `next`. I think I need one more
iteration.

Ciao,
Dscho


Re: [PATCH v3] fast-export: fix regression skipping some merge-commits

2018-04-20 Thread Eric Sunshine
On Fri, Apr 20, 2018 at 6:12 PM, Martin Ågren  wrote:
> 7199203937 (object_array: add and use `object_array_pop()`, 2017-09-23)
> noted that the pattern `object = array.objects[--array.nr].item` could
> be abstracted as `object = object_array_pop()`.
>
> Unfortunately, one of the conversions was horribly wrong. Between
> grabbing the last object (i.e., peeking at it) and decreasing the object
> count, the original code would sometimes return early. The updated code
> on the other hand, will always pop the last element, then maybe do the
> early return without doing anything with the object.
>
> The end result is that merge commits where all the parents have still
> not been exported will simply be dropped, meaning that they will be
> completely missing from the exported data.
>
> Re-add a commit when it is not yet time to handle it. An alternative
> that was considered was to peek-then-pop. That carries some risk with it
> since the peeking and poping need to act on the same object, in a

s/poping/popping/

> concerted fashion.
>
> Add a test that would have caught this.
>
> Reported-by: Isaac Chou 
> Analyzed-by: Isaac Chou 
> Helped-by: Johannes Schindelin 
> Signed-off-by: Martin Ågren 


[PATCH v3] fast-export: fix regression skipping some merge-commits

2018-04-20 Thread Martin Ågren
7199203937 (object_array: add and use `object_array_pop()`, 2017-09-23)
noted that the pattern `object = array.objects[--array.nr].item` could
be abstracted as `object = object_array_pop()`.

Unfortunately, one of the conversions was horribly wrong. Between
grabbing the last object (i.e., peeking at it) and decreasing the object
count, the original code would sometimes return early. The updated code
on the other hand, will always pop the last element, then maybe do the
early return without doing anything with the object.

The end result is that merge commits where all the parents have still
not been exported will simply be dropped, meaning that they will be
completely missing from the exported data.

Re-add a commit when it is not yet time to handle it. An alternative
that was considered was to peek-then-pop. That carries some risk with it
since the peeking and poping need to act on the same object, in a
concerted fashion.

Add a test that would have caught this.

Reported-by: Isaac Chou 
Analyzed-by: Isaac Chou 
Helped-by: Johannes Schindelin 
Signed-off-by: Martin Ågren 
---
This v3 is similar in spirit to v1/v2, but with a reworked test and a
different fix approach, both based on Dscho's suggestions.

>> Between you cleaning up the test and providing a different
>> implementation, there's not much left for me to take credit for. Can I
>> forge your From: and Signed-off-by: on this?
>
> I disagree, all I did was to play a variation of your tune. You are the
> composer of this patch, you performed all the hard work (analysis,
> implementation & testing), and you deserve the credit.

Ok.

> It would please my ego a bit, of course, if you could add a "Helped-by:
> Dscho" line... ;-)

That's a given! Again, thanks for really helpful suggestions.

Martin

 t/t9350-fast-export.sh | 18 ++
 builtin/fast-export.c  |  5 -
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
index 866ddf6058..c699c88d00 100755
--- a/t/t9350-fast-export.sh
+++ b/t/t9350-fast-export.sh
@@ -540,4 +540,22 @@ test_expect_success 'when using -C, do not declare copy 
when source of copy is a
test_cmp expected actual
 '
 
+test_expect_success 'merge commit gets exported with --import-marks' '
+   test_create_repo merging &&
+   (
+   cd merging &&
+   test_commit initial &&
+   git checkout -b topic &&
+   test_commit on-topic &&
+   git checkout master &&
+   test_commit on-master &&
+   test_tick &&
+   git merge --no-ff -m Yeah topic &&
+
+   echo ":1 $(git rev-parse HEAD^^)" >marks &&
+   git fast-export --import-marks=marks master >out &&
+   grep Yeah out
+   )
+'
+
 test_done
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 27b2cc138e..7b8dfc5af1 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -651,8 +651,11 @@ static void handle_tail(struct object_array *commits, 
struct rev_info *revs,
struct commit *commit;
while (commits->nr) {
commit = (struct commit *)object_array_pop(commits);
-   if (has_unshown_parent(commit))
+   if (has_unshown_parent(commit)) {
+   /* Queue again, to be handled later */
+   add_object_array(>object, NULL, commits);
return;
+   }
handle_commit(commit, revs, paths_of_changed_objects);
}
 }
-- 
2.17.0



Re: Feature requst

2018-04-20 Thread Stefan Beller
On Fri, Apr 20, 2018 at 2:31 PM, Yuri Weinstein
 wrote:
> "git grep xxx" currently does not follow symlinks.
> Please consider adding this functionality

Is this related to
https://public-inbox.org/git/20180409090047.lfru2ul5fbnggfg7@bod/ ?


Feature requst

2018-04-20 Thread Yuri Weinstein
"git grep xxx" currently does not follow symlinks.
Please consider adding this functionality

Thx
YuriW


Re: [PATCH v2 3/4] sequencer: leave a tell-tale when a fixup/squash failed

2018-04-20 Thread Stefan Beller
>  static GIT_PATH_FUNC(rebase_path_amend, "rebase-merge/amend")
> +/*
> + * If there was a merge conflict in a fixup/squash series, we need to
> + * record the type so that a `git rebase --skip` can clean up the commit
> + * message as appropriate. This file will contain that type (`fixup` or
> + * `squash`), and not exist otherwise.
> + */

Thanks for the documentation here, is there some other high level doc that
describes all things to know about the internals of the rebase-merge dir
or is this the definitive guide?

> +static GIT_PATH_FUNC(rebase_path_amend_type, "rebase-merge/amend-type")
>  /*
>   * When we stop at a given patch via the "edit" command, this file contains
>   * the abbreviated commit name of the corresponding patch.
> @@ -2400,10 +2407,20 @@ static int error_with_patch(struct commit *commit,
>  static int error_failed_squash(struct commit *commit,
> struct replay_opts *opts, int subject_len, const char *subject)
>  {
> +   const char *amend_type = "squash";
> +
> +   if (file_exists(rebase_path_fixup_msg())) {
> +   unlink(rebase_path_fixup_msg());
> +   amend_type = "fixup";
> +   }
> +   if (write_message(amend_type, strlen(amend_type),
> +  rebase_path_amend_type(), 0))
> +   return error(_("could not write '%s'"),
> +rebase_path_amend_type());

Do we want to wait with unlinking rebase_path_fixup_msg()
until after we are sure there is no error returned?
I first thought so as to preserve the state as before, but
then it only signals the amend type. But we're downgrading the
amend type from "squash" to "fixup", which means that if
this error happens and the user just retries the git command
we'll end up with a "fixup", i.e. not opening their editor?


Re: [PATCH v2 2/4] rebase -i: Handle "combination of commits" with GETTEXT_POISON

2018-04-20 Thread Stefan Beller
On Fri, Apr 20, 2018 at 2:07 PM, Johannes Schindelin
 wrote:
> We previously relied on the localized versions of
>
> # This is a combination of  commits
>
> (which we write into the commit messages during fixup/squash chains)
> to contain  as ASCII.
>
> Thisis not true in general, and certainly not in GETTEXT_POISON, as

This is

Apart from this typo, this patch looks good.


[PATCH v2 4/4] rebase --skip: clean up commit message after a failed fixup/squash

2018-04-20 Thread Johannes Schindelin
During a series of fixup/squash commands, the interactive rebase builds
up a commit message with comments. This will be presented to the user in
the editor if at least one of those commands was a `squash`.

However, if the last of these fixup/squash commands fails with merge
conflicts, and if the user then decides to skip it (or resolve it to a
clean worktree and then continue the rebase), the current code fails to
clean up the commit message.

This commit fixes that behavior.

The diff is best viewed with --color-moved.

Signed-off-by: Johannes Schindelin 
---
 sequencer.c| 36 
 t/t3418-rebase-continue.sh |  2 +-
 2 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index a6a4efeaae2..881503a6463 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2789,17 +2789,12 @@ static int continue_single_pick(void)
 
 static int commit_staged_changes(struct replay_opts *opts)
 {
-   unsigned int flags = ALLOW_EMPTY | EDIT_MSG;
+   unsigned int flags = ALLOW_EMPTY | EDIT_MSG, is_fixup = 0, is_clean;
 
if (has_unstaged_changes(1))
return error(_("cannot rebase: You have unstaged changes."));
-   if (!has_uncommitted_changes(0)) {
-   const char *cherry_pick_head = git_path_cherry_pick_head();
 
-   if (file_exists(cherry_pick_head) && unlink(cherry_pick_head))
-   return error(_("could not remove CHERRY_PICK_HEAD"));
-   return 0;
-   }
+   is_clean = !has_uncommitted_changes(0);
 
if (file_exists(rebase_path_amend())) {
struct strbuf rev = STRBUF_INIT;
@@ -2812,16 +2807,41 @@ static int commit_staged_changes(struct replay_opts 
*opts)
if (get_oid_hex(rev.buf, _amend))
return error(_("invalid contents: '%s'"),
rebase_path_amend());
-   if (oidcmp(, _amend))
+   if (!is_clean && oidcmp(, _amend))
return error(_("\nYou have uncommitted changes in your "
   "working tree. Please, commit them\n"
   "first and then run 'git rebase "
   "--continue' again."));
+   if (is_clean && !oidcmp(, _amend)) {
+   strbuf_reset();
+   /*
+* Clean tree, but we may need to finalize a
+* fixup/squash chain. A failed fixup/squash leaves the
+* file amend-type in rebase-merge/; It is okay if that
+* file is missing, in which case there is no such
+* chain to finalize.
+*/
+   read_oneliner(, rebase_path_amend_type(), 0);
+   if (!strcmp("squash", rev.buf))
+   is_fixup = TODO_SQUASH;
+   else if (!strcmp("fixup", rev.buf)) {
+   is_fixup = TODO_FIXUP;
+   flags = (flags & ~EDIT_MSG) | CLEANUP_MSG;
+   }
+   }
 
strbuf_release();
flags |= AMEND_MSG;
}
 
+   if (is_clean && !is_fixup) {
+   const char *cherry_pick_head = git_path_cherry_pick_head();
+
+   if (file_exists(cherry_pick_head) && unlink(cherry_pick_head))
+   return error(_("could not remove CHERRY_PICK_HEAD"));
+   return 0;
+   }
+
if (run_git_commit(rebase_path_message(), opts, flags))
return error(_("could not commit staged changes."));
unlink(rebase_path_amend());
diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh
index 6ddf952b7b9..693f92409ec 100755
--- a/t/t3418-rebase-continue.sh
+++ b/t/t3418-rebase-continue.sh
@@ -88,7 +88,7 @@ test_expect_success 'rebase passes merge strategy options 
correctly' '
git rebase --continue
 '
 
-test_expect_failure '--skip after failed fixup cleans commit message' '
+test_expect_success '--skip after failed fixup cleans commit message' '
test_when_finished "test_might_fail git rebase --abort" &&
git checkout -b with-conflicting-fixup &&
test_commit wants-fixup &&
-- 
2.17.0.windows.1.15.gaa56ade3205


[PATCH v2 3/4] sequencer: leave a tell-tale when a fixup/squash failed

2018-04-20 Thread Johannes Schindelin
In the upcoming patch to clean up fixup/squash commit messages even when
skipping a final fixup/squash that failed with merge conflicts, we will
need to have some indicator what happened.

As we need to remove the message-fixup and message-squash files upon
failure, we cannot use those. So let's just write an explicit amend-type
file, containing either `fixup` or `squash`. The absence of that file
indicates that we were not in the middle of a fixup or squash when merge
conflicts were happening.

Signed-off-by: Johannes Schindelin 
---
 sequencer.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/sequencer.c b/sequencer.c
index dc482e76a28..a6a4efeaae2 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -106,6 +106,13 @@ static GIT_PATH_FUNC(rebase_path_author_script, 
"rebase-merge/author-script")
  * command is processed, this file is deleted.
  */
 static GIT_PATH_FUNC(rebase_path_amend, "rebase-merge/amend")
+/*
+ * If there was a merge conflict in a fixup/squash series, we need to
+ * record the type so that a `git rebase --skip` can clean up the commit
+ * message as appropriate. This file will contain that type (`fixup` or
+ * `squash`), and not exist otherwise.
+ */
+static GIT_PATH_FUNC(rebase_path_amend_type, "rebase-merge/amend-type")
 /*
  * When we stop at a given patch via the "edit" command, this file contains
  * the abbreviated commit name of the corresponding patch.
@@ -2400,10 +2407,20 @@ static int error_with_patch(struct commit *commit,
 static int error_failed_squash(struct commit *commit,
struct replay_opts *opts, int subject_len, const char *subject)
 {
+   const char *amend_type = "squash";
+
+   if (file_exists(rebase_path_fixup_msg())) {
+   unlink(rebase_path_fixup_msg());
+   amend_type = "fixup";
+   }
+   if (write_message(amend_type, strlen(amend_type),
+  rebase_path_amend_type(), 0))
+   return error(_("could not write '%s'"),
+rebase_path_amend_type());
+
if (rename(rebase_path_squash_msg(), rebase_path_message()))
return error(_("could not rename '%s' to '%s'"),
rebase_path_squash_msg(), rebase_path_message());
-   unlink(rebase_path_fixup_msg());
unlink(git_path_merge_msg());
if (copy_file(git_path_merge_msg(), rebase_path_message(), 0666))
return error(_("could not copy '%s' to '%s'"),
@@ -2580,6 +2597,7 @@ static int pick_commits(struct todo_list *todo_list, 
struct replay_opts *opts)
unlink(rebase_path_author_script());
unlink(rebase_path_stopped_sha());
unlink(rebase_path_amend());
+   unlink(rebase_path_amend_type());
delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
}
if (item->command <= TODO_SQUASH) {
@@ -2807,6 +2825,7 @@ static int commit_staged_changes(struct replay_opts *opts)
if (run_git_commit(rebase_path_message(), opts, flags))
return error(_("could not commit staged changes."));
unlink(rebase_path_amend());
+   unlink(rebase_path_amend_type());
return 0;
 }
 
-- 
2.17.0.windows.1.15.gaa56ade3205




[PATCH v2 1/4] rebase -i: demonstrate bugs with fixup!/squash! commit messages

2018-04-20 Thread Johannes Schindelin
When multiple fixup/squash commands are processed and the last one
causes merge conflicts and is skipped, we leave the "This is a
combination of ..." comments in the commit message.

Noticed by Eric Sunshine.

This regression test also demonstrates that we rely on the localized
version of

# This is a combination of  commits

to contain the  in ASCII, which breaks under GETTEXT_POISON.

Signed-off-by: Johannes Schindelin 
---
 t/t3418-rebase-continue.sh | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh
index 9214d0bb511..6ddf952b7b9 100755
--- a/t/t3418-rebase-continue.sh
+++ b/t/t3418-rebase-continue.sh
@@ -88,6 +88,28 @@ test_expect_success 'rebase passes merge strategy options 
correctly' '
git rebase --continue
 '
 
+test_expect_failure '--skip after failed fixup cleans commit message' '
+   test_when_finished "test_might_fail git rebase --abort" &&
+   git checkout -b with-conflicting-fixup &&
+   test_commit wants-fixup &&
+   test_commit "fixup! wants-fixup" wants-fixup.t 1 wants-fixup-1 &&
+   test_commit "fixup! wants-fixup" wants-fixup.t 2 wants-fixup-2 &&
+   test_commit "fixup! wants-fixup" wants-fixup.t 3 wants-fixup-3 &&
+   test_must_fail env FAKE_LINES="1 fixup 2 fixup 4" \
+   git rebase -i HEAD~4 &&
+
+   : now there is a conflict, and comments in the commit message &&
+   git show HEAD >out &&
+   test_i18ngrep "This is a combination of" out &&
+
+   : skip and continue &&
+   git rebase --skip &&
+
+   : now the comments in the commit message should have been cleaned up &&
+   git show HEAD >out &&
+   test_i18ngrep ! "This is a combination of" out
+'
+
 test_expect_success 'setup rerere database' '
rm -fr .git/rebase-* &&
git reset --hard commit-new-file-F3-on-topic-branch &&
-- 
2.17.0.windows.1.15.gaa56ade3205




[PATCH v2 2/4] rebase -i: Handle "combination of commits" with GETTEXT_POISON

2018-04-20 Thread Johannes Schindelin
We previously relied on the localized versions of

# This is a combination of  commits

(which we write into the commit messages during fixup/squash chains)
to contain  as ASCII.

Thisis not true in general, and certainly not in GETTEXT_POISON, as
demonstrated by the regression test we just introduced in t3418.

Signed-off-by: Johannes Schindelin 
---
 sequencer.c | 36 ++--
 1 file changed, 22 insertions(+), 14 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 667f35ebdff..dc482e76a28 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1343,19 +1343,18 @@ static int update_squash_messages(enum todo_command 
command,
eol = strchrnul(buf.buf, '\n');
if (buf.buf[0] != comment_line_char ||
(p += strcspn(p, "0123456789\n")) == eol)
-   return error(_("unexpected 1st line of squash message:"
-  "\n\n\t%.*s"),
-(int)(eol - buf.buf), buf.buf);
-   count = strtol(p, NULL, 10);
-
-   if (count < 1)
-   return error(_("invalid 1st line of squash message:\n"
-  "\n\t%.*s"),
-(int)(eol - buf.buf), buf.buf);
+   count = -1;
+   else
+   count = strtol(p, NULL, 10);
 
strbuf_addf(, "%c ", comment_line_char);
-   strbuf_addf(,
-   _("This is a combination of %d commits."), ++count);
+   if (count < 1)
+   strbuf_addf(, _("This is a combination of "
+  "several commits."));
+   else
+   strbuf_addf(,
+   _("This is a combination of %d commits."),
+   ++count);
strbuf_splice(, 0, eol - buf.buf, header.buf, header.len);
strbuf_release();
} else {
@@ -1398,13 +1397,22 @@ static int update_squash_messages(enum todo_command 
command,
if (command == TODO_SQUASH) {
unlink(rebase_path_fixup_msg());
strbuf_addf(, "\n%c ", comment_line_char);
-   strbuf_addf(, _("This is the commit message #%d:"), count);
+   if (count < 2)
+   strbuf_addf(, _("This is the next commit "
+   "message:"));
+   else
+   strbuf_addf(, _("This is the commit message #%d:"),
+   count);
strbuf_addstr(, "\n\n");
strbuf_addstr(, body);
} else if (command == TODO_FIXUP) {
strbuf_addf(, "\n%c ", comment_line_char);
-   strbuf_addf(, _("The commit message #%d will be skipped:"),
-   count);
+   if (count < 2)
+   strbuf_addf(, _("The next commit message will be "
+   "skipped:"));
+   else
+   strbuf_addf(, _("The commit message #%d will be "
+   "skipped:"), count);
strbuf_addstr(, "\n\n");
strbuf_add_commented_lines(, body, strlen(body));
} else
-- 
2.17.0.windows.1.15.gaa56ade3205




[PATCH v2 0/4] rebase -i: avoid stale "# This is a combination of" in commit messages

2018-04-20 Thread Johannes Schindelin
Eric Sunshine pointed out that I had such a commit message in
https://public-inbox.org/git/CAPig+cRrS0_nYJJY=o6cbov630snqhpv5qgrqdd8mw-syzn...@mail.gmail.com/
and I went on a hunt to figure out how the heck this happened.

Turns out that if there is a fixup/squash chain where the *last* command
fails with merge conflicts, and we either --skip ahead or resolve the
conflict to a clean tree and then --continue, our code does not do a
final cleanup.

Contrary to my initial gut feeling, this bug was not introduced by my
rewrite in C of the core parts of rebase -i, but it looks to me as if
that bug was with us for a very long time (at least the --skip part).

The developer (read: user of rebase -i) in me says that we would want to
fast-track this, but the author of rebase -i in me says that we should
be cautious and cook this in `next` for a while.

Fixes since v1:

- Using test_i18ngrep instead of grep, because "This is a combination of
   commits" is marked for translation.

- Added a patch to actually fix `rebase -i` when building with
  GETTEXT_POISON, because we used to assume that numbers are encoded as
  ASCII so that we can increment it when writing the next commit message
  in the fixup/squash chain. This also seems to be a long-standing bug
  that has been with us since the the beginning of the localization of
  rebase -i's commit messages.

- The test case now starts with test_when_finished "test_might_fail git rebase
  --abort" to be allow for failing more gently.

- Fixed grammar of 2/3 (now 3/4): thanks, Eric!

- Fixed the description of the new test case (it purported to test --continue,
  but it really tests --skip).


Johannes Schindelin (4):
  rebase -i: demonstrate bugs with fixup!/squash! commit messages
  rebase -i: Handle "combination of  commits" with GETTEXT_POISON
  sequencer: leave a tell-tale when a fixup/squash failed
  rebase --skip: clean up commit message after a failed fixup/squash

 sequencer.c| 93 --
 t/t3418-rebase-continue.sh | 22 +
 2 files changed, 92 insertions(+), 23 deletions(-)


base-commit: fe0a9eaf31dd0c349ae4308498c33a5c3794b293
Published-As: 
https://github.com/dscho/git/releases/tag/clean-msg-after-fixup-continue-v2
Fetch-It-Via: git fetch https://github.com/dscho/git 
clean-msg-after-fixup-continue-v2

Interdiff vs v1:
 diff --git a/sequencer.c b/sequencer.c
 index f067b7b24c5..881503a6463 100644
 --- a/sequencer.c
 +++ b/sequencer.c
 @@ -1350,19 +1350,18 @@ static int update_squash_messages(enum todo_command 
command,
eol = strchrnul(buf.buf, '\n');
if (buf.buf[0] != comment_line_char ||
(p += strcspn(p, "0123456789\n")) == eol)
 -  return error(_("unexpected 1st line of squash message:"
 - "\n\n\t%.*s"),
 -   (int)(eol - buf.buf), buf.buf);
 -  count = strtol(p, NULL, 10);
 -
 -  if (count < 1)
 -  return error(_("invalid 1st line of squash message:\n"
 - "\n\t%.*s"),
 -   (int)(eol - buf.buf), buf.buf);
 +  count = -1;
 +  else
 +  count = strtol(p, NULL, 10);
  
strbuf_addf(, "%c ", comment_line_char);
 -  strbuf_addf(,
 -  _("This is a combination of %d commits."), ++count);
 +  if (count < 1)
 +  strbuf_addf(, _("This is a combination of "
 + "several commits."));
 +  else
 +  strbuf_addf(,
 +  _("This is a combination of %d commits."),
 +  ++count);
strbuf_splice(, 0, eol - buf.buf, header.buf, header.len);
strbuf_release();
} else {
 @@ -1405,13 +1404,22 @@ static int update_squash_messages(enum todo_command 
command,
if (command == TODO_SQUASH) {
unlink(rebase_path_fixup_msg());
strbuf_addf(, "\n%c ", comment_line_char);
 -  strbuf_addf(, _("This is the commit message #%d:"), count);
 +  if (count < 2)
 +  strbuf_addf(, _("This is the next commit "
 +  "message:"));
 +  else
 +  strbuf_addf(, _("This is the commit message #%d:"),
 +  count);
strbuf_addstr(, "\n\n");
strbuf_addstr(, body);
} else if (command == TODO_FIXUP) {
strbuf_addf(, "\n%c ", comment_line_char);
 -  strbuf_addf(, _("The commit message #%d will be skipped:"),
 -  count);
 +  if (count < 2)
 +  strbuf_addf(, _("The next commit message will be "
 +  

Re: [PATCH 1/3] rebase -i: demonstrate bug with fixup!/squash! commit messages

2018-04-20 Thread Johannes Schindelin
Hi Stefan,

On Fri, 20 Apr 2018, Stefan Beller wrote:

> > Funny thing is: I tested this with GETTEXT_POISON=1, and it succeeded.
> 
> I actually wanted to review the code leading to this commit, and to find
> where to start reviewing I had 'git grep "This is a combination of"' which
> lead me to the translation files.

Heh...

> s/grep/test_i18ngrep/ doesn't work as the syntax is slightly off,
> s/ ! grep/ test_i18n_grep !/ would work, just pointing out the obvious.

Yes, I actually knew that, but my usage of a s/// as a shorthand for the
idea to replace `grep` with `test_i18ngrep` was indeed misleading. Sorry
about that, and thank you for helping me getting this right.

Ciao,
Dscho


Re: [PATCH] fast-export: fix regression skipping some merge-commits

2018-04-20 Thread Johannes Schindelin
Hi Martin,

On Fri, 20 Apr 2018, Martin Ågren wrote:

> Thanks a lot for your comments. I will give this some testing, check
> that your proposed test fails and succeeds as it should, and so on, then
> try to wrap this up.

Thank you so much!

> Between you cleaning up the test and providing a different
> implementation, there's not much left for me to take credit for. Can I
> forge your From: and Signed-off-by: on this?

I disagree, all I did was to play a variation of your tune. You are the
composer of this patch, you performed all the hard work (analysis,
implementation & testing), and you deserve the credit.

It would please my ego a bit, of course, if you could add a "Helped-by:
Dscho" line... ;-)

Ciao,
Dscho

Re: [PATCH 1/3] rebase -i: demonstrate bug with fixup!/squash! commit messages

2018-04-20 Thread Johannes Schindelin
Hi Eric,

On Fri, 20 Apr 2018, Eric Sunshine wrote:

> On Fri, Apr 20, 2018 at 3:29 PM, Eric Sunshine  
> wrote:
> > On Fri, Apr 20, 2018 at 8:17 AM, Johannes Schindelin
> >  wrote:
> >> +   test_must_fail env FAKE_LINES="1 fixup 2 fixup 4" \
> >> +   git rebase -i HEAD~4 &&
> >> +
> >> +   : now there is a conflict, and comments in the commit message &&
> >> +   git show HEAD >out &&
> >> +   grep "This is a combination of" out &&
> >> +
> >> +   : skip and continue &&
> >> +   git rebase --skip &&
> >
> > I see that this test script doesn't utilize it, but do you want a
> >
> > test_when_finished "reset_rebase" &&
> >
> > before starting the rebase to clean up in case something before "git
> > rebase --skip" fails?

No, because then one of the next test cases fails:

not ok 15 - rebase -i --continue remembers --rerere-autoupdate

;-)

I'll use

test_when_finished "test_might_fail git rebase --abort"

instead, okay? ;-)

Ciao,
Dscho


Re: [PATCH 1/3] rebase -i: demonstrate bug with fixup!/squash! commit messages

2018-04-20 Thread Stefan Beller
Hi Johannes,

> Funny thing is: I tested this with GETTEXT_POISON=1, and it succeeded.

I actually wanted to review the code leading to this commit, and to find
where to start reviewing I had 'git grep "This is a combination of"' which
lead me to the translation files.

s/grep/test_i18ngrep/ doesn't work as the syntax is slightly off,
s/ ! grep/ test_i18n_grep !/ would work, just pointing out the obvious.


Re: [PATCH v6 00/15] rebase -i: offer to recreate commit topology

2018-04-20 Thread Jacob Keller
On Fri, Apr 20, 2018 at 1:26 AM, Johannes Schindelin
 wrote:
> Hi Jake,
>
> On Thu, 19 Apr 2018, Jacob Keller wrote:
>
>> On Wed, Apr 18, 2018 at 9:24 PM, Sergey Organov  wrote:
>> > Johannes Schindelin  writes:
>> >
>> >> On Fri, 13 Apr 2018, Phillip Wood wrote:
>> >>
>> >>> On 12/04/18 23:02, Johannes Schindelin wrote:
>> >>> >
>> >>> > [...]
>> >>> >
>> >>> > So: the order of the 3-way merges does matter.
>> >>> >
>> >>> > [...]
>> >>>
>> >>> Those conflicts certainly look intimidating (and the ones in your later
>> >>> reply with the N way merge example still look quite complicated). One
>> >>> option would be just to stop and have the user resolve the conflicts
>> >>> after each conflicting 3-way merge rather than at the end of all the
>> >>> merges. There are some downsides: there would need to be a way to
>> >>> explain to the user that this is an intermediate step (and what that
>> >>> step was); the code would have to do some book keeping to know where it
>> >>> had got to; and it would stop and prompt the user to resolve conflicts
>> >>> more often which could be annoying but hopefully they'd be clearer to
>> >>> resolve because they weren't nested.
>> >>
>> >> I thought about that. But as I pointed out: the order of the merges *does*
>> >> matter. Otherwise we force the user to resolve conflicts that they
>> >> *already* resolved during this rebase...
>> >
>> > How it's relevant to what Phillip suggested? How the order of taking 2
>> > steps, A and B, affects an ability to stop after the first step? It's
>> > still either "A,stop,B" or "B,stop,A", depending on the chosen order.
>> >
>> > What's the _actual_ problem here, if any?
>> >
>> > -- Sergey
>>
>> I believe the order of the merges changes which ones cause conflicts,
>
> That is a correct interpretation of the example I showed.
>
>> but it's possible to generate pre-images (i.e. a set of parents to
>> merge) which cause conflicts regardless of which ordering we pick, so
>> I'm not sure there is a "best ordering".
>
> In general, there is no best ordering, you are right. There is no silver
> bullet.
>
> I am not satisfied with stating that and then leaving it at that.
>
> In the example I presented, you can see that there are common cases where
> there *is* a best ordering. In the wrong order, even if you would force
> the user to resolve the merge conflict in an intermediate merge (which
> would introduce a nightmare for the user interface, I am sure you see
> that), then the next merge would *again* show merge conflicts.
>
> And I, for one, am *really* certain what my decision would be when offered
> the two options 1) force the user to resolve merge conflicts *twice*, or
> 2) reorder the intermediate merges and present the user with exactly one
> set of merge conflicts.
>
> So it is irrelevant that there might not be a "best order" in the general
> case, when in the common cases quite frequently there is.
>
> It is just another example where theory disagrees with practice. Don't get
> me wrong: it is good to start with theory. And likewise it is simply
> necessary to continue from there, and put your theory to the test. And
> then you need to turn this into something practical.
>
> Ciao,
> Dscho

I recall you suggested an approach of "try one way, if there are
conflicts, check the other way and see if it had conflicts".

And I also agree that forcing the user to resolve conflicts in the
middle of the operation is a huge nightmare of a user interface,
probably worse than the issues with nested merge conflicts.

Thanks,
Jake


Re: [PATCH 1/3] rebase -i: demonstrate bug with fixup!/squash! commit messages

2018-04-20 Thread Johannes Schindelin
Hi Stefan,

On Fri, 20 Apr 2018, Johannes Schindelin wrote:

> A brief test shows, however, that it is not quite as easy as
> s/grep/test_i18ngrep/, something more seems to be broken.

It seems that this week is my Rabbit Hole Week.

Turns out that we have a really, really long-standing bug in our rebase -i
where we construct the commit messages for fixup/squash chains.

Background: when having multiple fixup!/squash! commits for the same
original commit, the intermediate commits have messages starting with the
message

# This is a combination of  commits.

and then every fixup/squash command increments that  and adds a header

# This is the commit message #:

before writing the respective commit message.

The problem arises from the fact that we deduce  from parsing the first
number in ASCII encoding on the first line.

That breaks e.g. when compiling with GETTEXT_POISON, and it is probably
not true in general, either.

So I introduced a patch that handles the absence of an ASCII-encoded
number gracefully, and now the test passes with and without
GETTEXT_POISON.

Thanks for the review that let me find and fix this bug!
Dscho


Re: [PATCH 1/3] rebase -i: demonstrate bug with fixup!/squash! commit messages

2018-04-20 Thread Johannes Schindelin
Hi Stefan,

On Fri, 20 Apr 2018, Stefan Beller wrote:

> On Fri, Apr 20, 2018 at 5:17 AM, Johannes Schindelin
>  wrote:
> > When multiple fixup/squash commands are processed and the last one
> > causes merge conflicts and is skipped, we leave the "This is a
> > combination of ..." comments in the commit message.
> >
> > Noticed by Eric Sunshine.
> >
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  t/t3418-rebase-continue.sh | 21 +
> >  1 file changed, 21 insertions(+)
> >
> > diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh
> > index 9214d0bb511..b177baee322 100755
> > --- a/t/t3418-rebase-continue.sh
> > +++ b/t/t3418-rebase-continue.sh
> > @@ -88,6 +88,27 @@ test_expect_success 'rebase passes merge strategy 
> > options correctly' '
> > git rebase --continue
> >  '
> >
> > +test_expect_failure '--continue after failed fixup cleans commit message' '
> > +   git checkout -b with-conflicting-fixup &&
> > +   test_commit wants-fixup &&
> > +   test_commit "fixup! wants-fixup" wants-fixup.t 1 wants-fixup-1 &&
> > +   test_commit "fixup! wants-fixup" wants-fixup.t 2 wants-fixup-2 &&
> > +   test_commit "fixup! wants-fixup" wants-fixup.t 3 wants-fixup-3 &&
> > +   test_must_fail env FAKE_LINES="1 fixup 2 fixup 4" \
> > +   git rebase -i HEAD~4 &&
> > +
> > +   : now there is a conflict, and comments in the commit message &&
> > +   git show HEAD >out &&
> > +   grep "This is a combination of" out &&
> 
> test_i18n_grep ?

Funny thing is: I tested this with GETTEXT_POISON=1, and it succeeded. So
I dug deeper why: I never understood that this is a *build* option. I
always thought that would be a test-time option... Oh well ;-)

> > +
> > +   : skip and continue &&
> > +   git rebase --skip &&
> > +
> > +   : now the comments in the commit message should have been cleaned 
> > up &&
> > +   git show HEAD >out &&
> > +   ! grep "This is a combination of" out
> 
> same

Will work on a fix. A brief test shows, however, that it is not quite as
easy as s/grep/test_i18ngrep/, something more seems to be broken.

Will keep you posted,
Dscho


(nessun oggetto)

2018-04-20 Thread Sheng Li Hung



--
I am Mr.Sheng Li Hung, from china I got your information while search 
for
a reliable person, I have a very profitable business proposition for 
you

and i can assure you that you will not regret been part of this mutual
beneficial transaction after completion. Kindly get back to me for more
details on this email id: shengl...@outlook.com

Thanks
Mr Sheng Li Hung




Re: commit -> public-inbox link helper

2018-04-20 Thread Johannes Schindelin
Hi Eric,

On Fri, 20 Apr 2018, Eric Wong wrote:

> Johannes Schindelin  wrote:
> > 
> > I found myself in dear need to quickly look up mails in the
> > public-inbox mail archive corresponding to any given commit in
> > git.git. Some time ago, I wrote a shell script to help me with that,
> > and I found myself using it a couple of times, so I think it might be
> > useful for others, too.
> 
> Hello, I think you can dump all the info you need more quickly
> without cloning 1G of data by dumping NNTP OVER(view)
> information instead.

That might be true for the current state of affairs.

However, there *are* cases (I think I even linked to my original mail with
my post-GitMerge 2017 analysis) where the triplet Date/Author/Email is not
enough, where even some patch series have the identical triplet for every
single patch.

Even if I did not hit those cases yet, and therefore did not implement
that part, I needed to keep the door open for that. So I need a clone.

Ciao,
Dscho


Re: [PATCH 1/3] gettext: avoid initialization if the locale dir is not present

2018-04-20 Thread Johannes Schindelin
Hi Ævar,

On Fri, 20 Apr 2018, Ævar Arnfjörð Bjarmason wrote:

> On Fri, Apr 20 2018, Martin Ågren wrote:
> 
> > On 20 April 2018 at 11:59, Ævar Arnfjörð Bjarmason  wrote:
> >>
> >> On Fri, Apr 20 2018, Johannes Schindelin wrote:
> >>
> >>> The runtime of a simple `git.exe version` call on Windows is currently
> >>> dominated by the gettext setup, adding a whopping ~150ms to the ~210ms
> >>> total.
> >>>
> >>> Given that this cost is added to each and every git.exe invocation goes
> >>> through common-main's invocation of git_setup_gettext(), and given that
> >>> scripts have to call git.exe dozens, if not hundreds, of times, this is
> >>> a substantial performance penalty.
> >>>
> >>> This is particularly pointless when considering that Git for Windows
> >>> ships without localization (to keep the installer's size to a bearable
> >>> ~34MB): all that time setting up gettext is for naught.
> >>>
> >>> So let's be smart about it and skip setting up gettext if the locale
> >>> directory is not even present.
> >
> >> If you don't ship git for windows with gettext or a podir, then compile
> >> with NO_GETTEXT, then we won't even compile this function you're
> >> patching here. See line 30 and beyond of gettext.h.
> >>
> >> Are you instead compiling git for windows with gettext support with an
> >> optional add-on package for i18n, so then this podir conditional would
> >> pass?
> >>
> >> In any case, if this is actually the desired behavior I think it's worth
> >> clearly explaining the interaction with NO_GETTEXT in the commit
> >> message, and if you *actually* don't want NO_GETTEXT I think it's useful
> >> to guard this with something like MAYBE_GETTEXT on top, so this code
> >> doesn't unintentionally hide installation errors on other
> >> platforms. I.e. something like:
> >>
> >> int have_podir = is_directory(podir);
> >> if (!have_podir)
> >> #ifdef MAYBE_GETTEXT
> >> return;
> >> #else
> >> BUG("Broken installation, can't find '%s'!", podir);
> >> #endif
> >
> > Is it fair to say that such a broken installation would function more or
> > less well before and after Dscho's patch, and that your approach would
> > render such an installation quite useless?
> 
> Yes my thown out there for the purposes of discussion suggestion may
> break things for Dscho, or not. I'm just saying that we should document
> *what* the use-case is.

I think you underestimate the scope of your willful breakage. It is not
just "break things for Dscho". It is breaking things for every single last
user of Git for Windows. If we ever do that, I will make sure to announce
that together with your name and your postal address on it.

> Because it's not just important to massage the code so it works now, but
> to document what the use-case is, so someone down the line who things
> "oh why are we doing that" has a clear answer.

Let's face it: if gettext would ever fail to work in case of a missing
podir, then it would fail for every installation using a locale for which
we just happen not to have any translation.

So we know that your patch would not only break things, but break things
totally without reason!

> > Do we have some other similar
> > cases where we go BUG("I could not find a resource. I could manage
> > fairly well without it, but you seemed to want it when you installed
> > me.")? I wonder what other well-respected software do if they can't find
> > their translations.
> 
> E.g. my recent 1aca69c019 ("perl Git::LoadCPAN: emit better errors under
> NO_PERL_CPAN_FALLBACKS", 2018-03-03) does similar things, we *could*
> carry on in certain cases instead of dying there (or not, depending on
> the codepath).
> 
> But in general, I don't think there's any reasonable use-cases for git
> needing to repair from a broken or semi-broken installation that aren't
> so specific that they can be explicitly declared, e.g. in this
> hypothetical MAYBE_GETTEXT case.

Ævar, you need to understand one thing, and you need to understand it
right now: the vast majority of Git users will not compile their Git. Not.
Ever. Really, I am not kidding. They use whatever built version they can
get most conveniently.

It is even more true on Windows, where it may be easier to build Git for
Windows now (what with all the work I put into the Git for Windows SDK),
but it is still an expensive endeavor.

And that is even assuming that every Git user is at liberty to build their
own software, which is completely untrue in a large chunk of our business.

So in order to give users who want localization what they want, without
burdening users who do not want localization, we use the exact same build
of Git for Windows for both, and the entire difference is that one comes
with the podir, and the other without.

Okay, I am lying, at the moment we do not even have anything end-user
facing that ships with the podir. But I distinctly want to leave the door
open for that.

And if you think that this 

Re: [PATCH 2/3] sequencer: leave a tell-tale when a fixup/squash failed

2018-04-20 Thread Eric Sunshine
On Fri, Apr 20, 2018 at 8:18 AM, Johannes Schindelin
 wrote:
> In the upcoming patch to clean up fixup/squash commit messages even when
> skipping a final fixup/squash that failed with merge conflicts, we will
> need to have some indicator what happened.
>
> As we need to remove the message-fixup and message-squash files upon
> failure, we cannot use those. So let's just write an explicit amend-type
> file, containing either `fixup` or `squash`. The absence of that file
> indicates that the we were not in the middle of a fixup or squash when

s/the we/we/

> merge conflicts were happening.
>
> Signed-off-by: Johannes Schindelin 


Re: [PATCH] fast-export: fix regression skipping some merge-commits

2018-04-20 Thread Martin Ågren
Hi Johannes,

On 20 April 2018 at 21:07, Johannes Schindelin
 wrote:
> On Fri, 20 Apr 2018, Martin Ågren wrote:
>
>> Reintroduce the pattern of first grabbing the last object (using a new
>> function `object_array_peek()`), then later poping it. Using
>> `..._peek()` and `..._pop()` makes it clear that we are referring to the
>> same item, i.e., we do not grab one element, then remove another one.
>
> Instead of using _peek() and _pop() and having to reason about the
> correctness, maybe we should simply re-push? See below for my suggested
> alternative.

>> diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
>> index 866ddf6058..2b46a83a49 100755
>> --- a/t/t9350-fast-export.sh
>> +++ b/t/t9350-fast-export.sh
>> @@ -540,4 +540,26 @@ test_expect_success 'when using -C, do not declare copy 
>> when source of copy is a
>>   test_cmp expected actual
>>  '
>>
>> +test_expect_success 'todo' '
>> + test_create_repo merging &&
>> + git -C merging commit --allow-empty -m initial &&
>
> I see that you copied the style of the latest test case, but I have to
> admit that I would find it much easier to read if it said:
>
> (
> cd merging &&
> test_commit initial &&
> git checkout -b topic &&
> test_commit on-branch &&
> git checkout master &&
> test_commit on-master &&
> test_tick &&
> git merge --no-ff -m Yeah topic &&
>
> echo ":1 $(git rev-parse HEAD^^)" >marks &&
> git fast-export --import-marks=marks master >out &&
> grep Yeah out
> )

This looks much more succinct and much less noisy. I was perhaps too
focused on "subshells are bad" and less on "let's make a decent
trade-off here".

>> +/*
>> + * Returns NULL if the array is empty. Otherwise, returns the last object.
>> + * That is, the returned value is what `object_array_pop()` would have 
>> returned.
>> + */
>> +inline struct object *object_array_peek(const struct object_array *array)
>
> I looked, and this would be the first use of `inline` without `static`...

Hm.

>> +{
>> + return array->nr ? array->objects[array->nr - 1].item : NULL;
>> +}
>> +
>>  typedef int (*object_array_each_func_t)(struct object_array_entry *, void 
>> *);
>>
>>  /*
>> diff --git a/builtin/fast-export.c b/builtin/fast-export.c
>> index 27b2cc138e..8377d27b46 100644
>> --- a/builtin/fast-export.c
>> +++ b/builtin/fast-export.c
>> @@ -650,9 +650,10 @@ static void handle_tail(struct object_array *commits, 
>> struct rev_info *revs,
>>  {
>>   struct commit *commit;
>>   while (commits->nr) {
>> - commit = (struct commit *)object_array_pop(commits);
>> + commit = (struct commit *)object_array_peek(commits);
>>   if (has_unshown_parent(commit))
>>   return;
>> + (void)object_array_pop(commits);
>>   handle_commit(commit, revs, paths_of_changed_objects);
>>   }
>>  }
>
> As I stated above, I think we can make this a bit easier to reason about
> (and less easy to break by future additions) if we avoided the _peek()
> function altogether, like this:
>
>  {
> struct commit *commit;
> while (commits->nr) {
> commit = (struct commit *)object_array_pop(commits);
> -   if (has_unshown_parent(commit))
> +   if (has_unshown_parent(commit)) {
> +   /* Queue again, to be handled later */
> +   add_object_array(commits, NULL, commit);
> return;
> +   }
> handle_commit(commit, revs, paths_of_changed_objects);
> }
>  }
>
> (I did not test this, and I was honestly surprised that there is no
> object_array_push() counterpart to _pop() ;-) So this might be all wrong.)

I was initially torn 50-50 between these two approaches, but now that I
see it, sure it's more verbose and a bit more "back and forth", but most
likely it's less error-prone going forwards.

Thanks a lot for your comments. I will give this some testing, check
that your proposed test fails and succeeds as it should, and so on, then
try to wrap this up. Between you cleaning up the test and providing a
different implementation, there's not much left for me to take credit
for. Can I forge your From: and Signed-off-by: on this?

Thanks a lot for the review.

Martin


Re: [PATCH 1/3] rebase -i: demonstrate bug with fixup!/squash! commit messages

2018-04-20 Thread Eric Sunshine
On Fri, Apr 20, 2018 at 3:29 PM, Eric Sunshine  wrote:
> On Fri, Apr 20, 2018 at 8:17 AM, Johannes Schindelin
>  wrote:
>> +   test_must_fail env FAKE_LINES="1 fixup 2 fixup 4" \
>> +   git rebase -i HEAD~4 &&
>> +
>> +   : now there is a conflict, and comments in the commit message &&
>> +   git show HEAD >out &&
>> +   grep "This is a combination of" out &&
>> +
>> +   : skip and continue &&
>> +   git rebase --skip &&
>
> I see that this test script doesn't utilize it, but do you want a
>
> test_when_finished "reset_rebase" &&
>
> before starting the rebase to clean up in case something before "git
> rebase --skip" fails?

Stated less ambiguously:

... in case something fails between "git rebase -i ..."
and "git rebase --skip"?


Google Gewinner Promotion

2018-04-20 Thread Google Gewinner


Lieber Gewinner,

Wir freuen uns Ihnen mitteilen zu können, dass Ihre E-Mail-Adresse einen Preis 
gewonnen hat.

Ihre E-Mail wurde unter den 12 (zwölf) glücklichen Gewinnen von 950.000 Pfund, 
während der GOOGLE LOTTERY PROMO VEREINIGTEN KÖNIGREICH ausgewählt.


FÜLLEN SIE DIESE INFORMATIONEN FÜR DIE ÜBERPRÜFUNG

DEINE VOLLSTANDIGEN NAMEN:
IHRE KONTAKTADRESSE:
IHRE TELEFON KONTAKTNUMMER:
DEINE NATIONALITÄT:
IHR ZUSTAND:

Vielen Dank für die Verwendung von Google Mail, Google Search Matchin und 
Google anderen Produkten.
Herr Donald Williams.
Internationale Preisabteilung

Grüße,
Herr Larry
Online Google Director
Großbritannien
+44 (0) 20 7605 886,
Wollstraße 37, 67547 Worms




Re: [PATCH 1/3] rebase -i: demonstrate bug with fixup!/squash! commit messages

2018-04-20 Thread Eric Sunshine
On Fri, Apr 20, 2018 at 8:17 AM, Johannes Schindelin
 wrote:
> When multiple fixup/squash commands are processed and the last one
> causes merge conflicts and is skipped, we leave the "This is a
> combination of ..." comments in the commit message.
>
> Signed-off-by: Johannes Schindelin 
> ---
> diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh
> @@ -88,6 +88,27 @@ test_expect_success 'rebase passes merge strategy options 
> correctly' '
> +test_expect_failure '--continue after failed fixup cleans commit message' '
> +   git checkout -b with-conflicting-fixup &&
> +   test_commit wants-fixup &&
> +   test_commit "fixup! wants-fixup" wants-fixup.t 1 wants-fixup-1 &&
> +   test_commit "fixup! wants-fixup" wants-fixup.t 2 wants-fixup-2 &&
> +   test_commit "fixup! wants-fixup" wants-fixup.t 3 wants-fixup-3 &&
> +   test_must_fail env FAKE_LINES="1 fixup 2 fixup 4" \
> +   git rebase -i HEAD~4 &&
> +
> +   : now there is a conflict, and comments in the commit message &&
> +   git show HEAD >out &&
> +   grep "This is a combination of" out &&
> +
> +   : skip and continue &&
> +   git rebase --skip &&

I see that this test script doesn't utilize it, but do you want a

test_when_finished "reset_rebase" &&

before starting the rebase to clean up in case something before "git
rebase --skip" fails?

> +   : now the comments in the commit message should have been cleaned up 
> &&
> +   git show HEAD >out &&
> +   ! grep "This is a combination of" out
> +'


Re: [RFC PATCH 12/12] commit-graph: update design document

2018-04-20 Thread Jakub Narebski
Derrick Stolee  writes:

> The commit-graph feature is now integrated with 'fsck' and 'gc',
> so remove those items from the "Future Work" section of the
> commit-graph design document.

See comments below, but this looks good to me.

What is missing from the "Future Work" section (and which was somewhat
implied by now removed "When this feature stabilizes enough to recommend
to most users") is safety against history [view] changing features:
git-replace, shallow clone and grafts file.  I wrote about this in
"Re: [PATCH v8 04/14] graph: add commit graph design document"
https://public-inbox.org/git/86vacsjdcg@gmail.com/

JN> The problem in my opinion lies in different direction, namely that
JN> commit grafts can change, changing the view of the history.  If we want
JN> commit-graph file to follow user-visible view of the history of the
JN> project, it needs to respect current version of commit grafts - but what
JN> if commit grafts changed since commit-graph file was generated?
JN> 
JN> Actually, there are currently three ways to affect the view of the
JN> history:
JN> 
JN> a. legacy commit grafts mechanism; it was first, but it is not safe,
JN>cannot be transferred on fetch / push, and is now deprecated.
JN> 
JN> b. shallow clones, which are kind of specialized and limited grafts;
JN>they used to limit available functionality, but restrictions are
JN>being lifted (or perhaps even got lifted)
JN> 
JN> c. git-replace mechanism, where we can create an "overlay" of any
JN>object, and is intended to be among others replacement for commit
JN>grafts; safe, transferable, can be turned off with "git
JN>--no-replace-objects "
JN> 
JN> All those can change; some more likely than others.  The problem is if
JN> they change between writing commit-graph file (respecting old view of
JN> the history) and reading it (where we expect to see the new view).
JN> 
JN> a. grafts file can change: lines can be added, removed or changed
JN> 
JN> b. shallow clones can be deepened or shortened, or even make
JN>not shallow
JN> 
JN> c. new replacements can be added, old removed, and existing edited
JN> 
JN> 
JN> There are, as far as I can see, two ways of handling the issue of Git
JN> features that can change the view of the project's history, namely:
JN> 
JN>  * Disable commit-graph reading when any of this features are used, and
JN>always write full graph info.
JN> 
JN>This may not matter much for shallow clones, where commit count
JN>should be small anyway, but when using git-replace to stitch together
JN>current repository with historical one, commit-graph would be
JN>certainly useful.  Also, git-replace does not need to change history.
JN> 
JN>On the other hand I think it is the easier solution.
JN> 
JN> Or
JN> 
JN>  * Detect somehow that the view of the history changed, and invalidate
JN>commit-graph (perhaps even automatically regenerate it).
JN> 
JN>For shallow clone changes I think one can still use the old
JN>commit-graph file to generate the new one.  For other cases, the
JN>metadata is simple to modify, but indices such as generation number
JN>would need to be at least partially calculated anew.

Note that in all cases one can simply discard generation number
information (treating it as if it was ZERO), and use commit-graph as
values before applying history-changing feature: replacements, grafts or
shallow.

Well, at least for shallow you can do that for generation numbers: using
grafts (deprecated) or replacements to join repository with historical
one would mean that we are no longer have commit-graph transitively
closed under reachability.

>
> Signed-off-by: Derrick Stolee 
> ---
>  Documentation/technical/commit-graph.txt | 9 -
>  1 file changed, 9 deletions(-)
>
> diff --git a/Documentation/technical/commit-graph.txt 
> b/Documentation/technical/commit-graph.txt
> index d9f2713efa..d04657b781 100644
> --- a/Documentation/technical/commit-graph.txt
> +++ b/Documentation/technical/commit-graph.txt
> @@ -118,9 +118,6 @@ Future Work
>  - The commit graph feature currently does not honor commit grafts. This can
>be remedied by duplicating or refactoring the current graft logic.
>  
> -- The 'commit-graph' subcommand does not have a "verify" mode that is
> -  necessary for integration with fsck.
> -

All right (though "verify" mode is actually done via "check" command).

>  - After computing and storing generation numbers, we must make graph
>walks aware of generation numbers to gain the performance benefits they
>enable. This will mostly be accomplished by swapping a commit-date-ordered
> @@ -142,12 +139,6 @@ Future Work
>such as "ensure_tree_loaded(commit)" that fully loads a tree before
>using commit->tree.
>  
> -- The current design uses the 'commit-graph' subcommand to generate the 
> graph.
> -  When this feature stabilizes enough to recommend to most users, we should

[PATCH v2] fast-export: fix regression skipping some merge-commits

2018-04-20 Thread Martin Ågren
7199203937 (object_array: add and use `object_array_pop()`, 2017-09-23)
noted that the pattern `object = array.objects[--array.nr].item` could
be abstracted as `object = object_array_pop()`.

Unfortunately, one of the conversions was horribly wrong. Between
grabbing the last object (i.e., peeking at it) and decreasing the object
count, the original code would sometimes return early. The updated code
on the other hand, will always pop the last element, then maybe do the
early return before doing anything with the object.

The end result is that merge commits where all the parents have still
not been exported will simply be dropped, meaning that they will be
completely missing from the exported data.

Reintroduce the pattern of first grabbing the last object (using a new
function `object_array_peek()`), then later poping it. Using
`..._peek()` and `..._pop()` makes it clear that we are referring to the
same item, i.e., we do not grab one element, then remove another one.

Add a test that would have caught this.

Reported-by: Isaac Chou 
Analyzed-by: Isaac Chou 
Signed-off-by: Martin Ågren 
---
Hmph. Version 1 described the test as "todo". This version uses a better
description. No other changes.

 t/t9350-fast-export.sh | 22 ++
 object.h   |  9 +
 builtin/fast-export.c  |  3 ++-
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
index 866ddf6058..194782b05b 100755
--- a/t/t9350-fast-export.sh
+++ b/t/t9350-fast-export.sh
@@ -540,4 +540,26 @@ test_expect_success 'when using -C, do not declare copy 
when source of copy is a
test_cmp expected actual
 '
 
+test_expect_success 'merge commit gets exported with --import-marks' '
+   test_create_repo merging &&
+   git -C merging commit --allow-empty -m initial &&
+
+   git -C merging checkout -b topic &&
+   >merging/topic-file &&
+   git -C merging add topic-file &&
+   git -C merging commit -m topic-file &&
+
+   git -C merging checkout master &&
+   >merging/master-file &&
+   git -C merging add master-file &&
+   git -C merging commit -m master-file &&
+
+   git -C merging merge --no-ff topic -m "merge the topic" &&
+
+   oid=$(git -C merging rev-parse HEAD^^) &&
+   echo :1 $oid >merging/git-marks &&
+   git -C merging fast-export --import-marks=git-marks refs/heads/master 
>out &&
+   grep "merge the topic" out
+'
+
 test_done
diff --git a/object.h b/object.h
index f13f85b2a9..4d8ce280d9 100644
--- a/object.h
+++ b/object.h
@@ -129,6 +129,15 @@ void add_object_array_with_path(struct object *obj, const 
char *name, struct obj
  */
 struct object *object_array_pop(struct object_array *array);
 
+/*
+ * Returns NULL if the array is empty. Otherwise, returns the last object.
+ * That is, the returned value is what `object_array_pop()` would have 
returned.
+ */
+inline struct object *object_array_peek(const struct object_array *array)
+{
+   return array->nr ? array->objects[array->nr - 1].item : NULL;
+}
+
 typedef int (*object_array_each_func_t)(struct object_array_entry *, void *);
 
 /*
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 27b2cc138e..8377d27b46 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -650,9 +650,10 @@ static void handle_tail(struct object_array *commits, 
struct rev_info *revs,
 {
struct commit *commit;
while (commits->nr) {
-   commit = (struct commit *)object_array_pop(commits);
+   commit = (struct commit *)object_array_peek(commits);
if (has_unshown_parent(commit))
return;
+   (void)object_array_pop(commits);
handle_commit(commit, revs, paths_of_changed_objects);
}
 }
-- 
2.17.0



Re: [PATCH] fast-export: fix regression skipping some merge-commits

2018-04-20 Thread Johannes Schindelin
Hi Martin,

On Fri, 20 Apr 2018, Martin Ågren wrote:

> 7199203937 (object_array: add and use `object_array_pop()`, 2017-09-23)
> noted that the pattern `object = array.objects[--array.nr].item` could
> be abstracted as `object = object_array_pop()`.
> 
> Unfortunately, one of the conversions was horribly wrong. Between
> grabbing the last object (i.e., peeking at it) and decreasing the object
> count, the original code would sometimes return early. The updated code
> on the other hand, will always pop the last element, then maybe do the
> early return before doing anything with the object.
> 
> The end result is that merge commits where all the parents have still
> not been exported will simply be dropped, meaning that they will be
> completely missing from the exported data.

Excellent explanation.

> Reintroduce the pattern of first grabbing the last object (using a new
> function `object_array_peek()`), then later poping it. Using
> `..._peek()` and `..._pop()` makes it clear that we are referring to the
> same item, i.e., we do not grab one element, then remove another one.

Instead of using _peek() and _pop() and having to reason about the
correctness, maybe we should simply re-push? See below for my suggested
alternative.

> My sincerest apologies for the stupid train-wreck that the original
> conversion was. Weird interactions between different components can make
> for fun bugs, but this one is just embarassing.

The only way to fail is by doing something. You did something. That is
much better than not doing anything. So please do not be sorry about
introducing a breakage. You did fix it, which makes you double awesome.

> Isaac, this should solve the problem you are seeing. Unfortunately, I do
> not have any experience with building Git for Windows [1]. I really hope
> that this bug did not take up too much of your time. Or eat your data!

It is as easy as

git clone --depth=1 https://github.com/git-for-windows/git-sdk-64

(downloading half a gigabyte of objects, but then you have almost
everything except for the Git source and one support repository for Git
for Windows), then starting git-bash.exe in its toplevel directory and
calling

sdk build git

in there. The `sdk` helper is in its infancy, so I could imagine that a
really neat thing would be to be able to build custom branches and bundle
them in a portable Git. Something like `sdk build portable-git --patch
https://public-inbox.org/git/20180420181248.2015922-1-martin.ag...@gmail.com/`.

In the meantime, it should still be doable by calling

sdk cd git
sdk init build-extra
/usr/src/build-extra/apply-from-public-inbox.sh 
https://public-inbox.org/git/20180420181248.2015922-1-martin.ag...@gmail.com/
make -j15 && make -j15 strip && make -j15 install
sdk build installer

and then running that installer.

You could also build a portable Git instead by replacing the last line
with

/usr/src/build-extra/portable/release.sh 0-test

Or if you want to avoid building a portable Git installer, and instead
copy the files directly, you could try this sequence:

pacman -S --noconfirm rsync
mkdir ~/my-test-git
(cd / && rsync -Rau $(ARCH=x86_64 BITNESS=64 \
usr/src/build-extra/make-file-list.sh) ~/my-test-git/)

Please let me know how it is going, I am always eager to make the Git for
Windows SDK easier to use, as it will ultimately save me time.

> diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
> index 866ddf6058..2b46a83a49 100755
> --- a/t/t9350-fast-export.sh
> +++ b/t/t9350-fast-export.sh
> @@ -540,4 +540,26 @@ test_expect_success 'when using -C, do not declare copy 
> when source of copy is a
>   test_cmp expected actual
>  '
>  
> +test_expect_success 'todo' '
> + test_create_repo merging &&
> + git -C merging commit --allow-empty -m initial &&

I see that you copied the style of the latest test case, but I have to
admit that I would find it much easier to read if it said:

(
cd merging &&
test_commit initial &&
git checkout -b topic &&
test_commit on-branch &&
git checkout master &&
test_commit on-master &&
test_tick &&
git merge --no-ff -m Yeah topic &&

echo ":1 $(git rev-parse HEAD^^)" >marks &&
git fast-export --import-marks=marks master >out &&
grep Yeah out
)

i.e. using the subshell where you cd into merging/ first thing, and then
making extensive use of `test_commit`.

> +
> + git -C merging checkout -b topic &&
> + >merging/topic-file &&
> + git -C merging add topic-file &&
> + git -C merging commit -m topic-file &&
> +
> + git -C merging checkout master &&
> + >merging/master-file &&
> + git -C merging add master-file &&
> + git -C merging commit -m master-file &&
> +
> + git -C merging 

RE: [PATCH] fast-export: fix regression skipping some merge-commits

2018-04-20 Thread Isaac Chou
Hi Martin,

No problem at all.  Thanks for the super quick turnaround.  :-)

Isaac

-Original Message-
From: Martin Ågren [mailto:martin.ag...@gmail.com] 
Sent: Friday, April 20, 2018 2:13 PM
To: git@vger.kernel.org; Isaac Chou 
Cc: Junio C Hamano ; Johannes Schindelin 
; Jonathan Tan 
Subject: [PATCH] fast-export: fix regression skipping some merge-commits

7199203937 (object_array: add and use `object_array_pop()`, 2017-09-23) noted 
that the pattern `object = array.objects[--array.nr].item` could be abstracted 
as `object = object_array_pop()`.

Unfortunately, one of the conversions was horribly wrong. Between grabbing the 
last object (i.e., peeking at it) and decreasing the object count, the original 
code would sometimes return early. The updated code on the other hand, will 
always pop the last element, then maybe do the early return before doing 
anything with the object.

The end result is that merge commits where all the parents have still not been 
exported will simply be dropped, meaning that they will be completely missing 
from the exported data.

Reintroduce the pattern of first grabbing the last object (using a new function 
`object_array_peek()`), then later poping it. Using `..._peek()` and 
`..._pop()` makes it clear that we are referring to the same item, i.e., we do 
not grab one element, then remove another one.

Add a test that would have caught this.

Reported-by: Isaac Chou 
Analyzed-by: Isaac Chou 
Signed-off-by: Martin Ågren 
---
Based on maint, but applies equally well on master.

My sincerest apologies for the stupid train-wreck that the original conversion 
was. Weird interactions between different components can make for fun bugs, but 
this one is just embarassing.

Isaac, this should solve the problem you are seeing. Unfortunately, I do not 
have any experience with building Git for Windows [1]. I really hope that this 
bug did not take up too much of your time. Or eat your data!

Martin

[1] The least I can do is provide a link:
https://github.com/git-for-windows/git/wiki/Building-Git

 t/t9350-fast-export.sh | 22 ++
 object.h   |  9 +
 builtin/fast-export.c  |  3 ++-
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh index 
866ddf6058..2b46a83a49 100755
--- a/t/t9350-fast-export.sh
+++ b/t/t9350-fast-export.sh
@@ -540,4 +540,26 @@ test_expect_success 'when using -C, do not declare copy 
when source of copy is a
test_cmp expected actual
 '
 
+test_expect_success 'todo' '
+   test_create_repo merging &&
+   git -C merging commit --allow-empty -m initial &&
+
+   git -C merging checkout -b topic &&
+   >merging/topic-file &&
+   git -C merging add topic-file &&
+   git -C merging commit -m topic-file &&
+
+   git -C merging checkout master &&
+   >merging/master-file &&
+   git -C merging add master-file &&
+   git -C merging commit -m master-file &&
+
+   git -C merging merge --no-ff topic -m "merge the topic" &&
+
+   oid=$(git -C merging rev-parse HEAD^^) &&
+   echo :1 $oid >merging/git-marks &&
+   git -C merging fast-export --import-marks=git-marks refs/heads/master 
>out &&
+   grep "merge the topic" out
+'
+
 test_done
diff --git a/object.h b/object.h
index f13f85b2a9..4d8ce280d9 100644
--- a/object.h
+++ b/object.h
@@ -129,6 +129,15 @@ void add_object_array_with_path(struct object *obj, const 
char *name, struct obj
  */
 struct object *object_array_pop(struct object_array *array);
 
+/*
+ * Returns NULL if the array is empty. Otherwise, returns the last object.
+ * That is, the returned value is what `object_array_pop()` would have 
returned.
+ */
+inline struct object *object_array_peek(const struct object_array 
+*array) {
+   return array->nr ? array->objects[array->nr - 1].item : NULL; }
+
 typedef int (*object_array_each_func_t)(struct object_array_entry *, void *);
 
 /*
diff --git a/builtin/fast-export.c b/builtin/fast-export.c index 
27b2cc138e..8377d27b46 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -650,9 +650,10 @@ static void handle_tail(struct object_array *commits, 
struct rev_info *revs,  {
struct commit *commit;
while (commits->nr) {
-   commit = (struct commit *)object_array_pop(commits);
+   commit = (struct commit *)object_array_peek(commits);
if (has_unshown_parent(commit))
return;
+   (void)object_array_pop(commits);
handle_commit(commit, revs, paths_of_changed_objects);
}
 }
--
2.17.0



Re: [PATCH v1 1/2] merge: Add merge.renames config setting

2018-04-20 Thread Elijah Newren
Hi Ben,

On Fri, Apr 20, 2018 at 10:59 AM, Ben Peart  wrote:
>
> On 4/20/2018 1:02 PM, Elijah Newren wrote:
>>
>> On Fri, Apr 20, 2018 at 6:36 AM, Ben Peart 
>> wrote:
>>>
>>> --- a/Documentation/merge-config.txt
>>> +++ b/Documentation/merge-config.txt
>>> @@ -37,6 +37,11 @@ merge.renameLimit::
>>>  during a merge; if not specified, defaults to the value of
>>>  diff.renameLimit.
>>>
>>> +merge.renames::
>>> +   Whether and how Git detects renames.  If set to "false",
>>> +   rename detection is disabled. If set to "true", basic rename
>>> +   detection is enabled. This is the default.
>>
>>
>> One can already control o->detect_rename via the -Xno-renames and
>> -Xfind-renames options.

This statement wasn't meant to be independent of the sentence that
followed it...

> Yes, but that requires people to know they need to do that and then remember
> to pass it on the command line every time.  We've found that doesn't
> typically happen, we just get someone complaining about slow merges. :)
>
> That is why we added them as config options which change the default. That
> way we can then set them on the repo and the default behavior gives them
> better performance.  They can still always override the config setting with
> the command line options.

Sorry, I think I wasn't being clear.  The documentation for the config
options for e.g. diff.renameLimit, fetch.prune, log.abbrevCommit, and
merge.ff all mention the equivalent command line parameters.  Your
patch doesn't do that for merge.renames, but I think it would be
helpful if it did.

Also, a link in the documentation the other way, from
Documentation/merge-strategies.txt under the entries for -Xno-renames
and -Xfind-renames should probably mention this new merge.renames
config setting (much like the -Xno-renormalize flag mentions the
merge.renomralize config option).

(In general, I think having this as a configuration option makes
sense, though I hope my other performance patches would be enough to
make people consider switching back to the defaults and use rename
detection again.)


> I'm of the opinion that we shouldn't bother adding features that we aren't
> sure someone will want/use.  If it comes up, we can certainly add it at a
> later date.

Works for me; I was mostly throwing it out there for thought.

> Yes, command line options override the config settings.

Good.  :-)

>> Also, if someone sets merge.renameLimit (to anything) and sets
>> merge.renames to false, then they've got a contradictory setup.  Does
>> it make sense to check and warn about that anywhere?
>
> I don't think we need to.  The merge.renameLimit is only used if
> detect_rename it turned on no matter how that gets turned on (default,
> config setting, command line option) so there isn't really a change in
> behavior here.

I agree that's the pre-existing behavior, but prior to this patch
turning off rename detection could only be done manually with every
invocation.  I'm slightly concerned that users might be confused if
merge.renames was set to false somewhere -- perhaps even in a global
/etc/gitconfig that they had no knowledge of or control over -- and in
an attempt to get rename detection to work they started passing larger
and larger values for renameLimit all to no avail.

The easy fix here may just be documenting the diff.renameLimit and
merge.renameLimit options that they have no effect if rename detection
is turned off.

Or maybe I'm just worrying too much, but we (folks at $dayjob) were
bit pretty hard by renameLimit silently being capped at a value less
than the user specified and in a way that wasn't documented anywhere.


Re: [RFC PATCH 11/12] gc: automatically write commit-graph files

2018-04-20 Thread Ævar Arnfjörð Bjarmason

On Fri, Apr 20 2018, Jakub Narebski wrote:

> Derrick Stolee  writes:
>
>> The commit-graph file is a very helpful feature for speeding up git
>> operations. In order to make it more useful, write the commit-graph file
>> by default during standard garbage collection operations.
>>
>> Add a 'gc.commitGraph' config setting that triggers writing a
>> commit-graph file after any non-trivial 'git gc' command.
>
> Other than the question if 'gc.commitGraph' and 'core.commitGraph'
> should be independent config variables, and the exact wording of the
> git-gc docs, it looks good to me.

Sans doc errors you pointed out in other places (you need to set
core.commitGraph so it's read at all), I think it's very useful to have
these split up. It's simliar to pack.useBitmaps & pack.writeBitmaps.

Makes it easy to start writing them, and then have a quick toggle to
turn it off if there's any issues rathen than go around deleting the
files or making sure they're not written out.


Re: [PATCH v1] perf/aggregate: tighten option parsing

2018-04-20 Thread Ævar Arnfjörð Bjarmason

On Fri, Apr 20 2018, Eric Sunshine wrote:

> On Fri, Apr 20, 2018 at 8:10 AM, Christian Couder
>  wrote:
>> When passing an option '--foo' that it does not recognize, the
>> aggregate.perl script should die with an helpful error message
>> like:
>>
>>   unknown option '--foo' at ./aggregate.perl line 80.
>>
>> rather than:
>>
>>   fatal: Needed a single revision
>>   rev-parse --verify --foo: command returned error: 128
>>
>> While at it let's also prevent something like
>> 'foo--sort-by=regression' to be handled as if
>> '--sort-by=regression' had been used.
>>
>> Signed-off-by: Christian Couder 
>> ---
>> diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
>> @@ -46,7 +46,7 @@ while (scalar @ARGV) {
>> -   if ($arg =~ /--sort-by(?:=(.*))?/) {
>> +   if ($arg =~ /^--sort-by(?:=(.*))?$/) {
>
> Makes sense.
>
>> @@ -76,6 +76,9 @@ while (scalar @ARGV) {
>> +   if ($arg =~ /^--.+$/) {
>> +   die "unknown option '$arg'";
>> +   }
>
> Nit: In this expression, the trailing +$ makes the match no tighter
> than the simpler /^--./, so the latter would be good enough.
>
> Not necessarily worth a re-roll.

Not that it matters in this case, but just as a bit of Perl rx pedantry,
yes his is tighter & more correct. You didn't consider how "." interacts
with newlines:

$ perl -wE 'my @rx = (qr/^--./, qr/^--.+$/, qr/^--./m, qr/^--.+$/m, 
qr/^--./s, qr/^--.+$/s); for (@rx) { my $s = "--foo\n--bar"; say $_, "\t", ($s 
=~ $_ ? 1 : 0) }'
(?^u:^--.)  1
(?^u:^--.+$)0
(?^um:^--.) 1
(?^um:^--.+$)   1
(?^us:^--.) 1
(?^us:^--.+$)   1

I don't think it matters here, not like someone will pass \n in options
to aggregate.perl...


Re: [PATCH v1 0/2] add additional config settings for merge

2018-04-20 Thread Ben Peart



On 4/20/2018 1:34 PM, Elijah Newren wrote:

On Fri, Apr 20, 2018 at 6:36 AM, Ben Peart  wrote:

This enables the user to set a couple of additional options for merge.

1. merge.aggressive - this is to try to resolve a few more trivial
merge cases.  It is documented in read-tree and is not something you
can pass into merge itself.

2. merge.renames - this is to save git from having to go through the entire
3 trees to see if there were any renames that happened.

For the work item repro that I have been using this drops the merge time
from ~1 hour to ~5 minutes and the unmerged entries goes down from
~40,000 to 1.


Ooh, this is *very* interesting.  Is there any chance I could also get
you to test performing the same merge with the version of git at
https://github.com/newren/git/tree/big-repo-small-cherry-pick and
report on your timings?



Unfortunately, it isn't quite that simple.  My repo is _really_ big 
(3.2M files and ~100K commits per week) and requires me to use a custom 
fork of git that works with our GVFS solution for it to work at all.


I've been watching your work in this area and am hoping it pays off for 
us if/when we have users that want to do rename detection and override 
our defaults.



The 'big-repo-small-cherry-pick' name could be improved, but that
branch has a number of performance fixes for really poor rename
detection performance during merges.  From your description, I'm
pretty sure it'll apply to your case.  For my specific testcase,  I
got a speedup factor of 30.  Someone else on the list saw a factor of
24[1].  Results are highly dependent on the specific repo, but it's
certainly possible that it gets much of your factor of 12 speedup that
you saw with these new config settings you added.

However, what makes this case even more interesting to me is that my
branch may not be quite as effective as your workarounds.  There are
other other performance issues in merge that I am aware of, but for
which I haven't had the time to write the patches yet (I've been
waiting for the directory rename detection stuff to land and settle
down before working more on the performance aspects).  I do not know
how big a factor those other performance issues are, but your
workarounds (namely the aggressive setting) may get around some of
those other issues as well, so I'm very interested to see how my
current branch compares to the speedups you got with these settings.

Thanks,
Elijah


[1] 
https://public-inbox.org/git/alpine.deb.2.00.1711211303290.20...@ds9.cixit.se/



[PATCH] fast-export: fix regression skipping some merge-commits

2018-04-20 Thread Martin Ågren
7199203937 (object_array: add and use `object_array_pop()`, 2017-09-23)
noted that the pattern `object = array.objects[--array.nr].item` could
be abstracted as `object = object_array_pop()`.

Unfortunately, one of the conversions was horribly wrong. Between
grabbing the last object (i.e., peeking at it) and decreasing the object
count, the original code would sometimes return early. The updated code
on the other hand, will always pop the last element, then maybe do the
early return before doing anything with the object.

The end result is that merge commits where all the parents have still
not been exported will simply be dropped, meaning that they will be
completely missing from the exported data.

Reintroduce the pattern of first grabbing the last object (using a new
function `object_array_peek()`), then later poping it. Using
`..._peek()` and `..._pop()` makes it clear that we are referring to the
same item, i.e., we do not grab one element, then remove another one.

Add a test that would have caught this.

Reported-by: Isaac Chou 
Analyzed-by: Isaac Chou 
Signed-off-by: Martin Ågren 
---
Based on maint, but applies equally well on master.

My sincerest apologies for the stupid train-wreck that the original
conversion was. Weird interactions between different components can make
for fun bugs, but this one is just embarassing.

Isaac, this should solve the problem you are seeing. Unfortunately, I do
not have any experience with building Git for Windows [1]. I really hope
that this bug did not take up too much of your time. Or eat your data!

Martin

[1] The least I can do is provide a link:
https://github.com/git-for-windows/git/wiki/Building-Git

 t/t9350-fast-export.sh | 22 ++
 object.h   |  9 +
 builtin/fast-export.c  |  3 ++-
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
index 866ddf6058..2b46a83a49 100755
--- a/t/t9350-fast-export.sh
+++ b/t/t9350-fast-export.sh
@@ -540,4 +540,26 @@ test_expect_success 'when using -C, do not declare copy 
when source of copy is a
test_cmp expected actual
 '
 
+test_expect_success 'todo' '
+   test_create_repo merging &&
+   git -C merging commit --allow-empty -m initial &&
+
+   git -C merging checkout -b topic &&
+   >merging/topic-file &&
+   git -C merging add topic-file &&
+   git -C merging commit -m topic-file &&
+
+   git -C merging checkout master &&
+   >merging/master-file &&
+   git -C merging add master-file &&
+   git -C merging commit -m master-file &&
+
+   git -C merging merge --no-ff topic -m "merge the topic" &&
+
+   oid=$(git -C merging rev-parse HEAD^^) &&
+   echo :1 $oid >merging/git-marks &&
+   git -C merging fast-export --import-marks=git-marks refs/heads/master 
>out &&
+   grep "merge the topic" out
+'
+
 test_done
diff --git a/object.h b/object.h
index f13f85b2a9..4d8ce280d9 100644
--- a/object.h
+++ b/object.h
@@ -129,6 +129,15 @@ void add_object_array_with_path(struct object *obj, const 
char *name, struct obj
  */
 struct object *object_array_pop(struct object_array *array);
 
+/*
+ * Returns NULL if the array is empty. Otherwise, returns the last object.
+ * That is, the returned value is what `object_array_pop()` would have 
returned.
+ */
+inline struct object *object_array_peek(const struct object_array *array)
+{
+   return array->nr ? array->objects[array->nr - 1].item : NULL;
+}
+
 typedef int (*object_array_each_func_t)(struct object_array_entry *, void *);
 
 /*
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 27b2cc138e..8377d27b46 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -650,9 +650,10 @@ static void handle_tail(struct object_array *commits, 
struct rev_info *revs,
 {
struct commit *commit;
while (commits->nr) {
-   commit = (struct commit *)object_array_pop(commits);
+   commit = (struct commit *)object_array_peek(commits);
if (has_unshown_parent(commit))
return;
+   (void)object_array_pop(commits);
handle_commit(commit, revs, paths_of_changed_objects);
}
 }
-- 
2.17.0



Re: [PATCH v1 1/2] merge: Add merge.renames config setting

2018-04-20 Thread Ben Peart



On 4/20/2018 1:02 PM, Elijah Newren wrote:

On Fri, Apr 20, 2018 at 6:36 AM, Ben Peart  wrote:

--- a/Documentation/merge-config.txt
+++ b/Documentation/merge-config.txt
@@ -37,6 +37,11 @@ merge.renameLimit::
 during a merge; if not specified, defaults to the value of
 diff.renameLimit.

+merge.renames::
+   Whether and how Git detects renames.  If set to "false",
+   rename detection is disabled. If set to "true", basic rename
+   detection is enabled. This is the default.


One can already control o->detect_rename via the -Xno-renames and
-Xfind-renames options.  


Yes, but that requires people to know they need to do that and then 
remember to pass it on the command line every time.  We've found that 
doesn't typically happen, we just get someone complaining about slow 
merges. :)


That is why we added them as config options which change the default. 
That way we can then set them on the repo and the default behavior gives 
them better performance.  They can still always override the config 
setting with the command line options.


I think the documentation should mention that

"false" is the same as passing -Xno-renames, and "true" is the same as
passing -Xfind-renames.  However, find-renames does take similarity
threshold as a parameter, so there's a question whether this option
should provide some way to do the same.  I'm not sure the answer to
that; it may be that we'd want a separate config option for that, and
we can wait to add it until someone actually wants it.


I'm of the opinion that we shouldn't bother adding features that we 
aren't sure someone will want/use.  If it comes up, we can certainly add 
it at a later date.





  merge.renormalize::
 Tell Git that canonical representation of files in the
 repository has changed over time (e.g. earlier commits record
diff --git a/merge-recursive.c b/merge-recursive.c
index 9c05eb7f70..cd5367e890 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3256,6 +3256,7 @@ static void merge_recursive_config(struct merge_options 
*o)
 git_config_get_int("merge.verbosity", >verbosity);
 git_config_get_int("diff.renamelimit", >diff_rename_limit);
 git_config_get_int("merge.renamelimit", >merge_rename_limit);
+   git_config_get_bool("merge.renames", >detect_rename);
 git_config(git_xmerge_config, NULL);
  }


I would expect an explicitly passed -Xno-renames or -Xfind-renames to
override the config setting.  Could you check if that's the case?



Yes, command line options override the config settings.  You can see 
that in the code where the call to init_merge_options() which loads the 
config settings is followed by parse_merge_opt() which loads the command 
line options.  I've also verified the behavior in the debugger (it's on 
by default in the code, the config setting turns it off, then the 
command line option turns it back on).



Also, if someone sets merge.renameLimit (to anything) and sets
merge.renames to false, then they've got a contradictory setup.  Does
it make sense to check and warn about that anywhere?



I don't think we need to.  The merge.renameLimit is only used if 
detect_rename it turned on no matter how that gets turned on (default, 
config setting, command line option) so there isn't really a change in 
behavior here.


Re: [PATCH v1 0/5] Allocate cache entries from memory pool

2018-04-20 Thread Stefan Beller
>> base-commit: cafaccae98f749ebf33495aec42ea25060de8682
>
> I couldn't quite figure out what these five patches were based on,
> even with this line.  Basing on and referring to a commit that is
> not part of our published history with "base-commit" is not all that
> useful to others.

I'd like to second this. In the object store refactoring, I am at a point where
I'd want to migrate the memory management of {object, tree, commit, tag}.c
which currently is done in alloc.c to a memory pool, that has a dedicated
pointer to it.

So I'd either have to refactor alloc.c to take the_repository[1] or
I'd play around with the mem_pool to manage memory in the
object layer. I guess this playing around can happen with
what is at origin/jm/mem-pool, however the life cycle management
part of the third patch[2] would allow for stopping memleaks there.
So I am interested in this series as well.

[1] proof of concept in patches nearby
https://public-inbox.org/git/20180206001749.218943-31-sbel...@google.com/

[2] https://public-inbox.org/git/20180417163400.3875-5-jam...@microsoft.com/

Thanks,
Stefan


Re: [PATCH v1] perf/aggregate: tighten option parsing

2018-04-20 Thread Eric Sunshine
On Fri, Apr 20, 2018 at 8:10 AM, Christian Couder
 wrote:
> When passing an option '--foo' that it does not recognize, the
> aggregate.perl script should die with an helpful error message
> like:
>
>   unknown option '--foo' at ./aggregate.perl line 80.
>
> rather than:
>
>   fatal: Needed a single revision
>   rev-parse --verify --foo: command returned error: 128
>
> While at it let's also prevent something like
> 'foo--sort-by=regression' to be handled as if
> '--sort-by=regression' had been used.
>
> Signed-off-by: Christian Couder 
> ---
> diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
> @@ -46,7 +46,7 @@ while (scalar @ARGV) {
> -   if ($arg =~ /--sort-by(?:=(.*))?/) {
> +   if ($arg =~ /^--sort-by(?:=(.*))?$/) {

Makes sense.

> @@ -76,6 +76,9 @@ while (scalar @ARGV) {
> +   if ($arg =~ /^--.+$/) {
> +   die "unknown option '$arg'";
> +   }

Nit: In this expression, the trailing +$ makes the match no tighter
than the simpler /^--./, so the latter would be good enough.

Not necessarily worth a re-roll.


Re: [PATCH v1 0/2] add additional config settings for merge

2018-04-20 Thread Elijah Newren
On Fri, Apr 20, 2018 at 6:36 AM, Ben Peart  wrote:
> This enables the user to set a couple of additional options for merge.
>
> 1. merge.aggressive - this is to try to resolve a few more trivial
>merge cases.  It is documented in read-tree and is not something you
>can pass into merge itself.
>
> 2. merge.renames - this is to save git from having to go through the entire
>3 trees to see if there were any renames that happened.
>
> For the work item repro that I have been using this drops the merge time
> from ~1 hour to ~5 minutes and the unmerged entries goes down from
> ~40,000 to 1.

Ooh, this is *very* interesting.  Is there any chance I could also get
you to test performing the same merge with the version of git at
https://github.com/newren/git/tree/big-repo-small-cherry-pick and
report on your timings?

The 'big-repo-small-cherry-pick' name could be improved, but that
branch has a number of performance fixes for really poor rename
detection performance during merges.  From your description, I'm
pretty sure it'll apply to your case.  For my specific testcase,  I
got a speedup factor of 30.  Someone else on the list saw a factor of
24[1].  Results are highly dependent on the specific repo, but it's
certainly possible that it gets much of your factor of 12 speedup that
you saw with these new config settings you added.

However, what makes this case even more interesting to me is that my
branch may not be quite as effective as your workarounds.  There are
other other performance issues in merge that I am aware of, but for
which I haven't had the time to write the patches yet (I've been
waiting for the directory rename detection stuff to land and settle
down before working more on the performance aspects).  I do not know
how big a factor those other performance issues are, but your
workarounds (namely the aggressive setting) may get around some of
those other issues as well, so I'm very interested to see how my
current branch compares to the speedups you got with these settings.

Thanks,
Elijah


[1] 
https://public-inbox.org/git/alpine.deb.2.00.1711211303290.20...@ds9.cixit.se/


Re: [RFC PATCH 11/12] gc: automatically write commit-graph files

2018-04-20 Thread Jakub Narebski
Derrick Stolee  writes:

> The commit-graph file is a very helpful feature for speeding up git
> operations. In order to make it more useful, write the commit-graph file
> by default during standard garbage collection operations.
>
> Add a 'gc.commitGraph' config setting that triggers writing a
> commit-graph file after any non-trivial 'git gc' command.

Other than the question if 'gc.commitGraph' and 'core.commitGraph'
should be independent config variables, and the exact wording of the
git-gc docs, it looks good to me.

> Signed-off-by: Derrick Stolee 
> ---
>  Documentation/git-gc.txt | 4 
>  builtin/gc.c | 8 
>  2 files changed, 12 insertions(+)
>
> diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> index 571b5a7e3c..17dd654a59 100644
> --- a/Documentation/git-gc.txt
> +++ b/Documentation/git-gc.txt
> @@ -119,6 +119,10 @@ The optional configuration variable `gc.packRefs` 
> determines if
>  it within all non-bare repos or it can be set to a boolean value.
>  This defaults to true.
>  
> +The optional configuration variable 'gc.commitGraph' determines if
> +'git gc' runs 'git commit-graph write'. This can be set to a boolean
> +value. This defaults to false.
> +
>  The optional configuration variable `gc.aggressiveWindow` controls how
>  much time is spent optimizing the delta compression of the objects in
>  the repository when the --aggressive option is specified.  The larger

[...]


Re: [PATCH v1 1/2] merge: Add merge.renames config setting

2018-04-20 Thread Elijah Newren
On Fri, Apr 20, 2018 at 10:02 AM, Elijah Newren  wrote:
> On Fri, Apr 20, 2018 at 6:36 AM, Ben Peart  wrote:
>> --- a/Documentation/merge-config.txt
>> +++ b/Documentation/merge-config.txt
>> @@ -37,6 +37,11 @@ merge.renameLimit::
>> during a merge; if not specified, defaults to the value of
>> diff.renameLimit.
>>
>> +merge.renames::
>> +   Whether and how Git detects renames.  If set to "false",
>> +   rename detection is disabled. If set to "true", basic rename
>> +   detection is enabled. This is the default.
>
> One can already control o->detect_rename via the -Xno-renames and
> -Xfind-renames options.  I think the documentation should mention that
> "false" is the same as passing -Xno-renames, and "true" is the same as
> passing -Xfind-renames.  However, find-renames does take similarity
> threshold as a parameter, so there's a question whether this option
> should provide some way to do the same.  I'm not sure the answer to
> that; it may be that we'd want a separate config option for that, and
> we can wait to add it until someone actually wants it.

I just realized another issue, though it also affects -Xno-renames.
Even if rename detection is turned off for the merge, it is
unconditionally turned on for the diffstat.  In builtin/merge.c,
function finish(), there is the code:

if (new_head && show_diffstat) {
...
opts.detect_rename = DIFF_DETECT_RENAME;

It seems that this option should affect that line as well.  (Do you
have diffstat turned off by chance?  If not, you may be able to
improve your performance even more...)


Re: [PATCH v1 2/2] merge: Add merge.aggressive config setting

2018-04-20 Thread Elijah Newren
On Fri, Apr 20, 2018 at 6:36 AM, Ben Peart  wrote:
> Add the ability to control the aggressive flag passed to read-tree via a 
> config setting.

This feels like a workaround to the performance problems with index
updates in merge-recursive.c.  That said, it makes sense to me to do
this when rename detection is turned off.  In fact, I think you'd
automatically want to set aggressive to true whenever rename detection
is turned off (whether by your merge.renames option or the
-Xno-renames flag).

I can't think of any reason this setting would be useful separate from
turning rename detection off, and it'd actively harm rename detection
performance improvements I have in the pipeline.  I'd really prefer to
not add this option, and instead combine the setting of aggressive
with the other flag.  Do you have an independent reason for wanting
this?

Thanks,
Elijah


Re: [RFC PATCH 10/12] commit-graph: add '--reachable' option

2018-04-20 Thread Jakub Narebski
Derrick Stolee  writes:

> When writing commit-graph files, it can be convenient to ask for all
> reachable commits (starting at the ref set) in the resulting file. This
> is particularly helpful when writing to stdin is complicated, such as a
> future integration with 'git gc' which will call
> 'git commit-graph write --reachable' after performing cleanup of the
> object database.
>
> Signed-off-by: Derrick Stolee 

For what it is worth, it looks good to me.

> ---
>  Documentation/git-commit-graph.txt |  8 --
>  builtin/commit-graph.c | 41 +++---
>  t/t5318-commit-graph.sh| 10 
>  3 files changed, 53 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/git-commit-graph.txt 
> b/Documentation/git-commit-graph.txt
> index 93c7841ba2..1b14d40590 100644
> --- a/Documentation/git-commit-graph.txt
> +++ b/Documentation/git-commit-graph.txt
> @@ -37,12 +37,16 @@ Write a commit graph file based on the commits found in 
> packfiles.
>  +
>  With the `--stdin-packs` option, generate the new commit graph by
>  walking objects only in the specified pack-indexes. (Cannot be combined
> -with --stdin-commits.)
> +with --stdin-commits or --reachable.)
>  +
>  With the `--stdin-commits` option, generate the new commit graph by
>  walking commits starting at the commits specified in stdin as a list
>  of OIDs in hex, one OID per line. (Cannot be combined with
> ---stdin-packs.)
> +--stdin-packs or --reachable.)
> ++
> +With the `--reachable` option, generate the new commit graph by walking
> +commits starting at all refs. (Cannot be combined with --stdin-commits
> +or --stind-packs.)

I wonder if this "cannot be combined" is sustainable... ;-)


[...]
> @@ -113,6 +115,25 @@ static int graph_read(int argc, const char **argv)
>   return 0;
>  }
>  
> +struct hex_list {
> + char **hex_strs;
> + int hex_nr;
> + int hex_alloc;
> +};
> +
> +static int add_ref_to_list(const char *refname,
> +const struct object_id *oid,
> +int flags, void *cb_data)
> +{
> + struct hex_list *list = (struct hex_list*)cb_data;
> +
> + ALLOC_GROW(list->hex_strs, list->hex_nr + 1, list->hex_alloc);
> + list->hex_strs[list->hex_nr] = xcalloc(GIT_MAX_HEXSZ + 1, 1);
> + strcpy(list->hex_strs[list->hex_nr], oid_to_hex(oid));
> + list->hex_nr++;
> + return 0;
> +}
> +
>  static int graph_write(int argc, const char **argv)
>  {
>   const char **pack_indexes = NULL;
> @@ -127,6 +148,8 @@ static int graph_write(int argc, const char **argv)
>   OPT_STRING(0, "object-dir", _dir,
>   N_("dir"),
>   N_("The object directory to store the graph")),
> + OPT_BOOL(0, "reachable", ,
> + N_("start walk at all refs")),
>   OPT_BOOL(0, "stdin-packs", _packs,
>   N_("scan pack-indexes listed by stdin for commits")),
>   OPT_BOOL(0, "stdin-commits", _commits,
> @@ -140,8 +163,8 @@ static int graph_write(int argc, const char **argv)
>builtin_commit_graph_write_options,
>builtin_commit_graph_write_usage, 0);
>  
> - if (opts.stdin_packs && opts.stdin_commits)
> - die(_("cannot use both --stdin-commits and --stdin-packs"));
> + if (opts.reachable + opts.stdin_packs + opts.stdin_commits > 1)
> + die(_("use at most one of --reachable, --stdin-commits, or 
> --stdin-packs"));

Nice trick, but is it worth it (in place of boolean expression)?  Though
it lines up with the error message, though...

>   if (!opts.obj_dir)
>   opts.obj_dir = get_object_directory();
>  
> @@ -164,6 +187,16 @@ static int graph_write(int argc, const char **argv)
>   commit_hex = lines;
>   commits_nr = lines_nr;
>   }
> + } else if (opts.reachable) {
> + struct hex_list list;
> + list.hex_nr = 0;
> + list.hex_alloc = 128;
> + ALLOC_ARRAY(list.hex_strs, list.hex_alloc);
> +
> + for_each_ref(add_ref_to_list, );
> +
> + commit_hex = (const char **)list.hex_strs;
> + commits_nr = list.hex_nr;

Nice trick!

>   }
>  
>   write_commit_graph(opts.obj_dir,
> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> index e91053271a..ccadd22f57 100755
> --- a/t/t5318-commit-graph.sh
> +++ b/t/t5318-commit-graph.sh
> @@ -200,6 +200,16 @@ test_expect_success 'build graph from commits with 
> append' '
>  graph_git_behavior 'append graph, commit 8 vs merge 1' full commits/8 merge/1
>  graph_git_behavior 'append graph, commit 8 vs merge 2' full commits/8 merge/2
>  
> +test_expect_success 'build graph using --reachable' '
> + cd "$TRASH_DIRECTORY/full" &&
> + git commit-graph write --reachable &&
> + test_path_is_file 

Re: [PATCH v1 1/2] merge: Add merge.renames config setting

2018-04-20 Thread Elijah Newren
On Fri, Apr 20, 2018 at 6:36 AM, Ben Peart  wrote:
> --- a/Documentation/merge-config.txt
> +++ b/Documentation/merge-config.txt
> @@ -37,6 +37,11 @@ merge.renameLimit::
> during a merge; if not specified, defaults to the value of
> diff.renameLimit.
>
> +merge.renames::
> +   Whether and how Git detects renames.  If set to "false",
> +   rename detection is disabled. If set to "true", basic rename
> +   detection is enabled. This is the default.

One can already control o->detect_rename via the -Xno-renames and
-Xfind-renames options.  I think the documentation should mention that
"false" is the same as passing -Xno-renames, and "true" is the same as
passing -Xfind-renames.  However, find-renames does take similarity
threshold as a parameter, so there's a question whether this option
should provide some way to do the same.  I'm not sure the answer to
that; it may be that we'd want a separate config option for that, and
we can wait to add it until someone actually wants it.

>  merge.renormalize::
> Tell Git that canonical representation of files in the
> repository has changed over time (e.g. earlier commits record
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 9c05eb7f70..cd5367e890 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -3256,6 +3256,7 @@ static void merge_recursive_config(struct merge_options 
> *o)
> git_config_get_int("merge.verbosity", >verbosity);
> git_config_get_int("diff.renamelimit", >diff_rename_limit);
> git_config_get_int("merge.renamelimit", >merge_rename_limit);
> +   git_config_get_bool("merge.renames", >detect_rename);
> git_config(git_xmerge_config, NULL);
>  }

I would expect an explicitly passed -Xno-renames or -Xfind-renames to
override the config setting.  Could you check if that's the case?

Also, if someone sets merge.renameLimit (to anything) and sets
merge.renames to false, then they've got a contradictory setup.  Does
it make sense to check and warn about that anywhere?


Re: [RFC PATCH 09/12] fsck: check commit-graph

2018-04-20 Thread Jakub Narebski
Derrick Stolee  writes:

> If a commit-graph file exists, check its contents during 'git fsck'.

Is it "if a commit-graph file exists", or is it core.commitGraph feature
is turned on?

>
> Signed-off-by: Derrick Stolee 
> ---
>  builtin/fsck.c | 13 +
>  1 file changed, 13 insertions(+)
>
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index ef78c6c00c..9712f230ba 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -16,6 +16,7 @@
>  #include "streaming.h"
>  #include "decorate.h"
>  #include "packfile.h"
> +#include "run-command.h"

Couln't this be done internally, without run-command?  Or is it just
preliminary implementation?

>  
>  #define REACHABLE 0x0001
>  #define SEEN  0x0002
> @@ -45,6 +46,7 @@ static int name_objects;
>  #define ERROR_REACHABLE 02
>  #define ERROR_PACK 04
>  #define ERROR_REFS 010
> +#define ERROR_COMMIT_GRAPH 020

I see that these error status codes are not documented anywhere.  Still,
I would expect at least mentioning commit-graph in the git-fsck manpage.

>  
>  static const char *describe_object(struct object *obj)
>  {
> @@ -815,5 +817,16 @@ int cmd_fsck(int argc, const char **argv, const char 
> *prefix)
>   }
>  
>   check_connectivity();
> +
> + if (core_commit_graph) {
> + struct child_process commit_graph_check = CHILD_PROCESS_INIT;
> + const char *check_argv[] = { "commit-graph", "check", NULL, 
> NULL };
> + commit_graph_check.argv = check_argv;
> + commit_graph_check.git_cmd = 1;
> +
> + if (run_command(_graph_check))
> + errors_found |= ERROR_COMMIT_GRAPH;
> + }
> +
>   return errors_found;
>  }


Re: [RFC PATCH 08/12] commit-graph: verify commit contents against odb

2018-04-20 Thread Jakub Narebski
Derrick Stolee  writes:

One more check that could been done, and which do not require accessing
the object database, would be testing correctness of the Large Edge List
(EDGE) chunk.

For each commit in the commit-graph (in the Commit Data (CDAT) chunk),
check if it has more than two parents (if the value for second parent is
different from 0x but has the most significant bit set).  If
there is any such commit, then.

1. Check that EDGE chunk exists
2. For each octopus merge:
   - check that the index into EDGE array is less than number of its
 elements (sanity check the index)
   - for each parent in the EDGE list, check if the position is valid:
 is less than number of commits in the commit-graph
   - check that list of parents in EDGE terminates
3. If feasible, also check that
   - all entries in EDGE chunk are referenced directly or indirectly
   - number of parent list terminators (with most significant bit set)
 is equal to the number of octopus merges (no overlap of parent
 lists) -- if it is considered an error

> When running 'git commit-graph check', compare the contents of the
> commits that are loaded from the commit-graph file with commits that are
> loaded directly from the object database. This includes checking the
> root tree object ID, commit date, and parents.

All right, this part requires checking the object database.

>
> In addition, verify the generation number calculation is correct for all
> commits in the commit-graph file.

But if I understand the code correctly, this one does not require
accessing the object database.  This is entirely separate check, and in
my opinion it should be a separate commit (a separate patch).

Also, there might be a problem with handling GENERATION_NUMBER_MAX.

>
> Signed-off-by: Derrick Stolee 
> ---
>  commit-graph.c | 82 ++
>  1 file changed, 82 insertions(+)

I guess testing for this would be hard - you would need to create
invalid commit-graph file...

>
> diff --git a/commit-graph.c b/commit-graph.c
> index 80a2ac2a6a..b5bce2bac4 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -899,5 +899,87 @@ int check_commit_graph(struct commit_graph *g)
>graph_commit->graph_pos, i);
>   }
>  
> + for (i = 0; i < g->num_commits; i++) {
> + struct commit *graph_commit, *odb_commit;
> + struct commit_list *graph_parents, *odb_parents;
> + int num_parents = 0;
> +
> + hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i);
> +
> + graph_commit = lookup_commit(_oid);
> + odb_commit = (struct commit *)create_object(cur_oid.hash, 
> alloc_commit_node());
> + if (parse_commit_internal(odb_commit, 0, 0))
> + graph_report(_("failed to parse %s from object 
> database"), oid_to_hex(_oid));

Is it an error to have commit in the commit-graph that is not present in
the object database?

It may happen (if commit-graph is not automatically updated on gc and
pruning), that since creating the commit-graph, the user have deleted
the branch and pruned object database -- then commit-graph can contain
objects that are not present in the object database.

> +
> + if (oidcmp(_commit_tree_in_graph_one(g, 
> graph_commit)->object.oid,
> +get_commit_tree_oid(odb_commit)))
> + graph_report(_("root tree object ID for commit %s in 
> commit-graph is %s != %s"),
> +  oid_to_hex(_oid),
> +  
> oid_to_hex(get_commit_tree_oid(graph_commit)),
> +  
> oid_to_hex(get_commit_tree_oid(odb_commit)));

Looks good to me, nicely detailed error message.

> +
> + if (graph_commit->date != odb_commit->date)
> + graph_report(_("commit date for commit %s in 
> commit-graph is %"PRItime" != %"PRItime""),
> +  oid_to_hex(_oid),
> +  graph_commit->date,
> +  odb_commit->date);

Looks good to me, nicely detailed error message.
It is good to have the same format of the message.

> +
> +
> + graph_parents = graph_commit->parents;
> + odb_parents = odb_commit->parents;
> +
> + while (graph_parents) {
> + num_parents++;
> +
> + if (odb_parents == NULL)
> + graph_report(_("commit-graph parent list for 
> commit %s is too long (%d)"),
> +  oid_to_hex(_oid),
> +  num_parents);
> +
> + if (oidcmp(_parents->item->object.oid, 
> _parents->item->object.oid))
> + graph_report(_("commit-graph parent for %s is 
> %s != %s"),
> +   

Re: [PATCH 1/3] rebase -i: demonstrate bug with fixup!/squash! commit messages

2018-04-20 Thread Stefan Beller
On Fri, Apr 20, 2018 at 5:17 AM, Johannes Schindelin
 wrote:
> When multiple fixup/squash commands are processed and the last one
> causes merge conflicts and is skipped, we leave the "This is a
> combination of ..." comments in the commit message.
>
> Noticed by Eric Sunshine.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  t/t3418-rebase-continue.sh | 21 +
>  1 file changed, 21 insertions(+)
>
> diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh
> index 9214d0bb511..b177baee322 100755
> --- a/t/t3418-rebase-continue.sh
> +++ b/t/t3418-rebase-continue.sh
> @@ -88,6 +88,27 @@ test_expect_success 'rebase passes merge strategy options 
> correctly' '
> git rebase --continue
>  '
>
> +test_expect_failure '--continue after failed fixup cleans commit message' '
> +   git checkout -b with-conflicting-fixup &&
> +   test_commit wants-fixup &&
> +   test_commit "fixup! wants-fixup" wants-fixup.t 1 wants-fixup-1 &&
> +   test_commit "fixup! wants-fixup" wants-fixup.t 2 wants-fixup-2 &&
> +   test_commit "fixup! wants-fixup" wants-fixup.t 3 wants-fixup-3 &&
> +   test_must_fail env FAKE_LINES="1 fixup 2 fixup 4" \
> +   git rebase -i HEAD~4 &&
> +
> +   : now there is a conflict, and comments in the commit message &&
> +   git show HEAD >out &&
> +   grep "This is a combination of" out &&

test_i18n_grep ?

> +
> +   : skip and continue &&
> +   git rebase --skip &&
> +
> +   : now the comments in the commit message should have been cleaned up 
> &&
> +   git show HEAD >out &&
> +   ! grep "This is a combination of" out

same

Thanks,
Stefan


Re: [PATCH v2 1/7] replace: "libify" create_graft()

2018-04-20 Thread Johannes Schindelin
Hi Junio,

On Fri, 20 Apr 2018, Junio C Hamano wrote:

> FWIW, I fully agree with the goal of making this (or other pieces
> that would be useful if they were reusable) reusable.

Great! That's something I'll gladly take into my break.

Ciao,
Dscho


Re: [PATCH v2 3/7] Add a test for `git replace --convert-graft-file`

2018-04-20 Thread Johannes Schindelin
Hi Junio,

On Fri, 20 Apr 2018, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > The proof, as the saying goes, lies in the pudding. So here is a
> > regression test that not only demonstrates what the option is supposed to
> > accomplish, but also demonstrates that it does accomplish it.
> 
> The above spreads the misconception that the value of the test is
> "what I wrote works, look here".  It is not.  "Here is how this
> thing is supposed to work.  You are free to improve it, but do not
> break the basic promises these tests outline" to protect the
> resulting system is.

I am but a lousy foreigner, but to me, basically we said the same.

> > diff --git a/t/t6050-replace.sh b/t/t6050-replace.sh
> > index c630aba657e..77ded6df653 100755
> > --- a/t/t6050-replace.sh
> > +++ b/t/t6050-replace.sh
> > @@ -444,4 +444,24 @@ test_expect_success GPG '--graft on a commit with a 
> > mergetag' '
> > git replace -d $HASH10
> >  '
> >  
> > +test_expect_success '--convert-graft-file' '
> > +   : add and convert graft file &&
> > +   printf "%s\n%s %s\n%s\n" \
> > +   $(git rev-parse HEAD^^ HEAD^ HEAD^^ HEAD^2) \
> > +   >.git/info/grafts &&
> 
> I find the above much less readbale than something like
> 
>   {
>   git rev-parse HEAD^^
>   git rev-parse HEAD^ HEAD^^
>   git rev-parse HEAD^2
>   } >.git/info/grafts

Well, don't you know, that is how my first version looked. It failed,
though, as `git rev-parse HEAD^ HEAD^^` outputs two *lines*.

And the version with a here-doc and inlined `$(git rev-parse ...)` really
looks even uglier.

> because printf formatting string must be deciphered and then matched
> against the order and number of rev-parse arguments (and printf's
> ability to happily accept more args than the placeholders does not
> help in readablity---the reader needs to verify that the code is not
> doing anything overly clever exploiting that ability) before I can
> figure out who's parent of whom.
> 
> Of course, it saves a few spawning of subprocesses; I am not sure if
> that is worth the loss of readability in this case, though.

I disagree that it is so horrible to read. If a regression occurs, you
will have the .git/info/grafts file as a reference anyway, so there is
little you need to decipher.

Ciao,
Dscho


Re: [PATCH] git-send-email: Cc more people

2018-04-20 Thread Mathieu Desnoyers
- On Apr 19, 2018, at 8:03 PM, Junio C Hamano gits...@pobox.com wrote:

> Mathieu Desnoyers  writes:
> 
 I'd further say that these new CC-sources should be disabled by
 default and made opt-in to avoid surprising existing users.
>>> 
>>> But I disagree with this.  The current behaviour is surprising to
>>> existing users, to the point where people are writing their own scripts
>>> to replace git send-email (which seems crazy to me).
>>
>> We could perhaps go with a whitelist approach. The four
>> main match I would be tempted to add are: Acked-by, Reported-by,
>> Reviewed-by, and Tested-by.
> 
> A tool that suddenly starts sending e-mails to more addresses
> without letting the end-users know when and why the change in
> behaviour happened is a source of irritated "somebody made a stupid
> change to git-send-email without telling us that caused unwanted
> e-mails sent to unexpected places and embarrassed me" bug reports.
> I do agree with a whitelist approach from that point of view, and in
> the initial rollout of the feature, that whitelist should be limited
> to what we already send out.
> 
> The users who learn about this new feature can opt into whitelisting
> the common 4 above before we enable them by default.  FWIW, I
> personally think these will be a sensible default (in addition to
> what we already Cc).  I however prefer an approach to introduce
> these more gradually.

Sure, introducing changes like this needs to be done gradually.

Thanks!

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


Re: [PATCH v2 2/7] replace: introduce --convert-graft-file

2018-04-20 Thread Johannes Schindelin
Hi Junio,

On Fri, 20 Apr 2018, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > This option is intended to help with the transition away from the
> > now-deprecated graft file.
> >
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  Documentation/git-replace.txt | 11 +--
> >  builtin/replace.c | 59 ++-
> >  2 files changed, 66 insertions(+), 4 deletions(-)
> 
> I expected you to remove convert-grafts-to-replace-refs.sh from the
> contrib/ section in the same patch, actually.

Sorry, I was still under the impression that contrib/ was somewhat off
limits when replacing scripts by C code (it used to be the rule, but I did
see that the contrib/examples/*.sh files went poof).

Will change.

> FWIW, I think it is a much better approach to give the first-class UI
> for transition like this patch does than "go fish for a good way to
> transition yourself, we may have something useful in contrib/", which is
> what we had so far.

Obviously I agree. It just makes for such a vastly superior user
experience to get an explanation that yes, something has been deprecated,
but Do Not Panic, this is what you do to get out of this fix.

In the same vein, I considered some real hackery that would make the
deprecation notice in commit.c less annoying: I wanted to use some
tentative date in the future as cut-off, and then use some arithmetic
based on the time left until that date to show the deprecation notice
increasingly more often.

However, this would have made it really frustrating for users seeing the
notice fly by, say, in a lengthy script's output, and then trying to
reproduce the message. So as much fun as it would have been to come up
with that formula, I refrained from it.

> > +   while (strbuf_getline(, fp) != EOF) {
> > +   int i = 0, j;
> > +
> > +   while (i != buf.len) {
> > +   char save;
> > +
> > +   for (j = i; j < buf.len && !isspace(buf.buf[j]); j++)
> > +   ; /* look further */
> > +   save = buf.buf[j];
> > +   buf.buf[j] = '\0';
> > +   argv_array_push(, buf.buf + i);
> > +   buf.buf[j] = save;
> 
> It's a shame that we do not have a helper that splits the contents
> of a strbuf at SP and shove the result into an argv_array(). [*1*]
> 
> *1* There is one that splits into an array of strbuf but the point
> of splitting is often that these split pieces are the final thing we
> want, and placing them in separate strbuf (whose strength is that
> contents are easily manipulatable) is pointless.

FWIW I considered introducing such a helper. But I really want to have the
full line to show to the user, so I went with the current version.

Based on your comment, I realized that since argv_array_push() duplicates
the string *anyway*, I could have implemented argv_array_split().

Briefly deterred by the insight that some readers will invariably think
that this function performs de-quoting, Unix shell style, I almost decided
against that.

But only almost.

If anybody needs a de-quoting version of argv_array_split(), or a version
that uses a different delimiter than white-space, my version should
provide a neat starting point (new parameters should be added for those
purposes, of course, since we really need a non-de-quoting version in
--convert-graft-file).

So the next iteration of this patch series will sport a shiny new
argv_array_split(). Enjoy.

> > +
> > +   while (j < buf.len && isspace(buf.buf[j]))
> > +   j++;
> > +   i = j;
> 
> One difference I notice while comparing this with what is done by
> contrib/convert-grafts-to-replace-refs.sh is that this does not
> skip a line that begins with # or SP.  I offhand do not know what
> the point of skipping a line that begins with a SP, but I suspect
> that skipping a line that begins with "#" is a desirable thing to
> do, because commit.c::read_graft_line() does know that such a line
> is a valid comment.

Rght! I meant to look at the parser to verify that I do the same, but
forgot (I had the incorrect recollection that the graft file cannot
contain comments or empty lines).

So I remedied that now. Including correct handling of empty lines.

> > +   }
> > +
> > +   if (create_graft(args.argc, args.argv, force))
> > +   strbuf_addf(, "\n\t%s", buf.buf);
> > +
> > +   argv_array_clear();
> > +   strbuf_reset();
> 
> Strictly speaking, this reset is redundant, as getline() will always
> stuff the line into a fresh buffer (and after the loop there
> correctly is a release).

Good point. I did assume that strbuf_getline() would append, and I was
wrong. I dropped the needless strbuf_reset() call.

> > +   }
> > +
> > +   strbuf_release();
> > +   argv_array_clear();
> > +
> > +   if (!err.len)
> > +   

Re: [PATCH v10 33/36] merge-recursive: fix was_tracked() to quit lying with some renamed paths

2018-04-20 Thread Elijah Newren
On Fri, Apr 20, 2018 at 5:23 AM, SZEDER Gábor  wrote:
> This patch causes memory corruption when the split index feature is in
> use, making several tests fail.  Now, while the split index feature
> sure has its own set of problems, AFAIK those are not that bad to
> cause memory corruption, they "only" tend to cause transient test
> failures due to a variant of the classic racy git issue [1].
>
> Here is a test failure:
>
>   $ GIT_TEST_SPLIT_INDEX=DareISayYes ./t3030-merge-recursive.sh

Running under valgrind shows that merge-recursive.c's add_cacheinfo
(which calls add_cache_entry()) results in data used by o->orig_index
getting free()'d.  That means that anything trying to use that memory
(whether a later call to discard_index, or just a call to was_dirty()
or was_tracked()) will be access'ing free'd memory.  (The exact same
tests run valgrind clean when GIT_TEST_SPLIT_INDEX is not turned on.)

The fact that add_cacheinfo() frees data used by o->orig_index
surprises me.  add_cacheinfo is only supposed to modify the_index.
Are o->orig_index and the_index sharing data somehow?  Did I do
something wrong or incomplete for the split index case when swapping
indexes?  My swapping logic, as shown in this patch was:

/*
 * Update the_index to match the new results, AFTER saving a copy
 * in o->orig_index.  Update src_index to point to the saved copy.
 * (verify_uptodate() checks src_index, and the original index is
 * the one that had the necessary modification timestamps.)
 */
o->orig_index = the_index;
the_index = tmp_index;
o->unpack_opts.src_index = >orig_index;

Do I need to do more?


Re: [PATCH v2 2/7] replace: introduce --convert-graft-file

2018-04-20 Thread Johannes Schindelin
Hi Christian,

On Thu, 19 Apr 2018, Christian Couder wrote:

> On Thu, Apr 19, 2018 at 10:17 AM, Johannes Schindelin
>  wrote:
> 
> > @@ -87,9 +88,13 @@ OPTIONS
> > content as  except that its parents will be
> > [...] instead of 's parents. A replacement ref
> > is then created to replace  with the newly created
> > -   commit. See contrib/convert-grafts-to-replace-refs.sh for an
> > -   example script based on this option that can convert grafts to
> > -   replace refs.
> > +   commit. Use `--convert-graft-file` to convert a
> > +   `$GIT_DIR/info/grafts` file use replace refs instead.
> 
> s/file use replace refs/file to use replace refs/

Ouch... thanks for catching this!

> > +--convert-graft-file::
> > +   Creates graft commits for all entries in `$GIT_DIR/info/grafts`
> > +   and deletes that file upon success. The purpose is to help users
> > +   with transitioning off of the now-deprecated graft file.
> 
> I wonder if it would be better to rename the file to "old_grafts" or
> something, and perhaps tell the user that we renamed it and it can now
> be either deleted or moved somewhere else.

I do not think so. Those old_grafts will only lie around uselessly... and
are no longer necessary.

Ciao,
Dscho


RE: [BUG] Git fast-export with import marks file omits merge commits

2018-04-20 Thread Isaac Chou
Hi Martin,

No problem.  I was thinking of the peek/pop pattern as well.  :)  If you don't 
mind, can you please go ahead and submit a patch for this?  Thanks so much.

Isaac

-Original Message-
From: Martin Ågren [mailto:martin.ag...@gmail.com] 
Sent: Friday, April 20, 2018 1:08 AM
To: Junio C Hamano 
Cc: Isaac Chou ; Johannes Schindelin 
; Jonathan Tan ; 
git@vger.kernel.org
Subject: Re: [BUG] Git fast-export with import marks file omits merge commits

On 20 April 2018 at 00:48, Junio C Hamano  wrote:
> Isaac Chou  writes:
>
>> I inspected the source code (builtin/fast-export.c) for the 
>> fast-export issue I encountered, and it looks like the merge commit 
>> is discarded too early by the call to object_array_pop() after only 
>> one of the two UNSHOWN parents is processed in the method 
>> handle_tail().  The poped merge commit still has one UNSHOWN parent, 
>> therefore it is not processed and is lost in the output.  Can someone 
>> advise me on how to submit a code change or bug report in order to 
>> get the fix into the code base?
>
> There indeed are some differences between v2.14 and v2.15 around the 
> code that returns early when has_unshown_parent() says "yes" [*1*], 
> but the decision to return early when the function says "yes" hasn't 
> changed between that timeperiod---it dates back to f2dc849e ("Add 'git 
> fast-export', the sister of 'git fast-import'", 2007-12-02), i.e. the 
> very beginning of the program's life.
>
> I'll CC those who wrote the original and b3e8ca89 ("fast-export: do 
> not copy from modified file", 2017-09-20) and 71992039
> ("object_array: add and use `object_array_pop()`", 2017-09-23), which 
> are the only two commits that touch the surrounding area during that 
> timeperiod, to ask if something jumps at them.
>
> Thanks.
>
>
> [Footnotes]
>
> *1* An excerpt from 'git diff v2.14.0 v2.15.0 builtin/fast-export.c'
> reads like so:
>
> diff --git a/builtin/fast-export.c b/builtin/fast-export.c index 
> d412c0a8f3..2fb60d6d48 100644
> --- a/builtin/fast-export.c
> +++ b/builtin/fast-export.c
> ...
> @@ -630,15 +645,15 @@ static void *anonymize_tag(const void *old, size_t *len)
> return strbuf_detach(, len);  }
>
> -static void handle_tail(struct object_array *commits, struct rev_info 
> *revs)
> +static void handle_tail(struct object_array *commits, struct rev_info *revs,
> +   struct string_list *paths_of_changed_objects)
>  {
> struct commit *commit;
> while (commits->nr) {
> -   commit = (struct commit *)commits->objects[commits->nr - 
> 1].item;
> +   commit = (struct commit *)object_array_pop(commits);
> if (has_unshown_parent(commit))
> return;
> -   handle_commit(commit, revs);
> -   commits->nr--;
> +   handle_commit(commit, revs, paths_of_changed_objects);
> }
>  }

Indeed. This looks wrong and the guilty person would be me.

If my 71992039 ("object_array: add and use `object_array_pop()`",
2017-09-23) would instead have done something like 
s/commits->nr--/(void)object_array_pop(commits)/ it would not have screwed up 
as much. It could also use a peek+pop-pattern.

Isaac, are you up for submitting a patch? Just let me know if you encounter any 
issues. Otherwise, I can submit a patch shortly since I was the one who dropped 
the ball originally.

Thanks for diagnosing this.

Martin


[PATCH v1 2/2] merge: Add merge.aggressive config setting

2018-04-20 Thread Ben Peart
Add the ability to control the aggressive flag passed to read-tree via a config 
setting.

Reviewed-by: Johannes Schindelin 
Signed-off-by: Ben Peart 
---
 Documentation/merge-config.txt | 4 
 merge-recursive.c  | 1 +
 2 files changed, 5 insertions(+)

diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
index 656f909eb3..5a9ab969db 100644
--- a/Documentation/merge-config.txt
+++ b/Documentation/merge-config.txt
@@ -1,3 +1,7 @@
+merge.aggressive::
+   Passes "aggressive" to read-tree which makes the command resolve
+   a few more cases internally. See "--aggressive" in 
linkgit:git-read-tree[1].
+
 merge.conflictStyle::
Specify the style in which conflicted hunks are written out to
working tree files upon merge.  The default is "merge", which
diff --git a/merge-recursive.c b/merge-recursive.c
index cd5367e890..0ca84e4b82 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -355,6 +355,7 @@ static int git_merge_trees(struct merge_options *o,
o->unpack_opts.fn = threeway_merge;
o->unpack_opts.src_index = _index;
o->unpack_opts.dst_index = _index;
+   git_config_get_bool("merge.aggressive", (int 
*)>unpack_opts.aggressive);
setup_unpack_trees_porcelain(>unpack_opts, "merge");
 
init_tree_desc_from_tree(t+0, common);
-- 
2.17.0.windows.1



[PATCH v1 0/2] add additional config settings for merge

2018-04-20 Thread Ben Peart
This enables the user to set a couple of additional options for merge.

1. merge.aggressive - this is to try to resolve a few more trivial
   merge cases.  It is documented in read-tree and is not something you
   can pass into merge itself.

2. merge.renames - this is to save git from having to go through the entire
   3 trees to see if there were any renames that happened.

For the work item repro that I have been using this drops the merge time
from ~1 hour to ~5 minutes and the unmerged entries goes down from
~40,000 to 1.

Helped-by: Kevin Willford 
Reviewed-by: Johannes Schindelin 
Signed-off-by: Ben Peart 

Base Ref: master
Web-Diff: https://github.com/benpeart/git/commit/a3d157f5be
Checkout: git fetch https://github.com/benpeart/git merge-options-v1 && git 
checkout a3d157f5be

Ben Peart (2):
  merge: Add merge.renames config setting
  merge: Add merge.aggressive config setting

 Documentation/merge-config.txt | 9 +
 merge-recursive.c  | 2 ++
 2 files changed, 11 insertions(+)


base-commit: 0b0cc9f86731f894cff8dd25299a9b38c254569e
-- 
2.17.0.windows.1




[PATCH v1 1/2] merge: Add merge.renames config setting

2018-04-20 Thread Ben Peart
Add the ability to control rename detection for merge via a config setting.

Reviewed-by: Johannes Schindelin 
Signed-off-by: Ben Peart 
---
 Documentation/merge-config.txt | 5 +
 merge-recursive.c  | 1 +
 2 files changed, 6 insertions(+)

diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
index 12b6bbf591..656f909eb3 100644
--- a/Documentation/merge-config.txt
+++ b/Documentation/merge-config.txt
@@ -37,6 +37,11 @@ merge.renameLimit::
during a merge; if not specified, defaults to the value of
diff.renameLimit.
 
+merge.renames::
+   Whether and how Git detects renames.  If set to "false",
+   rename detection is disabled. If set to "true", basic rename
+   detection is enabled. This is the default.
+
 merge.renormalize::
Tell Git that canonical representation of files in the
repository has changed over time (e.g. earlier commits record
diff --git a/merge-recursive.c b/merge-recursive.c
index 9c05eb7f70..cd5367e890 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3256,6 +3256,7 @@ static void merge_recursive_config(struct merge_options 
*o)
git_config_get_int("merge.verbosity", >verbosity);
git_config_get_int("diff.renamelimit", >diff_rename_limit);
git_config_get_int("merge.renamelimit", >merge_rename_limit);
+   git_config_get_bool("merge.renames", >detect_rename);
git_config(git_xmerge_config, NULL);
 }
 
-- 
2.17.0.windows.1



Re: [PATCH v10 33/36] merge-recursive: fix was_tracked() to quit lying with some renamed paths

2018-04-20 Thread SZEDER Gábor
> In commit aacb82de3ff8 ("merge-recursive: Split was_tracked() out of
> would_lose_untracked()", 2011-08-11), was_tracked() was split out of
> would_lose_untracked() with the intent to provide a function that could
> answer whether a path was tracked in the index before the merge.  Sadly,
> it instead returned whether the path was in the working tree due to having
> been tracked in the index before the merge OR having been written there by
> unpack_trees().  The distinction is important when renames are involved,
> e.g. for a merge where:
> 
>HEAD:  modifies path b
>other: renames b->c
> 
> In this case, c was not tracked in the index before the merge, but would
> have been added to the index at stage 0 and written to the working tree by
> unpack_trees().  would_lose_untracked() is more interested in the
> in-working-copy-for-either-reason behavior, while all other uses of
> was_tracked() want just was-it-tracked-in-index-before-merge behavior.
> 
> Unsplit would_lose_untracked() and write a new was_tracked() function
> which answers whether a path was tracked in the index before the merge
> started.
> 
> This will also affect was_dirty(), helping it to return better results
> since it can base answers off the original index rather than an index that
> possibly only copied over some of the stat information.  However,
> was_dirty() will need an additional change that will be made in a
> subsequent patch.
> 
> Signed-off-by: Elijah Newren 
> ---

This patch causes memory corruption when the split index feature is in
use, making several tests fail.  Now, while the split index feature
sure has its own set of problems, AFAIK those are not that bad to
cause memory corruption, they "only" tend to cause transient test
failures due to a variant of the classic racy git issue [1].

Here is a test failure:

  $ GIT_TEST_SPLIT_INDEX=DareISayYes ./t3030-merge-recursive.sh
  <...>
  ok 31 - merge-recursive simple w/submodule result
  *** Error in `/home/szeder/src/git/git': free(): invalid pointer: 
0x01f646d0 ***
  === Backtrace: =
  /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f84e0c5b7e5]
  /lib/x86_64-linux-gnu/libc.so.6(+0x7f72a)[0x7f84e0c6372a]
  /lib/x86_64-linux-gnu/libc.so.6(cfree+0xf7)[0x7f84e0c685e7]
  /home/szeder/src/git/git[0x5181ee]
  /home/szeder/src/git/git[0x4f1e82]
  /home/szeder/src/git/git[0x4f394b]
  /home/szeder/src/git/git[0x44a37f]
  /home/szeder/src/git/git[0x44afa9]
  /home/szeder/src/git/git[0x406640]
  /home/szeder/src/git/git[0x4070f0]
  /home/szeder/src/git/git[0x4062a7]
  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f84e0c04830]
  /home/szeder/src/git/git[0x4062f9]
  === Memory map: 
  0040-00616000 r-xp  08:06 2255502
/home/szeder/src/git/git
  00815000-00816000 r--p 00215000 08:06 2255502
/home/szeder/src/git/git
  00816000-00823000 rw-p 00216000 08:06 2255502
/home/szeder/src/git/git
  00823000-00866000 rw-p  00:00 0 
  01f63000-01fa6000 rw-p  00:00 0  
[heap]
  7f84e09ce000-7f84e09e4000 r-xp  08:06 921674 
/lib/x86_64-linux-gnu/libgcc_s.so.1
  7f84e09e4000-7f84e0be3000 ---p 00016000 08:06 921674 
/lib/x86_64-linux-gnu/libgcc_s.so.1
  7f84e0be3000-7f84e0be4000 rw-p 00015000 08:06 921674 
/lib/x86_64-linux-gnu/libgcc_s.so.1
  7f84e0be4000-7f84e0da4000 r-xp  08:06 917791 
/lib/x86_64-linux-gnu/libc-2.23.so
  7f84e0da4000-7f84e0fa4000 ---p 001c 08:06 917791 
/lib/x86_64-linux-gnu/libc-2.23.so
  7f84e0fa4000-7f84e0fa8000 r--p 001c 08:06 917791 
/lib/x86_64-linux-gnu/libc-2.23.so
  7f84e0fa8000-7f84e0faa000 rw-p 001c4000 08:06 917791 
/lib/x86_64-linux-gnu/libc-2.23.so
  7f84e0faa000-7f84e0fae000 rw-p  00:00 0 
  7f84e0fae000-7f84e0fb5000 r-xp  08:06 917825 
/lib/x86_64-linux-gnu/librt-2.23.so
  7f84e0fb5000-7f84e11b4000 ---p 7000 08:06 917825 
/lib/x86_64-linux-gnu/librt-2.23.so
  7f84e11b4000-7f84e11b5000 r--p 6000 08:06 917825 
/lib/x86_64-linux-gnu/librt-2.23.so
  7f84e11b5000-7f84e11b6000 rw-p 7000 08:06 917825 
/lib/x86_64-linux-gnu/librt-2.23.so
  7f84e11b6000-7f84e11ce000 r-xp  08:06 917789 
/lib/x86_64-linux-gnu/libpthread-2.23.so
  7f84e11ce000-7f84e13cd000 ---p 00018000 08:06 917789 
/lib/x86_64-linux-gnu/libpthread-2.23.so
  7f84e13cd000-7f84e13ce000 r--p 00017000 08:06 917789 
/lib/x86_64-linux-gnu/libpthread-2.23.so
  7f84e13ce000-7f84e13cf000 rw-p 00018000 08:06 917789 
/lib/x86_64-linux-gnu/libpthread-2.23.so
  7f84e13cf000-7f84e13d3000 rw-p  00:00 0 
  7f84e13d3000-7f84e13ec000 r-xp  08:06 

Re: [PATCH v4] bisect: create 'bisect_flags' parameter in find_bisection()

2018-04-20 Thread Johannes Schindelin
Hi Harald,

On Wed, 18 Apr 2018, Harald Nordgren wrote:

> Make it possible to implement bisecting only on first parents or on
> merge commits by passing flags to find_bisection(), instead of just
> a 'find_all' boolean.
> 
> Signed-off-by: Harald Nordgren 
> ---
> 
> Notes:
> Use unsigned type and cache flag value

This version looks good to me.

Thanks,
Johannes


[PATCH 3/3] rebase --skip: clean up commit message after a failed fixup/squash

2018-04-20 Thread Johannes Schindelin
During a series of fixup/squash commands, the interactive rebase builds
up a commit message with comments. This will be presented to the user in
the editor if at least one of those commands was a `squash`.

However, if the last of these fixup/squash commands fails with merge
conflicts, and if the user then decides to skip it (or resolve it to a
clean worktree and then continue the rebase), the current code fails to
clean up the commit message.

This commit fixes that behavior.

The diff is best viewed with --color-moved.

Signed-off-by: Johannes Schindelin 
---
 sequencer.c| 36 
 t/t3418-rebase-continue.sh |  2 +-
 2 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index a9c3bc26f84..f067b7b24c5 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2781,17 +2781,12 @@ static int continue_single_pick(void)
 
 static int commit_staged_changes(struct replay_opts *opts)
 {
-   unsigned int flags = ALLOW_EMPTY | EDIT_MSG;
+   unsigned int flags = ALLOW_EMPTY | EDIT_MSG, is_fixup = 0, is_clean;
 
if (has_unstaged_changes(1))
return error(_("cannot rebase: You have unstaged changes."));
-   if (!has_uncommitted_changes(0)) {
-   const char *cherry_pick_head = git_path_cherry_pick_head();
 
-   if (file_exists(cherry_pick_head) && unlink(cherry_pick_head))
-   return error(_("could not remove CHERRY_PICK_HEAD"));
-   return 0;
-   }
+   is_clean = !has_uncommitted_changes(0);
 
if (file_exists(rebase_path_amend())) {
struct strbuf rev = STRBUF_INIT;
@@ -2804,16 +2799,41 @@ static int commit_staged_changes(struct replay_opts 
*opts)
if (get_oid_hex(rev.buf, _amend))
return error(_("invalid contents: '%s'"),
rebase_path_amend());
-   if (oidcmp(, _amend))
+   if (!is_clean && oidcmp(, _amend))
return error(_("\nYou have uncommitted changes in your "
   "working tree. Please, commit them\n"
   "first and then run 'git rebase "
   "--continue' again."));
+   if (is_clean && !oidcmp(, _amend)) {
+   strbuf_reset();
+   /*
+* Clean tree, but we may need to finalize a
+* fixup/squash chain. A failed fixup/squash leaves the
+* file amend-type in rebase-merge/; It is okay if that
+* file is missing, in which case there is no such
+* chain to finalize.
+*/
+   read_oneliner(, rebase_path_amend_type(), 0);
+   if (!strcmp("squash", rev.buf))
+   is_fixup = TODO_SQUASH;
+   else if (!strcmp("fixup", rev.buf)) {
+   is_fixup = TODO_FIXUP;
+   flags = (flags & ~EDIT_MSG) | CLEANUP_MSG;
+   }
+   }
 
strbuf_release();
flags |= AMEND_MSG;
}
 
+   if (is_clean && !is_fixup) {
+   const char *cherry_pick_head = git_path_cherry_pick_head();
+
+   if (file_exists(cherry_pick_head) && unlink(cherry_pick_head))
+   return error(_("could not remove CHERRY_PICK_HEAD"));
+   return 0;
+   }
+
if (run_git_commit(rebase_path_message(), opts, flags))
return error(_("could not commit staged changes."));
unlink(rebase_path_amend());
diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh
index b177baee322..4880bff82ff 100755
--- a/t/t3418-rebase-continue.sh
+++ b/t/t3418-rebase-continue.sh
@@ -88,7 +88,7 @@ test_expect_success 'rebase passes merge strategy options 
correctly' '
git rebase --continue
 '
 
-test_expect_failure '--continue after failed fixup cleans commit message' '
+test_expect_success '--continue after failed fixup cleans commit message' '
git checkout -b with-conflicting-fixup &&
test_commit wants-fixup &&
test_commit "fixup! wants-fixup" wants-fixup.t 1 wants-fixup-1 &&
-- 
2.17.0.windows.1.15.gaa56ade3205


[PATCH 2/3] sequencer: leave a tell-tale when a fixup/squash failed

2018-04-20 Thread Johannes Schindelin
In the upcoming patch to clean up fixup/squash commit messages even when
skipping a final fixup/squash that failed with merge conflicts, we will
need to have some indicator what happened.

As we need to remove the message-fixup and message-squash files upon
failure, we cannot use those. So let's just write an explicit amend-type
file, containing either `fixup` or `squash`. The absence of that file
indicates that the we were not in the middle of a fixup or squash when
merge conflicts were happening.

Signed-off-by: Johannes Schindelin 
---
 sequencer.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/sequencer.c b/sequencer.c
index 667f35ebdff..a9c3bc26f84 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -106,6 +106,13 @@ static GIT_PATH_FUNC(rebase_path_author_script, 
"rebase-merge/author-script")
  * command is processed, this file is deleted.
  */
 static GIT_PATH_FUNC(rebase_path_amend, "rebase-merge/amend")
+/*
+ * If there was a merge conflict in a fixup/squash series, we need to
+ * record the type so that a `git rebase --skip` can clean up the commit
+ * message as appropriate. This file will contain that type (`fixup` or
+ * `squash`), and not exist otherwise.
+ */
+static GIT_PATH_FUNC(rebase_path_amend_type, "rebase-merge/amend-type")
 /*
  * When we stop at a given patch via the "edit" command, this file contains
  * the abbreviated commit name of the corresponding patch.
@@ -2392,10 +2399,20 @@ static int error_with_patch(struct commit *commit,
 static int error_failed_squash(struct commit *commit,
struct replay_opts *opts, int subject_len, const char *subject)
 {
+   const char *amend_type = "squash";
+
+   if (file_exists(rebase_path_fixup_msg())) {
+   unlink(rebase_path_fixup_msg());
+   amend_type = "fixup";
+   }
+   if (write_message(amend_type, strlen(amend_type),
+  rebase_path_amend_type(), 0))
+   return error(_("could not write '%s'"),
+rebase_path_amend_type());
+
if (rename(rebase_path_squash_msg(), rebase_path_message()))
return error(_("could not rename '%s' to '%s'"),
rebase_path_squash_msg(), rebase_path_message());
-   unlink(rebase_path_fixup_msg());
unlink(git_path_merge_msg());
if (copy_file(git_path_merge_msg(), rebase_path_message(), 0666))
return error(_("could not copy '%s' to '%s'"),
@@ -2572,6 +2589,7 @@ static int pick_commits(struct todo_list *todo_list, 
struct replay_opts *opts)
unlink(rebase_path_author_script());
unlink(rebase_path_stopped_sha());
unlink(rebase_path_amend());
+   unlink(rebase_path_amend_type());
delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
}
if (item->command <= TODO_SQUASH) {
@@ -2799,6 +2817,7 @@ static int commit_staged_changes(struct replay_opts *opts)
if (run_git_commit(rebase_path_message(), opts, flags))
return error(_("could not commit staged changes."));
unlink(rebase_path_amend());
+   unlink(rebase_path_amend_type());
return 0;
 }
 
-- 
2.17.0.windows.1.15.gaa56ade3205




[PATCH 1/3] rebase -i: demonstrate bug with fixup!/squash! commit messages

2018-04-20 Thread Johannes Schindelin
When multiple fixup/squash commands are processed and the last one
causes merge conflicts and is skipped, we leave the "This is a
combination of ..." comments in the commit message.

Noticed by Eric Sunshine.

Signed-off-by: Johannes Schindelin 
---
 t/t3418-rebase-continue.sh | 21 +
 1 file changed, 21 insertions(+)

diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh
index 9214d0bb511..b177baee322 100755
--- a/t/t3418-rebase-continue.sh
+++ b/t/t3418-rebase-continue.sh
@@ -88,6 +88,27 @@ test_expect_success 'rebase passes merge strategy options 
correctly' '
git rebase --continue
 '
 
+test_expect_failure '--continue after failed fixup cleans commit message' '
+   git checkout -b with-conflicting-fixup &&
+   test_commit wants-fixup &&
+   test_commit "fixup! wants-fixup" wants-fixup.t 1 wants-fixup-1 &&
+   test_commit "fixup! wants-fixup" wants-fixup.t 2 wants-fixup-2 &&
+   test_commit "fixup! wants-fixup" wants-fixup.t 3 wants-fixup-3 &&
+   test_must_fail env FAKE_LINES="1 fixup 2 fixup 4" \
+   git rebase -i HEAD~4 &&
+
+   : now there is a conflict, and comments in the commit message &&
+   git show HEAD >out &&
+   grep "This is a combination of" out &&
+
+   : skip and continue &&
+   git rebase --skip &&
+
+   : now the comments in the commit message should have been cleaned up &&
+   git show HEAD >out &&
+   ! grep "This is a combination of" out
+'
+
 test_expect_success 'setup rerere database' '
rm -fr .git/rebase-* &&
git reset --hard commit-new-file-F3-on-topic-branch &&
-- 
2.17.0.windows.1.15.gaa56ade3205




Re: [RFC PATCH 07/12] commit-graph: load a root tree from specific graph

2018-04-20 Thread Jakub Narebski
Derrick Stolee  writes:

> When lazy-loading a tree for a commit, it will be important to select
> the tree from a specific struct commit_graph. Create a new method that
> specifies the commit-graph file and use that in
> get_commit_tree_in_graph().
>
> Signed-off-by: Derrick Stolee 

Looks good to me.

> ---
>  commit-graph.c | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
[...]


[PATCH 0/3] rebase -i: avoid stale "# This is a combination of" in commit messages

2018-04-20 Thread Johannes Schindelin
Eric Sunshine pointed out that I had such a commit message in
https://public-inbox.org/git/CAPig+cRrS0_nYJJY=o6cbov630snqhpv5qgrqdd8mw-syzn...@mail.gmail.com/
and I went on a hunt to figure out how the heck this happened.

Turns out that if there is a fixup/squash chain where the *last* command
fails with merge conflicts, and we either --skip ahead or resolve the
conflict to a clean tree and then --continue, our code does not do a
final cleanup.

Contrary to my initial gut feeling, this bug was not introduced by my
rewrite in C of the core parts of rebase -i, but it looks to me as if
that bug was with us for a very long time (at least the --skip part).

The developer (read: user of rebase -i) in me says that we would want to
fast-track this, but the author of rebase -i in me says that we should
be cautious and cook this in `next` for a while.


Johannes Schindelin (3):
  rebase -i: demonstrate bug with fixup!/squash! commit messages
  sequencer: leave a tell-tale when a fixup/squash failed
  rebase --skip: clean up commit message after a failed fixup/squash

 sequencer.c| 57 --
 t/t3418-rebase-continue.sh | 21 ++
 2 files changed, 69 insertions(+), 9 deletions(-)


base-commit: fe0a9eaf31dd0c349ae4308498c33a5c3794b293
Published-As: 
https://github.com/dscho/git/releases/tag/clean-msg-after-fixup-continue-v1
Fetch-It-Via: git fetch https://github.com/dscho/git 
clean-msg-after-fixup-continue-v1
-- 
2.17.0.windows.1.15.gaa56ade3205



RFC: How should we handle un-deleted remote branches?

2018-04-20 Thread Ævar Arnfjörð Bjarmason
I removed a remote and its refs persisted. At first I thought this was a
bug but looking at add_branch_for_removal()'s "don't delete a branch if
another remote also uses it" logic it's intentional.

This goes very wrong if you do e.g.:

(
rm -rf /tmp/git &&
git clone --bare --mirror g...@github.com:git/git.git /tmp/git &&
cd /tmp/git &&
git remote add avar  g...@github.com:avar/git.git &&
git fetch avar &&
git remote remove avar &&
git for-each-ref|grep -c refs/remotes/avar
)

At this point you can't prune it since no remote config is left:

$ git remote prune avar
fatal: 'avar' does not appear to be a git repository
fatal: Could not read from remote repository.

But this is a possible work-around:

git init /tmp/empty.git
git remote add avar file:///tmp/empty.git
git remote prune avar
git remote remove avar

Or use update-ref -d to remove them:

for rr in $(git for-each-ref --format="%(refname)"|grep ^refs/remotes/avar/)
do
git update-ref -d $rr
done

I started to patch this, but I'm not sure what to do here. we could do
some combination of:

 0. Just document the current behavior and leave it.

 1. Dig further down to see what other remotes reference these refs, and
just ignore any refspecs that don't explicitly reference
refs/remotes//*.

I.e. isn't the intention here to preserve a case where you have two
URLs for the same effective remote, not whene you have something
like a --mirror refspec? Unfortunately I can't ask the original
author :(

 2. Warn about each ref we didn't delete, or at least warn saying
there's undeleted refs under refs/remotes//*.

 3. Make 'git remote remove --force-deletion ' (or whatever the
flag is called) be a thing. But unless we do the next item this
won't be useful.

 4. Make 'git remote prune ' work in cases where we don't have a
remote called  anymore, just falling back to deleting
refs/remotes/. In this case 'git remote remove
--force-deletion ' would also do the same thing.

In any case, the current behavior where we just leave this crap behind
without any explanation or an obvious way to delete them (can't use
git-branch -d, need update-ref -d) isn't nice.

I encountered this in the wild because GitLab will add an "upstream" ref
if you clone a repo, but if you stop using the remote to update it in
combination with their "geo" remote (which has --mirror refspecs) it'll
leave behind all these dead refs.


Re: [RFC PATCH 06/12] commit: force commit to parse from object database

2018-04-20 Thread Jakub Narebski
Derrick Stolee  writes:

> In anticipation of checking commit-graph file contents against the
> object database, create parse_commit_internal() to allow side-stepping
> the commit-graph file and parse directly from the object database.

Nitpick/Bikeshed painting: do we have any naming convention for such
functions (*_internal() here)?

>
> Due to the use of generation numbers, this method should not be called
> unless the intention is explicit in avoiding commits from the
> commit-graph file.

Looks good to me, except for some stray whitespace changes in the
patch.

>
> Signed-off-by: Derrick Stolee 
> ---
>  commit.c | 14 ++
>  commit.h |  1 +
>  2 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/commit.c b/commit.c
> index 9ef6f699bd..07752d8503 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -392,7 +392,8 @@ int parse_commit_buffer(struct commit *item, const void 
> *buffer, unsigned long s
>   return 0;
>  }
>  
> -int parse_commit_gently(struct commit *item, int quiet_on_missing)
> +

Stray empty line, though I think it may improve readability of the code
by using two empty lines between separate functions.

But to be consistent with the rest of the file, there shouldn't be this
extra empty line.

> +int parse_commit_internal(struct commit *item, int quiet_on_missing, int 
> use_commit_graph)
>  {
>   enum object_type type;
>   void *buffer;
> @@ -403,17 +404,17 @@ int parse_commit_gently(struct commit *item, int 
> quiet_on_missing)
>   return -1;
>   if (item->object.parsed)
>   return 0;
> - if (parse_commit_in_graph(item))
> + if (use_commit_graph && parse_commit_in_graph(item))
>   return 0;

All right.

>   buffer = read_sha1_file(item->object.oid.hash, , );
>   if (!buffer)
>   return quiet_on_missing ? -1 :
>   error("Could not read %s",
> -  oid_to_hex(>object.oid));
> + oid_to_hex(>object.oid));

Stray whitespace change (looks like spaces to tabs conversion).

>   if (type != OBJ_COMMIT) {
>   free(buffer);
>   return error("Object %s not a commit",
> -  oid_to_hex(>object.oid));
> + oid_to_hex(>object.oid));

Stray whitespace change (looks like spaces to tabs conversion).

>   }
>   ret = parse_commit_buffer(item, buffer, size, 0);
>   if (save_commit_buffer && !ret) {
> @@ -424,6 +425,11 @@ int parse_commit_gently(struct commit *item, int 
> quiet_on_missing)
>   return ret;
>  }
>  
> +int parse_commit_gently(struct commit *item, int quiet_on_missing)
> +{
> + return parse_commit_internal(item, quiet_on_missing, 1);
> +}

All right; if it is internal details of implementations, I don't mind
this slightly cryptic "1" as the value of last parameters.

> +
>  void parse_commit_or_die(struct commit *item)
>  {
>   if (parse_commit(item))
> diff --git a/commit.h b/commit.h
> index b5afde1ae9..5fde74fcd7 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -73,6 +73,7 @@ struct commit *lookup_commit_reference_by_name(const char 
> *name);
>  struct commit *lookup_commit_or_die(const struct object_id *oid, const char 
> *ref_name);
>  
>  int parse_commit_buffer(struct commit *item, const void *buffer, unsigned 
> long size, int check_graph);
> +int parse_commit_internal(struct commit *item, int quiet_on_missing, int 
> use_commit_graph);
>  int parse_commit_gently(struct commit *item, int quiet_on_missing);
>  static inline int parse_commit(struct commit *item)
>  {

All right.


[PATCH v1] perf/aggregate: tighten option parsing

2018-04-20 Thread Christian Couder
When passing an option '--foo' that it does not recognize, the
aggregate.perl script should die with an helpful error message
like:

  unknown option '--foo' at ./aggregate.perl line 80.

rather than:

  fatal: Needed a single revision
  rev-parse --verify --foo: command returned error: 128

While at it let's also prevent something like
'foo--sort-by=regression' to be handled as if
'--sort-by=regression' had been used.

Signed-off-by: Christian Couder 
---
 t/perf/aggregate.perl | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
index 48637ef64b..2cc9eb0ac5 100755
--- a/t/perf/aggregate.perl
+++ b/t/perf/aggregate.perl
@@ -46,7 +46,7 @@ while (scalar @ARGV) {
shift @ARGV;
next;
}
-   if ($arg =~ /--sort-by(?:=(.*))?/) {
+   if ($arg =~ /^--sort-by(?:=(.*))?$/) {
shift @ARGV;
if (defined $1) {
$sortby = $1;
@@ -76,6 +76,9 @@ while (scalar @ARGV) {
}
next;
}
+   if ($arg =~ /^--.+$/) {
+   die "unknown option '$arg'";
+   }
last if -f $arg or $arg eq "--";
if (! -d $arg) {
my $rev = Git::command_oneline(qw(rev-parse --verify), $arg);
-- 
2.17.0.257.g28b659db43



Re: [PATCH v2 1/7] replace: "libify" create_graft()

2018-04-20 Thread Jakub Narebski
Junio C Hamano  writes:
> Johannes Schindelin  writes:
>
>> It is quite convenient to simply die() in builtins, in the absence of
>> proper exception handling, because it allows us to just go belly up
>> without having to implement error handling chains.
>>
>> Of course, for reusable library functions, this is a big no-no, so we
>> (try to) restrict the usage of die() to one-shot commands, i.e. places
>> where we know that the caller does not want to, say, give the user a
>> useful high-level error message, i.e. a message that the function calling
>> die() could not possibly know.
>>
>> The problem with this reasoning is that sooner or later, pretty much all
>> useful functions will *need* to be libified: the more useful a function,
>> the more likely it is to be called from a call chain where the outer
>> function implements a high-level operation that needs to provide
>> additional advice to the user in case of failure.
>>
>> This is the case here: the create_graft() function is useful enough to be
>> called in a loop, say, in the upcoming patch to convert a graft file in
>> one fell swoop. Therefore, this function must not be allowed to die(), nor
>> any of its callees.
>
> All of the first three paragraphs are already widely known to the
> project, and I do not think you need to verbosely repeat it, if
> brevity demands it.

[taking the role of Advocatus Diaboli]

Well, this information (first three paragraphs) is maybe widely known to
the project, but do we have it explicitely documented somwehere, like
CodingGuidelines, or FAQ on some web site?

> One thing that the proposed log message for this step would be far
> more useful to say is that the current callers of create_graft() is
> fine with the behaviour change (i.e. prepared to act on an error
> return).

Right.

>
> Also it may want to say why the other die() we see in this function
> in the pre-context is OK.  I actually think it is not OK and should
> return error(not a valid object name) the same way (I suspect that
> this is mere omission/mistake, and you did not intend to leave it
> dying).  As long as the caller is prepared to deal with an error
> return, that is, which was the previous point.

[taking the role of Advocatus Diaboli]

Maybe they are errors that caller cannot handle sanely in any way?
If I understand things correctly the problem is with parameter to the
--graft option...

>
> FWIW, I fully agree with the goal of making this (or other pieces
> that would be useful if they were reusable) reusable.
>
>> Signed-off-by: Johannes Schindelin 
>> ---
>>  builtin/replace.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/builtin/replace.c b/builtin/replace.c
>> index 935647be6bd..43264f0998e 100644
>> --- a/builtin/replace.c
>> +++ b/builtin/replace.c
>> @@ -395,7 +395,9 @@ static int create_graft(int argc, const char **argv, int 
>> force)
>>  
>>  if (get_oid(old_ref, _oid) < 0)
>>  die(_("Not a valid object name: '%s'"), old_ref);
>> -commit = lookup_commit_or_die(_oid, old_ref);
>> +commit = lookup_commit_reference(_oid);
>> +if (!commit)
>> +return error(_("could not parse %s"), old_ref);
>>  
>>  buffer = get_commit_buffer(commit, );
>>  strbuf_add(, buffer, size);


  1   2   >