[PATCH] rebase docs: drop stray word in merge command description

2018-12-08 Thread Kyle Meyer
Delete a misplaced word introduced by caafecfcf1 (rebase
--rebase-merges: adjust man page for octopus support, 2018-03-09).

Signed-off-by: Kyle Meyer 
---
 Documentation/git-rebase.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index dff17b3178..2ee535fb23 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -979,7 +979,7 @@ when the merge operation did not even start), it is 
rescheduled immediately.
 
 At this time, the `merge` command will *always* use the `recursive`
 merge strategy for regular merges, and `octopus` for octopus merges,
-strategy, with no way to choose a different one. To work around
+with no way to choose a different one. To work around
 this, an `exec` command can be used to call `git merge` explicitly,
 using the fact that the labels are worktree-local refs (the ref
 `refs/rewritten/onto` would correspond to the label `onto`, for example).
-- 
2.19.2



Re: why doesn't "git reset" mention optional pathspec?

2018-12-08 Thread Duy Nguyen
On Sat, Dec 8, 2018 at 6:37 PM Robert P. J. Day  wrote:
>
> On Sat, 8 Dec 2018, Duy Nguyen wrote:
>
> > On Sat, Dec 8, 2018 at 6:32 PM Robert P. J. Day  
> > wrote:
> > >
> > > On Sat, 8 Dec 2018, Duy Nguyen wrote:
> > >
> > > > On Sat, Dec 8, 2018 at 5:08 PM Robert P. J. Day  
> > > > wrote:
> > > > >
> > > > >
> > > > >   from "man git-reset":
> > > > >
> > > > > SYNOPSIS
> > > > >   git reset [-q] [] [--] ...
> > > > >   git reset (--patch | -p) [] [--] [...]
> > > > >   git reset [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] 
> > > > > []
> > > > >
> > > > > oddly, the third form says nothing about possible "", even
> > > > > though i'm pretty sure they're valid in that third case (at least
> > > > > for "--mixed"). thoughts? is that just an oversight in the man
> > > > > page?
> > > >
> > > > --mixed prints a deprecation warning. I don't think it's worth
> > > > making the synopsis more complicated for that. All other modes
> > > > reject pathspec.
> > >
> > >   i just tested this, and i don't see a deprecation warning.
> >
> > Hmm.. maybe I misread the code. I just tried it
> >
> > $ ./git reset --mixed HEAD foo
> > warning: --mixed with paths is deprecated; use 'git reset -- ' 
> > instead.
>
>   weird ... i just tried this two ways, explicitly specifying
> "--mixed" and also without (which is the default mode, right?), and i
> got the deprecated message with the first but not the second. that
> seems ... odd.

Without --mixed, you're using the first form

git reset [-q] [] [--] ...

which accepts pathspec. If it's not clear, of course patches are welcome.
-- 
Duy


Re: why doesn't "git reset" mention optional pathspec?

2018-12-08 Thread Robert P. J. Day
On Sat, 8 Dec 2018, Duy Nguyen wrote:

> On Sat, Dec 8, 2018 at 6:32 PM Robert P. J. Day  wrote:
> >
> > On Sat, 8 Dec 2018, Duy Nguyen wrote:
> >
> > > On Sat, Dec 8, 2018 at 5:08 PM Robert P. J. Day  
> > > wrote:
> > > >
> > > >
> > > >   from "man git-reset":
> > > >
> > > > SYNOPSIS
> > > >   git reset [-q] [] [--] ...
> > > >   git reset (--patch | -p) [] [--] [...]
> > > >   git reset [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] 
> > > > []
> > > >
> > > > oddly, the third form says nothing about possible "", even
> > > > though i'm pretty sure they're valid in that third case (at least
> > > > for "--mixed"). thoughts? is that just an oversight in the man
> > > > page?
> > >
> > > --mixed prints a deprecation warning. I don't think it's worth
> > > making the synopsis more complicated for that. All other modes
> > > reject pathspec.
> >
> >   i just tested this, and i don't see a deprecation warning.
>
> Hmm.. maybe I misread the code. I just tried it
>
> $ ./git reset --mixed HEAD foo
> warning: --mixed with paths is deprecated; use 'git reset -- ' instead.

  my test:

  Changes to be committed:
(use "git reset HEAD ..." to unstage)

modified:   README.asc
modified:   Rakefile

  $ git reset -- README.asc
  Unstaged changes after reset:
  M README.asc
  $ git reset --mixed -- Rakefile
  warning: --mixed with paths is deprecated; use 'git reset -- ' instead.
  Unstaged changes after reset:
  M README.asc
  M Rakefile
  $

that definitely seems inconsistent.

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
  http://crashcourse.ca/dokuwiki

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday




Re: why doesn't "git reset" mention optional pathspec?

2018-12-08 Thread Robert P. J. Day
On Sat, 8 Dec 2018, Duy Nguyen wrote:

> On Sat, Dec 8, 2018 at 6:32 PM Robert P. J. Day  wrote:
> >
> > On Sat, 8 Dec 2018, Duy Nguyen wrote:
> >
> > > On Sat, Dec 8, 2018 at 5:08 PM Robert P. J. Day  
> > > wrote:
> > > >
> > > >
> > > >   from "man git-reset":
> > > >
> > > > SYNOPSIS
> > > >   git reset [-q] [] [--] ...
> > > >   git reset (--patch | -p) [] [--] [...]
> > > >   git reset [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] 
> > > > []
> > > >
> > > > oddly, the third form says nothing about possible "", even
> > > > though i'm pretty sure they're valid in that third case (at least
> > > > for "--mixed"). thoughts? is that just an oversight in the man
> > > > page?
> > >
> > > --mixed prints a deprecation warning. I don't think it's worth
> > > making the synopsis more complicated for that. All other modes
> > > reject pathspec.
> >
> >   i just tested this, and i don't see a deprecation warning.
>
> Hmm.. maybe I misread the code. I just tried it
>
> $ ./git reset --mixed HEAD foo
> warning: --mixed with paths is deprecated; use 'git reset -- ' instead.

  weird ... i just tried this two ways, explicitly specifying
"--mixed" and also without (which is the default mode, right?), and i
got the deprecated message with the first but not the second. that
seems ... odd.

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
  http://crashcourse.ca/dokuwiki

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: why doesn't "git reset" mention optional pathspec?

2018-12-08 Thread Duy Nguyen
On Sat, Dec 8, 2018 at 6:32 PM Robert P. J. Day  wrote:
>
> On Sat, 8 Dec 2018, Duy Nguyen wrote:
>
> > On Sat, Dec 8, 2018 at 5:08 PM Robert P. J. Day  
> > wrote:
> > >
> > >
> > >   from "man git-reset":
> > >
> > > SYNOPSIS
> > >   git reset [-q] [] [--] ...
> > >   git reset (--patch | -p) [] [--] [...]
> > >   git reset [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] 
> > > []
> > >
> > > oddly, the third form says nothing about possible "", even
> > > though i'm pretty sure they're valid in that third case (at least
> > > for "--mixed"). thoughts? is that just an oversight in the man
> > > page?
> >
> > --mixed prints a deprecation warning. I don't think it's worth
> > making the synopsis more complicated for that. All other modes
> > reject pathspec.
>
>   i just tested this, and i don't see a deprecation warning.

Hmm.. maybe I misread the code. I just tried it

$ ./git reset --mixed HEAD foo
warning: --mixed with paths is deprecated; use 'git reset -- ' instead.
-- 
Duy


Re: why doesn't "git reset" mention optional pathspec?

2018-12-08 Thread Robert P. J. Day
On Sat, 8 Dec 2018, Duy Nguyen wrote:

> On Sat, Dec 8, 2018 at 5:08 PM Robert P. J. Day  wrote:
> >
> >
> >   from "man git-reset":
> >
> > SYNOPSIS
> >   git reset [-q] [] [--] ...
> >   git reset (--patch | -p) [] [--] [...]
> >   git reset [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] 
> > []
> >
> > oddly, the third form says nothing about possible "", even
> > though i'm pretty sure they're valid in that third case (at least
> > for "--mixed"). thoughts? is that just an oversight in the man
> > page?
>
> --mixed prints a deprecation warning. I don't think it's worth
> making the synopsis more complicated for that. All other modes
> reject pathspec.

  i just tested this, and i don't see a deprecation warning.

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
  http://crashcourse.ca/dokuwiki

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: why doesn't "git reset" mention optional pathspec?

2018-12-08 Thread Duy Nguyen
On Sat, Dec 8, 2018 at 5:08 PM Robert P. J. Day  wrote:
>
>
>   from "man git-reset":
>
> SYNOPSIS
>   git reset [-q] [] [--] ...
>   git reset (--patch | -p) [] [--] [...]
>   git reset [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] 
> []
>
> oddly, the third form says nothing about possible "", even
> though i'm pretty sure they're valid in that third case (at least for
> "--mixed"). thoughts? is that just an oversight in the man page?

--mixed prints a deprecation warning. I don't think it's worth making
the synopsis more complicated for that. All other modes reject
pathspec.
-- 
Duy


[PATCH v4 4/7] pretty: allow showing specific trailers

2018-12-08 Thread Anders Waldenborg
Adds a new "key=X" option to "%(trailers)" which will cause it to only
print trailer lines which match any of the specified keys.

Signed-off-by: Anders Waldenborg 
---
 Documentation/pretty-formats.txt |  8 +
 pretty.c | 47 ++---
 t/t4205-log-pretty-formats.sh| 51 
 trailer.c| 10 ---
 trailer.h|  2 ++
 5 files changed, 110 insertions(+), 8 deletions(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index d33b072eb2..d6add831c0 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -225,6 +225,14 @@ endif::git-rev-list[]
   linkgit:git-interpret-trailers[1]. The
   `trailers` string may be followed by a colon
   and zero or more comma-separated options:
+** 'key=': only show trailers with specified key. Matching is done
+   case-insensitively and trailing colon is optional. If option is
+   given multiple times trailer lines matching any of the keys are
+   shown. This option automatically enables the `only` option so that
+   non-trailer lines in the trailer block are hidden. If that is not
+   desired it can be disabled with `only=false`.  E.g.,
+   `%(trailers:key=Reviewed-by)` shows trailer lines with key
+   `Reviewed-by`.
 ** 'only[=val]': select whether non-trailer lines from the trailer
block should be included. The `only` keyword may optionally be
followed by an equal sign and one of `true`, `on`, `yes` to omit or
diff --git a/pretty.c b/pretty.c
index 07e6c0..541a553ccc 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1056,13 +1056,20 @@ static size_t parse_padding_placeholder(struct strbuf 
*sb,
return 0;
 }
 
-static int match_placeholder_arg(const char *to_parse, const char *candidate,
-const char **end)
+static int match_placeholder_arg_value(const char *to_parse, const char 
*candidate,
+  const char **end, const char 
**valuestart, size_t *valuelen)
 {
const char *p;
 
if (!(skip_prefix(to_parse, candidate, )))
return 0;
+   if (valuestart) {
+   if (*p != '=')
+   return 0;
+   *valuestart = p + 1;
+   *valuelen = strcspn(*valuestart, ",)");
+   p = *valuestart + *valuelen;
+   }
if (*p == ',') {
*end = p + 1;
return 1;
@@ -1074,6 +1081,12 @@ static int match_placeholder_arg(const char *to_parse, 
const char *candidate,
return 0;
 }
 
+static int match_placeholder_arg(const char *to_parse, const char *candidate,
+const char **end)
+{
+   return match_placeholder_arg_value(to_parse, candidate, end, NULL, 
NULL);
+}
+
 static int match_placeholder_bool_arg(const char *to_parse, const char 
*candidate,
  const char **end, int *val)
 {
@@ -1095,7 +1108,19 @@ static int match_placeholder_bool_arg(const char 
*to_parse, const char *candidat
*val = 1;
return 1;
}
+   return 0;
+}
 
+static int format_trailer_match_cb(const struct strbuf *key, void *ud)
+{
+   const struct string_list *list = ud;
+   const struct string_list_item *item;
+
+   for_each_string_list_item (item, list) {
+   if (key->len == (uintptr_t)item->util &&
+   !strncasecmp (item->string, key->buf, key->len))
+   return 1;
+   }
return 0;
 }
 
@@ -1337,6 +1362,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
 
if (skip_prefix(placeholder, "(trailers", )) {
struct process_trailer_options opts = 
PROCESS_TRAILER_OPTIONS_INIT;
+   struct string_list filter_list = STRING_LIST_INIT_NODUP;
size_t ret = 0;
 
opts.no_divider = 1;
@@ -1344,8 +1370,20 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
if (*arg == ':') {
arg++;
for (;;) {
-   if (!match_placeholder_bool_arg(arg, "only", 
, _trailers) &&
-   !match_placeholder_bool_arg(arg, "unfold", 
, ))
+   const char *argval;
+   size_t arglen;
+
+   if (match_placeholder_arg_value(arg, "key", 
, , )) {
+   uintptr_t len = arglen;
+   if (len && argval[len - 1] == ':')
+   len--;
+   string_list_append(_list, 
argval)->util = (char *)len;
+
+   opts.filter = format_trailer_match_cb;
+   

[PATCH v4 3/7] pretty: single return path in %(trailers) handling

2018-12-08 Thread Anders Waldenborg
No functional change intended.

This change may not seem useful on its own, but upcoming commits will do
memory allocation in there, and a single return path makes deallocation
easier.

Signed-off-by: Anders Waldenborg 
---
 pretty.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/pretty.c b/pretty.c
index 26efdba73a..07e6c0 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1337,6 +1337,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
 
if (skip_prefix(placeholder, "(trailers", )) {
struct process_trailer_options opts = 
PROCESS_TRAILER_OPTIONS_INIT;
+   size_t ret = 0;
 
opts.no_divider = 1;
 
@@ -1350,8 +1351,9 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
}
if (*arg == ')') {
format_trailers_from_commit(sb, msg + c->subject_off, 
);
-   return arg - placeholder + 1;
+   ret = arg - placeholder + 1;
}
+   return ret;
}
 
return 0;   /* unknown placeholder */
-- 
2.17.1



[PATCH v4 0/7] %(trailers) improvements in pretty format

2018-12-08 Thread Anders Waldenborg
Updated since v3:
 * multiple 'key=' matches any
 * allow overriding implicit 'only' when using key
 * minor grammar and spelling fixes
 * documentation restructuring
 * Helper functions for parsing options

Anders Waldenborg (7):
  doc: group pretty-format.txt placeholders descriptions
  pretty: allow %(trailers) options with explicit value
  pretty: single return path in %(trailers) handling
  pretty: allow showing specific trailers
  pretty: add support for "valueonly" option in %(trailers)
  strbuf: separate callback for strbuf_expand:ing literals
  pretty: add support for separator option in %(trailers)

 Documentation/pretty-formats.txt | 260 ++-
 pretty.c | 104 ++---
 strbuf.c |  21 +++
 strbuf.h |   8 +
 t/t4205-log-pretty-formats.sh| 111 +
 trailer.c|  25 ++-
 trailer.h|   4 +
 7 files changed, 400 insertions(+), 133 deletions(-)

-- 
2.17.1



[PATCH v4 6/7] strbuf: separate callback for strbuf_expand:ing literals

2018-12-08 Thread Anders Waldenborg
Expanding '%n' and '%xNN' is generic functionality, so extract that from
the pretty.c formatter into a callback that can be reused.

No functional change intended

Signed-off-by: Anders Waldenborg 
---
 pretty.c | 16 +---
 strbuf.c | 21 +
 strbuf.h |  8 
 3 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/pretty.c b/pretty.c
index c508357606..50d0b5830d 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1133,9 +1133,13 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
const char *msg = c->message;
struct commit_list *p;
const char *arg;
-   int ch;
+   size_t res;
 
/* these are independent of the commit */
+   res = strbuf_expand_literal_cb(sb, placeholder, NULL);
+   if (res)
+   return res;
+
switch (placeholder[0]) {
case 'C':
if (starts_with(placeholder + 1, "(auto)")) {
@@ -1154,16 +1158,6 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
 */
return ret;
}
-   case 'n':   /* newline */
-   strbuf_addch(sb, '\n');
-   return 1;
-   case 'x':
-   /* %x00 == NUL, %x0a == LF, etc. */
-   ch = hex2chr(placeholder + 1);
-   if (ch < 0)
-   return 0;
-   strbuf_addch(sb, ch);
-   return 3;
case 'w':
if (placeholder[1] == '(') {
unsigned long width = 0, indent1 = 0, indent2 = 0;
diff --git a/strbuf.c b/strbuf.c
index f6a6cf78b9..78eecd29f7 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -380,6 +380,27 @@ void strbuf_expand(struct strbuf *sb, const char *format, 
expand_fn_t fn,
}
 }
 
+size_t strbuf_expand_literal_cb(struct strbuf *sb,
+   const char *placeholder,
+   void *context)
+{
+   int ch;
+
+   switch (placeholder[0]) {
+   case 'n':   /* newline */
+   strbuf_addch(sb, '\n');
+   return 1;
+   case 'x':
+   /* %x00 == NUL, %x0a == LF, etc. */
+   ch = hex2chr(placeholder + 1);
+   if (ch < 0)
+   return 0;
+   strbuf_addch(sb, ch);
+   return 3;
+   }
+   return 0;
+}
+
 size_t strbuf_expand_dict_cb(struct strbuf *sb, const char *placeholder,
void *context)
 {
diff --git a/strbuf.h b/strbuf.h
index fc40873b65..52e44c9ab8 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -320,6 +320,14 @@ void strbuf_expand(struct strbuf *sb,
   expand_fn_t fn,
   void *context);
 
+/**
+ * Used as callback for `strbuf_expand` to only expand literals
+ * (i.e. %n and %xNN). The context argument is ignored.
+ */
+size_t strbuf_expand_literal_cb(struct strbuf *sb,
+   const char *placeholder,
+   void *context);
+
 /**
  * Used as callback for `strbuf_expand()`, expects an array of
  * struct strbuf_expand_dict_entry as context, i.e. pairs of
-- 
2.17.1



[PATCH v4 5/7] pretty: add support for "valueonly" option in %(trailers)

2018-12-08 Thread Anders Waldenborg
With the new "key=" option to %(trailers) it often makes little sense to
show the key, as it by definition already is knows which trailer is
printed there. This new "valueonly" option makes it omit the key when
printing trailers.

E.g.:
 $ git show -s --pretty='%s%n%(trailers:key=Signed-off-by,valueonly)' 88182
will show:
 > upload-pack: fix broken if/else chain in config callback
 > Jeff King 
 > Junio C Hamano 

Signed-off-by: Anders Waldenborg 
---
 Documentation/pretty-formats.txt | 2 ++
 pretty.c | 3 ++-
 t/t4205-log-pretty-formats.sh| 6 ++
 trailer.c| 6 --
 trailer.h| 1 +
 5 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index d6add831c0..a920dd15b1 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -243,6 +243,8 @@ endif::git-rev-list[]
option was given. In same way as to for `only` it can be followed
by an equal sign and explicit value. E.g.,
`%(trailers:only,unfold=true)` unfolds and shows all trailer lines.
+** 'valueonly[=val]': skip over the key part of the trailer line and only
+   show the value part. Also this optionally allows explicit value.
 
 NOTE: Some placeholders may depend on other options given to the
 revision traversal engine. For example, the `%g*` reflog options will
diff --git a/pretty.c b/pretty.c
index 541a553ccc..c508357606 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1383,7 +1383,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
opts.filter_data = _list;
opts.only_trailers = 1;
} else if (!match_placeholder_bool_arg(arg, 
"only", , _trailers) &&
-  !match_placeholder_bool_arg(arg, 
"unfold", , ))
+  !match_placeholder_bool_arg(arg, 
"unfold", , ) &&
+  !match_placeholder_bool_arg(arg, 
"valueonly", , _only))
break;
}
}
diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
index 54239290cf..22336c5485 100755
--- a/t/t4205-log-pretty-formats.sh
+++ b/t/t4205-log-pretty-formats.sh
@@ -667,6 +667,12 @@ test_expect_success 'pretty format 
%(trailers:key=foo,only=no) also includes non
test_cmp expect actual
 '
 
+test_expect_success '%(trailers:key=foo,valueonly) shows only value' '
+   git log --no-walk --pretty="format:%(trailers:key=Acked-by,valueonly)" 
>actual &&
+   echo "A U Thor " >expect &&
+   test_cmp expect actual
+'
+
 test_expect_success 'trailer parsing not fooled by --- line' '
git commit --allow-empty -F - <<-\EOF &&
this is the subject
diff --git a/trailer.c b/trailer.c
index d6da555cd7..d0d9e91631 100644
--- a/trailer.c
+++ b/trailer.c
@@ -1150,8 +1150,10 @@ static void format_trailer_info(struct strbuf *out,
if (!opts->filter || opts->filter(, 
opts->filter_data)) {
if (opts->unfold)
unfold_value();
-
-   strbuf_addf(out, "%s: %s\n", tok.buf, val.buf);
+   if (!opts->value_only)
+   strbuf_addf(out, "%s: ", tok.buf);
+   strbuf_addbuf(out, );
+   strbuf_addch(out, '\n');
}
strbuf_release();
strbuf_release();
diff --git a/trailer.h b/trailer.h
index 5255b676de..06d417fe93 100644
--- a/trailer.h
+++ b/trailer.h
@@ -72,6 +72,7 @@ struct process_trailer_options {
int only_input;
int unfold;
int no_divider;
+   int value_only;
int (*filter)(const struct strbuf *, void *);
void *filter_data;
 };
-- 
2.17.1



[PATCH v4 1/7] doc: group pretty-format.txt placeholders descriptions

2018-12-08 Thread Anders Waldenborg
The placeholders can be grouped into three kinds:
 * literals
 * affecting formatting of later placeholders
 * expanding to information in commit

Also change the list to a definition list (using '::')

Signed-off-by: Anders Waldenborg 
---
 Documentation/pretty-formats.txt | 235 ---
 1 file changed, 125 insertions(+), 110 deletions(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index 417b638cd8..86d804fe97 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -102,118 +102,133 @@ The title was >>t4119: test autocomputing -p for 
traditional diff input.<<
 +
 The placeholders are:
 
-- '%H': commit hash
-- '%h': abbreviated commit hash
-- '%T': tree hash
-- '%t': abbreviated tree hash
-- '%P': parent hashes
-- '%p': abbreviated parent hashes
-- '%an': author name
-- '%aN': author name (respecting .mailmap, see linkgit:git-shortlog[1]
-  or linkgit:git-blame[1])
-- '%ae': author email
-- '%aE': author email (respecting .mailmap, see
-  linkgit:git-shortlog[1] or linkgit:git-blame[1])
-- '%ad': author date (format respects --date= option)
-- '%aD': author date, RFC2822 style
-- '%ar': author date, relative
-- '%at': author date, UNIX timestamp
-- '%ai': author date, ISO 8601-like format
-- '%aI': author date, strict ISO 8601 format
-- '%cn': committer name
-- '%cN': committer name (respecting .mailmap, see
-  linkgit:git-shortlog[1] or linkgit:git-blame[1])
-- '%ce': committer email
-- '%cE': committer email (respecting .mailmap, see
-  linkgit:git-shortlog[1] or linkgit:git-blame[1])
-- '%cd': committer date (format respects --date= option)
-- '%cD': committer date, RFC2822 style
-- '%cr': committer date, relative
-- '%ct': committer date, UNIX timestamp
-- '%ci': committer date, ISO 8601-like format
-- '%cI': committer date, strict ISO 8601 format
-- '%d': ref names, like the --decorate option of linkgit:git-log[1]
-- '%D': ref names without the " (", ")" wrapping.
-- '%e': encoding
-- '%s': subject
-- '%f': sanitized subject line, suitable for a filename
-- '%b': body
-- '%B': raw body (unwrapped subject and body)
+- Placeholders that expand to a single literal character:
+'%n':: newline
+'%%':: a raw '%'
+'%x00':: print a byte from a hex code
+
+- Placeholders that affect formatting of later placeholders:
+'%Cred':: switch color to red
+'%Cgreen':: switch color to green
+'%Cblue':: switch color to blue
+'%Creset':: reset color
+'%C(...)':: color specification, as described under Values in the
+"CONFIGURATION FILE" section of linkgit:git-config[1].  By
+default, colors are shown only when enabled for log output
+(by `color.diff`, `color.ui`, or `--color`, and respecting
+the `auto` settings of the former if we are going to a
+terminal). `%C(auto,...)` is accepted as a historical
+synonym for the default (e.g., `%C(auto,red)`). Specifying
+`%C(always,...) will show the colors even when color is
+not otherwise enabled (though consider just using
+`--color=always` to enable color for the whole output,
+including this format and anything else git might color).
+`auto` alone (i.e. `%C(auto)`) will turn on auto coloring
+on the next placeholders until the color is switched
+again.
+'%m':: left (`<`), right (`>`) or boundary (`-`) mark
+'%w([[,[,]]])':: switch line wrapping, like the -w option of
+linkgit:git-shortlog[1].
+'%<([,trunc|ltrunc|mtrunc])':: make the next placeholder take at
+  least N columns, padding spaces on
+  the right if necessary.  Optionally
+  truncate at the beginning (ltrunc),
+  the middle (mtrunc) or the end
+  (trunc) if the output is longer than
+  N columns.  Note that truncating
+  only works correctly with N >= 2.
+'%<|()':: make the next placeholder take at least until Nth
+ columns, padding spaces on the right if necessary
+'%>()', '%>|()':: similar to '%<()', '%<|()' respectively,
+but padding spaces on the left
+'%>>()', '%>>|()':: similar to '%>()', '%>|()'
+  respectively, except that if the next
+  placeholder takes more spaces than given and
+  there are spaces on its left, use those
+  spaces
+'%><()', '%><|()':: similar to '%<()', '%<|()'
+  respectively, but padding both sides
+  (i.e. the text is centered)
+
+- Placeholders that expand to information extracted from the commit:
+'%H':: commit hash
+'%h':: abbreviated commit hash
+'%T':: tree hash
+'%t':: abbreviated tree hash
+'%P':: parent 

[PATCH v4 2/7] pretty: allow %(trailers) options with explicit value

2018-12-08 Thread Anders Waldenborg
In addition to old %(trailers:only) it is now allowed to write
%(trailers:only=yes)

By itself this only gives (the not quite so useful) possibility to have
users change their mind in the middle of a formatting
string (%(trailers:only=true,only=false)). However, it gives users the
opportunity to override defaults from future options.

Signed-off-by: Anders Waldenborg 
---
 Documentation/pretty-formats.txt | 14 ++
 pretty.c | 32 +++-
 t/t4205-log-pretty-formats.sh| 18 ++
 3 files changed, 55 insertions(+), 9 deletions(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index 86d804fe97..d33b072eb2 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -225,10 +225,16 @@ endif::git-rev-list[]
   linkgit:git-interpret-trailers[1]. The
   `trailers` string may be followed by a colon
   and zero or more comma-separated options:
-** 'only': omit non-trailer lines from the trailer block.
-** 'unfold': make it behave as if interpret-trailer's `--unfold`
-   option was given. E.g., `%(trailers:only,unfold)` unfolds and
-   shows all trailer lines.
+** 'only[=val]': select whether non-trailer lines from the trailer
+   block should be included. The `only` keyword may optionally be
+   followed by an equal sign and one of `true`, `on`, `yes` to omit or
+   `false`, `off`, `no` to show the non-trailer lines. If option is
+   given without value it is enabled. If given multiple times the last
+   value is used.
+** 'unfold[=val]': make it behave as if interpret-trailer's `--unfold`
+   option was given. In same way as to for `only` it can be followed
+   by an equal sign and explicit value. E.g.,
+   `%(trailers:only,unfold=true)` unfolds and shows all trailer lines.
 
 NOTE: Some placeholders may depend on other options given to the
 revision traversal engine. For example, the `%g*` reflog options will
diff --git a/pretty.c b/pretty.c
index b83a3ecd23..26efdba73a 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1074,6 +1074,31 @@ static int match_placeholder_arg(const char *to_parse, 
const char *candidate,
return 0;
 }
 
+static int match_placeholder_bool_arg(const char *to_parse, const char 
*candidate,
+ const char **end, int *val)
+{
+   const char *p;
+   if (!skip_prefix(to_parse, candidate, ))
+   return 0;
+
+   if (match_placeholder_arg(p, "=no", end) ||
+   match_placeholder_arg(p, "=off", end) ||
+   match_placeholder_arg(p, "=false", end)) {
+   *val = 0;
+   return 1;
+   }
+
+   if (match_placeholder_arg(p, "", end) ||
+   match_placeholder_arg(p, "=yes", end) ||
+   match_placeholder_arg(p, "=on", end) ||
+   match_placeholder_arg(p, "=true", end)) {
+   *val = 1;
+   return 1;
+   }
+
+   return 0;
+}
+
 static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
const char *placeholder,
void *context)
@@ -1318,11 +1343,8 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
if (*arg == ':') {
arg++;
for (;;) {
-   if (match_placeholder_arg(arg, "only", ))
-   opts.only_trailers = 1;
-   else if (match_placeholder_arg(arg, "unfold", 
))
-   opts.unfold = 1;
-   else
+   if (!match_placeholder_bool_arg(arg, "only", 
, _trailers) &&
+   !match_placeholder_bool_arg(arg, "unfold", 
, ))
break;
}
}
diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
index 978a8a66ff..63730a4ec0 100755
--- a/t/t4205-log-pretty-formats.sh
+++ b/t/t4205-log-pretty-formats.sh
@@ -578,6 +578,24 @@ test_expect_success '%(trailers:only) shows only "key: 
value" trailers' '
test_cmp expect actual
 '
 
+test_expect_success '%(trailers:only=yes) shows only "key: value" trailers' '
+   git log --no-walk --pretty=format:"%(trailers:only=yes)" >actual &&
+   grep -v patch.description expect &&
+   test_cmp expect actual
+'
+
+test_expect_success '%(trailers:only=no) shows all trailers' '
+   git log --no-walk --pretty=format:"%(trailers:only=no)" >actual &&
+   cat trailers >expect &&
+   test_cmp expect actual
+'
+
+test_expect_success '%(trailers:only=no,only=true) shows only "key: value" 
trailers' '
+   git log --no-walk --pretty=format:"%(trailers:only=yes)" >actual &&
+   grep -v patch.description expect &&
+   test_cmp expect actual
+'
+
 

[PATCH v4 7/7] pretty: add support for separator option in %(trailers)

2018-12-08 Thread Anders Waldenborg
By default trailer lines are terminated by linebreaks ('\n'). By
specifying the new 'separator' option they will instead be separated by
user provided string and have separator semantics rather than terminator
semantics. The separator string can contain the literal formatting codes
%n and %xNN allowing it to be things that are otherwise hard to type
such as %x00, or comma and end-parenthesis which would break parsing.

E.g:
 $ git log --pretty='%(trailers:key=Reviewed-by,valueonly,separator=%x00)'

Signed-off-by: Anders Waldenborg 
---
 Documentation/pretty-formats.txt |  9 
 pretty.c | 10 +
 t/t4205-log-pretty-formats.sh| 36 
 trailer.c| 15 +++--
 trailer.h|  1 +
 5 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index a920dd15b1..ce087dee80 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -239,6 +239,15 @@ endif::git-rev-list[]
`false`, `off`, `no` to show the non-trailer lines. If option is
given without value it is enabled. If given multiple times the last
value is used.
+** 'separator=': specify a separator inserted between trailer
+   lines. When this option is not given each trailer line is
+   terminated with a line feed character. The string SEP may contain
+   the literal formatting codes described above. To use comma as
+   separator one must use `%x2C` as it would otherwise be parsed as
+   next option. If separator option is given multiple times only the
+   last one is used. E.g., `%(trailers:key=Ticket,separator=%x2C )`
+   shows all trailer lines whose key is "Ticket" separated by a comma
+   and a space.
 ** 'unfold[=val]': make it behave as if interpret-trailer's `--unfold`
option was given. In same way as to for `only` it can be followed
by an equal sign and explicit value. E.g.,
diff --git a/pretty.c b/pretty.c
index 50d0b5830d..c7609493ee 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1357,6 +1357,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
if (skip_prefix(placeholder, "(trailers", )) {
struct process_trailer_options opts = 
PROCESS_TRAILER_OPTIONS_INIT;
struct string_list filter_list = STRING_LIST_INIT_NODUP;
+   struct strbuf sepbuf = STRBUF_INIT;
size_t ret = 0;
 
opts.no_divider = 1;
@@ -1376,6 +1377,14 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
opts.filter = format_trailer_match_cb;
opts.filter_data = _list;
opts.only_trailers = 1;
+   } else if (match_placeholder_arg_value(arg, 
"separator", , , )) {
+   char *fmt;
+
+   strbuf_reset();
+   fmt = xstrndup(argval, arglen);
+   strbuf_expand(, fmt, 
strbuf_expand_literal_cb, NULL);
+   free(fmt);
+   opts.separator = 
} else if (!match_placeholder_bool_arg(arg, 
"only", , _trailers) &&
   !match_placeholder_bool_arg(arg, 
"unfold", , ) &&
   !match_placeholder_bool_arg(arg, 
"valueonly", , _only))
@@ -1387,6 +1396,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in 
UTF-8 */
ret = arg - placeholder + 1;
}
string_list_clear (_list, 0);
+   strbuf_release();
return ret;
}
 
diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
index 22336c5485..282369dac0 100755
--- a/t/t4205-log-pretty-formats.sh
+++ b/t/t4205-log-pretty-formats.sh
@@ -673,6 +673,42 @@ test_expect_success '%(trailers:key=foo,valueonly) shows 
only value' '
test_cmp expect actual
 '
 
+test_expect_success 'pretty format %(trailers:separator) changes separator' '
+   git log --no-walk --pretty=format:"X%(trailers:separator=%x00,unfold)X" 
>actual &&
+   printf "XSigned-off-by: A U Thor \0Acked-by: A U 
Thor \0[ v2 updated patch description ]\0Signed-off-by: A U 
Thor X" >expect &&
+   test_cmp expect actual
+'
+
+test_expect_success 'pretty format %(trailers) combining 
separator/key/valueonly' '
+   git commit --allow-empty -F - <<-\EOF &&
+   Important fix
+
+   The fix is explained here
+
+   Closes: #1234
+   EOF
+
+   git commit --allow-empty -F - <<-\EOF &&
+   Another fix
+
+   The fix is explained here
+
+   Closes: #567
+   Closes: #890
+   EOF
+
+   git commit --allow-empty -F - <<-\EOF &&
+   Does 

Re: [PATCH v3 1/1] git clone C:\cygwin\home\USER\repo' is working (again)

2018-12-08 Thread Steven Penny
On Sat, Dec 8, 2018 at 9:11 AM wrote:
> Changes since V2:

latest patch still fixes original issue - thanks

> - Settled on a better name:
>   The common code is in compat/win32/path-utils.c/h
>   [...]
> - The "DOS" moniker is still used for 2 reasons:
>   Windows inherited the "drive letter" concept from DOS,
>   and everybody (tm) familar with the code and the path handling
>   in Git is used to that wording.
>   Even if there was a better name, it needed to be addressed
>   in a patch series different from this one.
>   Here I want to fix a reported regression.

i still disagree with this - but i understand if naming argument is out of scope
for thread

> And, before any cleanup is done, I sould like to ask if anybody
> can build the code with VS and confirm that it works, please ?

sorry but i am decidedly *not* interested in doing this. i use cygwin
specifically so that i can avoid VS. hopefully someone else will be able to
test. cheers


Re: "git add -p" versus "git add -i", followed by "p"

2018-12-08 Thread Robert P. J. Day
On Sun, 2 Dec 2018, Duy Nguyen wrote:

> On Sun, Dec 2, 2018 at 6:05 PM Robert P. J. Day  wrote:
> > >   Patch update>> 2
> > >  staged unstaged path
> > >   * 1:unchanged+1/-0 README.md
> > >   * 2:unchanged+1/-0 contrib/README
> > > 3:unchanged+1/-0 t/README
> > >   Patch update>>
> > >
> > > Here I hit enter.  Did you?
> >
> >   perhaps i'm just not seeing it, but from "man git-add", it
> > doesn't seem obvious that you would first select the files to work
> > with, then hit a simple CR to get into actual patch mode.
>
> I think it's the same procedure as the "update" step, which
> describes this in more detail. I agree that the "patch" section does
> not make this obvious.

  when i get a chance, i'll try to reword it.

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
  http://crashcourse.ca/dokuwiki

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



why doesn't "git reset" mention optional pathspec?

2018-12-08 Thread Robert P. J. Day


  from "man git-reset":

SYNOPSIS
  git reset [-q] [] [--] ...
  git reset (--patch | -p) [] [--] [...]
  git reset [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] []

oddly, the third form says nothing about possible "", even
though i'm pretty sure they're valid in that third case (at least for
"--mixed"). thoughts? is that just an oversight in the man page?

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
  http://crashcourse.ca/dokuwiki

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



[wishlist] submodule.update config

2018-12-08 Thread Yaroslav Halchenko
Relates (but orthogonal) to my other thread

  [wishlist] git submodule update --reset-hard

ATM, it possible to specify per submodule update strategy via
configuration variable submodule.SUBMODULE.update where SUBMODULE is the name
of the corresponding submodule.  But I see no way to specify default update
strategy for all submodules.

>From our conversation in that other thread  I have discovered to myself about
existence of  submodule.recurse  configuration, and there seems to be a few
more (.fetchJobs, .active) where e.g. .active seems to complement per-submodule
submodule.*.active:

yoh@debian:~/proj/misc/git$ git grep '[^.]submodule\.[a-z]' -- 
Documentation/
Documentation/RelNotes/2.14.0.txt: * Many commands learned to pay 
attention to submodule.recurse
Documentation/RelNotes/2.15.0.txt: * "git -c submodule.recurse=yes 
pull" did not work as if the
Documentation/config.txt:include::config/submodule.txt[]
Documentation/config/submodule.txt: update'. If neither 
submodule..active or submodule.active are
Documentation/config/submodule.txt: interact with submodules; 
settings like `submodule.active`
Documentation/config/submodule.txt: submodule.active config option. 
See linkgit:gitsubmodules[7] for
Documentation/config/submodule.txt: as computed via 
`submodule.alternateLocation`. Possible values are
Documentation/git-clone.txt:of multiple entries.  The resulting 
clone has `submodule.active` set to
Documentation/git-clone.txt:Defaults to the `submodule.fetchJobs` 
option.
Documentation/git-submodule.txt:If no path is specified and 
submodule.active has been configured, submodules
Documentation/git-submodule.txt:Defaults to the 
`submodule.fetchJobs` option.
Documentation/gitsubmodules.txt:`submodule.foo.path = path/to/bar`.
Documentation/gitsubmodules.txt:The section `submodule.foo.*` in the 
`.gitmodules` file gives additional
Documentation/gitsubmodules.txt:hints to Git's porcelain layer. For 
example, the `submodule.foo.url`
Documentation/gitsubmodules.txt:  b. if the submodule's path matches 
the pathspec in `submodule.active`
Documentation/gitsubmodules.txt:submodule's path is excluded in the 
pathspec in `submodule.active`, the
Documentation/gitsubmodules.txt:  git config --global submodule.recurse 
true
Documentation/gitsubmodules.txt:your working tree. Alternatively you 
can set 'submodule.recurse' to have
Documentation/technical/api-config.txt:if 
(!git_configset_get_bool(gm_config, "submodule.frotz.ignore", )) {
Documentation/technical/http-protocol.txt:  $GIT_URL: 
http://example.com/git/repo.git/path/submodule.git
Documentation/technical/http-protocol.txt:  URL request:  
http://example.com/git/repo.git/path/submodule.git/info/refs

I wondered, if you think it would be sensible to also add of
submodule.update which would be considered before submodule.SUBMODULE.update
variable possibly defined per submodule.  That would be more inline with desire
to use any of the --merge, --rebase (and hopefully soon --reset-hard)
strategies specified as an option for submodule update, where no per-submodule
handling  is happening.

Thanks in advance for the consideration!
-- 
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834   Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik


[PATCH v3 1/1] git clone C:\cygwin\home\USER\repo' is working (again)

2018-12-08 Thread tboegi
From: Torsten Bögershausen 

A regression for cygwin users was introduced with commit 05b458c,
 "real_path: resolve symlinks by hand".

In the the commit message we read:
  The current implementation of real_path uses chdir() in order to resolve
symlinks.  Unfortunately this isn't thread-safe as chdir() affects a
  process as a whole...

The old (and non-thread-save) OS calls chdir()/pwd() had been
replaced by a string operation.
The cygwin layer "knows" that "C:\cygwin" is an absolute path,
but the new string operation does not.

"git clone  C:\cygwin\home\USER\repo" fails like this:
fatal: Invalid path '/home/USER/repo/C:\cygwin\home\USER\repo'

The solution is to implement has_dos_drive_prefix(), skip_dos_drive_prefix()
is_dir_sep(), offset_1st_component() and convert_slashes() for cygwin
in the same way as it is done in 'Git for Windows' in compat/mingw.[ch]

Extract the needed code into compat/win32/path-utils.[ch] and use it
for cygwin as well.

Reported-by: Steven Penny 
Helped-by: Johannes Schindelin 
Signed-off-by: Torsten Bögershausen 
---
Changes since V2:
- Settled on a better name:
  The common code is in compat/win32/path-utils.c/h
- Skip the 2 patches which "only" do a cleanup (for a moment)
  put those cleanups onto the "todo stack".
- The "DOS" moniker is still used for 2 reasons:
  Windows inherited the "drive letter" concept from DOS,
  and everybody (tm) familar with the code and the path handling
  in Git is used to that wording.
  Even if there was a better name, it needed to be addressed
  in a patch series different from this one.
  Here I want to fix a reported regression.
   
And, before any cleanup is done, I sould like to ask if anybody
can build the code with VS and confirm that it works, please ?

Thanks for the reviews, testing and comment.

compat/cygwin.c   | 19 ---
 compat/cygwin.h   |  2 --
 compat/mingw.c| 29 +
 compat/mingw.h| 20 
 compat/win32/path-utils.c | 28 
 compat/win32/path-utils.h | 20 
 config.mak.uname  |  3 ++-
 git-compat-util.h |  3 ++-
 8 files changed, 53 insertions(+), 71 deletions(-)
 delete mode 100644 compat/cygwin.c
 delete mode 100644 compat/cygwin.h
 create mode 100644 compat/win32/path-utils.c
 create mode 100644 compat/win32/path-utils.h

diff --git a/compat/cygwin.c b/compat/cygwin.c
deleted file mode 100644
index b9862d606d..00
--- a/compat/cygwin.c
+++ /dev/null
@@ -1,19 +0,0 @@
-#include "../git-compat-util.h"
-#include "../cache.h"
-
-int cygwin_offset_1st_component(const char *path)
-{
-   const char *pos = path;
-   /* unc paths */
-   if (is_dir_sep(pos[0]) && is_dir_sep(pos[1])) {
-   /* skip server name */
-   pos = strchr(pos + 2, '/');
-   if (!pos)
-   return 0; /* Error: malformed unc path */
-
-   do {
-   pos++;
-   } while (*pos && pos[0] != '/');
-   }
-   return pos + is_dir_sep(*pos) - path;
-}
diff --git a/compat/cygwin.h b/compat/cygwin.h
deleted file mode 100644
index 8e52de4644..00
--- a/compat/cygwin.h
+++ /dev/null
@@ -1,2 +0,0 @@
-int cygwin_offset_1st_component(const char *path);
-#define offset_1st_component cygwin_offset_1st_component
diff --git a/compat/mingw.c b/compat/mingw.c
index 34b3880b29..27e397f268 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -350,7 +350,7 @@ static inline int needs_hiding(const char *path)
return 0;
 
/* We cannot use basename(), as it would remove trailing slashes */
-   mingw_skip_dos_drive_prefix((char **));
+   win_path_utils_skip_dos_drive_prefix((char **));
if (!*path)
return 0;
 
@@ -2275,33 +2275,6 @@ pid_t waitpid(pid_t pid, int *status, int options)
return -1;
 }
 
-int mingw_skip_dos_drive_prefix(char **path)
-{
-   int ret = has_dos_drive_prefix(*path);
-   *path += ret;
-   return ret;
-}
-
-int mingw_offset_1st_component(const char *path)
-{
-   char *pos = (char *)path;
-
-   /* unc paths */
-   if (!skip_dos_drive_prefix() &&
-   is_dir_sep(pos[0]) && is_dir_sep(pos[1])) {
-   /* skip server name */
-   pos = strpbrk(pos + 2, "\\/");
-   if (!pos)
-   return 0; /* Error: malformed unc path */
-
-   do {
-   pos++;
-   } while (*pos && !is_dir_sep(*pos));
-   }
-
-   return pos + is_dir_sep(*pos) - path;
-}
-
 int xutftowcsn(wchar_t *wcs, const char *utfs, size_t wcslen, int utflen)
 {
int upos = 0, wpos = 0;
diff --git a/compat/mingw.h b/compat/mingw.h
index 8c24ddaa3e..30d9fb3e36 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -443,32 +443,12 @@ HANDLE winansi_get_osfhandle(int fd);
  * git specific compatibility
  */
 
-#define 

Documentation: update "man git-add" to include "[]"

2018-12-08 Thread Robert P. J. Day


Current "man git-add" emphasizes single letter interactive
shortcut commands with "[]".

Signed-off-by: Robert P. J. Day 

---

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index 45652fe4a6..ad9bd7c7a6 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -239,8 +239,8 @@ and type return, like this:

 
 *** Commands ***
-  1: status   2: update   3: revert   4: add untracked
-  5: patch6: diff 7: quit 8: help
+  1: [s]tatus 2: [u]pdate 3: [r]evert 4: [a]dd untracked
+  5: [p]atch  6: [d]iff   7: [q]uit   8: [h]elp
 What now> 1
 


-- 


Robert P. J. Day Ottawa, Ontario, CANADA
  http://crashcourse.ca/dokuwiki

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: [PATCH] mailmap: update brandon williams's email address

2018-12-08 Thread Ævar Arnfjörð Bjarmason


On Sat, Dec 08 2018, Junio C Hamano wrote:

> Stefan Beller  writes:
>
>> On Fri, Dec 7, 2018 at 1:40 PM Jonathan Nieder  wrote:
>>>
>>> Brandon Williams wrote:
>>>
>>> > Signed-off-by: Brandon Williams 
>>> > ---
>>> >  .mailmap | 1 +
>>> >  1 file changed, 1 insertion(+)
>>>
>>> I can confirm that this is indeed the same person.
>>
>> What would be more of interest is why we'd be interested in this patch
>> as there is no commit/patch sent by Brandon with this email in gits history.
>
> Once I "git am" the message that began this thread, there will be a
> commit under this new ident, so that would be somewhat a moot point.

"Get to the top of 'git shortlog -sn' with this one easy trick" :)

(The patch makes sense, good to see you back on-list Brandon)


Re: [PATCH] mailmap: update brandon williams's email address

2018-12-08 Thread Duy Nguyen
On Sat, Dec 8, 2018 at 7:09 AM Junio C Hamano  wrote:
> If this were "Jonathan asked Brandon if we want to record an address
> we can reach him in our .mailmap file and sent a patch to add one",

Not sure about Jonathan, but I did.

> then the story is different, and I tend to agree with you that such
> a patch is more or less pointless.  That's not the purpose of the
> mailmap file.

Not directly, but when multiple commands use mailmap to show the
canonical mail addresses, then it kinda is. When I look back at an old
commit message, I may look up the author name and address, which is
mapped.

> Not until git-send-email learns to use that file to rewrite
> To/cc/etc to the "canonical" addresses, anyway ;-)

git-send-email does not have to when I copy/paste the address from
git-log anyway.

> I am not sure if there are people whose "canonical" address to be
> used as the author is not necessarily the best address they want to
> get their e-mails at, though.  If we can be reasonably sure that the
> set of such people is empty, then people can take the above mention
> about send-email as a hint about a low-hanging fruit ;-)
>
> Thanks.
>
>


-- 
Duy


Re: Retrieving a file in git that was deleted and committed

2018-12-07 Thread Jeff King
On Fri, Dec 07, 2018 at 01:50:57PM -0800, biswaranjan panda wrote:

> Thanks Jeff and Bryan! However, I am curious that if there were a way
> to tell git blame to skip a commit (the one which added the file again
> and maybe the one which deleted it originally) while it walks back
> through history, then it should just get back the
> entire history right ?

Not easily. ;)

You can feed a set of revisions to git-blame with the "-S" option, but I
don't offhand know how it handles diffs (I think it would have to still
diff each commit against its parent, since history is non-linear, and a
list is inherently linear). You might want to experiment with that.

Other than that, you can play with git-replace to produce a fake
history, as if the deletion never happened. But note that will affect
all commands, not just one particular blame. It might be a neat way to
play with blame, but I doubt I'd leave the replacement in place in the
long term.

-Peff


Re: Difficulty with parsing colorized diff output

2018-12-07 Thread Jeff King
On Fri, Dec 07, 2018 at 07:09:58PM -0500, George King wrote:

> My goal is to detect SGR color sequences, e.g. '\x1b[32m', that exist
> in the source text, and have my highlighter print escaped
> representations of those. For example, I have checked in files that
> are expected test outputs for tools that emit color codes, and diffs
> of those get very confusing.
> 
> Figuring out which color codes are from the source text and which were
> added by git is proving very difficult. The obvious solution is to
> turn git diff coloring off, but as far as I can see this also turns
> off all coloring for logs, which is undesirable.

Another gotcha that I don't think you've hit yet: whitespace
highlighting can color arbitrary parts of the line.

Try this, for example:

  printf 'foo\n' >old
  printf '\t \tfoo\n' >new
  git diff --no-index old new

So I'm not sure what you want to do can ever be completely robust. That
said, I'm not opposed to making Git's color output more regular. It
seems like a reasonable cleanup on its own.

(For the Git project itself, we long ago realized that putting raw color
codes into the source is a big pain when working with diffs, and we
instead use tools like t/test-lib-functions.sh's test_decode_color).

> * Context lines do not begin with reset code, but do end with a reset
> code. It would be preferable in my opinion if they had both (like
> every other line), or none at all.

Context lines do have both. It's just that the default color for context
lines is empty. ;)

But yes, I think it might be reasonable to recognize when an opening
color is empty, and turn the closing reset into a noop in that case (or
I guess you are also advocating the opposite: turn an empty color into a
noop \x1b[m or similar).

I think most of the coloring, including context lines, is coming from
diff.c:emit_diff_symbol_from_struct(). Instead of unconditionally
calling:

  context = diff_get_color_opt(o, DIFF_CONTEXT);
  reset = diff_get_color_opt(o, DIFF_RESET);

I suspect we could have a helper like this:

  static const char *diff_get_reset(const char *color)
  {
if (!color || !*color)
return "";
return diff_colors[DIFF_RESET];
  }
  ...
  context = diff_get_color_opt(o, DIFF_CONTEXT);
  reset = diff_get_reset(context);

> * Added lines have excess codes after the plus sign. The entire prefix
> is, `\x1b[32m+\x1b[m\x1b[32m` translating to GREEN PLUS RESET GREEN.
> Emitting codes after the plus sign makes the parsing more complex and
> idiosyncratic.

Yeah, I've run into this one, too (when working on diff-highlight, which
similarly tries to pass-through Git's existing colors).

It's unfortunately not quite trivial, due to some interactions with
whitespace coloring. See this old patch:

  https://public-inbox.org/git/20101109220136.ga5...@sigill.intra.peff.net/

and the followup:

  https://public-inbox.org/git/20101118064050.ga12...@sigill.intra.peff.net/

> * Add a config feature to turn on log coloring while leaving diff coloring 
> off.

Yes, the fact that --pretty log coloring is tied to diffs is mostly a
historical artifact. I think it would be OK to introduce a color.log
config option that defaults to the value of color.diff if unset. That
would be backwards compatible, but allow you to set them independently
if you wanted to.

-Peff


Re: [PATCH 4/4] submodule deinit: unset core.worktree

2018-12-07 Thread Junio C Hamano
Stefan Beller  writes:

> This re-introduces 984cd77ddb (submodule deinit: unset core.worktree,
> 2018-06-18), which was reverted as part of f178c13fda (Revert "Merge
> branch 'sb/submodule-core-worktree'", 2018-09-07)
>
> The whole series was reverted as the offending commit e98317508c
> (submodule: ensure core.worktree is set after update, 2018-06-18)
> was relied on by other commits such as 984cd77ddb.
>
> Keep the offending commit reverted, but its functionality came back via
> 4d6d6ef1fc (Merge branch 'sb/submodule-update-in-c', 2018-09-17), such
> that we can reintroduce 984cd77ddb now.

None of the above three explains the most important thing directly,
so readers fail to grasp what the main theme of the three-patch
series is, without looking at the commits that were reverted
already.

Is the theme of the overall series to make sure core.worktree is set
to point at the working tree when submodule's working tree is
instantiated, and unset it when it is not?

2/4 was also explained (in the original) that it wants to unset and
did so when "move_head" gets called.  This one does the unset when a
submodule is deinited.  Are these the only two cases a submodule
loses its working tree?  If so, the log message for this step should
declare victory by ending with something like

... as we covered the only other case in which a submodule
loses its working tree in the earlier step (i.e. switching
branches of top-level project to move to a commit that did
not have the submodule), this makes the code always maintain
core.worktree correctly unset when there is no working tree
for a submodule.

Thanks.  I think I agree with what the series wants to do (if I read
what it wants to do correctly, that is).

> Signed-off-by: Stefan Beller 
> ---
>  builtin/submodule--helper.c | 2 ++
>  t/lib-submodule-update.sh   | 2 +-
>  t/t7400-submodule-basic.sh  | 5 +
>  3 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 31ac30cf2f..672b74db89 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -1131,6 +1131,8 @@ static void deinit_submodule(const char *path, const 
> char *prefix,
>   if (!(flags & OPT_QUIET))
>   printf(format, displaypath);
>  
> + submodule_unset_core_worktree(sub);
> +
>   strbuf_release(_rm);
>   }
>  
> diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
> index 51d449..5b56b23166 100755
> --- a/t/lib-submodule-update.sh
> +++ b/t/lib-submodule-update.sh
> @@ -235,7 +235,7 @@ reset_work_tree_to_interested () {
>   then
>   mkdir -p submodule_update/.git/modules/sub1/modules &&
>   cp -r submodule_update_repo/.git/modules/sub1/modules/sub2 
> submodule_update/.git/modules/sub1/modules/sub2
> - GIT_WORK_TREE=. git -C 
> submodule_update/.git/modules/sub1/modules/sub2 config --unset core.worktree
> + # core.worktree is unset for sub2 as it is not checked out
>   fi &&
>   # indicate we are interested in the submodule:
>   git -C submodule_update config submodule.sub1.url "bogus" &&
> diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
> index 76a7cb0af7..aba2d4d6ee 100755
> --- a/t/t7400-submodule-basic.sh
> +++ b/t/t7400-submodule-basic.sh
> @@ -984,6 +984,11 @@ test_expect_success 'submodule deinit should remove the 
> whole submodule section
>   rmdir init
>  '
>  
> +test_expect_success 'submodule deinit should unset core.worktree' '
> + test_path_is_file .git/modules/example/config &&
> + test_must_fail git config -f .git/modules/example/config core.worktree
> +'
> +
>  test_expect_success 'submodule deinit from subdirectory' '
>   git submodule update --init &&
>   git config submodule.example.foo bar &&


Re: [PATCH 3/4] submodule--helper: fix BUG message in ensure_core_worktree

2018-12-07 Thread Junio C Hamano
Stefan Beller  writes:

> Shortly after f178c13fda (Revert "Merge branch
> 'sb/submodule-core-worktree'", 2018-09-07), we had another series
> that implemented partially the same, ensuring that core.worktree was
> set in a checked out submodule, namely 74d4731da1 (submodule--helper:
> replace connect-gitdir-workingtree by ensure-core-worktree, 2018-08-13)
>
> As the series 4d6d6ef1fc (Merge branch 'sb/submodule-update-in-c',
> 2018-09-17) has different goals than the reverted series 7e25437d35
> (Merge branch 'sb/submodule-core-worktree', 2018-07-18), I'd wanted to
> replay the series on top of it to reach the goal of having `core.worktree`
> correctly set when the submodules worktree is present, and unset when the
> worktree is not present.
>
> The replay resulted in a strange merge conflict highlighting that
> the BUG message was not changed in 74d4731da1 (submodule--helper:
> replace connect-gitdir-workingtree by ensure-core-worktree, 2018-08-13).
>
> Fix the error message.
>
> Signed-off-by: Stefan Beller 
> ---

Unlike the step 2/4 I commented on, this does explain what this
wants to do and why, at least when looked from sideways.  Is the
above saying the same as the following two-liner?

An ealier mistake while rebasing to produce 74d4731da1
failed to update this BUG message.  Fix this.



>  builtin/submodule--helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index d38113a31a..31ac30cf2f 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -2045,7 +2045,7 @@ static int ensure_core_worktree(int argc, const char 
> **argv, const char *prefix)
>   struct repository subrepo;
>  
>   if (argc != 2)
> - BUG("submodule--helper connect-gitdir-workingtree  
> ");
> + BUG("submodule--helper ensure-core-worktree ");
>  
>   path = argv[1];


Re: [PATCH] mailmap: update brandon williams's email address

2018-12-07 Thread Brandon Williams
On Fri, Dec 7, 2018 at 10:08 PM Junio C Hamano  wrote:
>
> Stefan Beller  writes:
>
> > On Fri, Dec 7, 2018 at 1:40 PM Jonathan Nieder  wrote:
> >>
> >> Brandon Williams wrote:
> >>
> >> > Signed-off-by: Brandon Williams 
> >> > ---
> >> >  .mailmap | 1 +
> >> >  1 file changed, 1 insertion(+)
> >>
> >> I can confirm that this is indeed the same person.
> >
> > What would be more of interest is why we'd be interested in this patch
> > as there is no commit/patch sent by Brandon with this email in gits history.
>
> Once I "git am" the message that began this thread, there will be a
> commit under this new ident, so that would be somewhat a moot point.
>
> If this were "Jonathan asked Brandon if we want to record an address
> we can reach him in our .mailmap file and sent a patch to add one",
> then the story is different, and I tend to agree with you that such
> a patch is more or less pointless.  That's not the purpose of the
> mailmap file.
>

Turns out this is exactly the reason :) I've had a couple of people
reach out to me asking me to do this because CCing my old email
bounces and they've wanted my input/comments on something related to
work I've done.  If that's not the intended purpose then please ignore
this patch

> Not until git-send-email learns to use that file to rewrite
> To/cc/etc to the "canonical" addresses, anyway ;-)
>
> I am not sure if there are people whose "canonical" address to be
> used as the author is not necessarily the best address they want to
> get their e-mails at, though.  If we can be reasonably sure that the
> set of such people is empty, then people can take the above mention
> about send-email as a hint about a low-hanging fruit ;-)
>
> Thanks.
>
>


Re: [PATCH 2/4] submodule: unset core.worktree if no working tree is present

2018-12-07 Thread Junio C Hamano
Stefan Beller  writes:

> This reintroduces 4fa4f90ccd (submodule: unset core.worktree if no working
> tree is present, 2018-06-12), which was reverted as part of f178c13fda
> (Revert "Merge branch 'sb/submodule-core-worktree'", 2018-09-07).
>
> 4fa4f90ccd was reverted as its followup commit was faulty, but without
> the accompanying change of the followup, we'd have an incomplete workflow
> of setting `core.worktree` again, when it is needed such as checking out
> a revision that contains a submodule.
>
> So re-introduce that commit as-is, focusing on fixing up the followup

I was hoping to hear (given what 0/4 claimed) a clearer explanation
of what this change wants to achieve, but that is lacking.

No need to grumble about an earlier work was that turned out to be
inappropriate for the codebase back then.  Repeatedly saying "this
is needed" without giving further explaining why it is so, or
anything like that, would help readers.

Just pretend that the ealier commits and their reversion never
happened, and further pretend that we are doing the best thing that
should happen to our codebase.  How would we explain this change,
what the problem it tries to solve and what the solution looks like
in the larger picture?

When removing a working tree of a submodule (e.g. we may be
switching back to an earlier commit in the superproject that
did not have the submodule in question yet), we failed to
unset core.worktree of the submodule's repository.  That
caused this and that issues, exhibited by a few new tests
this commit adds.

Make sure that core.worktree gets unset so that a leftover
setting won't cause these issues.

or something like that?  I am just guessing by looking at the old
commit's text, as the above two paragraphs and one sentence does not
say much.

> diff --git a/submodule.c b/submodule.c
> index 6415cc5580..d393e947e6 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -1561,6 +1561,18 @@ int bad_to_remove_submodule(const char *path, unsigned 
> flags)
>   return ret;
>  }
>  
> +void submodule_unset_core_worktree(const struct submodule *sub)
> +{
> + char *config_path = xstrfmt("%s/modules/%s/config",
> + get_git_common_dir(), sub->name);
> +
> + if (git_config_set_in_file_gently(config_path, "core.worktree", NULL))
> + warning(_("Could not unset core.worktree setting in submodule 
> '%s'"),
> +   sub->path);
> +
> + free(config_path);
> +}
> +
>  static const char *get_super_prefix_or_empty(void)
>  {
>   const char *s = get_super_prefix();
> @@ -1726,6 +1738,8 @@ int submodule_move_head(const char *path,
>  
>   if (is_empty_dir(path))
>   rmdir_or_warn(path);
> +
> + submodule_unset_core_worktree(sub);
>   }
>   }
>  out:
> diff --git a/submodule.h b/submodule.h
> index a680214c01..9e18e9b807 100644
> --- a/submodule.h
> +++ b/submodule.h
> @@ -131,6 +131,8 @@ int submodule_move_head(const char *path,
>   const char *new_head,
>   unsigned flags);
>  
> +void submodule_unset_core_worktree(const struct submodule *sub);
> +
>  /*
>   * Prepare the "env_array" parameter of a "struct child_process" for 
> executing
>   * a submodule by clearing any repo-specific environment variables, but
> diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
> index 016391723c..51d449 100755
> --- a/t/lib-submodule-update.sh
> +++ b/t/lib-submodule-update.sh
> @@ -709,7 +709,8 @@ test_submodule_recursing_with_args_common() {
>   git branch -t remove_sub1 origin/remove_sub1 &&
>   $command remove_sub1 &&
>   test_superproject_content origin/remove_sub1 &&
> - ! test -e sub1
> + ! test -e sub1 &&
> + test_must_fail git config -f .git/modules/sub1/config 
> core.worktree
>   )
>   '
>   # ... absorbing a .git directory along the way.


Re: [RFC PATCH 2/3] Documentation/git-rev-list: s///

2018-12-07 Thread Junio C Hamano
Matthew DeVore  writes:

> $ git -C pc1 fetch-pack --stdin "file://$(pwd)/srv.bare" 
> Where observed.oids contains all the blobs that were missing. It tells
> the remote that it already has the "refs/heads/master" commit, which
> means it is excluded. Before, this worked fine, since it didn't mean
> the blobs were excluded, only the commit itself.

So 'master' has some blob in its tree, which is missing from the
repository, in this test?  Which means that such a repository is
"corrupt" and does not pass connectivity check by fsck.

I am of two minds.  If we claim that we have everything that is
needed to complete the commit sitting at the tip of 'master', I
think it is correct for the other side not to send a blob that is in
'master' (or its ancestors), so your "fix" may (at least
technically) be more correct than the status quo.  On the other
hand, if possession of commit 'master' does not defeat an explicit
request for a blob in it, that would actually be a good thing---it
would be a very straight-forward way to recover from such form of
repository corruption.

Fetching isolated objects without walking is also something that
would help backfill a lazily created clone, and I even vaguely
recall an effort to allow objects explicitly requested to be always
fetched regardless of the connectivity, if I am not mistaken (JTan?)

Anyway, thanks for a thoughtful response.


Re: [PATCH] mailmap: update brandon williams's email address

2018-12-07 Thread Junio C Hamano
Stefan Beller  writes:

> On Fri, Dec 7, 2018 at 1:40 PM Jonathan Nieder  wrote:
>>
>> Brandon Williams wrote:
>>
>> > Signed-off-by: Brandon Williams 
>> > ---
>> >  .mailmap | 1 +
>> >  1 file changed, 1 insertion(+)
>>
>> I can confirm that this is indeed the same person.
>
> What would be more of interest is why we'd be interested in this patch
> as there is no commit/patch sent by Brandon with this email in gits history.

Once I "git am" the message that began this thread, there will be a
commit under this new ident, so that would be somewhat a moot point.

If this were "Jonathan asked Brandon if we want to record an address
we can reach him in our .mailmap file and sent a patch to add one",
then the story is different, and I tend to agree with you that such
a patch is more or less pointless.  That's not the purpose of the
mailmap file.

Not until git-send-email learns to use that file to rewrite
To/cc/etc to the "canonical" addresses, anyway ;-)

I am not sure if there are people whose "canonical" address to be
used as the author is not necessarily the best address they want to
get their e-mails at, though.  If we can be reasonably sure that the
set of such people is empty, then people can take the above mention
about send-email as a hint about a low-hanging fruit ;-)

Thanks.




Re: RFE: git-patch-id should handle patches without leading "diff"

2018-12-07 Thread Junio C Hamano
Jonathan Nieder  writes:

>> So it seems most sensible to me if this is going to be supported that we
>> go a bit beyond the call of duty and fake up the start of it, namely:
>>
>> --- a/arch/x86/kernel/process.c
>> +++ b/arch/x86/kernel/process.c
>>
>> To be:
>>
>> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
>> --- a/arch/x86/kernel/process.c
>> +++ b/arch/x86/kernel/process.c
>
> Right.  We may want to handle diff.mnemonicPrefix as well.

I definitely think under the --stable option, we should pretend as
if the canonical a/ vs b/ prefixes were given with the "diff --git"
header, just like we try to reverse the effect of diff-orderfile,
etc.

I am unsure what the right behaviour under --unstable is, though.




Re: [PATCH 0/4]

2018-12-07 Thread Junio C Hamano
Stefan Beller  writes:

> A couple days before the 2.19 release we had a bug report about
> broken submodules[1] and reverted[2] the commits leading up to them.
>
> The behavior of said bug fixed itself by taking a different approach[3],
> specifically by a weaker enforcement of having `core.worktree` set in a
> submodule [4].
>
> The revert [2] was overly broad as we neared the release, such that we wanted
> to rather keep the known buggy behavior of always having `core.worktree` set,
> rather than figuring out how to fix the new bug of having 'git submodule 
> update'
> not working in old style repository setups.
>
> This series re-introduces those reverted patches, with no changes in code,
> but with drastically changed commit messages, as those focus on why it is safe
> to re-introduce them instead of explaining the desire for the change.

The above was a bit too cryptic for me to grok, so let me try
rephrasing to see if I got them all correctly.

 - three-patch series leading to 984cd77ddb were meant to fix some
   bug, but the series itself was buggy and caused problems; we got
   rid of them

 - the problem 984cd77ddb wanted to fix was fixed differently
   without reintroducing the problem three-patch series introduced.
   That fix is already with us since 4d6d6ef1fc.

 - now these three changes that were problematic in the past is
   resent without any update (other than that it has one preparatory
   patch to add tests).

Is that what is going on?  Obviously I am not getting "the other"
benefit we wanted to gain out of these three patches (because the
above description fails to explain what that is), other than to fix
the issue that was fixed by 4d6d6ef1fc.

Sorry for being puzzled...

> [1] https://public-inbox.org/git/2659750.rG6xLiZASK@twilight
> [2] f178c13fda (Revert "Merge branch 'sb/submodule-core-worktree'", 
> 2018-09-07)
> [3] 4d6d6ef1fc (Merge branch 'sb/submodule-update-in-c', 2018-09-17)
> [4] 74d4731da1 (submodule--helper: replace connect-gitdir-workingtree by 
> ensure-core-worktree, 2018-08-13)
>
> Stefan Beller (4):
>   submodule update: add regression test with old style setups
>   submodule: unset core.worktree if no working tree is present
>   submodule--helper: fix BUG message in ensure_core_worktree
>   submodule deinit: unset core.worktree
>
>  builtin/submodule--helper.c|  4 +++-
>  submodule.c| 14 ++
>  submodule.h|  2 ++
>  t/lib-submodule-update.sh  |  5 +++--
>  t/t7400-submodule-basic.sh |  5 +
>  t/t7412-submodule-absorbgitdirs.sh |  7 ++-
>  6 files changed, 33 insertions(+), 4 deletions(-)


Re: [PATCH] commit: abort before commit-msg if empty message

2018-12-07 Thread Junio C Hamano
Jonathan Tan  writes:

> When a user runs "git commit" without specifying a message, an editor
> appears with advice:
>
> Please enter the commit message for your changes. Lines starting
> with '#' will be ignored, and an empty message aborts the commit.
>
> However, if the user supplies an empty message and has a commit-msg hook
> which updates the message to be non-empty, the commit proceeds to occur,
> despite what the advice states.

When "--no-edit" is given, and when commit-msg fills that blank, the
command should go ahead and record the commit, I think.

An automation where commit-msg is used to produce whatever
appropriate message for the automation is entirely a reasonable
thing to arrange.  Of course, you can move the logic to produce an
appropriate message for the automation from commit-msg to the script
that drives the "git commit" and use the output of that logic as the
value for the "-m" option to achieve the same, so in that sense,
there is an escape hatch even if you suddenly start to forbid such a
working set-up, but it nevertheless is unnecessary busywork for those
with such a set-up to adjust to this change.

I actually think in this partcular case, the commit-msg hook that
adds Change-ID to an empty message is BUGGY.  If the hook looked at
the message contents and refrains from making an otherwise empty
message to non-empty, there is no need for any change here.

In any case, you'll have plenty of time to make your case after the
rc freeze.  I am not so sympathetic to a patch that makes us bend
backwards to support such a buggy hook to e honest.




Re: [wishlist] git submodule update --reset-hard

2018-12-07 Thread Yaroslav Halchenko

On Fri, 07 Dec 2018, Yaroslav Halchenko wrote:


> On Fri, 07 Dec 2018, Stefan Beller wrote:
> > > the initial "git submodule update --reset-hard" is pretty much a
> > > crude workaround for some of those cases, so I would just go earlier in
> > > the history, and redo some things, whenever I could just drop or revert
> > > some selected set of commits.

> > That makes sense.
> > Do you want to give the implementation a try for the --reset-hard switch?

> ok, will do, thanks for the blessing ;-)

The patch is attached (please advise if should be done
differently) and also submitted as PR
https://github.com/git/git/pull/563

I guess it would need more tests.  Took me some time to figure out
why I was getting

fatal: bad value for update parameter

after all my changes to the git-submodule.sh script after looking at an
example commit 42b491786260eb17d97ea9fb1c4b70075bca9523 which introduced
--merge to the update ;-)

-- 
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834   Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik
From 170296dc661b4bc3d942917ce27288df52ff650d Mon Sep 17 00:00:00 2001
From: Yaroslav Halchenko 
Date: Fri, 7 Dec 2018 21:28:29 -0500
Subject: [PATCH] submodule: Add --reset-hard option for git submodule update

This patch adds a --reset-hard option for the update command to hard
reset submodule(s) to the gitlink for the corresponding submodule in
the superproject.  This feature is desired e.g. to be able to discard
recent changes in the entire hierarchy of the submodules after running

   git reset --hard PREVIOUS_STATE

in the superproject which leaves submodules in their original state,
and

   git reset --hard --recurse-submodules PREVIOUS_STATE

would result in the submodules being checked into detached HEADs.

As in the original  git reset --hard  no checks or any kind of
safe-guards against jumping into some state which was never on the
current branch is done.

must_die_on_failure is not set to  yes to mimic behavior of a update
--checkout strategy, which would leave user with a non-clean state
immediately apparent via  git status  so an informed decision/actions
could be done manually.

Signed-off-by: Yaroslav Halchenko 
---
 Documentation/git-submodule.txt | 12 +++-
 Documentation/gitmodules.txt|  4 ++--
 builtin/submodule--helper.c |  3 ++-
 git-submodule.sh| 10 +-
 submodule.c |  4 
 submodule.h |  1 +
 t/t7406-submodule-update.sh | 17 -
 7 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index ba3c4df550..f90a42d265 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -124,7 +124,7 @@ If you really want to remove a submodule from the repository and commit
 that use linkgit:git-rm[1] instead. See linkgit:gitsubmodules[7] for removal
 options.
 
-update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference ] [--depth ] [--recursive] [--jobs ] [--] [...]::
+update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge|--reset-hard] [--reference ] [--depth ] [--recursive] [--jobs ] [--] [...]::
 +
 --
 Update the registered submodules to match what the superproject
@@ -358,6 +358,16 @@ the submodule itself.
 	If the key `submodule.$name.update` is set to `rebase`, this option is
 	implicit.
 
+--reset-hard::
+	This option is only valid for the update command.
+	Hard reset current state to the commit recorded in the	superproject.
+If this option is given, the submodule's HEAD will not get detached
+if it was not detached before. Note that, like with a regular
+git reset --hard  no safe-guards are in place to prevent jumping
+to a commit which was never part of the current branch.
+	If the key `submodule.$name.update` is set to `reset-hard`, this
+	option is implicit.
+
 --init::
 	This option is only valid for the update command.
 	Initialize all submodules for which "git submodule init" has not been
diff --git a/Documentation/gitmodules.txt b/Documentation/gitmodules.txt
index 312b6f9259..e085dbc01f 100644
--- a/Documentation/gitmodules.txt
+++ b/Documentation/gitmodules.txt
@@ -43,8 +43,8 @@ submodule..update::
 	command in the superproject. This is only used by `git
 	submodule init` to initialize the configuration variable of
 	the same name. Allowed values here are 'checkout', 'rebase',
-	'merge' or 'none'. See description of 'update' command in
-	linkgit:git-submodule[1] for their meaning. Note that the
+	'merge', 'reset-hard' or 'none'. See description of 'update' command
+	in linkgit:git-submodule[1] for their meaning. Note that the
 	'!command' form 

Re: bug: git fetch reports too many unreachable loose objects

2018-12-07 Thread Elijah Newren
On Fri, Dec 7, 2018 at 6:14 PM Josh Wolfe  wrote:
>
> git version 2.19.1
> steps to reproduce:
>
> # start in a brand new repo
> git init
>
> # create lots of unreachable loose objects
> for i in {1..1}; do git commit-tree -m "$(head -c 12 /dev/urandom
> | base64)" "$(git mktree <&-)" <&-; done
> # this prints a lot of output and takes a minute or so to run
>
> # trigger git gc to run in the background
> git fetch
> # Auto packing the repository in background for optimum performance.
> # See "git help gc" for manual housekeeping.
>
> # trigger it again
> git fetch
> # Auto packing the repository in background for optimum performance.
> # See "git help gc" for manual housekeeping.
> # error: The last gc run reported the following. Please correct the root cause
> # and remove .git/gc.log.
> # Automatic cleanup will not be performed until the file is removed.
> #
> # warning: There are too many unreachable loose objects; run 'git
> prune' to remove them.
>
> # to manually fix this, run git prune:
> git prune
>
> # note that `git gc` does not fix the problem, and appears to do
> nothing in this situation:
> git gc
>
>
> According to the `git fetch` output, the `git help gc` docs, and the
> `git help prune` docs, I don't think I shouldn't ever have to run `git
> prune` manually, so this behavior seems like a bug to me. Please
> correct me if this is expected behavior.

Known bug, there are a variety of other ways to trigger it too.  See
the threads here for more info:
  https://public-inbox.org/git/87inc89j38@evledraar.gmail.com/
  https://public-inbox.org/git/20180716172717.237373-1-jonathanta...@google.com/
There are probably other threads as well.


Re: [wishlist] git submodule update --reset-hard

2018-12-07 Thread Yaroslav Halchenko


On Fri, 07 Dec 2018, Stefan Beller wrote:
> > the initial "git submodule update --reset-hard" is pretty much a
> > crude workaround for some of those cases, so I would just go earlier in
> > the history, and redo some things, whenever I could just drop or revert
> > some selected set of commits.

> That makes sense.
> Do you want to give the implementation a try for the --reset-hard switch?

ok, will do, thanks for the blessing ;-)

-- 
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834   Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik


bug: git fetch reports too many unreachable loose objects

2018-12-07 Thread Josh Wolfe
git version 2.19.1
steps to reproduce:

# start in a brand new repo
git init

# create lots of unreachable loose objects
for i in {1..1}; do git commit-tree -m "$(head -c 12 /dev/urandom
| base64)" "$(git mktree <&-)" <&-; done
# this prints a lot of output and takes a minute or so to run

# trigger git gc to run in the background
git fetch
# Auto packing the repository in background for optimum performance.
# See "git help gc" for manual housekeeping.

# trigger it again
git fetch
# Auto packing the repository in background for optimum performance.
# See "git help gc" for manual housekeeping.
# error: The last gc run reported the following. Please correct the root cause
# and remove .git/gc.log.
# Automatic cleanup will not be performed until the file is removed.
#
# warning: There are too many unreachable loose objects; run 'git
prune' to remove them.

# to manually fix this, run git prune:
git prune

# note that `git gc` does not fix the problem, and appears to do
nothing in this situation:
git gc


According to the `git fetch` output, the `git help gc` docs, and the
`git help prune` docs, I don't think I shouldn't ever have to run `git
prune` manually, so this behavior seems like a bug to me. Please
correct me if this is expected behavior.

In case anyone's wondering why I'm creating unreachable loose objects,
here's the usecase: https://stackoverflow.com/a/50403179/367916 . I
would love a first-class solution to obviate that workaround, but that
is probably a separate issue.

Josh


Re: [PATCH v2 1/3] git clone C:\cygwin\home\USER\repo' is working (again)

2018-12-07 Thread Steven Penny
On Fri, Dec 7, 2018 at 11:04 AM wrote:
> The solution is to implement has_dos_drive_prefix(), skip_dos_drive_prefix()
> is_dir_sep(), offset_1st_component() and convert_slashes() for cygwin
> in the same way as it is done in 'Git for Windows' in compat/mingw.[ch]
>
> Instead of duplicating the code, it is extracted into compat/mingw-cygwin.[ch]
> Some need for refactoring and cleanup came up in the review, they are adressed
> in a seperate commit.

i have applied the 3 patches against current master, and my original test
passes, so looks good to me.

however like Johannes i take issue with the naming. as he said "mingw-cygwin"
really isnt appropriate. ideally it would be "windows.h", but as that is
conspicuously in use, something like these:

- pc-windows
- pc-win
- win

i disagree with him on using "win32" - that doesnt really make sense, as
obviously you can compile 64-bit Git for Windows. if you wanted to go that route
you would want to use something like:

- windows-api
- win-api
- winapi

further - i disagree with the "DOS" moniker being used at all. DOS is a defunkt
operating system that i dont think Git has *ever* supported, so it doesnt make
sense to be referring to it this way. again, a more approriate name would be
something like "win_drive_prefix".


Re: [PATCH] docs/gitweb.conf: config variable typo

2018-12-07 Thread FeRD
On Fri, Dec 7, 2018 at 7:45 PM Jonathan Nieder  wrote:
>
>
> Thanks for fixing it.  May we forge your sign-off?

Yes please, guess I didn't read far enough down that document. My apologies.
Consider the previous patch email:

Signed-off-by: FeRD (Frank Dana) 


Re: [PATCH] docs/gitweb.conf: config variable typo

2018-12-07 Thread Jonathan Nieder
Hi,

Frank Dana wrote:

> The documentation for the feature 'snapshot' claimed
> "This feature can be configured on a per-repository basis via
> repository's `gitweb.blame` configuration variable"
>
> Fixed to specify `gitweb.snapshot` as the variable name.
> ---
>  Documentation/gitweb.conf.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks for fixing it.  May we forge your sign-off?  See
Documentation/SubmittingPatches section "Certify your work" for what
this means.

Sincerely,
Jonathan


[PATCH] docs/gitweb.conf: config variable typo

2018-12-07 Thread Frank Dana
The documentation for the feature 'snapshot' claimed
"This feature can be configured on a per-repository basis via
repository's `gitweb.blame` configuration variable"

Fixed to specify `gitweb.snapshot` as the variable name.
---
 Documentation/gitweb.conf.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/gitweb.conf.txt b/Documentation/gitweb.conf.txt
index c0a326e3883c3..40c9563ef67af 100644
--- a/Documentation/gitweb.conf.txt
+++ b/Documentation/gitweb.conf.txt
@@ -684,7 +684,7 @@ compressed tar archive) and "zip"; please consult gitweb 
sources for
 a definitive list.  By default only "tgz" is offered.
 +
 This feature can be configured on a per-repository basis via
-repository's `gitweb.blame` configuration variable, which contains
+repository's `gitweb.snapshot` configuration variable, which contains
 a comma separated list of formats or "none" to disable snapshots.
 Unknown values are ignored.
 

--
https://github.com/git/git/pull/562


Difficulty with parsing colorized diff output

2018-12-07 Thread George King
Hello, I have a rather elaborate diff highlighter that I have implemented as a 
post-processor to regular git output. I am writing to discuss some difficult 
aspects of git diff's color output that I am observing with version 2.19.2. 
This is not a regression report; I am trying to implement a new feature and am 
stymied by these details.

My goal is to detect SGR color sequences, e.g. '\x1b[32m', that exist in the 
source text, and have my highlighter print escaped representations of those. 
For example, I have checked in files that are expected test outputs for tools 
that emit color codes, and diffs of those get very confusing.

Figuring out which color codes are from the source text and which were added by 
git is proving very difficult. The obvious solution is to turn git diff 
coloring off, but as far as I can see this also turns off all coloring for 
logs, which is undesirable.

Then I tried to remove just the color codes that git adds to the diff. This 
almost works, but there are some irregularities. Most lines begin with a 
style/color code and end with a reset code, which would be a perfect indicator 
that git is using colors. However:

* Context lines do not begin with reset code, but do end with a reset code. It 
would be preferable in my opinion if they had both (like every other line), or 
none at all.

* Added lines have excess codes after the plus sign. The entire prefix is, 
`\x1b[32m+\x1b[m\x1b[32m` translating to GREEN PLUS RESET GREEN. Emitting codes 
after the plus sign makes the parsing more complex and idiosyncratic.


In summary, I would like to suggest the following improvements:

* Remove the excess codes after the plus sign.

* When git diff is adding colors, ensure that every line begins with an SGR 
code and ends with the RESET code.

* Add a config feature to turn on log coloring while leaving diff coloring off.


I would be willing to attempt a fix for this myself, but I'd like to hear what 
the maintainers think first, and would appreciate any hints as to where I 
should start looking in the code base.


If anyone is curious about the implementation it is called `same-same` and 
lives here: https://github.com/gwk/pithy/blob/master/pithy/bin/same_same.py

I configure it like this in .gitconfig:

[core]
  pager = same-same | LESSANSIENDCHARS=mK less --RAW-CONTROL-CHARS
[interactive]
  diffFilter = same-same -interactive | LESSANSIENDCHARS=mK less 
--RAW-CONTROL-CHARS


Thank you,
George



[PATCH 1/4] submodule update: add regression test with old style setups

2018-12-07 Thread Stefan Beller
As f178c13fda (Revert "Merge branch 'sb/submodule-core-worktree'",
2018-09-07) was produced shortly before a release, nobody asked for
a regression test to be included. Add a regression test that makes sure
that the invocation of `git submodule update` on old setups doesn't
produce errors as pointed out in f178c13fda.

The place to add such a regression test may look odd in t7412, but
that is the best place as there we setup old style submodule setups
explicitly.

Signed-off-by: Stefan Beller 
---
 t/t7412-submodule-absorbgitdirs.sh | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/t/t7412-submodule-absorbgitdirs.sh 
b/t/t7412-submodule-absorbgitdirs.sh
index ce74c12da2..1cfa150768 100755
--- a/t/t7412-submodule-absorbgitdirs.sh
+++ b/t/t7412-submodule-absorbgitdirs.sh
@@ -75,7 +75,12 @@ test_expect_success 're-setup nested submodule' '
GIT_WORK_TREE=../../../nested git -C sub1/.git/modules/nested config \
core.worktree "../../../nested" &&
# make sure this re-setup is correct
-   git status --ignore-submodules=none
+   git status --ignore-submodules=none &&
+
+   # also make sure this old setup does not regress
+   git submodule update --init --recursive >out 2>err &&
+   test_must_be_empty out &&
+   test_must_be_empty err
 '
 
 test_expect_success 'absorb the git dir in a nested submodule' '
-- 
2.20.0.rc2.403.gdbc3b29805-goog



[PATCH 4/4] submodule deinit: unset core.worktree

2018-12-07 Thread Stefan Beller
This re-introduces 984cd77ddb (submodule deinit: unset core.worktree,
2018-06-18), which was reverted as part of f178c13fda (Revert "Merge
branch 'sb/submodule-core-worktree'", 2018-09-07)

The whole series was reverted as the offending commit e98317508c
(submodule: ensure core.worktree is set after update, 2018-06-18)
was relied on by other commits such as 984cd77ddb.

Keep the offending commit reverted, but its functionality came back via
4d6d6ef1fc (Merge branch 'sb/submodule-update-in-c', 2018-09-17), such
that we can reintroduce 984cd77ddb now.

Signed-off-by: Stefan Beller 
---
 builtin/submodule--helper.c | 2 ++
 t/lib-submodule-update.sh   | 2 +-
 t/t7400-submodule-basic.sh  | 5 +
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 31ac30cf2f..672b74db89 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1131,6 +1131,8 @@ static void deinit_submodule(const char *path, const char 
*prefix,
if (!(flags & OPT_QUIET))
printf(format, displaypath);
 
+   submodule_unset_core_worktree(sub);
+
strbuf_release(_rm);
}
 
diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
index 51d449..5b56b23166 100755
--- a/t/lib-submodule-update.sh
+++ b/t/lib-submodule-update.sh
@@ -235,7 +235,7 @@ reset_work_tree_to_interested () {
then
mkdir -p submodule_update/.git/modules/sub1/modules &&
cp -r submodule_update_repo/.git/modules/sub1/modules/sub2 
submodule_update/.git/modules/sub1/modules/sub2
-   GIT_WORK_TREE=. git -C 
submodule_update/.git/modules/sub1/modules/sub2 config --unset core.worktree
+   # core.worktree is unset for sub2 as it is not checked out
fi &&
# indicate we are interested in the submodule:
git -C submodule_update config submodule.sub1.url "bogus" &&
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 76a7cb0af7..aba2d4d6ee 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -984,6 +984,11 @@ test_expect_success 'submodule deinit should remove the 
whole submodule section
rmdir init
 '
 
+test_expect_success 'submodule deinit should unset core.worktree' '
+   test_path_is_file .git/modules/example/config &&
+   test_must_fail git config -f .git/modules/example/config core.worktree
+'
+
 test_expect_success 'submodule deinit from subdirectory' '
git submodule update --init &&
git config submodule.example.foo bar &&
-- 
2.20.0.rc2.403.gdbc3b29805-goog



[PATCH 0/4]

2018-12-07 Thread Stefan Beller
A couple days before the 2.19 release we had a bug report about
broken submodules[1] and reverted[2] the commits leading up to them.

The behavior of said bug fixed itself by taking a different approach[3],
specifically by a weaker enforcement of having `core.worktree` set in a
submodule [4].

The revert [2] was overly broad as we neared the release, such that we wanted
to rather keep the known buggy behavior of always having `core.worktree` set,
rather than figuring out how to fix the new bug of having 'git submodule update'
not working in old style repository setups.

This series re-introduces those reverted patches, with no changes in code,
but with drastically changed commit messages, as those focus on why it is safe
to re-introduce them instead of explaining the desire for the change.

[1] https://public-inbox.org/git/2659750.rG6xLiZASK@twilight
[2] f178c13fda (Revert "Merge branch 'sb/submodule-core-worktree'", 2018-09-07)
[3] 4d6d6ef1fc (Merge branch 'sb/submodule-update-in-c', 2018-09-17)
[4] 74d4731da1 (submodule--helper: replace connect-gitdir-workingtree by 
ensure-core-worktree, 2018-08-13)

Stefan Beller (4):
  submodule update: add regression test with old style setups
  submodule: unset core.worktree if no working tree is present
  submodule--helper: fix BUG message in ensure_core_worktree
  submodule deinit: unset core.worktree

 builtin/submodule--helper.c|  4 +++-
 submodule.c| 14 ++
 submodule.h|  2 ++
 t/lib-submodule-update.sh  |  5 +++--
 t/t7400-submodule-basic.sh |  5 +
 t/t7412-submodule-absorbgitdirs.sh |  7 ++-
 6 files changed, 33 insertions(+), 4 deletions(-)

-- 
2.20.0.rc2.403.gdbc3b29805-goog



[PATCH 2/4] submodule: unset core.worktree if no working tree is present

2018-12-07 Thread Stefan Beller
This reintroduces 4fa4f90ccd (submodule: unset core.worktree if no working
tree is present, 2018-06-12), which was reverted as part of f178c13fda
(Revert "Merge branch 'sb/submodule-core-worktree'", 2018-09-07).

4fa4f90ccd was reverted as its followup commit was faulty, but without
the accompanying change of the followup, we'd have an incomplete workflow
of setting `core.worktree` again, when it is needed such as checking out
a revision that contains a submodule.

So re-introduce that commit as-is, focusing on fixing up the followup

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 submodule.c   | 14 ++
 submodule.h   |  2 ++
 t/lib-submodule-update.sh |  3 ++-
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/submodule.c b/submodule.c
index 6415cc5580..d393e947e6 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1561,6 +1561,18 @@ int bad_to_remove_submodule(const char *path, unsigned 
flags)
return ret;
 }
 
+void submodule_unset_core_worktree(const struct submodule *sub)
+{
+   char *config_path = xstrfmt("%s/modules/%s/config",
+   get_git_common_dir(), sub->name);
+
+   if (git_config_set_in_file_gently(config_path, "core.worktree", NULL))
+   warning(_("Could not unset core.worktree setting in submodule 
'%s'"),
+ sub->path);
+
+   free(config_path);
+}
+
 static const char *get_super_prefix_or_empty(void)
 {
const char *s = get_super_prefix();
@@ -1726,6 +1738,8 @@ int submodule_move_head(const char *path,
 
if (is_empty_dir(path))
rmdir_or_warn(path);
+
+   submodule_unset_core_worktree(sub);
}
}
 out:
diff --git a/submodule.h b/submodule.h
index a680214c01..9e18e9b807 100644
--- a/submodule.h
+++ b/submodule.h
@@ -131,6 +131,8 @@ int submodule_move_head(const char *path,
const char *new_head,
unsigned flags);
 
+void submodule_unset_core_worktree(const struct submodule *sub);
+
 /*
  * Prepare the "env_array" parameter of a "struct child_process" for executing
  * a submodule by clearing any repo-specific environment variables, but
diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
index 016391723c..51d449 100755
--- a/t/lib-submodule-update.sh
+++ b/t/lib-submodule-update.sh
@@ -709,7 +709,8 @@ test_submodule_recursing_with_args_common() {
git branch -t remove_sub1 origin/remove_sub1 &&
$command remove_sub1 &&
test_superproject_content origin/remove_sub1 &&
-   ! test -e sub1
+   ! test -e sub1 &&
+   test_must_fail git config -f .git/modules/sub1/config 
core.worktree
)
'
# ... absorbing a .git directory along the way.
-- 
2.20.0.rc2.403.gdbc3b29805-goog



[PATCH 3/4] submodule--helper: fix BUG message in ensure_core_worktree

2018-12-07 Thread Stefan Beller
Shortly after f178c13fda (Revert "Merge branch
'sb/submodule-core-worktree'", 2018-09-07), we had another series
that implemented partially the same, ensuring that core.worktree was
set in a checked out submodule, namely 74d4731da1 (submodule--helper:
replace connect-gitdir-workingtree by ensure-core-worktree, 2018-08-13)

As the series 4d6d6ef1fc (Merge branch 'sb/submodule-update-in-c',
2018-09-17) has different goals than the reverted series 7e25437d35
(Merge branch 'sb/submodule-core-worktree', 2018-07-18), I'd wanted to
replay the series on top of it to reach the goal of having `core.worktree`
correctly set when the submodules worktree is present, and unset when the
worktree is not present.

The replay resulted in a strange merge conflict highlighting that
the BUG message was not changed in 74d4731da1 (submodule--helper:
replace connect-gitdir-workingtree by ensure-core-worktree, 2018-08-13).

Fix the error message.

Signed-off-by: Stefan Beller 
---
 builtin/submodule--helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index d38113a31a..31ac30cf2f 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2045,7 +2045,7 @@ static int ensure_core_worktree(int argc, const char 
**argv, const char *prefix)
struct repository subrepo;
 
if (argc != 2)
-   BUG("submodule--helper connect-gitdir-workingtree  
");
+   BUG("submodule--helper ensure-core-worktree ");
 
path = argv[1];
 
-- 
2.20.0.rc2.403.gdbc3b29805-goog



Re: [PATCH] commit: abort before commit-msg if empty message

2018-12-07 Thread Jonathan Nieder
Hi,

Jonathan Tan wrote:

> (The implementation in this commit reads the commit message twice even
> if there is no commit-msg hook. I think that this is fine, since this is
> for interactive use - an alternative would be to plumb information about
> the absence of the hook from run_hook_ve() upwards, which seems more
> complicated.)

Sounds like a good followup for an interested person to do later.  Can
you include a NEEDSWORK comment describing this?

> Signed-off-by: Jonathan Tan 
> ---
> This was noticed with the commit-msg hook that comes with Gerrit, which
> basically just calls git interpret-trailers to add a Change-Id trailer.

Thanks for fixing it.

This kind of context tends to be very useful when looking back at a
commit later, so I'd be happy to see it in the commit message too.

[...]
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -652,6 +652,21 @@ static void adjust_comment_line_char(const struct strbuf 
> *sb)
>   comment_line_char = *p;
>  }
>  
> +static void read_and_clean_commit_message(struct strbuf *sb)
> +{
> + if (strbuf_read_file(sb, git_path_commit_editmsg(), 0) < 0) {
> + int saved_errno = errno;
> + rollback_index_files();
> + die(_("could not read commit message: %s"), 
> strerror(saved_errno));

Long line.

More importantly, I wonder if this can use die_errno.
rollback_index_files calls delete_tempfile, which can clobber errno, so
that would require restoring errno either here or in that function:

diff --git i/lockfile.h w/lockfile.h
index 35403ccc0d..d592f384e7 100644
--- i/lockfile.h
+++ w/lockfile.h
@@ -298,10 +298,14 @@ static inline int commit_lock_file_to(struct lock_file 
*lk, const char *path)
  * remove the lockfile. It is a NOOP to call `rollback_lock_file()`
  * for a `lock_file` object that has already been committed or rolled
  * back.
+ *
+ * Saves and restores errno for more convenient use during error handling.
  */
 static inline void rollback_lock_file(struct lock_file *lk)
 {
+   int save_errno = errno;
delete_tempfile(>tempfile);
+   errno = save_errno;
 }
 
 #endif /* LOCKFILE_H */

It doesn't make the code more obvious so what you have is probably
better.

Also, on second glance, this is just moved code.  Still hopefully some
of the above is amusing (and rewrapping would be fine to do during the
move).

[...]
> @@ -970,6 +985,22 @@ static int prepare_to_commit(const char *index_file, 
> const char *prefix,
>   argv_array_clear();
>   }
>  
> + if (use_editor && !no_verify) {
> + /*
> +  * Abort the commit if the user supplied an empty commit
> +  * message in the editor. (Because the commit-msg hook is to be
> +  * run, we need to check this now, since that hook may change
> +  * the commit message.)
> +  */
> + read_and_clean_commit_message();
> + if (message_is_empty(, cleanup_mode) && 
> !allow_empty_message) {
> + rollback_index_files();
> + fprintf(stderr, _("Aborting commit due to empty commit 
> message.\n"));
> + exit(1);

What about the template_untouched case?

Should the two call sites share code?

[...]
> + }
> + strbuf_release();
> + }
> +
>   if (!no_verify &&
>   run_commit_hook(use_editor, index_file, "commit-msg", 
> git_path_commit_editmsg(), NULL)) {
>   return 0;

This means we have a little duplication of code: first we check whether
to abort here in prepare_to_commit, and then again after the hook is run
in cmd_commit.

Is there some other code structure that would make things more clear?

cmd_commit also seems to be rather long --- is there some logical way
to split it up that would make the code clearer (as a preparatory or
followup patch)?

In cmd_commit, we spend a while (in number of lines, not actual
running time) to determine parents before deciding whether the user
has chosen to abort by writing an empty message.  Should we perform
that check sooner, closer to the prepare_to_commit call?

> @@ -1608,17 +1639,7 @@ int cmd_commit(int argc, const char **argv, const char 
> *prefix)
>  
>   /* Finally, get the commit message */
>   strbuf_reset();
> - if (strbuf_read_file(, git_path_commit_editmsg(), 0) < 0) {
> - int saved_errno = errno;
> - rollback_index_files();
> - die(_("could not read commit message: %s"), 
> strerror(saved_errno));
> - }
> -
> - if (verbose || /* Truncate the message just before the diff, if any. */
> - cleanup_mode == COMMIT_MSG_CLEANUP_SCISSORS)
> - strbuf_setlen(, wt_status_locate_end(sb.buf, sb.len));
> - if (cleanup_mode != COMMIT_MSG_CLEANUP_NONE)
> - strbuf_stripspace(, cleanup_mode == COMMIT_MSG_CLEANUP_ALL);
> + read_and_clean_commit_message();
>  
>   if (message_is_empty(, cleanup_mode) && 

Re: [RFC PATCH 2/3] Documentation/git-rev-list: s///

2018-12-07 Thread Matthew DeVore
On Sun, Oct 21, 2018 at 7:21 PM Junio C Hamano  wrote:
>
> Matthew DeVore  writes:
>
> >> It is more like "this is a set operation across commits.  We also
> >> show objects that are reachable from the commits in the resulting
> >> set and are not reachable from the commits in the set that were
> >> excluded when --objects option is given".
> >>
> > That would be correct though it wouldn't tell that you can use
> > "--objects ^foo-tree bar-tree."
>
> Yeah, but quite honestly, I consider that it is working by accident,
> not by design, in the current code (iow, you are looking at a
> behaviour of whatever the code happens to do).  "rev-list" is pretty
> much set operation across commits, and anything that deals with a
> non commit-ish given from the command line is an afterthought at
> best, and happenstance in reality.
>
> I do not mean to say that the code must stay that way, though.

I tried fixing the issue of "--objects ^commitobj treeobj" not
properly excluding objects reachable from commitobj, but this ended up
causing t5616-partial-clone.sh to fail. In the test labeled "manual
prefetch of missing objects", we create a clone of srv.bare without
blobs called "pc1", then push some new commits to srv.bare (via a
separate "local" repo), and try to fetch missing blobs with this
command:

$ git -C pc1 fetch-pack --stdin "file://$(pwd)/srv.bare" 

[PATCH] commit: abort before commit-msg if empty message

2018-12-07 Thread Jonathan Tan
When a user runs "git commit" without specifying a message, an editor
appears with advice:

Please enter the commit message for your changes. Lines starting
with '#' will be ignored, and an empty message aborts the commit.

However, if the user supplies an empty message and has a commit-msg hook
which updates the message to be non-empty, the commit proceeds to occur,
despite what the advice states.

Teach commit to also check the emptiness of the commit message before it
invokes the commit-msg hook if an editor is used and if no_verify is not
set (that is, commit-msg is not suppressed). This makes the advice true.

(The implementation in this commit reads the commit message twice even
if there is no commit-msg hook. I think that this is fine, since this is
for interactive use - an alternative would be to plumb information about
the absence of the hook from run_hook_ve() upwards, which seems more
complicated.)

Signed-off-by: Jonathan Tan 
---
This was noticed with the commit-msg hook that comes with Gerrit, which
basically just calls git interpret-trailers to add a Change-Id trailer.
---
 builtin/commit.c   | 43 --
 t/t7504-commit-msg-hook.sh | 11 ++
 2 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index c021b119bb..3681a59af8 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -652,6 +652,21 @@ static void adjust_comment_line_char(const struct strbuf 
*sb)
comment_line_char = *p;
 }
 
+static void read_and_clean_commit_message(struct strbuf *sb)
+{
+   if (strbuf_read_file(sb, git_path_commit_editmsg(), 0) < 0) {
+   int saved_errno = errno;
+   rollback_index_files();
+   die(_("could not read commit message: %s"), 
strerror(saved_errno));
+   }
+
+   if (verbose || /* Truncate the message just before the diff, if any. */
+   cleanup_mode == COMMIT_MSG_CLEANUP_SCISSORS)
+   strbuf_setlen(sb, wt_status_locate_end(sb->buf, sb->len));
+   if (cleanup_mode != COMMIT_MSG_CLEANUP_NONE)
+   strbuf_stripspace(sb, cleanup_mode == COMMIT_MSG_CLEANUP_ALL);
+}
+
 static int prepare_to_commit(const char *index_file, const char *prefix,
 struct commit *current_head,
 struct wt_status *s,
@@ -970,6 +985,22 @@ static int prepare_to_commit(const char *index_file, const 
char *prefix,
argv_array_clear();
}
 
+   if (use_editor && !no_verify) {
+   /*
+* Abort the commit if the user supplied an empty commit
+* message in the editor. (Because the commit-msg hook is to be
+* run, we need to check this now, since that hook may change
+* the commit message.)
+*/
+   read_and_clean_commit_message();
+   if (message_is_empty(, cleanup_mode) && 
!allow_empty_message) {
+   rollback_index_files();
+   fprintf(stderr, _("Aborting commit due to empty commit 
message.\n"));
+   exit(1);
+   }
+   strbuf_release();
+   }
+
if (!no_verify &&
run_commit_hook(use_editor, index_file, "commit-msg", 
git_path_commit_editmsg(), NULL)) {
return 0;
@@ -1608,17 +1639,7 @@ int cmd_commit(int argc, const char **argv, const char 
*prefix)
 
/* Finally, get the commit message */
strbuf_reset();
-   if (strbuf_read_file(, git_path_commit_editmsg(), 0) < 0) {
-   int saved_errno = errno;
-   rollback_index_files();
-   die(_("could not read commit message: %s"), 
strerror(saved_errno));
-   }
-
-   if (verbose || /* Truncate the message just before the diff, if any. */
-   cleanup_mode == COMMIT_MSG_CLEANUP_SCISSORS)
-   strbuf_setlen(, wt_status_locate_end(sb.buf, sb.len));
-   if (cleanup_mode != COMMIT_MSG_CLEANUP_NONE)
-   strbuf_stripspace(, cleanup_mode == COMMIT_MSG_CLEANUP_ALL);
+   read_and_clean_commit_message();
 
if (message_is_empty(, cleanup_mode) && !allow_empty_message) {
rollback_index_files();
diff --git a/t/t7504-commit-msg-hook.sh b/t/t7504-commit-msg-hook.sh
index 31b9c6a2c1..b44d6fc43e 100755
--- a/t/t7504-commit-msg-hook.sh
+++ b/t/t7504-commit-msg-hook.sh
@@ -122,6 +122,17 @@ test_expect_success 'with failing hook (editor)' '
 
 '
 
+test_expect_success 'hook is not run if commit message was empty' '
+   echo "yet more another" >>file &&
+   git add file &&
+   echo >FAKE_MSG &&
+   test_must_fail env GIT_EDITOR="\"\$FAKE_EDITOR\"" git commit 2>err &&
+
+   # Verify that git stopped because it saw an empty message, not because
+   # the hook exited with non-zero error code
+   test_i18ngrep "Aborting commit due to empty commit message" err
+'
+
 

Re: enhancement: support for author.email and author.name in "git config"

2018-12-07 Thread Jonathan Nieder
William Hubbs wrote:
> On Thu, Dec 06, 2018 at 11:20:07PM +0100, Johannes Schindelin wrote:

>> There *is* a way to get what you want that is super easy and will
>> definitely work: if you sit down and do it ;-)
>>
>> Please let us know if you need any additional information before you can
>> start.
>
> Which branch should I work off of in the repo?

"master".

Also, please make sure the documentation (e.g. in
Documentation/config/user.txt) describes when a user would want to set
this option.

See also:
- Documentation/SubmittingPatches
- the DISCUSSION section of "git help format-patch"
- [1]
- https://www.kernel.org/pub/software/scm/git/docs/user-manual.html#hacking-git

Thanks,
Jonathan

[1] 
https://github.com/gitgitgadget/gitgitgadget#a-bot-to-serve-as-glue-between-github-pull-requests-and-the-git-mailing-list


Re: RFE: git-patch-id should handle patches without leading "diff"

2018-12-07 Thread Jonathan Nieder
Ævar Arnfjörð Bjarmason wrote:
> On Fri, Dec 07 2018, Jonathan Nieder wrote:

>> The patch-id appears to only care about the diff text, so it should be
>> able to handle this.  So if we have a better heuristic for where the
>> diff starts, it would be good to use it.
>
> No, the patch-id doesn't just care about the diff, it cares about the
> context before the diff too.

Sorry, I did a bad job of communicating.  When I said "diff text", I was
including context.

[...]
> Observe that the diff --git line matters, we hash it:
>
> $ git diff-tree -p HEAD~.. | git patch-id
> 5870d115b7e2a9a936ab8fdc254932234413c710 
> 
> $ git diff-tree --src-prefix=a/ --dst-prefix=b/ -p HEAD~.. | git patch-id 
> --stable
> 5870d115b7e2a9a936ab8fdc254932234413c710 
> 
> $ git diff-tree --src-prefix=x/ --dst-prefix=y/ -p HEAD~.. | git patch-id 
> --stable
> 4cd136f2b98760150f700ac6a5b126389d6d05a7 
> 

Oh, hm.  That's unfortunate.

[...]
> So it seems most sensible to me if this is going to be supported that we
> go a bit beyond the call of duty and fake up the start of it, namely:
>
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
>
> To be:
>
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c

Right.  We may want to handle diff.mnemonicPrefix as well.

Jonathan


[PATCH v3 0/3] Add commit-graph fuzzer and fix buffer overflow

2018-12-07 Thread Josh Steadmon
Ad a new fuzz test for the commit graph and fix a buffer read-overflow
that it discovered. Additionally, fix the Makefile instructions for
building fuzzers.

Changes since V2:
  * Avoid pointer arithmetic overflow when checking the graph's chunk
count.
  * Merge the corrupt_graph_and_verify and
corrupt_and_zero_graph_then_verify test functions.

Josh Steadmon (3):
  commit-graph, fuzz: Add fuzzer for commit-graph
  commit-graph: fix buffer read-overflow
  Makefile: correct example fuzz build

 .gitignore  |  1 +
 Makefile|  3 +-
 commit-graph.c  | 67 +
 commit-graph.h  |  3 ++
 fuzz-commit-graph.c | 16 ++
 t/t5318-commit-graph.sh | 15 +++--
 6 files changed, 83 insertions(+), 22 deletions(-)
 create mode 100644 fuzz-commit-graph.c

Range-diff against v2:
1:  af45c2337f ! 1:  675d58ecea commit-graph: fix buffer read-overflow
@@ -22,8 +22,8 @@
 +  uint64_t chunk_offset;
int chunk_repeated = 0;
  
-+  if (chunk_lookup + GRAPH_CHUNKLOOKUP_WIDTH >
-+  data + graph_size) {
++  if (data + graph_size - chunk_lookup <
++  GRAPH_CHUNKLOOKUP_WIDTH) {
 +  error(_("chunk lookup table entry missing; graph file 
may be incomplete"));
 +  free(graph);
 +  return NULL;
@@ -40,31 +40,34 @@
  --- a/t/t5318-commit-graph.sh
  +++ b/t/t5318-commit-graph.sh
 @@
-   test_i18ngrep "$grepstr" err
- }
+ GRAPH_BYTE_OCTOPUS=$(($GRAPH_OCTOPUS_DATA_OFFSET + 4))
+ GRAPH_BYTE_FOOTER=$(($GRAPH_OCTOPUS_DATA_OFFSET + 4 * $NUM_OCTOPUS_EDGES))
  
-+
-+# usage: corrupt_and_zero_graph_then_verify   
 
-+# Manipulates the commit-graph file at  by inserting 
the data,
-+# then zeros the file starting at . Finally, runs
-+# 'git commit-graph verify' and places the output in the file 'err'. 
Tests 'err'
-+# for the given string.
-+corrupt_and_zero_graph_then_verify() {
-+  corrupt_pos=$1
-+  data="${2:-\0}"
-+  zero_pos=$3
-+  grepstr=$4
+-# usage: corrupt_graph_and_verify   
++# usage: corrupt_graph_and_verify[]
+ # Manipulates the commit-graph file at the position
+-# by inserting the data, then runs 'git commit-graph verify'
++# by inserting the data, optionally zeroing the file
++# starting at , then runs 'git commit-graph verify'
+ # and places the output in the file 'err'. Test 'err' for
+ # the given string.
+ corrupt_graph_and_verify() {
+   pos=$1
+   data="${2:-\0}"
+   grepstr=$3
 +  orig_size=$(stat --format=%s $objdir/info/commit-graph)
-+  cd "$TRASH_DIRECTORY/full" &&
-+  test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
-+  cp $objdir/info/commit-graph commit-graph-backup &&
-+  printf "$data" | dd of="$objdir/info/commit-graph" bs=1 
seek="$corrupt_pos" conv=notrunc &&
++  zero_pos=${4:-${orig_size}}
+   cd "$TRASH_DIRECTORY/full" &&
+   test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
+   cp $objdir/info/commit-graph commit-graph-backup &&
+   printf "$data" | dd of="$objdir/info/commit-graph" bs=1 seek="$pos" 
conv=notrunc &&
 +  truncate --size=$zero_pos $objdir/info/commit-graph &&
 +  truncate --size=$orig_size $objdir/info/commit-graph &&
-+  test_must_fail git commit-graph verify 2>test_err &&
-+  grep -v "^+" test_err >err &&
-+  test_i18ngrep "$grepstr" err
-+}
+   test_must_fail git commit-graph verify 2>test_err &&
+   grep -v "^+" test_err >err
+   test_i18ngrep "$grepstr" err
+ }
+ 
 +
  test_expect_success 'detect bad signature' '
corrupt_graph_and_verify 0 "\0" \
@@ -73,9 +76,9 @@
"incorrect checksum"
  '
  
-+test_expect_success 'detect truncated graph' '
-+  corrupt_and_zero_graph_then_verify $GRAPH_BYTE_CHUNK_COUNT "\xff" \
-+  $GRAPH_CHUNK_LOOKUP_OFFSET "chunk lookup table entry missing"
++test_expect_success 'detect incorrect chunk count' '
++  corrupt_graph_and_verify $GRAPH_BYTE_CHUNK_COUNT "\xff" \
++  "chunk lookup table entry missing" $GRAPH_CHUNK_LOOKUP_OFFSET
 +'
 +
  test_expect_success 'git fsck (checks commit-graph)' '
2:  7519fc76df = 2:  06a32bfe8b Makefile: correct example fuzz build
-- 
2.20.0.rc2.12.g4c11c11dec



[PATCH v3 1/3] commit-graph, fuzz: Add fuzzer for commit-graph

2018-12-07 Thread Josh Steadmon
Breaks load_commit_graph_one() into a new function,
parse_commit_graph(). The latter function operates on arbitrary buffers,
which makes it suitable as a fuzzing target. Since parse_commit_graph()
is only called by load_commit_graph_one() (and the fuzzer described
below), we omit error messages that would be duplicated by the caller.

Adds fuzz-commit-graph.c, which provides a fuzzing entry point
compatible with libFuzzer (and possibly other fuzzing engines).

Signed-off-by: Josh Steadmon 
---
 .gitignore  |  1 +
 Makefile|  1 +
 commit-graph.c  | 53 ++---
 commit-graph.h  |  3 +++
 fuzz-commit-graph.c | 16 ++
 5 files changed, 57 insertions(+), 17 deletions(-)
 create mode 100644 fuzz-commit-graph.c

diff --git a/.gitignore b/.gitignore
index 0d77ea5894..8bcf153ed9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,3 +1,4 @@
+/fuzz-commit-graph
 /fuzz_corpora
 /fuzz-pack-headers
 /fuzz-pack-idx
diff --git a/Makefile b/Makefile
index 1a44c811aa..6b72f37c29 100644
--- a/Makefile
+++ b/Makefile
@@ -684,6 +684,7 @@ SCRIPTS = $(SCRIPT_SH_INS) \
 
 ETAGS_TARGET = TAGS
 
+FUZZ_OBJS += fuzz-commit-graph.o
 FUZZ_OBJS += fuzz-pack-headers.o
 FUZZ_OBJS += fuzz-pack-idx.o
 
diff --git a/commit-graph.c b/commit-graph.c
index 40c855f185..07dd410f3c 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -84,16 +84,10 @@ static int commit_graph_compatible(struct repository *r)
 struct commit_graph *load_commit_graph_one(const char *graph_file)
 {
void *graph_map;
-   const unsigned char *data, *chunk_lookup;
size_t graph_size;
struct stat st;
-   uint32_t i;
-   struct commit_graph *graph;
+   struct commit_graph *ret;
int fd = git_open(graph_file);
-   uint64_t last_chunk_offset;
-   uint32_t last_chunk_id;
-   uint32_t graph_signature;
-   unsigned char graph_version, hash_version;
 
if (fd < 0)
return NULL;
@@ -108,27 +102,55 @@ struct commit_graph *load_commit_graph_one(const char 
*graph_file)
die(_("graph file %s is too small"), graph_file);
}
graph_map = xmmap(NULL, graph_size, PROT_READ, MAP_PRIVATE, fd, 0);
+   ret = parse_commit_graph(graph_map, fd, graph_size);
+
+   if (!ret) {
+   munmap(graph_map, graph_size);
+   close(fd);
+   exit(1);
+   }
+
+   return ret;
+}
+
+struct commit_graph *parse_commit_graph(void *graph_map, int fd,
+   size_t graph_size)
+{
+   const unsigned char *data, *chunk_lookup;
+   uint32_t i;
+   struct commit_graph *graph;
+   uint64_t last_chunk_offset;
+   uint32_t last_chunk_id;
+   uint32_t graph_signature;
+   unsigned char graph_version, hash_version;
+
+   if (!graph_map)
+   return NULL;
+
+   if (graph_size < GRAPH_MIN_SIZE)
+   return NULL;
+
data = (const unsigned char *)graph_map;
 
graph_signature = get_be32(data);
if (graph_signature != GRAPH_SIGNATURE) {
error(_("graph signature %X does not match signature %X"),
  graph_signature, GRAPH_SIGNATURE);
-   goto cleanup_fail;
+   return NULL;
}
 
graph_version = *(unsigned char*)(data + 4);
if (graph_version != GRAPH_VERSION) {
error(_("graph version %X does not match version %X"),
  graph_version, GRAPH_VERSION);
-   goto cleanup_fail;
+   return NULL;
}
 
hash_version = *(unsigned char*)(data + 5);
if (hash_version != GRAPH_OID_VERSION) {
error(_("hash version %X does not match version %X"),
  hash_version, GRAPH_OID_VERSION);
-   goto cleanup_fail;
+   return NULL;
}
 
graph = alloc_commit_graph();
@@ -152,7 +174,8 @@ struct commit_graph *load_commit_graph_one(const char 
*graph_file)
if (chunk_offset > graph_size - GIT_MAX_RAWSZ) {
error(_("improper chunk offset %08x%08x"), 
(uint32_t)(chunk_offset >> 32),
  (uint32_t)chunk_offset);
-   goto cleanup_fail;
+   free(graph);
+   return NULL;
}
 
switch (chunk_id) {
@@ -187,7 +210,8 @@ struct commit_graph *load_commit_graph_one(const char 
*graph_file)
 
if (chunk_repeated) {
error(_("chunk id %08x appears multiple times"), 
chunk_id);
-   goto cleanup_fail;
+   free(graph);
+   return NULL;
}
 
if (last_chunk_id == GRAPH_CHUNKID_OIDLOOKUP)
@@ -201,11 +225,6 @@ struct commit_graph *load_commit_graph_one(const char 
*graph_file)
}
 
return graph;
-
-cleanup_fail:
-   

[PATCH v3 2/3] commit-graph: fix buffer read-overflow

2018-12-07 Thread Josh Steadmon
fuzz-commit-graph identified a case where Git will read past the end of
a buffer containing a commit graph if the graph's header has an
incorrect chunk count. A simple bounds check in parse_commit_graph()
prevents this.

Signed-off-by: Josh Steadmon 
---
 commit-graph.c  | 14 --
 t/t5318-commit-graph.sh | 15 +--
 2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 07dd410f3c..836d65a1d3 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -165,10 +165,20 @@ struct commit_graph *parse_commit_graph(void *graph_map, 
int fd,
last_chunk_offset = 8;
chunk_lookup = data + 8;
for (i = 0; i < graph->num_chunks; i++) {
-   uint32_t chunk_id = get_be32(chunk_lookup + 0);
-   uint64_t chunk_offset = get_be64(chunk_lookup + 4);
+   uint32_t chunk_id;
+   uint64_t chunk_offset;
int chunk_repeated = 0;
 
+   if (data + graph_size - chunk_lookup <
+   GRAPH_CHUNKLOOKUP_WIDTH) {
+   error(_("chunk lookup table entry missing; graph file 
may be incomplete"));
+   free(graph);
+   return NULL;
+   }
+
+   chunk_id = get_be32(chunk_lookup + 0);
+   chunk_offset = get_be64(chunk_lookup + 4);
+
chunk_lookup += GRAPH_CHUNKLOOKUP_WIDTH;
 
if (chunk_offset > graph_size - GIT_MAX_RAWSZ) {
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 5fe21db99f..5b6b44b78e 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -366,24 +366,30 @@ GRAPH_OCTOPUS_DATA_OFFSET=$(($GRAPH_COMMIT_DATA_OFFSET + \
 GRAPH_BYTE_OCTOPUS=$(($GRAPH_OCTOPUS_DATA_OFFSET + 4))
 GRAPH_BYTE_FOOTER=$(($GRAPH_OCTOPUS_DATA_OFFSET + 4 * $NUM_OCTOPUS_EDGES))
 
-# usage: corrupt_graph_and_verify   
+# usage: corrupt_graph_and_verify[]
 # Manipulates the commit-graph file at the position
-# by inserting the data, then runs 'git commit-graph verify'
+# by inserting the data, optionally zeroing the file
+# starting at , then runs 'git commit-graph verify'
 # and places the output in the file 'err'. Test 'err' for
 # the given string.
 corrupt_graph_and_verify() {
pos=$1
data="${2:-\0}"
grepstr=$3
+   orig_size=$(stat --format=%s $objdir/info/commit-graph)
+   zero_pos=${4:-${orig_size}}
cd "$TRASH_DIRECTORY/full" &&
test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
cp $objdir/info/commit-graph commit-graph-backup &&
printf "$data" | dd of="$objdir/info/commit-graph" bs=1 seek="$pos" 
conv=notrunc &&
+   truncate --size=$zero_pos $objdir/info/commit-graph &&
+   truncate --size=$orig_size $objdir/info/commit-graph &&
test_must_fail git commit-graph verify 2>test_err &&
grep -v "^+" test_err >err
test_i18ngrep "$grepstr" err
 }
 
+
 test_expect_success 'detect bad signature' '
corrupt_graph_and_verify 0 "\0" \
"graph signature"
@@ -484,6 +490,11 @@ test_expect_success 'detect invalid checksum hash' '
"incorrect checksum"
 '
 
+test_expect_success 'detect incorrect chunk count' '
+   corrupt_graph_and_verify $GRAPH_BYTE_CHUNK_COUNT "\xff" \
+   "chunk lookup table entry missing" $GRAPH_CHUNK_LOOKUP_OFFSET
+'
+
 test_expect_success 'git fsck (checks commit-graph)' '
cd "$TRASH_DIRECTORY/full" &&
git fsck &&
-- 
2.20.0.rc2.12.g4c11c11dec



[PATCH v3 3/3] Makefile: correct example fuzz build

2018-12-07 Thread Josh Steadmon
Signed-off-by: Josh Steadmon 
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 6b72f37c29..bbcfc2bc9f 100644
--- a/Makefile
+++ b/Makefile
@@ -3104,7 +3104,7 @@ cover_db_html: cover_db
 # An example command to build against libFuzzer from LLVM 4.0.0:
 #
 # make CC=clang CXX=clang++ \
-#  FUZZ_CXXFLAGS="-fsanitize-coverage=trace-pc-guard -fsanitize=address" \
+#  CFLAGS="-fsanitize-coverage=trace-pc-guard -fsanitize=address" \
 #  LIB_FUZZING_ENGINE=/usr/lib/llvm-4.0/lib/libFuzzer.a \
 #  fuzz-all
 #
-- 
2.20.0.rc2.12.g4c11c11dec



Re: RFE: git-patch-id should handle patches without leading "diff"

2018-12-07 Thread Ævar Arnfjörð Bjarmason


On Fri, Dec 07 2018, Jonathan Nieder wrote:

> Hi,
>
> Konstantin Ryabitsev wrote:
>
>> Every now and again I come across a patch sent to LKML without a leading
>> "diff a/foo b/foo" -- usually produced by quilt. E.g.:
>>
>> https://lore.kernel.org/lkml/20181125185004.151077...@linutronix.de/
>>
>> I am guessing quilt does not bother including the leading "diff a/foo
>> b/foo" because it's redundant with the next two lines, however this
>> remains a valid patch recognized by git-am.
>>
>> If you pipe that patch via git-patch-id, it produces nothing, but if I
>> put in the leading "diff", like so:
>>
>> diff a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
>>
>> then it properly returns "fb3ae17451bc619e3d7f0dd647dfba2b9ce8992e".
>
> Interesting.  As Ævar mentioned, the relevant code is
>
>   /* Ignore commit comments */
>   if (!patchlen && !starts_with(line, "diff "))
>   continue;
>
> which is trying to handle a case where a line that is special to the
> parser appears before the diff begins.
>
> The patch-id appears to only care about the diff text, so it should be
> able to handle this.  So if we have a better heuristic for where the
> diff starts, it would be good to use it.

No, the patch-id doesn't just care about the diff, it cares about the
context before the diff too.

See this patch:

$ git diff-tree --src-prefix=x/ --dst-prefix=y/ -p HEAD~..
diff --git x/refs/files-backend.c y/refs/files-backend.c
index 9183875dad..dd8abe9185 100644
--- x/refs/files-backend.c
+++ y/refs/files-backend.c
@@ -180,7 +180,8 @@ static void files_reflog_path(struct files_ref_store 
*refs,
break;
case REF_TYPE_OTHER_PSEUDOREF:
case REF_TYPE_MAIN_PSEUDOREF:
-   return files_reflog_path_other_worktrees(refs, sb, refname);
+   files_reflog_path_other_worktrees(refs, sb, refname);
+   break;
case REF_TYPE_NORMAL:
strbuf_addf(sb, "%s/logs/%s", refs->gitcommondir, refname);
break;

Observe that the diff --git line matters, we hash it:

$ git diff-tree -p HEAD~.. | git patch-id
5870d115b7e2a9a936ab8fdc254932234413c710 

$ git diff-tree --src-prefix=a/ --dst-prefix=b/ -p HEAD~.. | git patch-id 
--stable
5870d115b7e2a9a936ab8fdc254932234413c710 

$ git diff-tree --src-prefix=x/ --dst-prefix=y/ -p HEAD~.. | git patch-id 
--stable
4cd136f2b98760150f700ac6a5b126389d6d05a7 


The thing it doesn't care about is the "index" between the "diff" and
patch:

$ git diff-tree --src-prefix=x/ --dst-prefix=y/ -p HEAD~.. | grep -v ^index 
| git patch-id --stable
4cd136f2b98760150f700ac6a5b126389d6d05a7 


We also care about the +++ and --- lines:

$ git diff-tree --src-prefix=x/ --dst-prefix=y/ -p HEAD~.. | grep -v ^index 
| perl -pe 's/^(\+\+\+|---).*/$1/g' | git patch-id
56985c2c38cce6079de2690082e1770a8e81214c 


Then we normalize the @@ line, e.g.:

$ git diff-tree --src-prefix=x/ --dst-prefix=y/ -p HEAD~.. | grep -v ^index 
| git patch-id
4cd136f2b98760150f700ac6a5b126389d6d05a7 

$ git diff-tree --src-prefix=x/ --dst-prefix=y/ -p HEAD~.. | grep -v ^index 
| perl -pe 's/\d+/123/g' | git patch-id
4cd136f2b98760150f700ac6a5b126389d6d05a7 



There's other caveats (see the code, e.g. "strip space") but to a first
approximation a patch id is a hash of something that looks like this:

diff --git x/refs/files-backend.c y/refs/files-backend.c
--- x/refs/files-backend.c
+++ y/refs/files-backend.c
@@ -123,123 +123,123 @@ static void files_reflog_path(struct 
files_ref_store *refs,
break;
case REF_TYPE_OTHER_PSEUDOREF:
case REF_TYPE_MAIN_PSEUDOREF:
-   return files_reflog_path_other_worktrees(refs, sb, refname);
+   files_reflog_path_other_worktrees(refs, sb, refname);
+   break;
case REF_TYPE_NORMAL:
strbuf_addf(sb, "%s/logs/%s", refs->gitcommondir, refname);
break;

Which means that accepting a patch like this as input would actually
give you a different patch-id than if it had the proper header.

So it seems most sensible to me if this is going to be supported that we
go a bit beyond the call of duty and fake up the start of it, namely:

--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c

To be:

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c

It'll make the state machine a bit more complex, but IMO it would suck
more if 

Re: [PATCH] mailmap: update brandon williams's email address

2018-12-07 Thread Jonathan Nieder
Hi,

Stefan Beller wrote:

> What would be more of interest is why we'd be interested in this patch
> as there is no commit/patch sent by Brandon with this email in gits history.

I think there's an implicit assumption in this question that isn't
spelled out.  Do I understand correctly that you're saying the main
purpose of .mailmap is to figure out whether two commits are by the
same author?

My own uses of .mailmap primarily have a different purpose: to find
out the preferred contact address for the author of a given commit.

Thanks,
Jonathan


Re: enhancement: support for author.email and author.name in "git config"

2018-12-07 Thread William Hubbs
Hi Johannes,

On Thu, Dec 06, 2018 at 11:20:07PM +0100, Johannes Schindelin wrote:
> Hi William,
> 
>[...]
> 
> There *is* a way to get what you want that is super easy and will
> definitely work: if you sit down and do it ;-)
> 
> Please let us know if you need any additional information before you can
> start.

Which branch should I work off of in the repo?

William


Re: [PATCH v2 3/3] Refactor mingw_cygwin_offset_1st_component()

2018-12-07 Thread Johannes Schindelin
Hi Torsten,

On Fri, 7 Dec 2018, tbo...@web.de wrote:

> diff --git a/compat/mingw-cygwin.c b/compat/mingw-cygwin.c
> index 5552c3ac20..c379a72775 100644
> --- a/compat/mingw-cygwin.c
> +++ b/compat/mingw-cygwin.c
> @@ -10,10 +10,8 @@ size_t mingw_cygwin_skip_dos_drive_prefix(char **path)
>  size_t mingw_cygwin_offset_1st_component(const char *path)
>  {
>   char *pos = (char *)path;
> -
> - /* unc paths */

This comment is still useful (and now even more correct), and should stay.

> - if (!skip_dos_drive_prefix() &&
> - is_dir_sep(pos[0]) && is_dir_sep(pos[1])) {
> + if (is_dir_sep(pos[0]) && is_dir_sep(pos[1])) {
> + /* unc path */
>   /* skip server name */
>   pos = strpbrk(pos + 2, "\\/");
>   if (!pos)
> @@ -22,7 +20,8 @@ size_t mingw_cygwin_offset_1st_component(const char *path)
>   do {
>   pos++;
>   } while (*pos && !is_dir_sep(*pos));
> + } else {
> + skip_dos_drive_prefix();
>   }
> -

Why remove this empty line? It structures the code quite nicely.

The rest looks correct to me,
Johannes

>   return pos + is_dir_sep(*pos) - path;
>  }
> -- 
> 2.19.0.271.gfe8321ec05
> 
> 


Re: [PATCH] mailmap: update brandon williams's email address

2018-12-07 Thread Stefan Beller
On Fri, Dec 7, 2018 at 1:40 PM Jonathan Nieder  wrote:
>
> Brandon Williams wrote:
>
> > Signed-off-by: Brandon Williams 
> > ---
> >  .mailmap | 1 +
> >  1 file changed, 1 insertion(+)
>
> I can confirm that this is indeed the same person.

What would be more of interest is why we'd be interested in this patch
as there is no commit/patch sent by Brandon with this email in gits history.

Is that so you get cc'd on your private address and can follow
things you worked on without being subscribed to the mailing list?
(I'd be interested to see the use case in the commit message;)

Thanks,
Stefan


Re: RFE: git-patch-id should handle patches without leading "diff"

2018-12-07 Thread Jonathan Nieder
Hi,

Konstantin Ryabitsev wrote:

> Every now and again I come across a patch sent to LKML without a leading
> "diff a/foo b/foo" -- usually produced by quilt. E.g.:
>
> https://lore.kernel.org/lkml/20181125185004.151077...@linutronix.de/
>
> I am guessing quilt does not bother including the leading "diff a/foo
> b/foo" because it's redundant with the next two lines, however this
> remains a valid patch recognized by git-am.
>
> If you pipe that patch via git-patch-id, it produces nothing, but if I
> put in the leading "diff", like so:
>
> diff a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
>
> then it properly returns "fb3ae17451bc619e3d7f0dd647dfba2b9ce8992e".

Interesting.  As Ævar mentioned, the relevant code is

/* Ignore commit comments */
if (!patchlen && !starts_with(line, "diff "))
continue;

which is trying to handle a case where a line that is special to the
parser appears before the diff begins.

The patch-id appears to only care about the diff text, so it should be
able to handle this.  So if we have a better heuristic for where the
diff starts, it would be good to use it.

"git apply" uses apply.c::find_header, which is more permissive.
Maybe it would be possible to unify these somehow.  (I haven't looked
closely enough to tell how painful that would be.)

Thanks and hope that helps,
Jonathan


Re: [wishlist] git submodule update --reset-hard

2018-12-07 Thread Stefan Beller
On Thu, Dec 6, 2018 at 5:23 PM Yaroslav Halchenko  wrote:

> > There was a proposal to "re-attach HEAD" in the submodule, i.e.
> > if the branch branch points at the same commit, we don't need
> > a detached HEAD, but could go with the branch instead.
>
> if I got the idea right, if we are talking about any branch, it
> would also non-deterministic since who knows what left over branch(es)
> point to that commit.  Not sure if I would have used that ;)

I would think we'd rather want to have it deterministic, i.e. something like
1) record branch name of the submodule
2) update submodules HEAD to to superprojects gitlink
3) if recorded branch (1) matches the sha1 of detached HEAD,
  have HEAD point to the branch instead.

You notice a small inefficiency here as we write HEAD twice, so it
could be reworded as:
1) compare superprojects gitlink with the submodules branch
2a) if equal, set submodules HEAD to branch
2b) if unequal set HEAD to gitlink value, resulting in detached HEAD

Note that this idea of reattaching reflects the idea (a) below.


> > a) "stay on submodule branch (i.e. HEAD still points at $branch), and
> > reset --hard" such that the submodule has a clean index and at that $branch
> > or
> > b) "stay on submodule branch (i.e. HEAD still points at $branch), but 
> > $branch is
> >set to the gitlink from the superproject, and then a reset --hard
> >will have the worktree set to it as well.


> NB "gitlink" -- just now discovered the thing for me.  Thought it would be
> called a  subproject  echoing what git diff/log -p shows for submodule 
> commits.

The terminology is messy:
The internal representation in Gits object model is a "gitlink" entry in a tree
object. Once we have a .gitmodules entry, we call it submodule.

The term 'subproject' is a historic artifact and will likely not be changed
in the diff output (or format-patch), because these diffs can be applied using
git-am for example. That makes the diff output effectively a transport
protocol, and changing protocols is hard if you have no versioning in them.

More in https://git-scm.com/docs/gitsubmodules (a rather recent new write
of a man page, going into concepts).

> > > right -- I meant the local changes and indeed reset --recurse-submodules
> > > indeed seems to recurse nicely.  Then the undesired effect remaining only
> > > the detached HEAD
>
> > For that we may want to revive discussions in
> > https://public-inbox.org/git/20170501180058.8063-5-sbel...@google.com/
>
> well, isn't that one requires a branch to be specified in .gitmodules?

Ah good point.

> >   git reset --hard --recursive=hard,keep-branch PREVIOUSPOINT
>
> 'keep-branch' (given aforementioned keeping the specified in .gitmodules
> branch) might be confusing.  Also what if a submodule already in a
> detached HEAD?  IMHO --recursive=hard, and just saying that it would do
> "reset --hard", is imho sufficient.  (that is why I like pure
> --reset hard   since it doesn't care and neither does anything to the
> branch)

For that we might want to first do the

  git submodule update --reset-hard

which runs reset --hard inside the submodule, no matter which
branch the submodule is on (if any) and resets to the given
superproject sha1.

See git-submodule.sh in git.git[1] in cmd_update.
We'd need to add a command line flag (`--reset-hard`
would be the obvious choice?) which would set the `update`
variable, which then is evaluated to what needs to be done in
the submodule, which in that case would be the hard reset.

https://github.com/git/git/blob/master/git-submodule.sh#L606

Once that is done we'd want to add a test case, presumably
in t/t7406-submodule-update.sh

> > > I would have asked for
>
> > >git revert --recursive ...
> > >git rebase --recursive [-i] ...
>
> > > which I also frequently desire (could elaborate on the use cases etc).
>
> > These would be nice to have. It would be nice if you'd elaborate on the
> > use cases for future reference in the mailing list archive. :-)
>
> ok, will try to do so ;-) In summary: they are just a logical extension
> of git support for submodules for anyone actively working with
> submodules to keep entire tree in sync.  Then quite often the need for
> reverting a specific commit (which also has changes reflected in
> submodules) arises.  The same with rebase, especially to trim away some
> no longer desired changes reflected in submodules.
>
> the initial "git submodule update --reset-hard" is pretty much a
> crude workaround for some of those cases, so I would just go earlier in
> the history, and redo some things, whenever I could just drop or revert
> some selected set of commits.

That makes sense.
Do you want to give the implementation a try for the --reset-hard switch?

> ah... so it is only   submodule  command which has --recursive, and the
> rest have --recurse-submodules   when talking about recursing into
> submodules?

I don't think we were that cautious in development as it was done by
different 

Re: [PATCH v2 1/3] git clone C:\cygwin\home\USER\repo' is working (again)

2018-12-07 Thread Johannes Schindelin
Hi Torsten,

On Fri, 7 Dec 2018, tbo...@web.de wrote:

>  compat/mingw-cygwin.c | 28 
>  compat/mingw-cygwin.h | 20 

Please use compat/win32/path.c (or .../path-utils.c) instead. This is not
so much about MINGW or Cygwin or MSys or MSYS2 or Visual C++, but about
Windows.

Thanks,
Johannes


Re: Retrieving a file in git that was deleted and committed

2018-12-07 Thread biswaranjan panda
On Thu, Dec 6, 2018 at 11:26 PM Jeff King  wrote:
>>
>> On Thu, Dec 06, 2018 at 11:07:00PM -0800, biswaranjan panda wrote:
>>
> >> Thanks! Strangely git log --follow does work.
>>
>> I suspect it would work even without --follow. When you limit a log
>> traversal with a pathspec, like:
>>
>>   git log foo
>>
>> that is not about following some continuous stream of content, but
>> rather just applying that pathspec to the diff of each commit, and
>> pruning ones where it did not change. So even if there are gaps where
>> the file did not exist, we continue to apply the pathspec to the older
>> commits.

> Ah, of course. Thanks for the clarification, Jeff. And my > apologies to
> Biswaranjan Panda for the incorrect information.

Thanks Jeff and Bryan! However, I am curious that if there were a way
to tell git blame to skip a commit (the one which added the file again
and maybe the one which deleted it originally) while it walks back
through history, then it should just get back the
entire history right ?
On Thu, Dec 6, 2018 at 11:37 PM Bryan Turner  wrote:
>
> On Thu, Dec 6, 2018 at 11:26 PM Jeff King  wrote:
> >
> > On Thu, Dec 06, 2018 at 11:07:00PM -0800, biswaranjan panda wrote:
> >
> > > Thanks! Strangely git log --follow does work.
> >
> > I suspect it would work even without --follow. When you limit a log
> > traversal with a pathspec, like:
> >
> >   git log foo
> >
> > that is not about following some continuous stream of content, but
> > rather just applying that pathspec to the diff of each commit, and
> > pruning ones where it did not change. So even if there are gaps where
> > the file did not exist, we continue to apply the pathspec to the older
> > commits.
>
> Ah, of course. Thanks for the clarification, Jeff. And my apologies to
> Biswaranjan Panda for the incorrect information.
>
> >
> > Tools like git-blame will _not_ work, though, as they really are trying
> > to track the content as they walk back through history. And Once all of
> > the content seems to appear from nowhere in your new commit, that seems
> > like a dead end.
> >
> > In theory there could be some machine-readable annotation in the commit
> > object (or in a note created after the fact) to say "even though 'foo'
> > is a new file here, it came from $commit:foo".  And then git-blame could
> > keep following the content there. But such a feature does not yet exist.
> >
> > -Peff



-- 
Thanks,
-Biswa


[PATCH on master v2] revision: use commit graph in get_reference()

2018-12-07 Thread Jonathan Tan
When fetching into a repository, a connectivity check is first made by
check_exist_and_connected() in builtin/fetch.c that runs:

  git rev-list --objects --stdin --not --all --quiet <(list of objects)

If the client repository has many refs, this command can be slow,
regardless of the nature of the server repository or what is being
fetched. A profiler reveals that most of the time is spent in
setup_revisions() (approx. 60/63), and of the time spent in
setup_revisions(), most of it is spent in parse_object() (approx.
49/60). This is because setup_revisions() parses the target of every ref
(from "--all"), and parse_object() reads the buffer of the object.

Reading the buffer is unnecessary if the repository has a commit graph
and if the ref points to a commit (which is typically the case). This
patch uses the commit graph wherever possible; on my computer, when I
run the above command with a list of 1 object on a many-ref repository,
I get a speedup from 1.8s to 1.0s.

Another way to accomplish this effect would be to modify parse_object()
to use the commit graph if possible; however, I did not want to change
parse_object()'s current behavior of always checking the object
signature of the returned object.

Signed-off-by: Jonathan Tan 
---
This patch is now on master.

v2 makes use of the optimization Stolee describes in [1], except that I
have arranged the functions slightly differently. In particular, I
didn't want to add even more ways to obtain objects, so I let
parse_commit_in_graph() be able to take in either a commit shell or an
OID, and did not create the parse_probably_commit() function he
suggested. But I'm not really attached to this design choice, and can
change it if requested.

[1] https://public-inbox.org/git/aa0cd481-c135-47aa-2a69-e3dc71661...@gmail.com/
---
 commit-graph.c | 38 --
 commit-graph.h | 12 
 commit.c   |  2 +-
 revision.c |  5 -
 t/helper/test-repository.c |  4 ++--
 5 files changed, 43 insertions(+), 18 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 40c855f185..a571b523b7 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -286,7 +286,8 @@ void close_commit_graph(struct repository *r)
r->objects->commit_graph = NULL;
 }
 
-static int bsearch_graph(struct commit_graph *g, struct object_id *oid, 
uint32_t *pos)
+static int bsearch_graph(struct commit_graph *g, const struct object_id *oid,
+uint32_t *pos)
 {
return bsearch_hash(oid->hash, g->chunk_oid_fanout,
g->chunk_oid_lookup, g->hash_len, pos);
@@ -374,24 +375,41 @@ static int find_commit_in_graph(struct commit *item, 
struct commit_graph *g, uin
}
 }
 
-static int parse_commit_in_graph_one(struct commit_graph *g, struct commit 
*item)
+static struct commit *parse_commit_in_graph_one(struct repository *r,
+   struct commit_graph *g,
+   struct commit *shell,
+   const struct object_id *oid)
 {
uint32_t pos;
 
-   if (item->object.parsed)
-   return 1;
+   if (shell && shell->object.parsed)
+   return shell;
 
-   if (find_commit_in_graph(item, g, ))
-   return fill_commit_in_graph(item, g, pos);
+   if (shell && shell->graph_pos != COMMIT_NOT_FROM_GRAPH) {
+   pos = shell->graph_pos;
+   } else if (bsearch_graph(g, shell ? >object.oid : oid, )) {
+   /* bsearch_graph sets pos */
+   } else {
+   return NULL;
+   }
 
-   return 0;
+   if (!shell) {
+   shell = lookup_commit(r, oid);
+   if (!shell)
+   return NULL;
+   }
+
+   fill_commit_in_graph(shell, g, pos);
+   return shell;
 }
 
-int parse_commit_in_graph(struct repository *r, struct commit *item)
+struct commit *parse_commit_in_graph(struct repository *r, struct commit 
*shell,
+const struct object_id *oid)
 {
if (!prepare_commit_graph(r))
return 0;
-   return parse_commit_in_graph_one(r->objects->commit_graph, item);
+   return parse_commit_in_graph_one(r, r->objects->commit_graph, shell,
+oid);
 }
 
 void load_commit_graph_info(struct repository *r, struct commit *item)
@@ -1025,7 +1043,7 @@ int verify_commit_graph(struct repository *r, struct 
commit_graph *g)
}
 
graph_commit = lookup_commit(r, _oid);
-   if (!parse_commit_in_graph_one(g, graph_commit))
+   if (!parse_commit_in_graph_one(r, g, graph_commit, NULL))
graph_report("failed to parse %s from commit-graph",
 oid_to_hex(_oid));
}
diff --git a/commit-graph.h b/commit-graph.h
index 

Re: [PATCH] mailmap: update brandon williams's email address

2018-12-07 Thread Jonathan Nieder
Brandon Williams wrote:

> Signed-off-by: Brandon Williams 
> ---
>  .mailmap | 1 +
>  1 file changed, 1 insertion(+)

I can confirm that this is indeed the same person.

Reviewed-by: Jonathan Nieder 

Welcome back!


[PATCH] mailmap: update brandon williams's email address

2018-12-07 Thread Brandon Williams
Signed-off-by: Brandon Williams 
---
 .mailmap | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.mailmap b/.mailmap
index eb7b5fc7b..247a3deb7 100644
--- a/.mailmap
+++ b/.mailmap
@@ -27,6 +27,7 @@ Ben Walton  
 Benoit Sigoure  
 Bernt Hansen  
 Brandon Casey  
+Brandon Williams  
 brian m. carlson 
 brian m. carlson  

 Bryan Larsen  
-- 
2.19.1



Re: RFE: git-patch-id should handle patches without leading "diff"

2018-12-07 Thread Ævar Arnfjörð Bjarmason


On Fri, Dec 07 2018, Konstantin Ryabitsev wrote:

> Hi, all:
>
> Every now and again I come across a patch sent to LKML without a leading
> "diff a/foo b/foo" -- usually produced by quilt. E.g.:
>
> https://lore.kernel.org/lkml/20181125185004.151077...@linutronix.de/
>
> I am guessing quilt does not bother including the leading "diff a/foo
> b/foo" because it's redundant with the next two lines, however this
> remains a valid patch recognized by git-am.
>
> If you pipe that patch via git-patch-id, it produces nothing, but if I
> put in the leading "diff", like so:
>
> diff a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
>
> then it properly returns "fb3ae17451bc619e3d7f0dd647dfba2b9ce8992e".
>
> Can we please teach git-patch-id to work without the leading diff a/foo
> b/foo, same as git-am?
>
> Best,
> -K

The state machine is sensitive there being a "diff" line, then "index"
etc.

diff --git a/builtin/patch-id.c b/builtin/patch-id.c
index 970d0d30b4..b99e4455fd 100644
--- a/builtin/patch-id.c
+++ b/builtin/patch-id.c
@@ -97,7 +97,9 @@ static int get_one_patchid(struct object_id *next_oid, struct 
object_id *result,
}

/* Ignore commit comments */
-   if (!patchlen && !starts_with(line, "diff "))
+   if (!patchlen && starts_with(line, "--- a/"))
+   ;
+   else if (!patchlen && !starts_with(line, "diff "))
continue;

/* Parsing diff header?  */

This would make it produce a patch-id for that input, however note that
I've done "--- a/" there, with just "--- " (which is legit) we'd get
confused and start earlier before the diffstat.

So if you're interested in having this I leave it to you to run with
this & write tests for it, but more convincingly run it on the git &
LKML archives and see that the output is the same (or just extra in case
where we now find patches) with --stable etc.


[PATCH] l10n: de.po: fix two messages

2018-12-07 Thread Ralf Thielow
Reported-by: Phillip Szelat 
Signed-off-by: Ralf Thielow 
---
Hi Phillip,

Good catches. Thanks for the review!

Ralf

 po/de.po | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/po/de.po b/po/de.po
index eb213d742..d5113434a 100644
--- a/po/de.po
+++ b/po/de.po
@@ -3421,7 +3421,7 @@ msgstr "Fehler bei Vorbereitung der Packdatei aus 
multi-pack-index."
 #: midx.c:407
 #, c-format
 msgid "failed to add packfile '%s'"
-msgstr "Fehler beim Hinzufügen von Packdatei'%s'."
+msgstr "Fehler beim Hinzufügen von Packdatei '%s'."
 
 #: midx.c:413
 #, c-format
@@ -4559,7 +4559,7 @@ msgstr "Öffnen von /dev/null fehlgeschlagen"
 #: run-command.c:1229
 #, c-format
 msgid "cannot create async thread: %s"
-msgstr "Kann Thread für async nicht erzeugen: %s"
+msgstr "Konnte Thread für async nicht erzeugen: %s"
 
 #: run-command.c:1293
 #, c-format
-- 
2.20.0.rc2.411.g8f28e744c2



Re: [PATCH] git-rebase.txt: update note about directory rename detection and am

2018-12-07 Thread Elijah Newren
On Fri, Dec 7, 2018 at 9:51 AM Johannes Sixt  wrote:
>
> From: Elijah Newren 
>
> In commit 6aba117d5cf7 ("am: avoid directory rename detection when
> calling recursive merge machinery", 2018-08-29), the git-rebase manpage
> probably should have also been updated to note the stronger
> incompatibility between git-am and directory rename detection.  Update
> it now.
>
> Signed-off-by: Elijah Newren 
> Signed-off-by: Johannes Sixt 
> ---
>   Documentation/git-rebase.txt | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> index 41631df6e4..7bea21e8e3 100644
> --- a/Documentation/git-rebase.txt
> +++ b/Documentation/git-rebase.txt
> @@ -569,8 +569,9 @@ it to keep commits that started empty.
>   Directory rename detection
>   ~~
>
> -The merge and interactive backends work fine with
> -directory rename detection.  The am backend sometimes does not.
> +Directory rename heuristics are enabled in the merge and interactive
> +backends.  Due to the lack of accurate tree information, directory
> +rename detection is disabled in the am backend.
>
>   include::merge-strategies.txt[]

I was intending to send this out the past couple days, was just kinda
busy.  Thanks for handling it for me.


RFE: git-patch-id should handle patches without leading "diff"

2018-12-07 Thread Konstantin Ryabitsev
Hi, all:

Every now and again I come across a patch sent to LKML without a leading
"diff a/foo b/foo" -- usually produced by quilt. E.g.:

https://lore.kernel.org/lkml/20181125185004.151077...@linutronix.de/

I am guessing quilt does not bother including the leading "diff a/foo
b/foo" because it's redundant with the next two lines, however this
remains a valid patch recognized by git-am.

If you pipe that patch via git-patch-id, it produces nothing, but if I
put in the leading "diff", like so:

diff a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c

then it properly returns "fb3ae17451bc619e3d7f0dd647dfba2b9ce8992e".

Can we please teach git-patch-id to work without the leading diff a/foo
b/foo, same as git-am?

Best,
-K


signature.asc
Description: PGP signature


[PATCH] git-rebase.txt: update note about directory rename detection and am

2018-12-07 Thread Johannes Sixt

From: Elijah Newren 

In commit 6aba117d5cf7 ("am: avoid directory rename detection when
calling recursive merge machinery", 2018-08-29), the git-rebase manpage
probably should have also been updated to note the stronger
incompatibility between git-am and directory rename detection.  Update
it now.

Signed-off-by: Elijah Newren 
Signed-off-by: Johannes Sixt 
---
 Documentation/git-rebase.txt | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 41631df6e4..7bea21e8e3 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -569,8 +569,9 @@ it to keep commits that started empty.
 Directory rename detection
 ~~

-The merge and interactive backends work fine with
-directory rename detection.  The am backend sometimes does not.
+Directory rename heuristics are enabled in the merge and interactive
+backends.  Due to the lack of accurate tree information, directory
+rename detection is disabled in the am backend.

 include::merge-strategies.txt[]

--
2.19.1.1133.g2dd3d172d2


[PATCH v2 1/3] git clone C:\cygwin\home\USER\repo' is working (again)

2018-12-07 Thread tboegi
From: Torsten Bögershausen 

A regression for cygwin users was introduced with commit 05b458c,
 "real_path: resolve symlinks by hand".

In the the commit message we read:
  The current implementation of real_path uses chdir() in order to resolve
symlinks.  Unfortunately this isn't thread-safe as chdir() affects a
  process as a whole...

The old (and non-thread-save) OS calls chdir()/pwd() had been
replaced by a string operation.
The cygwin layer "knows" that "C:\cygwin" is an absolute path,
but the new string operation does not.

"git clone  C:\cygwin\home\USER\repo" fails like this:
fatal: Invalid path '/home/USER/repo/C:\cygwin\home\USER\repo'

The solution is to implement has_dos_drive_prefix(), skip_dos_drive_prefix()
is_dir_sep(), offset_1st_component() and convert_slashes() for cygwin
in the same way as it is done in 'Git for Windows' in compat/mingw.[ch]

Instead of duplicating the code, it is extracted into compat/mingw-cygwin.[ch]
Some need for refactoring and cleanup came up in the review, they are adressed
in a seperate commit.

Reported-By: Steven Penny 
Signed-off-by: Torsten Bögershausen 
---
 compat/cygwin.c   | 19 ---
 compat/cygwin.h   |  2 --
 compat/mingw-cygwin.c | 28 
 compat/mingw-cygwin.h | 20 
 compat/mingw.c| 29 +
 compat/mingw.h| 20 
 config.mak.uname  |  4 ++--
 git-compat-util.h |  3 ++-
 8 files changed, 53 insertions(+), 72 deletions(-)
 delete mode 100644 compat/cygwin.c
 delete mode 100644 compat/cygwin.h
 create mode 100644 compat/mingw-cygwin.c
 create mode 100644 compat/mingw-cygwin.h

diff --git a/compat/cygwin.c b/compat/cygwin.c
deleted file mode 100644
index b9862d606d..00
--- a/compat/cygwin.c
+++ /dev/null
@@ -1,19 +0,0 @@
-#include "../git-compat-util.h"
-#include "../cache.h"
-
-int cygwin_offset_1st_component(const char *path)
-{
-   const char *pos = path;
-   /* unc paths */
-   if (is_dir_sep(pos[0]) && is_dir_sep(pos[1])) {
-   /* skip server name */
-   pos = strchr(pos + 2, '/');
-   if (!pos)
-   return 0; /* Error: malformed unc path */
-
-   do {
-   pos++;
-   } while (*pos && pos[0] != '/');
-   }
-   return pos + is_dir_sep(*pos) - path;
-}
diff --git a/compat/cygwin.h b/compat/cygwin.h
deleted file mode 100644
index 8e52de4644..00
--- a/compat/cygwin.h
+++ /dev/null
@@ -1,2 +0,0 @@
-int cygwin_offset_1st_component(const char *path);
-#define offset_1st_component cygwin_offset_1st_component
diff --git a/compat/mingw-cygwin.c b/compat/mingw-cygwin.c
new file mode 100644
index 00..c63d7acb9c
--- /dev/null
+++ b/compat/mingw-cygwin.c
@@ -0,0 +1,28 @@
+#include "../git-compat-util.h"
+
+int mingw_cygwin_skip_dos_drive_prefix(char **path)
+{
+   int ret = has_dos_drive_prefix(*path);
+   *path += ret;
+   return ret;
+}
+
+int mingw_cygwin_offset_1st_component(const char *path)
+{
+   char *pos = (char *)path;
+
+   /* unc paths */
+   if (!skip_dos_drive_prefix() &&
+   is_dir_sep(pos[0]) && is_dir_sep(pos[1])) {
+   /* skip server name */
+   pos = strpbrk(pos + 2, "\\/");
+   if (!pos)
+   return 0; /* Error: malformed unc path */
+
+   do {
+   pos++;
+   } while (*pos && !is_dir_sep(*pos));
+   }
+
+   return pos + is_dir_sep(*pos) - path;
+}
diff --git a/compat/mingw-cygwin.h b/compat/mingw-cygwin.h
new file mode 100644
index 00..66ccc909ae
--- /dev/null
+++ b/compat/mingw-cygwin.h
@@ -0,0 +1,20 @@
+#define has_dos_drive_prefix(path) \
+   (isalpha(*(path)) && (path)[1] == ':' ? 2 : 0)
+int mingw_cygwin_skip_dos_drive_prefix(char **path);
+#define skip_dos_drive_prefix mingw_cygwin_skip_dos_drive_prefix
+static inline int mingw_cygwin_is_dir_sep(int c)
+{
+   return c == '/' || c == '\\';
+}
+#define is_dir_sep mingw_cygwin_is_dir_sep
+static inline char *mingw_cygwin_find_last_dir_sep(const char *path)
+{
+   char *ret = NULL;
+   for (; *path; ++path)
+   if (is_dir_sep(*path))
+   ret = (char *)path;
+   return ret;
+}
+#define find_last_dir_sep mingw_cygwin_find_last_dir_sep
+int mingw_cygwin_offset_1st_component(const char *path);
+#define offset_1st_component mingw_cygwin_offset_1st_component
diff --git a/compat/mingw.c b/compat/mingw.c
index 34b3880b29..038e96af9d 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -350,7 +350,7 @@ static inline int needs_hiding(const char *path)
return 0;
 
/* We cannot use basename(), as it would remove trailing slashes */
-   mingw_skip_dos_drive_prefix((char **));
+   mingw_cygwin_skip_dos_drive_prefix((char **));
if (!*path)
return 0;
 
@@ -2275,33 

[PATCH v2 3/3] Refactor mingw_cygwin_offset_1st_component()

2018-12-07 Thread tboegi
From: Torsten Bögershausen 

The Windows version of offset_1st_component() needs to hande 3 cases:
- The path is an UNC path, starting with "//" or "".
  Skip the servername and the name of the share.
- The path is a DOS drive, starting with e.g. "X:"
  The driver letter and the ':' must be skipped
- The path is pointing to a subdirectory somewhere in the path and the
  directory seperator needs to be skipped ('/' or '\\').

Refactor the code to make it easier to read.

Suggested-by: Johannes Schindelin 
Signed-off-by: Torsten Bögershausen 
---
 compat/mingw-cygwin.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/compat/mingw-cygwin.c b/compat/mingw-cygwin.c
index 5552c3ac20..c379a72775 100644
--- a/compat/mingw-cygwin.c
+++ b/compat/mingw-cygwin.c
@@ -10,10 +10,8 @@ size_t mingw_cygwin_skip_dos_drive_prefix(char **path)
 size_t mingw_cygwin_offset_1st_component(const char *path)
 {
char *pos = (char *)path;
-
-   /* unc paths */
-   if (!skip_dos_drive_prefix() &&
-   is_dir_sep(pos[0]) && is_dir_sep(pos[1])) {
+   if (is_dir_sep(pos[0]) && is_dir_sep(pos[1])) {
+   /* unc path */
/* skip server name */
pos = strpbrk(pos + 2, "\\/");
if (!pos)
@@ -22,7 +20,8 @@ size_t mingw_cygwin_offset_1st_component(const char *path)
do {
pos++;
} while (*pos && !is_dir_sep(*pos));
+   } else {
+   skip_dos_drive_prefix();
}
-
return pos + is_dir_sep(*pos) - path;
 }
-- 
2.19.0.271.gfe8321ec05



[PATCH v2 2/3] offset_1st_component(), dos_drive_prefix() return size_t

2018-12-07 Thread tboegi
From: Torsten Bögershausen 

Change the return value for offset_1st_component(),
has_dos_drive_prefix() and skip_dos_drive_prefix from int into size_t,
which is the natural type for length of data in memory.

While at it, remove possible "parameter not used" warnings in for the
non-Windows builds in git-compat-util.h

Signed-off-by: Torsten Bögershausen 
---
 abspath.c | 2 +-
 compat/mingw-cygwin.c | 6 +++---
 compat/mingw-cygwin.h | 4 ++--
 git-compat-util.h | 8 +---
 setup.c   | 4 ++--
 5 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/abspath.c b/abspath.c
index 9857985329..12055a1d8f 100644
--- a/abspath.c
+++ b/abspath.c
@@ -51,7 +51,7 @@ static void get_next_component(struct strbuf *next, struct 
strbuf *remaining)
 /* copies root part from remaining to resolved, canonicalizing it on the way */
 static void get_root_part(struct strbuf *resolved, struct strbuf *remaining)
 {
-   int offset = offset_1st_component(remaining->buf);
+   size_t offset = offset_1st_component(remaining->buf);
 
strbuf_reset(resolved);
strbuf_add(resolved, remaining->buf, offset);
diff --git a/compat/mingw-cygwin.c b/compat/mingw-cygwin.c
index c63d7acb9c..5552c3ac20 100644
--- a/compat/mingw-cygwin.c
+++ b/compat/mingw-cygwin.c
@@ -1,13 +1,13 @@
 #include "../git-compat-util.h"
 
-int mingw_cygwin_skip_dos_drive_prefix(char **path)
+size_t mingw_cygwin_skip_dos_drive_prefix(char **path)
 {
-   int ret = has_dos_drive_prefix(*path);
+   size_t ret = has_dos_drive_prefix(*path);
*path += ret;
return ret;
 }
 
-int mingw_cygwin_offset_1st_component(const char *path)
+size_t mingw_cygwin_offset_1st_component(const char *path)
 {
char *pos = (char *)path;
 
diff --git a/compat/mingw-cygwin.h b/compat/mingw-cygwin.h
index 66ccc909ae..0e8a0c9074 100644
--- a/compat/mingw-cygwin.h
+++ b/compat/mingw-cygwin.h
@@ -1,6 +1,6 @@
 #define has_dos_drive_prefix(path) \
(isalpha(*(path)) && (path)[1] == ':' ? 2 : 0)
-int mingw_cygwin_skip_dos_drive_prefix(char **path);
+size_t mingw_cygwin_skip_dos_drive_prefix(char **path);
 #define skip_dos_drive_prefix mingw_cygwin_skip_dos_drive_prefix
 static inline int mingw_cygwin_is_dir_sep(int c)
 {
@@ -16,5 +16,5 @@ static inline char *mingw_cygwin_find_last_dir_sep(const char 
*path)
return ret;
 }
 #define find_last_dir_sep mingw_cygwin_find_last_dir_sep
-int mingw_cygwin_offset_1st_component(const char *path);
+size_t mingw_cygwin_offset_1st_component(const char *path);
 #define offset_1st_component mingw_cygwin_offset_1st_component
diff --git a/git-compat-util.h b/git-compat-util.h
index 7ece969b22..65eaaf0d50 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -355,16 +355,18 @@ static inline int noop_core_config(const char *var, const 
char *value, void *cb)
 #endif
 
 #ifndef has_dos_drive_prefix
-static inline int git_has_dos_drive_prefix(const char *path)
+static inline size_t git_has_dos_drive_prefix(const char *path)
 {
+   (void)path;
return 0;
 }
 #define has_dos_drive_prefix git_has_dos_drive_prefix
 #endif
 
 #ifndef skip_dos_drive_prefix
-static inline int git_skip_dos_drive_prefix(char **path)
+static inline size_t git_skip_dos_drive_prefix(char **path)
 {
+   (void)path;
return 0;
 }
 #define skip_dos_drive_prefix git_skip_dos_drive_prefix
@@ -379,7 +381,7 @@ static inline int git_is_dir_sep(int c)
 #endif
 
 #ifndef offset_1st_component
-static inline int git_offset_1st_component(const char *path)
+static inline size_t git_offset_1st_component(const char *path)
 {
return is_dir_sep(path[0]);
 }
diff --git a/setup.c b/setup.c
index 1be5037f12..538bc1ff99 100644
--- a/setup.c
+++ b/setup.c
@@ -29,7 +29,7 @@ static int abspath_part_inside_repo(char *path)
size_t len;
size_t wtlen;
char *path0;
-   int off;
+   size_t off;
const char *work_tree = get_git_work_tree();
 
if (!work_tree)
@@ -800,7 +800,7 @@ static const char *setup_bare_git_dir(struct strbuf *cwd, 
int offset,
  struct repository_format *repo_fmt,
  int *nongit_ok)
 {
-   int root_len;
+   size_t root_len;
 
if (check_repository_format_gently(".", repo_fmt, nongit_ok))
return NULL;
-- 
2.19.0.271.gfe8321ec05



Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()

2018-12-07 Thread Derrick Stolee

On 12/6/2018 6:36 PM, Jonathan Tan wrote:

AFAICT oid_object_info doesn't take advantage of the commit graph,
but just looks up the object header, which is still less than completely
parsing it. Then lookup_commit is overly strict, as it may return
NULL as when there still is a type mismatch (I don't think a mismatch
could happen here, as both rely on just the object store, and not the
commit graph.), so this would be just defensive programming for
the sake of it. I dunno.

 struct commit *c;

 if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT &&
 (c = lookup_commit(revs->repo, oid)) &&
 !repo_parse_commit(revs->repo, c))
 object = (struct object *) c;
 else
 object = parse_object(revs->repo, oid);

I like this way better - I'll do it in the next version.


If we do _not_ have a commit-graph or if the commit-graph does not have
that commit, this will have the same performance problem, right?

Should we instead create a direct dependence on the commit-graph, and try
to parse the oid from the graph directly? If it succeeds, then we learn
that the object is a commit, in addition to all of the parsing work. This
means we could avoid oid_object_info() loading data if we succeed. We
would fall back to parse_object() if it fails.

I was thinking this should be a simple API call to parse_commit_in_graph(),
but that requires a struct commit filled with an oid, which is not the
best idea if we don't actually know it is a commit yet.

The approach I recommend would then be more detailed:

1. Modify find_commit_in_graph() to take a struct object_id instead of a
   struct commit. This helps find the integer position in the graph. That
   position can be used in fill_commit_in_graph() to load the commit
   contents. Keep find_commit_in_graph() static as it should not be a
   public function.

2. Create a public function with prototype

struct commit *try_parse_commit_from_graph(struct repository *r, struct 
object_id *oid)


   that returns a commit struct fully parsed if and only if the repository
   has that oid. It can call find_commit_in_graph(), then 
lookup_commit() and

   fill_commit_in_graph() to create the commit and parse the data.

3. In replace of the snippet above, do:

    struct commit *c;

    if ((c = try_parse_commit_from_graph(revs->repo, oid))
        object = (struct object *)c;
    else
        object = parse_object(revs->repo, oid);

A similar pattern _could_ be used in parse_object(), but I don't recommend
doing this pattern unless we have a reasonable suspicion that we are going
to parse commits more often than other objects. (It adds an O(log(# 
commits))

binary search to each object.)

A final thought: consider making this "try the commit graph first, but fall
back to parse_object()" a library function with a name like

    struct object *parse_probably_commit(struct repository *r, struct 
object_id *oid)


so other paths that are parsing a lot of commits (but also maybe tags) could
use the logic.

Thanks!
-Stolee


Re: [PATCH v2 2/3] commit-graph: fix buffer read-overflow

2018-12-07 Thread Derrick Stolee

On 12/6/2018 3:20 PM, Josh Steadmon wrote:

+
+# usage: corrupt_and_zero_graph_then_verify   
 
+# Manipulates the commit-graph file at  by inserting the 
data,
+# then zeros the file starting at . Finally, runs
+# 'git commit-graph verify' and places the output in the file 'err'. Tests 
'err'
+# for the given string.
+corrupt_and_zero_graph_then_verify() {


This method is very similar to to 'corrupt_graph_and_verify()', the only 
difference being the zero_pos, which zeroes the graph.


Could it instead be a modification of corrupt_graph_and_verify() where 
$4 is interpreted as zero_pos, and if it is blank we don't do the 
truncation?



+test_expect_success 'detect truncated graph' '
+   corrupt_and_zero_graph_then_verify $GRAPH_BYTE_CHUNK_COUNT "\xff" \
+   $GRAPH_CHUNK_LOOKUP_OFFSET "chunk lookup table entry missing"
+'
+


Thanks for this! I think it's valuable to keep explicit tests around 
that were discovered from your fuzz tests. Specifically, I can repeat 
the test when I get around to the next file format.


Thanks,
-Stolee


Re: Get "responsible" .gitignore file / rule

2018-12-07 Thread Victor Toni
Am Fr., 7. Dez. 2018 um 13:45 Uhr schrieb Eric Sunshine
:
>
> On Fri, Dec 7, 2018 at 7:36 AM Victor Toni  wrote:
> > I'm wondering if there is any way to show which rules (ideally with
> > the .gitignore file they are coming from) are causing a specific file
> > to get ignored so I could easily fix the .gitignore file?
>
> Perhaps the "git check-ignore" command would help.

Thanks for the tip!
Works like a charm (had to use the --verbose option though, without,
it does not give much feedback)


Re: Get "responsible" .gitignore file / rule

2018-12-07 Thread Eric Sunshine
On Fri, Dec 7, 2018 at 7:36 AM Victor Toni  wrote:
> I'm wondering if there is any way to show which rules (ideally with
> the .gitignore file they are coming from) are causing a specific file
> to get ignored so I could easily fix the .gitignore file?

Perhaps the "git check-ignore" command would help.


Get "responsible" .gitignore file / rule

2018-12-07 Thread Victor Toni
In a rather complex setup with deep directory structure it happens
every now and then, that files get ignored when trying to add them.
As these files are _not_ shown in `git status` but in `git status
--ignored` so I guess the culprit is some misconfigured `.gitignore`.

Trying to ad the specific file gives a:
$ git add ignored/file/name
The following paths are ignored by one of your .gitignore files:
ignored/file/name
Use -f if you really want to add them.

Using -v doen't add any verbosity. I'm using git 2.19.1.windows.1 if
this matters

I'm wondering if there is any way to show which rules (ideally with
the .gitignore file they are coming from) are causing a specific file
to get ignored so I could easily fix the .gitignore file?


Re: [PATCH v2 2/3] commit-graph: fix buffer read-overflow

2018-12-07 Thread Jeff King
On Thu, Dec 06, 2018 at 12:20:54PM -0800, Josh Steadmon wrote:

> diff --git a/commit-graph.c b/commit-graph.c
> index 07dd410f3c..224a5f161e 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -165,10 +165,20 @@ struct commit_graph *parse_commit_graph(void 
> *graph_map, int fd,
>   last_chunk_offset = 8;
>   chunk_lookup = data + 8;
>   for (i = 0; i < graph->num_chunks; i++) {
> - uint32_t chunk_id = get_be32(chunk_lookup + 0);
> - uint64_t chunk_offset = get_be64(chunk_lookup + 4);
> + uint32_t chunk_id;
> + uint64_t chunk_offset;
>   int chunk_repeated = 0;
>  
> + if (chunk_lookup + GRAPH_CHUNKLOOKUP_WIDTH >
> + data + graph_size) {
> + error(_("chunk lookup table entry missing; graph file 
> may be incomplete"));
> + free(graph);
> + return NULL;
> + }

Is it possible to overflow the addition here? E.g., if I'm on a 32-bit
system and the truncated chunk appears right at the 4GB limit, in which
case we wrap back around? I guess that's pretty implausible, since it
would mean that the mmap is bumping up against the end of the address
space. I didn't check, but I wouldn't be surprised if sane operating
systems avoid allocating those addresses.

But I think you could write this as:

  if (data + graph_size - chunk_lookup < GRAPH_CHUNKLOOKUP_WIDTH)

to avoid overflow (we know that "data + graph_size" is sane because
that's our mmap, and chunk_lookup is somewhere between "data" and "data
+ graph_size", so the result is between 0 and graph_size).

I dunno. I think I've convinced myself it's a non-issue here, but it may
be good to get in the habit of writing these sorts of offset checks in
an overflow-proof order.

-Peff


Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()

2018-12-07 Thread Jeff King
On Thu, Dec 06, 2018 at 03:54:46PM -0800, Jonathan Tan wrote:

> This makes sense - I thought I shouldn't mention the commit graph in the
> code since it seems like a layering violation, but I felt the need to
> mention commit graph in a comment, so maybe the need to mention commit
> graph in the code is there too. Subsequently, maybe the lookup-for-type
> could be replaced by a lookup-in-commit-graph (maybe by using
> parse_commit_in_graph() directly), which should be at least slightly
> faster.

That makes more sense to me. If we don't have a commit graph at all,
it's a quick noop. If we do, we might binary search in the list of
commits for a non-commit. But that's strictly faster than finding the
object's type (which involves a binary search of a larger list, followed
by actually accessing the type info).

> > In general, it would be nice if we had a more incremental API
> > for accessing objects: open, get metadata, then read the data. That
> > would make these kinds of optimizations "free".
> 
> Would this be assuming that to read the data, you would (1) first need to
> read the metadata, and (2) there would be no redundancy in reading the
> two? It seems to me that for loose objects, you would want to perform
> all your reads at once, since any read requires opening the file, and
> for commit graphs, you just want to read what you want, since the
> metadata and the data are in separate places.

By metadata here, I don't mean the commit-graph data, but just the
object type and size. So I'm imagining an interface more like:

  - object_open() locates the object, and stores either the pack
file/offset or a descriptor to a loose path in an opaque handle
struct

  - object_size() and object_type() on that handle would do what you
expect. For loose objects, these would parse the header (the
equivalent of unpack_sha1_header()). For packed ones, they'd use the
object header in the pack (and chase down the delta bits as needed).

  - object_contents() would return the full content

  - object_read() could sequentially read a subset of the file (this
could replace the streaming interface we currently have)

We have most of the low-level bits for this already, if you poke into
what object_info_extended() is doing. We just don't have them packaged
in an interface which can persist across multiple calls.

With an interface like that, parse_object()'s large-blob check could be
something like the patch below.

But your case here is a bit more interesting. If we have a commit graph,
then we can avoid opening (or even finding!) the on-disk object at all.
So I actually think it makes sense to just check the commit-graph first,
as discussed above.

---
diff --git a/object.c b/object.c
index e54160550c..afce58c0bc 100644
--- a/object.c
+++ b/object.c
@@ -254,23 +254,31 @@ struct object *parse_object(struct repository *r, const 
struct object_id *oid)
const struct object_id *repl = lookup_replace_object(r, oid);
void *buffer;
struct object *obj;
+   struct object_handle oh;
 
obj = lookup_object(r, oid->hash);
if (obj && obj->parsed)
return obj;
 
-   if ((obj && obj->type == OBJ_BLOB && has_object_file(oid)) ||
-   (!obj && has_object_file(oid) &&
-oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
-   if (check_object_signature(repl, NULL, 0, NULL) < 0) {
+   if (object_open(, oid) < 0)
+   return NULL; /* missing object */
+
+   if (object_type() == OBJ_BLOB) {
+   /* this will call object_read() on 4k chunks */
+   if (check_object_signature_stream(, oid)) {
error(_("sha1 mismatch %s"), oid_to_hex(oid));
return NULL;
}
+   object_close(); /* we don't care about contents */
parse_blob_buffer(lookup_blob(r, oid), NULL, 0);
return lookup_object(r, oid->hash);
}
 
-   buffer = read_object_file(oid, , );
+   type = object_type();
+   size = object_size();
+   buffer = object_contents();
+   object_close();
+
if (buffer) {
if (check_object_signature(repl, buffer, size, type_name(type)) 
< 0) {
free(buffer);


Morning

2018-12-07 Thread Joy Smith Johnson
Hello; kindly see the attached my proposal Charities donation.


Mrs. Joy Smith Johnson.rtf
Description: RTF file


HELLO DEAR

2018-12-07 Thread michel
With Due Respect,

I know that this mail will come to you as a surprise as we have never
met before, but need not to worry as I am contacting you independently
of my investigation and no one is informed of this communication. I
need your urgent assistance in transferring the sum of $11.3million
immediately to your private account.The money has been here in our
Bank lying dormant for years now without anybody coming for the claim
of it.

I want to release the money to you as the relative to our deceased
customer (the account owner) who died a long with his supposed NEXT OF
KIN since 16th October 2005. The Banking laws here does not allow such
money to stay more than 13 years, because the money will be recalled
to the Bank treasury account as unclaimed fund.

By indicating your interest I will send you the full details on how
the business will be executed.

Please respond urgently and delete if you are not interested.

Best Regards,
Mr.Michel Duku.


Re: Retrieving a file in git that was deleted and committed

2018-12-06 Thread Bryan Turner
On Thu, Dec 6, 2018 at 11:26 PM Jeff King  wrote:
>
> On Thu, Dec 06, 2018 at 11:07:00PM -0800, biswaranjan panda wrote:
>
> > Thanks! Strangely git log --follow does work.
>
> I suspect it would work even without --follow. When you limit a log
> traversal with a pathspec, like:
>
>   git log foo
>
> that is not about following some continuous stream of content, but
> rather just applying that pathspec to the diff of each commit, and
> pruning ones where it did not change. So even if there are gaps where
> the file did not exist, we continue to apply the pathspec to the older
> commits.

Ah, of course. Thanks for the clarification, Jeff. And my apologies to
Biswaranjan Panda for the incorrect information.

>
> Tools like git-blame will _not_ work, though, as they really are trying
> to track the content as they walk back through history. And Once all of
> the content seems to appear from nowhere in your new commit, that seems
> like a dead end.
>
> In theory there could be some machine-readable annotation in the commit
> object (or in a note created after the fact) to say "even though 'foo'
> is a new file here, it came from $commit:foo".  And then git-blame could
> keep following the content there. But such a feature does not yet exist.
>
> -Peff


Re: Retrieving a file in git that was deleted and committed

2018-12-06 Thread Jeff King
On Thu, Dec 06, 2018 at 11:07:00PM -0800, biswaranjan panda wrote:

> Thanks! Strangely git log --follow does work.

I suspect it would work even without --follow. When you limit a log
traversal with a pathspec, like:

  git log foo

that is not about following some continuous stream of content, but
rather just applying that pathspec to the diff of each commit, and
pruning ones where it did not change. So even if there are gaps where
the file did not exist, we continue to apply the pathspec to the older
commits.

Tools like git-blame will _not_ work, though, as they really are trying
to track the content as they walk back through history. And Once all of
the content seems to appear from nowhere in your new commit, that seems
like a dead end.

In theory there could be some machine-readable annotation in the commit
object (or in a note created after the fact) to say "even though 'foo'
is a new file here, it came from $commit:foo".  And then git-blame could
keep following the content there. But such a feature does not yet exist.

-Peff


Re: Retrieving a file in git that was deleted and committed

2018-12-06 Thread biswaranjan panda
Thanks! Strangely git log --follow does work.
On Thu, Dec 6, 2018 at 10:55 PM Bryan Turner  wrote:
>
> On Thu, Dec 6, 2018 at 10:49 PM biswaranjan panda
>  wrote:
> >
> > I have the following scenario:
> >
> > On a branch A, I deleted a file foo.txt and committed the change. Then
> > I did a bunch of other changes.
> > Now I want to undelete foo.txt.
> >
> > One way is to checkout a separate branch B where the file is present.
> > Then checkout A. Then do
> > git checkout B -- path_to_file
>
> It doesn't change anything, but note that you don't need to checkout B
> first, to restore the file. If you know a commit SHA where the file is
> present, "git checkout SHA -- path_to_file" will pull back the file as
> it existed at that commit.
>
> >
> > While this does gets the file back, the file shows up as a new file to
> > be committed. Once I commit it, git blame doesn't show the old history
> > for the file.
> >
> > I would appreciate if anyone knows how to preserve git blame history
>
> It's not possible, as far as I'm aware. While the new file has the
> same name as the old file, to Git they are two unrelated entries that
> happen to reside at the same path. Even things like "git log --follow"
> won't consider the file to be related to its previous history.
>
> Bryan



-- 
Thanks,
-Biswa


Re: Retrieving a file in git that was deleted and committed

2018-12-06 Thread Bryan Turner
On Thu, Dec 6, 2018 at 10:49 PM biswaranjan panda
 wrote:
>
> I have the following scenario:
>
> On a branch A, I deleted a file foo.txt and committed the change. Then
> I did a bunch of other changes.
> Now I want to undelete foo.txt.
>
> One way is to checkout a separate branch B where the file is present.
> Then checkout A. Then do
> git checkout B -- path_to_file

It doesn't change anything, but note that you don't need to checkout B
first, to restore the file. If you know a commit SHA where the file is
present, "git checkout SHA -- path_to_file" will pull back the file as
it existed at that commit.

>
> While this does gets the file back, the file shows up as a new file to
> be committed. Once I commit it, git blame doesn't show the old history
> for the file.
>
> I would appreciate if anyone knows how to preserve git blame history

It's not possible, as far as I'm aware. While the new file has the
same name as the old file, to Git they are two unrelated entries that
happen to reside at the same path. Even things like "git log --follow"
won't consider the file to be related to its previous history.

Bryan


Retrieving a file in git that was deleted and committed

2018-12-06 Thread biswaranjan panda
I have the following scenario:

On a branch A, I deleted a file foo.txt and committed the change. Then
I did a bunch of other changes.
Now I want to undelete foo.txt.

One way is to checkout a separate branch B where the file is present.
Then checkout A. Then do
git checkout B -- path_to_file

While this does gets the file back, the file shows up as a new file to
be committed. Once I commit it, git blame doesn't show the old history
for the file.

I would appreciate if anyone knows how to preserve git blame history.


Issue with git-gui and font used for widgets

2018-12-06 Thread Eric Work
Hi,

I have set my UI font in the git-gui preferences, but it only affects
the menus and certain widgets.  It does not apply the font to labels
and buttons.  This creates a problem for my HiDPI display because the
fonts are quite small.  I've never programmed in TCL/TK before so I
don't know exactly what is wrong, but looking at the code I see where
it's supposed to apply the font option to the widgets in a foreach
loop.

foreach class {Button Checkbutton Entry Label
Labelframe Listbox Message
Radiobutton Spinbox Text} {
  option add *$class.font font_ui
}

But that doesn't seem to work.  As an experiment I added the -font
parameter (according to the docs) to where the branch label is created
and that worked as expected.

${NS}::label .branch.l1 \
  -text [mc "Current Branch:"] \
  -anchor w \
  -justify left \
  -font font_ui

I don't know what is the expected behavior in terms of setting fonts,
but I would assume that is not expected?  It appears to be using
TkDefaultFont instead.  The default font being small is really a tk
problem, where as setting the widget font is a git-gui problem.  I
created the following bug report against tk to get some feedback on
the small default font issue as well as documented a workaround to
allow per user default fonts.

https://core.tcl-lang.org/tk/tktview/dccd82bdc70dc25bb6709a6c14880a92104dda43

Any suggestions?

-Eric


Re: [wishlist] git submodule update --reset-hard

2018-12-06 Thread Yaroslav Halchenko
Hi Stefan,

Thanks for the dialogue!  Replies are embedded below.

On Thu, 06 Dec 2018, Stefan Beller wrote:
> ...
> > > What if the branch differs from the sha1 recorded in the superproject?

> > git reset --hard  itself is an operation which should be done with some
> > level of competence in doing "the right thing" by calling it.  You
> > can hop branches even in current (without any submodules in question)
> > repository with it and cause as much chaos as you desire.

> Right.

> git reset --hard would the branch (as well as the working tree) to the
> given sha1, which is confusing as submodules get involved.

> The Right Thing as of now is the sha1 as found in the
> superprojects gitlink. But as that can be different from any branch
> in the submodule, we'd rather detach the HEAD to make it
> deterministic.

yeap, makes total sense to be the thing do that by default ;-)

> There was a proposal to "re-attach HEAD" in the submodule, i.e.
> if the branch branch points at the same commit, we don't need
> a detached HEAD, but could go with the branch instead.

if I got the idea right, if we are talking about any branch, it
would also non-deterministic since who knows what left over branch(es)
point to that commit.  Not sure if I would have used that ;)

> > If desired though, a number of prevention mechanisms could be in place (but
> > would require option(s) to overcome) to allow submodule to be reset 
> > --hard'ed
> > only when some conditions met (e.g. only to the commit which is among parent
> > commits path of the current branch).  This way wild hops would be prevented,
> > although you might still end up in some feature branch.  But since "reset
> > --hard" itself doesn't have any safe-guards, I do not really think they 
> > should
> > be implemented here either.

> So are you looking for
> a) "stay on submodule branch (i.e. HEAD still points at $branch), and
> reset --hard" such that the submodule has a clean index and at that $branch 
> or
> b) "stay on submodule branch (i.e. HEAD still points at $branch), but $branch 
> is
>set to the gitlink from the superproject, and then a reset --hard
>will have the worktree set to it as well.

yes!

NB "gitlink" -- just now discovered the thing for me.  Thought it would be
called a  subproject  echoing what git diff/log -p shows for submodule commits.

> (a) is what the referenced submodule.repoLike option implements.

sounds like it indeed, thanks for spelling out

> I'd understand the desire for (b) as well, as it is a "real" hard reset on
> the superproject level, without detaching branches.

yeap

> > >   git reset --hard --recurse-submodules HEAD

> > it does indeed some trick(s) but not all seems to be the ones I desire:

> > 1. Seems to migrate submodule's .git directories into the top level
> > .git/modules

> Ah yes, that happens too. This will help once you want to git-rm
> a submodule and checkout states before and after.

yeap ;-) 

> > > undesirable in the sense of still having local changes (that is what
> > > the above reset with `--recurse` would fix) or changed the branch
> > > state? (i.e. is detached but was on a branch before?)

> > right -- I meant the local changes and indeed reset --recurse-submodules
> > indeed seems to recurse nicely.  Then the undesired effect remaining only
> > the detached HEAD

> For that we may want to revive discussions in
> https://public-inbox.org/git/20170501180058.8063-5-sbel...@google.com/

well, isn't that one requires a branch to be specified in .gitmodules?

> > > >   git submodule update --recursive

> > > > I would end up in the detached HEADs within submodules.

> > > > What I want is to retain current branch they are at (or may be possible
> > > > "were in"? reflog records might have that information)

> > > So something like

> > >   git submodule foreach --recursive git reset --hard

> > > ?

> > not quite  -- this would just kill all local changes within each submodule, 
> > not
> > to reset it to the desired state, which wouldn't be specified in such
> > invocation, and is only known to the repo containing it

> With this answer it sounds like you'd want (b) from above.

yeap

> > > You may be interested in
> > > https://public-inbox.org/git/20180927221603.148025-1-sbel...@google.com/
> > > which introduces a switch `submodule.repoLike [ = true]`, which
> > > when set would not detach HEAD in submodules.

> > Thanks! looks interesting -- was there more discussion/activity beyond 
> > those 5
> > posts in the thread?

> Unfortunately there was not.

pity

> > This feature might indeed come handy but if I got it right, it is somewhat
> > complimentary to just having submodule update --reset-hard .  E.g.  
> > submodules
> > might be in different branches (if I am not tracking based on branch 
> > names), so
> > I would not want a recursive checkout with -b|-B.  But we would indeed 
> > benefit
> > from such functionality, since this difficulty of managing branches of
> > submodules I 

Re: [RFC PATCH 3/3] test-lib: add the '--stress' option to run a test repeatedly under load

2018-12-06 Thread Jeff King
On Fri, Dec 07, 2018 at 12:10:05AM +0100, SZEDER Gábor wrote:

> On Wed, Dec 05, 2018 at 04:56:21PM -0500, Jeff King wrote:
> > Could we just kill them all?
> > 
> > I guess it's a little tricky, because $! is going to give us the pid of
> > each subshell. We actually want to kill the test sub-process. That takes
> > a few contortions, but the output is nice (every sub-job immediately
> > says "ABORTED ...", and the final process does not exit until the whole
> > tree is done):
> 
> It's trickier than that:
> 
>   - Nowadays our test lib translates SIGINT to exit, but not TERM (or
> HUP, in case a user decides to close the terminal window), which
> means that if the test runs any daemons in the background, then
> those won't be killed when the stress test is aborted.
> 
> This is easy enough to address, and I've been meaning to do this
> in an other patch series anyway.

Yeah, trapping on more signals seems reasonable (or just propagating INT
instead of TERM). I'm more worried about this one:

>   - With the (implied) '--verbose-log' the process killed in the
> background subshell's SIGTERM handler is not the actual test
> process, because '--verbose-log' runs the test again in a piped
> subshell to save its output:

Hmm, yeah. The patch I sent earlier already propagates through the
subshell, so this is really just another layer there. I.e., would it be
reasonable when we are using --verbose-log (or --tee) for the parent
process to propagate signals down to child it spawned?

That would fix this case, and it would make things more predictable in
general (e.g., a user who manually runs kill).

It does feel like a lot of boilerplate to be propagating these signals
manually. I think the "right" Unix-y solution here is process groups:
each sub-job should get its own group, and then the top-level runner
script can kill the whole group. We may run into portability issues,
though (setsid is in util-linux, but I don't know if there's an
equivalent elsewhere; a small perl or C helper could do it, but I have
no idea what that would mean on Windows).

>   (GIT_TEST_TEE_STARTED=done ${TEST_SHELL_PATH} "$0" "$@" 2>&1;
>echo $? >"$TEST_RESULTS_BASE.exit") | tee -a 
> "$GIT_TEST_TEE_OUTPUT_FILE"
> 
> That 'kill' kills the "outer" shell process.
> And though I get "ABORTED..." immediately and I get back my
> prompt right away, the commands involved in the above construct
> are still running in the background:
> 
>   $ ./t3903-stash.sh --stress=1
>   ^CABORTED  0.0
>   $ ps a |grep t3903
>   1280 pts/17   S  0:00 /bin/sh ./t3903-stash.sh --stress=1
>   1281 pts/17   S  0:00 tee -a 
> <...>/test-results/t3903-stash.stress-0.out
>   1282 pts/17   S  0:00 /bin/sh ./t3903-stash.sh --stress=1
>   4173 pts/17   S+ 0:00 grep t3903
> 
> I see this behavior with all shells I tried.
> I haven't yet started to think it through why this happens :)

I think you get ABORTED because we are just watching for the
parent-process to exit (and reporting its status). The fact that that
the subshell (and the tee) are still running is not important. That all
makes sense to me.

> Not implying '--verbose-log' but redirecting the test script's
> output, like you did in your 'stress' script, seems to work in
> dash, ksh, and ks93, but not in Bash v4.3 or later, where, for
> whatever reason, the subshell get SIGINT before the SIGTERM would
> arrive.
> While we could easily handle SIGINT in the subshell with the same
> signal handler as SIGTERM, it bothers me that something
> fundamental is wrong here.

Yeah, I don't understand the SIGINT behavior you're seeing with bash. I
can't replicate it with bash 4.4.23 (from Debian unstable).

-Peff


Re: [RFC PATCH 3/3] test-lib: add the '--stress' option to run a test repeatedly under load

2018-12-06 Thread Jeff King
On Thu, Dec 06, 2018 at 11:56:01PM +0100, SZEDER Gábor wrote:

> > +test_expect_success 'roll those dice' '
> > +   case "$(openssl rand -base64 1)" in
> > +   z*)
> > +   return 1
> > +   esac
> > +'
> 
> Wasteful :)
> 
>   test $(($$ % 42)) -ne 0

Oh, indeed, that is much more clever. :)

-Peff


Re: [PATCHv2 0/9] Resending sb/submodule-recursive-fetch-gets-the-tip

2018-12-06 Thread Josh Steadmon
On 2018.11.28 16:27, Stefan Beller wrote:
> This is a resend of sb/submodule-recursive-fetch-gets-the-tip,
> with all feedback addressed. As it took some time, I'll send it
> without range-diff, but would ask for full review.
> 
> I plan on resending after the next release as this got delayed quite a bit,
> which is why I also rebased it to master.
> 
> Thanks,
> Stefan

I am not very familiar with most of the submodule code, but for what
it's worth, this entire series looks good to me. I'll note that most of
the commits caused some style complaints, but I'll leave it up to your
judgement as to whether they're valid or not.

Reviewed-by: Josh Steadmon 


Re: [WIP RFC 4/5] upload-pack: refactor writing of "packfile" line

2018-12-06 Thread Junio C Hamano
Jonathan Tan  writes:

>> Jonathan Tan  writes:
>> 
>> > @@ -126,6 +129,12 @@ static int read_pack_objects_stdout(int outfd, struct 
>> > output_state *os)
>> >}
>> >os->used += readsz;
>> >  
>> > +  if (!os->packfile_started) {
>> > +  os->packfile_started = 1;
>> > +  if (use_protocol_v2)
>> > +  packet_write_fmt(1, "packfile\n");
>> 
>> If we fix this function so that the only byte in the buffer is held
>> back without emitted when os->used == 1 as I alluded to, this may
>> have to be done a bit later, as with such a change, it is no longer
>> guaranteed that send_client_data() will be called after this point.
>
> I'm not sure what you mean about there being no guarantee that
> send_client_data() is not called - in create_pack_file(), there is an
> "if (output_state.used > 0)" line (previously "if (0 <= buffered)") that
> outputs anything remaining.

I was referring to this part of the review on the previous step,
which you may not yet have read.

OK, this corresponds to the "*cp++ = buffered"  in the original just
before xread().

> + os->used = 1;
> + } else {
> + send_client_data(1, os->buffer, os->used);
> + os->used = 0;

I am not sure if the code is correct when os->used happens to be 1
(shouldn't we hold the byte, skip the call to send_client_data(),
and go back to poll() to expect more data?), but this is a faithful
code movement and rewrite of the original.

The point of this logic is to make sure we always hold back some
bytes and do not feed *all* the bytes to the other side by calling
"send-client-data" until we made sure the upstream of what we are
relaying (pack-objects?) successfully exited, but it looks to me
that the "else" clause above ends up flushing everything when
os->used is 1, which goes against the whole purpose of the code.

And the "fix" I was alluding to was to update that "else" clause to
make it a no-op that keeps os->used non-zero, which would not call
send-client-data.

When that fix happens, the part that early in the function this
patch added "now we know we will call send-client-data, so let's say
'here comes packdata' unless we have already said that" is making
the decision too early.  Depending on the value of os->used when we
enter the code and the number of bytes xread() reads from the
upstream, we might not call send-client-data yet (namely, when we
have no buffered data and we happened to get only one byte).

> ... it might be
> better if the server can send sideband throughout the whole response -
> perhaps that should be investigated first.

Yup.  It just looked quite crazy, and it is even more crazy to
buffer keepalives ;-)


Re: Any way to make git-log to enumerate commits?

2018-12-06 Thread Junio C Hamano
Konstantin Khomoutov  writes:

>> I do not see why the "name each rev relative to HEAD" formatting
>> option cannot produce HEAD^2~2 etc.
>>  ...
> My reading was that the OP explicitly wanted to just glance at a single
> integer number and use it right away in a subsequent rebase command.
>
> I mean, use one's own short memory instead of copying and pasting.

As everybody pointed out, a single integer number will fundamentally
unworkable with distributed nature of Git that inherently makes the
history with forks and merges.  Besides, there is no way to feed
"git log" with "a single integer number", without which "making
git-log to enumerate" would not be all that useful ("git show
HEAD~4" works but "git show 4" does not).

All of these name-rev based suggestions were about using anchoring
point with memorable name plus a short path to reach from there,
which I think is the closest thing to "a single integer number" and
still is usable for the exact purpose.  "HEAD~48^2" on an
integration branch would be "the tip of the side branch that was
merged some 48 commits ago", for example.


Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()

2018-12-06 Thread Jonathan Tan
Also CC-ing Stolee since I mention multi-pack indices at the end.

> This seems like a reasonable thing to do, but I have sort of a
> meta-comment. In several places we've started doing this kind of "if
> it's this type of object, do X, otherwise do Y" optimization (e.g.,
> handling large blobs for streaming).
> 
> And in the many cases we end up doubling the effort to do object
> lookups: here we do one lookup to get the type, and then if it's not a
> commit (or if we don't have a commit graph) we end up parsing it anyway.
> 
> I wonder if we could do better. In this instance, it might make sense
> to first see if we actually have a commit graph available (it might not
> have this object, of course, but at least we'd expect it to have most
> commits).

This makes sense - I thought I shouldn't mention the commit graph in the
code since it seems like a layering violation, but I felt the need to
mention commit graph in a comment, so maybe the need to mention commit
graph in the code is there too. Subsequently, maybe the lookup-for-type
could be replaced by a lookup-in-commit-graph (maybe by using
parse_commit_in_graph() directly), which should be at least slightly
faster.

> In general, it would be nice if we had a more incremental API
> for accessing objects: open, get metadata, then read the data. That
> would make these kinds of optimizations "free".

Would this be assuming that to read the data, you would (1) first need to
read the metadata, and (2) there would be no redundancy in reading the
two? It seems to me that for loose objects, you would want to perform
all your reads at once, since any read requires opening the file, and
for commit graphs, you just want to read what you want, since the
metadata and the data are in separate places.

> I don't have numbers for how much the extra lookups cost. The lookups
> are probably dwarfed by parse_object() in general, so even if we save
> only a few full object loads, it may be a win. It just seems a shame
> that we may be making the "slow" paths (when our type-specific check
> doesn't match) even slower.

I agree. I think it will always remain a tradeoff when we have multiple
data sources of objects (loose, packed, commit graph - and we can't
unify them all, since they each have their uses). Unless the multi-pack
index can reference commit graphs as well...then it could be our first
point of reference without introducing any inefficiencies...


Re: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()

2018-12-06 Thread Jonathan Tan
> > This is on sb/more-repo-in-api because I'm using the repo_parse_commit()
> > function.
> 
> This is a mere nicety, not strictly required.
> Before we had parse_commit(struct commit *) which would accomplish the
> same, (and we'd still have that afterwards as a #define falling back onto
> the_repository). As the function get_reference() is not the_repository safe
> as it contains a call to is_promisor_object() that is repository
> agnostic, I think
> it would be fair game to not depend on that series. I am not
> complaining, though.

Good point - I'll base the next version on master (and add a TODO
explaining which functions are not yet converted).

> AFAICT oid_object_info doesn't take advantage of the commit graph,
> but just looks up the object header, which is still less than completely
> parsing it. Then lookup_commit is overly strict, as it may return
> NULL as when there still is a type mismatch (I don't think a mismatch
> could happen here, as both rely on just the object store, and not the
> commit graph.), so this would be just defensive programming for
> the sake of it. I dunno.
> 
> struct commit *c;
> 
> if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT &&
> (c = lookup_commit(revs->repo, oid)) &&
> !repo_parse_commit(revs->repo, c))
> object = (struct object *) c;
> else
> object = parse_object(revs->repo, oid);

I like this way better - I'll do it in the next version.


  1   2   3   4   5   6   7   8   9   10   >