Re: [PATCH v3] format-patch: Add --rfc for the common case of [RFC PATCH]

2016-09-19 Thread Jeff King
On Mon, Sep 19, 2016 at 09:23:25PM -0700, Josh Triplett wrote:

> This provides a shorter and more convenient alias for
> --subject-prefix='RFC PATCH'.
> 
> Includes documentation in the format-patch manpage, and a new test
> covering --rfc.
> 
> Signed-off-by: Josh Triplett 
> ---
> v3:
> - Fix an error message referring to --subject-prefix
> - Expand the acronym "RFC"

This looks fine to me. Thanks.

-Peff


Re: [PATCH v2 2/2] mailinfo: unescape quoted-pair in header fields

2016-09-19 Thread Jeff King
On Mon, Sep 19, 2016 at 08:54:40PM +0200, Kevin Daudt wrote:

> diff --git a/t/t5100/comment.expect b/t/t5100/comment.expect
> new file mode 100644
> index 000..1197e76
> --- /dev/null
> +++ b/t/t5100/comment.expect
> @@ -0,0 +1,5 @@
> +Author: A U Thor (this is a comment (really))

Hmm. I don't see any recursion in your parsing, so after the first ")"
our escape_context would be 0 again, right? So a more tricky test is:

  Author: A U Thor (this is a comment (really) with \(quoted\) pairs)

We are still inside "ctext" when we hit those quoted pairs, and they
should be unquoted, but your code would not do so (unless we go the
route of simply unquoting pairs everywhere).

I think your parser would have to follow the BNF more closely with a
recursive descent parser, like:

  const char *parse_comment(const char *in, struct strbuf *out)
  {
size_t orig_out = out->len;

if ((in = parse_char('(', in, out))) &&
(in = parse_ccontent(in, out)) &&
(in = parse_char(')', in, out
return in;

strbuf_setlen(out, orig_out);
return NULL;
  }

  const char *parse_ccontent(const char *in, struct strbuf *out)
  {
while (*in && *in != ')') {
const char *next;

if ((next = parse_quoted_pair(in, out)) ||
(next = parse_comment(in, out)) ||
(next = parse_ctext(in, out))) {
in = next;
continue;
}
}

/*
 * if "in" is NUL here we have an unclosed comment; but we'll
 * just silently ignore and accept it
 */
return in;
  }

  const char *parse_char(char c, const char *in, struct strbuf *out)
  {
if (*in != c)
return NULL;
strbuf_addch(out, c);
return in + 1;
  }

You can probably guess at the implementation of parse_quoted_pair(),
parse_ctext(), etc (and naturally, the above is completely untested and
probably has some bugs in it).

In a former life (back when it was still rfc822!) I remember
implementing a similar parser, which I think was in turn based on the
cclient code in pine. It's not _too_ hard to get it all right based on
the BNF in the RFC, but as you can see it's a bit tedious. And I'm not
convinced we actually need it to be completely right for our purposes.
We really are looking for a single address, with the email in "<>" and
the name as everything before that, but de-quoted.

-Peff


[PATCH v3] format-patch: Add --rfc for the common case of [RFC PATCH]

2016-09-19 Thread Josh Triplett
This provides a shorter and more convenient alias for
--subject-prefix='RFC PATCH'.

Includes documentation in the format-patch manpage, and a new test
covering --rfc.

Signed-off-by: Josh Triplett 
---
v3:
- Fix an error message referring to --subject-prefix
- Expand the acronym "RFC"
v2:
- Add documentation to the format-patch manpage
- Call subject_prefix_callback rather than reimplementing it
- Update test to move expectations inside

 Documentation/git-format-patch.txt |  8 +++-
 builtin/log.c  | 10 +-
 t/t4014-format-patch.sh|  9 +
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-format-patch.txt 
b/Documentation/git-format-patch.txt
index 9624c84..9b200b3 100644
--- a/Documentation/git-format-patch.txt
+++ b/Documentation/git-format-patch.txt
@@ -19,7 +19,8 @@ SYNOPSIS
   [--start-number ] [--numbered-files]
   [--in-reply-to=Message-Id] [--suffix=.]
   [--ignore-if-in-upstream]
-  [--subject-prefix=Subject-Prefix] [(--reroll-count|-v) ]
+  [--rfc] [--subject-prefix=Subject-Prefix]
+  [(--reroll-count|-v) ]
   [--to=] [--cc=]
   [--[no-]cover-letter] [--quiet] [--notes[=]]
   []
@@ -172,6 +173,11 @@ will want to ensure that threading is disabled for `git 
send-email`.
allows for useful naming of a patch series, and can be
combined with the `--numbered` option.
 
+--rfc::
+   Alias for `--subject-prefix="RFC PATCH"`. RFC means "Request For
+   Comments"; use this when sending an experimental patch for
+   discussion rather than application.
+
 -v ::
 --reroll-count=::
Mark the series as the -th iteration of the topic. The
diff --git a/builtin/log.c b/builtin/log.c
index 92dc34d..c657900 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1112,6 +1112,11 @@ static int subject_prefix_callback(const struct option 
*opt, const char *arg,
return 0;
 }
 
+static int rfc_callback(const struct option *opt, const char *arg, int unset)
+{
+   return subject_prefix_callback(opt, "RFC PATCH", unset);
+}
+
 static int numbered_cmdline_opt = 0;
 
 static int numbered_callback(const struct option *opt, const char *arg,
@@ -1419,6 +1424,9 @@ int cmd_format_patch(int argc, const char **argv, const 
char *prefix)
N_("start numbering patches at  instead of 1")),
OPT_INTEGER('v', "reroll-count", _count,
N_("mark the series as Nth re-roll")),
+   { OPTION_CALLBACK, 0, "rfc", , NULL,
+   N_("Use [RFC PATCH] instead of [PATCH]"),
+   PARSE_OPT_NOARG | PARSE_OPT_NONEG, rfc_callback },
{ OPTION_CALLBACK, 0, "subject-prefix", , N_("prefix"),
N_("Use [] instead of [PATCH]"),
PARSE_OPT_NONEG, subject_prefix_callback },
@@ -1557,7 +1565,7 @@ int cmd_format_patch(int argc, const char **argv, const 
char *prefix)
if (numbered && keep_subject)
die (_("-n and -k are mutually exclusive."));
if (keep_subject && subject_prefix)
-   die (_("--subject-prefix and -k are mutually exclusive."));
+   die (_("--subject-prefix/--rfc and -k are mutually 
exclusive."));
rev.preserve_subject = keep_subject;
 
argc = setup_revisions(argc, argv, , _r_opt);
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index b0579dd..ed4d3c2 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -1073,6 +1073,15 @@ test_expect_success 'empty subject prefix does not have 
extra space' '
test_cmp expect actual
 '
 
+test_expect_success '--rfc' '
+   cat >expect <<-\EOF &&
+   Subject: [RFC PATCH 1/1] header with . in it
+   EOF
+   git format-patch -n -1 --stdout --rfc >patch &&
+   grep ^Subject: patch >actual &&
+   test_cmp expect actual
+'
+
 test_expect_success '--from=ident notices bogus ident' '
test_must_fail git format-patch -1 --stdout --from=foo >patch
 '

base-commit: 6ebdac1bab966b720d776aa43ca188fe378b1f4b
-- 
git-series 0.8.10


Re: [PATCH v2 1/2] t5100-mailinfo: replace common path prefix with variable

2016-09-19 Thread Jeff King
On Mon, Sep 19, 2016 at 02:16:23PM -0700, Junio C Hamano wrote:

> Kevin Daudt  writes:
> 
> > Many tests need to store data in a file, and repeat the same pattern to
> > refer to that path:
> >
> > "$TEST_DATA"/t5100/
> 
> That obviously is a typo of
> 
>   "$TEST_DIRECTORY/t5100"
> 
> It is a good change, even though I would have chosen a name
> that is a bit more descriptive than "$DATA".

The name "$DATA" was my suggestion. I was shooting for something short
since this is used a lot and is really a script-local variable (I'd have
kept it lowercase to indicate that, but maybe that is just me).
Something like "$root" would also work. I dunno.

> > -   test_cmp "$TEST_DIRECTORY"/t5100/msg$mo msg$mo &&
> > -   test_cmp "$TEST_DIRECTORY"/t5100/patch$mo patch$mo &&
> > -   test_cmp "$TEST_DIRECTORY"/t5100/info$mo info$mo
> > +   test_cmp "$DATA"/msg$mo msg$mo &&
> > +   test_cmp "$DATA"/patch$mo patch$mo &&
> > +   test_cmp "$DATA"/info$mo info$mo
> 
> make me wonder why we don't quote the whole thing, i.e.
> 
>   test_cmp "$TEST_DATA/info$mo" "info$mo"
> 
> as leaving $mo part unquoted forces reader to wonder if it is our
> deliberate attempt to allow shell $IFS in $mo and have the argument
> split when that happens, which can be avoided if we quoted more
> explicitly.
> 
> Perhaps we'd leave that as a low-hanging fruit for future people.

Yeah, I agree that quoting the whole thing makes it more obvious (though
I guess quoting the second info$mo does add two characters).

-Peff


Re: [PATCH] mailinfo: unescape quoted-pair in header fields

2016-09-19 Thread Jeff King
On Mon, Sep 19, 2016 at 12:51:33PM +0200, Kevin Daudt wrote:

> > I didn't look in the RFC. Is:
> > 
> >   From: my \"name\" 
> > 
> > really the same as:
> > 
> >   From: "my \\\"name\\\"" 
> > 
> > ? That seems weird, but I think it may be that the former is simply
> > bogus (you are not supposed to use backslashes outside of the quoted
> > section at all).
> 
> Correct, the quoted-pair (escape sequence) can only occur in a quoted
> string or a comment. Even more so, the display name *needs* to be quoted
> when consisting of more then one word according to the RFC.

Hmm. So, I guess a follow-up question is: what would it be OK to do if
we see a quoted-pair outside of quotes? If the top one above violates
the RFC, it seems like stripping the backslashes would be a reasonable
outcome.

So if that's the case, do we actually need to care if we see any
parenthesized comments? I think we should just leave comments in place
either way, so syntactically they are only interesting insofar as we
replace quoted pairs or not.

IOW, I wonder if:

  while ((c = *in++)) {
switch (c) {
case '\\':
if (!*in)
return 0; /* ignore trailing backslash */
/* quoted pair */
strbuf_addch(out, *in++);
break;
case '"':
/*
 * This may be starting or ending a quoted section,
 * but we do not care whether we are in such a section.
 * We _do_ need to remove the quotes, though, as they
 * are syntactic.
 */
break;
default:
/*
 * Anything else is a normal character we keep. These
 * _might_ be violating the RFC if they are magic
 * characters outside of a quoted section, but we'd
 * rather be liberal and pass them through.
 */
strbuf_addch(out, c);
break;
}
  }

would work. I certainly do not mind following the RFC more closely, but
AFAICT the very simple code above gives a pretty forgiving outcome.

> > This is obviously getting pretty silly, but if we are going to follow
> > the RFC, I think you actually have to do a recursive parse, and keep
> > track of an arbitrary depth of context.
> > 
> > I dunno. This method probably covers most cases in practice, and it's
> > easy to reason about.
> 
> The problem is, how do you differentiate between nested comments, and
> escaped braces within a comment after one run?

I'm not sure what you mean. Escaped characters are always handled first
in your loop. Can you give an example (although if you agree with what I
wrote above, it may not be worth discussing further)?

-Peff


Re: Bug: pager. doesn't work well with editors

2016-09-19 Thread Jeff King
On Mon, Sep 19, 2016 at 09:03:05AM -0700, Junio C Hamano wrote:

> Anatoly Borodin  writes:
> 
> >> I think, the pagination should be turned off when the editor is being
> >> called.
> 
> This is a fun one.  IIRC, we decide to spawn a pager and run our
> output via pipe thru it fairly early, even before we figure out
> which subcommand is being run (especially if you do "git -p
> subcommand"), which by definition is way before we know if that
> subcommand wants to let the user edit things with the editor.

Right. Once the pager is spawned, it is too late to change things (even
if we spawned the editor directed to /dev/tty, it would be racily
competing for input with the pager).

And this isn't really limited to the editor. It's more _annoying_ with
the editor, but really "pager.tag" does not make any sense to set right
now, because it is handled outside of the "tag" command entirely, and
doesn't know what mode the tag command will be running in. So it's
_also_ the wrong thing to do with "git tag foo", which doesn't run an
editor (but you don't tend to notice if you use "less -F" because the
pager just quits immediately).

So you really only want to page tag-listing (and the same for branch,
config, etc). A long time ago I had a patch to add "pager.tag.list", and
the tag command would decide whether and how to page after realizing it
was in listing mode. Looks like I did send it to the list:

  http://public-inbox.org/git/20111007144438.ga30...@sigill.intra.peff.net/

though I think there are some rough edges (like handling "git stash
list").

I also wonder if there are any commands that actually have more than one
sub-command. So another option would be to teach the main git.c a
blacklist of "do not respect pager config" commands (like tag), and then
tag itself could decide to respect pager.tag at the right moment.

I think that makes things worse for a third-party command, though; we
cannot know whether a script "git-foobar" dropped into the $PATH would
like us to respect pager.foobar or if it would prefer to decide itself
later.

-Peff


Re: [PATCH v2] format-patch: Add --rfc for the common case of [RFC PATCH]

2016-09-19 Thread Jeff King
On Mon, Sep 19, 2016 at 04:55:50PM -0700, Josh Triplett wrote:

> On Mon, Sep 19, 2016 at 04:46:06PM -0700, Jacob Keller wrote:
> > On Mon, Sep 19, 2016 at 4:40 PM, Josh Triplett  
> > wrote:
> > > On Mon, Sep 19, 2016 at 04:34:35PM -0700, Jeff King wrote:
> > >> As far as your patch goes, I'd be OK with defining:
> > >>
> > >>   --rfc::
> > >>   Pretend as if `--subject-prefix='RFC PATCH'` was given.
> > >>
> > 
> > Would:
> > 
> > Shorthand for `--subject-prefix='RFC PATCH'`
> > 
> > be a better reading? I feel like using "pretend" is a bit weird here.
> 
> My patch used "Alias for"; if you prefer "Shorthand for" I'm
> indifferent. :)

Or maybe "Act as if...".

-Peff


Re: [PATCH v2] format-patch: Add --rfc for the common case of [RFC PATCH]

2016-09-19 Thread Jeff King
On Mon, Sep 19, 2016 at 04:40:22PM -0700, Josh Triplett wrote:

> >   - there are a non-trivial number of patches for other projects (JGIT,
> > EGIT, StGit, etc). This is somewhat unique to git, where we discuss
> > a lot of related projects on the list. But I wonder if other
> > projects would use subsystems in a similar way (though I guess for
> > the kernel, there are separate subsystems lists, so the "to" or "cc"
> > header becomes the more interesting tag).
> 
> The kernel mostly uses "[PATCH] subsystem: ...".  Occasionally I see
> "[PATCH somegitrepo ...] ..." when it's necessary to explicitly say
> whose git repo the patch needs to go through, but that's pretty rare.

We do both. "foo: blah" is for subsystem "foo" of Git itself, but
all-caps "JGIT PATCH" is "this is not even for Git". I don't know that
the kernel really has an equivalent.

-Peff


Re: .git directory tree as tar-file

2016-09-19 Thread Anatoly Borodin
Hi Martin,

if multiple small loose files are the problem, this could be interesting
for you:

https://git-scm.com/docs/git-gc

This command is run automatically from time to time, and you can
configure it or run manually.

PS To see how bad the situation is, you can use

https://git-scm.com/docs/git-count-objects


-- 
Mit freundlichen Grüßen,
Anatoly Borodin



Re: [PATCH v2] format-patch: Add --rfc for the common case of [RFC PATCH]

2016-09-19 Thread Jacob Keller
On Mon, Sep 19, 2016 at 4:55 PM, Josh Triplett  wrote:
> On Mon, Sep 19, 2016 at 04:46:06PM -0700, Jacob Keller wrote:
>> On Mon, Sep 19, 2016 at 4:40 PM, Josh Triplett  wrote:
>> > On Mon, Sep 19, 2016 at 04:34:35PM -0700, Jeff King wrote:
>> >> As far as your patch goes, I'd be OK with defining:
>> >>
>> >>   --rfc::
>> >>   Pretend as if `--subject-prefix='RFC PATCH'` was given.
>> >>
>>
>> Would:
>>
>> Shorthand for `--subject-prefix='RFC PATCH'`
>>
>> be a better reading? I feel like using "pretend" is a bit weird here.
>
> My patch used "Alias for"; if you prefer "Shorthand for" I'm
> indifferent. :)

Alias seems fine to me.

Thanks,
Jake


Re: [PATCH v2] format-patch: Add --rfc for the common case of [RFC PATCH]

2016-09-19 Thread Josh Triplett
On Mon, Sep 19, 2016 at 04:46:06PM -0700, Jacob Keller wrote:
> On Mon, Sep 19, 2016 at 4:40 PM, Josh Triplett  wrote:
> > On Mon, Sep 19, 2016 at 04:34:35PM -0700, Jeff King wrote:
> >> As far as your patch goes, I'd be OK with defining:
> >>
> >>   --rfc::
> >>   Pretend as if `--subject-prefix='RFC PATCH'` was given.
> >>
> 
> Would:
> 
> Shorthand for `--subject-prefix='RFC PATCH'`
> 
> be a better reading? I feel like using "pretend" is a bit weird here.

My patch used "Alias for"; if you prefer "Shorthand for" I'm
indifferent. :)


Re: [PATCH v2] format-patch: Add --rfc for the common case of [RFC PATCH]

2016-09-19 Thread Jacob Keller
On Mon, Sep 19, 2016 at 4:40 PM, Josh Triplett  wrote:
> On Mon, Sep 19, 2016 at 04:34:35PM -0700, Jeff King wrote:
>> As far as your patch goes, I'd be OK with defining:
>>
>>   --rfc::
>>   Pretend as if `--subject-prefix='RFC PATCH'` was given.
>>

Would:

Shorthand for `--subject-prefix='RFC PATCH'`

be a better reading? I feel like using "pretend" is a bit weird here.

That being said, I'm not super strongly against it, but just find that
it sounds weird for documentation.

Thanks,
Jake


Re: [PATCH v2] format-patch: Add --rfc for the common case of [RFC PATCH]

2016-09-19 Thread Jacob Keller
On Mon, Sep 19, 2016 at 4:34 PM, Jeff King  wrote:
>   - there are a non-trivial number of patches for other projects (JGIT,
> EGIT, StGit, etc). This is somewhat unique to git, where we discuss
> a lot of related projects on the list. But I wonder if other
> projects would use subsystems in a similar way (though I guess for
> the kernel, there are separate subsystems lists, so the "to" or "cc"
> header becomes the more interesting tag).

Working in the net-next community, we often use "[net PATCH]",
"[net-next] [PATCH]", or even just replace "PATCH" with "[net-next]"
and similar (though this is served just fine by --subject-prefix, and
may be an artifact caused because -P doesn't exist).

For the netdev, there are both "net" and "net-next" trees, and so
there is some need to distinguish between these. I prefer "[net
PATCH]" style myself.

I think --rfc is common enough to warrant its own tag, even if we add
"-P tag" just because it encourages its use for whenever you want to
comment about a patch without necessarily wanting it immediately
applied.

I also happen to prefer "RFC PATCH" instead of "PATCH/RFC" but I think
that's mostly preference.

>
> So I dunno what all this means. I do think "--rfc" to standardize on
> _some_ form of "RFC PATCH" or "PATCH/RFC" would serve the most people,
> and would make things more consistent. But at least in Git, people would
> be served by having arbitrary prefixes, too.
>

A general way to do this would be helpful, but i don't think it avoids
the usefulness of --rfc on its own.

I know that some formats are also generated by tools such as stgit
which has its own way to generate emails and doesn't use exactly the
same format as git.

Thanks,
Jake


Re: [PATCH v2] format-patch: Add --rfc for the common case of [RFC PATCH]

2016-09-19 Thread Josh Triplett
On Mon, Sep 19, 2016 at 04:34:35PM -0700, Jeff King wrote:
> On Mon, Sep 19, 2016 at 01:44:08PM -0700, Josh Triplett wrote:
> 
> > On Mon, Sep 19, 2016 at 10:49:17AM -0700, Junio C Hamano wrote:
> > > Andrew Donnellan  writes:
> > > 
> > > > Sounds good to me. Agreed that "RFC" is essentially the only prefix
> > > > other than "PATCH" that I see, at least in the kernel.
> > > 
> > > Around here I think we saw WIP too, and that makes me lean towards
> > > Peff's earlier suggestion to allow an end-user supplied string in
> > > front of PATCH, i.e. "-P RFC" => "--subject-prefix='RFC PATCH'",
> > > even though I understand that those who _ONLY_ care about RFC would
> > > prefer --rfc (5 keystrokes) over "-P RFC" (6 keystrokes).
> > 
> > I do share the concern raised elsewhere in the thread that adding new
> > format-patch short options potentially conflicts with diff/rev-list
> > short options.  If you're not worried about that, I'd be happy to add
> > (and document and test) -P.  However, I'd still advocate adding --rfc as
> > well; it's a common case, and "-P RFC" is actually rather more
> > keystrokes when you count shifting. :)
> > 
> > There might also be some value in steering people towards "RFC" (since a
> > WIP is in a way an RFC).
> 
> Good point. This may be an opportunity to be just slightly opinionated
> and nudge people towards a micro-standard.
> 
> I was curious what we have used over the years on the git list. So here
> are the top hits for:
> 
>   # start with a maildir archive
>   find git/cur -type f |
> 
>   # grab first subject line from each mail
>   xargs grep -i -m1 -h ^subject: |
> 
>   # pull out only bracketed text, ignore "re: [PATCH]"
>   perl -lne '/subject:\s*(\[.*?\])/i and print $1' |
> 
>   # normalize numbers; note that a long patch series will be
>   # over-represented, since it gets one hit per message
>   perl -pe 's/[0-9]+/X/g' |
> 
>   # and then sort by count
>   sort | uniq -c | sort -rn
> 
> The top 5 hits are:
> 
>   26252 [PATCH X/X]
>   18255 [PATCH]
>   17262 [PATCH vX X/X]
>2330 [PATCH vX]
>2297 [PATCHvX X/X]
> 
> which is not surprising (our "-v" uses "PATCH vX", which is why that's
> so much more common). After that it starts to get interesting, but let's
> do two further transformations before the sort to coalesce similar
> cases:
> 
>   # drop versioning entirely
>   perl -pe 's/\s*vX//' |
> 
>   # drop multipart subjects
>   perl -pe 's{\s*X/X}{}' |
> 
> That gives us:
> 
>   67081 [PATCH]
>1286 [PATCH/RFC]
>1169 [RFC/PATCH]
> 863 [JGIT PATCH]
> 832 [ANNOUNCE]
> 797 [RFC]
> 675 [RFC PATCH]
> 537 [StGit PATCH]
> 524 [EGIT PATCH]
> 367 [BUG]
> 169 [StGIT PATCH]
> 158 [GUILT]
> 142 [PATCH RFC]
> 131 [WIP PATCH]
> 129 [PATCH/WIP]
> 115 [TopGit PATCH]
> 115 [PATCH VX]
> ...
> 
> Some of those are obviously uninteresting (like "ANNOUNCE" or "BUG").
> But I see a few interesting patterns:
> 
>   - the "slash" form of RFC/PATCH or PATCH/RFC seems more common than a
> straight prefix (2400 versus about 800)
> 
>   - both RFC/PATCH and PATCH/RFC seems similarly popular (but "RFC
> PATCH" is much more popular than "PATCH RFC")
> 
>   - WIP is a lot less popular; it seems reasonable that it's a synonym
> for RFC and I don't mind pushing people in that direction
> 
>   - there are a non-trivial number of patches for other projects (JGIT,
> EGIT, StGit, etc). This is somewhat unique to git, where we discuss
> a lot of related projects on the list. But I wonder if other
> projects would use subsystems in a similar way (though I guess for
> the kernel, there are separate subsystems lists, so the "to" or "cc"
> header becomes the more interesting tag).

The kernel mostly uses "[PATCH] subsystem: ...".  Occasionally I see
"[PATCH somegitrepo ...] ..." when it's necessary to explicitly say
whose git repo the patch needs to go through, but that's pretty rare.

> As far as your patch goes, I'd be OK with defining:
> 
>   --rfc::
>   Pretend as if `--subject-prefix='RFC PATCH'` was given.
> 
> for now. If we later add:
> 
>   -P ::
>   Pretend as if `--subject-prefix=' PATCH'` was given.
> 
> then `--rfc` naturally becomes:
> 
>   --rfc::
>   Pretend as if `-P RFC` was given.
> 
> in a backwards-compatible way. It doesn't have to all come at once, and
> it sounds like `-P` may not be as useful for the kernel (though I'd be
> interested if somebody wanted to do a similar count; I don't have a copy
> of the kernel archive handy).

Sounds reasonable to me.  I'll send v3 of --rfc with the requested
additional documentation fix, then.


Re: [PATCH v2] format-patch: Add --rfc for the common case of [RFC PATCH]

2016-09-19 Thread Jeff King
On Mon, Sep 19, 2016 at 01:44:08PM -0700, Josh Triplett wrote:

> On Mon, Sep 19, 2016 at 10:49:17AM -0700, Junio C Hamano wrote:
> > Andrew Donnellan  writes:
> > 
> > > Sounds good to me. Agreed that "RFC" is essentially the only prefix
> > > other than "PATCH" that I see, at least in the kernel.
> > 
> > Around here I think we saw WIP too, and that makes me lean towards
> > Peff's earlier suggestion to allow an end-user supplied string in
> > front of PATCH, i.e. "-P RFC" => "--subject-prefix='RFC PATCH'",
> > even though I understand that those who _ONLY_ care about RFC would
> > prefer --rfc (5 keystrokes) over "-P RFC" (6 keystrokes).
> 
> I do share the concern raised elsewhere in the thread that adding new
> format-patch short options potentially conflicts with diff/rev-list
> short options.  If you're not worried about that, I'd be happy to add
> (and document and test) -P.  However, I'd still advocate adding --rfc as
> well; it's a common case, and "-P RFC" is actually rather more
> keystrokes when you count shifting. :)
> 
> There might also be some value in steering people towards "RFC" (since a
> WIP is in a way an RFC).

Good point. This may be an opportunity to be just slightly opinionated
and nudge people towards a micro-standard.

I was curious what we have used over the years on the git list. So here
are the top hits for:

  # start with a maildir archive
  find git/cur -type f |

  # grab first subject line from each mail
  xargs grep -i -m1 -h ^subject: |

  # pull out only bracketed text, ignore "re: [PATCH]"
  perl -lne '/subject:\s*(\[.*?\])/i and print $1' |

  # normalize numbers; note that a long patch series will be
  # over-represented, since it gets one hit per message
  perl -pe 's/[0-9]+/X/g' |

  # and then sort by count
  sort | uniq -c | sort -rn

The top 5 hits are:

  26252 [PATCH X/X]
  18255 [PATCH]
  17262 [PATCH vX X/X]
   2330 [PATCH vX]
   2297 [PATCHvX X/X]

which is not surprising (our "-v" uses "PATCH vX", which is why that's
so much more common). After that it starts to get interesting, but let's
do two further transformations before the sort to coalesce similar
cases:

  # drop versioning entirely
  perl -pe 's/\s*vX//' |

  # drop multipart subjects
  perl -pe 's{\s*X/X}{}' |

That gives us:

  67081 [PATCH]
   1286 [PATCH/RFC]
   1169 [RFC/PATCH]
863 [JGIT PATCH]
832 [ANNOUNCE]
797 [RFC]
675 [RFC PATCH]
537 [StGit PATCH]
524 [EGIT PATCH]
367 [BUG]
169 [StGIT PATCH]
158 [GUILT]
142 [PATCH RFC]
131 [WIP PATCH]
129 [PATCH/WIP]
115 [TopGit PATCH]
115 [PATCH VX]
...

Some of those are obviously uninteresting (like "ANNOUNCE" or "BUG").
But I see a few interesting patterns:

  - the "slash" form of RFC/PATCH or PATCH/RFC seems more common than a
straight prefix (2400 versus about 800)

  - both RFC/PATCH and PATCH/RFC seems similarly popular (but "RFC
PATCH" is much more popular than "PATCH RFC")

  - WIP is a lot less popular; it seems reasonable that it's a synonym
for RFC and I don't mind pushing people in that direction

  - there are a non-trivial number of patches for other projects (JGIT,
EGIT, StGit, etc). This is somewhat unique to git, where we discuss
a lot of related projects on the list. But I wonder if other
projects would use subsystems in a similar way (though I guess for
the kernel, there are separate subsystems lists, so the "to" or "cc"
header becomes the more interesting tag).

So I dunno what all this means. I do think "--rfc" to standardize on
_some_ form of "RFC PATCH" or "PATCH/RFC" would serve the most people,
and would make things more consistent. But at least in Git, people would
be served by having arbitrary prefixes, too.

I don't have a big opinion on ordering or slashes. The slashes make
sense to me as "this is a patch, and it has these attribute tags" (so
we could even start saying "PATCH/maint" or something, I guess). And
"RFC" functions as such a tag. But for subsystems, putting the tag first
is probably best; I don't care about EGIT patches, so I can immediately
ignore them when it's at the beginning.

As far as your patch goes, I'd be OK with defining:

  --rfc::
Pretend as if `--subject-prefix='RFC PATCH'` was given.

for now. If we later add:

  -P ::
Pretend as if `--subject-prefix=' PATCH'` was given.

then `--rfc` naturally becomes:

  --rfc::
Pretend as if `-P RFC` was given.

in a backwards-compatible way. It doesn't have to all come at once, and
it sounds like `-P` may not be as useful for the kernel (though I'd be
interested if somebody wanted to do a similar count; I don't have a copy
of the kernel archive handy).

-Peff

PS There are some amusing ones as you get far down the list, like "DRY
   HUMOR PATCH". Clearly we need "-P" to encourage such things. :)


What's cooking in git.git (Sep 2016, #05; Mon, 19)

2016-09-19 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ah/misc-message-fixes (2016-09-08) 5 commits
  (merged to 'next' on 2016-09-12 at a113aea)
 + unpack-trees: do not capitalize "working"
 + git-merge-octopus: do not capitalize "octopus"
 + git-rebase--interactive: fix English grammar
 + cat-file: put spaces around pipes in usage string
 + am: put spaces around pipe in usage string

 Message cleanup.


* bc/object-id (2016-09-07) 20 commits
  (merged to 'next' on 2016-09-15 at 34c6459)
 + builtin/reset: convert to use struct object_id
 + builtin/commit-tree: convert to struct object_id
 + builtin/am: convert to struct object_id
 + refs: add an update_ref_oid function.
 + sha1_name: convert get_sha1_mb to struct object_id
 + builtin/update-index: convert file to struct object_id
 + notes: convert init_notes to use struct object_id
 + builtin/rm: convert to use struct object_id
 + builtin/blame: convert file to use struct object_id
 + Convert read_mmblob to take struct object_id.
 + notes-merge: convert struct notes_merge_pair to struct object_id
 + builtin/checkout: convert some static functions to struct object_id
 + streaming: make stream_blob_to_fd take struct object_id
 + builtin: convert textconv_object to use struct object_id
 + builtin/cat-file: convert some static functions to struct object_id
 + builtin/cat-file: convert struct expand_data to use struct object_id
 + builtin/log: convert some static functions to use struct object_id
 + builtin/blame: convert struct origin to use struct object_id
 + builtin/apply: convert static functions to struct object_id
 + cache: convert struct cache_entry to use struct object_id

 The "unsigned char sha1[20]" to "struct object_id" conversion
 continues.  Notable changes in this round includes that ce->sha1,
 i.e. the object name recorded in the cache_entry, turns into an
 object_id.

 It had merge conflicts with a few topics in flight (Christian's
 "apply.c split", Dscho's "cat-file --filters" and Jeff Hostetler's
 "status --porcelain-v2").  Extra sets of eyes double-checking for
 mismerges are highly appreciated.


* cc/apply-am (2016-09-07) 41 commits
  (merged to 'next' on 2016-09-12 at 854edde)
 + builtin/am: use apply API in run_apply()
 + apply: learn to use a different index file
 + apply: pass apply state to build_fake_ancestor()
 + apply: refactor `git apply` option parsing
 + apply: change error_routine when silent
 + usage: add get_error_routine() and get_warn_routine()
 + usage: add set_warn_routine()
 + apply: don't print on stdout in verbosity_silent mode
 + apply: make it possible to silently apply
 + apply: use error_errno() where possible
 + apply: make some parsing functions static again
 + apply: move libified code from builtin/apply.c to apply.{c,h}
 + apply: rename and move opt constants to apply.h
 + builtin/apply: rename option parsing functions
 + builtin/apply: make create_one_file() return -1 on error
 + builtin/apply: make try_create_file() return -1 on error
 + builtin/apply: make write_out_results() return -1 on error
 + builtin/apply: make write_out_one_result() return -1 on error
 + builtin/apply: make create_file() return -1 on error
 + builtin/apply: make add_index_file() return -1 on error
 + builtin/apply: make add_conflicted_stages_file() return -1 on error
 + builtin/apply: make remove_file() return -1 on error
 + builtin/apply: make build_fake_ancestor() return -1 on error
 + builtin/apply: change die_on_unsafe_path() to check_unsafe_path()
 + builtin/apply: make gitdiff_*() return -1 on error
 + builtin/apply: make gitdiff_*() return 1 at end of header
 + builtin/apply: make parse_traditional_patch() return -1 on error
 + builtin/apply: make apply_all_patches() return 128 or 1 on error
 + builtin/apply: move check_apply_state() to apply.c
 + builtin/apply: make check_apply_state() return -1 instead of die()ing
 + apply: make init_apply_state() return -1 instead of exit()ing
 + builtin/apply: move init_apply_state() to apply.c
 + builtin/apply: make parse_ignorewhitespace_option() return -1 instead of 
die()ing
 + builtin/apply: make parse_whitespace_option() return -1 instead of die()ing
 + builtin/apply: make parse_single_patch() return -1 on error
 + builtin/apply: make parse_chunk() return a negative integer on error
 + builtin/apply: make find_header() return -128 instead of die()ing
 + builtin/apply: read_patch_file() return -1 instead of die()ing
 + builtin/apply: make apply_patch() return -1 or -128 instead of die()ing
 + apply: move 'struct apply_state' to apply.h
 + 

Re: [PATCH v2] ls-files: add pathspec matching for submodules

2016-09-19 Thread Junio C Hamano
Brandon Williams  writes:

> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> index a623ebf..09e4449 100644
> --- a/Documentation/git-ls-files.txt
> +++ b/Documentation/git-ls-files.txt
> @@ -19,7 +19,7 @@ SYNOPSIS
> - [--output-path-prefix=]
> + [--submodule-prefix=]
> ...
> ---output-path-prefix=::
> +--submodule-prefix=::
> ...
> -static const char *output_path_prefix;
> +static const char *submodule_prefix;
> ...
> - if (output_path_prefix && *output_path_prefix) {
> + if (submodule_prefix && *submodule_prefix) {
>   strbuf_reset(_name);
> - strbuf_addstr(_name, output_path_prefix);
> + strbuf_addstr(_name, submodule_prefix);
> ...
> - argv_array_pushf(, "--output-path-prefix=%s%s/",
> -  output_path_prefix ? output_path_prefix : "",
> + argv_array_pushf(, "--submodule-prefix=%s%s/",
> +  submodule_prefix ? submodule_prefix : "",
>ce->name);

As the previous one that used a wrong (sorry) argument is not even
in 'next' yet, let's pretend that it never happened.  It is OK to
still keep it and this patch as two separate steps, i.e. a topic
with two patches in it.

That means that this patch will become 2/2 of a series, and 1/2 is
rerolled to use submodule_prefix from the get-go, without ever
introducing output_path_prefix variable, so that many of the above
lines we won't have to review in 2/2.

> + /*
> +  * Pass in the original pathspec args.  The submodule will be
> +  * responsible for prepending the 'submodule_prefix' prior to comparing
> +  * against the pathspec for matches.
> +  */

Good.

> + argv_array_push(, "--");
> + for (i = 0; i < pathspec.nr; ++i)
> + argv_array_push(, pathspec.items[i].original);
> +

Please prefer post-increment i++ over pre-increment ++i when writing
a for(;;) loop, unless there is a strong reason not to (familiarlity
in C++ is not a good reason).

> diff --git a/dir.c b/dir.c
> index 0ea235f..1afc3ff 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -63,6 +63,35 @@ int fspathncmp(const char *a, const char *b, size_t count)
>   return ignore_case ? strncasecmp(a, b, count) : strncmp(a, b, count);
>  }
>  
> +/**
> + * Used to perform prefix matching against a pathspec_item for determining 
> if we
> + * should decend into a submodule.  This function can result in false 
> positives
> + * since we are only trying to match the 'string' to a prefix of the 
> 'pattern'
> + */

Perhaps s/can/is allowed to/.  Implicit but equally important is
that it is not OK to produce false negative.

> +static int prefix_fnmatch(const struct pathspec_item *item,
> +const char *pattern, const char *string,
> +int prefix)
> +{
> + if (prefix > 0) {
> + if (ps_strncmp(item, pattern, string, prefix))
> + return WM_NOMATCH;
> + pattern += prefix;
> + string += prefix;
> + }
> +
> + if (item->flags & PATHSPEC_ONESTAR) {
> + return WM_MATCH;
> + } else if (item->magic & PATHSPEC_GLOB) {
> + return wildmatch(pattern, string,
> +  WM_PATHNAME |
> +  (item->magic & PATHSPEC_ICASE ?
> +   WM_CASEFOLD : 0),
> +  NULL);

Isn't this last one overly tight?  I am wondering about a scenario
where you have a submodule at "sub/" in the superproject, and "sub/"
has a "file" at the top of its working tree.  And you do:

git ls-files --recurse-submodules ':(glob)??b/fi?e'

at the top of the superproject.  The "pattern" would be '??b/fi?e"
while string would be 'sub', and wildmatch() would not like it, but
there is no way for this caller to append anything to 'sub' before
making this call, as it hasn't looked into what paths appear in the
submodule repository (and it should not want to).  And I think we
would want it to recurse to find sub/file.  IOW, this looks like a
false negative we must avoid in this function.  As we cannot afford
to check if anything that matches 'fi?e' is in the index file of the
submodule repository, we shouldn't try to match 'fi?e' portion of
the given pathspec pattern.

Extending the scenario, if I also create a directory "sib/" in the
superproject next to "sub/" and add a "file" in it, the same
pathspec:

git ls-files [--recurse-submodules] ':(glob)??b/fi?e'

does find "sib/file" (with or without the recursion option).

Duy cc'ed because I am not quite sure why it is a good thing
that I need to add an exlicit ':(glob)' in these examples to
illustrate the breakage.  This comes from v1.8.3.2-849-gbd30c2e
("pathspec: support :(glob) syntax", 2013-07-14) and I do not
think we want to change its behaviour. I just want to be
reminded why we did this.  I am guessing that it was because of
an 

Re: [PATCH v2 2/2] mailinfo: unescape quoted-pair in header fields

2016-09-19 Thread Junio C Hamano
Junio C Hamano  writes:

> Kevin Daudt  writes:
>
>> +static void unquote_quoted_string(struct strbuf *line)
>> +{
>> +const char *in = strbuf_detach(line, NULL);
>> +int c, take_next_literally = 0;
>> +int found_error = 0;
> ...
>> +}
>> +}
>
> The additional comment makes it very clear what is going on.
>
> Is it an event unusual enough that is worth warning() about if we
> have either take_next_literally or escape_context set to non-NUL
> upon leaving the loop, I wonder?
>
> Will queue.  Thanks.

It turns out that found_error is not used anywhere and tripped the
-Werror=unused-variable check.  I've removed that line while
queuing.

Thanks.


Re: [PATCH v2 0/4] handle multiline in-body headers

2016-09-19 Thread Junio C Hamano
Jonathan Tan  writes:

> This is an iteration of the patch set after the discussion in
>  .
>
> Changes:
> o largely rewritten to follow Junio's suggested design (refactor of
>   check_header to separate out ">From" and "[PATCH]", and an
>   is_inbody_header function performing an airtight check on whether a
>   line is an in-body header)
> o simpler try_convert_to_utf8 API
> o one file of the expected output of t/t5100/*0015 is modified (instead
>   of the input)
> o t/t5100/*0018--no-inbody-headers test files added
> o example in commit message improved following Peff's suggestion

Overall it looks much nicer, but I still do not understand why we
want to do try-convert-to-utf8.  The _only_ caller of that helper
function is in 4/4 where it tries to convert the line to see if a
line is a scissors line.

It seems to me that the only thing doing so gives us is that you
could later enhance the definition of what a scissors-line looks
like by adding unicode characters [*1*], but I would strongly
suggest not doing that kind of enhancement [*2*].  With the current
definition of what a scissors-line looks like, it can be checked
before you do the UTF-8 conversion, I think, which would mean that
the helper is not strictly necessary.

So...


[Footnotes]

*1* E.g. ✂ encoded in non-UTF8 may have to be converted to UTF-8
first before being recognized as such.

*2* Encouraging people to use non-ASCII that older version of Git
did not take for structural things like "what is a scissors line"
merely for the looks is a bad trade-off; patch e-mails that use the
enhanced definition will break for those with older version of Git,
and the benefit of "prettier scissors" is not really worth it, I
would think.







Re: [PATCH v2 2/2] mailinfo: unescape quoted-pair in header fields

2016-09-19 Thread Junio C Hamano
Kevin Daudt  writes:

> +static void unquote_quoted_string(struct strbuf *line)
> +{
> + const char *in = strbuf_detach(line, NULL);
> + int c, take_next_literally = 0;
> + int found_error = 0;
> +
> + /*
> +  * Stores the character that started the escape mode so that we know 
> what
> +  * character will stop it
> +  */
> + char escape_context = 0;
> +
> + while ((c = *in++) != 0) {
> + if (take_next_literally) {
> + take_next_literally = 0;
> + } else {
> + switch (c) {
> + case '"':
> + if (!escape_context)
> + escape_context = '"';
> + else if (escape_context == '"')
> + escape_context = 0;
> + continue;
> + case '\\':
> + if (escape_context) {
> + take_next_literally = 1;
> + continue;
> + }
> + break;
> + case '(':
> + if (!escape_context)
> + escape_context = '(';
> + break;
> + case ')':
> + if (escape_context == '(')
> + escape_context = 0;
> + break;
> + }
> + }
> +
> + strbuf_addch(line, c);
> + }
> +}

The additional comment makes it very clear what is going on.

Is it an event unusual enough that is worth warning() about if we
have either take_next_literally or escape_context set to non-NUL
upon leaving the loop, I wonder?

Will queue.  Thanks.




Re: [PATCH v2 1/2] t5100-mailinfo: replace common path prefix with variable

2016-09-19 Thread Junio C Hamano
Kevin Daudt  writes:

> Many tests need to store data in a file, and repeat the same pattern to
> refer to that path:
>
> "$TEST_DATA"/t5100/

That obviously is a typo of

"$TEST_DIRECTORY/t5100"

It is a good change, even though I would have chosen a name
that is a bit more descriptive than "$DATA".

>  test_expect_success 'split sample box' \
> - 'git mailsplit -o. "$TEST_DIRECTORY"/t5100/sample.mbox >last &&
> + 'git mailsplit -o. "$DATA"/sample.mbox >last &&

You are just following the pattern, and this instance is not too
bad, but lines like these

> - test_cmp "$TEST_DIRECTORY"/t5100/msg$mo msg$mo &&
> - test_cmp "$TEST_DIRECTORY"/t5100/patch$mo patch$mo &&
> - test_cmp "$TEST_DIRECTORY"/t5100/info$mo info$mo
> + test_cmp "$DATA"/msg$mo msg$mo &&
> + test_cmp "$DATA"/patch$mo patch$mo &&
> + test_cmp "$DATA"/info$mo info$mo

make me wonder why we don't quote the whole thing, i.e.

test_cmp "$TEST_DATA/info$mo" "info$mo"

as leaving $mo part unquoted forces reader to wonder if it is our
deliberate attempt to allow shell $IFS in $mo and have the argument
split when that happens, which can be avoided if we quoted more
explicitly.

Perhaps we'd leave that as a low-hanging fruit for future people.

Thanks.


[PATCH v2 3/4] mailinfo: make is_scissors_line take plain char *

2016-09-19 Thread Jonathan Tan
The is_scissors_line takes a struct strbuf * when a char * would
suffice. Make it take char *.

Signed-off-by: Jonathan Tan 
---
 mailinfo.c | 35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index aadad09..1f487df 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -581,37 +581,35 @@ static inline int patchbreak(const struct strbuf *line)
return 0;
 }
 
-static int is_scissors_line(const struct strbuf *line)
+static int is_scissors_line(const char *line)
 {
-   size_t i, len = line->len;
+   const char *c;
int scissors = 0, gap = 0;
-   int first_nonblank = -1;
-   int last_nonblank = 0, visible, perforation = 0, in_perforation = 0;
-   const char *buf = line->buf;
+   const char *first_nonblank = NULL, *last_nonblank = NULL;
+   int visible, perforation = 0, in_perforation = 0;
 
-   for (i = 0; i < len; i++) {
-   if (isspace(buf[i])) {
+   for (c = line; *c; c++) {
+   if (isspace(*c)) {
if (in_perforation) {
perforation++;
gap++;
}
continue;
}
-   last_nonblank = i;
-   if (first_nonblank < 0)
-   first_nonblank = i;
-   if (buf[i] == '-') {
+   last_nonblank = c;
+   if (first_nonblank == NULL)
+   first_nonblank = c;
+   if (*c == '-') {
in_perforation = 1;
perforation++;
continue;
}
-   if (i + 1 < len &&
-   (!memcmp(buf + i, ">8", 2) || !memcmp(buf + i, "8<", 2) ||
-!memcmp(buf + i, ">%", 2) || !memcmp(buf + i, "%<", 2))) {
+   if ((!memcmp(c, ">8", 2) || !memcmp(c, "8<", 2) ||
+!memcmp(c, ">%", 2) || !memcmp(c, "%<", 2))) {
in_perforation = 1;
perforation += 2;
scissors += 2;
-   i++;
+   c++;
continue;
}
in_perforation = 0;
@@ -626,7 +624,10 @@ static int is_scissors_line(const struct strbuf *line)
 * than half of the perforation.
 */
 
-   visible = last_nonblank - first_nonblank + 1;
+   if (first_nonblank && last_nonblank)
+   visible = last_nonblank - first_nonblank + 1;
+   else
+   visible = 0;
return (scissors && 8 <= visible &&
visible < perforation * 3 &&
gap * 2 < perforation);
@@ -671,7 +672,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct 
strbuf *line)
if (convert_to_utf8(mi, line, mi->charset.buf))
return 0; /* mi->input_error already set */
 
-   if (mi->use_scissors && is_scissors_line(line)) {
+   if (mi->use_scissors && is_scissors_line(line->buf)) {
int i;
 
strbuf_setlen(>log_message, 0);
-- 
2.10.0.rc2.20.g5b18e70



[PATCH v2 0/4] handle multiline in-body headers

2016-09-19 Thread Jonathan Tan
This is an iteration of the patch set after the discussion in
 .

Changes:
o largely rewritten to follow Junio's suggested design (refactor of
  check_header to separate out ">From" and "[PATCH]", and an
  is_inbody_header function performing an airtight check on whether a
  line is an in-body header)
o simpler try_convert_to_utf8 API
o one file of the expected output of t/t5100/*0015 is modified (instead
  of the input)
o t/t5100/*0018--no-inbody-headers test files added
o example in commit message improved following Peff's suggestion

Jonathan Tan (4):
  mailinfo: separate in-body header processing
  mailinfo: refactor to support utf8 decode attempts
  mailinfo: make is_scissors_line take plain char *
  mailinfo: handle in-body header continuations

 mailinfo.c   | 164 ++-
 mailinfo.h   |   1 +
 t/t4150-am.sh|  23 +
 t/t5100-mailinfo.sh  |   2 +-
 t/t5100/info0018 |   5 ++
 t/t5100/info0018--no-inbody-headers  |   5 ++
 t/t5100/msg0015  |   2 -
 t/t5100/msg0018  |   2 +
 t/t5100/msg0018--no-inbody-headers   |   8 ++
 t/t5100/patch0018|   6 ++
 t/t5100/patch0018--no-inbody-headers |   6 ++
 t/t5100/sample.mbox  |  19 
 12 files changed, 198 insertions(+), 45 deletions(-)
 create mode 100644 t/t5100/info0018
 create mode 100644 t/t5100/info0018--no-inbody-headers
 create mode 100644 t/t5100/msg0018
 create mode 100644 t/t5100/msg0018--no-inbody-headers
 create mode 100644 t/t5100/patch0018
 create mode 100644 t/t5100/patch0018--no-inbody-headers

-- 
2.10.0.rc2.20.g5b18e70



[PATCH v2 2/4] mailinfo: refactor to support utf8 decode attempts

2016-09-19 Thread Jonathan Tan
mailinfo.c currently has a convert_to_utf8 function that overrides its
argument and prints an error message when the decoding fails. Refactor
it, creating another function that does not override anything and does
not print an error message when the decoding fails. This is to be used
by a subsequent patch.

Signed-off-by: Jonathan Tan 
---
 mailinfo.c | 42 +-
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index 0c4738a..aadad09 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -340,23 +340,47 @@ static struct strbuf *decode_b_segment(const struct 
strbuf *b_seg)
return out;
 }
 
+/*
+ * Attempts to convert line into UTF-8.
+ *
+ * This differs from convert_to_utf8 in that no inputs are overridden, and that
+ * conversion non-success is not considered an error case - mi->input_error is
+ * not set, and no error message is printed.
+ *
+ * If the conversion is unnecessary, returns 1 and stores NULL in result.
+ *
+ * If the conversion is successful, returns 1 and stores the converted string
+ * in result.
+ *
+ * If the conversion is unsuccessful, returns 0 and stores NULL in result.
+ */
+static int try_convert_to_utf8(const struct mailinfo *mi, const char *line,
+  const char *charset, char **result)
+{
+   if (!mi->metainfo_charset || !charset || !*charset ||
+   same_encoding(mi->metainfo_charset, charset)) {
+   *result = NULL;
+   return 1;
+   }
+
+   *result = reencode_string(line, mi->metainfo_charset, charset);
+   return !!*result;
+}
+
+/*
+ * Converts line into UTF-8, setting mi->input_error to -1 upon failure.
+ */
 static int convert_to_utf8(struct mailinfo *mi,
   struct strbuf *line, const char *charset)
 {
char *out;
-
-   if (!mi->metainfo_charset || !charset || !*charset)
-   return 0;
-
-   if (same_encoding(mi->metainfo_charset, charset))
-   return 0;
-   out = reencode_string(line->buf, mi->metainfo_charset, charset);
-   if (!out) {
+   if (!try_convert_to_utf8(mi, line->buf, charset, )) {
mi->input_error = -1;
return error("cannot convert from %s to %s",
 charset, mi->metainfo_charset);
}
-   strbuf_attach(line, out, strlen(out), strlen(out));
+   if (out)
+   strbuf_attach(line, out, strlen(out), strlen(out));
return 0;
 }
 
-- 
2.10.0.rc2.20.g5b18e70



[PATCH v2 1/4] mailinfo: separate in-body header processing

2016-09-19 Thread Jonathan Tan
The check_header function contains logic specific to in-body headers,
although it is invoked during both the processing of actual headers and
in-body headers. Separate out the in-body header part into its own
function.

Signed-off-by: Jonathan Tan 
---
 mailinfo.c | 33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index e19abe3..0c4738a 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -495,21 +495,6 @@ static int check_header(struct mailinfo *mi,
goto check_header_out;
}
 
-   /* for inbody stuff */
-   if (starts_with(line->buf, ">From") && isspace(line->buf[5])) {
-   ret = is_format_patch_separator(line->buf + 1, line->len - 1);
-   goto check_header_out;
-   }
-   if (starts_with(line->buf, "[PATCH]") && isspace(line->buf[7])) {
-   for (i = 0; header[i]; i++) {
-   if (!strcmp("Subject", header[i])) {
-   handle_header(_data[i], line);
-   ret = 1;
-   goto check_header_out;
-   }
-   }
-   }
-
 check_header_out:
strbuf_release();
return ret;
@@ -623,6 +608,22 @@ static int is_scissors_line(const struct strbuf *line)
gap * 2 < perforation);
 }
 
+static int check_inbody_header(struct mailinfo *mi, const struct strbuf *line)
+{
+   if (starts_with(line->buf, ">From") && isspace(line->buf[5]))
+   return is_format_patch_separator(line->buf + 1, line->len - 1);
+   if (starts_with(line->buf, "[PATCH]") && isspace(line->buf[7])) {
+   int i;
+   for (i = 0; header[i]; i++)
+   if (!strcmp("Subject", header[i])) {
+   handle_header(>s_hdr_data[i], line);
+   return 1;
+   }
+   return 0;
+   }
+   return check_header(mi, line, mi->s_hdr_data, 0);
+}
+
 static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
 {
assert(!mi->filter_stage);
@@ -633,7 +634,7 @@ static int handle_commit_msg(struct mailinfo *mi, struct 
strbuf *line)
}
 
if (mi->use_inbody_headers && mi->header_stage) {
-   mi->header_stage = check_header(mi, line, mi->s_hdr_data, 0);
+   mi->header_stage = check_inbody_header(mi, line);
if (mi->header_stage)
return 0;
} else
-- 
2.10.0.rc2.20.g5b18e70



[PATCH v2 4/4] mailinfo: handle in-body header continuations

2016-09-19 Thread Jonathan Tan
Mailinfo currently handles multi-line headers, but it does not handle
multi-line in-body headers. Teach it to handle such headers, for
example, for this input:

  From: author 
  Date: Fri, 9 Jun 2006 00:44:16 -0700
  Subject: a very long
   broken line

  Subject: another very long
   broken line

interpret the in-body subject to be "another very long broken line"
instead of "another very long".

An existing test (t/t5100/msg0015) has an indented line immediately
after an in-body header - it has been modified to reflect the new
functionality.

Signed-off-by: Jonathan Tan 
---
 mailinfo.c   | 56 +++-
 mailinfo.h   |  1 +
 t/t4150-am.sh| 23 +++
 t/t5100-mailinfo.sh  |  2 +-
 t/t5100/info0018 |  5 
 t/t5100/info0018--no-inbody-headers  |  5 
 t/t5100/msg0015  |  2 --
 t/t5100/msg0018  |  2 ++
 t/t5100/msg0018--no-inbody-headers   |  8 ++
 t/t5100/patch0018|  6 
 t/t5100/patch0018--no-inbody-headers |  6 
 t/t5100/sample.mbox  | 19 
 12 files changed, 131 insertions(+), 4 deletions(-)
 create mode 100644 t/t5100/info0018
 create mode 100644 t/t5100/info0018--no-inbody-headers
 create mode 100644 t/t5100/msg0018
 create mode 100644 t/t5100/msg0018--no-inbody-headers
 create mode 100644 t/t5100/patch0018
 create mode 100644 t/t5100/patch0018--no-inbody-headers

diff --git a/mailinfo.c b/mailinfo.c
index 1f487df..aec3c52 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -524,6 +524,21 @@ static int check_header(struct mailinfo *mi,
return ret;
 }
 
+/*
+ * Returns 1 if the given line or any line beginning with the given line is an
+ * in-body header (that is, check_header will succeed when passed
+ * mi->s_hdr_data).
+ */
+static int is_inbody_header(const struct mailinfo *mi,
+   const struct strbuf *line)
+{
+   int i;
+   for (i = 0; header[i]; i++)
+   if (!mi->s_hdr_data[i] && cmp_header(line, header[i]))
+   return 1;
+   return 0;
+}
+
 static void decode_transfer_encoding(struct mailinfo *mi, struct strbuf *line)
 {
struct strbuf *ret;
@@ -633,8 +648,39 @@ static int is_scissors_line(const char *line)
gap * 2 < perforation);
 }
 
+static void flush_inbody_header_accum(struct mailinfo *mi)
+{
+   if (!mi->inbody_header_accum.len)
+   return;
+   assert(check_header(mi, >inbody_header_accum, mi->s_hdr_data, 0));
+   strbuf_reset(>inbody_header_accum);
+}
+
 static int check_inbody_header(struct mailinfo *mi, const struct strbuf *line)
 {
+   if (mi->inbody_header_accum.len &&
+   (line->buf[0] == ' ' || line->buf[0] == '\t')) {
+   char *utf8 = NULL;
+   int is_scissors = mi->use_scissors && 
+ try_convert_to_utf8(mi, line->buf,
+ mi->charset.buf, ) &&
+ is_scissors_line(utf8 ? utf8 : line->buf);
+   free(utf8);
+   if (is_scissors) {
+   /*
+* This is a scissors line; do not consider this line
+* as a header continuation line.
+*/
+   flush_inbody_header_accum(mi);
+   return 0;
+   }
+   strbuf_strip_suffix(>inbody_header_accum, "\n");
+   strbuf_addbuf(>inbody_header_accum, line);
+   return 1;
+   }
+
+   flush_inbody_header_accum(mi);
+
if (starts_with(line->buf, ">From") && isspace(line->buf[5]))
return is_format_patch_separator(line->buf + 1, line->len - 1);
if (starts_with(line->buf, "[PATCH]") && isspace(line->buf[7])) {
@@ -646,7 +692,11 @@ static int check_inbody_header(struct mailinfo *mi, const 
struct strbuf *line)
}
return 0;
}
-   return check_header(mi, line, mi->s_hdr_data, 0);
+   if (is_inbody_header(mi, line)) {
+   strbuf_addbuf(>inbody_header_accum, line);
+   return 1;
+   }
+   return 0;
 }
 
 static int handle_commit_msg(struct mailinfo *mi, struct strbuf *line)
@@ -912,6 +962,8 @@ static void handle_body(struct mailinfo *mi, struct strbuf 
*line)
break;
} while (!strbuf_getwholeline(line, mi->input, '\n'));
 
+   flush_inbody_header_accum(mi);
+
 handle_body_out:
strbuf_release();
 }
@@ -1027,6 +1079,7 @@ void setup_mailinfo(struct mailinfo *mi)
strbuf_init(>email, 0);
strbuf_init(>charset, 0);
strbuf_init(>log_message, 0);
+   strbuf_init(>inbody_header_accum, 0);
mi->header_stage = 1;

Re: [PATCH 1/6] i18n: commit: mark message for translation

2016-09-19 Thread Junio C Hamano
Jean-Noël AVILA  writes:

> Maybe 3/6 would have benefited from some code factorization. But that's ok 
> for 
> a few more sentences.

Ah, let me see I understood what you mean.  Adding something like
this

static NORETURN void die_want_option(const char *option_name)
{
die(_("option '%s' requires a value"), option_name);
}

and have many of the callers of die() call it would reduce the
number of strings that translators would need to touch.

I agree, but I think we can do that in a later follow-up.

Thanks.


Re: clarification of `rev-list --no-walk ^`?

2016-09-19 Thread Junio C Hamano
"Philip Oakley"  writes:

>> The "--no-walk" tells the rev-list machinery "I have only positives;
>> do not traverse from them AT ALL but just use these positives".
>> Only under that condition, the order of the positive ends you list
>> on the command line matters.
>
> What does "--do-walk" do ("Overrides a previous --no-walk"), and when
> would it be applied?

Wouldn't 

$ git cherry-pick --no-walk --do-walk A

end up walking the history behind A and reproducing the history
since the root commit?



Re: .git directory tree as tar-file

2016-09-19 Thread Jacob Keller
On Mon, Sep 19, 2016 at 1:31 PM, Martin Bammer  wrote:
> Hi,
>
> it would be nice to have an option to have a .git.tar instead of the .git
> directory tree. This tree quickly gets very big and copy and compare actions
> take very long. I've observed that even in small projects with only a few 
> files
> the .git tree can contain several thousand entries. Especially when projects 
> are
> saved on shares copy and compare actions are getting very slow.
> Is such a feature already planned or is there a chance to have such a feature 
> in
> the near future?
>
> Best regards,
>

It sounds like you're mis-using the git directory and attempting to
share by copying it instead of using any of the protocols designed for
sharing such as local file, ssh, or https.

I don't believe that storing the git via compression is really a good
idea, as many of the operations inside are based on files and we
already try to keep things compressed using various methods already
(pack files)

If you want to actually share a git repository as a bundle, you can
generate such a bundle. See "git help bundle" for more information.

Regards,
Jake

> Martin
>


Re: clarification of `rev-list --no-walk ^`?

2016-09-19 Thread Philip Oakley

From: "Junio C Hamano" 

"Philip Oakley"  writes:


Philip probably has a
confused notion that "rev-list A..B C..D" is somehow a union of set
A..B and C..D?


That wasn't the issue. Though it does beg the question that it's the
same as "rev-list D B ^A ^C" isn't it?


If you think it begs the question, then you haven't understood what
I meant by all of the explanation.  Let me repeat:


Apologies. We appear to be having an British/American usage 
misunderstanding. Locally, the answer to the begged (rhetorically asked) 
question is, as you say, "Yes, they are the same". It was simply confirming 
our common understanding.




"A..B C..D" is exactly a short-hand for "^A B ^C D" which is
the same as ANY permutation like "D B ^A ^C".



regards,
Philip 



Re: [PATCH v2] format-patch: Add --rfc for the common case of [RFC PATCH]

2016-09-19 Thread Josh Triplett
On Mon, Sep 19, 2016 at 10:49:17AM -0700, Junio C Hamano wrote:
> Andrew Donnellan  writes:
> 
> > Sounds good to me. Agreed that "RFC" is essentially the only prefix
> > other than "PATCH" that I see, at least in the kernel.
> 
> Around here I think we saw WIP too, and that makes me lean towards
> Peff's earlier suggestion to allow an end-user supplied string in
> front of PATCH, i.e. "-P RFC" => "--subject-prefix='RFC PATCH'",
> even though I understand that those who _ONLY_ care about RFC would
> prefer --rfc (5 keystrokes) over "-P RFC" (6 keystrokes).

I do share the concern raised elsewhere in the thread that adding new
format-patch short options potentially conflicts with diff/rev-list
short options.  If you're not worried about that, I'd be happy to add
(and document and test) -P.  However, I'd still advocate adding --rfc as
well; it's a common case, and "-P RFC" is actually rather more
keystrokes when you count shifting. :)

There might also be some value in steering people towards "RFC" (since a
WIP is in a way an RFC).

> >> +--rfc::
> >> +  Alias for `--subject-prefix="RFC PATCH"`. Use this when
> >> +  sending an experimental patch for discussion rather than
> >> +  application.
> >
> > Perhaps mention the phrase "Request For Comment" for the benefit of
> > those who aren't familiar ...
> 
> Good point.

I'll add that to the documentation in v3.


Re: [PATCH 1/6] i18n: commit: mark message for translation

2016-09-19 Thread Jean-Noël AVILA
On lundi 19 septembre 2016 10:54:37 CEST Junio C Hamano wrote:
> I am responding to 1/6, as the series lacked a cover letter, but all
> of them looked good.
> 
> Thanks.

Maybe 3/6 would have benefited from some code factorization. But that's ok for 
a few more sentences.

Thanks.



.git directory tree as tar-file

2016-09-19 Thread Martin Bammer
Hi,

it would be nice to have an option to have a .git.tar instead of the .git
directory tree. This tree quickly gets very big and copy and compare actions
take very long. I've observed that even in small projects with only a few files
the .git tree can contain several thousand entries. Especially when projects are
saved on shares copy and compare actions are getting very slow.
Is such a feature already planned or is there a chance to have such a feature in
the near future?

Best regards,

Martin



RE: Switching branches not working in a cloned repo

2016-09-19 Thread Paul Williamson
-Original Message-
From: Jacob Keller [mailto:jacob.kel...@gmail.com] 

>Try a fresh clone with "git checkout --track origin/web_dev" and then a "git 
>status -v" and let us know what happens. I suspect that older clones have the 
>correct branches already setup for tracking, but the new clone >isn't doing it 
>automatically, so when you run git checkout web_dev you're just creating a new 
>local copy.

Hi Jake, this worked!

Your explanation seems correct - I could still list the remote branches, thanks 
to the command that Philip suggested, but the linkage between the two was not 
there by default for some reason. Very strange, I have never seen that happen 
before.

Anyway, this solution is sound and seems to set everything (history etc) back 
to how it should be.

Thanks very much for your help!

Paul


Re: clarification of `rev-list --no-walk ^`?

2016-09-19 Thread Philip Oakley

From: "Junio C Hamano" 

"Philip Oakley"  writes:


Philip probably has a
confused notion


Hi Junio,

Could you clarify a particular point from here..


The "--no-walk" tells the rev-list machinery "I have only positives;
do not traverse from them AT ALL but just use these positives".
Only under that condition, the order of the positive ends you list
on the command line matters.


What does "--do-walk" do ("Overrides a previous --no-walk"), and when would 
it be applied?


--
Philip 



Re: [PATCH 3/2] batch check whether submodule needs pushing into one call

2016-09-19 Thread Heiko Voigt
On Fri, Sep 16, 2016 at 11:13:09AM -0700, Junio C Hamano wrote:
> Heiko Voigt  writes:
> > The most exact solution would be to use all actual remote refs available
> > (not sure if we have them at this point in the process?) another
> > solution would be to still append the --remotes= option as a
> > fallback to reduce the revisions.
> 
> I'd say --remotes= is the least problematic thing to do.

Ok then lets drop my last patch and keep it the way it was. Because if
the remote sha1 differs we probably do not have it locally anyway. The
only case this does not catch is when the user specifies a remote URL.
But that just means we will iterate over all revisions instead of a
reduced set, which makes the check slower but still correct. As one can
see from my measurements that should not be that bad anymore.

Cheers Heiko


Re: [PATCH 2/2] serialize collection of refs that contain submodule changes

2016-09-19 Thread Junio C Hamano
Heiko Voigt  writes:

> I am not sure if I understand you correctly here. With the "ref cache layer"
> you are referring to add_submodule_odb() which is called indirectly from
> submodule_needs_pushing()? Those revs are only used to check whether the hash
> we need on the remote side exists in the local submodule. That should not
> change due to a push. The actual check whether the commit(s) exist on the
> remote side is done using a 'rev-list' in a subprocess later.

I was wondering what would happen in this scenario:

 * You have ON_DEMAND set, which causes "git -C sub push origin" to
   push out what are new, updating the remote tracking branches in
   the submodule, sub/.git/refs/remotes/origin/*.

 * Then you check again.  If you used for-each-ref-in-submodule, the
   updated refs/remotes/origin/* may not have been re-read.

But you check by spawning "rev-list ... --not --remotes" as a
separate process in submodule_needs_pushing(), and that will force
the new process to read the updated state, so it turns out that I
was overly worried without good reason ;-)

It may matter once somebody tries to internalize the external
rev-list call done via start_command() interface to an internal
call, though.  But we are not there yet.


Re: clarification of `rev-list --no-walk ^`?

2016-09-19 Thread Junio C Hamano
"Philip Oakley"  writes:

>> Philip probably has a
>> confused notion that "rev-list A..B C..D" is somehow a union of set
>> A..B and C..D?
>
> That wasn't the issue. Though it does beg the question that it's the
> same as "rev-list D B ^A ^C" isn't it?

If you think it begs the question, then you haven't understood what
I meant by all of the explanation.  Let me repeat:

"A..B C..D" is exactly a short-hand for "^A B ^C D" which is
the same as ANY permutation like "D B ^A ^C".

This is because rev-list machinery NEVER does "ah the command line
asks for two or more sets to be computed and then a union of the
results to be returned".  As I said, it collects positive starting
points (B and D) and negative starting points (A and C), and does a
single traversal "ones that are reachable from positives, excluding
the ones that are reachable from negatives".  That is all it does.

Once that is understood, it should be obvious that it would not have
any effect on the result to permute the order of positives and
negatives.

> I'd say it was the walk - no walk range confusion.

You may _additionally_ have had that confusion ;-).

> i.e. cherry-pick B D F G Q..T;  isn't B D F G R S T, is it?

Of course it isn't, but that is not about any ordering.  It is
because the machinery does not work this way:

  - we have positives B D F G and ... oh wait, we see Q..T, so we
need to traverse ONLY that bit and come up with positives in the
resulting set, which are R S T, and take a union of all of the
above; and then

  - now we process --no-walk to show only the resulting positives.

The "--no-walk" tells the rev-list machinery "I have only positives;
do not traverse from them AT ALL but just use these positives".
Only under that condition, the order of the positive ends you list
on the command line matters.

Having ^Q in the input "B D F G ^Q T" contradicts that "I have only
positives" precondtion and forces the machinery to traverse, i.e. to
find things reachable from B D F G T excluding things reachable from
Q.  The order of the output would be topological (i.e. a parent is
not shown until all its children are), with or without --date-order,
and may not match the order you gave the positive ones from the
command line.

You may want the "exclude things reachable from Q" part to not have
any effect on things related to B, D, F or G, but the machinery does
not work that way.  That is because there is only one single traversal
and no union operation involved.

It is a different matter Git born in a fictional world that takes
A..B C..D as a union of two sets is better than the current one.
There might be cases where such a variant of "a set of commits that
is a union of multiple sets" is handy to use.  But I think we are
talking about what Git in our world does, so I'd not go there in
this discussion.


Re: [PATCH 3/2] batch check whether submodule needs pushing into one call

2016-09-19 Thread Heiko Voigt
On Fri, Sep 16, 2016 at 10:59:37AM -0700, Junio C Hamano wrote:
> Heiko Voigt  writes:
> > +static void check_has_hash(const unsigned char sha1[20], void *data)
> > +{
> > +   int *has_hash = (int *) data;
> > +
> > +   if (!lookup_commit_reference(sha1))
> > +   *has_hash = 0;
> > +}
> > +
> > +static int submodule_has_hashes(const char *path, struct sha1_array 
> > *hashes)
> > +{
> > +   int has_hash = 1;
> > +
> > +   if (add_submodule_odb(path))
> > +   return 0;
> > +
> > +   sha1_array_for_each_unique(hashes, check_has_hash, _hash);
> > +   return has_hash;
> > +}
> > +
> > +static int submodule_needs_pushing(const char *path, struct sha1_array 
> > *hashes)
> > +{
> > +   if (!submodule_has_hashes(path, hashes))
> > return 0;
> 
> I think you meant well, but this optimization is wrong.  A mere
> presence of an object does not mean that the current tip can reach
> that object.  Imagine you pushed commit A earlier to them at the
> tip, then pushed commit A~ to them at the tip, which is the current
> state of the remote of the submodule, and since them they may have
> GC'ed.  They no longer have the commit A.
> 
> For that matter, because you are doing this check by pretending as
> if all the submodule objects are in the object store of the current
> superproject you are working in, and saying "it exists there in the
> submodule repository" when the only thing you know is it exists in
> an object store of either the submodule repository, the superproject
> repository, or any of the other submodule repositories, you really
> cannot tell much from a mere presence of an object.  Not just the
> remote of the submodule repository you are interested in, but the
> submodule repository you are interested in itself, may not have that
> object.
> 
> Drop the previous two helper functions and this short-cut.

This is nothing I added. This came from the original code which I simply
extended to check for all sha1's. The diff is a little bit misleading. This
would be the local diff:

-   if (add_submodule_odb(path) || !lookup_commit_reference(sha1))
+   if (!submodule_has_hashes(path, hashes))
return 0;

I think that this is more a shortcut for: If we can not find the sha1, we do
not need to bother spawning a process for the real check. We default to the
answer no in that case.

It looks like the reason for the sha1 check is to avoid error output from the
spawned 'rev-list' process. The error output might be confusing to the user in
case the sha1 does not exist in the submodule. We should probably die here
since we were not able to check a submodule that has a diff in the revisions we
would push.

After thinking about this further: I think it is actually bad to proceed here,
as the current revisions (just in the superproject) would still be pushed and
the user possibly gets an inconsistent state on the remote side which he tried
to avoid with check/on-demand-push enabled.

So in short this deserves some further love. Will have a look into adding
another patch for this if we agree on my suggestion to die above.

Cheers Heiko


Re: Switching branches not working in a cloned repo

2016-09-19 Thread Jacob Keller
On Mon, Sep 19, 2016 at 6:58 AM, Paul Williamson
 wrote:
> Hi,
>
> We use git extensively on a number of repos. Recently, we have had a problem 
> with one of them. This repo has a 'web_dev' branch. For copies of the repo 
> cloned before a certain (recent but unidentified) time, we could 'git 
> checkout' between master and web_dev and everything would be normal.
>
> However, now if we clone the repo, we can no longer do 'git checkout 
> web_dev'. Git doesn't complain, in fact there is no feedback and we are still 
> in the master branch. Running 'git branch -r' still shows the branch as 
> existing at origin.
>
> If we try 'git branch web_dev' we then see web_dev listed locally and can 
> switch to it BUT on closer inspection we realise that this action has created 
> a new branch off master.
>
> The first time we saw this was under Bash on Windows, so we thought maybe it 
> was a beta problem, but a) other repos work as expected under that 
> environment, and b) under cygwin, pulling the same repo to a new directory 
> alongside an older copy shows that the problem occurs with the new clone, but 
> not the one that that was cloned longer ago.
>
> Also in this situation, there are no local outstanding code changes that 
> might cause problems switching branches. This occurs right from a cleanly 
> cloned repo.
>
> It seems something has gone wrong with this repo, and we don't know what. 
> It's a tough problem to google, and I was not able to search the gmane 
> archives (DNS errors).
>
> Any idea how to investigate?
>

Try a fresh clone with "git checkout --track origin/web_dev" and then
a "git status -v" and let us know what happens. I suspect that older
clones have the correct branches already setup for tracking, but the
new clone isn't doing it automatically, so when you run git checkout
web_dev you're just creating a new local copy.

You can probably fix this by running

git checkout web_dev
git branch --set-upstream-to=origin/web_dev
git reset --hard origin/web_dev

The last command will completely reset your local web_dev to match
what's upstream, and the second command tells git that this branch
tracks a given remote branch.

Hope this helps.

Regards,
Jake

> Thanks,
> Paul
>
>


Re: [PATCH 2/2] serialize collection of refs that contain submodule changes

2016-09-19 Thread Heiko Voigt
Hi,

On Fri, Sep 16, 2016 at 10:47:46AM -0700, Junio C Hamano wrote:
> One thing that makes me worried is how the ref cache layer interacts
> with this.  I see you first call push_unpushed_submodules() when
> ON_DEMAND is set, which would result in pushes in submodule
> repositories, updating their remote tracking branches.  At that
> point, before you make another call to find_unpushed_submodules(),
> is our cached ref layer knows that the remote tracking branches
> are now up to date (otherwise, we would incorrectly judge that these
> submodules need pushing based on stale information)?

I am not sure if I understand you correctly here. With the "ref cache layer"
you are referring to add_submodule_odb() which is called indirectly from
submodule_needs_pushing()? Those revs are only used to check whether the hash
we need on the remote side exists in the local submodule. That should not
change due to a push. The actual check whether the commit(s) exist on the
remote side is done using a 'rev-list' in a subprocess later.


> > diff --git a/transport.c b/transport.c
> > index 94d6dc3..76e1daf 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -903,23 +903,29 @@ int transport_push(struct transport *transport,
> >  
> > if ((flags & TRANSPORT_RECURSE_SUBMODULES_ON_DEMAND) && 
> > !is_bare_repository()) {
> > struct ref *ref = remote_refs;
> > +   struct sha1_array hashes = SHA1_ARRAY_INIT;
> > +
> > for (; ref; ref = ref->next)
> > -   if (!is_null_oid(>new_oid) &&
> > -   !push_unpushed_submodules(ref->new_oid.hash,
> > -   transport->remote->name))
> > -   die ("Failed to push all needed 
> > submodules!");
> > +   if (!is_null_oid(>new_oid))
> > +   sha1_array_append(, 
> > ref->new_oid.hash);
> > +
> > +   if (!push_unpushed_submodules(, 
> > transport->remote->name))
> > +   die ("Failed to push all needed submodules!");
> 
> Do we leak the contents of hashes here?

Good catch, sorry about that. Will clear the hashes array.


> > }
> >  
> > if ((flags & (TRANSPORT_RECURSE_SUBMODULES_ON_DEMAND |
> >   TRANSPORT_RECURSE_SUBMODULES_CHECK)) && 
> > !is_bare_repository()) {
> > struct ref *ref = remote_refs;
> > struct string_list needs_pushing = STRING_LIST_INIT_DUP;
> > +   struct sha1_array hashes = SHA1_ARRAY_INIT;
> >  
> > for (; ref; ref = ref->next)
> > -   if (!is_null_oid(>new_oid) &&
> > -   find_unpushed_submodules(ref->new_oid.hash,
> > -   transport->remote->name, 
> > _pushing))
> > -   
> > die_with_unpushed_submodules(_pushing);
> > +   if (!is_null_oid(>new_oid))
> > +   sha1_array_append(, 
> > ref->new_oid.hash);
> > +
> > +   if (find_unpushed_submodules(, 
> > transport->remote->name,
> > +   _pushing))
> > +   die_with_unpushed_submodules(_pushing);
> 
> Do we leak the contents of hashes here?  I do not think we need to
> worry about needs_pushing leaking, as we will always die if it is
> not empty, but it might be a better code hygiene to clear it as
> well.

As above, will fix.

Cheers Heiko


Re: [PATCH 1/2] serialize collection of changed submodules

2016-09-19 Thread Heiko Voigt
On Fri, Sep 16, 2016 at 10:27:04AM -0700, Junio C Hamano wrote:
> Heiko Voigt  writes:
> 
> > +static struct sha1_array *get_sha1s_from_list(struct string_list 
> > *submodules,
> > +   const char *path)
> > +{
> > +   struct string_list_item *item;
> > +   struct sha1_array *hashes;
> > +
> > +   item = string_list_insert(submodules, path);
> > +   if (item->util)
> > +   return (struct sha1_array *) item->util;
> > +
> > +   hashes = (struct sha1_array *) xmalloc(sizeof(struct sha1_array));
> > +   /* NEEDSWORK: should we add an initializer function for
> > +* sha1_array ? */
> > +   memset(hashes, 0, sizeof(struct sha1_array));
> > +   item->util = hashes;
> 
> 
>   /* NEEDSWORK: should we have SHA1_ARRAY_INIT etc.? */
>   item->util = xcalloc(1, sizeof(struct sha1_array));

Ok will do.


Re: clarification of `rev-list --no-walk ^`?

2016-09-19 Thread Philip Oakley

From: "Junio C Hamano" 

Michael J Gruber  writes:


It can be read that

$ git cherry-pick maint next

would pick two single commits, while

$ git cherry-pick maint next ^master

could implicitly be read as

$ git cherry-pick maint next --do-walk ^master


You can read it as "master..next maint" that does force walking.


Clearly that's not what is intended, which is

$ git cherry-pick --do-walk maint next ^master


I do not see the distinction betwee the above two you seem to be
trying to make.  Care to explain?

but it is open to interpretation as to where in the command line the 
caret

range prefix's --do-walk (to countermand the --no-walk) should applied.


I do not think it can be position dependent.


OK. (background, long story) When I first read the man page, and in trying 
to explain a user confusion between cherry pick commits and am'ing the same 
commits via format-patch (where sometimes the patches had coverlapping 
context issues), I was trying to confirm for myself that 'git cherry-pick B 
C' and 'git cherry-pick C B' should get the same end result, and not be 
mistaken (e.g. user misunderstanding) for a range.


I first spotted the 'git help cherry-pick's line:

 - no traversal is done by default, as if the --no-walk option was 
specified, see git-rev-list(1).


So off I goes to rev-list, and see that the options no-walk/do-walk are said 
(implied) to be position dependent.


--do-walk  - Overrides a previous --no-walk.

Meanwhile the --no-walk option says:

--no-walk - This (option) has no effect if a range is specified.

So at this point I am wondering about the command line ordering and what 
comprises the range if the negative ref is given last, or at least just 
before a --do-walk.


Thus (at this point) it felt like one of those specification rabbit-holes 
that I often see at $dayjob. It was unclear as to the point at which the 
'range' was to be applied in to the command line to get the expected 
examples.



Climbing back out of the rabbit-hole. I now see that after the end of the 
cherry-pick  line I quoted, there is a secondary "Note that 
specifying a range will feed all . arguments to a single revision 
walk", which at the time did not register (a classic human error.. Three 
Mile Island et al.).


Similarly, in the rev-list --no-walk option, its says (mid-para) "This has 
no effect if a range is specified." so in some ways that confirms the cherry 
pick statement, but again a less obvious corollary.


However, there is still a small step missing, which is to confirm that using 
a negative ^ref anywhere(?) makes the whole list of refs into a range (i.e. 
it will look-back along the command line) to cancel any --no-walk options in 
place.


Given that a walk usually requires a range, I'm now having difficulty seeing 
how the --no-walk  --do-walk can be combined anyway.


The bits I felt was missing (in the docs) was to say explicitly somewhere 
that a negative ref defined that we had a range (to link back to those 
walk-no-walk statements), and the extent of rev paramaters it applied to.


And after re-reading, that some of those corollary statements about ranges 
flipping the walk-no-walk condition should be brought forward to be more 
obvious within the primary rev-list (to avoid the typical reader error).




Philip probably has a
confused notion that "rev-list A..B C..D" is somehow a union of set
A..B and C..D?


That wasn't the issue. Though it does beg the question that it's the same as 
"rev-list D B ^A ^C" isn't it?




If the user did want just the single commit at the tip of maint, and 
then

the range master..next, what would be their command line, and also, how
would the man page warn against false expectations?


Yeah, this can show us that all of the have is coming from that
exact confusion I suspected Philip has.  We need to clarify in the
documentation that rev-list set operation does *NOT* have union of
multiple sets to unconfuse the readers.


I'd say it was the walk - no walk range confusion. Inclusion of any range 
definition of any sort (in particular ^rev) causes the expectation that an 
ordered list of single revs can be included, to be broken.

i.e. cherry-pick B D F G Q..T;  isn't B D F G R S T, is it?

I've also have a quick browse of the test scripts and didn't see any tests 
that actually cover the example of `git cherry-pick maint next ^master` 
where both have multiple commits to pick, so couldn't see what the test 
would expect.


--
Philip



[PATCH v2 1/2] t5100-mailinfo: replace common path prefix with variable

2016-09-19 Thread Kevin Daudt
Many tests need to store data in a file, and repeat the same pattern to
refer to that path:

"$TEST_DATA"/t5100/

Create a variable that contains this path, and use that instead.

Signed-off-by: Kevin Daudt 
---
 t/t5100-mailinfo.sh | 56 +++--
 1 file changed, 29 insertions(+), 27 deletions(-)

diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh
index 1a5a546..27bf3b8 100755
--- a/t/t5100-mailinfo.sh
+++ b/t/t5100-mailinfo.sh
@@ -7,8 +7,10 @@ test_description='git mailinfo and git mailsplit test'
 
 . ./test-lib.sh
 
+DATA="$TEST_DIRECTORY/t5100"
+
 test_expect_success 'split sample box' \
-   'git mailsplit -o. "$TEST_DIRECTORY"/t5100/sample.mbox >last &&
+   'git mailsplit -o. "$DATA"/sample.mbox >last &&
last=$(cat last) &&
echo total is $last &&
test $(cat last) = 17'
@@ -17,9 +19,9 @@ check_mailinfo () {
mail=$1 opt=$2
mo="$mail$opt"
git mailinfo -u $opt msg$mo patch$mo <$mail >info$mo &&
-   test_cmp "$TEST_DIRECTORY"/t5100/msg$mo msg$mo &&
-   test_cmp "$TEST_DIRECTORY"/t5100/patch$mo patch$mo &&
-   test_cmp "$TEST_DIRECTORY"/t5100/info$mo info$mo
+   test_cmp "$DATA"/msg$mo msg$mo &&
+   test_cmp "$DATA"/patch$mo patch$mo &&
+   test_cmp "$DATA"/info$mo info$mo
 }
 
 
@@ -27,15 +29,15 @@ for mail in 00*
 do
test_expect_success "mailinfo $mail" '
check_mailinfo $mail "" &&
-   if test -f "$TEST_DIRECTORY"/t5100/msg$mail--scissors
+   if test -f "$DATA"/msg$mail--scissors
then
check_mailinfo $mail --scissors
fi &&
-   if test -f "$TEST_DIRECTORY"/t5100/msg$mail--no-inbody-headers
+   if test -f "$DATA"/msg$mail--no-inbody-headers
then
check_mailinfo $mail --no-inbody-headers
fi &&
-   if test -f "$TEST_DIRECTORY"/t5100/msg$mail--message-id
+   if test -f "$DATA"/msg$mail--message-id
then
check_mailinfo $mail --message-id
fi
@@ -45,7 +47,7 @@ done
 
 test_expect_success 'split box with rfc2047 samples' \
'mkdir rfc2047 &&
-   git mailsplit -orfc2047 "$TEST_DIRECTORY"/t5100/rfc2047-samples.mbox \
+   git mailsplit -orfc2047 "$DATA"/rfc2047-samples.mbox \
  >rfc2047/last &&
last=$(cat rfc2047/last) &&
echo total is $last &&
@@ -56,18 +58,18 @@ do
test_expect_success "mailinfo $mail" '
git mailinfo -u $mail-msg $mail-patch <$mail >$mail-info &&
echo msg &&
-   test_cmp "$TEST_DIRECTORY"/t5100/empty $mail-msg &&
+   test_cmp "$DATA"/empty $mail-msg &&
echo patch &&
-   test_cmp "$TEST_DIRECTORY"/t5100/empty $mail-patch &&
+   test_cmp "$DATA"/empty $mail-patch &&
echo info &&
-   test_cmp "$TEST_DIRECTORY"/t5100/rfc2047-info-$(basename $mail) 
$mail-info
+   test_cmp "$DATA"/rfc2047-info-$(basename $mail) $mail-info
'
 done
 
 test_expect_success 'respect NULs' '
 
-   git mailsplit -d3 -o. "$TEST_DIRECTORY"/t5100/nul-plain &&
-   test_cmp "$TEST_DIRECTORY"/t5100/nul-plain 001 &&
+   git mailsplit -d3 -o. "$DATA"/nul-plain &&
+   test_cmp "$DATA"/nul-plain 001 &&
(cat 001 | git mailinfo msg patch) &&
test_line_count = 4 patch
 
@@ -75,52 +77,52 @@ test_expect_success 'respect NULs' '
 
 test_expect_success 'Preserve NULs out of MIME encoded message' '
 
-   git mailsplit -d5 -o. "$TEST_DIRECTORY"/t5100/nul-b64.in &&
-   test_cmp "$TEST_DIRECTORY"/t5100/nul-b64.in 1 &&
+   git mailsplit -d5 -o. "$DATA"/nul-b64.in &&
+   test_cmp "$DATA"/nul-b64.in 1 &&
git mailinfo msg patch <1 &&
-   test_cmp "$TEST_DIRECTORY"/t5100/nul-b64.expect patch
+   test_cmp "$DATA"/nul-b64.expect patch
 
 '
 
 test_expect_success 'mailinfo on from header without name works' '
 
mkdir info-from &&
-   git mailsplit -oinfo-from "$TEST_DIRECTORY"/t5100/info-from.in &&
-   test_cmp "$TEST_DIRECTORY"/t5100/info-from.in info-from/0001 &&
+   git mailsplit -oinfo-from "$DATA"/info-from.in &&
+   test_cmp "$DATA"/info-from.in info-from/0001 &&
git mailinfo info-from/msg info-from/patch \
  info-from/out &&
-   test_cmp "$TEST_DIRECTORY"/t5100/info-from.expect info-from/out
+   test_cmp "$DATA"/info-from.expect info-from/out
 
 '
 
 test_expect_success 'mailinfo finds headers after embedded From line' '
mkdir embed-from &&
-   git mailsplit -oembed-from "$TEST_DIRECTORY"/t5100/embed-from.in &&
-   test_cmp "$TEST_DIRECTORY"/t5100/embed-from.in embed-from/0001 &&
+   git mailsplit -oembed-from "$DATA"/embed-from.in &&
+   test_cmp "$DATA"/embed-from.in embed-from/0001 &&
git mailinfo 

[PATCH v2 0/2] Handle escape characters in From field.

2016-09-19 Thread Kevin Daudt
Changes since v2:
- detach from input parameter to reuse it as an output buffer
- don't return error when encountering another open bracket in a comment
- test escaping in comments

Kevin Daudt (2):
  t5100-mailinfo: replace common path prefix with variable
  mailinfo: unescape quoted-pair in header fields

 mailinfo.c   | 46 +
 t/t5100-mailinfo.sh  | 70 +++-
 t/t5100/comment.expect   |  5 
 t/t5100/comment.in   |  9 ++
 t/t5100/quoted-string.expect |  5 
 t/t5100/quoted-string.in |  9 ++
 6 files changed, 117 insertions(+), 27 deletions(-)
 create mode 100644 t/t5100/comment.expect
 create mode 100644 t/t5100/comment.in
 create mode 100644 t/t5100/quoted-string.expect
 create mode 100644 t/t5100/quoted-string.in

-- 
2.10.0.86.g6ffa4f1.dirty



[PATCH v2] ls-files: add pathspec matching for submodules

2016-09-19 Thread Brandon Williams
Pathspecs can be a bit tricky when trying to apply them to submodules.
The main challenge is that the pathspecs will be with respect to the
super module and not with respect to paths in the submodule.  The
approach this patch takes is to pass in the identical pathspec from the
super module to the submodule in addition to the submodule-prefix, which
is the path from the root of the super module to the submodule, and then
we can compare an entry in the submodule prepended with the
submodule-prefix to the pathspec in order to determine if there is a
match.

This patch also permits the pathspec logic to perform a prefix match against
submodules since a pathspec could refer to a file inside of a submodule.
This can can result in false positive matches since a super module
doesn't know what files could be in the submodule.

Signed-off-by: Brandon Williams 
---
 Documentation/git-ls-files.txt |   4 +-
 builtin/ls-files.c | 144 -
 dir.c  |  67 ++-
 dir.h  |   4 +
 t/t3007-ls-files-recurse-submodules.sh |  66 +--
 5 files changed, 215 insertions(+), 70 deletions(-)

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index a623ebf..09e4449 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -19,7 +19,7 @@ SYNOPSIS
[--exclude-standard]
[--error-unmatch] [--with-tree=]
[--full-name] [--recurse-submodules]
-   [--output-path-prefix=]
+   [--submodule-prefix=]
[--abbrev] [--] [...]
 
 DESCRIPTION
@@ -143,7 +143,7 @@ a space) at the start of each line:
Recursively calls ls-files on each submodule in the repository.
Currently there is only support for the --cached mode.
 
---output-path-prefix=::
+--submodule-prefix=::
Prepend the provided path to the output of each file
 
 --abbrev[=]::
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 687e475..1417f44 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -29,7 +29,7 @@ static int show_valid_bit;
 static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
-static const char *output_path_prefix;
+static const char *submodule_prefix;
 static int recurse_submodules;
 
 static const char *prefix;
@@ -78,9 +78,9 @@ static void write_name(const char *name)
 * churn.
 */
static struct strbuf full_name = STRBUF_INIT;
-   if (output_path_prefix && *output_path_prefix) {
+   if (submodule_prefix && *submodule_prefix) {
strbuf_reset(_name);
-   strbuf_addstr(_name, output_path_prefix);
+   strbuf_addstr(_name, submodule_prefix);
strbuf_addstr(_name, name);
name = full_name.buf;
}
@@ -177,12 +177,34 @@ static void show_gitlink(const struct cache_entry *ce)
 {
struct child_process cp = CHILD_PROCESS_INIT;
int status;
+   int i;
 
argv_array_push(, "ls-files");
argv_array_push(, "--recurse-submodules");
-   argv_array_pushf(, "--output-path-prefix=%s%s/",
-output_path_prefix ? output_path_prefix : "",
+   argv_array_pushf(, "--submodule-prefix=%s%s/",
+submodule_prefix ? submodule_prefix : "",
 ce->name);
+   /* add options */
+   if (show_eol)
+   argv_array_push(, "--eol");
+   if (show_valid_bit)
+   argv_array_push(, "-v");
+   if (show_stage)
+   argv_array_push(, "--stage");
+   if (show_cached)
+   argv_array_push(, "--cached");
+   if (debug_mode)
+   argv_array_push(, "--debug");
+
+   /*
+* Pass in the original pathspec args.  The submodule will be
+* responsible for prepending the 'submodule_prefix' prior to comparing
+* against the pathspec for matches.
+*/
+   argv_array_push(, "--");
+   for (i = 0; i < pathspec.nr; ++i)
+   argv_array_push(, pathspec.items[i].original);
+
cp.git_cmd = 1;
cp.dir = ce->name;
status = run_command();
@@ -192,57 +214,62 @@ static void show_gitlink(const struct cache_entry *ce)
 
 static void show_ce_entry(const char *tag, const struct cache_entry *ce)
 {
+   struct strbuf name = STRBUF_INIT;
int len = max_prefix_len;
+   if (submodule_prefix)
+   strbuf_addstr(, submodule_prefix);
+   strbuf_addstr(, ce->name);
 
if (len >= ce_namelen(ce))
die("git ls-files: internal error - cache entry not superset of 
prefix");
 
-   if (!match_pathspec(, ce->name, ce_namelen(ce),
-   len, ps_matched,
-   S_ISDIR(ce->ce_mode) || S_ISGITLINK(ce->ce_mode)))
-   return;
-   if 

[PATCH v2 2/2] mailinfo: unescape quoted-pair in header fields

2016-09-19 Thread Kevin Daudt
rfc2822 has provisions for quoted strings in structured header fields,
but also allows for escaping these with so-called quoted-pairs.

The only thing git currently does is removing exterior quotes, but
quotes within are left alone.

Remove exterior quotes and remove escape characters so that they don't
show up in the author field.

Signed-off-by: Kevin Daudt 
---
 mailinfo.c   | 46 
 t/t5100-mailinfo.sh  | 14 ++
 t/t5100/comment.expect   |  5 +
 t/t5100/comment.in   |  9 +
 t/t5100/quoted-string.expect |  5 +
 t/t5100/quoted-string.in |  9 +
 6 files changed, 88 insertions(+)
 create mode 100644 t/t5100/comment.expect
 create mode 100644 t/t5100/comment.in
 create mode 100644 t/t5100/quoted-string.expect
 create mode 100644 t/t5100/quoted-string.in

diff --git a/mailinfo.c b/mailinfo.c
index e19abe3..6a7c2f2 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -54,6 +54,50 @@ static void parse_bogus_from(struct mailinfo *mi, const 
struct strbuf *line)
get_sane_name(>name, >name, >email);
 }
 
+static void unquote_quoted_string(struct strbuf *line)
+{
+   const char *in = strbuf_detach(line, NULL);
+   int c, take_next_literally = 0;
+   int found_error = 0;
+
+   /*
+* Stores the character that started the escape mode so that we know 
what
+* character will stop it
+*/
+   char escape_context = 0;
+
+   while ((c = *in++) != 0) {
+   if (take_next_literally) {
+   take_next_literally = 0;
+   } else {
+   switch (c) {
+   case '"':
+   if (!escape_context)
+   escape_context = '"';
+   else if (escape_context == '"')
+   escape_context = 0;
+   continue;
+   case '\\':
+   if (escape_context) {
+   take_next_literally = 1;
+   continue;
+   }
+   break;
+   case '(':
+   if (!escape_context)
+   escape_context = '(';
+   break;
+   case ')':
+   if (escape_context == '(')
+   escape_context = 0;
+   break;
+   }
+   }
+
+   strbuf_addch(line, c);
+   }
+}
+
 static void handle_from(struct mailinfo *mi, const struct strbuf *from)
 {
char *at;
@@ -63,6 +107,8 @@ static void handle_from(struct mailinfo *mi, const struct 
strbuf *from)
strbuf_init(, from->len);
strbuf_addbuf(, from);
 
+   unquote_quoted_string();
+
at = strchr(f.buf, '@');
if (!at) {
parse_bogus_from(mi, from);
diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh
index 27bf3b8..8c21434 100755
--- a/t/t5100-mailinfo.sh
+++ b/t/t5100-mailinfo.sh
@@ -144,4 +144,18 @@ test_expect_success 'mailinfo unescapes with --mboxrd' '
test_cmp expect mboxrd/msg
 '
 
+test_expect_success 'mailinfo handles rfc2822 quoted-string' '
+   mkdir quoted-string &&
+   git mailinfo /dev/null /dev/null <"$DATA"/quoted-string.in \
+   >quoted-string/info &&
+   test_cmp "$DATA"/quoted-string.expect quoted-string/info
+'
+
+test_expect_success 'mailinfo handles rfc2822 comment' '
+   mkdir comment &&
+   git mailinfo /dev/null /dev/null <"$DATA"/comment.in \
+   >comment/info &&
+   test_cmp "$DATA"/comment.expect comment/info
+'
+
 test_done
diff --git a/t/t5100/comment.expect b/t/t5100/comment.expect
new file mode 100644
index 000..1197e76
--- /dev/null
+++ b/t/t5100/comment.expect
@@ -0,0 +1,5 @@
+Author: A U Thor (this is a comment (really))
+Email: someb...@example.com
+Subject: testing comments
+Date: Sun, 25 May 2008 00:38:18 -0700
+
diff --git a/t/t5100/comment.in b/t/t5100/comment.in
new file mode 100644
index 000..430ba97
--- /dev/null
+++ b/t/t5100/comment.in
@@ -0,0 +1,9 @@
+From 1234567890123456789012345678901234567890 Mon Sep 17 00:00:00 2001
+From: "A U Thor"  (this is a comment \(really\))
+Date: Sun, 25 May 2008 00:38:18 -0700
+Subject: [PATCH] testing comments
+
+
+
+---
+patch
diff --git a/t/t5100/quoted-string.expect b/t/t5100/quoted-string.expect
new file mode 100644
index 000..cab1bce
--- /dev/null
+++ b/t/t5100/quoted-string.expect
@@ -0,0 +1,5 @@
+Author: Author "The Author" Name
+Email: someb...@example.com
+Subject: testing quoted-pair
+Date: Sun, 25 May 2008 00:38:18 -0700
+
diff --git a/t/t5100/quoted-string.in 

Re: [PATCH] ls-files: add pathspec matching for submodules

2016-09-19 Thread Brandon Williams
I thought as much.  Thanks for the quick explanation :)

On Mon, Sep 19, 2016 at 11:34 AM, Junio C Hamano  wrote:
> Brandon Williams  writes:
>
>> side question if the answer is short:  Any reason as to why all of the
>> pathspec matching code lives inside of dir.c and not pathspec.c?
>
> Hysterical Raisins.
>
> A pathspec used to be just a "const char **" in simpler times, which
> was no more elaborate than "argv + i" (i.e. after parsing options
> and revisions out, the remainders are pathspec elements).  Major
> part of the matching was done inside dir.c and we haven't had chance
> to look at the larger picture to move the code around.


Re: [PATCH] ls-files: add pathspec matching for submodules

2016-09-19 Thread Junio C Hamano
Brandon Williams  writes:

> side question if the answer is short:  Any reason as to why all of the
> pathspec matching code lives inside of dir.c and not pathspec.c?

Hysterical Raisins.

A pathspec used to be just a "const char **" in simpler times, which
was no more elaborate than "argv + i" (i.e. after parsing options
and revisions out, the remainders are pathspec elements).  Major
part of the matching was done inside dir.c and we haven't had chance
to look at the larger picture to move the code around.


Re: [PATCH] ls-files: add pathspec matching for submodules

2016-09-19 Thread Brandon Williams
side question if the answer is short:  Any reason as to why all of the
pathspec matching code lives inside of dir.c and not pathspec.c?

On Mon, Sep 19, 2016 at 11:22 AM, Junio C Hamano  wrote:
> Brandon Williams  writes:
>
>> Yes in that case it wouldn't have passed ps_strncmp()...but we should have 
>> never
>> made it there in the first place due to a piece of logic in 
>> match_pathspec_item:
>
> Ah, OK.
>
> Thanks for clarifying.


Re: [PATCH] ls-files: add pathspec matching for submodules

2016-09-19 Thread Junio C Hamano
Brandon Williams  writes:

> Yes in that case it wouldn't have passed ps_strncmp()...but we should have 
> never
> made it there in the first place due to a piece of logic in 
> match_pathspec_item:

Ah, OK.

Thanks for clarifying.


Re: [PATCH] ls-files: add pathspec matching for submodules

2016-09-19 Thread Brandon Williams
On Mon, Sep 19, 2016 at 11:04 AM, Junio C Hamano  wrote:
> Brandon Williams  writes:
>
>>> Again, what do we have in "name" and "item" at this point?  If we
>>> have a submodule at "sub/" and we are checking a pathspec element
>>> "sub/dir1/*", what is the non-wildcard part of the pathspec and what
>>> is the "string"?  Aren't they "sub/dir1/" and "sub/" respectively,
>>> which would not pass ps_strncmp() and produce a (false) negative?
>>
>> item will be the pathspec_item struct that we are trying to match against.
>
> ... which would mean "sub/dir1/" in the above example (which is
> followed by '*' that is wildcard).
>
>> name will be the file we are trying to match, which should already have the
>> 'prefix' cut off (this is the prefix that is used as an optimization
>> in the common
>> case, which isn't used in the submodule case).
>
> ... which would be "sub/" in the above example, because we disable
> the common-prefix optimization.
>
> So in short, the answer to the last questions in the first quoted
> paragraph are yes, yes, and "no they do not pass ps_strncmp()"?

Yes in that case it wouldn't have passed ps_strncmp()...but we should have never
made it there in the first place due to a piece of logic in match_pathspec_item:

@@ -283,6 +308,24 @@ static int match_pathspec_item(const struct
pathspec_item *item, int prefix,
 item->nowildcard_len - prefix))
return MATCHED_FNMATCH;

+   /* Perform checks to see if "name" is a super set of the pathspec */
+   if (flags & DO_MATCH_SUBMODULE) {
+   int matched = 0;
+
+   /* Check if the name is a literal prefix of the pathspec */
+   if ((item->match[namelen] == '/') &&
+   !ps_strncmp(item, match, name, namelen)) {
+   matched = MATCHED_RECURSIVELY;
+   /* Check if the name wildmatches to the pathspec */
+   } else if (item->nowildcard_len < item->len &&
+  !prefix_fnmatch(item, match, name,
+  item->nowildcard_len - prefix)) {
+   matched = MATCHED_FNMATCH;
+   }
+
+   return matched;
+   }

Perhaps the call structure and code organization could be changes a bit to make
a little more sense.


Re: [PATCH] ls-files: add pathspec matching for submodules

2016-09-19 Thread Junio C Hamano
Brandon Williams  writes:

>  static void show_ce_entry(const char *tag, const struct cache_entry *ce)
>  {
> + struct strbuf name = STRBUF_INIT;
>   int len = max_prefix_len;
> + if (submodule_prefix)
> + strbuf_addstr(, submodule_prefix);
> + strbuf_addstr(, ce->name);

Continuing with the previous review, which concentrated on what
happens in the superproject; let's see what happens in the recursive
invocation in a submodule.

So a recursively spawned "ls-files --submodule-prefix=sub/" finds
a path in the index PATH and forms "sub/PATH" in "name".  From here
on, where we used to match pathspec against ce->name, we would be
matching it against name->buf.

> + if (recurse_submodules && S_ISGITLINK(ce->ce_mode) &&
> + submodule_path_match(, name.buf, ps_matched)) {
>   show_gitlink(ce);

This is primarily what happens in the superproject to decide if the
submodule is worth showing.  When we are in a submodule, we can
descend into subsubmodule (if our ls-files run in the superproject
passed --recurse-submodule down) from here.

> + } else if (match_pathspec(, name.buf, name.len,
> +   len, ps_matched,
> +   S_ISDIR(ce->ce_mode) ||
> +   S_ISGITLINK(ce->ce_mode))) {

This is interesting bit to see what happens in the recursive
invocation.  It uses the usual match_pathspec(), as we want to be
precise and correct, iow, we do not want to use DO_MATCH_SUBMODULE,
aka "it might be worth descending into submodule".

> + if (tag && *tag && show_valid_bit &&
> + (ce->ce_flags & CE_VALID)) {
> +...
> + }
> + write_eolinfo(ce, ce->name);
> + write_name(ce->name);

The prefixing is taken care of by write_name(), so it is correct to
use ce->name here.

> ...  
> + strbuf_release();
>  }

OK, everything I saw so far for the recursive invocation here makes
sense.

Thanks.


RE: Switching branches not working in a cloned repo

2016-09-19 Thread Paul Williamson
-Original Message-
From: Philip Oakley [mailto:philipoak...@iee.org] 

>Have you tried `git ls-remote` ?
>The `branch -r` just lists the local 'rtb's (IIUC).

Nice, I didn't know about that command - I tried it though and it does list the 
remote branches correctly. I checked the commit refs given and it tallies with 
the refs at the gitolite server end.

>It could be someone has accidently pruned or deleted that branch at the remote.

I don't think so, as I say it's a gitolite server, so usually nobody is on it. 
We just push to it and set up post-receive hooks for if updates need to be 
pushed on to a deployment server.

Do you know of ways to check if anything is corrupted within the repo? I ran 
git fsck, but that didn't show up anything.

>What version are you (they) on?

1.9.1 at both ends (my laptop and our gitolite server)

> Gmane had to quit. Try http://public-inbox.org/git (see the help link)

Ah! Thanks. I tried searching but so far only turned up this thread.

> philip 

Thanks for your reply, Philip.

Paul



Re: [PATCH] ls-files: add pathspec matching for submodules

2016-09-19 Thread Junio C Hamano
Brandon Williams  writes:

>> Again, what do we have in "name" and "item" at this point?  If we
>> have a submodule at "sub/" and we are checking a pathspec element
>> "sub/dir1/*", what is the non-wildcard part of the pathspec and what
>> is the "string"?  Aren't they "sub/dir1/" and "sub/" respectively,
>> which would not pass ps_strncmp() and produce a (false) negative?
>
> item will be the pathspec_item struct that we are trying to match against.

... which would mean "sub/dir1/" in the above example (which is
followed by '*' that is wildcard).

> name will be the file we are trying to match, which should already have the
> 'prefix' cut off (this is the prefix that is used as an optimization
> in the common
> case, which isn't used in the submodule case).  

... which would be "sub/" in the above example, because we disable
the common-prefix optimization.

So in short, the answer to the last questions in the first quoted
paragraph are yes, yes, and "no they do not pass ps_strncmp()"?

>> I am starting to have a feeling that the best we can do in this
>> function safely is to see if prefix (i.e. the constant part of the
>> pathspec before the first wildcard) is long enough to cover the
>> "name" and if "name" part [matches or does not match] ...
>> If these two checks cannot decide, we may have to be pessimistic and
>> say "it may match; we don't know until we descend into it".
>> ...
>> So I would think we'd be in the business of counting slashes in the
>> name (called "string" in this function) and the pathspec, while
>> noticing '*' and '**' in the latter, and we may be able to be more
>> precise, but I am not sure how complex the end result would become.
>
> I agree, I'm not too sure how much more complex the logic would need
> to be to handle
> all matters of wildcard characters.  We could initially be more
> lenient on what qualifies as
> a match and then later (or in the near future) revisit the wildmatch
> function (which is complex)
> and see if we can add better matching capabilities more suited for
> submodules while at the
> same time fixing that bug discussed above.

I think it is reasonable to start a function that is meant to never
have false negatives pessimistic and return "might match" from it
when in doubt.

Thanks.


Re: [PATCH 1/6] i18n: commit: mark message for translation

2016-09-19 Thread Junio C Hamano

I am responding to 1/6, as the series lacked a cover letter, but all
of them looked good.

Thanks.


Re: [PATCH v2] format-patch: Add --rfc for the common case of [RFC PATCH]

2016-09-19 Thread Junio C Hamano
Andrew Donnellan  writes:

> Sounds good to me. Agreed that "RFC" is essentially the only prefix
> other than "PATCH" that I see, at least in the kernel.

Around here I think we saw WIP too, and that makes me lean towards
Peff's earlier suggestion to allow an end-user supplied string in
front of PATCH, i.e. "-P RFC" => "--subject-prefix='RFC PATCH'",
even though I understand that those who _ONLY_ care about RFC would
prefer --rfc (5 keystrokes) over "-P RFC" (6 keystrokes).

>> +--rfc::
>> +Alias for `--subject-prefix="RFC PATCH"`. Use this when
>> +sending an experimental patch for discussion rather than
>> +application.
>
> Perhaps mention the phrase "Request For Comment" for the benefit of
> those who aren't familiar ...

Good point.


Re: [Bug] Custom git-dir directory shouldn't be listed as “untracked”

2016-09-19 Thread Junio C Hamano
Nicolas Cuillery  writes:

> When using the default directory ".git", it logically doesn't appear
> in the "git status" command's output. Don't you think it should be the
> same when using a custom dir name ?

Not really.

GIT_DIR= mechanism was never meant to be used to name a
directory that sitsinside your working tree (an exception is if it
is actually ".git" at the top).


Re: [PATCH 11/11] Resumable clone: implement primer logic in git-clone

2016-09-19 Thread Junio C Hamano
Duy Nguyen  writes:

> On Fri, Sep 16, 2016 at 7:12 AM, Kevin Wern  wrote:
>>  builtin/clone.c | 590 
>> +---
>
> Argh.. this is too big for my brain at this hour. It might be easier
> to follow if you separate out some code move (I think I've seen some,
> not sure). I'll try to have another look when I find time. But it's
> great to hear from you again, the pleasant surprise in my inbox today,
> as I thought we lost you ;-) There's hope for resumable clone maybe
> before 2018 again.

I had a similar thought.

What "git clone" (WITHOUT Kevin's update) should have been was to be
as close to

* Parse command line arguments;

* Create a new repository and go into it; this step would
  require us to have parsed the command line for --template,
  , --separate-git-dir, etc.

* Talk to the remote and do get_remote_heads() aka ls-remote
  output;

* Decide what fetch refspec to use, which alternate object store
  to borrow from; this step would require us to have parsed the
  command line for --reference, --mirror, --origin, etc;

* Issue "git fetch" with the refspec determined above; this step
  would require us to have parsed the command line for --depth, etc.

* Run "git checkout -b" to create an initial checkout; this step
  would require us to have parsed the command line for --branch,
  etc.

and the current code Kevin is basing his work on is not quite in
that shape.  This round of the patches may be RFC so it may be OK
but the ready-for-review work may need to do the refactoring to get
it close to the above shape as a preparatory step before doing
anything else.  Once that is done, Kevin's series (other than the
part that acutally does the resumable static file download) will
become a very easily understood "Ah, we know where to prime the well
from, so let's do that first" step inserted immediately before the
"Issue 'git fetch'" step.



Re: [PATCH] ls-files: add pathspec matching for submodules

2016-09-19 Thread Brandon Williams
On Mon, Sep 19, 2016 at 10:00 AM, Junio C Hamano  wrote:
>
> I think you were clear enough.
>
> Don't read everything other people say in their reviews as pointing
> out issues.  Often trying to rephrase what they read in the code in
> their own words is a good way to make sure the reviewers and the
> original author are on the same page.  The above was one of these
> cases.

That makes sense, I'll be sure to remember that!


>
 + if (prefix > 0) {
 + if (ps_strncmp(item, pattern, string, prefix))
 + return WM_NOMATCH;
>>>
>>> This says: when we have a set prefix that must literally match, and
>>> that part does not match what we have, it cannot possibly match.
>>>
>>> Is that correct?  What do we have in "name" and "item" at this
>>> point?  We disable the common-prefix optimization, so we do not have
>>> to worry about a pathspec with two elements "sub/dir1/*" and "sub/dir2/*"
>>> giving you "sub/dir" as the common prefix, when you are wondering if
>>> it is worth descending into "sub/" without knowing what it contains.
>>> Is that what guarantees why this part is correct?
>>
>> I adopted this structure from another part of the code.  The caller
>> uses a field in
>> the pathspec item which indicates the location of the first wildcard 
>> character.
>> So the prefix (everything prior to the wildcard char) must match
>> literally before
>> we drop into a more expensive wildmatch function.
>
> "Another part of the code" is about tree walking, right?  Weren't
> you saying that part of the code may be buggy or something earlier
> (e.g. pathspec "su?/" vs entry "sub")?

I was refering to the logic that is used to do normal pathspec checking:
'match_pathspec' and the functions it calls, in particular
git_fnmatch.  And the bug I
pointed out before, I believe is due to how the wildmatch function
works.  It requires
that the pattern and the string being compared must match exactly (in
length as well).
The "su?/" case would drop into wildmatch funciton and wouldn't match
against any file
in the directory "sub/file1" for example, because it doesn't exactly
match "su?/".  In the
case of "sub" there is logic prior to the wildmatch function call
which would classify it a
match and return before descending into wildmatch.

> Again, what do we have in "name" and "item" at this point?  If we
> have a submodule at "sub/" and we are checking a pathspec element
> "sub/dir1/*", what is the non-wildcard part of the pathspec and what
> is the "string"?  Aren't then "sub/dir1/" and "sub/" respectively,
> which would not pass ps_strncmp() and produce a (false) negative?

item will be the pathspec_item struct that we are trying to match against.
name will be the file we are trying to match, which should already have the
'prefix' cut off (this is the prefix that is used as an optimization
in the common
case, which isn't used in the submodule case).  The 'prefix' in this function's
context is the part of the pattern prior to the first wildcard character. which
we can do a literal comparison on before descending into the wildmatch function.

> I am starting to have a feeling that the best we can do in this
> function safely is to see if prefix (i.e. the constant part of the
> pathspec before the first wildcard) is long enough to cover the
> "name" and if "name" part does not match to the directory boundary,
> e.g. for this combination
>
> pathspec = "a/b/sib/c/*"
> name = "a/b/sub/"
>
> we can say with confidence that it is not worth descending into.
>
> When prefix is long enough and "name" and leading part of the prefix
> matches to the directory boundary, e.g.
>
> pathspec = "a/b/sub/c/*"
> name = "a/b/sub/"
>
> we can say it is worth descending into.
>
> If these two checks cannot decide, we may have to be pessimistic and
> say "it may match; we don't know until we descend into it".  When
> prefix is shorter than name, I am not sure if we can devise a set of
> simple rules, e.g.
>
> pathspec = "a/**/c/*"
> name = "a/b/sub/"
>
> may match with its ** "b/sub" part and worth descending into, so is
>
> pathspec = "a/b/*/c/*"
> name = "a/b/sub/"
>
> but not this one:
>
> pathspec = "a/b/su[c-z]/c/*"
> name = "a/b/sub/"
>
> but this is OK:
>
> pathspec = "a/b/su[a-z]/c/*"
> name = "a/b/sub/"
>
> So I would think we'd be in the business of counting slashes in the
> name (called "string" in this function) and the pathspec, while
> noticing '*' and '**' in the latter, and we may be able to be more
> precise, but I am not sure how complex the end result would become.
>

I agree, I'm not too sure how much more complex the logic would need
to be to handle
all matters of wildcard characters.  We could initially be more
lenient on what qualifies as
a match and then later (or in the near future) revisit the wildmatch
function (which is complex)
and see if we can add better matching 

Re: [PATCH v3 0/8] Better heuristics make prettier diffs

2016-09-19 Thread Junio C Hamano
Junio C Hamano  writes:

> Michael Haggerty  writes:
>
>> On 09/08/2016 01:25 AM, Junio C Hamano wrote:
>>> I'd move it temporarily to t4061 with a separate SQUASH??? at the
>>> tip for now, as I am running out of time today.
>>
>> I didn't realize you were waiting for an ACK. Yes, it's totally OK to
>> rename the test.
>
> I actually wasn't asking for an Ack.
>
> As the issue was in the one that is buried a few commits from the
> tip, and there is a later one that adds more tests to it, I didn't
> find enough energy to rename the new file in a buried commit and
> then adjust the patch later updates it, I was hoping that you'd
> reroll to save me effort, rather than forcing me to do the rebase
> myself ;-).

Now I did, so no need to resend (unless you have changes other than
the renaming of the test script, that is).

Let's move it down to 'next' soonish.

Thanks.


Re: [PATCH v3] checkout: eliminate unnecessary merge for trivial checkout

2016-09-19 Thread Junio C Hamano
Junio C Hamano  writes:

>> "git checkout -b foo" (without -f -m or ) is defined in the
>> manual as being a shortcut for/equivalent to:
>>  
>> (1a) "git branch foo"
>> (1b) "git checkout foo"
>>  
>> However, it has been our experience in our observed use cases and all the
>> existing git tests, that it can be treated as equivalent to:
>>  
>> (2a) "git branch foo"
>> (2b) "git symbolic-ref HEAD refs/heads/foo"
>> ...
>
> I am still not sure if I like the change of what "checkout -b" is
> this late in the game, though.

Having said all that.

I do see the merit of having a shorthand way to invoke your 2 above.
It is just that I am not convinced that it is the best way to
achieve that goal to redefine what "git checkout -b " (no
other parameters) does.


Re: [PATCH] ls-files: add pathspec matching for submodules

2016-09-19 Thread Junio C Hamano
Brandon Williams  writes:

>> OK, so as discussed previously with Heiko and Stefan, the idea is to
>>
>>  - pass the original pathspec as-is,
>>
>>  - when --submodule-prefix is given, a path discovered in a
>>submodule repository is first prefixed with that string before
>>getting checked to see if it matches the original pathspec.
>>
>> And this loop is about relaying the original pathspec.
>
> Exactly.  Perhaps I should have made this more clear either with a
> detailed comment or
> more information in the commit msg.

I think you were clear enough.

Don't read everything other people say in their reviews as pointing
out issues.  Often trying to rephrase what they read in the code in
their own words is a good way to make sure the reviewers and the
original author are on the same page.  The above was one of these
cases.

>>> + if (prefix > 0) {
>>> + if (ps_strncmp(item, pattern, string, prefix))
>>> + return WM_NOMATCH;
>>
>> This says: when we have a set prefix that must literally match, and
>> that part does not match what we have, it cannot possibly match.
>>
>> Is that correct?  What do we have in "name" and "item" at this
>> point?  We disable the common-prefix optimization, so we do not have
>> to worry about a pathspec with two elements "sub/dir1/*" and "sub/dir2/*"
>> giving you "sub/dir" as the common prefix, when you are wondering if
>> it is worth descending into "sub/" without knowing what it contains.
>> Is that what guarantees why this part is correct?
>
> I adopted this structure from another part of the code.  The caller
> uses a field in
> the pathspec item which indicates the location of the first wildcard 
> character.
> So the prefix (everything prior to the wildcard char) must match
> literally before
> we drop into a more expensive wildmatch function.

"Another part of the code" is about tree walking, right?  Weren't
you saying that part of the code may be buggy or something earlier
(e.g. pathspec "su?/" vs entry "sub")?

Again, what do we have in "name" and "item" at this point?  If we
have a submodule at "sub/" and we are checking a pathspec element
"sub/dir1/*", what is the non-wildcard part of the pathspec and what
is the "string"?  Aren't then "sub/dir1/" and "sub/" respectively,
which would not pass ps_strncmp() and produce a (false) negative?

I am starting to have a feeling that the best we can do in this
function safely is to see if prefix (i.e. the constant part of the
pathspec before the first wildcard) is long enough to cover the
"name" and if "name" part does not match to the directory boundary,
e.g. for this combination

pathspec = "a/b/sib/c/*"
name = "a/b/sub/"

we can say with confidence that it is not worth descending into.

When prefix is long enough and "name" and leading part of the prefix
matches to the directory boundary, e.g.

pathspec = "a/b/sub/c/*"
name = "a/b/sub/"

we can say it is worth descending into.

If these two checks cannot decide, we may have to be pessimistic and
say "it may match; we don't know until we descend into it".  When
prefix is shorter than name, I am not sure if we can devise a set of
simple rules, e.g.

pathspec = "a/**/c/*"
name = "a/b/sub/"

may match with its ** "b/sub" part and worth descending into, so is

pathspec = "a/b/*/c/*"
name = "a/b/sub/"

but not this one:

pathspec = "a/b/su[c-z]/c/*"
name = "a/b/sub/"

but this is OK:

pathspec = "a/b/su[a-z]/c/*"
name = "a/b/sub/"

So I would think we'd be in the business of counting slashes in the
name (called "string" in this function) and the pathspec, while
noticing '*' and '**' in the latter, and we may be able to be more
precise, but I am not sure how complex the end result would become.



Re: [PATCH v3] checkout: eliminate unnecessary merge for trivial checkout

2016-09-19 Thread Junio C Hamano
"Ben Peart"  writes:

> Let me see if I can better explain what I’m trying to accomplish with this
> patch.  
>  
> "git checkout -b foo" (without -f -m or ) is defined in the
> manual as being a shortcut for/equivalent to:
>  
> (1a) "git branch foo"
> (1b) "git checkout foo"
>  
> However, it has been our experience in our observed use cases and all the
> existing git tests, that it can be treated as equivalent to:
>  
> (2a) "git branch foo"
> (2b) "git symbolic-ref HEAD refs/heads/foo"
>  
> That is, the common perception (use case) is to just create a new branch
> "foo" (pointing at the current commit) and point HEAD at it WITHOUT making
> any changes to the index or worktree.
>  
> However, the (1b) command has "git reset" connotations in that it should
> examine and manipulate the trees, index, and worktree in the expectation
> that there MIGHT be work to do.
>  
> Since this additional work in (1b) takes minutes on large repos and (2b)
> takes less than a second, my intent was to identify the conditions that this
> additional work will have no affect and thereby avoid it.
>  
> Alternatively, was the "-b" option just created as a shortcut only to avoid
> calling the separate "git branch foo" command and we should not think about
> the common perception and usage?

If you are trying to change the definition of "checkout -b" from 1
to 2 above, that is a completely different issue.  I thought this
was an attempt to optimize for the performance without changing the
behaviour.

So if you did not apologize like this...

> It is correct that this optimization will skip updating the tree to honor
> any changes to the sparse-checkout in the case of creating a new branch.
> Unfortunately, I don't know of any way to detect the changes other than
> actually doing all the work to update the skip work tree bit in the index.

... but insisted that skipping the yucky sparse-checkout adjustment
in this case was an intended behaviour change, I would have
understood (not necessarily agreed, though) what you were trying to
do.

> Beyond this code review process and testing, I don't know how else we make
> sure we're caught all the conditions where we are OK skipping some of the
> steps. Any change has inherent risk - a change in behavior even more so.

At least we made one-step progress today.  I now know that you are
trying to change the behaviour, but I didn't know that last week,
when I was primarily reacting that your claim that this was
performance thing and assuming you meant no change in behaviour, but
there was clearly behaviour change, and it was apparent that the
denseness of the code made it almost impossible to see if there are
unintended changes.

I am still not sure if I like the change of what "checkout -b" is
this late in the game, though.


Re: Why are there multiple ways to get the manual in Git?

2016-09-19 Thread Philip Oakley

From: "Junio C Hamano" 

"Philip Oakley"  writes:


The `git revisions --help` does work ;-)


Not anymore ;-)

I think Ralf Thielow fixed it recently.


hmm, I sort of though it would still work with a valid guide.

I'd only checked with my last GfW version.
--
hey ho
Philip


Re: clarification of `rev-list --no-walk ^`?

2016-09-19 Thread Junio C Hamano
"Philip Oakley"  writes:

> At the moment the cherry-pick man page's example implies that
> --do-walk is applied from the beginning, rather from the point given
> on the command line.
>
> I had a very quick search of the *.c code for the options but didn't
> get any further. Hopefully the user issue/misunderstanding is
> elsewhere... I'll add this to my little list.

I think the confusion is coming from not understanding that revision
specifiers cannot have position-dependent semantics, because there
is no "union of multiple sets".  You said

commits in 'master..next' range and the tip of 'maint'

earlier, and that is a prime specimen of that confusion.  That is
asking "things reachable from next excluding things reachable from
master" computed independently from everything else on the command
line (i.e. that is one set), and "the commit at the tip of 'maint'"
(i.e. that is another set, which consists of a singleton element),
and wanting to take a union of it.  But the revision machinery is
not structured to work that way.  It can only do "reachable from one
enumeration of positive tips, excluding ones reachable from another
enumeration of negative tips".  "no-walk" is a cheap hack that tells
the machinery "stop after collecting that 'one enumeration of
positive tips' and do not walk. Make that enumeration the resulting
set".  Having anything negative in the enumeration of starting
points from the command line automatically turns "no-walk" off, even
for commands that default to "no-walk".

We may need further documentation updates to unconfuse readers.


Re: Switching branches not working in a cloned repo

2016-09-19 Thread Philip Oakley

From: "Paul Williamson" 

Hi,

We use git extensively on a number of repos. Recently, we have had a 
problem with one of them. This repo has a 'web_dev' branch. For copies of 
the repo cloned before a certain (recent but unidentified) time, we could 
'git checkout' between master and web_dev and everything would be normal.


However, now if we clone the repo, we can no longer do 'git checkout 
web_dev'. Git doesn't complain, in fact there is no feedback and we are 
still in the master branch. Running 'git branch -r' still shows the branch 
as existing at origin.


Have you tried `git ls-remote` ?

The `branch -r` just lists the local 'rtb's (IIUC).

It could be someone has accidently pruned or deleted that branch at the 
remote.


What version are you (they) on?



If we try 'git branch web_dev' we then see web_dev listed locally and can 
switch to it BUT on closer inspection we realise that this action has 
created a new branch off master.


The first time we saw this was under Bash on Windows, so we thought maybe 
it was a beta problem, but a) other repos work as expected under that 
environment, and b) under cygwin, pulling the same repo to a new directory 
alongside an older copy shows that the problem occurs with the new clone, 
but not the one that that was cloned longer ago.


Also in this situation, there are no local outstanding code changes that 
might cause problems switching branches. This occurs right from a cleanly 
cloned repo.


It seems something has gone wrong with this repo, and we don't know what. 
It's a tough problem to google, and I was not able to search the gmane 
archives (DNS errors).


Gmane had to quit. Try http://public-inbox.org/git (see the help link)



Any idea how to investigate?

Thanks,
Paul

--
philip 



Re: clarification of `rev-list --no-walk ^`?

2016-09-19 Thread Junio C Hamano
Michael J Gruber  writes:

>> It can be read that
>> 
>> $ git cherry-pick maint next
>> 
>> would pick two single commits, while
>> 
>> $ git cherry-pick maint next ^master
>> 
>> could implicitly be read as
>> 
>> $ git cherry-pick maint next --do-walk ^master

You can read it as "master..next maint" that does force walking.

>> Clearly that's not what is intended, which is
>> 
>> $ git cherry-pick --do-walk maint next ^master

I do not see the distinction betwee the above two you seem to be
trying to make.  Care to explain?

>> but it is open to interpretation as to where in the command line the caret
>> range prefix's --do-walk (to countermand the --no-walk) should applied.

I do not think it can be position dependent.  Philip probably has a
confused notion that "rev-list A..B C..D" is somehow a union of set
A..B and C..D?

>> If the user did want just the single commit at the tip of maint, and then
>> the range master..next, what would be their command line, and also, how
>> would the man page warn against false expectations?

Yeah, this can show us that all of the have is coming from that
exact confusion I suspected Philip has.  We need to clarify in the
documentation that rev-list set operation does *NOT* have union of
multiple sets to unconfuse the readers.



Re: [PATCH v3 0/8] Better heuristics make prettier diffs

2016-09-19 Thread Junio C Hamano
Michael Haggerty  writes:

> On 09/08/2016 01:25 AM, Junio C Hamano wrote:
>> Michael Haggerty  writes:
>> 
>>>   * Add test t4059 as part of this commit, not as part of its
>>> successor.
>> 
>> Which needs to be moved to somewhere else, as another topics that
>> has already been in 'next' uses t4059.
>> 
>> I'd move it temporarily to t4061 with a separate SQUASH??? at the
>> tip for now, as I am running out of time today.
>
> I didn't realize you were waiting for an ACK. Yes, it's totally OK to
> rename the test.

I actually wasn't asking for an Ack.

As the issue was in the one that is buried a few commits from the
tip, and there is a later one that adds more tests to it, I didn't
find enough energy to rename the new file in a buried commit and
then adjust the patch later updates it, I was hoping that you'd
reroll to save me effort, rather than forcing me to do the rebase
myself ;-).



Re: Bug: pager. doesn't work well with editors

2016-09-19 Thread Junio C Hamano
Anatoly Borodin  writes:

>> I think, the pagination should be turned off when the editor is being
>> called.

This is a fun one.  IIRC, we decide to spawn a pager and run our
output via pipe thru it fairly early, even before we figure out
which subcommand is being run (especially if you do "git -p
subcommand"), which by definition is way before we know if that
subcommand wants to let the user edit things with the editor.


Re: Why are there multiple ways to get the manual in Git?

2016-09-19 Thread Junio C Hamano
"Philip Oakley"  writes:

> The `git revisions --help` does work ;-)

Not anymore ;-)

I think Ralf Thielow fixed it recently.



Switching branches not working in a cloned repo

2016-09-19 Thread Paul Williamson
Hi,

We use git extensively on a number of repos. Recently, we have had a problem 
with one of them. This repo has a 'web_dev' branch. For copies of the repo 
cloned before a certain (recent but unidentified) time, we could 'git checkout' 
between master and web_dev and everything would be normal.

However, now if we clone the repo, we can no longer do 'git checkout web_dev'. 
Git doesn't complain, in fact there is no feedback and we are still in the 
master branch. Running 'git branch -r' still shows the branch as existing at 
origin.

If we try 'git branch web_dev' we then see web_dev listed locally and can 
switch to it BUT on closer inspection we realise that this action has created a 
new branch off master.

The first time we saw this was under Bash on Windows, so we thought maybe it 
was a beta problem, but a) other repos work as expected under that environment, 
and b) under cygwin, pulling the same repo to a new directory alongside an 
older copy shows that the problem occurs with the new clone, but not the one 
that that was cloned longer ago.

Also in this situation, there are no local outstanding code changes that might 
cause problems switching branches. This occurs right from a cleanly cloned repo.

It seems something has gone wrong with this repo, and we don't know what. It's 
a tough problem to google, and I was not able to search the gmane archives (DNS 
errors).

Any idea how to investigate?

Thanks,
Paul




Re: clarification of `rev-list --no-walk ^`?

2016-09-19 Thread Philip Oakley

From: "Michael J Gruber" 

Philip Oakley venit, vidit, dixit 19.09.2016 12:56:

A question came up on the Git user list regarding cherry-pick that got me
reading the manual (again), in this case regarding --no-walk ranges.

Essentially my question is: If --no-walk is given to rev-list (e.g. via
charry-pick), and the user includes a caret prefixed rev, when does that
range definition take effect on the command line, especially in light of
the --do-walk option?

In rev-list(1) there are only 8 references to  'range', with only
the --no-walk option saying "This has no effect if a range is specified."
but leaving open the decision as to what does (and does not) comprises 
the

specification of a range on the cli.

The two and three dot notations are fairly obvious ranges from
gitrevisions(7) as they are complete strings, while the caret prefix is 
an

implied range (it needs additional parameters to complete the range, and
there-in lies the issue).

It can be read that

$ git cherry-pick maint next

would pick two single commits, while

$ git cherry-pick maint next ^master

could implicitly be read as

$ git cherry-pick maint next --do-walk ^master

because the ^ caret starts the range that cancels the --no-walk.

Clearly that's not what is intended, which is

$ git cherry-pick --do-walk maint next ^master

but it is open to interpretation as to where in the command line the 
caret

range prefix's --do-walk (to countermand the --no-walk) should applied.

If the user did want just the single commit at the tip of maint, and then
the range master..next, what would be their command line, and also, how
would the man page warn against false expectations?


Maybe:

Every negative rev (rev prefixed with ^, or a range) implies a
`--do-walk` (right at its position on the command line).

And then curb the misleading range sentence in the `--no-walk` 
description.


At the moment the cherry-pick man page's example implies that --do-walk is 
applied from the beginning, rather from the point given on the command line.


I had a very quick search of the *.c code for the options but didn't get any 
further. Hopefully the user issue/misunderstanding is elsewhere... I'll add 
this to my little list.


--
Philip 



[Bug] Custom git-dir directory shouldn't be listed as “untracked”

2016-09-19 Thread Nicolas Cuillery
Hi, I want to create a local repository with a custom dir name instead
of ".git", I used the env vars GIT_DIR and WORK_TREE:
>export GIT_DIR=".customgitdir"
>export GIT_WORK_TREE="."

Then I created a repo in an empty directory:
>$ git init
>Initialized empty Git repository in X/.customgitdir/

Then I ran git status:
>$ git status
>On branch master
>
>Initial commit
>
>Untracked files:
>  (use "git add ..." to include in what will be committed)
>
>.customgitdir/
>
>nothing added to commit but untracked files present (use "git add" to track)

The local repo directory listed as "untracked files" which is a
problem when using "git add ." afterwards.

When using the default directory ".git", it logically doesn't appear
in the "git status" command's output. Don't you think it should be the
same when using a custom dir name ?

Git version 2.6.4 on MacOSX 10.11

Regards,
Nicolas


Re: [PATCH 11/11] Resumable clone: implement primer logic in git-clone

2016-09-19 Thread Duy Nguyen
On Fri, Sep 16, 2016 at 7:12 AM, Kevin Wern  wrote:
>  builtin/clone.c | 590 
> +---

Argh.. this is too big for my brain at this hour. It might be easier
to follow if you separate out some code move (I think I've seen some,
not sure). I'll try to have another look when I find time. But it's
great to hear from you again, the pleasant surprise in my inbox today,
as I thought we lost you ;-) There's hope for resumable clone maybe
before 2018 again.
-- 
Duy


Re: [PATCH 04/11] Resumable clone: add prime-clone to remote-curl

2016-09-19 Thread Duy Nguyen
On Fri, Sep 16, 2016 at 7:12 AM, Kevin Wern  wrote:
> +static void prime_clone(void)
> +{
> +   char *result, *result_full, *line;
> +   size_t result_len;
> +   int err = 0, one_successful = 0;
> +
> +   if (request_service("git-prime-clone", _full, ,
> +   _len, HTTP_ERROR_GENTLE)) {
> +   while (line = packet_read_line_buf_gentle(, 
> _len,
> + NULL)) {
> +   char *space = strchr(line ,' ');
> +
> +   // We will eventually support multiple resources, so
> +   // always parse the whole message
> +   if (err)
> +   continue;
> +   if (!space || strchr(space + 1, ' ')) {
> +   if (options.verbosity > 1)
> +   fprintf(stderr, "prime clone "
> +   "protocol error: got '%s'\n",
> +   line);
> +   printf("error\n");
> +   err = 1;
> +   continue;
> +   }
> +
> +   one_successful = 1;
> +   printf("%s\n", line);

A brief overview for this service in
Documentation/technical/http-protocol.txt (and maybe
Documentation/gitremote-helpers.txt as well) would be great help. It's
a bit hard to follow because at this point I don't know anything about
the server side (and on top of that I was confused between http
send/receive vs transport send/receive, but this is my fault).

> +   }
> +   if (!one_successful && options.verbosity > 1)
> +   fprintf(stderr, "did not get required components for "
> +   "alternate resource\n");
> +   }
> +
> +   printf("\n");
> +   fflush(stdout);
> +   free(result_full);
> +}
-- 
Duy


Re: clarification of `rev-list --no-walk ^`?

2016-09-19 Thread Michael J Gruber
Philip Oakley venit, vidit, dixit 19.09.2016 12:56:
> A question came up on the Git user list regarding cherry-pick that got me
> reading the manual (again), in this case regarding --no-walk ranges.
> 
> Essentially my question is: If --no-walk is given to rev-list (e.g. via
> charry-pick), and the user includes a caret prefixed rev, when does that
> range definition take effect on the command line, especially in light of
> the --do-walk option?
> 
> In rev-list(1) there are only 8 references to  'range', with only
> the --no-walk option saying "This has no effect if a range is specified."
> but leaving open the decision as to what does (and does not) comprises the
> specification of a range on the cli.
> 
> The two and three dot notations are fairly obvious ranges from
> gitrevisions(7) as they are complete strings, while the caret prefix is an 
> implied range (it needs additional parameters to complete the range, and 
> there-in lies the issue).
> 
> It can be read that
> 
> $ git cherry-pick maint next
> 
> would pick two single commits, while
> 
> $ git cherry-pick maint next ^master
> 
> could implicitly be read as
> 
> $ git cherry-pick maint next --do-walk ^master
> 
> because the ^ caret starts the range that cancels the --no-walk.
> 
> Clearly that's not what is intended, which is
> 
> $ git cherry-pick --do-walk maint next ^master
> 
> but it is open to interpretation as to where in the command line the caret
> range prefix's --do-walk (to countermand the --no-walk) should applied.
> 
> If the user did want just the single commit at the tip of maint, and then
> the range master..next, what would be their command line, and also, how
> would the man page warn against false expectations?

Maybe:

Every negative rev (rev prefixed with ^, or a range) implies a
`--do-walk` (right at its position on the command line).

And then curb the misleading range sentence in the `--no-walk` description.

Michael



Re: [PATCH 09/11] path: add resumable marker

2016-09-19 Thread Duy Nguyen
On Fri, Sep 16, 2016 at 7:12 AM, Kevin Wern  wrote:
> Create function to get gitdir file RESUMABLE.

A very good opportunity to explain what this file is or will be (and
whether its content matters or just its existence). Either as commit
message, or even better in Documentation/gitrepository-layout.txt.
-- 
Duy


RE: [PATCH v3] checkout: eliminate unnecessary merge for trivial checkout

2016-09-19 Thread Ben Peart
Let me see if I can better explain what I’m trying to accomplish with this
patch.  
 
"git checkout -b foo" (without -f -m or ) is defined in the
manual as being a shortcut for/equivalent to:
 
    (1a) "git branch foo"
    (1b) "git checkout foo"
 
However, it has been our experience in our observed use cases and all the
existing git tests, that it can be treated as equivalent to:
 
    (2a) "git branch foo"
    (2b) "git symbolic-ref HEAD refs/heads/foo"
 
That is, the common perception (use case) is to just create a new branch
"foo" (pointing at the current commit) and point HEAD at it WITHOUT making
any changes to the index or worktree.
 
However, the (1b) command has "git reset" connotations in that it should
examine and manipulate the trees, index, and worktree in the expectation
that there MIGHT be work to do.
 
Since this additional work in (1b) takes minutes on large repos and (2b)
takes less than a second, my intent was to identify the conditions that this
additional work will have no affect and thereby avoid it.
 
Alternatively, was the "-b" option just created as a shortcut only to avoid
calling the separate "git branch foo" command and we should not think about
the common perception and usage?
 
More comments inline...
 
> -Original Message-
> From: Junio C Hamano [mailto:gits...@pobox.com]
> Sent: Tuesday, September 13, 2016 6:35 PM
> To: Ben Peart 
> Cc: mailto:git@vger.kernel.org; mailto:pclo...@gmail.com; Ben Peart
> 
> Subject: Re: [PATCH v3] checkout: eliminate unnecessary merge for trivial
> checkout
> 
> Ben Peart  writes:
> 
> > +static int needs_working_tree_merge(const struct checkout_opts *opts,
> > +  const struct branch_info *old,
> > +  const struct branch_info *new)
> > +{
> > +...
> > +}
> 
> I do not think I need to repeat the same remarks on the conditions in this
> helper, which hasn't changed since v2.  Many "comments" in the code do not
> explain why skipping is justified, or what they claim to check looks to me
just
> plain wrong.
> 
> For example, there is
> 
>    /*
> * If we're not creating a new branch, by definition we're changing
> * the existing one so need to do the merge
> */
>    if (!opts->new_branch)
>    return 1;
> 
> but "git checkout" (no other argument) hits this condition.  It disables
the
> most trivial optimization opportunity, because we are not "creating".
> 
 
Disabling the optimization for "git checkout" with no argument was
intentional. This command does not create a new branch but instead, performs
a "soft reset" which will update the index and working directory to reflect
changes to the sparse-checkout (for example).  If this was not disabled,
many tests fail as they expect this behavior.  Because "git checkout" does
not actually change the refs, if we skipped the merge/index/working
directory update, this command becomes a no-op.
 
> "By definition, we're changing"?  Really?  Not quite.
> 
 
What I was attempting to communicate is that if we aren't creating a new
branch any changes or updates will happen in the existing branch.  Since
that could only be updating the index and working directory, we don't want
to skip those steps or we've defeated any purpose in running the command. 
 
> If you disable this bogus check, "git checkout" (no other argument) would
be
> allowed to skip the merge_working_tree(), and that in turn reveals another
> case that the helper is not checking when
> unpack_trees() MUST be called.
> 
> Note: namely, when sparse checkout is in effect, switching from
> HEAD to HEAD can nuke existing working tree files outside the
> sparse pattern -- YUCK!  See penultimate test in t1011 for
> an example.
> 
> This yuckiness is not your fault, but needs_working_tree_merge() logic you
> added needs to refrain from skipping unpack_trees() call when sparse thing
> is in effect.  I'd expect "git checkout -b foo"
> instead of "git checkout" (no other argument) would fail to honor the
sparse
> thing and reveal this bug, because the above bogus "!opts->new_branch"
> check will not protect you for that case.
> 
 
It is correct that this optimization will skip updating the tree to honor
any changes to the sparse-checkout in the case of creating a new branch.
Unfortunately, I don't know of any way to detect the changes other than
actually doing all the work to update the skip work tree bit in the index.
If this behavior is required, then this optimization will need to check  if
sparse-checkout is enabled and skip the optimization just in case there have
been changes.
 
> In other words, these random series of "if (...) return 1" are bugs hiding
> other real bugs and we need to reason about which ones are bugs that are
> hiding what other bugs that are not covered by this function.  As Peff
said
> earlier for v1, this is still an unreadable mess.  We need to figure 

Re: git add --intent-to-add silently creates empty commits

2016-09-19 Thread Duy Nguyen
On Fri, Sep 16, 2016 at 12:48 AM, Junio C Hamano  wrote:
> Aviv Eyal  writes:
>
>> Using `git add -N` allows creating of empty commits:
>>
>> git init test && cd test
>> echo text > file
>> git add --intent-to-add file
>> git commit -m 'Empty commit'
>> echo $?# prints 0
>> ...
>> I'd expect `git commit` to error out instead of producing an empty commit.
>>
>> I've seen this with git 2.8.1 and 2.10.0.129.g35f6318
>
> I think I've seen this reported some time ago.
>
> https://public-inbox.org/git/%3ccacsjy8a8-rgpyxysjbalrmia7d3dfqpr4cxasnsalycnmgm...@mail.gmail.com%3E/
>
> I do not offhand recall what happend to the topic after that.

Yeah. I'm a bit behind, no, I'm wy behind my git backlog. This
definitely gets a rise-up, together with the multiworktree bug fix in
git-init.
-- 
Duy


Re: [PATCH 02/11] Resumable clone: add prime-clone endpoints

2016-09-19 Thread Duy Nguyen
On Fri, Sep 16, 2016 at 7:12 AM, Kevin Wern  wrote:
>  static struct daemon_service daemon_service[] = {
> { "upload-archive", "uploadarch", upload_archive, 0, 1 },
> { "upload-pack", "uploadpack", upload_pack, 1, 1 },
> { "receive-pack", "receivepack", receive_pack, 0, 1 },
> +   { "prime-clone", "primeclone", prime_clone, 0, 1 },
>  };

I guess this is why you chose to implement a new command in 01/11,
simpler to be called from http-backend?

> +   // prime-clone does not need --stateless-rpc and
> +   // --advertise-refs options. Maybe it will in the future, but
> +   // until then it seems best to do this instead of adding
> +   // "dummy" options.

Stick to /* .. */

> +   if (strcmp(svc->name, "prime-clone") != 0) {
> +   argv_array_pushl(, "--stateless-rpc",
> +"--advertise-refs", NULL);
> +   }

We also have an exception for select_getanyfile() below. I think it's
time we add a function callback in struct rpc_service to run each
service the way they want. Then prime-clone won't need an exception
(neither does select_anyfile, mostly)
-- 
Duy


[PATCH 6/6] i18n: stash: mark messages for translation

2016-09-19 Thread Vasco Almeida
Signed-off-by: Vasco Almeida 
---
 git-stash.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/git-stash.sh b/git-stash.sh
index 826af18..90d63f2 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -100,7 +100,7 @@ create_stash () {
u_tree=$(git write-tree) &&
printf 'untracked files on %s\n' "$msg" | git 
commit-tree $u_tree  &&
rm -f "$TMPindex"
-   ) ) || die "Cannot save the untracked files"
+   ) ) || die "$(gettext "Cannot save the untracked files")"
 
untracked_commit_option="-p $u_commit";
else
@@ -248,7 +248,7 @@ save_stash () {
 
if test -n "$patch_mode" && test -n "$untracked"
then
-   die "Can't use --patch and --include-untracked or --all at the same 
time"
+   die "$(gettext "Can't use --patch and --include-untracked or 
--all at the same time")"
fi
 
stash_msg="$*"
@@ -494,7 +494,7 @@ apply_stash () {
GIT_INDEX_FILE="$TMPindex" git-read-tree "$u_tree" &&
GIT_INDEX_FILE="$TMPindex" git checkout-index --all &&
rm -f "$TMPindex" ||
-   die 'Could not restore untracked files from stash'
+   die "$(gettext "Could not restore untracked files from stash")"
fi
 
eval "
-- 
2.7.4



[PATCH 5/6] i18n: notes-merge: mark die messages for translation

2016-09-19 Thread Vasco Almeida
Update test to reflect changes.

Signed-off-by: Vasco Almeida 
---
 notes-merge.c | 8 
 t/t3310-notes-merge-manual-resolve.sh | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/notes-merge.c b/notes-merge.c
index 97fc42f..3bbeb86 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -269,15 +269,15 @@ static void check_notes_merge_worktree(struct 
notes_merge_options *o)
if (file_exists(git_path(NOTES_MERGE_WORKTREE)) &&
!is_empty_dir(git_path(NOTES_MERGE_WORKTREE))) {
if (advice_resolve_conflict)
-   die("You have not concluded your previous "
+   die(_("You have not concluded your previous "
"notes merge (%s exists).\nPlease, use "
"'git notes merge --commit' or 'git notes "
"merge --abort' to commit/abort the "
"previous merge before you start a new "
-   "notes merge.", git_path("NOTES_MERGE_*"));
+   "notes merge."), git_path("NOTES_MERGE_*"));
else
-   die("You have not concluded your notes merge "
-   "(%s exists).", git_path("NOTES_MERGE_*"));
+   die(_("You have not concluded your notes merge "
+   "(%s exists)."), git_path("NOTES_MERGE_*"));
}
 
if (safe_create_leading_directories_const(git_path(
diff --git a/t/t3310-notes-merge-manual-resolve.sh 
b/t/t3310-notes-merge-manual-resolve.sh
index 6967436..baef2d6 100755
--- a/t/t3310-notes-merge-manual-resolve.sh
+++ b/t/t3310-notes-merge-manual-resolve.sh
@@ -225,7 +225,7 @@ test_expect_success 'cannot do merge w/conflicts when 
previous merge is unfinish
test -d .git/NOTES_MERGE_WORKTREE &&
test_must_fail git notes merge z >output 2>&1 &&
# Output should indicate what is wrong
-   grep -q "\\.git/NOTES_MERGE_\\* exists" output
+   test_i18ngrep -q "\\.git/NOTES_MERGE_\\* exists" output
 '
 
 # Setup non-conflicting merge between x and new notes ref w
-- 
2.7.4



[PATCH 4/6] i18n: ident: mark hint for translation

2016-09-19 Thread Vasco Almeida
Mark env_hint for translation.

Signed-off-by: Vasco Almeida 
---
 ident.c | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/ident.c b/ident.c
index e20a772..92c3cca 100644
--- a/ident.c
+++ b/ident.c
@@ -331,17 +331,17 @@ int split_ident_line(struct ident_split *split, const 
char *line, int len)
 }
 
 static const char *env_hint =
-"\n"
-"*** Please tell me who you are.\n"
-"\n"
-"Run\n"
-"\n"
-"  git config --global user.email \"y...@example.com\"\n"
-"  git config --global user.name \"Your Name\"\n"
-"\n"
-"to set your account\'s default identity.\n"
-"Omit --global to set the identity only in this repository.\n"
-"\n";
+N_("\n"
+   "*** Please tell me who you are.\n"
+   "\n"
+   "Run\n"
+   "\n"
+   "  git config --global user.email \"y...@example.com\"\n"
+   "  git config --global user.name \"Your Name\"\n"
+   "\n"
+   "to set your account\'s default identity.\n"
+   "Omit --global to set the identity only in this repository.\n"
+   "\n");
 
 const char *fmt_ident(const char *name, const char *email,
  const char *date_str, int flag)
@@ -356,13 +356,13 @@ const char *fmt_ident(const char *name, const char *email,
if (!name) {
if (strict && ident_use_config_only
&& !(ident_config_given & IDENT_NAME_GIVEN)) {
-   fputs(env_hint, stderr);
+   fputs(_(env_hint), stderr);
die("no name was given and auto-detection is 
disabled");
}
name = ident_default_name();
using_default = 1;
if (strict && default_name_is_bogus) {
-   fputs(env_hint, stderr);
+   fputs(_(env_hint), stderr);
die("unable to auto-detect name (got '%s')", 
name);
}
}
@@ -370,7 +370,7 @@ const char *fmt_ident(const char *name, const char *email,
struct passwd *pw;
if (strict) {
if (using_default)
-   fputs(env_hint, stderr);
+   fputs(_(env_hint), stderr);
die("empty ident name (for <%s>) not allowed", 
email);
}
pw = xgetpwuid_self(NULL);
@@ -381,12 +381,12 @@ const char *fmt_ident(const char *name, const char *email,
if (!email) {
if (strict && ident_use_config_only
&& !(ident_config_given & IDENT_MAIL_GIVEN)) {
-   fputs(env_hint, stderr);
+   fputs(_(env_hint), stderr);
die("no email was given and auto-detection is 
disabled");
}
email = ident_default_email();
if (strict && default_email_is_bogus) {
-   fputs(env_hint, stderr);
+   fputs(_(env_hint), stderr);
die("unable to auto-detect email address (got '%s')", 
email);
}
}
-- 
2.7.4



[PATCH 3/6] i18n: diff: mark die errors for translation

2016-09-19 Thread Vasco Almeida
Signed-off-by: Vasco Almeida 
---
 diff.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/diff.c b/diff.c
index b4310f8..d82ad79 100644
--- a/diff.c
+++ b/diff.c
@@ -3325,7 +3325,7 @@ void diff_setup_done(struct diff_options *options)
if (options->output_format & DIFF_FORMAT_NO_OUTPUT)
count++;
if (count > 1)
-   die("--name-only, --name-status, --check and -s are mutually 
exclusive");
+   die(_("--name-only, --name-status, --check and -s are mutually 
exclusive"));
 
/*
 * Most of the time we can say "there are changes"
@@ -3521,7 +3521,7 @@ static int stat_opt(struct diff_options *options, const 
char **av)
if (*arg == '=')
width = strtoul(arg + 1, , 10);
else if (!*arg && !av[1])
-   die("Option '--stat-width' requires a value");
+   die(_("Option '--stat-width' requires a 
value"));
else if (!*arg) {
width = strtoul(av[1], , 10);
argcount = 2;
@@ -3530,7 +3530,7 @@ static int stat_opt(struct diff_options *options, const 
char **av)
if (*arg == '=')
name_width = strtoul(arg + 1, , 10);
else if (!*arg && !av[1])
-   die("Option '--stat-name-width' requires a 
value");
+   die(_("Option '--stat-name-width' requires a 
value"));
else if (!*arg) {
name_width = strtoul(av[1], , 10);
argcount = 2;
@@ -3539,7 +3539,7 @@ static int stat_opt(struct diff_options *options, const 
char **av)
if (*arg == '=')
graph_width = strtoul(arg + 1, , 10);
else if (!*arg && !av[1])
-   die("Option '--stat-graph-width' requires a 
value");
+   die(_("Option '--stat-graph-width' requires a 
value"));
else if (!*arg) {
graph_width = strtoul(av[1], , 10);
argcount = 2;
@@ -3548,7 +3548,7 @@ static int stat_opt(struct diff_options *options, const 
char **av)
if (*arg == '=')
count = strtoul(arg + 1, , 10);
else if (!*arg && !av[1])
-   die("Option '--stat-count' requires a value");
+   die(_("Option '--stat-count' requires a 
value"));
else if (!*arg) {
count = strtoul(av[1], , 10);
argcount = 2;
-- 
2.7.4



[PATCH 2/6] i18n: connect: mark die messages for translation

2016-09-19 Thread Vasco Almeida
Mark messages passed to die() in die_initial_contact().

Update test to reflect changes.

Signed-off-by: Vasco Almeida 
---
 connect.c| 8 
 t/t5512-ls-remote.sh | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/connect.c b/connect.c
index 722dc3f..06bff0b 100644
--- a/connect.c
+++ b/connect.c
@@ -46,11 +46,11 @@ int check_ref_type(const struct ref *ref, int flags)
 static void die_initial_contact(int got_at_least_one_head)
 {
if (got_at_least_one_head)
-   die("The remote end hung up upon initial contact");
+   die(_("The remote end hung up upon initial contact"));
else
-   die("Could not read from remote repository.\n\n"
-   "Please make sure you have the correct access rights\n"
-   "and the repository exists.");
+   die(_("Could not read from remote repository.\n\n"
+ "Please make sure you have the correct access rights\n"
+ "and the repository exists."));
 }
 
 static void parse_one_symref_info(struct string_list *symref, const char *val, 
int len)
diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
index 819b9dd..c23434b 100755
--- a/t/t5512-ls-remote.sh
+++ b/t/t5512-ls-remote.sh
@@ -99,7 +99,7 @@ test_expect_success 'confuses pattern as remote when no 
remote specified' '
# We could just as easily have used "master"; the "*" emphasizes its
# role as a pattern.
test_must_fail git ls-remote refs*master >actual 2>&1 &&
-   test_cmp exp actual
+   test_i18ncmp exp actual
 '
 
 test_expect_success 'die with non-2 for wrong repository even with 
--exit-code' '
-- 
2.7.4



[PATCH 1/6] i18n: commit: mark message for translation

2016-09-19 Thread Vasco Almeida
Mark message commit_utf8_warn for translation.

Update tests to reflect changes.

Signed-off-by: Vasco Almeida 
---
 commit.c   | 8 
 t/t3900-i18n-commit.sh | 8 
 t/t3901-i18n-patch.sh  | 2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/commit.c b/commit.c
index ba6dee3..8eb1707 100644
--- a/commit.c
+++ b/commit.c
@@ -1511,9 +1511,9 @@ static int verify_utf8(struct strbuf *buf)
 }
 
 static const char commit_utf8_warn[] =
-"Warning: commit message did not conform to UTF-8.\n"
-"You may want to amend it after fixing the message, or set the config\n"
-"variable i18n.commitencoding to the encoding your project uses.\n";
+N_("Warning: commit message did not conform to UTF-8.\n"
+   "You may want to amend it after fixing the message, or set the config\n"
+   "variable i18n.commitencoding to the encoding your project uses.\n");
 
 int commit_tree_extended(const char *msg, size_t msg_len,
 const unsigned char *tree,
@@ -1566,7 +1566,7 @@ int commit_tree_extended(const char *msg, size_t msg_len,
 
/* And check the encoding */
if (encoding_is_utf8 && !verify_utf8())
-   fprintf(stderr, commit_utf8_warn);
+   fprintf(stderr, _(commit_utf8_warn));
 
if (sign_commit && do_sign_commit(, sign_commit))
return -1;
diff --git a/t/t3900-i18n-commit.sh b/t/t3900-i18n-commit.sh
index 4bf1dbe..3b94283 100755
--- a/t/t3900-i18n-commit.sh
+++ b/t/t3900-i18n-commit.sh
@@ -45,7 +45,7 @@ test_expect_success 'UTF-8 invalid characters refused' '
printf "Commit message\n\nInvalid surrogate:\355\240\200\n" \
>"$HOME/invalid" &&
git commit -a -F "$HOME/invalid" 2>"$HOME"/stderr &&
-   grep "did not conform" "$HOME"/stderr
+   test_i18ngrep "did not conform" "$HOME"/stderr
 '
 
 test_expect_success 'UTF-8 overlong sequences rejected' '
@@ -55,7 +55,7 @@ test_expect_success 'UTF-8 overlong sequences rejected' '
printf "\340\202\251ommit message\n\nThis is not a space:\300\240\n" \
>"$HOME/invalid" &&
git commit -a -F "$HOME/invalid" 2>"$HOME"/stderr &&
-   grep "did not conform" "$HOME"/stderr
+   test_i18ngrep "did not conform" "$HOME"/stderr
 '
 
 test_expect_success 'UTF-8 non-characters refused' '
@@ -64,7 +64,7 @@ test_expect_success 'UTF-8 non-characters refused' '
printf "Commit message\n\nNon-character:\364\217\277\276\n" \
>"$HOME/invalid" &&
git commit -a -F "$HOME/invalid" 2>"$HOME"/stderr &&
-   grep "did not conform" "$HOME"/stderr
+   test_i18ngrep "did not conform" "$HOME"/stderr
 '
 
 test_expect_success 'UTF-8 non-characters refused' '
@@ -73,7 +73,7 @@ test_expect_success 'UTF-8 non-characters refused' '
printf "Commit message\n\nNon-character:\357\267\220\n" \
>"$HOME/invalid" &&
git commit -a -F "$HOME/invalid" 2>"$HOME"/stderr &&
-   grep "did not conform" "$HOME"/stderr
+   test_i18ngrep "did not conform" "$HOME"/stderr
 '
 
 for H in ISO8859-1 eucJP ISO-2022-JP
diff --git a/t/t3901-i18n-patch.sh b/t/t3901-i18n-patch.sh
index 509084e..f663d56 100755
--- a/t/t3901-i18n-patch.sh
+++ b/t/t3901-i18n-patch.sh
@@ -295,7 +295,7 @@ test_expect_success 'am --no-utf8 (U/L)' '
 
# commit-tree will warn that the commit message does not contain valid 
UTF-8
# as mailinfo did not convert it
-   grep "did not conform" err &&
+   test_i18ngrep "did not conform" err &&
 
check_encoding 2
 '
-- 
2.7.4



Re: Two bugs in --pretty with %C(auto)

2016-09-19 Thread Duy Nguyen
On Sun, Sep 18, 2016 at 1:25 AM, René Scharfe  wrote:
> Am 17.09.2016 um 14:51 schrieb Anatoly Borodin:
>> Hi All!
>>
>> First bug:
>>
>>   git log -3 --pretty='%C(cyan)%C(auto)%h%C(auto)%d %s'
>>
>> prints %h with the default color (normal yellow), but
>>
>>   git log -3 --pretty='%C(bold cyan)%C(auto)%h%C(auto)%d %s'
>>
>> shows %h with bold yellow, as if only the color was reset, but not
>> the attributes (blink, ul, reverse also work this way). %d and %s are
>> printed with the right color both times.
>>
>> Second bug, maybe related to the first one:
>>
>>   git log -3 --pretty='%C(bold cyan)%h%C(auto)%d %s %an %h %h %s'
>>
>> The first line looks as expected. Well, almost: the '(' of %d is bold
>> yellow.
>>
>> The second line looks like this:
>>
>> * %h, %s, %an with bold cyan;
>> * %h with bold yellow;
>> * %h with normal yellow and %s with normal white (default colors).
>>
>> PS git version 2.9.2
>
> Well, in both cases you could add %Creset before %C(auto) to get what
> you want.
>
> I'm not sure how just how automatic %C(auto) is supposed to be, but you
> expected it do emit the reset for you, right?  Sounds reasonable to me.
> The following patch implements that behavior.
>
> Duy, what do you think?

Even though letting some attributes before %C(auto) through sounds
interesting, I'd say it's a bit unpredictable, especially when the
main usage of %C(auto) is %d which could use plenty of colors. So yes,
your changes look good.
-- 
Duy


clarification of `rev-list --no-walk ^`?

2016-09-19 Thread Philip Oakley

A question came up on the Git user list regarding cherry-pick that got me
reading the manual (again), in this case regarding --no-walk ranges.

Essentially my question is: If --no-walk is given to rev-list (e.g. via
charry-pick), and the user includes a caret prefixed rev, when does that
range definition take effect on the command line, especially in light of
the --do-walk option?

In rev-list(1) there are only 8 references to  'range', with only
the --no-walk option saying "This has no effect if a range is specified."
but leaving open the decision as to what does (and does not) comprises the
specification of a range on the cli.

The two and three dot notations are fairly obvious ranges from
gitrevisions(7) as they are complete strings, while the caret prefix is an 
implied range (it needs additional parameters to complete the range, and 
there-in lies the issue).


It can be read that

$ git cherry-pick maint next

would pick two single commits, while

$ git cherry-pick maint next ^master

could implicitly be read as

$ git cherry-pick maint next --do-walk ^master

because the ^ caret starts the range that cancels the --no-walk.

Clearly that's not what is intended, which is

$ git cherry-pick --do-walk maint next ^master

but it is open to interpretation as to where in the command line the caret
range prefix's --do-walk (to countermand the --no-walk) should applied.

If the user did want just the single commit at the tip of maint, and then
the range master..next, what would be their command line, and also, how
would the man page warn against false expectations?

--

Philip







Re: [PATCH] mailinfo: unescape quoted-pair in header fields

2016-09-19 Thread Kevin Daudt
Thanks for the review

On Fri, Sep 16, 2016 at 03:22:06PM -0700, Jeff King wrote:
> On Fri, Sep 16, 2016 at 11:02:04PM +0200, Kevin Daudt wrote:
> 
> >  mailinfo.c | 54 
> > ++
> >  t/t5100-mailinfo.sh|  6 ++
> >  t/t5100/quoted-pair.expect |  5 +
> >  t/t5100/quoted-pair.in |  9 
> >  4 files changed, 74 insertions(+)
> >  create mode 100644 t/t5100/quoted-pair.expect
> >  create mode 100644 t/t5100/quoted-pair.in
> > 
> > diff --git a/mailinfo.c b/mailinfo.c
> > index e19abe3..04036f3 100644
> > --- a/mailinfo.c
> > +++ b/mailinfo.c
> > @@ -54,15 +54,69 @@ static void parse_bogus_from(struct mailinfo *mi, const 
> > struct strbuf *line)
> > get_sane_name(>name, >name, >email);
> >  }
> >  
> > +static int unquote_quoted_string(struct strbuf *line)
> > +{
> > +   struct strbuf outbuf;
> > +   const char *in = line->buf;
> > +   int c, take_next_literally = 0;
> > +   int found_error = 0;
> > +   char escape_context=0;
> 
> Style: whitespace around "=".
> 
> I had to wonder why we needed both escape_context and
> take_next_literally; shouldn't we just need a single state bit. But
> escape_context is not "escape the next character", it is "we are
> currently in a mode where we should be escaping".
> 
> Could we give it a more descriptive name? I guess it is more than just
> "we are in a mode", but rather "here is the character that will end the
> escaped mode". Maybe a comment would be more appropriate.
> 

Yes, your analysis is right, we need to know what character would end
the 'escape context'. I'll add a comment.

> > +   while ((c = *in++) != 0) {
> > +   if (take_next_literally) {
> > +   take_next_literally = 0;
> > +   } else {
> 
> OK, so that means the previous one was backslash-quoted, and we don't do
> any other cleverness. Good.
> 
> > +   switch (c) {
> > +   case '"':
> > +   if (!escape_context)
> > +   escape_context = '"';
> > +   else if (escape_context == '"')
> > +   escape_context = 0;
> > +   continue;
> 
> And here we open or close the quoted portion, depending. Makes sense.
> 
> > +   case '\\':
> > +   if (escape_context) {
> > +   take_next_literally = 1;
> > +   continue;
> > +   }
> > +   break;
> 
> I didn't look in the RFC. Is:
> 
>   From: my \"name\" 
> 
> really the same as:
> 
>   From: "my \\\"name\\\"" 
> 
> ? That seems weird, but I think it may be that the former is simply
> bogus (you are not supposed to use backslashes outside of the quoted
> section at all).

Correct, the quoted-pair (escape sequence) can only occur in a quoted
string or a comment. Even more so, the display name *needs* to be quoted
when consisting of more then one word according to the RFC.

> 
> > +   case '(':
> > +   if (!escape_context)
> > +   escape_context = '(';
> > +   else if (escape_context == '(')
> > +   found_error = 1;
> > +   break;
> 
> Hmm. Is:
> 
>   From: Name (Comment with (another comment))
> 
> really disallowed? RFC2822 seems to say that "comment" can contain
> "ccontent", which can itself be a comment.

Yes, you are right, it is allowed, I was just looking at the ctext when
adding this, but failed to see that comments can be nested at that time.

> 
> This is obviously getting pretty silly, but if we are going to follow
> the RFC, I think you actually have to do a recursive parse, and keep
> track of an arbitrary depth of context.
> 
> I dunno. This method probably covers most cases in practice, and it's
> easy to reason about.

The problem is, how do you differentiate between nested comments, and
escaped braces within a comment after one run?
> 
> > +   case ')':
> > +   if (escape_context == '(')
> > +   escape_context = 0;
> > +   break;
> > +   }
> > +   }
> > +
> > +   strbuf_addch(, c);
> > +   }
> > +
> > +   strbuf_reset(line);
> > +   strbuf_addbuf(line, );
> > +   strbuf_release();
> 
> I think you can use strbuf_swap() here to avoid copying the line an
> extra time, like:
> 
>   strbuf_swap(line, );
>   strbuf_release();
> 
> Another option would be to just:
> 
>   in = strbuf_detach();
> 
> at the beginning, and then output back into "line".
> 

Thanks, I just looked at what other functions were doing, but this is
much better indeed.

> > +   return found_error;
> 
> What happens when we get here and 

How can I make a best dissertation paper?

2016-09-19 Thread elizabethgillespe
Dissertation writing is not a difficult task. You can make a best
dissertation paper by practicing with the dissertation writing task. The
format of the dissertation writing is same as the essay writing with the
headers dissertation introduction, body and conclusion. You can get best
sample papers and dissertation writing tips from the online  dissertation
writing help   .



--
View this message in context: 
http://git.661346.n2.nabble.com/How-can-I-make-a-best-dissertation-paper-tp7657453.html
Sent from the git mailing list archive at Nabble.com.


Re: [PATCH v2] format-patch: Add --rfc for the common case of [RFC PATCH]

2016-09-19 Thread Andrew Donnellan

On 17/09/16 17:21, Josh Triplett wrote:

This provides a shorter and more convenient alias for
--subject-prefix='RFC PATCH'.

Includes documentation in the format-patch manpage, and a new test
covering --rfc.

Signed-off-by: Josh Triplett 


Sounds good to me. Agreed that "RFC" is essentially the only prefix 
other than "PATCH" that I see, at least in the kernel.


I don't have anything to say about the code, though I did note that 
there's a error message stating that "--subject-prefix and -k are 
mutually exclusive." - I haven't tested the patch, but I imagine this 
message will trigger with --rfc as well and could be slightly confusing.



---
v2:
- Add documentation to the format-patch manpage
- Call subject_prefix_callback rather than reimplementing it
- Update test to move expectations inside

 Documentation/git-format-patch.txt |  8 +++-
 builtin/log.c  |  8 
 t/t4014-format-patch.sh|  9 +
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-format-patch.txt 
b/Documentation/git-format-patch.txt
index 9624c84..b9590a5 100644
--- a/Documentation/git-format-patch.txt
+++ b/Documentation/git-format-patch.txt
@@ -19,7 +19,8 @@ SYNOPSIS
   [--start-number ] [--numbered-files]
   [--in-reply-to=Message-Id] [--suffix=.]
   [--ignore-if-in-upstream]
-  [--subject-prefix=Subject-Prefix] [(--reroll-count|-v) ]
+  [--rfc] [--subject-prefix=Subject-Prefix]
+  [(--reroll-count|-v) ]
   [--to=] [--cc=]
   [--[no-]cover-letter] [--quiet] [--notes[=]]
   []
@@ -172,6 +173,11 @@ will want to ensure that threading is disabled for `git 
send-email`.
allows for useful naming of a patch series, and can be
combined with the `--numbered` option.

+--rfc::
+   Alias for `--subject-prefix="RFC PATCH"`. Use this when
+   sending an experimental patch for discussion rather than
+   application.


Perhaps mention the phrase "Request For Comment" for the benefit of 
those who aren't familiar (which admittedly, among users of 
git-format-patch, are probably rather few, but still).


--
Andrew Donnellan  OzLabs, ADL Canberra
andrew.donnel...@au1.ibm.com  IBM Australia Limited