Re: [PATCH v2 1/4] builtin: add git-check-mailmap command

2013-07-12 Thread Eric Sunshine
On Fri, Jul 12, 2013 at 1:47 AM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 For each contact information (either in the form of ``Name
 user@host'' or ...)

 in order to clarify that the two forms of input is what you call
 contact information.

 Is this easier to read?

 For each ``Name $$user@host$$'' or ``$$user@host$$'' from the
 command-line or standard input (when using `--stdin`), print a line
 showing either the canonical name and email address (see Mapping
 Authors below), or the input ``Name $$user@host$$'' or
 ``$$user@host$$'' if there is no mapping for that person.

 I find it easier than your original, but I do not know if you would
 want to repeat the Name... or user@host at the end.  It does not
 seem to add much useful information and is distracting.

Next attempt:

For each ``Name $$user@host$$'' or ``$$user@host$$'' from the
command-line or standard input (when using `--stdin`) look up the
person's canonical name and email address (see Mapping Authors
below). If found, print them; otherwise print the input as-is.

 In check-attr, null_term_line indicates that _input_ lines are
 null-terminated. In check-ignore, null_term_lines is overloaded (and
 perhaps abused) to mean that both _input_ and _output_ lines are
 null-terminated.

 That is unfortunate but it is good that you found the breakage.  As
 we do not have --nul-terminated-input and --nul-terminated-output
 options separtely, -z should apply to both input and output.  What
 b4666852 (check-attr: Add --stdin option, 2008-10-07) did is broken.

I can make git-check-mailmap behave this way, however, other than
git-check-ignore (which is quite new), there doesn't seem to be any
precedence (that I can find) anywhere else in git which ties input and
output null-termination to a single switch. Is it desirable to do so
or should the user have more fine-grained control? (xargs -0 comes
to mind when thinking of a null-termination input switch.)

 Also git check-ignore -h advertises -z as only affecting --stdin,
 which is also wrong.  It does affect both input and output as it should,
 so it should be described as such, I think.

I also noticed this. (It was copied from check-attr.c).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/4] builtin: add git-check-mailmap command

2013-07-12 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 I find it easier than your original, but I do not know if you would
 want to repeat the Name... or user@host at the end.  It does not
 seem to add much useful information and is distracting.

 Next attempt:

 For each ``Name $$user@host$$'' or ``$$user@host$$'' from the
 command-line or standard input (when using `--stdin`) look up the
 person's canonical name and email address (see Mapping Authors
 below). If found, print them; otherwise print the input as-is.

Nice.

 ... Is it desirable to do so
 or should the user have more fine-grained control? (xargs -0 comes
 to mind when thinking of a null-termination input switch.)

For the purposes of check-attr and check-ignore, a single -z
governing both is sufficient.  I think you already got that from my
4-patch series, but the core reason for that is :

 - when -z is used, the user knows the input paths may need
   protection against LF.

 - our output contains these same paths.

 - which means our output cannot be expressed unambiguously using LF
   as record separator.

For the purpose of check-mailmap, I actually do not see much point
in supporting -z format.  We do not even handle names or addresses
with LF in it.  The mailmap format would not let you express such
records in the first place, no?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/4] builtin: add git-check-mailmap command

2013-07-12 Thread Eric Sunshine
On Fri, Jul 12, 2013 at 2:34 AM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 ... Is it desirable to do so
 or should the user have more fine-grained control? (xargs -0 comes
 to mind when thinking of a null-termination input switch.)

 For the purposes of check-attr and check-ignore, a single -z
 governing both is sufficient.  I think you already got that from my
 4-patch series, but the core reason for that is :

I'm reading it right now.

  - when -z is used, the user knows the input paths may need
protection against LF.

  - our output contains these same paths.

  - which means our output cannot be expressed unambiguously using LF
as record separator.

 For the purpose of check-mailmap, I actually do not see much point
 in supporting -z format.  We do not even handle names or addresses
 with LF in it.  The mailmap format would not let you express such
 records in the first place, no?

You're right. I had this exact argument in mind for why
null-termination was not needed on the input side of check-mailmap,
but for some reason I had a blind spot concerning the output side.
I'll drop this option in the next re-roll.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/4] builtin: add git-check-mailmap command

2013-07-11 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 +DESCRIPTION
 +---
 +
 +For each ``Name $$email@address$$'' or ``$$email@address$$'' provided on
 +the command-line or standard input (when using `--stdin`), prints a line with
 +the canonical contact information for that person according to the 'mailmap'
 +(see Mapping Authors below). If no mapping exists for the person, echoes 
 the
 +provided contact information.

OK.  The last part needed a reading and a half for me to realize the
echoes the provided contact information is the same thing as the
input string is printed as-is, especially because contact
information is not defined anywhere in the previous part of the
DESCRIPTION section, though.  I might be worth starting the
paragraph with:

For each contact information (either in the form of ``Name
user@host'' or ...)

in order to clarify that the two forms of input is what you call
contact information.

 diff --git a/builtin/check-mailmap.c b/builtin/check-mailmap.c
 new file mode 100644
 index 000..c36a61c
 --- /dev/null
 +++ b/builtin/check-mailmap.c
 @@ -0,0 +1,69 @@
 +#include builtin.h
 +#include mailmap.h
 +#include parse-options.h
 +#include string-list.h
 +
 +static int use_stdin;
 +static int null_out;

Is there a reason why the variable name should not match that of the
corresponding variables in check-ignore and check-attr, which you
copied the bulk of the code from?

If there isn't, use null_term_line like they do.

 +static const char * const check_mailmap_usage[] = {
 +N_(git check-mailmap [options] contact...),
 +NULL
 +};

This mis-indentation looks somewhat unfortunate, but they are shared
with other check-blah, and they are meant to contain a rather long
string, so let's keep it like so.

 +static void check_mailmap(struct string_list *mailmap, const char *contact)
 +{
 + const char *name, *mail;
 + size_t namelen, maillen;
 + struct ident_split ident;
 + char term = null_out ? '\0' : '\n';

Is there a reason why the variable name term should not match that
of the corresponding variables in check-ignore and check-attr, which
you copied the bulk of the code from?

If there isn't, use line_termination like they do.

 + if (split_ident_line(ident, contact, strlen(contact)))
 + die(_(unable to parse contact: %s), contact);
 +
 + name = ident.name_begin;
 + namelen = ident.name_end - ident.name_begin;
 + mail = ident.mail_begin;
 + maillen = ident.mail_end - ident.mail_begin;
 +
 + map_user(mailmap, mail, maillen, name, namelen);
 +
 + if (namelen)
 + printf(%.*s , (int)namelen, name);
 + printf(%.*s%c, (int)maillen, mail, term);
 +}
 +
 +int cmd_check_mailmap(int argc, const char **argv, const char *prefix)
 +{
 + int i;
 + struct string_list mailmap = STRING_LIST_INIT_NODUP;
 +
 + git_config(git_default_config, NULL);
 + argc = parse_options(argc, argv, prefix, check_mailmap_options,
 + check_mailmap_usage, 0);

It is a bit distracting that the second line that is not indented
enough.  Either (1) with enough HT and with minimum number of SP,
align argc and check_mailmap_usage, or (2) with minimum number
of HT and no SP, push check_mailmap_usage to the right of opening
parenthesis of (argc.  Our code tend to prefer (1).

 + if (argc == 0  !use_stdin)
 + die(_(no contacts specified));
 +
 + read_mailmap(mailmap, NULL);
 +
 + for (i = 0; i  argc; ++i)
 + check_mailmap(mailmap, argv[i]);
 + maybe_flush_or_die(stdout, stdout);
 +
 + if (use_stdin) {
 + struct strbuf buf = STRBUF_INIT;
 + while (strbuf_getline(buf, stdin, '\n') != EOF) {
 + check_mailmap(mailmap, buf.buf);
 + maybe_flush_or_die(stdout, stdout);
 + }
 + strbuf_release(buf);
 + }
 +
 + clear_mailmap(mailmap);
 + return 0;
 +}
 diff --git a/command-list.txt b/command-list.txt
 index bf83303..08b04e2 100644
 --- a/command-list.txt
 +++ b/command-list.txt
 @@ -13,6 +13,7 @@ git-bundle  mainporcelain
  git-cat-fileplumbinginterrogators
  git-check-attr  purehelpers
  git-check-ignorepurehelpers
 +git-check-mailmap   purehelpers
  git-checkoutmainporcelain common
  git-checkout-index  plumbingmanipulators
  git-check-ref-formatpurehelpers
 diff --git a/contrib/completion/git-completion.bash 
 b/contrib/completion/git-completion.bash
 index ebc40d4..c118550 100644
 --- a/contrib/completion/git-completion.bash
 +++ b/contrib/completion/git-completion.bash
 @@ -648,6 +648,7 @@ __git_list_porcelain_commands ()
   cat-file) : plumbing;;
   check-attr)   : plumbing;;
   check-ignore) : plumbing;;
 + check-mailmap): 

Re: [PATCH v2 1/4] builtin: add git-check-mailmap command

2013-07-11 Thread Eric Sunshine
On Thu, Jul 11, 2013 at 3:04 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 +DESCRIPTION
 +---
 +
 +For each ``Name $$email@address$$'' or ``$$email@address$$'' provided on
 +the command-line or standard input (when using `--stdin`), prints a line 
 with
 +the canonical contact information for that person according to the 'mailmap'
 +(see Mapping Authors below). If no mapping exists for the person, echoes 
 the
 +provided contact information.

 OK.  The last part needed a reading and a half for me to realize the
 echoes the provided contact information is the same thing as the
 input string is printed as-is, especially because contact
 information is not defined anywhere in the previous part of the
 DESCRIPTION section, though.  I might be worth starting the
 paragraph with:

 For each contact information (either in the form of ``Name
 user@host'' or ...)

 in order to clarify that the two forms of input is what you call
 contact information.

Is this easier to read?

For each ``Name $$user@host$$'' or ``$$user@host$$'' from the
command-line or standard input (when using `--stdin`), print a line
showing either the canonical name and email address (see Mapping
Authors below), or the input ``Name $$user@host$$'' or
``$$user@host$$'' if there is no mapping for that person.

 diff --git a/builtin/check-mailmap.c b/builtin/check-mailmap.c
 new file mode 100644
 index 000..c36a61c
 --- /dev/null
 +++ b/builtin/check-mailmap.c
 @@ -0,0 +1,69 @@
 +#include builtin.h
 +#include mailmap.h
 +#include parse-options.h
 +#include string-list.h
 +
 +static int use_stdin;
 +static int null_out;

 Is there a reason why the variable name should not match that of the
 corresponding variables in check-ignore and check-attr, which you
 copied the bulk of the code from?

 If there isn't, use null_term_line like they do.

In check-attr, null_term_line indicates that _input_ lines are
null-terminated. In check-ignore, null_term_lines is overloaded (and
perhaps abused) to mean that both _input_ and _output_ lines are
null-terminated. In check-mailmap, -z affects only _output_ lines. A
reader of the code can easily be misled into thinking that the
variable controls the same function in all three programs, hence
null_out was chosen to make it clear that it applies only to output. A
lesser reason is that, in the future, someone might add an option to
null terminate input lines (distinct from output), and null_in would
be a reasonable name for that variable.

Other than the above reasoning, I don't feel strongly about it and can
revert the name if you prefer.

 +static void check_mailmap(struct string_list *mailmap, const char *contact)
 +{
 + const char *name, *mail;
 + size_t namelen, maillen;
 + struct ident_split ident;
 + char term = null_out ? '\0' : '\n';

 Is there a reason why the variable name term should not match that
 of the corresponding variables in check-ignore and check-attr, which
 you copied the bulk of the code from?

 If there isn't, use line_termination like they do.

No strong justification. As with 'buf' vs. 'buffer',
'line_termination' required more reading effort without conveying any
more information than 'term'.

 + if (split_ident_line(ident, contact, strlen(contact)))
 + die(_(unable to parse contact: %s), contact);
 +
 + name = ident.name_begin;
 + namelen = ident.name_end - ident.name_begin;
 + mail = ident.mail_begin;
 + maillen = ident.mail_end - ident.mail_begin;
 +
 + map_user(mailmap, mail, maillen, name, namelen);
 +
 + if (namelen)
 + printf(%.*s , (int)namelen, name);
 + printf(%.*s%c, (int)maillen, mail, term);
 +}
 +
 +int cmd_check_mailmap(int argc, const char **argv, const char *prefix)
 +{
 + int i;
 + struct string_list mailmap = STRING_LIST_INIT_NODUP;
 +
 + git_config(git_default_config, NULL);
 + argc = parse_options(argc, argv, prefix, check_mailmap_options,
 + check_mailmap_usage, 0);

 It is a bit distracting that the second line that is not indented
 enough.  Either (1) with enough HT and with minimum number of SP,
 align argc and check_mailmap_usage, or (2) with minimum number
 of HT and no SP, push check_mailmap_usage to the right of opening
 parenthesis of (argc.  Our code tend to prefer (1).

Okay.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/4] builtin: add git-check-mailmap command

2013-07-11 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 For each contact information (either in the form of ``Name
 user@host'' or ...)

 in order to clarify that the two forms of input is what you call
 contact information.

 Is this easier to read?

 For each ``Name $$user@host$$'' or ``$$user@host$$'' from the
 command-line or standard input (when using `--stdin`), print a line
 showing either the canonical name and email address (see Mapping
 Authors below), or the input ``Name $$user@host$$'' or
 ``$$user@host$$'' if there is no mapping for that person.

I find it easier than your original, but I do not know if you would
want to repeat the Name... or user@host at the end.  It does not
seem to add much useful information and is distracting.

 If there isn't, use null_term_line like they do.

 In check-attr, null_term_line indicates that _input_ lines are
 null-terminated. In check-ignore, null_term_lines is overloaded (and
 perhaps abused) to mean that both _input_ and _output_ lines are
 null-terminated.

That is unfortunate but it is good that you found the breakage.  As
we do not have --nul-terminated-input and --nul-terminated-output
options separtely, -z should apply to both input and output.  What
b4666852 (check-attr: Add --stdin option, 2008-10-07) did is broken.
What check-ignore does 

We should find a way to fix it.  I have a feeling that silently
fixing it and seeing if anybody screams might be the best course of
action ;-).

Also git check-ignore -h advertises -z as only affecting --stdin,
which is also wrong.  It does affect both input and output as it should,
so it should be described as such, I think.

Thanks for noticing.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html