Re: [PATCH v2 1/4] builtin: add git-check-mailmap command
On Fri, Jul 12, 2013 at 1:47 AM, Junio C Hamano gits...@pobox.com wrote: Eric Sunshine sunsh...@sunshineco.com writes: For each contact information (either in the form of ``Name user@host'' or ...) in order to clarify that the two forms of input is what you call contact information. Is this easier to read? For each ``Name $$user@host$$'' or ``$$user@host$$'' from the command-line or standard input (when using `--stdin`), print a line showing either the canonical name and email address (see Mapping Authors below), or the input ``Name $$user@host$$'' or ``$$user@host$$'' if there is no mapping for that person. I find it easier than your original, but I do not know if you would want to repeat the Name... or user@host at the end. It does not seem to add much useful information and is distracting. Next attempt: For each ``Name $$user@host$$'' or ``$$user@host$$'' from the command-line or standard input (when using `--stdin`) look up the person's canonical name and email address (see Mapping Authors below). If found, print them; otherwise print the input as-is. In check-attr, null_term_line indicates that _input_ lines are null-terminated. In check-ignore, null_term_lines is overloaded (and perhaps abused) to mean that both _input_ and _output_ lines are null-terminated. That is unfortunate but it is good that you found the breakage. As we do not have --nul-terminated-input and --nul-terminated-output options separtely, -z should apply to both input and output. What b4666852 (check-attr: Add --stdin option, 2008-10-07) did is broken. I can make git-check-mailmap behave this way, however, other than git-check-ignore (which is quite new), there doesn't seem to be any precedence (that I can find) anywhere else in git which ties input and output null-termination to a single switch. Is it desirable to do so or should the user have more fine-grained control? (xargs -0 comes to mind when thinking of a null-termination input switch.) Also git check-ignore -h advertises -z as only affecting --stdin, which is also wrong. It does affect both input and output as it should, so it should be described as such, I think. I also noticed this. (It was copied from check-attr.c). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] builtin: add git-check-mailmap command
Eric Sunshine sunsh...@sunshineco.com writes: I find it easier than your original, but I do not know if you would want to repeat the Name... or user@host at the end. It does not seem to add much useful information and is distracting. Next attempt: For each ``Name $$user@host$$'' or ``$$user@host$$'' from the command-line or standard input (when using `--stdin`) look up the person's canonical name and email address (see Mapping Authors below). If found, print them; otherwise print the input as-is. Nice. ... Is it desirable to do so or should the user have more fine-grained control? (xargs -0 comes to mind when thinking of a null-termination input switch.) For the purposes of check-attr and check-ignore, a single -z governing both is sufficient. I think you already got that from my 4-patch series, but the core reason for that is : - when -z is used, the user knows the input paths may need protection against LF. - our output contains these same paths. - which means our output cannot be expressed unambiguously using LF as record separator. For the purpose of check-mailmap, I actually do not see much point in supporting -z format. We do not even handle names or addresses with LF in it. The mailmap format would not let you express such records in the first place, no? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] builtin: add git-check-mailmap command
On Fri, Jul 12, 2013 at 2:34 AM, Junio C Hamano gits...@pobox.com wrote: Eric Sunshine sunsh...@sunshineco.com writes: ... Is it desirable to do so or should the user have more fine-grained control? (xargs -0 comes to mind when thinking of a null-termination input switch.) For the purposes of check-attr and check-ignore, a single -z governing both is sufficient. I think you already got that from my 4-patch series, but the core reason for that is : I'm reading it right now. - when -z is used, the user knows the input paths may need protection against LF. - our output contains these same paths. - which means our output cannot be expressed unambiguously using LF as record separator. For the purpose of check-mailmap, I actually do not see much point in supporting -z format. We do not even handle names or addresses with LF in it. The mailmap format would not let you express such records in the first place, no? You're right. I had this exact argument in mind for why null-termination was not needed on the input side of check-mailmap, but for some reason I had a blind spot concerning the output side. I'll drop this option in the next re-roll. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] builtin: add git-check-mailmap command
Eric Sunshine sunsh...@sunshineco.com writes: +DESCRIPTION +--- + +For each ``Name $$email@address$$'' or ``$$email@address$$'' provided on +the command-line or standard input (when using `--stdin`), prints a line with +the canonical contact information for that person according to the 'mailmap' +(see Mapping Authors below). If no mapping exists for the person, echoes the +provided contact information. OK. The last part needed a reading and a half for me to realize the echoes the provided contact information is the same thing as the input string is printed as-is, especially because contact information is not defined anywhere in the previous part of the DESCRIPTION section, though. I might be worth starting the paragraph with: For each contact information (either in the form of ``Name user@host'' or ...) in order to clarify that the two forms of input is what you call contact information. diff --git a/builtin/check-mailmap.c b/builtin/check-mailmap.c new file mode 100644 index 000..c36a61c --- /dev/null +++ b/builtin/check-mailmap.c @@ -0,0 +1,69 @@ +#include builtin.h +#include mailmap.h +#include parse-options.h +#include string-list.h + +static int use_stdin; +static int null_out; Is there a reason why the variable name should not match that of the corresponding variables in check-ignore and check-attr, which you copied the bulk of the code from? If there isn't, use null_term_line like they do. +static const char * const check_mailmap_usage[] = { +N_(git check-mailmap [options] contact...), +NULL +}; This mis-indentation looks somewhat unfortunate, but they are shared with other check-blah, and they are meant to contain a rather long string, so let's keep it like so. +static void check_mailmap(struct string_list *mailmap, const char *contact) +{ + const char *name, *mail; + size_t namelen, maillen; + struct ident_split ident; + char term = null_out ? '\0' : '\n'; Is there a reason why the variable name term should not match that of the corresponding variables in check-ignore and check-attr, which you copied the bulk of the code from? If there isn't, use line_termination like they do. + if (split_ident_line(ident, contact, strlen(contact))) + die(_(unable to parse contact: %s), contact); + + name = ident.name_begin; + namelen = ident.name_end - ident.name_begin; + mail = ident.mail_begin; + maillen = ident.mail_end - ident.mail_begin; + + map_user(mailmap, mail, maillen, name, namelen); + + if (namelen) + printf(%.*s , (int)namelen, name); + printf(%.*s%c, (int)maillen, mail, term); +} + +int cmd_check_mailmap(int argc, const char **argv, const char *prefix) +{ + int i; + struct string_list mailmap = STRING_LIST_INIT_NODUP; + + git_config(git_default_config, NULL); + argc = parse_options(argc, argv, prefix, check_mailmap_options, + check_mailmap_usage, 0); It is a bit distracting that the second line that is not indented enough. Either (1) with enough HT and with minimum number of SP, align argc and check_mailmap_usage, or (2) with minimum number of HT and no SP, push check_mailmap_usage to the right of opening parenthesis of (argc. Our code tend to prefer (1). + if (argc == 0 !use_stdin) + die(_(no contacts specified)); + + read_mailmap(mailmap, NULL); + + for (i = 0; i argc; ++i) + check_mailmap(mailmap, argv[i]); + maybe_flush_or_die(stdout, stdout); + + if (use_stdin) { + struct strbuf buf = STRBUF_INIT; + while (strbuf_getline(buf, stdin, '\n') != EOF) { + check_mailmap(mailmap, buf.buf); + maybe_flush_or_die(stdout, stdout); + } + strbuf_release(buf); + } + + clear_mailmap(mailmap); + return 0; +} diff --git a/command-list.txt b/command-list.txt index bf83303..08b04e2 100644 --- a/command-list.txt +++ b/command-list.txt @@ -13,6 +13,7 @@ git-bundle mainporcelain git-cat-fileplumbinginterrogators git-check-attr purehelpers git-check-ignorepurehelpers +git-check-mailmap purehelpers git-checkoutmainporcelain common git-checkout-index plumbingmanipulators git-check-ref-formatpurehelpers diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index ebc40d4..c118550 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -648,6 +648,7 @@ __git_list_porcelain_commands () cat-file) : plumbing;; check-attr) : plumbing;; check-ignore) : plumbing;; + check-mailmap):
Re: [PATCH v2 1/4] builtin: add git-check-mailmap command
On Thu, Jul 11, 2013 at 3:04 PM, Junio C Hamano gits...@pobox.com wrote: Eric Sunshine sunsh...@sunshineco.com writes: +DESCRIPTION +--- + +For each ``Name $$email@address$$'' or ``$$email@address$$'' provided on +the command-line or standard input (when using `--stdin`), prints a line with +the canonical contact information for that person according to the 'mailmap' +(see Mapping Authors below). If no mapping exists for the person, echoes the +provided contact information. OK. The last part needed a reading and a half for me to realize the echoes the provided contact information is the same thing as the input string is printed as-is, especially because contact information is not defined anywhere in the previous part of the DESCRIPTION section, though. I might be worth starting the paragraph with: For each contact information (either in the form of ``Name user@host'' or ...) in order to clarify that the two forms of input is what you call contact information. Is this easier to read? For each ``Name $$user@host$$'' or ``$$user@host$$'' from the command-line or standard input (when using `--stdin`), print a line showing either the canonical name and email address (see Mapping Authors below), or the input ``Name $$user@host$$'' or ``$$user@host$$'' if there is no mapping for that person. diff --git a/builtin/check-mailmap.c b/builtin/check-mailmap.c new file mode 100644 index 000..c36a61c --- /dev/null +++ b/builtin/check-mailmap.c @@ -0,0 +1,69 @@ +#include builtin.h +#include mailmap.h +#include parse-options.h +#include string-list.h + +static int use_stdin; +static int null_out; Is there a reason why the variable name should not match that of the corresponding variables in check-ignore and check-attr, which you copied the bulk of the code from? If there isn't, use null_term_line like they do. In check-attr, null_term_line indicates that _input_ lines are null-terminated. In check-ignore, null_term_lines is overloaded (and perhaps abused) to mean that both _input_ and _output_ lines are null-terminated. In check-mailmap, -z affects only _output_ lines. A reader of the code can easily be misled into thinking that the variable controls the same function in all three programs, hence null_out was chosen to make it clear that it applies only to output. A lesser reason is that, in the future, someone might add an option to null terminate input lines (distinct from output), and null_in would be a reasonable name for that variable. Other than the above reasoning, I don't feel strongly about it and can revert the name if you prefer. +static void check_mailmap(struct string_list *mailmap, const char *contact) +{ + const char *name, *mail; + size_t namelen, maillen; + struct ident_split ident; + char term = null_out ? '\0' : '\n'; Is there a reason why the variable name term should not match that of the corresponding variables in check-ignore and check-attr, which you copied the bulk of the code from? If there isn't, use line_termination like they do. No strong justification. As with 'buf' vs. 'buffer', 'line_termination' required more reading effort without conveying any more information than 'term'. + if (split_ident_line(ident, contact, strlen(contact))) + die(_(unable to parse contact: %s), contact); + + name = ident.name_begin; + namelen = ident.name_end - ident.name_begin; + mail = ident.mail_begin; + maillen = ident.mail_end - ident.mail_begin; + + map_user(mailmap, mail, maillen, name, namelen); + + if (namelen) + printf(%.*s , (int)namelen, name); + printf(%.*s%c, (int)maillen, mail, term); +} + +int cmd_check_mailmap(int argc, const char **argv, const char *prefix) +{ + int i; + struct string_list mailmap = STRING_LIST_INIT_NODUP; + + git_config(git_default_config, NULL); + argc = parse_options(argc, argv, prefix, check_mailmap_options, + check_mailmap_usage, 0); It is a bit distracting that the second line that is not indented enough. Either (1) with enough HT and with minimum number of SP, align argc and check_mailmap_usage, or (2) with minimum number of HT and no SP, push check_mailmap_usage to the right of opening parenthesis of (argc. Our code tend to prefer (1). Okay. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] builtin: add git-check-mailmap command
Eric Sunshine sunsh...@sunshineco.com writes: For each contact information (either in the form of ``Name user@host'' or ...) in order to clarify that the two forms of input is what you call contact information. Is this easier to read? For each ``Name $$user@host$$'' or ``$$user@host$$'' from the command-line or standard input (when using `--stdin`), print a line showing either the canonical name and email address (see Mapping Authors below), or the input ``Name $$user@host$$'' or ``$$user@host$$'' if there is no mapping for that person. I find it easier than your original, but I do not know if you would want to repeat the Name... or user@host at the end. It does not seem to add much useful information and is distracting. If there isn't, use null_term_line like they do. In check-attr, null_term_line indicates that _input_ lines are null-terminated. In check-ignore, null_term_lines is overloaded (and perhaps abused) to mean that both _input_ and _output_ lines are null-terminated. That is unfortunate but it is good that you found the breakage. As we do not have --nul-terminated-input and --nul-terminated-output options separtely, -z should apply to both input and output. What b4666852 (check-attr: Add --stdin option, 2008-10-07) did is broken. What check-ignore does We should find a way to fix it. I have a feeling that silently fixing it and seeing if anybody screams might be the best course of action ;-). Also git check-ignore -h advertises -z as only affecting --stdin, which is also wrong. It does affect both input and output as it should, so it should be described as such, I think. Thanks for noticing. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html