Re: [PATCH] pull, fetch: add --set-upstream option

2019-08-20 Thread Matthieu Moy
Junio C Hamano  writes:

>> That wouldn't make the commands really easier to type IMHO, as you would
>> still have to pull at some point, so it's:
>>
>>   git remote add main http://example.com/project-main-repo
>>   git pull --set-upstream main master
>>   
>> Vs
>>
>>   git remote add --set-upstream master main 
>> http://example.com/project-main-repo
>>   git pull
>>
>> The second is a bit shorter (saves the second instance of "master"), but
>> I tend to prefer the first to avoid the overly long "git remote add"
>> command.
>
> I do not particularly care about five extra keystrokes.  The reason
> I prefer the latter more is conceptual clarity of it saying "I use
> 'remote' to set things up, and then use 'pull' to get updated" (as
> opposed to "I use 'remote' to set things half-way up, and then use
> the first 'pull' to finish setting things up and getting updated.
> I should remember that I do not need to give --set-upstream to later
> 'pull' I used to get further updates").

That's a good argument to add a similar feature to "git remote", and
it's a good idea for a microproject in the future actually. I admit I
didn't consider this possibility before this discussion, thanks.

I think I'll still appreciate having the possibility to "pull
--set-upstream" too:

* "git remote add" is ran once for a remote, "git pull --set-upstream"
  can be run several times for several branches.

* In practice, even when "remote add" supports "--set-upstream", I'll
  very likely forget it, and by the time I run "git pull", it'll be too
  late to add --set-upstream to my "remote add" command.

-- 
Matthieu Moy
https://matthieu-moy.fr/


[PATCH v2] pull, fetch: add --set-upstream option

2019-08-19 Thread Matthieu Moy
From: Corentin BOMPARD 

Add the --set-upstream option to git pull/fetch
which lets the user set the upstream configuration
(branch..merge and
branch..remote) for the current branch.

A typical use-case is:

git clone http://example.com/my-public-fork
git remote add main http://example.com/project-main-repo
git pull --set-upstream main master

or, instead of the last line:

git fetch --set-upstream main master
git merge # or git rebase

This is mostly equivalent to cloning project-main-repo (which sets
upsteam) and then "git remote add" my-public-fork, but may feel more
natural for people using a hosting system which allows forking from
the web UI.

This functionality is analog to "git push --set-upstream".

Signed-off-by: Corentin BOMPARD 
Signed-off-by: Nathan BERBEZIER 
Signed-off-by: Pablo CHABANNE 
Signed-off-by: Matthieu Moy 
Patch-edited-by: Matthieu Moy 
---
Only style fixes and slightly improved commit message since v1.
Interdiff below:

  diff --git a/builtin/fetch.c b/builtin/fetch.c
  index 5557ae1c04..54d6b01892 100644
  --- a/builtin/fetch.c
  +++ b/builtin/fetch.c
  @@ -51,7 +51,8 @@ static int fetch_prune_tags_config = -1; /* unspecified */
   static int prune_tags = -1; /* unspecified */
   #define PRUNE_TAGS_BY_DEFAULT 0 /* do we prune tags by default? */
   
  -static int all, append, dry_run, force, keep, multiple, update_head_ok, 
verbosity, deepen_relative, set_upstream;
  +static int all, append, dry_run, force, keep, multiple, update_head_ok;
  +static int verbosity, deepen_relative, set_upstream;
   static int progress = -1;
   static int enable_auto_gc = 1;
   static int tags = TAGS_DEFAULT, unshallow, update_shallow, deepen;
  @@ -1377,12 +1378,14 @@ static int do_fetch(struct transport *transport,
struct ref *source_ref = NULL;
   
/*
  -  * We're setting the upstream configuration for the current 
branch. The
  -  * relevent upstream is the fetched branch that is meant to be 
merged with
  -  * the current one, i.e. the one fetched to FETCH_HEAD.
  +  * We're setting the upstream configuration for the
  +  * current branch. The relevent upstream is the
  +  * fetched branch that is meant to be merged with the
  +  * current one, i.e. the one fetched to FETCH_HEAD.
 *
  -  * When there are several such branches, consider the request 
ambiguous and
  -  * err on the safe side by doing nothing and just emit a 
warning.
  +  * When there are several such branches, consider the
  +  * request ambiguous and err on the safe side by doing
  +  * nothing and just emit a warning.
 */
for (rm = ref_map; rm; rm = rm->next) {
if (!rm->peer_ref) {
  @@ -1396,17 +1399,17 @@ static int do_fetch(struct transport *transport,
}
if (source_ref) {
if (!strcmp(source_ref->name, "HEAD") ||
  - starts_with(source_ref->name, "refs/heads/")) {
  - install_branch_config(0, branch->name,
  + starts_with(source_ref->name, "refs/heads/"))
  + install_branch_config(0,
  +   branch->name,
  transport->remote->name,
  source_ref->name);
  - } else if (starts_with(source_ref->name, 
"refs/remotes/")) {
  + else if (starts_with(source_ref->name, "refs/remotes/"))
warning(_("not setting upstream for a remote 
remote-tracking branch"));
  - } else if (starts_with(source_ref->name, "refs/tags/")) 
{
  + else if (starts_with(source_ref->name, "refs/tags/"))
warning(_("not setting upstream for a remote 
tag"));
  - } else {
  + else
warning(_("unknown branch type"));
  - }
} else {
warning(_("no source branch found.\n"
"you need to specify exactly one branch with 
the --set-upstream option."));
  

 Documentation/fetch-options.txt |   7 ++
 builtin/fetch.c |  51 -
 builtin/pull.c  |   6 ++
 t/t5553-set-upstream.sh | 178 
 4 files changed, 241 insertions(+), 1 deletion(-)
 create mode 100755 t/t5553-set-upstream.sh

diff --git a/Documentation/fetch-options.txt b/Documen

Re: [PATCH] pull, fetch: add --set-upstream option

2019-08-19 Thread Matthieu Moy
Pratyush Yadav  writes:

> This is not really a review. Just some minor nitpicks I spotted while 
> reading through.

Thanks for the comments.

>> -static int all, append, dry_run, force, keep, multiple, update_head_ok, 
>> verbosity, deepen_relative;
>> +static int all, append, dry_run, force, keep, multiple, update_head_ok, 
>> verbosity, deepen_relative, set_upstream;
>
> This line is getting pretty long. I think it is a good idea to split it 
> into two.

Indeed, and it was already >80 characters, I've split it.

>> +if (set_upstream) {
>> +struct branch *branch = branch_get("HEAD");
>> +struct ref *rm;
>> +struct ref *source_ref = NULL;
>> +
>> +/*
>> + * We're setting the upstream configuration for the current 
>> branch. The
>> + * relevent upstream is the fetched branch that is meant to be 
>> merged with
>> + * the current one, i.e. the one fetched to FETCH_HEAD.
>> + *
>> + * When there are several such branches, consider the request 
>> ambiguous and
>> + * err on the safe side by doing nothing and just emit a 
>> warning.
>> + */
>
> The comment lines cross the 80 column boundary. The usual convention in 
> this project is to try to keep lines below 80 columns. For strings IMO 
> an exception can be allowed because breaking them up makes it harder to 
> grep for them. But comments are the easiest to format.
>
> Are you using a tab size of 4?

I'm not, but it's possible that the original authors had. Anyway, I've
wrapped it.

>> +for (rm = ref_map; rm; rm = rm->next) {
>> +if (!rm->peer_ref) {
>> +if (source_ref) {
>> +warning(_("multiple branch detected, 
>> incompatible with --set-upstream"));
>> +goto skip;
>> +} else {
>> +source_ref = rm;
>> +}
>> +}
>> +}
>> +if (source_ref) {
>> +if (!strcmp(source_ref->name, "HEAD") || 
>
> This line has a trailing space.

Fixed.

> So this should change to something like:
>
>   install_branch_config(0, branch->name,
> transport->remote->name,
> source_ref->name);

I've added a newline after the comma, I don't like mixing "several
arguments on the same line" and "one argument per line".

>  
> Maybe this discrepancy is because you are using the wrong tab size?
>
>> +} else if (starts_with(source_ref->name, 
>> "refs/remotes/")) {
>> +warning(_("not setting upstream for a remote 
>> remote-tracking branch"));
>> +} else if (starts_with(source_ref->name, "refs/tags/")) 
>> {
>> +warning(_("not setting upstream for a remote 
>> tag"));
>> +} else {
>> +warning(_("unknown branch type"));
>> +}
>
> No need to wrap single line if statements in braces.

Fixed.

>> +#tests for fetch --set-upstream
>
> Add a space after the '#'. Same in other comments below.

Fixed.

Thanks. Version 2 fixing all these follows.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH] pull, fetch: add --set-upstream option

2019-08-19 Thread Matthieu Moy
Junio C Hamano  writes:

> Matthieu Moy  writes:
>
>> From: Corentin BOMPARD 
>>
>> Add the --set-upstream option to git pull/fetch
>> which lets the user set the upstream configuration
>> (branch..merge and
>> branch..remote) for the current branch.
>>
>> A typical use-case is:
>>
>> git clone http://example.com/my-public-fork
>> git remote add main http://example.com/project-main-repo
>> git pull --set-upstream main master
>>
>> or, instead of the last line:
>>
>> git fetch --set-upstream main master
>> git merge # or git rebase
>>
>> This functionality is analog to push --set-upstream.
>
> I was writing a one-paragraph summary for this topic, for the
> "What's cooking" report, and here is what I have:
>
>  "git fetch" learned "--set-upstream" option to help those who first
>  clone from a forked repository they intend to push to, add the true
>  upstream via "git remote add" and then "git fetch" from it.
>
> After describing it like so, I cannot shake the feeling that the
> workflow this intends to support feels somewhat backwards and
> suboptimal.
>
>  - Unless you rely on server-side "fork" like GitHub does,

Note that these days, this is not a very restrictive statement ;-).

And when you make a fork on GitHub or GitLab from the web UI, the next
thing you see is the page of your fork with a button "clone or download"
pointing to your local copy's URL. So even though there are arguments to
clone upstream first, it's also quite natural from the UI point of view
to clone the local copy, and add upstream when needed.

>you would first clone from the upstream, and then push to your
>"fork". The flow whose first step is to clone from your "fork", not
>from the true upstream, feels backwards (cloning from upstream then
>adding your fork as a secondary may be more natural, without need for
>the complexity of --set-upstream to pull/fetch/push, no?).

To me, it depends on the involvement in the project. If I plan to send
several contributions to a project, I'd usually clone the upstream and
add my fork. But I also often do:

- Find a project on GitHub/GitLab/...
- Think about a minor contribution I can make
- Fork from the web UI
- clone my fork
- code, commit, push
- make a PR

Only if my PR takes time to get accepted, I'll add upstream as a remote
and pull from there to rebase my PR.

>  - The second step adds the true upstream using "git remote", and at
>that point, in your mind you are quite clear that you want to
>pull from there (and push to your own fork).  Not having the "I
>am adding this new remote; from now on, it is my upstream"

Note that you can also group "remote add" and "pull" by saying just

  git pull --set-upstream http://example.com/project-main-repo master

(I still tend to prefer the "remote add" + "pull" flow to name the
remote, though).

>feature at this step, and instead having to say that with your
>first "git pull", feels backwards.  If this feature were instead
>added to "git remote", then the last step in your example does
>not even have to say "main" (and no need for this new option),
>does it?

There's already "git remote add --track   ", but it
does something different: it does not set the upstream information but
only sets the glob refspec to fetch only one branch from the remote.

We could add a new option like

  git remote --set-upstream   

That would do

  git remote add  
  git branch --set-upstream-to=

That wouldn't make the commands really easier to type IMHO, as you would
still have to pull at some point, so it's:

  git remote add main http://example.com/project-main-repo
  git pull --set-upstream main master
  
Vs

  git remote add --set-upstream master main http://example.com/project-main-repo
  git pull

The second is a bit shorter (saves the second instance of "master"), but
I tend to prefer the first to avoid the overly long "git remote add"
command.

Also, if one has several local branches, one may run just one "git
remote add" and several "git pull --set-upstream".

Note that there are other possible use-cases, like "upstream was using a
flow where 'master' was the main branch, but now commits to 'develop'
branch and only merges to 'master' for releases", where you can just

  git pull --set-upstream origin develop

Actually, since "--set-upstream" means "next time, *pull* from this
branch", it felt weird to have it in "git *push*" and not in "git pull".
Certainly, not having "git pull --set-upstream" it "git pull" wasn't
that much bothering (otherwise, someone would have implemented it long
ago), but I still find it a nice-to-have shortcut.

Cheers,

-- 
Matthieu Moy
https://matthieu-moy.fr/


[PATCH] pull, fetch: add --set-upstream option

2019-08-14 Thread Matthieu Moy
From: Corentin BOMPARD 

Add the --set-upstream option to git pull/fetch
which lets the user set the upstream configuration
(branch..merge and
branch..remote) for the current branch.

A typical use-case is:

git clone http://example.com/my-public-fork
git remote add main http://example.com/project-main-repo
git pull --set-upstream main master

or, instead of the last line:

git fetch --set-upstream main master
git merge # or git rebase

This functionality is analog to push --set-upstream.

Signed-off-by: Corentin BOMPARD 
Signed-off-by: Nathan BERBEZIER 
Signed-off-by: Pablo CHABANNE 
Signed-off-by: Matthieu Moy 
Patch-edited-by: Matthieu Moy 
---
This is a followup on
https://public-inbox.org/git/86zhoil3yw@univ-lyon1.fr/. It's
initially a student project, but students didn't get time to complete
it. Still, I think the feature is interesting, and I finally get time
to fix the remarks made up to now. This now looks good to me, but
obviously needs other pairs of eyes.

Thanks,

 Documentation/fetch-options.txt |   7 ++
 builtin/fetch.c |  48 -
 builtin/pull.c  |   6 ++
 t/t5553-set-upstream.sh | 178 
 4 files changed, 238 insertions(+), 1 deletion(-)
 create mode 100755 t/t5553-set-upstream.sh

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index 3c9b4f9e09..99df1f3d4e 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -169,6 +169,13 @@ ifndef::git-pull[]
Disable recursive fetching of submodules (this has the same effect as
using the `--recurse-submodules=no` option).
 
+--set-upstream::
+   If the remote is fetched successfully, pull and add upstream
+   (tracking) reference, used by argument-less
+   linkgit:git-pull[1] and other commands. For more information,
+   see `branch..merge` and `branch..remote` in
+   linkgit:git-config[1].
+
 --submodule-prefix=::
Prepend  to paths printed in informative messages
such as "Fetching submodule foo".  This option is used
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 717dd14e89..5557ae1c04 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -23,6 +23,7 @@
 #include "packfile.h"
 #include "list-objects-filter-options.h"
 #include "commit-reach.h"
+#include "branch.h"
 
 #define FORCED_UPDATES_DELAY_WARNING_IN_MS (10 * 1000)
 
@@ -50,7 +51,7 @@ static int fetch_prune_tags_config = -1; /* unspecified */
 static int prune_tags = -1; /* unspecified */
 #define PRUNE_TAGS_BY_DEFAULT 0 /* do we prune tags by default? */
 
-static int all, append, dry_run, force, keep, multiple, update_head_ok, 
verbosity, deepen_relative;
+static int all, append, dry_run, force, keep, multiple, update_head_ok, 
verbosity, deepen_relative, set_upstream;
 static int progress = -1;
 static int enable_auto_gc = 1;
 static int tags = TAGS_DEFAULT, unshallow, update_shallow, deepen;
@@ -123,6 +124,8 @@ static struct option builtin_fetch_options[] = {
OPT__VERBOSITY(&verbosity),
OPT_BOOL(0, "all", &all,
 N_("fetch from all remotes")),
+   OPT_BOOL(0, "set-upstream", &set_upstream,
+N_("set upstream for git pull/fetch")),
OPT_BOOL('a', "append", &append,
 N_("append to .git/FETCH_HEAD instead of overwriting")),
OPT_STRING(0, "upload-pack", &upload_pack, N_("path"),
@@ -1367,6 +1370,49 @@ static int do_fetch(struct transport *transport,
retcode = 1;
goto cleanup;
}
+
+   if (set_upstream) {
+   struct branch *branch = branch_get("HEAD");
+   struct ref *rm;
+   struct ref *source_ref = NULL;
+
+   /*
+* We're setting the upstream configuration for the current 
branch. The
+* relevent upstream is the fetched branch that is meant to be 
merged with
+* the current one, i.e. the one fetched to FETCH_HEAD.
+*
+* When there are several such branches, consider the request 
ambiguous and
+* err on the safe side by doing nothing and just emit a 
warning.
+*/
+   for (rm = ref_map; rm; rm = rm->next) {
+   if (!rm->peer_ref) {
+   if (source_ref) {
+   warning(_("multiple branch detected, 
incompatible with --set-upstream"));
+   goto skip;
+   } else {
+   source_ref = rm;
+   }
+   }
+   }
+   if (source_ref) {
+ 

Re: [PATCH] [WIP/RFC] add git pull and git fetch --set-upstream

2019-04-22 Thread Matthieu Moy
BOMPARD CORENTIN p1603631  writes:

> Add the --set-upstream option to git pull/fetch

Add _a_?

> which lets the user set the upstream configuration
> (branch..merge and
> branch..remote) for the current branch.
>
> For example a typical use-case like

I don't understand this sentence. Perhaps

A typical use-case is:

> git clone http://example.com/my-public-fork
>
> git remote add main http://example.com/project-main-repo
>
> git pull --set-upstream main master

I'd keep the newline before and after the block of commands, but not
between commands.

> +--set-upstream::
> + If the new URL remote is correct, pull and add upstream (tracking) 

"URL remote" seems translated literally from french. You probably meant
"remote URL". I'd write "If the remote is fetched successfully, ...".

> + reference, used by argument-less linkgit:git-push[1] and other commands.

What's your conclusion on the discussion following your previous
submission here? Mine is that git-push is not the best command to
mention here. The setting impacts pull, fetch, push, merge and rebase
(I may have missed others), and to me the main motivation is to impact
pull, so if only one command should be cited, it should be pull.

> +  * When there are several such branches, consider the request 
> ambiguous and
> +  * err on the safe side by doing nothing and just emit a waring.

s/waring/warning/

> + if (!rm->peer_ref) {
> + if (source_ref) {
> + warning(_("multiple branch detected, 
> incompatible with set-upstream"));
> + source_ref = NULL;
> + goto skip;

"source_ref = NULL" is dead code due to the "goto skip" right below. I'd
remove it.

> + if (source_ref) {
> + if (!strcmp(source_ref->name, "HEAD") || 
> + starts_with(source_ref->name, "refs/heads/")) {
> + install_branch_config(0, branch->name,
> +  
> transport->remote->name,
> +  source_ref->name);
> + } else if (starts_with(source_ref->name, 
> "refs/remotes/")) {
> + warning(_("not setting upstream for a remote 
> remote-tracking branch"));
> + } else if (starts_with(source_ref->name, "refs/tags/")) 
> {
> + warning(_("tag upstream not set"));

The second warning seems a bit cryptic to me. Why not take the same as
the first, with s/remote-tracking branch/tag/?

> + warning(_("no source branch found. \n" 

Already noted in previous round: useless trailing whitespace.

> --- /dev/null
> +++ b/t/t5553-set-upstream.sh
> @@ -0,0 +1,137 @@
> +#!/bin/sh
> +
> +test_description='"git fetch/pull --set-upstream" basic tests.
> +
> +'

Don't make $test_description a multi-line string, just close the ' on
the first line.

> +check_config_empty () {

Perhaps name the function check_config_missing instead of ..._empty.

> +test_expect_success 'setup bare parent fetch' '
> + ensure_fresh_upstream &&
> + git remote add upstream parent &&
> + git remote add up parent

I don't think you ever use this "up". Either add a comment explaining
why it's needed, or remove it.

> +test_expect_success 'fetch --set-upstream upstream master sets branch master 
> but not other' '
> + git fetch --set-upstream upstream master &&
> + check_config master upstream refs/heads/master &&
> + check_config_empty other
> +'
> +
> +test_expect_success 'fetch --set-upstream upstream other sets branch other' '
> + git fetch --set-upstream upstream other &&
> + check_config master upstream refs/heads/other &&
> + check_config_empty other
> +'

The first test sets the config for master, and the config is not reset
between tests, so the second may read the config set by "git fetch"
right above, or just a leftover config of the previous test.

You need a "clear_config" in between. Best is to put it at the beginning
of tests so that it's clear that the test does not depend on what has
been executed previously. There are several places where you really need
it. It probably makes sense to use it at the start of every tests for
consistency and future-proof-ness.

> +test_expect_success 'fetch --set-upstream http://nosuchdomain.example.com 
> fails with the bad url' '
> + test_must_fail git fetch --set-upstream http://nosuchdomain.example.com 
> &&
> + check_config master upstream refs/heads/other &&
> + check_config_empty other &&
> + check_config_empty other2
> +'

It would probably make sense to check what happens when running

  git fetch --set-upstream 

i.e. use a URL instead of a named remote.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH] [WIP/RFC] add git pull and git fetch --set-upstream

2019-04-19 Thread Matthieu Moy
Junio C Hamano  writes:

> Matthieu Moy  writes:
>
>> -u::
>> --set-upstream::
>>  For every branch that is up to date or successfully pushed, add
>>  upstream (tracking) reference, used by argument-less
>>  linkgit:git-pull[1] and other commands. For more information,
>>  see `branch..merge` in linkgit:git-config[1].
>>
>> Probably the reasoning was to make a symmetry between "git push
>> --set-upstream", which mentions "pull" in the doc, and the new "git pull
>> --set-upstream". However, I do not think there should be such symmetry:
>
> Yeah, if "git push --set-upstream" affects the settings that is used
> by "git pull", then the above description is good.  Does this new
> "git pull --set-upstream" affect the settings used by "git push"?  I
> somehow did not think so.  It records the remote and branch used by
> this particular "git pull" invocation in branch..{remote,merge}
> for use by future uses of "git pull", right?

It also affects push, in the absence of a branch..pushRemote
setting (branch..remote will be used).

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH] [WIP/RFC] add git pull and git fetch --set-upstream

2019-04-18 Thread Matthieu Moy
BOMPARD CORENTIN p1603631  writes:

> + warning(_("No source branch found. \n You need to 
> specify excatly "
> + "one branch with the 
> set-upstream option."));

s/excatly/exactly/

Also, this " \n " is weird, the trailing whitespace is useless, and the
leading one on the next line is weird. You can use the \n to split the
string in the source without risking space issues:

warning(_("no source branch found.\n"
  "You need to specify excatly one branch with 
the set-upstream option."));

?

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH] [WIP/RFC] add git pull and git fetch --set-upstream

2019-04-18 Thread Matthieu Moy
Junio C Hamano  writes:

>> --- a/Documentation/fetch-options.txt
>> +++ b/Documentation/fetch-options.txt
>> @@ -165,6 +165,11 @@ ifndef::git-pull[]
>>  Disable recursive fetching of submodules (this has the same effect as
>>  using the `--recurse-submodules=no` option).
>>  
>> +--set-upstream::
>> +If the new URL remote is correct, pull and add upstream (tracking) 
>> +reference, used by argument-less linkgit:git-push[1] and other commands.
>
> git-push and other commands?

I think this is taken from the documentation of --set-upstream for push,
which says:

-u::
--set-upstream::
For every branch that is up to date or successfully pushed, add
upstream (tracking) reference, used by argument-less
linkgit:git-pull[1] and other commands. For more information,
see `branch..merge` in linkgit:git-config[1].

Probably the reasoning was to make a symmetry between "git push
--set-upstream", which mentions "pull" in the doc, and the new "git pull
--set-upstream". However, I do not think there should be such symmetry:

Actually, the way I see it, the notion of uptream (i.e.
branch..remote and branch..merge) is primarily about
"pull" and friends, and "push" happens to use it also by default. But
when branch..pushRemote is set, upstream is really about
pulling, and pushing goes to the pushRemote.

>> +/* TODO: remove debug trace */
>
> Perhaps do so before sending it out for the review?

Yes. This is WIP for now, but it's time to get closer to a real patch,
and these debug statements are counter-productive for that.

>> +test_must_be_empty merge.$1
>> +}
>
> If this wanted to say "It is OK for the variable to be missing, and
> it also is OK for the variable to have an empty string as its value;
> all other cases are unacceptable",

Actually, I don't think the "present but empty" case makes sense here,
so just test_must_fail git config "$1" should do the trick.

I agree with all other remarks.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH] [WIP/RFC] add git pull and git fetch --set-upstream

2019-04-04 Thread Matthieu Moy
git fetch/pull --set-upstream" basic tests.
> +
> +'
> +. ./test-lib.sh
> +
> +
> +
> +check_config() {
> + (echo $2; echo $3) >expect.$1
> + (git config branch.$1.remote
> +  git config branch.$1.merge) >actual.$1
> + test_cmp expect.$1 actual.$1
> +}
> +
> +check_config_empty() {
> + git config branch.$1.remote >remote.$1
> + test_must_be_empty remote.$1
> + git config branch.$1.merge >merge.$1
> + test_must_be_empty merge.$1
> +}

Broken &&-chain (in both functions, but most importantly in the second,
where the first test_must_be_empty is useless without &&.

> +test_expect_success 'fetch --set-upstream does not set branch other' '

Misleading test name: "set branch" -> "set upstream"? And here it's not
just about "other" but about all branches.

'fetch --set-upstream does not set upstream w/o branch'

?

> + git checkout master &&
> + git fetch --set-upstream upstream &&
> + check_config_empty master &&
> + check_config_empty other
> +'

> +#test_expect_success 'fetch --set-upstream does not set branch other' '
> +#git checkout master &&
> +#git fetch --set-upstream upstream &&
> +#check_config master upstream refs/heads/master &&
> +#check_config_empty other
> +#'

Avoid leaving leftovers like this, even in WIP patches, they distract
the reader.

> +test_expect_success 'fetch --set-upstream upstream master sets branch master 
> but not other' '
> + git fetch --set-upstream upstream master &&
> + check_config master upstream refs/heads/master &&
> + check_config_empty other
> +'
> +
> +

Style: you sometimes leave 2 blank lines, sometimes 1 between tests. Try
to be consistent.

> +test_expect_success 'pull --set-upstream upstream other sets branch other' '

Test title and content say the opposite of each other.

> + git pull --set-upstream upstream other &&
> + check_config master upstream refs/heads/other &&
> + check_config_empty other
> +'

> +test_expect_success 'pull --set-upstream http://nosuchdomain.example.com 
> fails with the bad url' '
> + test_must_fail git pull --set-upstream http://nosuchdomain.example.com
> +'

You should check that it doesn't touch the config. That it fails is not
a surprise regardless of the correctness of your code, but the thing to
check is that it does not touch the config before failing.

> +test_expect_success 'pull --set-upstream upstream with more than one branch 
> does nothing' '

Here also, test title and content say different things. Probably you
need to reset the config and use check_config_empty.

> + git pull --set-upstream upstream master three &&
> + check_config master upstream HEAD &&
> + check_config_empty three
> +'

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH 2/2 v3] doc: format pathnames and URLs as monospace.

2019-03-13 Thread Matthieu Moy
Eric Sunshine  writes:

> In patch 1/2:
>
> * drop the full stop from the first line of the commit message
>
> * s/futur/future/ in the commit message
>
> * s/There are false/& positives/ in the commit message
>
> * s/both, It/both, it/

Nice catches, thanks.

> In patch 2/2, there's a 'man curl' which probably ought to be
> converted to `man curl` (per paragraph updated by patch 1/2), but
> perhaps that's outside the scope of this patch series

Yes, I have no objection in fixing it but I'd rather limit the scope of
the patch and keep other fixes for another time.

Thanks,

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH 2/2 v3] doc: format pathnames and URLs as monospace.

2019-03-12 Thread Matthieu Moy
Corentin BOMPARD  writes:

> Applying CodingGuidelines about monospace on pathnames and URLs.
>
> See Documentation/CodingGuidelines.txt for more information.
>
> Signed-off-by: Corentin BOMPARD 
> Signed-off-by: Nathan BERBEZIER 
> Signed-off-by: Pablo CHABANNE 
> Signed-off-by: Matthieu MOY 
> ---
>  Changes: We listen to Matthieu MOY and Eric SUNSHINE's remarks about
>  our mistakes on the last patch.

This addresses all my previous remarks, so this (patches 1 and 2) is now

Reviewed-by: Matthieu Moy 

I'm Cc-ing Eric in case he has anything more to say.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH v2] doc: format pathnames and URLs as monospace

2019-03-03 Thread Matthieu Moy
"corentin bompard"  wrote:

> Updating the documentation to use monospace on URLs and pathnames because it
> makes more sense

We usually write commit message with an imperative tone, eg. "Update
documentation", not "Updating documentation". Also, the period (.) is
missing at the end of the sentence and the message is not wrapped at
72 characters (your text editor probably can do that for you; with
Emacs it's M-q or M-x auto-fill-mode RET).

But more importantly, "makes more sense" is the question here, not an
answer. The commit message is precisely here to justify why the code
after the patch makes more sense than before, and you can't argue "it
makes more sense because it makes more sense".

Among the arguments:

* It is already an established practice. For example:

  $ git grep "'[^' ]*/[^' ]*'" | wc -l
  204
  $ git grep '`[^` ]*/[^` ]*`' | wc -l
  576
  
  There are false on both sides, but after a cursory look at the
  output of both, I don't think the false positive rate is really
  higher in the second case.

  At least, this shows that the existing documentation uses inconsistent
  formatting, and that it would be good to do something about it.

* It may be debatable whether path names need to be typed in
  monospace (I wouldn't be shocked if they were using the normal
  font), but having them in italics is really unusual.

In addition to doing the actual change, you probably want to add a
mention of the rule followed in Documentation/CodingGuideline.

The patch itself looks good, except the error noted by Eric Sunshine
and:

> --- a/Documentation/git-filter-branch.txt
> +++ b/Documentation/git-filter-branch.txt
> @@ -48,7 +48,7 @@ rewriting published history.)
> 
> Always verify that the rewritten version is correct: The original refs,
> if different from the rewritten ones, will be stored in the namespace
> -'refs/original/'.
> +`refs/original/`.
> 
> Note that since this operation is very I/O expensive, it might
> be a good idea to redirect the temporary directory off-disk with the
[...]
> diff --git a/Documentation/glossary-content.txt 
> b/Documentation/glossary-content.txt
> index 023ca95e7..9848d0d84 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -524,7 +524,7 @@ The most notable example is `HEAD`.
> [[def_remote_tracking_branch]]remote-tracking branch::
>   A <> that is used to follow changes from another
>   <>. It typically looks like
> - 'refs/remotes/foo/bar' (indicating that it tracks a branch named
> + `refs/remotes/foo/bar` (indicating that it tracks a branch named
>   'bar' in a remote named 'foo'), and matches the right-hand-side of
>   a configured fetch <>. A remote-tracking
>   branch should not contain direct modifications or have local
> @@ -654,7 +654,7 @@ The most notable example is `HEAD`.
>   The default <> that is merged into the branch in
>   question (or the branch in question is rebased onto). It is configured
>   via branch..remote and branch..merge. If the upstream branch
> - of 'A' is 'origin/B' sometimes we say "'A' is tracking 'origin/B'".
> + of 'A' is `origin/B` sometimes we say "'A' is tracking `origin/B`".
> 
> [[def_working_tree]]working tree::
>   The tree of actual checked out files.  The working tree normally
[...]
> diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt
> index 72daa20e7..92b1d5638 100644
> --- a/Documentation/revisions.txt
> +++ b/Documentation/revisions.txt
> @@ -23,27 +23,27 @@ characters and to avoid word splitting.
>   followed by a dash and a number of commits, followed by a dash, a
>   'g', and an abbreviated object name.
> 
> -'', e.g. 'master', 'heads/master', 'refs/heads/master'::
> +'', e.g. 'master', `heads/master`, `refs/heads/master`::

These are refnames, and you said you excluded them from the patch.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: Students projects: looking for small and medium project ideas

2019-02-26 Thread Matthieu Moy
Fabio Aiuto  writes:

> Hi Matthieu and to all developers,
> I'm Fabio, no more a student and I'm brand new in community
> development. I joined the git mailing-list about two weeks ago and I'm
> looking for some first fix or tasks. I apologize myself in advance for
> my little know of the subject.  Hope to have some useful information to
> start workin'.

My advice would be to "scratch your own itch", i.e. find something you
dislike about Git, and try to improve that. It's hard to find the
motivation (and time) to contribute in a purely un-interested way, but
once you start getting the benefits of your own patches in the way _you_
use Git, it's really rewarding !

Cheers,

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: Do test-path_is_{file,dir,exists} make sense anymore with -x?

2019-02-26 Thread Matthieu Moy
SZEDER Gábor  writes:

> On Tue, Feb 26, 2019 at 05:10:30PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> However. I wonder in general if we've re-visited the utility of these
>> wrappers and maybe other similar wrappers after -x was added.
>
>> But 4 years after this was added in a136f6d8ff ("test-lib.sh: support -x
>> option for shell-tracing", 2014-10-10) we got -x, and then with "-i -v -x":
>
> '-x' tracing doesn't work in all test scripts, unless it is run with a
> Bash version already supporting BASH_XTRACEFD, i.e. v4.1 or later.
> Notably the default Bash shipped in macOS is somewhere around v3.2.

According to http://www.tldp.org/LDP/abs/html/bashver4.html#AEN21183,
bash 4.1 was released on May, 2010. Are you sure macOS is _that_ late?

I also tried with dash, and -x seems to work fine too (I use "works with
dash" as a heuristic for "should word on any shell", but it doesn't
always work).

If -x doesn't work in some setups, it may be a good reason to wait a bit
before trashing test_path_is_*, but if it's clear enough that the vast
majority of platforms get -x, then why not trash these wrappers indeed.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH] reflog: specify default pretty format in config

2019-01-30 Thread Matthieu Moy
Roland Hieber  writes:

> The output of git-reflog is currently only customizable by calling
> reflog with --pretty=... or overriding the default "oneline" pretty
> format in the configuration.

Sounds like a good idea to me, but the patch needs a bit more work:

>  Documentation/git-reflog.txt |  2 ++

It's nice to refer to the config variable in git-reflog.txt, but you
should also document it in some Documentation/config/*.txt file,
included from Documentation/config.txt, so that it appears in man
git-config.

>  builtin/log.c| 12 +---
>  2 files changed, 11 insertions(+), 3 deletions(-)

This lacks tests, too (t/*.sh).

> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -667,6 +667,7 @@ int cmd_log_reflog(int argc, const char **argv, const 
> char *prefix)
>  {
>   struct rev_info rev;
>   struct setup_revision_opt opt;
> + int cfg_have_pretty;
>  
>   init_log_defaults();
>   git_config(git_log_config, NULL);
> @@ -676,11 +677,16 @@ int cmd_log_reflog(int argc, const char **argv, const 
> char *prefix)
>   rev.verbose_header = 1;
>   memset(&opt, 0, sizeof(opt));
>   opt.def = "HEAD";
> +
> + cfg_have_pretty = git_config_get_string_const("reflog.pretty", 
> &fmt_pretty);
>   cmd_log_init_defaults(&rev);
> - rev.abbrev_commit = 1;
> - rev.commit_format = CMIT_FMT_ONELINE;
> - rev.use_terminator = 1;
> + if (cfg_have_pretty != 0) {

I'd write just "if (cfg_have_pretty)".

>   rev.always_show_header = 1;
> +
>   cmd_log_init_finish(argc, argv, prefix, &rev, &opt);

Avoid adding unrelated whitespace changes like this one.

Regards,

-- 
Matthieu Moy
https://matthieu-moy.fr/


Students projects: looking for small and medium project ideas

2019-01-14 Thread Matthieu Moy
Hi,

I haven't been active for a while on this list, but for those who don't
know me, I'm a CS teacher and I'm regularly offering my students to
contribute to open-source projects as part of their school projects. A
few nice features like "git rebase -i --exec" or many of the hints in
"git status" were implemented as part of these projects.

I'm starting another instance of such project next week.

Part of the work of students is to choose which feature they want to
work on, but I try to prepare this for them. I'm keeping a list of ideas
here:

  https://git.wiki.kernel.org/index.php/SmallProjectsIdeas

(At some point, I should probably migrate this to git.github.io, since
the wiki only seems half-alive these days).

I'm looking for small to medium size projects (typically, a GSoC project
is far too big in comparison, but we may expect more than just
microprojects).

You may suggest ideas by editting the wiki page, or just by replying to
this email (I'll point my students to the thread). Don't hesitate to
remove entries (or ask me to do so) on the wiki page if you think they
are not relevant anymore.

Thanks in advance,

-- 
Matthieu Moy
https://matthieu-moy.fr/


[PATCH] git-multimail: update to release 1.5.0

2019-01-07 Thread Matthieu Moy
Changes are described in CHANGES.

Contributions-by: Matthieu Moy 
Contributions-by: William Stewart 
Contributions-by: Ville Skyttä 
Contributions-by: Dirk Olmes 
Contributions-by: Björn Kautler 
Contributions-by: Konstantin Ryabitsev 
Contributions-by: Gareth Pye 
Contributions-by: David Lazar 
Signed-off-by: Matthieu Moy 
---

A very long time since I did a git-multimail release, sorry for the
lack of activity on this project. OTOH, there have not been many
pull-requests nor feature requests, which may be a sign that the
project reached some kind of maturity.

Help is welcome if some people would like to see git-multimail evolve
faster.

 contrib/hooks/multimail/CHANGES   |  56 ++
 contrib/hooks/multimail/CONTRIBUTING.rst  |  28 ++-
 contrib/hooks/multimail/README.Git|   4 +-
 .../hooks/multimail/{README => README.rst}|  38 +++-
 contrib/hooks/multimail/doc/gitolite.rst  |   9 +
 contrib/hooks/multimail/git_multimail.py  | 188 ++
 .../hooks/multimail/migrate-mailhook-config   |  13 +-
 contrib/hooks/multimail/post-receive.example  |   2 +-
 8 files changed, 281 insertions(+), 57 deletions(-)
 rename contrib/hooks/multimail/{README => README.rst} (95%)

diff --git a/contrib/hooks/multimail/CHANGES b/contrib/hooks/multimail/CHANGES
index 2076cf972b..35791fd02c 100644
--- a/contrib/hooks/multimail/CHANGES
+++ b/contrib/hooks/multimail/CHANGES
@@ -1,3 +1,59 @@
+Release 1.5.0
+=
+
+Backward-incompatible change
+
+
+The name of classes for environment was misnamed as `*Environement`.
+It is now `*Environment`.
+
+New features
+
+
+* A Thread-Index header is now added to each email sent (except for
+  combined emails where it would not make sense), so that MS Outlook
+  properly groups messages by threads even though they have a
+  different subject line. Unfortunately, even adding this header the
+  threading still seems to be unreliable, but it is unclear whether
+  this is an issue on our side or on MS Outlook's side (see discussion
+  here: https://github.com/git-multimail/git-multimail/pull/194).
+
+* A new variable multimailhook.ExcludeMergeRevisions was added to send
+  notification emails only for non-merge commits.
+
+* For gitolite environment, it is now possible to specify the mail map
+  in a separate file in addition to gitolite.conf, using the variable
+  multimailhook.MailaddressMap.
+
+Internal changes
+
+
+* The testsuite now uses GIT_PRINT_SHA1_ELLIPSIS where needed for
+  compatibility with recent Git versions. Only tests are affected.
+
+* We don't try to install pyflakes in the continuous integration job
+  for old Python versions where it's no longer available.
+
+* Stop using the deprecated cgi.escape in Python 3.
+
+* New flake8 warnings have been fixed.
+
+* Python 3.6 is now tested against on Travis-CI.
+
+* A bunch of lgtm.com warnings have been fixed.
+
+Bug fixes
+-
+
+* SMTPMailer logs in only once now. It used to re-login for each email
+  sent which triggered errors for some SMTP servers.
+
+* migrate-mailhook-config was broken by internal refactoring, it
+  should now work again.
+
+This version was tested with Python 2.6 to 3.7. It was tested with Git
+1.7.10.406.gdc801, 2.15.1 and 2.20.1.98.gecbdaf0.
+
 Release 1.4.0
 =
 
diff --git a/contrib/hooks/multimail/CONTRIBUTING.rst 
b/contrib/hooks/multimail/CONTRIBUTING.rst
index da65570e9b..de20a54287 100644
--- a/contrib/hooks/multimail/CONTRIBUTING.rst
+++ b/contrib/hooks/multimail/CONTRIBUTING.rst
@@ -4,9 +4,8 @@ Contributing
 git-multimail is an open-source project, built by volunteers. We would
 welcome your help!
 
-The current maintainers are Matthieu Moy
- and Michael Haggerty
-.
+The current maintainers are `Matthieu Moy <http://matthieu-moy.fr>`__ and
+`Michael Haggerty <https://github.com/mhagger>`__.
 
 Please note that although a copy of git-multimail is distributed in
 the "contrib" section of the main Git project, development takes place
@@ -33,6 +32,29 @@ mailing list`_.
 Please CC emails regarding git-multimail to the maintainers so that we
 don't overlook them.
 
+Help needed: testers/maintainer for specific environments/OS
+
+
+The current maintainer uses and tests git-multimail on Linux with the
+Generic environment. More testers, or better contributors are needed
+to test git-multimail on other real-life setups:
+
+* Mac OS X, Windows: git-multimail is currently not supported on these
+  platforms. But since we have no external dependencies and try to
+  write code as portable as possible, it is possible that
+  git-multimail already runs there and if not, it is likely that it
+  could be ported easily.
+
+  Patches to improve support for Windows and OS X are welcome.
+  Ideally, there would be a sub-maintainer for each OS who would test
+  at lea

Re: [PATCH] push: change needlessly ambiguous example in error

2018-11-13 Thread Matthieu Moy
"Junio C Hamano"  wrote:

> > Where 'topic' is a tracking branch of 'origin/master' (I use
> > push.default=upstream). I only recently discovered that I could push to
> > 'HEAD" to do the same thing. So one ulterior motive is to make that more
> > prominent.
[...]
> Do we consider the current behaviour useful? Is it documented

Yes, since 1750783 (Documentation: more git push examples, 2009-01-26).

It may be an accident that the doc says "to the same name on the
remote." since it predates the introduction of push.default, but it
does say so and it's the actual behavior.

> already and widely known?

https://stackoverflow.com/questions/14031970/git-push-current-branch-shortcut

458 votes for the answer suggesting it.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH] push: change needlessly ambiguous example in error

2018-11-13 Thread Matthieu Moy
Ævar Arnfjörð Bjarmason"  wrote:

> Let's use "git push  HEAD" which always means push the current
> branch name to that remote, instead of "git push 
> " which will do that under "simple", but is not
> guaranteed to do under "upstream".

Probably a good idea indeed.

One potential objection though: I think many people know

  git push origin master

(not that they necessarily understand it, but I've seen many people talking
or asking questions about what "origin master" is)

This "git push origin HEAD" is IMHO less common. It may confuse users.
Or users may learn it and be happy thanks to your message. I don't know.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread Matthieu Moy
"Per Lundberg"  wrote:

> On 11/11/18 5:41 PM, Duy Nguyen wrote:
> > On Sun, Nov 11, 2018 at 1:33 PM Ævar Arnfjörð Bjarmason
> >  wrote:
>
> >> That will lose no data, and in the very rare cases where a checkout of
> >> tracked files would overwrite an ignored pattern, we can just error out
> >> (as we do with the "Ok to overwrite" branch removed) and tell the user
> >> to delete the files to proceed.
> > There's also the other side of the coin. If this refuse to overwrite
> > triggers too often, it can become an annoyance.

I may have missed some cases, but to me the cases when checkout may try
to overwrite an ignored file are essentially:

* Someone "git add"ed a file meant to be ignored by mistake (e.g.
  "git add -f *.o").

* A file that was meant to be kept private (e.g. config.mak.dev) ends
  up being tracked. This may happen when we find a way to make per-developer
  settings the same for everyone.

I both cases I'd want at least to be notified that something is going on,
and in the second I'd probably want to keep my local file around.

> If we feel thrashable is stretching it too far (which I don't think it
> is), we could add a "core.ignore_files_are_trashable" setting that
> brings back the old semantics, for those who have a strong feeling about it.

May I remind an idea I sugested in an old thread: add an intermediate level
where ignored files to be overwritten are renamed (eg. foo -> foo~ like Emacs'
backup files):

https://public-inbox.org/git/vpqd3t9656k@bauges.imag.fr/

One advantage of the "rename" behavior is that it's safer that the current,
but still not very disturbing for people who like the current behavior. This
makes it a good candidate for a default behavior.

This could come in complement with this thread's "precious" concept:

* If you know what you're doing and know that such or such file is precious,
  mark it as such and Git will never overwrite it.

* If you don't know about precious files, just keep the default setting and
  the worse that can happen is to get your file overwritten with a bakup
  of the old version kept around.

This would probably play better with a notion of "precious" files than with
a notion of "trashable" files.

-- 
Matthieu Moy
https://matthieu-moy.fr/


git@vger.kernel.org

2018-08-08 Thread Matthieu Moy
"jrnieder"  wrote:

> (+cc: some folks interested in git-remote-mediawiki)

Thanks.

In case it still matters, an obvious Acked-by: Matthieu Moy 


-- 
Matthieu Moy


Re: [PATCH 1/3] checkout.c: add strict usage of -- before file_path

2018-05-13 Thread Matthieu Moy
"Dannier Castro L"  wrote:

> Currently,  is a complex command able to handle both
> branches and files without any distintion other than their names,
> taking into account that depending on the type (branch or file),
> the functionality is completely different, the easier example:
> 
> $ git checkout   # Switch from current branch to .
> $ git checkout # Restore  from HEAD, discarding
> # changes if it's necessary.
> $ git checkout --  # The same as the last one, only with an
> # useless '--'.

It's not really "useless".

In the first two commands you give, git guesses whether the first
argument is a branch or a file. In the 3rd, the -- indicates that it
must be a file.

> For GIT new users,

Nit: we write Git (for the system) or git (for the command-line tool),
but usually avoid the all-caps GIT.

> The solution consists in detect '--' into command args, allowing
> the discard of changes and considering the following names as
> file paths, otherwise, they are branch names.

This answers the "what?" question, but does not explain why this is a
good change. A good commit message should focus more on the "why?"
question.

While I agree that "git checkout" is a complex command, and would love
to see a simpler syntax at least for the most common operations, I do
not think that this is a good change for several reasons:

* If one considers that this "--" syntax is an issue, then this patch
  is a very partial solution only. Many other commands use the same
  convention (for example "git log " Vs "git log -- "),
  so changing only one makes git less consistent. Also, note that this
  "--" convention is not git-specific. Try "touch --help" and "touch
  -- --help" for example.

* This breaks backward compatibility. People used to "git checkout
  " won't be able to use it anymore. Scripts using it will
  break. Git avoids breaking backward compatibility, and when there's
  a good reason to do so we need a good transition plan. In this case,
  one possible plan would be to 1) issue a warning whenever "git
  checkout " is used for a while, and then 2) actually forbid
  it. But following this path, I don't think step 2) would actually be
  useful.

> @@ -928,6 +931,7 @@ static int parse_branchname_arg(int argc, const char 
> **argv,
>   dash_dash_pos = -1;
>   for (i = 0; i < argc; i++) {
>   if (!strcmp(argv[i], "--")) {
> + opts->discard_changes = 1;
>   dash_dash_pos = i;

Wouldn't "dash_dash_pos != -1" be enough to know whether there's a --?

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults

2018-05-01 Thread Matthieu Moy
"Eckhard Maaß" :

> On Tue, May 01, 2018 at 01:09:06PM +0200, Matthieu Moy wrote:
> > That init_diff_ui_defaults() should indeed have been before
> > git_config() from the beginning. My bad, I'm the one who
> > misplaced it apparently :-(.

> Should I have done this "bug fix" in a separate commit or mention it in
> the commit message?

I'm fine with it as-is. Before your "fix", the config was ignored
because overwritten by init_diff_ui_defaults() after reading the
config, so effect of your change is indeed what the commit message
describes.

I'm often thinking aloud while reviewing, don't take my comments as
objections.

> > This "break_opt = 0" deserves a mention in the commit message IMHO.
> > I'm not 100% sure it's a good change actually.

> Hm, what problems do you see here?

I don't see any "problem", I *think* your change is good, but I can't
fully convince myself that it is without further explanation.

Unlike the other two, this option has no corresponding configuration
variable, so the "let the config" argument doesn't apply. For "git
status", there's actually not even a command-line option. So, this
assignment removed, there's no way in the user-interface to re-enable
the previous behavior. *If* there was a good reason to get "break_opt
= 0", then your patch is breaking it.

Unfortunately, the commit introducing it doesn't help much: f714fb8
(Enable rewrite as well as rename detection in git-status,
2007-12-02) is just a one-liner message with a one-liner patch.

But actually, I never used -B/--break-rewrites, and writting this
message I tried to get a case where -B would make a difference and I'm
not even able to find one. So, as someone who never understood the
real point of -B, I'm not sure I'm qualified to juge on what's the
best default ;-).

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults

2018-05-01 Thread Matthieu Moy
"Eckhard S. Maaß"  wrote:

> Since the very beginning, git status behaved differently for rename
> detection than other rename aware commands like git log or git show as
> it has the use of rename hard coded into it.

My understanding is that the succession of events went stg like:

1) invent the rename detection, but consider it experimental
   hence don't activate it by default;

2) add commands using the rename detection, and since it works
   well, use it by default;

3) activate rename detection by default for diff.

The next logical step is what you patch does indeed.

> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -161,9 +161,9 @@ static void determine_whence(struct wt_status *s)
> static void status_init_config(struct wt_status *s, config_fn_t fn)
> {
>   wt_status_prepare(s);
> + init_diff_ui_defaults();
>   git_config(fn, s);
>   determine_whence(s);
> - init_diff_ui_defaults();
>   s->hints = advice_status_hints; /* must come after git_config() */
> }

That init_diff_ui_defaults() should indeed have been before
git_config() from the beginning. My bad, I'm the one who
misplaced it apparently :-(.

> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -625,9 +625,6 @@ static void wt_status_collect_changes_index(struct 
> wt_status
> *s)
>   rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
>   rev.diffopt.format_callback = wt_status_collect_updated_cb;
>   rev.diffopt.format_callback_data = s;
> - rev.diffopt.detect_rename = DIFF_DETECT_RENAME;
> - rev.diffopt.rename_limit = 200;
> - rev.diffopt.break_opt = 0;

This "break_opt = 0" deserves a mention in the commit message
IMHO. I'm not 100% sure it's a good change actually.

break_opt is normally controlled by "-B/--break-rewrites".
I'm not sure why it was set to 0.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: multimail/p4: issues identified by lgtm.com

2018-03-19 Thread Matthieu Moy
Johannes Schindelin writes:

> Hi team,
>
> while Coverity was down (it still is not reachable for me, but I guess
> that's just because everybody and their dog wants to catch up on a month
> of work delayed by their outage), I tried to find alternatives, and one of
> them is lgtm.com. Their C/C++ support is not generally available yet, but
> there have been a couple of issues identified in our Python scripts:
>
> https://lgtm.com/projects/g/git/git/alerts/?mode=list&rule=py%2Fcall%2Fwrong-arguments%2Cpy%2Fcall%2Fwrong-named-argument%2Cpy%2Fexplicit-call-to-delete%2Cpy%2Finconsistent-equality%2Cpy%2Finheritance%2Fincorrect-overridden-signature%2Cpy%2Fmissing-call-to-init%2Cpy%2Fmultiple-definition%2Cpy%2Fredundant-else%2Cpy%2Funinitialized-local-variable%2Cpy%2Funreachable-statement
>
> From a cursory look, it *does* seem as if there are legitimate issues
> there that want to be fixed ;-) But then, I am far from a Python expert,
> so my impressions could be completely off the mark.

For git-multimail, there are actually a lot of false-positive due to the
way we use multiple-inheritance (when calling super(...).__init__(),
lgtm assumes we're calling the superclass's constructor while we
actually call the "next in the method resolution order list"), but also
probably valid concerns.

I've set up a separate project for git-multimail:

  https://lgtm.com/projects/g/git-multimail/git-multimail/alerts/?mode=list

I'll have a deeper look ASAP.

Thanks!

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH 5/8] perl: update our copy of Mail::Address

2018-02-15 Thread Matthieu Moy
"Jonathan Nieder"  wrote:

> Ævar Arnfjörð Bjarmason wrote:
>
> > Update our copy of Mail::Address from 2.19 (Aug 22, 2017) to 2.20 (Jan
> > 23, 2018). This should be a trivial update[1] but it seems the version
> > Matthieu Moy imported in bd869f67b9 ("send-email: add and use a local
> > copy of Mail::Address", 2018-01-05) doesn't correspond to any 2.19
> > version found on the CPAN. From the comment at the top of the file it
> > looks like some OS version with the POD stripped, and with different
> > indentation.
>
> Were there changes other than the POD stripping?

No.

I should have mentionned it in the commit message, but the one I took was
from:

  http://cpansearch.perl.org/src/MARKOV/MailTools-2.19/lib/Mail/Address.pm

i.e. following the "source" link from:

  http://search.cpan.org/~markov/MailTools-2.19/lib/Mail/Address.pod

The link name suggested it was the actual source code but indeed it's a
pre-processed file with the POD stripped.

It would make sense to indicate explicitly where this file is from in
this commit's message to avoid having the same discussion next time someone
upgrades the package.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: GSoC 2018 Org applications. Deadline = January 23, 2018 at 18:00 (CET)

2018-01-08 Thread Matthieu Moy
"Christian Couder"  wrote:

> Hi,
> 
> On Fri, Jan 5, 2018 at 12:18 PM, Johannes Schindelin
>  wrote:
>> Hi,
>>
>> On Fri, 5 Jan 2018, Matthieu Moy wrote:
>>
>>> If we go for it, we need:
>>>
>>> * Admins
>>>
>>> * Potential mentors
>>
>> Count me in as a potential mentor.
> 
> I am ok to be admin and mentor.

Cool :-)

In case you missed it: there's an iCal/Google calendar link on
the timeline page:

  https://developers.google.com/open-source/gsoc/timeline

If you use an electronic calendar, it's nice to add it to make
sure you never miss a deadline. The URL is the same every year, it's
how I got a reminder that the application was open.

-- 
Matthieu Moy
https://matthieu-moy.fr/


[PATCH v3 3/3] send-email: add test for Linux's get_maintainer.pl

2018-01-08 Thread Matthieu Moy
From: Alex Bennée 

We had a regression that broke Linux's get_maintainer.pl. Using
Mail::Address to parse email addresses fixed it, but let's protect
against future regressions.

Note that we need --cc-cmd to be relative because this option doesn't
accept spaces in script names (probably to allow --cc-cmd="executable
--option"), while --smtp-server needs to be absolute.

Patch-edited-by: Matthieu Moy 
Signed-off-by: Alex Bennée 
Signed-off-by: Matthieu Moy 
---
Change since v2:

* Mention relative Vs absolute path in commit message.

* Remove useless "chmod +x"

* Remove useless double quotes

 t/t9001-send-email.sh | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index 4d261c2..a06e5d7 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -172,6 +172,25 @@ test_expect_success $PREREQ 'cc trailer with various 
syntax' '
test_cmp expected-cc commandline1
 '
 
+test_expect_success $PREREQ 'setup fake get_maintainer.pl script for cc 
trailer' "
+   write_script expected-cc-script.sh <<-EOF
+   echo 'One Person  (supporter:THIS (FOO/bar))'
+   echo 'Two Person  (maintainer:THIS THING)'
+   echo 'Third List  (moderated list:THIS THING 
(FOO/bar))'
+   echo ' (moderated list:FOR THING)'
+   echo 'f...@example.com (open list:FOR THING (FOO/bar))'
+   echo 's...@example.com (open list)'
+   EOF
+"
+
+test_expect_success $PREREQ 'cc trailer with get_maintainer.pl output' '
+   clean_fake_sendmail &&
+   git send-email -1 --to=recipi...@example.com \
+   --cc-cmd=./expected-cc-script.sh \
+   --smtp-server="$(pwd)/fake.sendmail" &&
+   test_cmp expected-cc commandline1
+'
+
 test_expect_success $PREREQ 'setup expect' "
 cat >expected-show-all-headers <<\EOF
 0001-Second.patch
-- 
2.7.4



[PATCH v3 2/3] Remove now useless email-address parsing code

2018-01-08 Thread Matthieu Moy
We now use Mail::Address unconditionaly, hence parse_mailboxes is now
dead code. Remove it and its tests.

Signed-off-by: Matthieu Moy 
---
No change since v2.

 perl/Git.pm  | 71 
 t/t9000-addresses.sh | 27 
 t/t9000/test.pl  | 67 -
 3 files changed, 165 deletions(-)
 delete mode 100755 t/t9000-addresses.sh
 delete mode 100755 t/t9000/test.pl

diff --git a/perl/Git.pm b/perl/Git.pm
index ffa09ac..65e6b32 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -880,77 +880,6 @@ sub ident_person {
return "$ident[0] <$ident[1]>";
 }
 
-=item parse_mailboxes
-
-Return an array of mailboxes extracted from a string.
-
-=cut
-
-# Very close to Mail::Address's parser, but we still have minor
-# differences in some cases (see t9000 for examples).
-sub parse_mailboxes {
-   my $re_comment = qr/\((?:[^)]*)\)/;
-   my $re_quote = qr/"(?:[^\"\\]|\\.)*"/;
-   my $re_word = qr/(?:[^]["\s()<>:;@\\,.]|\\.)+/;
-
-   # divide the string in tokens of the above form
-   my $re_token = qr/(?:$re_quote|$re_word|$re_comment|\S)/;
-   my @tokens = map { $_ =~ /\s*($re_token)\s*/g } @_;
-   my $end_of_addr_seen = 0;
-
-   # add a delimiter to simplify treatment for the last mailbox
-   push @tokens, ",";
-
-   my (@addr_list, @phrase, @address, @comment, @buffer) = ();
-   foreach my $token (@tokens) {
-   if ($token =~ /^[,;]$/) {
-   # if buffer still contains undeterminated strings
-   # append it at the end of @address or @phrase
-   if ($end_of_addr_seen) {
-   push @phrase, @buffer;
-   } else {
-   push @address, @buffer;
-   }
-
-   my $str_phrase = join ' ', @phrase;
-   my $str_address = join '', @address;
-   my $str_comment = join ' ', @comment;
-
-   # quote are necessary if phrase contains
-   # special characters
-   if ($str_phrase =~ /[][()<>:;@\\,.\000-\037\177]/) {
-   $str_phrase =~ s/(^|[^\\])"/$1/g;
-   $str_phrase = qq["$str_phrase"];
-   }
-
-   # add "<>" around the address if necessary
-   if ($str_address ne "" && $str_phrase ne "") {
-   $str_address = qq[<$str_address>];
-   }
-
-   my $str_mailbox = "$str_phrase $str_address 
$str_comment";
-   $str_mailbox =~ s/^\s*|\s*$//g;
-   push @addr_list, $str_mailbox if ($str_mailbox);
-
-   @phrase = @address = @comment = @buffer = ();
-   $end_of_addr_seen = 0;
-   } elsif ($token =~ /^\(/) {
-   push @comment, $token;
-   } elsif ($token eq "<") {
-   push @phrase, (splice @address), (splice @buffer);
-   } elsif ($token eq ">") {
-   $end_of_addr_seen = 1;
-   push @address, (splice @buffer);
-   } elsif ($token eq "@" && !$end_of_addr_seen) {
-   push @address, (splice @buffer), "@";
-   } else {
-   push @buffer, $token;
-   }
-   }
-
-   return @addr_list;
-}
-
 =item hash_object ( TYPE, FILENAME )
 
 Compute the SHA1 object id of the given C considering it is
diff --git a/t/t9000-addresses.sh b/t/t9000-addresses.sh
deleted file mode 100755
index a1ebef6..000
--- a/t/t9000-addresses.sh
+++ /dev/null
@@ -1,27 +0,0 @@
-#!/bin/sh
-
-test_description='compare address parsing with and without Mail::Address'
-. ./test-lib.sh
-
-if ! test_have_prereq PERL; then
-   skip_all='skipping perl interface tests, perl not available'
-   test_done
-fi
-
-perl -MTest::More -e 0 2>/dev/null || {
-   skip_all="Perl Test::More unavailable, skipping test"
-   test_done
-}
-
-perl -MMail::Address -e 0 2>/dev/null || {
-   skip_all="Perl Mail::Address unavailable, skipping test"
-   test_done
-}
-
-test_external_has_tap=1
-
-test_external_without_stderr \
-   'Perl address parsing function' \
-   perl "$TEST_DIRECTORY"/t9000/test.pl
-
-test_done
diff --git a/t/t9000/test.pl b/t/t9000/test.pl
deleted file mode 100755
index dfeaa9c..000
--- a/t/t9000/test.pl
+++ /dev/null
@@ -1,67 +0,0 @@
-#!/usr/bin/perl
-use lib (split(/:/, $ENV{GITPERLLIB}));

[PATCH v3 1/3] send-email: add and use a local copy of Mail::Address

2018-01-08 Thread Matthieu Moy
We used to have two versions of the email parsing code. Our
parse_mailboxes (in Git.pm), and Mail::Address which we used if
installed. Unfortunately, both versions have different sets of bugs, and
changing the behavior of git depending on whether Mail::Address is
installed was a bad idea.

A first attempt to solve this was cc90750 (send-email: don't use
Mail::Address, even if available, 2017-08-23), but it turns out our
parse_mailboxes is too buggy for some uses. For example the lack of
nested comments support breaks get_maintainer.pl in the Linux kernel
tree:

  https://public-inbox.org/git/20171116154814.23785-1-alex.ben...@linaro.org/

This patch goes the other way: use Mail::Address anyway, but have a
local copy from CPAN as a fallback, when the system one is not
available.

The duplicated script is small (276 lines of code) and stable in time.
Maintaining the local copy should not be an issue, and will certainly be
less burden than maintaining our own parse_mailboxes.

Another option would be to consider Mail::Address as a hard dependency,
but it's easy enough to save the trouble of extra-dependency to the end
user or packager.

Signed-off-by: Matthieu Moy 
---
No change since v2.

 git-send-email.perl   |   3 +-
 perl/Git/FromCPAN/Mail/Address.pm | 276 ++
 perl/Git/Mail/Address.pm  |  24 
 3 files changed, 302 insertions(+), 1 deletion(-)
 create mode 100644 perl/Git/FromCPAN/Mail/Address.pm
 create mode 100755 perl/Git/Mail/Address.pm

diff --git a/git-send-email.perl b/git-send-email.perl
index edcc6d3..340b5c8 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -30,6 +30,7 @@ use Error qw(:try);
 use Cwd qw(abs_path cwd);
 use Git;
 use Git::I18N;
+use Git::Mail::Address;
 
 Getopt::Long::Configure qw/ pass_through /;
 
@@ -489,7 +490,7 @@ my ($repoauthor, $repocommitter);
 ($repocommitter) = Git::ident_person(@repo, 'committer');
 
 sub parse_address_line {
-   return Git::parse_mailboxes($_[0]);
+   return map { $_->format } Mail::Address->parse($_[0]);
 }
 
 sub split_addrs {
diff --git a/perl/Git/FromCPAN/Mail/Address.pm 
b/perl/Git/FromCPAN/Mail/Address.pm
new file mode 100644
index 000..13b2ff7
--- /dev/null
+++ b/perl/Git/FromCPAN/Mail/Address.pm
@@ -0,0 +1,276 @@
+# Copyrights 1995-2017 by [Mark Overmeer ].
+#  For other contributors see ChangeLog.
+# See the manual pages for details on the licensing terms.
+# Pod stripped from pm file by OODoc 2.02.
+package Mail::Address;
+use vars '$VERSION';
+$VERSION = '2.19';
+
+use strict;
+
+use Carp;
+
+# use locale;   removed in version 1.78, because it causes taint problems
+
+sub Version { our $VERSION }
+
+
+
+# given a comment, attempt to extract a person's name
+sub _extract_name
+{   # This function can be called as method as well
+my $self = @_ && ref $_[0] ? shift : undef;
+
+local $_ = shift
+or return '';
+
+# Using encodings, too hard. See Mail::Message::Field::Full.
+return '' if m/\=\?.*?\?\=/;
+
+# trim whitespace
+s/^\s+//;
+s/\s+$//;
+s/\s+/ /;
+
+# Disregard numeric names (e.g. 123456.1...@compuserve.com)
+return "" if /^[\d ]+$/;
+
+s/^\((.*)\)$/$1/; # remove outermost parenthesis
+s/^"(.*)"$/$1/;   # remove outer quotation marks
+s/\(.*?\)//g; # remove minimal embedded comments
+s/\\//g;  # remove all escapes
+s/^"(.*)"$/$1/;   # remove internal quotation marks
+s/^([^\s]+) ?, ?(.*)$/$2 $1/; # reverse "Last, First M." if applicable
+s/,.*//;
+
+# Change casing only when the name contains only upper or only
+# lower cased characters.
+unless( m/[A-Z]/ && m/[a-z]/ )
+{   # Set the case of the name to first char upper rest lower
+s/\b(\w+)/\L\u$1/igo;  # Upcase first letter on name
+s/\bMc(\w)/Mc\u$1/igo; # Scottish names such as 'McLeod'
+s/\bo'(\w)/O'\u$1/igo; # Irish names such as 'O'Malley, O'Reilly'
+s/\b(x*(ix)?v*(iv)?i*)\b/\U$1/igo; # Roman numerals, eg 'Level III 
Support'
+}
+
+# some cleanup
+s/\[[^\]]*\]//g;
+s/(^[\s'"]+|[\s'"]+$)//g;
+s/\s{2,}/ /g;
+
+$_;
+}
+
+sub _tokenise
+{   local $_ = join ',', @_;
+my (@words,$snippet,$field);
+
+s/\A\s+//;
+s/[\r\n]+/ /g;
+
+while ($_ ne '')
+{   $field = '';
+if(s/^\s*\(/(/ )# (...)
+{   my $depth = 0;
+
+ PAREN: while(s/^(\(([^\(\)\\]|\\.)*)//)
+{   $field .= $1;
+$depth++;
+while(s/^(([^\(\)\\]|\\.)*\)\s*)//)
+{   $field .= $1;
+last PAREN unless --$depth;
+   $field .= $1 if s/^(([^\(\)\\]|\\.)+)//;
+}
+}
+
+carp &quo

Re: [PATCH v2 3/3] send-email: add test for Linux's get_maintainer.pl

2018-01-08 Thread Matthieu Moy
Eric Sunshine  writes:

> On Fri, Jan 5, 2018 at 1:36 PM, Matthieu Moy  wrote:
>> From: Alex Bennée 
>>
>> We had a regression that broke Linux's get_maintainer.pl. Using
>> Mail::Address to parse email addresses fixed it, but let's protect
>> against future regressions.
>>
>> Patch-edited-by: Matthieu Moy 
>> Signed-off-by: Alex Bennée 
>> Signed-off-by: Matthieu Moy 
>> ---
>> diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
>> @@ -172,6 +172,26 @@ test_expect_success $PREREQ 'cc trailer with various 
>> syntax' '
>> +test_expect_success $PREREQ 'setup fake get_maintainer.pl script for cc 
>> trailer' "
>> +   write_script expected-cc-script.sh <<-EOF &&
>> +   echo 'One Person  (supporter:THIS (FOO/bar))'
>> +   echo 'Two Person  (maintainer:THIS THING)'
>> +   echo 'Third List  (moderated list:THIS THING 
>> (FOO/bar))'
>> +   echo ' (moderated list:FOR THING)'
>> +   echo 'f...@example.com (open list:FOR THING (FOO/bar))'
>> +   echo 's...@example.com (open list)'
>> +   EOF
>> +   chmod +x expected-cc-script.sh
>> +"
>> +
>> +test_expect_success $PREREQ 'cc trailer with get_maintainer.pl output' '
>> +   clean_fake_sendmail &&
>> +   git send-email -1 --to=recipi...@example.com \
>> +   --cc-cmd="./expected-cc-script.sh" \
>> +   --smtp-server="$(pwd)/fake.sendmail" &&
>
> Aside from the unnecessary (thus noisy) quotes around the --cc-cmd

Indeed, removed.

> value, my one concern is that someone may come along and want to
> "normalize" it to --cc-cmd="$(pwd)/expected-cc-script.sh" for
> consistency with the following --smtp-server line. This worry is
> compounded by the commit message not explaining why these two lines
> differ (one using "./" and one using "$(pwd)/").

Added a note in the commit message.

> An alternative would be to insert a cleanup/modernization
> patch before this one which changes all the "$(pwd)/" to "./",

For --smtp-server, doing so results in a failing tests. I didn't
investigate on why.

> although you'd still want to explain why that's being done (to wit:
> because --cc-cmd behavior with spaces is not well defined). Or,
> perhaps this isn't an issue and my worry is not justified (after all,
> the test will break if someone changes the "./" to "$(pwd)/").

Also, the existing code is written like this: --cc-cmd is always
relative, --stmp-server is always absolute, including when they're used
in the same command:

test_suppress_self () {
[...]
git send-email --from="$1 <$2>" \
--to=nob...@example.com \
--cc-cmd=./cccmd-sed \
--suppress-cc=self \
--smtp-server="$(pwd)/fake.sendmail" \

Thanks for your careful review,

-- 
Matthieu Moy
https://matthieu-moy.fr/


[PATCH v2 2/3] Remove now useless email-address parsing code

2018-01-05 Thread Matthieu Moy
We now use Mail::Address unconditionaly, hence parse_mailboxes is now
dead code. Remove it and its tests.

Signed-off-by: Matthieu Moy 
---
 perl/Git.pm  | 71 
 t/t9000-addresses.sh | 27 
 t/t9000/test.pl  | 67 -
 3 files changed, 165 deletions(-)
 delete mode 100755 t/t9000-addresses.sh
 delete mode 100755 t/t9000/test.pl

diff --git a/perl/Git.pm b/perl/Git.pm
index ffa09ac..65e6b32 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -880,77 +880,6 @@ sub ident_person {
return "$ident[0] <$ident[1]>";
 }
 
-=item parse_mailboxes
-
-Return an array of mailboxes extracted from a string.
-
-=cut
-
-# Very close to Mail::Address's parser, but we still have minor
-# differences in some cases (see t9000 for examples).
-sub parse_mailboxes {
-   my $re_comment = qr/\((?:[^)]*)\)/;
-   my $re_quote = qr/"(?:[^\"\\]|\\.)*"/;
-   my $re_word = qr/(?:[^]["\s()<>:;@\\,.]|\\.)+/;
-
-   # divide the string in tokens of the above form
-   my $re_token = qr/(?:$re_quote|$re_word|$re_comment|\S)/;
-   my @tokens = map { $_ =~ /\s*($re_token)\s*/g } @_;
-   my $end_of_addr_seen = 0;
-
-   # add a delimiter to simplify treatment for the last mailbox
-   push @tokens, ",";
-
-   my (@addr_list, @phrase, @address, @comment, @buffer) = ();
-   foreach my $token (@tokens) {
-   if ($token =~ /^[,;]$/) {
-   # if buffer still contains undeterminated strings
-   # append it at the end of @address or @phrase
-   if ($end_of_addr_seen) {
-   push @phrase, @buffer;
-   } else {
-   push @address, @buffer;
-   }
-
-   my $str_phrase = join ' ', @phrase;
-   my $str_address = join '', @address;
-   my $str_comment = join ' ', @comment;
-
-   # quote are necessary if phrase contains
-   # special characters
-   if ($str_phrase =~ /[][()<>:;@\\,.\000-\037\177]/) {
-   $str_phrase =~ s/(^|[^\\])"/$1/g;
-   $str_phrase = qq["$str_phrase"];
-   }
-
-   # add "<>" around the address if necessary
-   if ($str_address ne "" && $str_phrase ne "") {
-   $str_address = qq[<$str_address>];
-   }
-
-   my $str_mailbox = "$str_phrase $str_address 
$str_comment";
-   $str_mailbox =~ s/^\s*|\s*$//g;
-   push @addr_list, $str_mailbox if ($str_mailbox);
-
-   @phrase = @address = @comment = @buffer = ();
-   $end_of_addr_seen = 0;
-   } elsif ($token =~ /^\(/) {
-   push @comment, $token;
-   } elsif ($token eq "<") {
-   push @phrase, (splice @address), (splice @buffer);
-   } elsif ($token eq ">") {
-   $end_of_addr_seen = 1;
-   push @address, (splice @buffer);
-   } elsif ($token eq "@" && !$end_of_addr_seen) {
-   push @address, (splice @buffer), "@";
-   } else {
-   push @buffer, $token;
-   }
-   }
-
-   return @addr_list;
-}
-
 =item hash_object ( TYPE, FILENAME )
 
 Compute the SHA1 object id of the given C considering it is
diff --git a/t/t9000-addresses.sh b/t/t9000-addresses.sh
deleted file mode 100755
index a1ebef6..000
--- a/t/t9000-addresses.sh
+++ /dev/null
@@ -1,27 +0,0 @@
-#!/bin/sh
-
-test_description='compare address parsing with and without Mail::Address'
-. ./test-lib.sh
-
-if ! test_have_prereq PERL; then
-   skip_all='skipping perl interface tests, perl not available'
-   test_done
-fi
-
-perl -MTest::More -e 0 2>/dev/null || {
-   skip_all="Perl Test::More unavailable, skipping test"
-   test_done
-}
-
-perl -MMail::Address -e 0 2>/dev/null || {
-   skip_all="Perl Mail::Address unavailable, skipping test"
-   test_done
-}
-
-test_external_has_tap=1
-
-test_external_without_stderr \
-   'Perl address parsing function' \
-   perl "$TEST_DIRECTORY"/t9000/test.pl
-
-test_done
diff --git a/t/t9000/test.pl b/t/t9000/test.pl
deleted file mode 100755
index dfeaa9c..000
--- a/t/t9000/test.pl
+++ /dev/null
@@ -1,67 +0,0 @@
-#!/usr/bin/perl
-use lib (split(/:/, $ENV{GITPERLLIB}));
-
-use

[PATCH v2 3/3] send-email: add test for Linux's get_maintainer.pl

2018-01-05 Thread Matthieu Moy
From: Alex Bennée 

We had a regression that broke Linux's get_maintainer.pl. Using
Mail::Address to parse email addresses fixed it, but let's protect
against future regressions.

Patch-edited-by: Matthieu Moy 
Signed-off-by: Alex Bennée 
Signed-off-by: Matthieu Moy 
---
Change since v1: fixed proposed by Eric Sunshine and pointed out by
Alex Bennée.

Eric pointed out that using --cc-cmd=$(pwd)/expected-cc-script.sh did
not work because $(pwd) had spaces in it, but I already turned it into
./expected-cc-script.sh.

 t/t9001-send-email.sh | 20 
 1 file changed, 20 insertions(+)

diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index 4d261c2..d13d8c3 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -172,6 +172,26 @@ test_expect_success $PREREQ 'cc trailer with various 
syntax' '
test_cmp expected-cc commandline1
 '
 
+test_expect_success $PREREQ 'setup fake get_maintainer.pl script for cc 
trailer' "
+   write_script expected-cc-script.sh <<-EOF &&
+   echo 'One Person  (supporter:THIS (FOO/bar))'
+   echo 'Two Person  (maintainer:THIS THING)'
+   echo 'Third List  (moderated list:THIS THING 
(FOO/bar))'
+   echo ' (moderated list:FOR THING)'
+   echo 'f...@example.com (open list:FOR THING (FOO/bar))'
+   echo 's...@example.com (open list)'
+   EOF
+   chmod +x expected-cc-script.sh
+"
+
+test_expect_success $PREREQ 'cc trailer with get_maintainer.pl output' '
+   clean_fake_sendmail &&
+   git send-email -1 --to=recipi...@example.com \
+   --cc-cmd="./expected-cc-script.sh" \
+   --smtp-server="$(pwd)/fake.sendmail" &&
+   test_cmp expected-cc commandline1
+'
+
 test_expect_success $PREREQ 'setup expect' "
 cat >expected-show-all-headers <<\EOF
 0001-Second.patch
-- 
2.7.4



[PATCH v2 1/3] send-email: add and use a local copy of Mail::Address

2018-01-05 Thread Matthieu Moy
We used to have two versions of the email parsing code. Our
parse_mailboxes (in Git.pm), and Mail::Address which we used if
installed. Unfortunately, both versions have different sets of bugs, and
changing the behavior of git depending on whether Mail::Address is
installed was a bad idea.

A first attempt to solve this was cc90750 (send-email: don't use
Mail::Address, even if available, 2017-08-23), but it turns out our
parse_mailboxes is too buggy for some uses. For example the lack of
nested comments support breaks get_maintainer.pl in the Linux kernel
tree:

  https://public-inbox.org/git/20171116154814.23785-1-alex.ben...@linaro.org/

This patch goes the other way: use Mail::Address anyway, but have a
local copy from CPAN as a fallback, when the system one is not
available.

The duplicated script is small (276 lines of code) and stable in time.
Maintaining the local copy should not be an issue, and will certainly be
less burden than maintaining our own parse_mailboxes.

Another option would be to consider Mail::Address as a hard dependency,
but it's easy enough to save the trouble of extra-dependency to the end
user or packager.

Signed-off-by: Matthieu Moy 
---
Change since v1: just reworded the commit message.

 git-send-email.perl   |   3 +-
 perl/Git/FromCPAN/Mail/Address.pm | 276 ++
 perl/Git/Mail/Address.pm  |  24 
 3 files changed, 302 insertions(+), 1 deletion(-)
 create mode 100644 perl/Git/FromCPAN/Mail/Address.pm
 create mode 100755 perl/Git/Mail/Address.pm

diff --git a/git-send-email.perl b/git-send-email.perl
index edcc6d3..340b5c8 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -30,6 +30,7 @@ use Error qw(:try);
 use Cwd qw(abs_path cwd);
 use Git;
 use Git::I18N;
+use Git::Mail::Address;
 
 Getopt::Long::Configure qw/ pass_through /;
 
@@ -489,7 +490,7 @@ my ($repoauthor, $repocommitter);
 ($repocommitter) = Git::ident_person(@repo, 'committer');
 
 sub parse_address_line {
-   return Git::parse_mailboxes($_[0]);
+   return map { $_->format } Mail::Address->parse($_[0]);
 }
 
 sub split_addrs {
diff --git a/perl/Git/FromCPAN/Mail/Address.pm 
b/perl/Git/FromCPAN/Mail/Address.pm
new file mode 100644
index 000..13b2ff7
--- /dev/null
+++ b/perl/Git/FromCPAN/Mail/Address.pm
@@ -0,0 +1,276 @@
+# Copyrights 1995-2017 by [Mark Overmeer ].
+#  For other contributors see ChangeLog.
+# See the manual pages for details on the licensing terms.
+# Pod stripped from pm file by OODoc 2.02.
+package Mail::Address;
+use vars '$VERSION';
+$VERSION = '2.19';
+
+use strict;
+
+use Carp;
+
+# use locale;   removed in version 1.78, because it causes taint problems
+
+sub Version { our $VERSION }
+
+
+
+# given a comment, attempt to extract a person's name
+sub _extract_name
+{   # This function can be called as method as well
+my $self = @_ && ref $_[0] ? shift : undef;
+
+local $_ = shift
+or return '';
+
+# Using encodings, too hard. See Mail::Message::Field::Full.
+return '' if m/\=\?.*?\?\=/;
+
+# trim whitespace
+s/^\s+//;
+s/\s+$//;
+s/\s+/ /;
+
+# Disregard numeric names (e.g. 123456.1...@compuserve.com)
+return "" if /^[\d ]+$/;
+
+s/^\((.*)\)$/$1/; # remove outermost parenthesis
+s/^"(.*)"$/$1/;   # remove outer quotation marks
+s/\(.*?\)//g; # remove minimal embedded comments
+s/\\//g;  # remove all escapes
+s/^"(.*)"$/$1/;   # remove internal quotation marks
+s/^([^\s]+) ?, ?(.*)$/$2 $1/; # reverse "Last, First M." if applicable
+s/,.*//;
+
+# Change casing only when the name contains only upper or only
+# lower cased characters.
+unless( m/[A-Z]/ && m/[a-z]/ )
+{   # Set the case of the name to first char upper rest lower
+s/\b(\w+)/\L\u$1/igo;  # Upcase first letter on name
+s/\bMc(\w)/Mc\u$1/igo; # Scottish names such as 'McLeod'
+s/\bo'(\w)/O'\u$1/igo; # Irish names such as 'O'Malley, O'Reilly'
+s/\b(x*(ix)?v*(iv)?i*)\b/\U$1/igo; # Roman numerals, eg 'Level III 
Support'
+}
+
+# some cleanup
+s/\[[^\]]*\]//g;
+s/(^[\s'"]+|[\s'"]+$)//g;
+s/\s{2,}/ /g;
+
+$_;
+}
+
+sub _tokenise
+{   local $_ = join ',', @_;
+my (@words,$snippet,$field);
+
+s/\A\s+//;
+s/[\r\n]+/ /g;
+
+while ($_ ne '')
+{   $field = '';
+if(s/^\s*\(/(/ )# (...)
+{   my $depth = 0;
+
+ PAREN: while(s/^(\(([^\(\)\\]|\\.)*)//)
+{   $field .= $1;
+$depth++;
+while(s/^(([^\(\)\\]|\\.)*\)\s*)//)
+{   $field .= $1;
+last PAREN unless --$depth;
+   $field .= $1 if s/^(([^\(\)\\]|\\.)+)//;
+}
+  

Re: [PATCH] send-email: add test for Linux's get_maintainer.pl

2018-01-05 Thread Matthieu Moy
Alex Bennée  writes:

> I think you need to apply Eric's suggestions from:
>
>   From: Eric Sunshine 
>   Date: Sat, 18 Nov 2017 21:54:46 -0500
>   Message-ID: 
> 

Indeed. I'm squashing this into the patch:

diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index f126177..d13d8c3 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -173,8 +173,7 @@ test_expect_success $PREREQ 'cc trailer with various 
syntax' '
 '
 
 test_expect_success $PREREQ 'setup fake get_maintainer.pl script for cc 
trailer' "
-   cat >expected-cc-script.sh <<-EOF &&
-   #!/bin/sh
+   write_script expected-cc-script.sh <<-EOF &&
echo 'One Person  (supporter:THIS (FOO/bar))'
echo 'Two Person  (maintainer:THIS THING)'
echo 'Third List  (moderated list:THIS THING 
(FOO/bar))'
@@ -186,7 +185,6 @@ test_expect_success $PREREQ 'setup fake get_maintainer.pl 
script for cc trailer'
 "
 
 test_expect_success $PREREQ 'cc trailer with get_maintainer.pl output' '
-   test_commit cc-trailer-getmaint &&
clean_fake_sendmail &&
git send-email -1 --to=recipi...@example.com \
--cc-cmd="./expected-cc-script.sh" \


-- 
Matthieu Moy
https://matthieu-moy.fr/


[PATCH] send-email: add test for Linux's get_maintainer.pl

2018-01-05 Thread Matthieu Moy
From: Alex Bennée 

We had a regression that broke Linux's get_maintainer.pl. Using
Mail::Address to parse email addresses fixed it, but let's protect
against future regressions.

Patch-edited-by: Matthieu Moy 
Signed-off-by: Alex Bennée 
Signed-off-by: Matthieu Moy 
---
 t/t9001-send-email.sh | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index 4d261c2..f126177 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -172,6 +172,28 @@ test_expect_success $PREREQ 'cc trailer with various 
syntax' '
test_cmp expected-cc commandline1
 '
 
+test_expect_success $PREREQ 'setup fake get_maintainer.pl script for cc 
trailer' "
+   cat >expected-cc-script.sh <<-EOF &&
+   #!/bin/sh
+   echo 'One Person  (supporter:THIS (FOO/bar))'
+   echo 'Two Person  (maintainer:THIS THING)'
+   echo 'Third List  (moderated list:THIS THING 
(FOO/bar))'
+   echo ' (moderated list:FOR THING)'
+   echo 'f...@example.com (open list:FOR THING (FOO/bar))'
+   echo 's...@example.com (open list)'
+   EOF
+   chmod +x expected-cc-script.sh
+"
+
+test_expect_success $PREREQ 'cc trailer with get_maintainer.pl output' '
+   test_commit cc-trailer-getmaint &&
+   clean_fake_sendmail &&
+   git send-email -1 --to=recipi...@example.com \
+   --cc-cmd="./expected-cc-script.sh" \
+   --smtp-server="$(pwd)/fake.sendmail" &&
+   test_cmp expected-cc commandline1
+'
+
 test_expect_success $PREREQ 'setup expect' "
 cat >expected-show-all-headers <<\EOF
 0001-Second.patch
-- 
2.7.4



Re: [RFC PATCH 2/2] Remove now useless email-address parsing code

2018-01-05 Thread Matthieu Moy
Alex Bennée  writes:

> Matthieu Moy  writes:
>
>> We now use Mail::Address unconditionaly, hence parse_mailboxes is now
>> dead code. Remove it and its tests.
>>
>> Signed-off-by: Matthieu Moy 
>> ---
>>  perl/Git.pm  | 71 
>> 
>>  t/t9000-addresses.sh | 27 
>>  t/t9000/test.pl  | 67 -
>>  3 files changed, 165 deletions(-)
>>  delete mode 100755 t/t9000-addresses.sh
>>  delete mode 100755 t/t9000/test.pl
>
> Should we add the tests for t9001-send-email.sh to guard against regressions?

Tests in t9001 were only useful with our parse_mailboxes (they were just
comparing parse_mailboxes and Mail::Address), so there's no point
keeping them after we delete parse_mailboxes.

Your added tests from
https://public-inbox.org/git/20171116154814.23785-1-alex.ben...@linaro.org
would make sense OTOH. Not breaking Linux's flow is a nice thing to
do ... Patch doing this follows (I'll resend the whole series with
Eric's nit later).

-- 
Matthieu Moy
https://matthieu-moy.fr/


[RFC PATCH 2/2] Remove now useless email-address parsing code

2018-01-04 Thread Matthieu Moy
We now use Mail::Address unconditionaly, hence parse_mailboxes is now
dead code. Remove it and its tests.

Signed-off-by: Matthieu Moy 
---
 perl/Git.pm  | 71 
 t/t9000-addresses.sh | 27 
 t/t9000/test.pl  | 67 -
 3 files changed, 165 deletions(-)
 delete mode 100755 t/t9000-addresses.sh
 delete mode 100755 t/t9000/test.pl

diff --git a/perl/Git.pm b/perl/Git.pm
index 02a3871..9d60d79 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -880,77 +880,6 @@ sub ident_person {
return "$ident[0] <$ident[1]>";
 }
 
-=item parse_mailboxes
-
-Return an array of mailboxes extracted from a string.
-
-=cut
-
-# Very close to Mail::Address's parser, but we still have minor
-# differences in some cases (see t9000 for examples).
-sub parse_mailboxes {
-   my $re_comment = qr/\((?:[^)]*)\)/;
-   my $re_quote = qr/"(?:[^\"\\]|\\.)*"/;
-   my $re_word = qr/(?:[^]["\s()<>:;@\\,.]|\\.)+/;
-
-   # divide the string in tokens of the above form
-   my $re_token = qr/(?:$re_quote|$re_word|$re_comment|\S)/;
-   my @tokens = map { $_ =~ /\s*($re_token)\s*/g } @_;
-   my $end_of_addr_seen = 0;
-
-   # add a delimiter to simplify treatment for the last mailbox
-   push @tokens, ",";
-
-   my (@addr_list, @phrase, @address, @comment, @buffer) = ();
-   foreach my $token (@tokens) {
-   if ($token =~ /^[,;]$/) {
-   # if buffer still contains undeterminated strings
-   # append it at the end of @address or @phrase
-   if ($end_of_addr_seen) {
-   push @phrase, @buffer;
-   } else {
-   push @address, @buffer;
-   }
-
-   my $str_phrase = join ' ', @phrase;
-   my $str_address = join '', @address;
-   my $str_comment = join ' ', @comment;
-
-   # quote are necessary if phrase contains
-   # special characters
-   if ($str_phrase =~ /[][()<>:;@\\,.\000-\037\177]/) {
-   $str_phrase =~ s/(^|[^\\])"/$1/g;
-   $str_phrase = qq["$str_phrase"];
-   }
-
-   # add "<>" around the address if necessary
-   if ($str_address ne "" && $str_phrase ne "") {
-   $str_address = qq[<$str_address>];
-   }
-
-   my $str_mailbox = "$str_phrase $str_address 
$str_comment";
-   $str_mailbox =~ s/^\s*|\s*$//g;
-   push @addr_list, $str_mailbox if ($str_mailbox);
-
-   @phrase = @address = @comment = @buffer = ();
-   $end_of_addr_seen = 0;
-   } elsif ($token =~ /^\(/) {
-   push @comment, $token;
-   } elsif ($token eq "<") {
-   push @phrase, (splice @address), (splice @buffer);
-   } elsif ($token eq ">") {
-   $end_of_addr_seen = 1;
-   push @address, (splice @buffer);
-   } elsif ($token eq "@" && !$end_of_addr_seen) {
-   push @address, (splice @buffer), "@";
-   } else {
-   push @buffer, $token;
-   }
-   }
-
-   return @addr_list;
-}
-
 =item hash_object ( TYPE, FILENAME )
 
 Compute the SHA1 object id of the given C considering it is
diff --git a/t/t9000-addresses.sh b/t/t9000-addresses.sh
deleted file mode 100755
index a1ebef6..000
--- a/t/t9000-addresses.sh
+++ /dev/null
@@ -1,27 +0,0 @@
-#!/bin/sh
-
-test_description='compare address parsing with and without Mail::Address'
-. ./test-lib.sh
-
-if ! test_have_prereq PERL; then
-   skip_all='skipping perl interface tests, perl not available'
-   test_done
-fi
-
-perl -MTest::More -e 0 2>/dev/null || {
-   skip_all="Perl Test::More unavailable, skipping test"
-   test_done
-}
-
-perl -MMail::Address -e 0 2>/dev/null || {
-   skip_all="Perl Mail::Address unavailable, skipping test"
-   test_done
-}
-
-test_external_has_tap=1
-
-test_external_without_stderr \
-   'Perl address parsing function' \
-   perl "$TEST_DIRECTORY"/t9000/test.pl
-
-test_done
diff --git a/t/t9000/test.pl b/t/t9000/test.pl
deleted file mode 100755
index dfeaa9c..000
--- a/t/t9000/test.pl
+++ /dev/null
@@ -1,67 +0,0 @@
-#!/usr/bin/perl
-use lib (split(/:/, $ENV{GITPERLLIB}));
-
-use

[RFC PATCH 1/2] add a local copy of Mail::Address from CPAN

2018-01-04 Thread Matthieu Moy
We used to have two versions of the email parsing code. Our
parse_mailboxes (in Git.pm), and Mail::Address which we used if
installed. Unfortunately, both versions have different sets of bugs, and
changing the behavior of git depending on whether Mail::Address is
installed was a bad idea.

A first attempt to solve this was cc90750 (send-email: don't use
Mail::Address, even if available, 2017-08-23), but it turns out our
parse_mailboxes is too buggy for some uses. For example the lack of
about nested comments support breaks get_maintainer.pl in the Linux
kernel tree:

  https://public-inbox.org/git/20171116154814.23785-1-alex.ben...@linaro.org/

This patch goes the other way: use Mail::Address anyway, but have a
local copy as a fallback, when the system one is not available.

The duplicated script is small (276 lines of code) and stable in time.
Maintaining the local copy should not be an issue, and will certainly be
less burden than maintaining our own parse_mailboxes.

Another option would be to consider Mail::Address as a hard dependency,
but it's easy enough to save the trouble of extra-dependency to the end
user or packager.

Signed-off-by: Matthieu Moy 
---
I looked at the perl/Git/Error.pm wrapper, and ended up writting a
different, much simpler version. I'm not sure the same approach would
apply to Error.pm, but my straightforward version does the job for
Mail/Address.pm.

I would also be fine with using our local copy unconditionaly.

 git-send-email.perl   |   3 +-
 perl/Git/FromCPAN/Mail/Address.pm | 276 ++
 perl/Git/Mail/Address.pm  |  24 
 3 files changed, 302 insertions(+), 1 deletion(-)
 create mode 100644 perl/Git/FromCPAN/Mail/Address.pm
 create mode 100755 perl/Git/Mail/Address.pm

diff --git a/git-send-email.perl b/git-send-email.perl
index 02747b6..d0dcc6d 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -30,6 +30,7 @@ use Git::Error qw(:try);
 use Cwd qw(abs_path cwd);
 use Git;
 use Git::I18N;
+use Git::Mail::Address;
 
 Getopt::Long::Configure qw/ pass_through /;
 
@@ -489,7 +490,7 @@ my ($repoauthor, $repocommitter);
 ($repocommitter) = Git::ident_person(@repo, 'committer');
 
 sub parse_address_line {
-   return Git::parse_mailboxes($_[0]);
+   return map { $_->format } Mail::Address->parse($_[0]);
 }
 
 sub split_addrs {
diff --git a/perl/Git/FromCPAN/Mail/Address.pm 
b/perl/Git/FromCPAN/Mail/Address.pm
new file mode 100644
index 000..13b2ff7
--- /dev/null
+++ b/perl/Git/FromCPAN/Mail/Address.pm
@@ -0,0 +1,276 @@
+# Copyrights 1995-2017 by [Mark Overmeer ].
+#  For other contributors see ChangeLog.
+# See the manual pages for details on the licensing terms.
+# Pod stripped from pm file by OODoc 2.02.
+package Mail::Address;
+use vars '$VERSION';
+$VERSION = '2.19';
+
+use strict;
+
+use Carp;
+
+# use locale;   removed in version 1.78, because it causes taint problems
+
+sub Version { our $VERSION }
+
+
+
+# given a comment, attempt to extract a person's name
+sub _extract_name
+{   # This function can be called as method as well
+my $self = @_ && ref $_[0] ? shift : undef;
+
+local $_ = shift
+or return '';
+
+# Using encodings, too hard. See Mail::Message::Field::Full.
+return '' if m/\=\?.*?\?\=/;
+
+# trim whitespace
+s/^\s+//;
+s/\s+$//;
+s/\s+/ /;
+
+# Disregard numeric names (e.g. 123456.1...@compuserve.com)
+return "" if /^[\d ]+$/;
+
+s/^\((.*)\)$/$1/; # remove outermost parenthesis
+s/^"(.*)"$/$1/;   # remove outer quotation marks
+s/\(.*?\)//g; # remove minimal embedded comments
+s/\\//g;  # remove all escapes
+s/^"(.*)"$/$1/;   # remove internal quotation marks
+s/^([^\s]+) ?, ?(.*)$/$2 $1/; # reverse "Last, First M." if applicable
+s/,.*//;
+
+# Change casing only when the name contains only upper or only
+# lower cased characters.
+unless( m/[A-Z]/ && m/[a-z]/ )
+{   # Set the case of the name to first char upper rest lower
+s/\b(\w+)/\L\u$1/igo;  # Upcase first letter on name
+s/\bMc(\w)/Mc\u$1/igo; # Scottish names such as 'McLeod'
+s/\bo'(\w)/O'\u$1/igo; # Irish names such as 'O'Malley, O'Reilly'
+s/\b(x*(ix)?v*(iv)?i*)\b/\U$1/igo; # Roman numerals, eg 'Level III 
Support'
+}
+
+# some cleanup
+s/\[[^\]]*\]//g;
+s/(^[\s'"]+|[\s'"]+$)//g;
+s/\s{2,}/ /g;
+
+$_;
+}
+
+sub _tokenise
+{   local $_ = join ',', @_;
+my (@words,$snippet,$field);
+
+s/\A\s+//;
+s/[\r\n]+/ /g;
+
+while ($_ ne '')
+{   $field = '';
+if(s/^\s*\(/(/ )# (...)
+{   my $depth = 0;
+
+ PAREN: while(s/^(\(([^\(\)\\]|\\.)*)//)

Re: [PATCH v2] doc: add triangular workflow

2017-12-15 Thread Matthieu Moy
ALBERTIN TIMOTHEE p1514771  writes:

>>>> +This workflow is commonly used on different platforms like BitBucket,
>>>> +GitHub or GitLab which provide a dedicated mechanism for requesting 
>>>> merges.
>>>
>>> As a user, I find it terribly frustrating when reading documentation
>>> telling me that "there's a dedicated mechanism" for something without
>>> telling me actually how to do it.
>
>>Also I think the description is backwards from end-user's point of
>>view.  It's not that it is common to use the workflow on these
>>hosting sites.  It's more like people use the workflow and adopt use
>>of these hosting sites as one building block of the workflow.
>
> I'm wondering if this sentence is really useful. It's not essential
> and it will take lot of time and space to talk about it properly.
> So, if you agree, we'll erase it.

I think it is useful. My guess is that most people don't know the
wording "triangular workflow", but most people know about GitHub for
example. See e.g.
https://trends.google.com/trends/explore?q=%22triangular%20workflow%22,github,gitlab,bitbucket

So the hope here is that the reader reading this feels "Ah, OK, this is
about how to do pull-requests on GitHub". OTOH, I wouldn't like a Git
official documentation to be biaised towards a single hosting site.

> In the case of a triangular workflow, if the project already exists,
> PUBLISH will already exist too.

No.

If the project already exists, then UPSTREAM exists, and the contributor
will create his or her own PUBLISH. There's generally two ways to do it:

* On platforms supporting this, use the platform's mechanism to create a
  fork (e.g. fork button on GitHub/GitLab's web UI).

* One can create an empty PUBLISH, clone UPSTREAM, and push to PUBLISH.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH v2] doc: add triangular workflow

2017-12-15 Thread Matthieu Moy
ALBERTIN TIMOTHEE p1514771  writes:

>>> +Example with master as :
>>> +===
>>> +* `git config branch.master.remote upstream`
>>> +* `git config branch.master.pushRemote origin`
>>> +===
>
>>origin is the remote you've cloned from. From the text above, I guess
>>you meant it to point to PUBLISH. But all the examples "git clone" you
>>gave are from UPSTREAM.
>
>>You're mixing the case where one "git clone"s from UPSTREAM and "git
>>remode add"s PUBLISH, and the converse. Both are possible, but the
>>"origin" remote will be different depending on which one you chose.
>
> I think I don't really get it. IMHO UPSTREAM is name from the repository
> you pull from and PUBLISH from the repositiry you push to.

In your document, you're suggesting to clone from ORIGIN, and then to
set pushRemote to origin. This means "git push" will push to ORIGIN,
which doesn't work. Actually, if one follows your instructions, upstream
and origin will point to the same remote.

Did you test your own document on a real-life example? If not, you
should do so before anything else. You should notice this kind of issues
before asking for external review.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH] send-email: extract email-parsing code into a subroutine

2017-12-15 Thread Matthieu Moy
PAYRE NATHAN p1508475  writes:

>>> +sub parse_header_line {
>>> + my $lines = shift;
>>> + my $parsed_line = shift;
>>> + my $pattern = join "|", qw(To Cc Bcc);
>>
>> Nit: you may want to rename it to something more explicit, like
>> $addr_headers_pat.
>
> I find "$addr_headers_pat" too long that's why I've choose rename it
> into "$addr_pat", in addition to that, because the variable is in the
> subroutine "parse_header_line" it does not require to include
> "headers" in the variable name.

I suggested this name because $addr_pat seems to imply that this matches
an address, while it matches the _name of headers_ containing address.
But that's not terribly important, the meaning is clear by the context
anyway.

All my previous remarks have been taken into account. This is now

Reviewed-by: Matthieu Moy 

Thanks,

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH v2] doc: add triangular workflow

2017-12-14 Thread Matthieu Moy
===
> +* `git clone `
> +* `git remote add **PUBLISH**`

git remote add needs two arguments, you're giving only one.

> +* `git push`

This will push to UPSTREAM, right?

> +Adding **UPSTREAM** remote:
> +
> +===
> +`git remote add upstream `
> +===

In which circumstance shall one write this? If you don't say it, the
reader will probably assume that this is to be done after the commands
you specified right above. But then: it doesn't make sense. You've
just cloned from UPSTREAM, you already have the UPSTREAM remote.

> +For each branch requiring a triangular workflow, set
> +`branch..remote` and `branch..pushRemote` to set up
> +the **UPSTREAM** and **PUBLISH** repositories.

This neither tells me how to set the variables, nor what the effect
will be ("set up"?).

> +Example with master as :
> +===
> +* `git config branch.master.remote upstream`
> +* `git config branch.master.pushRemote origin`
> +===

origin is the remote you've cloned from. From the text above, I guess
you meant it to point to PUBLISH. But all the examples "git clone" you
gave are from UPSTREAM.

You're mixing the case where one "git clone"s from UPSTREAM and "git
remode add"s PUBLISH, and the converse. Both are possible, but the
"origin" remote will be different depending on which one you chose.

> +Making your work available
> +~~
> +
> +The `git push` command sends commits to the **PUBLISH** repository and not to
> +the **UPSTREAM** thanks to the configuration you did earlier with the
> +`git config remote.pushdefault origin` command.

This explanation should be next to the place where you recommend
setting remote.pushdefault.

> +When a contributor pushes something, the `git config push.default
> +current` command can be used to specify that the name of the
> +**PUBLISH** branch is the same as the name of the **LOCAL** one.

I already said it multiple times, but I don't think it's a good idea
to recommend changing push.default. The default, "simple", was
specifically designed to be appropriate for triangular workflow:

  
http://git.661346.n2.nabble.com/PATCH-0-6-push-default-in-the-triangular-world-td7589907.html
  (PATCH 3/6 in particular)

You may disagree with me, but then please explain your motivation (by
replying to my messages and/or by explaining the rationale in the
commit message).

> +=
> +`git rev-parse --abbrev-ref @{push}`
> +=
> +
> +.Display the fetch remote's name:
> +[caption="Recipe: "]
> +
> +===
> +`git rev-parse --abbrev-ref @{upstream}`
> +===

I don't think "rev-parse" is the best example to give.

I use @{upstream} all the time to see what commits I have which aren't
in upstream yet:

  git log @{upstream}..


[ part of text not re-read by lack of time ]

> --- a/Documentation/gitworkflows.txt
> +++ b/Documentation/gitworkflows.txt
> @@ -467,6 +467,7 @@ other options.
>  SEE ALSO
>  
>  linkgit:gittutorial[7],
> +linkgit:git-triangular-workflow.txt[1],
>  linkgit:git-push[1],
>  linkgit:git-pull[1],
>  linkgit:git-merge[1],

I think this deserves more than just a "SEE ALSO" link. The "merge
workflow" part is essentially another name for triangular workflow.
There should be a proper citation of this new triangular workflow doc,
i.e. a link with an explanatory sentence somewhere in the "merge
workflow" part IMHO.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH] send-email: extract email-parsing code into a subroutine

2017-12-14 Thread Matthieu Moy
"PAYRE NATHAN p1508475"  wrote:

> - print $c2 $_;
>   }
> +
>   close $c;


Nit: this added newline does not seem necessary to me. Nothing
serious, but this kind of thing tend to distract the reader when
reviewing the patch.

> + foreach my $key (keys %parsed_email) {
> + next if $key == 'body';
> + print $c2 "$key: $parsed_email{$key}";
> + }

I'd add a comment like

# Preserve unknown headers

at the top of the loop to make it clear what we're doing.

On a side note: there's no comment in the code you're adding. This is
not necessarily a bad thing (beautifully written code does not need
comments to be readable), but you may re-read your code with the
question "did I explain everything well-enough?" in mind. The loop
above is a case where IMHO a short and sweet comment helps the reader.

Two potential issues not mentionned in your message but that we
discussed offlist is that 1) this doesn't preserve the order, and 2)
this strips duplicate headers. I believe this is not a problem here,
and trying to solve these points would make the code overkill, but
this would really deserve being mentionned in the commit message.
First, so that people reviewing your patch now can confirm (or not)
that you are taking the right decision by doing this, and also for
people in the future examining your patch (e.g. after a bisect).

> +sub parse_header_line {
> + my $lines = shift;
> + my $parsed_line = shift;
> + my $pattern = join "|", qw(To Cc Bcc);

Nit: you may want to rename it to something more explicit, like
$addr_headers_pat.

None of my nit should block the patch inclusion, but I think the
commit message should be expanded to include a mention of the
"duplicate headers"/"header order" potential issue.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH] send-email: extract email-parsing code into a subroutine

2017-12-11 Thread Matthieu Moy
"PAYRE NATHAN p1508475"  wrote:

> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -685,7 +685,7 @@ Lines beginning in "GIT:" will be removed.
>  Consider including an overall diffstat or table of contents
>  for the patch you are writing.
>  
> -Clear the body content if you don't wish to send a summary.
> +Clear the body content if you dont wish to send a summary.

This is not part of your patch. Use "git add -p" to specify
exactly which hunks should go into the patch and don't let this
kind of change end up in the version you send.

> + my %parsed_email;
> + $parsed_email{'body'} = '';
> + while (my $line = <$c>) {
> + next if $line =~ m/^GIT:/;
> + parse_header_line($line, \%parsed_email);
> + if ($line =~ /^$/) {
> + $parsed_email{'body'} = filter_body($c);
>   }
> - print $c2 $_;

I didn't notice this at first, but you're modifying the behavior here:
the old code used to print to $c2 anything that didn't match any of
the if/else if branches.

To keep this behavior, you need to keep all these extra headers in
$parsed_email (you do, in this version) and print them after taking
care of all the known headers (AFAICT, you don't).

>   }
> - close $c;
> - close $c2;

You'll still need $c2, but you don't need $c anymore, so I'd keep the
"close $c" here. OTOH, $c2 is not needed before this point (actually a
bit later), so it would make sense to move the "open" down a little.
This would materialize the "read input, then write output" scheme (as
opposed to "write output while reading input" in the previous code).
It's not a new issue in your patch, but giving variables meaningful
names (i.e. not $c and $c2) would help, too.

> + if ($parsed_email{'mime-version'}) {
> + print $c2 "MIME-Version: $parsed_email{'mime-version'}\n",
> + "Content-Type: 
> $parsed_email{'content-type'};\n",
> + "Content-Transfer-Encoding: 
> $parsed_email{'content-transfer-encoding'}\n";
> + }
> +
> + if ($parsed_email{'content-type'}) {
> + print $c2 "MIME-Version: 1.0\n",
> +  "Content-Type: $parsed_email{'content-type'};\n",
> +  "Content-Transfer-Encoding: 8bit\n";

This "if ($parsed_email{'content-type'})" does not correspond to
anything in the old code, and ...

> + } elsif (file_has_nonascii($compose_filename)) {
> +my $content_type = ($parsed_email{'content-type'} or
> +"text/plain; charset=$compose_encoding");

Here, your're dealing explicitly with $parsed_email{'content-type'} !=
false (you're in the 'else' branch where it can only be false).

I think you just meant to drop the "if
($parsed_email{'content-type'})" part, and plug the "elseif" directly
after the "if ($parsed_email{'mime-version'})". That's what I
suggested in my earlier email.

> +my $content_type =3D ($parsed_email{'content-type'} or
> +"text/plain; charset=3D$compose_encoding");
> +print $c2 "MIME-Version: 1.0\n",
> +  "Content-Type: $content_type\n",
> +  "Content-Transfer-Encoding: 8bit\n";
> +}

This part is indented with spaces, please use tabs.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: SSH port ignored when ssh:// prefix isn't specified

2017-12-10 Thread Matthieu Moy
"mqu...@neosmart.net"  writes:

> Basically, while `git clone ssh://g...@example.com:/path` works, the same
> with the `ssh://` prefix doesn't, and attempts to establish a connection to
> port 22 instead: `git clone g...@example.com:/path` (I'm not sure what it
> will do with the `:` should the connection actually succeed).


I don't see a simple way to distinguish between "I want to connect to
port 22 and access directory /path" and "I want to connect to port
 and access directory path". So Git chose for you the first option
(if you replace 2222 with abcd, it clearly makes sense).

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH v4] send-email: extract email-parsing code into a subroutine

2017-12-07 Thread Matthieu Moy
Not terribly important, but your patch has trailing newlines. "git diff
--staged --check" to see them. More below.

PAYRE NATHAN p1508475  writes:

> the part of code which parses the header a last time to prepare the
> email and send it.

The important point is not that it's the last time the code parses
headers, so I'd drop the "a last time".

> + my %parsed_email;
> + $parsed_email{'body'} = '';
> + while (my $line = <$c>) {
> + next if $line =~ m/^GIT:/;
> + parse_header_line($line, \%parsed_email);
> + if ($line =~ /^\n$/i) {

You don't need the /i (case-Insensitive) here, there are no letters to
match.

> + if ($parsed_email{'mime-version'}) {
> + $need_8bit_cte = 0;

This $need_8bit_cte is a leftover of the old code, which processed the
headers in the order it found them in the message and had to remember
the content of MIME-Version while parsing Content-Type.

I believe you can apply this on top of your patch:

--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -709,7 +709,6 @@ EOT3
open $c, "<", $compose_filename
or die sprintf(__("Failed to open %s: %s"), $compose_filename, 
$!);
 
-   my $need_8bit_cte = file_has_nonascii($compose_filename);
my $in_body = 0;
my $summary_empty = 1;
if (!defined $compose_encoding) {
@@ -740,12 +739,10 @@ EOT3
"\n";
}
if ($parsed_email{'mime-version'}) {
-   $need_8bit_cte = 0;
print $c2 "MIME-Version: $parsed_email{'mime-version'}\n",
"Content-Type: 
$parsed_email{'content-type'};\n",
"Content-Transfer-Encoding: 
$parsed_email{'content-transfer-encoding'}\n";
-   }
-   if ($need_8bit_cte) {
+   } else if (file_has_nonascii($compose_filename)) {
if ($parsed_email{'content-type'}) {
print $c2 "MIME-Version: 1.0\n",
 "Content-Type: 
$parsed_email{'content-type'};",

It reads much better: "If the original message already had a
MIME-Version header, then use that, else see if the file has non-ascii
characters and if so, use MIME-Version: 1.0".

Actually, you can even simplify further by factoring the if/else below:

> + if ($parsed_email{'content-type'}) {
> + print $c2 "MIME-Version: 1.0\n",
> +  "Content-Type: 
> $parsed_email{'content-type'};",

(Suspicious ";", and suspicious absence of "\n" here, I don't think it's
intentional and I'm fixing it below, but correct me if I'm wrong)

> +  "Content-Transfer-Encoding: 8bit\n";
> + } else {

(Broken indentation, this is not aligned with the "if" above)

>   print $c2 "MIME-Version: 1.0\n",
>"Content-Type: text/plain; ",
> -"charset=$compose_encoding\n",
> +  "charset=$compose_encoding\n",
>"Content-Transfer-Encoding: 8bit\n";
>   }

This could become stg like (untested):

} else if (file_has_nonascii($compose_filename)) {
my $content_type = ($parsed_email{'content-type'} or
"text/plain; charset=$compose_encoding");
print $c2 "MIME-Version: 1.0\n",
  "Content-Type: $content_type\n",
  "Content-Transfer-Encoding: 8bit\n";
}

> + open $c2, "<", $compose_filename . ".final"
> + or die sprintf(__("Failed to open %s.final: %s"), 
> $compose_filename, $!);
> + close $c2;

What is this? Cut-and-paste mistake?

> +sub parse_header_line {
> + my $lines = shift;
> + my $parsed_line = shift;
> + my $pattern1 = join "|", qw(To Cc Bcc);
> + my $pattern2 = join "|",
> + qw(From Subject Date In-Reply-To Message-ID MIME-Version 
> + Content-Type Content-Transfer-Encoding References);
> + 
> + foreach (split(/\n/, $lines)) {
> + if (/^($pattern1):\s*(.+)$/i) {
> + $parsed_line->{lc $1} = [ parse_address_line($2) ];
> + } elsif (/^($pattern2):\s*(.+)\s*$/i) {
> + $parsed_line->{lc $1} = $2;
> + }

I don't think you need to list the possibilities in the "else" branch.
Just matching /^([^:]*):\s*(.+)\s*$/i should do the trick.

> + $body = $body . $body_line;

Or just: $body .= $body_line;

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH] doc: clarify triangular workflow

2017-12-07 Thread Matthieu Moy
BENSOUSSAN--BOHM DANIEL p1507430
 writes:

>>The document starts with
>
>   >This document attempts to write down and motivate some of the
>   >workflow elements used for `git.git` itself.  Many ideas apply
>   >in general, though the full workflow is rarely required for
>   >smaller projects with fewer people involved.
>
>>and makes me wonder (note: I am not involved in writing any of the
>>existing text in this document) how much material foreign to the
>>actual workflow used for `git.git` should go in here.  Having
>>multiple maintainers at the same time is not a workflow element that
>>we have ever used, for example, so I am not sure about the change in
>>the above paragraph.
>
> We were told to change 'he' into 'they' to be neutral.  However, it's true
> that there's one maintainer at a time so we will remove the 's' from
> "maintainers". 

Not a native speaker, but according to wikipedia
(https://en.wikipedia.org/wiki/Singular_they) it's OK to write
"maintainer [singular, but already neulral] may get merge conflicts when
they [sinugular they] ..."

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH v3] send-email: extract email-parsing code into a subroutine

2017-12-06 Thread Matthieu Moy
PAYRE NATHAN p1508475  writes:

> Without the "print" used for testing. 

But still smoe broken indentation:

>  git-send-email.perl | 90 
> +
>  1 file changed, 63 insertions(+), 27 deletions(-)
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index 2208dcc21..a10574a56 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -715,41 +715,63 @@ EOT3
>   if (!defined $compose_encoding) {
>   $compose_encoding = "UTF-8";
>   }
> - while(<$c>) {
> - next if m/^GIT:/;
> - if ($in_body) {
> - $summary_empty = 0 unless (/^\n$/);
> - } elsif (/^\n$/) {
> - $in_body = 1;
> - if ($need_8bit_cte) {
> +
> +my %parsed_email;
> + $parsed_email{'body'} = '';
> +while (my $line = <$c>) {
> + next if $line =~ m/^GIT:/;
> + parse_header_line($line, \%parsed_email);
> + if ($line =~ /^\n$/i) {
> + while (my $body_line = <$c>) {
> +if ($body_line !~ m/^GIT:/) {
> +$parsed_email{'body'} = $parsed_email{'body'} . 
> $body_line;
> +}
> + }
> + }
> + }

This may display properly in your text editor with your setting, but
appears broken at least with tab-width=8. Don't mix tabs and spaces. The
Git coding style is to indent with tabs.

To see what I mean, open the script in Emacs and type M-x
whitespace-mode RET.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH] send-email: extract email-parsing code into a subroutine

2017-12-03 Thread Matthieu Moy
Nathan PAYRE  writes:

> I found a mistake in my signed-off-by, please replace
>  by 

I think you want exactly the opposite of this. You're contributing as a
Lyon 1 student, hence your identity is @etu.univ-lyon1.fr. Your Gmail
adress is used only for technical reasons.

OTOH, you are missing the first line From: ... @..univ-lyon1.fr in your
message.

See how you did it:

https://public-inbox.org/git/20171012091727.30759-1-second.pa...@gmail.com/

(The sign-off was wrong in this one, but the From was OK)

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH] git-send-email: fix get_maintainer.pl regression

2017-11-22 Thread Matthieu Moy
Junio C Hamano  writes:

> Was there any reason why Mail::Address was _inadequate_?

I think the main reason was that Mail::Address is not a standard perl
module, and not relying on it avoided one external dependency. AFAIK, we
don't really have a good way to deal with Perl dependencies, so having a
strong requirement on Mail::Address will probably end up in a runtime
error for users who compile Git from source (people using a packaged
version have their package manager to deal with this).

> I know we had trouble with random garbage that are *not* addresses
> people put on the in-body CC: trailer in the past, but I do not recall
> if they are something Mail::Address would give worse result and we
> need our workaround (hence our own substitute), or Mail::Address would
> handle them just fine.

For a long time, we used Mail::Address if it was available and I don't
think we had issues with it.

My point in cc907506 ("send-email: don't use Mail::Address, even if
available", 2017-08-23) was not that Mail::Address was bad, but that
changing our behavior depending on whether it was there or not was
really bad. For example, the issue dealt with in this thread probably
always existed, but it was present only for *some* users.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: future of the mediawiki extension?

2017-11-06 Thread Matthieu Moy
"Junio C Hamano"  wrote:

> I think the plan was to make code drop from time to time at major
> release points of git-multimail, but I do not think we've seen many
> updates recently.

I realize I didn't answer this point. The reason you didn't see any
update recently is just that there haven't been any release and
actually not much development for a while on git-multimail.

I still have a list of "would be nice to have" features, but it seems
users are essentially happy with git-multimail as it is (or at least,
aren't unhappy enough to send patches or discuss interesting issues),
I haven't received any bug report or pull-requests for a long time.

But I still do maintain git-multimail and I will continue updating it
in git.git.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH 2/2] send-email: quote-email quotes the message body

2017-11-01 Thread Matthieu Moy
Junio C Hamano  writes:

> Hmmm.  I have a strong suspicion that people want an option to
> trigger the feature from just 1/2 but not 2/2 some of the time.
> Sure, removing the unwanted lines in the compose editor may be easy,
> but it feels wasteful use of user's time to include the lines of
> text from the original only to have them removed.

So, that could be

  git send-email --in-reply-to=message-id  # message-id is not a file
  => existing behavior

  git send-email --in-reply-to=file
  => populate To:, Cc:, In-Reply-To: and References:

  git send-email --in-reply-to=file --quote
  => in addition to the above, include the quoted message in the body

(perhaps --quote should be --cite, I'm not sure which one looks best for
a native speaker)

This also leaves room for

  git send-email --in-reply-to=message-id --fetch [--quote]
  => download the message body from e.g. public-inbox and do the same as
 for --in-reply-to=file

(which doesn't have to be implemented now, but would be a nice-to-have
in the future)

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH 1/2] quote-email populates the fields

2017-11-01 Thread Matthieu Moy
Junio C Hamano  writes:

> Payre Nathan  writes:
>
>> From: Tom Russello 
>>
>> ---
>
> Missing something here???

To clarify for Nathan, Thimothee and Danial: the cover-letter is an
introduction send before the patch series. It can be needed to explain
the overall approach followed by the series. But in general, it does not
end up in the Git history, i.e. after the review is finished, the
cover-letter is forgotten.

OTOH, the commit messages for each patch is what ends up in the Git
history. This is what people will find later when running e.g. "git
blame", "git bisect" or so. Clearly the future user examining history
expects more than "quote-email populates the fields" (which was a good
reminder during development, but is actually a terrible subject line for
a final version).

A quick advice: if in doubt, prefer writing explanations in commit
message rather than the cover letter. If still in doubt, write the
explanations twice: once quickly in the cover letter and once more
detailed in the commit message.

>  * Do not call this option "quote" anything; you are not quoting,
>just using some info from the given file.  
>
>I wonder if we can simply reuse "--in-reply-to" option for this
>purpose.  If it is a message id and not a file on the filesystem,
>we behave just as before.  Otherwise we try to open it as a file
>and grab the "Message-ID:" header from it and use it.

There's a possible ambiguity since user may in theory want to run
"--in-reply-to=msgid" with a file named msgid and still want the old
behavior. But: a real message-id is typically something rather cryptic
and it is safe to assume that users won't have a file named exactly like
an actual message id containing something which isn't the message in
question.

The main drawback I see in re-using "--in-reply-to" is that typos are
hard to miss. For example, running

  git send-email --in-reply-to=msgi

when the user actually wanted msgid would trigger a different behavior
instead of raising an error (no such file or directory). I think it's
acceptable: in the current form of send-email, we're already not
user-friendly to users when they write a typo in the --in-reply-to
argument (and we can't really detect typos anyway).

>> +if ($quote_email) {
>> +my $error = validate_patch($quote_email);
>> +die "fatal: $quote_email: $error\nwarning: no patches were sent\n"
>> +if $error;
>
> validate_patch() calls sendemail-validate hook that is expecting to
> be fed a patch email you are going to send out that might have
> errors so that it can catch it and save you from embarrassment.  The
> file you are feeding it is *NOT* what you are going to send out, but
> is what you are responding to with your patch.  Even if it had an
> embarassing error as a patch, that is not something you care about
> (and it is something you received, so catching this late won't save
> the sender from embarrassment anyway).

I think the intention was to detect cases when $quote_email is not a
patch at all (and give a proper error message instead of trying to
continue with probably absurd behavior).

But I agree that there's no point in being too strict here, and if that
was the intension then it should be documented with a comment.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH 1/4] remote-mediawiki: add namespace support

2017-10-30 Thread Matthieu Moy
Antoine Beaupré  writes:

> @@ -319,6 +341,10 @@ sub get_mw_pages {
>   $user_defined = 1;
>   get_mw_tracked_categories(\%pages);
>   }
> +if (@tracked_namespaces) {
> +$user_defined = 1;
> +get_mw_tracked_namespaces(\%pages);
> +}
>   if (!$user_defined) {
>   get_mw_all_pages(\%pages);
>   }

Space Vs tabs indent issue (I have tab-width = 8, you probably have 4
and this "if" looks underindented).

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH 0/4] WIP: git-remote-media wiki namespace support

2017-10-30 Thread Matthieu Moy
Antoine Beaupré  writes:

> Obviously, doing unit tests against a full MediaWiki instance isn't
> exactly trivial.

Not trivial, but doable: there is all the infrastructure to do so in t/:
install-wiki.sh to automatically install Mediawiki, and then a testsuite
that interacts with it.

This has been written under the assumption that the developer had a
lighttpd instance running on localhost, but this can probably be adapted
to run on Travis-CI (install lighttpd & Mediawiki in the install: part,
and run the tests afterwards), so that anyone can run the tests by just
submitting a pull-request to Git-Mediawiki.

If you are to work more on Git-Mediawiki, don't underestimate the
usefullness of the testsuite (for example, Git-Mediawiki was developped
against a prehistoric version of Mediawiki, the testsuite can help
ensuring it still works on the lastest version), nor the fun of playing
with install scripts and CI systems ;-).

Cheers,

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [PATCH v2] remote-mediawiki: limit filenames to legal

2017-10-30 Thread Matthieu Moy
Antoine Beaupré  writes:

> @@ -52,7 +53,7 @@ sub smudge_filename {
>   $filename =~ s/ /_/g;
>   # Decode forbidden characters encoded in clean_filename
>   $filename =~ s/_%_([0-9a-fA-F][0-9a-fA-F])/sprintf('%c', hex($1))/ge;
> - return $filename;
> + return substr($filename, 0, NAME_MAX-3);

There's a request to allow a configurable extension (.mediawiki would
help importing in some wikis, see
https://github.com/Git-Mediawiki/Git-Mediawiki/issues/42). You should at
least make this stg like length(".mw") so that the next search&replace
for ".mw" finds this.

Also, note that your solution works for using Git-Mediawiki in a
read-only way, but if you start modifying and pushing such files, you'll
get into trouble. It probably makes sense to issue a warnign in such
case.

Regards,

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: future of the mediawiki extension?

2017-10-30 Thread Matthieu Moy
[ Please Cc: me in these discussions ]

Antoine Beaupré  writes:

> I think, however, it would be good to have a discussion about the future
> of that extension in Git. The extension has a bit of a hybrid presence -
> it is partly in git core (as a contrib, but still) and partly on GitHub
> here:
>
> https://github.com/Git-Mediawiki/Git-Mediawiki/

The initial plan was to use Git's contrib/ subdirectory to get more of
the Git community involved in its development, and avoid making it a
personal toy.

Initially, Git-Mediawiki is both a personnal project and a student's
project (most of the code was written by students as part of an Ensimag
project under my supervision, with strong interaction with the Git
community).

It did work well in the first times of the project, there were very
fruitfull interactions between Git-Mediawiki and Git. Some issues raised
while developing Git-Mediawiki led to improvements in the remote-helper
mechanism in Git. Some code written for Git-Mediawiki ended up in Git
(the Perl layer for the credential helpers at least). I think this would
have been harder if Git-Mediawiki was a separte project.

However, the rest of the plan did not work that well. I thought having
the code in contrib/ would help keeping the project alive if I became
inactive. It's been a while I didn't have enough time-budget to work on
the project, and the git.git review mechanism has actually blocked a lot
of contributors. Patches get posted here and there but no one takes time
for a proper submission here, and when this happens contributors give up
after the first round of review instead of re-rolling.

So, my conclusion is that a simpler submission mechanism (GitHub's
pull-requests) and a less picky code review would help Git-Mediawiki.

>From previous discussions, I think Junio will agree with that: he's
reluctant to keeping too much stuff in contrib/ and usally prefers
external projects.

> It should also be mentioned that this contrib isn't very active: I'm not
> part of the GitHub organization, yet I'm probably the one that's been
> the most active with patches in the last year (and I wasn't very active
> at all).

FYI, I'm no longer using Mediawiki as much as I did, and I don't really
use Git-Mediawiki anymore.

The main blocking point to revive Git-Mediawiki is to find a new
maintainer (https://github.com/Git-Mediawiki/Git-Mediawiki/issues/33). I
believe I just found one ;-).

> Please avoid "mailing list vs GitHub" flamewars and keep to the topic of
> this specific contrib's future. :)

Note that being a separate project doesn't mean there can't be any
interaction with this list. Requests for reviews for separate projects
are usually welcome even though they don't happen often here.

There's also a hybrid solution used by git-multimail: have a copy of the
code in git.git, but do the development separately. I'm not sure it'd be
a good idea for Git-Mediawiki, but I'm mentionning it for completeness.

Regards,

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: What means "git config bla ~/"?

2017-10-03 Thread Matthieu Moy
"Junio C Hamano"  writes:

> Jonathan Nieder  writes:
> 
>>> what's with that "git config bla ~/"? is this some config keyword
>>> or something?
>> ...
>>
>> I agree with you that it is less clear than it could be.  Ideas for
>> clarifying it?
> 
> Yeah, if it were
> 
>   git config section.var ~/some/thing
> 
> that would be far clearer than "bla" that what we see is a mere
> placeholder.

Agreed.

> Another obvious option is to remove the parenthetical comment.
> 
> Or "(but you can give values without quoting and let your shell
> expand it)", without a confusing sample command that proved not to
> help very much.

I prefer your "section.var" proposal above. I think people who really 
understand shell quoting do not need the explanations, and others probably need 
the example.

While we're there, the formatting is also wrong ('' quoting, while we normally 
use `` quoting for shell commands).

Sounds like a nice microproject for my students :-). A patch should follow soon.

-- 
Matthieu Moy
https://matthieu-moy.fr/


[PATCH v2 2/2] send-email: don't use Mail::Address, even if available

2017-08-25 Thread Matthieu Moy
Using Mail::Address made sense when we didn't have a proper parser. We
now have a reasonable address parser, and using Mail::Address
_if available_ causes much more trouble than it gives benefits:

* Developers typically test one version, not both.

* Users may not be aware that installing Mail::Address will change the
  behavior. They may complain about the behavior in one case without
  knowing that Mail::Address is involved.

* Having this optional Mail::Address makes it tempting to anwser "please
  install Mail::Address" to users instead of fixing our own code. We've
  reached the stage where bugs in our parser should be fixed, not worked
  around.

Signed-off-by: Matthieu Moy 
---
 git-send-email.perl | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index dfd646ac5b..0061dbfab9 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -155,7 +155,6 @@ sub format_2822_time {
 }
 
 my $have_email_valid = eval { require Email::Valid; 1 };
-my $have_mail_address = eval { require Mail::Address; 1 };
 my $smtp;
 my $auth;
 my $num_sent = 0;
@@ -490,11 +489,7 @@ my ($repoauthor, $repocommitter);
 ($repocommitter) = Git::ident_person(@repo, 'committer');
 
 sub parse_address_line {
-   if ($have_mail_address) {
-   return map { $_->format } Mail::Address->parse($_[0]);
-   } else {
-   return Git::parse_mailboxes($_[0]);
-   }
+   return Git::parse_mailboxes($_[0]);
 }
 
 sub split_addrs {
-- 
2.14.0.rc0.dirty



[PATCH v2 1/2] send-email: fix garbage removal after address

2017-08-25 Thread Matthieu Moy
This is a followup over 9d33439 (send-email: only allow one address
per body tag, 2017-02-20). The first iteration did allow writting

  Cc:  # garbage

but did so by matching the regex ([^>]*>?), i.e. stop after the first
instance of '>'. However, it did not properly deal with

  Cc: f...@example.com # garbage

Fix this using a new function strip_garbage_one_address, which does
essentially what the old ([^>]*>?) was doing, but dealing with more
corner-cases. Since we've allowed

  Cc: "Foo # Bar" 

in previous versions, it makes sense to continue allowing it (but we
still remove any garbage after it). OTOH, when an address is given
without quoting, we just take the first word and ignore everything
after.

Signed-off-by: Matthieu Moy 
---
Change since v1: removed dead code as suggested by Junio.


 git-send-email.perl   | 22 --
 t/t9001-send-email.sh |  4 
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index fa6526986e..dfd646ac5b 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1089,6 +1089,22 @@ sub sanitize_address {
 
 }
 
+sub strip_garbage_one_address {
+   my ($addr) = @_;
+   chomp $addr;
+   if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
+   # "Foo Bar"  [possibly garbage here]
+   # Foo Bar  [possibly garbage here]
+   #  [possibly garbage here]
+   return $1;
+   }
+   if ($addr =~ /^([^"#,\s]*)/) {
+   # address without quoting: remove anything after the address
+   return $1;
+   }
+   return $addr;
+}
+
 sub sanitize_address_list {
return (map { sanitize_address($_) } @_);
 }
@@ -1590,10 +1606,12 @@ foreach my $t (@files) {
# Now parse the message body
while(<$fh>) {
$message .=  $_;
-   if (/^(Signed-off-by|Cc): ([^>]*>?)/i) {
+   if (/^(Signed-off-by|Cc): (.*)/i) {
chomp;
my ($what, $c) = ($1, $2);
-   chomp $c;
+   # strip garbage for the address we'll use:
+   $c = strip_garbage_one_address($c);
+   # sanitize a bit more to decide whether to suppress the 
address:
my $sc = sanitize_address($c);
if ($sc eq $sender) {
next if ($suppress_cc{'self'});
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index d1e4e8ad19..f30980895c 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -148,6 +148,8 @@ cat >expected-cc <<\EOF
 !t...@example.com!
 !th...@example.com!
 !f...@example.com!
+!f...@example.com!
+!s...@example.com!
 EOF
 "
 
@@ -161,6 +163,8 @@ test_expect_success $PREREQ 'cc trailer with various 
syntax' '
Cc:  # trailing comments are ignored
Cc: ,  one address per line
Cc: "Some # Body"  [  ]
+   Cc: f...@example.com # not@example.com
+   Cc: s...@example.com, not.se...@example.com
EOF
clean_fake_sendmail &&
git send-email -1 --to=recipi...@example.com \
-- 
2.14.0.rc0.dirty



Re: [RFC PATCH 1/2] send-email: fix garbage removal after address

2017-08-25 Thread Matthieu Moy
Junio C Hamano  writes:

> Matthieu Moy  writes:
>
>> +sub strip_garbage_one_address {
>> +my ($addr) = @_;
>> +chomp $addr;
>> +if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
>> +# "Foo Bar"  [possibly garbage here]
>> +# Foo Bar  [possibly garbage here]
>> +return $1;
>> +}
>> +if ($addr =~ /^(<[^>]*>).*/) {
>> +#  [possibly garbage here]
>> +# if garbage contains other addresses, they are ignored.
>> +return $1;
>> +}
>
> Isn't this already covered by the first one,

Oops, indeed. I just removed the second "if" (and added the appropriate
comment to the first):

+   if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
+   # "Foo Bar"  [possibly garbage here]
+   # Foo Bar  [possibly garbage here]
+   #  [possibly garbage here]
+   return $1;
+   }

> By the way, these three regexps smell like they were written
> specifically to cover three cases you care about (perhaps the ones
> in your proposed log message), but what will be our response when
> somebody else comes next time to us and says that their favourite
> formatting of "Cc:" line is not covered by these rules?

Well, actually the last one covers essentially everything. Just stop at
the first space, #, ',' or '"'. The first case is here to allow putting
a name in front of the address, which is something we've already allowed
and sounds reasonable from the user point of view.

OTOH, I didn't bother with real corner-cases like

  Cc: "Foo \"bar\"" 

> So, from that point of view, I, with devil's advocate hat on, wonder
> why we are not saying
>
>   "Cc: s...@k.org # cruft"?  Use "Cc:  # cruft" instead
>   and you'd be fine.
>
> right now, without this patch.

I would agree if the broken case were an exotic one. But a plain adress
is really the simplest use-case I can think of, so it's hard to say
"don't do that" when we should say "sorry, we should obviously have
thought about this use-case".

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: sequencer status

2017-08-24 Thread Matthieu Moy
Christian Couder  writes:

> It is displaying the steps that have already been performed, right?
> I wonder if people might want more about the current step (but maybe
> that belongs to `git status`) or perhaps the not yet performed states
> (and maybe even a way to edit the todo list?)

Note that 'git status' is already doing this, but shows only 2 items of
each:

$ git status
interactive rebase in progress; onto 3772427
Last commands done (2 commands done):
   pick a48812c some commit title
   exec false
Next commands to do (2 remaining commands):
   pick 9d7835d some other commit
   pick c0e0fa8 one more subject
  (use "git rebase --edit-todo" to view and edit)
You are currently editing a commit while rebasing branch 'master' on '3772427'.
...

The idea was that 2 lines of context is often sufficient, and doesn't
eat too much screen space so it makes sense to show it by default.

I think it makes sense to have another command that shows the whole
sequence, but perhaps it could also be just an option for "git status".

Cheers,

-- 
Matthieu Moy
https://matthieu-moy.fr/


[RFC PATCH 2/2] send-email: don't use Mail::Address, even if available

2017-08-23 Thread Matthieu Moy
Using Mail::Address made sense when we didn't have a proper parser. We
now have a reasonable address parser, and using Mail::Address
_if available_ causes much more trouble than it gives benefits:

* Developers typically test one version, not both.

* Users may not be aware that installing Mail::Address will change the
  behavior. They may complain about the behavior in one case without
  knowing that Mail::Address is involved.

* Having this optional Mail::Address makes it tempting to anwser "please
  install Mail::Address" to users instead of fixing our own code. We've
  reached the stage where bugs in our parser should be fixed, not worked
  around.

Signed-off-by: Matthieu Moy 
---
 git-send-email.perl | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 33a69ffe5d..2208dcc213 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -155,7 +155,6 @@ sub format_2822_time {
 }
 
 my $have_email_valid = eval { require Email::Valid; 1 };
-my $have_mail_address = eval { require Mail::Address; 1 };
 my $smtp;
 my $auth;
 my $num_sent = 0;
@@ -490,11 +489,7 @@ my ($repoauthor, $repocommitter);
 ($repocommitter) = Git::ident_person(@repo, 'committer');
 
 sub parse_address_line {
-   if ($have_mail_address) {
-   return map { $_->format } Mail::Address->parse($_[0]);
-   } else {
-   return Git::parse_mailboxes($_[0]);
-   }
+   return Git::parse_mailboxes($_[0]);
 }
 
 sub split_addrs {
-- 
2.14.0.rc0.dirty



[RFC PATCH 1/2] send-email: fix garbage removal after address

2017-08-23 Thread Matthieu Moy
This is a followup over 9d33439 (send-email: only allow one address
per body tag, 2017-02-20). The first iteration did allow writting

  Cc:  # garbage

but did so by matching the regex ([^>]*>?), i.e. stop after the first
instance of '>'. However, it did not properly deal with

  Cc: f...@example.com # garbage

Fix this using a new function strip_garbage_one_address, which does
essentially what the old ([^>]*>?) was doing, but dealing with more
corner-cases. Since we've allowed

  Cc: "Foo # Bar" 

in previous versions, it makes sense to continue allowing it (but we
still remove any garbage after it). OTOH, when an address is given
without quoting, we just take the first word and ignore everything
after.

Signed-off-by: Matthieu Moy 
---
Also available as: https://github.com/git/git/pull/398

 git-send-email.perl   | 26 --
 t/t9001-send-email.sh |  4 
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index fa6526986e..33a69ffe5d 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1089,6 +1089,26 @@ sub sanitize_address {
 
 }
 
+sub strip_garbage_one_address {
+   my ($addr) = @_;
+   chomp $addr;
+   if ($addr =~ /^(("[^"]*"|[^"<]*)? *<[^>]*>).*/) {
+   # "Foo Bar"  [possibly garbage here]
+   # Foo Bar  [possibly garbage here]
+   return $1;
+   }
+   if ($addr =~ /^(<[^>]*>).*/) {
+   #  [possibly garbage here]
+   # if garbage contains other addresses, they are ignored.
+   return $1;
+   }
+   if ($addr =~ /^([^"#,\s]*)/) {
+   # address without quoting: remove anything after the address
+   return $1;
+   }
+   return $addr;
+}
+
 sub sanitize_address_list {
return (map { sanitize_address($_) } @_);
 }
@@ -1590,10 +1610,12 @@ foreach my $t (@files) {
# Now parse the message body
while(<$fh>) {
$message .=  $_;
-   if (/^(Signed-off-by|Cc): ([^>]*>?)/i) {
+   if (/^(Signed-off-by|Cc): (.*)/i) {
chomp;
my ($what, $c) = ($1, $2);
-   chomp $c;
+   # strip garbage for the address we'll use:
+   $c = strip_garbage_one_address($c);
+   # sanitize a bit more to decide whether to suppress the 
address:
my $sc = sanitize_address($c);
if ($sc eq $sender) {
next if ($suppress_cc{'self'});
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index d1e4e8ad19..f30980895c 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -148,6 +148,8 @@ cat >expected-cc <<\EOF
 !t...@example.com!
 !th...@example.com!
 !f...@example.com!
+!f...@example.com!
+!s...@example.com!
 EOF
 "
 
@@ -161,6 +163,8 @@ test_expect_success $PREREQ 'cc trailer with various 
syntax' '
Cc:  # trailing comments are ignored
Cc: ,  one address per line
Cc: "Some # Body"  [  ]
+   Cc: f...@example.com # not@example.com
+   Cc: s...@example.com, not.se...@example.com
EOF
clean_fake_sendmail &&
git send-email -1 --to=recipi...@example.com \
-- 
2.14.0.rc0.dirty



Re: git send-email Cc with cruft not working as expected

2017-08-23 Thread Matthieu Moy
Stefan Beller  writes:

> +cc people from that thread
>
> On Tue, Aug 22, 2017 at 4:30 PM, Jacob Keller  wrote:
>> On Tue, Aug 22, 2017 at 4:18 PM, Stefan Beller  wrote:
>>> On Tue, Aug 22, 2017 at 4:15 PM, Jacob Keller  
>>> wrote:
>>>> Hi,
>>>>
>>>> I recently found an issue with git-send-email where it does not
>>>> properly remove the cruft of an email address when sending using a Cc:
>>>> line.
>>>>
>>>> The specific example is with a commit containing the following Cc line,
>>>>
>>>> Cc: sta...@vger.kernel.org # 4.10+
>>>
>>> Please see and discuss at
>>> https://public-inbox.org/git/20170216174924.GB2625@localhost/
>>
>> I read that thread, and it addressed the problem of
>>
>> Cc:  # 4.10+
>>
>> but did not fix this case without the <> around the email address.

Indeed. It detects garbage as "everything after >".

I feel really sorry that we need so many iterations to get back to a
correct behavior :-(.

>> Additionally I just discovered that the behavior here changes pretty
>> drastically if you have Email::Validate installed, now it splits the
>> address into multiple things:

(I'm assuming you mean Email::Address, there's also Email::Valid but I
don't think it would modify the behavior)

Hmm, I think we reached the point where we should just stop using
Email::Address.

Patch series follows and should address both points.

-- 
Matthieu Moy
http://matthieu-moy.fr


Re: Bug with automated processing of git status results

2017-06-30 Thread Matthieu Moy
Сергей Шестаков  writes:

> I understand that we can turn off core.safecrlf, but it's
> inconvinient.

Note that you can do that without actually changing the config file:

  git -c core.safecrlf=false status ...

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH v4 5/5] stash: implement builtin stash

2017-06-26 Thread Matthieu Moy
Thomas Gummerer  writes:

> After the line
>
> test -n "$seen_non_option" || set "push" "$@"
>
> it's not possible that $# is 0 anymore, so this will never be
> printed.  From a quick look at the history it seems like it wasn't
> possible to trigger that codepath for a while.  If I'm reading things
> correctly 3c2eb80fe3 ("stash: simplify defaulting to "save" and reject
> unknown options", 2009-08-18) seems to have introduced the small
> change in behaviour.

Indeed. That wasn't on purpose, but I seem to have turned this

case $# in
0)
push_stash &&
say "$(gettext "(To restore them type \"git stash apply\")")"
;;

into dead code.

> As I don't think anyone has complained since then, I'd just leave it
> as is, which makes git stash with no options a little less verbose.

I agree it's OK to keep is as-is, but the original logic (give a bit
more advice when "stash push" was DWIM-ed) made sense too, so it can
make sense to re-activate it while porting to C.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: Git credential helper store flushes randomly

2017-05-12 Thread Matthieu Moy
Jeff King  writes:

> The only time it should remove an entry is when Git asks it to. And the
> only time that happens is when Git sees the credential rejected by the
> server (e.g., an HTTP 401 even after we fed the stored credential). I
> don't know why that would happen unless there's some non-determinism on
> the server.

I did see a case like this where the server was broken temporarily and
rejected one login attempt. In this case the credential store deletes
the entry for that user, and when the server is repaired, the store
still has the entry deleted.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: Add configuration options for some commonly used command-line options

2017-03-19 Thread Matthieu Moy
Duy Nguyen  writes:

> On Thu, Feb 19, 2015 at 5:32 PM, Matthieu Moy
>  wrote:
>> +### Add configuration options for some commonly used command-line options
>> +
>> +This includes:
>> +
>> +* git am -3
>> +
>> +* git am -c
>> +
>> +Some people always run the command with these options, and would
>> +prefer to be able to activate them by default in ~/.gitconfig.
>
> I was reading the .md file to add a new microproject and found this.
> Instead of adding new config case by case, should we do something more
> generic?
>
> We could have a new group defaultOptions. (or
> .options) which contains  =  where option
> names are the long name in parse-options? Then we don't have to
> manually add more config options any more (mostly, I'm aware of stuff
> like diff options that do not use parse-options).
>
> If we want to stop the users from shooting themselves in the foot, we
> could extend parse-options to allow/disallow certain options being
> used this way. Hmm?

I think the main problem is indeed "stop the users from shooting
themselves in the foot". Many command-line options change the behavior
completely so allowing users to enable them by default means allowing
the users to change Git in such a way that scripts calling it are
broken.

This also doesn't help when troublshouting an issue as these options are
typically something set once and for all and which you forget about.
This typically leads to discussion in Q&A forums like:

A: Can you run "git foo"?
B: Here's the result: ...
A: I don't understand, I can't reproduce here.

just because B has a CLI option enabled by default.

This is the same reasoning that leads Git to forbid aliasing an existing
command to something else.

OTOH, we already have almost "enable such or such option by default"
with aliases. People who always run "git am" with "-3" can write

[alias]
a3 = am -3

and just run "git a3".

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH] git-status: make porcelain more robust

2017-03-15 Thread Matthieu Moy
Junio C Hamano  writes:

> Michael J Gruber  writes:
>
>> git status provides a porcelain mode for porcelain writers with a
>> supposedly stable (plumbing) interface.
>> 7a76c28ff2 ("status: disable translation when --porcelain is used", 
>> 2014-03-20)
>> made sure that ahead/behind info is not translated (i.e. is stable).
>>
>> Make sure that the remaining two strings (initial commit, detached head)
>> are stable, too.
>
> It seems to me that 7a76c28ff2 already missed these strings, and
> their _() markings survive to this day.  Thanks for spotting and
> fixing.

Yep, sounds all right to me. Thanks,

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [Request for Documentation] Differentiate signed (commits/tags/pushes)

2017-03-07 Thread Matthieu Moy
Stefan Beller  writes:

> What is the difference between signed commits and tags?
> (Not from a technical perspective, but for the end user)

In addition to what the others already said:

If you use GitHub, then in the web interface you get a "Verified" stamp
for each signed commits:
https://help.github.com/articles/signing-commits-using-gpg/

It's not a Git feature but a GitHub one, but given the popularity of
GitHub, this probably led some users to believe that signed commits are
more convenient than signed tags.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH v2] send-email: only allow one address per body tag

2017-02-26 Thread Matthieu Moy
Junio C Hamano  writes:

> Matthieu Moy  writes:
>
>> Johan Hovold  writes:
>>
>>> --- a/git-send-email.perl
>>> +++ b/git-send-email.perl
>>> @@ -1563,7 +1563,7 @@ foreach my $t (@files) {
>>> # Now parse the message body
>>> while(<$fh>) {
>>> $message .=  $_;
>>> -   if (/^(Signed-off-by|Cc): (.*)$/i) {
>>> +   if (/^(Signed-off-by|Cc): ([^>]*>?)/i) {
>>
>> I think this is acceptable, but this doesn't work with trailers like
>>
>> Cc: "Some > Body" 
>>
>> A proper management of this kind of weird address should be doable by
>> reusing the regexp parsing "..." in parse_mailbox:
>>
>>  my $re_quote = qr/"(?:[^\"\\]|\\.)*"/;
>>
>> So the final regex would look like
>>
>> if (/^(Signed-off-by|Cc): (([^>]*|"(?:[^\"\\]|\\.)*")>?)/i) {
>>
>> I don't think that should block the patch inclusion, but it may be worth
>> considering.
>>
>> Anyway, thanks for the patch!
>
> Somehow this fell off the radar.  So your reviewed-by: and then
> we'll cook this in 'next' for a while?

OK.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH v2] send-email: only allow one address per body tag

2017-02-20 Thread Matthieu Moy
Johan Hovold  writes:

> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -1563,7 +1563,7 @@ foreach my $t (@files) {
>   # Now parse the message body
>   while(<$fh>) {
>   $message .=  $_;
> - if (/^(Signed-off-by|Cc): (.*)$/i) {
> + if (/^(Signed-off-by|Cc): ([^>]*>?)/i) {

I think this is acceptable, but this doesn't work with trailers like

Cc: "Some > Body" 

A proper management of this kind of weird address should be doable by
reusing the regexp parsing "..." in parse_mailbox:

my $re_quote = qr/"(?:[^\"\\]|\\.)*"/;

So the final regex would look like

if (/^(Signed-off-by|Cc): (([^>]*|"(?:[^\"\\]|\\.)*")>?)/i) {

I don't think that should block the patch inclusion, but it may be worth
considering.

Anyway, thanks for the patch!

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: body-CC-comment regression

2017-02-17 Thread Matthieu Moy
Junio C Hamano  writes:

> That approach may still constrain what those in the former camp can
> write in the "cruft" part, like they cannot write comma or semicolon
> as part of the "cruft", no?

Right. Indeed, this may be a problem since the use of "#" for stable
seem to include commit message, and they may contain commas.

So, maybe Johan's patch is better indeed.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: body-CC-comment regression

2017-02-17 Thread Matthieu Moy
Junio C Hamano  writes:

> Johan Hovold  writes:
>
>> That's precisely what the patch I posted earlier in the thread did.
>
> That's good.  I didn't see any patch yet 

It's here:

http://public-inbox.org/git/20170217110642.GD2625@localhost/

but as I explained, this removes a feature suported since several major
releases and we have no idea how many users may use the "mupliple emails
in one field". The approach I proposed does not suffer from this.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: body-CC-comment regression

2017-02-17 Thread Matthieu Moy
> On Fri, Feb 17, 2017 at 02:16:42PM +0100, Matthieu Moy wrote:
>> Johan Hovold  writes:
>
>> The "multiple emails per Cc: field" has been there for a while already
>> (b1c8a11c8024 released in 2.6.0, sept 2015), some users may have got
>> used to it. What you are proposing breaks their flow.
>
> Note that that commit never mentions multiple addresses in either
> headers or body-tags -- it's all about being able to specify multiple
> entries on the command line.

Indeed. I'm not the author of the patch, but I was supervising the
students who wrote it and "multiple addresses in Cc:" was not the goal,
but a (IMHO positive) side effect we discovered after the fact.

If I had a time machine, I'd probably go back then and forbid multiple
addresses there, but ...

> There does not seem to be single commit in the kernel where multiple
> address are specified in a CC tag since after git-send-email started
> allowing it, but there are ten commits before (to my surprise), and that
> should be contrasted with at least 4178 commits with trailing comments
> including a # sign.

Hey, there's a life outside the kernel ;-).

>> 1) Stop calling Mail::Address even if available.[...]
>
> Right, that sounds like the right thing to do regardless.
>
>> 2) Modify our in-house parser to discard garbage after the >. [...]
>
> Sounds perfectly fine to me, and seems to work too after quick test.

OK, sounds like the way to go.

Do you want to work on a patch? If not, I should be able to do that
myself. The code changes are straightforward, but we probably want a
proper test for that.

> addresses in a Cc-tag in that it breaks --suppress-cc=self, but I guess
> that can be fixed separately.

OK. If it's unrelated enough, please start a separate thread to explain
the problem (and/or write a patch ;-) ).

Thanks,

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: body-CC-comment regression

2017-02-17 Thread Matthieu Moy
Johan Hovold  writes:

> There is another option, namely to only accept a single address for tags
> in the body. I understand that being able to copy a CC-header to either
> the header section or to the command line could be useful, but I don't
> really see the point in allowing this in the tags in the body (a SoB
> always has one address, and so should a CC-tag).

I mostly agree for the SoB, but why should a Cc tag have only one email?

The "multiple emails per Cc: field" has been there for a while already
(b1c8a11c8024 released in 2.6.0, sept 2015), some users may have got
used to it. What you are proposing breaks their flow.

> And since this is a regression for something that has been working for
> years that was introduced by a new feature, I also think it's reasonable
> to (partially) revert the feature.

I'd find it rather ironic to fix your case by breaking a feature that
has been working for more than a year :-(. What would you answer to a
contributor comming one year from now and proposing to revert your
reversion because it breaks his flow?

All that said, I think another fix would be both satisfactory for
everyone and rather simple:

1) Stop calling Mail::Address even if available. It used to make sense
   to do that when our in-house parser was really poor, but we now have
   something essentially as good as Mail::Address. We test our parser
   against Mail::Address and we do have a few known differences (see
   t9000), but they are really corner-cases and shouldn't matter.

   A good consequence of this is that we stop depending on the way Perl
   is installed to parse emails. Regardless of the current issue, I
   think it is a good thing.

2) Modify our in-house parser to discard garbage after the >. The patch
   should look like (untested):

--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -903,11 +903,11 @@ sub parse_mailboxes {
my (@addr_list, @phrase, @address, @comment, @buffer) = ();
foreach my $token (@tokens) {
if ($token =~ /^[,;]$/) {
-   # if buffer still contains undeterminated strings
-   # append it at the end of @address or @phrase
-   if ($end_of_addr_seen) {
-   push @phrase, @buffer;
-   } else {
+   # if buffer still contains undeterminated
+   # strings append it at the end of @address,
+   # unless we already saw the closing >, in
+   # which case we discard it.
+   if (!$end_of_addr_seen) {
push @address, @buffer;
    }
 
What do you think?

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH 1/4 v4] revision.c: do not update argv with unknown option

2017-02-16 Thread Matthieu Moy
Junio C Hamano  writes:

> Matthieu Moy  writes:
>
>> Siddharth Kannan  writes:
>>
>>> handle_revision_opt() tries to recognize and handle the given argument. If 
>>> an
>>> option was unknown to it, it used to add the option to unkv[(*unkc)++].  
>>> This
>>> increment of unkc causes the variable in the caller to change.
>>>
>>> Teach handle_revision_opt to not update unknown arguments inside unkc 
>>> anymore.
>>> This is now the responsibility of the caller.
>>>
>>> There are two callers of this function:
>>>
>>> 1. setup_revision: Changes have been made so that setup_revision will now
>>> update the unknown option in argv
>>
>> You're writting "Changes have been made", but I did not see any up to
>> this point in the series.
>
> Actually, I think you misread the patch and explanation.
> handle_revision_opt() used to be responsible for stuffing unknown
> ones to unkv[] array passed from the caller even when it returns 0
> (i.e. "I do not know what they are" case, as opposed to "I know what
> they are, I am not handling them here and leaving them in unkv[]"
> case--the latter returns non-zero).  The first hunk makes the
> function stop doing so, and to compensate, the second hunk, which is
> in setup_revisions()

Indeed, I misread the patch. The explanation could be a little bit more
"tired-reviewer-proof" by not using a past tone, perhaps

1. setup_revision, which is changed to ...

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: body-CC-comment regression

2017-02-16 Thread Matthieu Moy
Johan Hovold  writes:

> Hi,
>
> I recently noticed that after an upgrade, git-send-email (2.10.2)
> started aborting when trying to send patches that had a linux-kernel
> stable-tag in its body. For example,
>
>   Cc: # 4.4
>
> was now parsed as
>
>   "sta...@vger.kernel.org#4.4"
>
> which resulted in
>
>   Died at /usr/libexec/git-core/git-send-email line 1332,  line 1.

This has changed in e3fdbcc8e1 (parse_mailboxes: accept extra text after
<...> address, 2016-10-13), released v2.11.0 as you noticed:

> The problem with the resulting fixes that are now in 2.11.1 is that
> git-send-email no longer discards the trailing comment but rather
> shoves it into the name after adding some random white space:
>
>   "# 3 . 3 . x : 1b9508f : sched : Rate-limit newidle" 
> "
>
> This example is based on the example from
> Documentation/process/stable-kernel-rules.rst:
>
>   Cc:  # 3.3.x: 1b9508f: sched: Rate-limit newidle
>
> and this format for stable-tags has been documented at least since 2009
> and 8e9b9362266d ("Doc/stable rules: add new cherry-pick logic"), and
> has been supported by git since 2012 and 831a488b76e0 ("git-send-email:
> remove garbage after email address") I believe.
>
> Can we please revert to the old behaviour of simply discarding such
> comments (from body-CC:s) or at least make it configurable through a
> configuration option?

The problem is that we now accept list of emails instead of just one
email, so it's hard to define what "comments after the email", for
example

Cc:  # , 

Is not accepted as two emails.

So, just stripping whatever comes after # before parsing the list of
emails would change the behavior once more, and possibly break other
user's flow. Dropping the garbage after the email while parsing is
possible, but only when we use our in-house parser (and we currently use
Perl's Mail::Address when available).

So, a proper fix is far from obvious, and unfortunately I won't have
time to work on that, at least not before a while.

OTOH, the current behavior isn't that bad. It accepts the input, and
extracts a valid email out of it. Just the display name is admitedly
suboptimal ...

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH 1/4 v4] revision.c: do not update argv with unknown option

2017-02-16 Thread Matthieu Moy
Siddharth Kannan  writes:

> handle_revision_opt() tries to recognize and handle the given argument. If an
> option was unknown to it, it used to add the option to unkv[(*unkc)++].  This
> increment of unkc causes the variable in the caller to change.
>
> Teach handle_revision_opt to not update unknown arguments inside unkc anymore.
> This is now the responsibility of the caller.
>
> There are two callers of this function:
>
> 1. setup_revision: Changes have been made so that setup_revision will now
> update the unknown option in argv

You're writting "Changes have been made", but I did not see any up to
this point in the series.

We write patch series so that they are bisectable, i.e. each commit
should be correct (compileable, pass tests, consistent
documentation, ...). Here, it seems you are introducing a breakage to
repair it later.

Other that bisectability, this makes review harder: at this point the
reader knows it's broken, guesses that it will be repaired later, but
does not know in which patch.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH 0/4 v4] WIP: allow "-" as a shorthand for "previous branch"

2017-02-16 Thread Matthieu Moy
Siddharth Kannan  writes:

> This is as per our discussion[1]. The patches and commit messages are based on
> Junio's patches that were posted as a reply to
> <20170212184132.12375-1-gits...@pobox.com>.
>
> As per Matthieu's comments, I have updated the tests, but there is still one
> thing that is not working: log -@{yesterday} or log -@{2.days.ago}

Note that I did not request that these things work, just that they seem
to be relevant tests: IMHO it's OK to reject them, but for example we
don't want them to segfault. And having a test is a good hint that you
thought about what could happen and to document it.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH 2/2 v3] sha1_name: teach get_sha1_1 "-" shorthand for "@{-1}"

2017-02-12 Thread Matthieu Moy
Siddharth Kannan  writes:

>  sha1_name.c  |  5 
>  t/t4214-log-shorthand.sh | 73 
> 
>  2 files changed, 78 insertions(+)
>  create mode 100755 t/t4214-log-shorthand.sh
>
> diff --git a/sha1_name.c b/sha1_name.c
> index 73a915f..d774e46 100644
> --- a/sha1_name.c
> +++ b/sha1_name.c
> @@ -947,6 +947,11 @@ static int get_sha1_1(const char *name, int len, 
> unsigned char *sha1, unsigned l
>   if (!ret)
>   return 0;
>  
> + if (!strcmp(name, "-")) {
> + name = "@{-1}";
> + len = 5;
> + }

One drawback of this approach is that further error messages will be
given from the "@{-1}" string that the user never typed.

After you do that, the existing "turn - into @{-1}" pieces of code
become useless and you should remove it (probably in a further patch).

There are at least:

$ git grep -n -A1 'strcmp.*"-"' | grep -B 1 '@\{1\}'
builtin/checkout.c:975: if (!strcmp(arg, "-"))
builtin/checkout.c-976- arg = "@{-1}";
--
builtin/merge.c:1231:   } else if (argc == 1 && !strcmp(argv[0], "-")) {
builtin/merge.c-1232-   argv[0] = "@{-1}";
--
builtin/revert.c:158:   if (!strcmp(argv[1], "-"))
builtin/revert.c-159-   argv[1] = "@{-1}";
--
builtin/worktree.c:344: if (!strcmp(branch, "-"))
builtin/worktree.c-345- branch = "@{-1}";

In the final version, obviously the same "refactoring" (specific
command -> git-wide) should be done for documentation (it should be in
this patch to avoid letting not-up-to-date documentation even for a
single commit).

> diff --git a/t/t4214-log-shorthand.sh b/t/t4214-log-shorthand.sh
> new file mode 100755
> index 000..dec966c
> --- /dev/null
> +++ b/t/t4214-log-shorthand.sh
> @@ -0,0 +1,73 @@
> +#!/bin/sh
> +
> +test_description='log can show previous branch using shorthand - for @{-1}'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup' '
> + echo hello >world &&
> + git add world &&
> + git commit -m initial &&
> + echo "hello second time" >>world &&
> + git add world &&
> + git commit -m second &&
> + echo "hello other file" >>planet &&
> + git add planet &&
> + git commit -m third &&
> + echo "hello yet another file" >>city &&
> + git add city &&
> + git commit -m fourth
> +'

You may use test_commit to save a few lines of code.

> +test_expect_success '"log -" should work' '
> + git checkout -b testing-1 master^ &&
> + git checkout -b testing-2 master~2 &&
> + git checkout master &&
> +
> + git log testing-2 >expect &&
> + git log - >actual &&
> + test_cmp expect actual
> +'

I'd have split this into a "setup branches" and a '"log -" should work'
test (to actually see where "setup branches" happen in the output, and
to allow running the setup step separately if needed). Not terribly
important.

> +test_expect_success 'symmetric revision range should work when one end is 
> left empty' '
> + git checkout testing-2 &&
> + git checkout master &&
> + git log ...@{-1} > expect.first_empty &&
> + git log @{-1}... > expect.last_empty &&
> + git log ...- > actual.first_empty &&
> + git log -... > actual.last_empty &&

Nitpick: we stick the > and the filename (as you did in most places
already).

It may be worth adding tests for more cases like

* Check what happens with suffixes, i.e. -^, -@{yesterday} and -~.

* -..- -> to make sure you handle the presence of two - properly.

* multiple separate arguments to make sure you handle them all, e.g.
  "git log - -", "git log HEAD -", "git log - HEAD".

The last two may be overkill, but the first one is probably important.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: GSoC 2017: application open, deadline = February 9, 2017

2017-02-09 Thread Matthieu Moy
Matthieu Moy  writes:

> Matthieu Moy  writes:
>
>> I created a Git organization and invited you + Peff as admins. I'll
>> start cut-and-pasting to show my good faith ;-).
>
> I created this page based on last year's:
>
> https://git.github.io/SoC-2017-Org-Application/
>
> I filled-in the "org profile". "Org application" is still TODO.

PS: I didn't hear from the libgit2 folks, so I think it's reasonable to
consider that we don't apply for them.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH/RFC] WIP: log: allow "-" as a short-hand for "previous branch"

2017-02-09 Thread Matthieu Moy
Siddharth Kannan  writes:

> Hello Matthieu,
>
> On 8 February 2017 at 20:10, Matthieu Moy  
> wrote:
>> In a previous discussion, I made an analogy with "cd -" (which is the
>> source of inspiration of this shorthand AFAIK): "-" did not magically
>> become "the last visited directory" for all Unix commands, just for
>> "cd". And in this case, I'm happy with it. For example, I never need
>> "mkdir -", and I'm happy I can't "rm -fr -" by mistake.
>>
>> So, it's debatable whether it's a good thing to have all commands
>> support "-". For example, forcing users to explicitly type "git branch
>> -d @{1}" and not providing them with a shortcut might be a good thing.
>
> builtin/branch.c does not call setup_revisions and remains unaffected
> by this patch :)

Right, I forgot this: in some place we need any revspec, but "branch -d"
needs a branch name explicitly.

> [...]
> As you might notice, in this list, most commands are not of the `rm` variety,
> i.e. something that would delete stuff.

OK, I think I'm convinced.

Keep the arguments in mind when polishing the commit message.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: GSoC 2017: application open, deadline = February 9, 2017

2017-02-09 Thread Matthieu Moy
Matthieu Moy  writes:

> I created a Git organization and invited you + Peff as admins. I'll
> start cut-and-pasting to show my good faith ;-).

I created this page based on last year's:

https://git.github.io/SoC-2017-Org-Application/

I filled-in the "org profile". "Org application" is still TODO.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: GSoC 2017: application open, deadline = February 9, 2017

2017-02-09 Thread Matthieu Moy
Christian Couder  writes:

> Well Peff wrote in reply to your email:
>
>> I did co-admin last year and the year before, but I made Matthieu do all
>> the work. :)
>>
>> I do not mind doing the administrative stuff. But the real work is in
>> polishing up the ideas list and microprojects page.
>
> So I thought Peff would be ok to be the admin (do "the administrative
> stuff").

There are several things the admins need to do:

1) "administrative stuff" about money with Conservancy (aka SFC). As I
   understand it, really not much to do since Google and Conservancy
   work directly with each other for most stuff.

2) Filling-in the application, i.e. essentially copy-past from the
   website.
 
3) Then, make sure things that must happen do happen (reviewing
   applications, start online or offline discussions when needed, ...).

Last year Peff did 1) and I did most of 2+3). My understanding of Peff's
reply was "OK to continue doing 1)".

I think you (Christian) could do 2+3). It's much, much less work than
being a mentor. Honnestly I felt like I did nothing and then Peff said I
did all the work :o). I can help, but as I said I'm really short in time
budget and I'd like to spend it more on coding+reviewing.

> I don't think emails in this thread is what really counts.
> I worked on the Idea page starting some months ago, and as I wrote I
> reviewed it again and found it not too bad.

OK, so giving up now seems unfair to you indeed.

I created a Git organization and invited you + Peff as admins. I'll
start cut-and-pasting to show my good faith ;-).

> About the janitoring part, as I previously said I am reluctant to do
> that as I don't know what Dscho would be ok to mentor.
> And I also think it's not absolutely necessary to do it before
> applying as an org.

Right.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: GSoC 2017: application open, deadline = February 9, 2017

2017-02-09 Thread Matthieu Moy
Christian Couder  writes:

> On Wed, Feb 8, 2017 at 3:54 PM, Matthieu Moy
>  wrote:
>> Jeff King  writes:
>>
>>> On Mon, Jan 23, 2017 at 04:02:02PM +0100, Matthieu Moy wrote:
>>>
>>>> * We need to write the application, i.e. essentially polish and update
>>>>   the text here: https://git.github.io/SoC-2016-Org-Application/ and
>>>>   update the list of project ideas and microprojects :
>>>>   https://git.github.io/SoC-2017-Ideas/
>>>>   https://git.github.io/SoC-2016-Microprojects/
>>>
>>> That can be done incrementally by people who care (especially mentors)
>>> over the next week or so, and doesn't require any real admin
>>> coordination. If it happens and the result looks good, then the
>>> application process is pretty straightforward.
>>>
>>> If it doesn't, then we probably ought not to participate in GSoC.
>>
>> OK, it seems the last message did not raise a lot of enthousiasm (unless
>> I missed some off-list discussion at Git-Merge?).
>
> I think having 2 possible mentors or co-mentors still shows some
> enthousiasm even if I agree it's unfortunate there is not more
> enthousiasm.

A non-quoted but yet important part of my initial email was:

| So, as much as possible, I'd like to avoid being an org admin this
| year. It's not a lot of work (much, much less than being a mentor!),
| but if I manage to get some time to work for Git, I'd rather do that
| on coding and reviewing this year.

and for now, no one stepped in to admin.

Other non-negligible sources of work are reviewing microprojects and
applications. Having a few more messages in this thread would have been
a good hint that we had volunteers to do that.

> Someone steps in to do what exactly?

First we need an admin. Then as you said a bit of janitoring work on
the web pages.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH/RFC] WIP: log: allow "-" as a short-hand for "previous branch"

2017-02-08 Thread Matthieu Moy
Siddharth Kannan  writes:

> Making a change in sha1_name.c will touch a lot of commands
> (setup_revisions is called from everywhere in the codebase), so, I am
> still trying to figure out how to do this such that the rest of the
> codepath remains unchanged.

Changing sha1_name.c is the way to go *if* we want all commands to
support this. Just like other ways to name a revision...

> I hope that you do not mind this side-effect, but rather, you intended
> for this to happen, right? More commands will start supporting this
> shorthand, suddenly.  (such as format-patch, whatchanged, diff to name
> a very few).

... but: the initial implementation of this '-' shorthand was
special-casing a single command (IIRC, "git checkout") for which the
shorthand was useful.

In a previous discussion, I made an analogy with "cd -" (which is the
source of inspiration of this shorthand AFAIK): "-" did not magically
become "the last visited directory" for all Unix commands, just for
"cd". And in this case, I'm happy with it. For example, I never need
"mkdir -", and I'm happy I can't "rm -fr -" by mistake.

So, it's debatable whether it's a good thing to have all commands
support "-". For example, forcing users to explicitly type "git branch
-d @{1}" and not providing them with a shortcut might be a good thing.

I don't have strong opinion on this: I tend to favor consistency and
supporting "-" everywhere goes in this direction, but I think the
downsides should be considered too. A large part of the exercice here is
to write a good commit message!

Another issue with this is: - is also a common way to say "use stdin
instead of a file", so before enabling - for "previous branch", we need
to make sure it does not introduce any ambiguity. Git does not seem to
use "- for stdin" much (most commands able to read from stdin have an
explicit --stdin option for that), a quick grep in the docs shows only
"git blame --contents -" which is OK because a revision wouldn't make
sense here anyway.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: GSoC 2017: application open, deadline = February 9, 2017

2017-02-08 Thread Matthieu Moy
Jeff King  writes:

> On Mon, Jan 23, 2017 at 04:02:02PM +0100, Matthieu Moy wrote:
>
>> * We need to write the application, i.e. essentially polish and update
>>   the text here: https://git.github.io/SoC-2016-Org-Application/ and
>>   update the list of project ideas and microprojects :
>>   https://git.github.io/SoC-2017-Ideas/
>>   https://git.github.io/SoC-2016-Microprojects/
>
> That can be done incrementally by people who care (especially mentors)
> over the next week or so, and doesn't require any real admin
> coordination. If it happens and the result looks good, then the
> application process is pretty straightforward.
>
> If it doesn't, then we probably ought not to participate in GSoC.

OK, it seems the last message did not raise a lot of enthousiasm (unless
I missed some off-list discussion at Git-Merge?).

The application deadline is tomorrow. I think it's time to admit that we
won't participate this year, unless someone steps in really soon.

If we don't participate, I'll add a disclaimer at the top of the
SoC-related pages on git.github.io to make sure students don't waste
time preparing an application.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH v2 1/1] status: be prepared for not-yet-started interactive rebase

2017-01-26 Thread Matthieu Moy
Johannes Schindelin  writes:

> Some developers might want to call `git status` in a working
> directory where they just started an interactive rebase, but the
> edit script is still opened in the editor.
>
> Let's show a meaningful message in such cases.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  t/t7512-status-help.sh | 19 +++
>  wt-status.c| 14 ++
>  2 files changed, 29 insertions(+), 4 deletions(-)

The patch looks good to me.

> @@ -1166,8 +1170,10 @@ static void show_rebase_information(struct wt_status 
> *s,
>   struct string_list yet_to_do = STRING_LIST_INIT_DUP;
>  
>   read_rebase_todolist("rebase-merge/done", &have_done);
> - read_rebase_todolist("rebase-merge/git-rebase-todo", 
> &yet_to_do);
> -
> + if (read_rebase_todolist("rebase-merge/git-rebase-todo",
> +  &yet_to_do))
> + status_printf_ln(s, color,
> + _("git-rebase-todo is missing."));

I first was surprised not to see this "git-rebase-todo" in the output of
status, but the testcase tests a missing 'done', not a missing todo, so
it's normal.

Thanks,

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


GSoC 2017: application open, deadline = February 9, 2017

2017-01-23 Thread Matthieu Moy
Hi,

The Google Summer of Code 2017 program is launched
(https://summerofcode.withgoogle.com/).

Last year, Pranit Bauva worked on porting "git bisect" from shell to C,
mentored by Christian and Lars (I didn't follow closely, but essentially
many preparatory steps, cleanups and tests were merged in master, and
patches starting the real port are still queued in pu). The org admins
were Peff and I.

The application deadline is February 9, 2017, i.e. a bit more than 2
weeks from now. So, we should decide quickly whether or not to
participate, and if so work on the application.

On my side, I've been struggling to find some time budget to allocate to
Git since last year and I couldn't even keep up with discussions on this
list :-(. Last summer, I thought that being an org co-admin would help,
but it didn't. So, as much as possible, I'd like to avoid being an org
admin this year. It's not a lot of work (much, much less than being a
mentor!), but if I manage to get some time to work for Git, I'd rather
do that on coding and reviewing this year.

So, a bunch of unanswered questions:

* Does the git.git community want to participate in GSoC this year?
  Actually, I have mixed feelings about this: I do like GSoC, but it
  seems we lack reviewer time more than coder time, and GSoC does not
  make it better. OTOH, a GSoC participant is a potential reviewer in
  the long run ...

* Does the libgit2 community want to participate in GSoC? If so, we
  should clarify the application process. Last year, I wrote (too late):

  > It's essentially too late to change this for this year, but next
  > time we should discuss earlier about how we want to organize this
  > git.git/libgit2 thing. For example, I think it would make little sense
  > to have a git.git microproject and then apply for a libgit2 project
  > since we have very different ways of interacting. And honestly, right
  > now the application is really git.git-centric so I don't think it
  > attracts students towards libgit2. So, if you want to attract more
  > students, we should work on that.

If the answer to one of the above question is yes, then:

* Who's willing to mentor? and to admin?

* We need to write the application, i.e. essentially polish and update
  the text here: https://git.github.io/SoC-2016-Org-Application/ and
  update the list of project ideas and microprojects :
  https://git.github.io/SoC-2017-Ideas/
  https://git.github.io/SoC-2016-Microprojects/

Cheers,

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH] Documentation/bisect: improve on (bad|new) and (good|bad)

2017-01-16 Thread Matthieu Moy
Junio C Hamano  writes:

> Christian Couder  writes:
>
>> The following part of the description:
>>
>> git bisect (bad|new) []
>> git bisect (good|old) [...]
>>
>> may be a bit confusing, as a reader may wonder if instead it should be:
>>
>> git bisect (bad|good) []
>> git bisect (old|new) [...]
>>
>> Of course the difference between "[]" and "[...]" should hint
>> that there is a good reason for the way it is.
>>
>> But we can further clarify and complete the description by adding
>> "" and "" to the "bad|new" and "good|old"
>> alternatives.
>>
>> Signed-off-by: Christian Couder 
>> ---
>>  Documentation/git-bisect.txt | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> Thanks.  The patch looks good.

Looks good to me too.

> I think the answer to the question "why do we think we need a single
> bisect/bad?" is "because bisection is about assuming that there is
> only one commit that flips the tree state from 'old' to 'new' and
> finding that single commit".

I wouldn't say it's about "assuming" there's only one commit, but it's
about finding *one* such commit, i.e. it works if there are several such
commits, but won't find them all.

> But what if bad-A and bad-B have more than one merge bases?  We
> won't know which side the badness came from.
>
>   o---o---o---bad-A
>  / \ / 
> -Good---o---o---o   / 
>  \ / \
>   o---o---o---bad-B
>
> Being able to bisect the region of DAG bound by "^Good bad-A bad-B"
> may have value in such a case.  I dunno.

I could help finding several guilty commits, but anyway you can't
guarantee you'll find them all as soon as you use a binary search: if
the history looks like

--- Good --- Bad --- Good --- Good --- Bad --- Good --- Bad

then without examining all commits, you can't tell how many good->bad
switches occured.

But keeping several bad commits wouldn't help keeping the set of
potentially guilty commits small: bad commits appear on the positive
side in "^Good bad-A bad-B", so having more bad commits mean having a
larger DAG to explore (which is a bit counter-intuitive: without
thinking about it I'd have said "more info => less commits to explore").

So, if finding all guilty commits is not possible, I'm not sure how
valuable it is to try to find several of them.

OTOH, keeping several good commits is needed to find a commit for which
all parents are good and the commit is bad, i.e. distinguish

Good
\
 Bad <-- this is the one.
/
Good

and

Good
\
 Bad <-- need to dig further
/
 Bad

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: "Your branch is ahead of 'origin' by X commits"

2016-12-02 Thread Matthieu Moy
"Alfonsogonzalez, Ernesto (GE Digital)" 
writes:

> Hi,
>
> Git status tells me "Your branch is ahead of 'origin' by 108 commits.²,
> but my local and origin/master are pointing to the same commit.
>
> What am I doing wrong?
>
> $ git diff origin/master
> $ git status
> On branch master
> Your branch is ahead of 'origin' by 108 commits.

This line should say "ahead of 'origin/master'" in common setups, where
'origin/master' is the remote-tracking branch configured as upstream for
'master'.

My guess is that you have a badly configured upstream.

What does "git pull -v" say? What's the content of the [branch "master"]
section of .git/config?

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH v7 14/17] ref-filter: allow porcelain to translate messages in the output

2016-11-21 Thread Matthieu Moy
Karthik Nayak  writes:

> cc'in Matthieu since he wrote the patch.
>
> On Sat, Nov 19, 2016 at 4:16 AM, Jakub Narębski  wrote:
>> W dniu 08.11.2016 o 21:12, Karthik Nayak pisze:
>>> From: Karthik Nayak 
>>>
>>> Introduce setup_ref_filter_porcelain_msg() so that the messages used in
>>> the atom %(upstream:track) can be translated if needed. This is needed
>>> as we port branch.c to use ref-filter's printing API's.
>>>
>>> Written-by: Matthieu Moy 
>>> Mentored-by: Christian Couder 
>>> Mentored-by: Matthieu Moy 
>>> Signed-off-by: Karthik Nayak 
>>> ---
>>>  ref-filter.c | 28 
>>>  ref-filter.h |  2 ++
>>>  2 files changed, 26 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/ref-filter.c b/ref-filter.c
>>> index b47b900..944671a 100644
>>> --- a/ref-filter.c
>>> +++ b/ref-filter.c
>>> @@ -15,6 +15,26 @@
>>>  #include "version.h"
>>>  #include "wt-status.h"
>>>
>>> +static struct ref_msg {
>>> + const char *gone;
>>> + const char *ahead;
>>> + const char *behind;
>>> + const char *ahead_behind;
>>> +} msgs = {
>>> + "gone",
>>> + "ahead %d",
>>> + "behind %d",
>>> + "ahead %d, behind %d"
>>> +};
>>> +
>>> +void setup_ref_filter_porcelain_msg(void)
>>> +{
>>> + msgs.gone = _("gone");
>>> + msgs.ahead = _("ahead %d");
>>> + msgs.behind = _("behind %d");
>>> + msgs.ahead_behind = _("ahead %d, behind %d");
>>> +}
>>
>> Do I understand it correctly that this mechanism is here to avoid
>> repeated calls into gettext, as those messages would get repeated
>> over and over; otherwise one would use foo = N_("...") and _(foo),
>> isn't it?

That's not the primary goal. The primary goal is to keep untranslated,
and immutable messages in plumbing commands. We may decide one day that
_("gone") is not the best message for the end user and replace it with,
say, _("vanished"), but the "gone" has to remain the same forever and
regardless of the user's config for scripts using it.

We could have written

  in_porcelain ? _("gone") : "gone"

here and there in the code, but having a single place where we set all
the messages seems simpler. Call setup_ref_filter_porcelain_msg() and
get the porcelain messages, don't call it and keep the plumbing
messages.

Note that it's not the first place in the code where we do this, see
setup_unpack_trees_porcelain in unpack-trees.c for another instance.

Karthik: the commit message could be improved, for example:

Introduce setup_ref_filter_porcelain_msg() so that the messages used in
the atom %(upstream:track) can be translated if needed. By default, keep
the messages untranslated, which is the right behavior for plumbing
commands. This is needed as we port branch.c to use ref-filter's
printing API's.

Why not a comment right below "} msgs = {" saying e.g.:

/* Untranslated plumbing messages: */

>> I wonder if there is some way to avoid duplication here, but I don't
>> see anything easy and safe (e.g. against running setup_*() twice).

I think we do need duplication for the reason above.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [RFC] Add way to make Git credentials accessible from clean/smudge filter

2016-11-10 Thread Matthieu Moy
Lars Schneider  writes:

> I haven't looked at an implemenation approach at all. I wonder if this could
> be OK from a conceptional point of view or if there are obvious security 
> problems that I am missing.

Did you consider just running "git credential" from the filter? It may
not be the perfect solution, but it should work. I already used it to
get credential from a remote-helper (git-remote-mediawiki). When
prompting credentials interactively, it grabs the terminal directly, so
it work even if stdin/stdout are used for the protocol.

Asking the main git process to get the credentials probably has added
value like the ability to prompt once and use the same for several
filter processes.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: [PATCH v15 01/27] bisect--helper: use OPT_CMDMODE instead of OPT_BOOL

2016-10-27 Thread Matthieu Moy
Junio C Hamano  writes:

> Cc'ed those who touched either "git-bisect.sh" or "builtin/bisect-helper.c"
> in our relatively recent past.
>
> Does any of you (and others on the list) have time and inclination
> to review this series?

Unfortunately, I have essentially zero-bandwidth to do that in the near
future :-(.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


Re: Drastic jump in the time required for the test suite

2016-10-21 Thread Matthieu Moy
René Scharfe  writes:

> I get this on WSL with prove -j8:
>
> Files=750, Tests=13657, 906 wallclock secs ( 8.51 usr 17.17 sys + 282.62 cusr 
> 3731.85 csys = 4040.15 CPU)
>
> And this for a run on Debian inside a Hyper-V VM on the same system:
>
> Files=759, Tests=13895, 99 wallclock secs ( 4.81 usr  1.06 sys + 39.70 cusr 
> 25.82 csys = 71.39 CPU)

What about the same without WSL on windows?

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/


[PATCH 2/2] Git.pm: add comment pointing to t9000

2016-10-21 Thread Matthieu Moy
parse_mailboxes should probably eventually be completely equivalent to
Mail::Address, and if this happens we can drop the Mail::Address
dependency. Add a comment in the code reminding the current state of the
code, and point to the corresponding failing test to help future
contributors to get it right.

Signed-off-by: Matthieu Moy 
---
 perl/Git.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/perl/Git.pm b/perl/Git.pm
index 42e0895ef7..8bb2b7c7e3 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -870,6 +870,8 @@ Return an array of mailboxes extracted from a string.
 
 =cut
 
+# Very close to Mail::Address's parser, but we still have minor
+# differences in some cases (see t9000 for examples).
 sub parse_mailboxes {
my $re_comment = qr/\((?:[^)]*)\)/;
my $re_quote = qr/"(?:[^\"\\]|\\.)*"/;
-- 
2.10.1.651.gffd0de0



  1   2   3   4   5   6   7   8   9   10   >