[PATCH] Make git log work for git CWD outside of work tree

2017-04-08 Thread Danny Sauer
Make git log's `--use-mailmap` argument works if the GIT_DIR & GIT_WORK_TREE
env vars are set and git is run from outside of work tree.  Without the
NEED_WORK_TREE set on the log subcommand, .mailmap is silently not found.

Signed-off-by: Danny Sauer 
---

Notes:
I'm not entirely sure if this is the best way to fix it, as my git
internals knowledge is pretty weak. But this /seems/ reasonable to me, and
passes all of the current test cases.  If there's a more appropriate way to
make `--use-mailmap` work properly when `git log` is run outside of the
tree, I'd be excited about the opportunity to learn. :)

 git.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git.c b/git.c
index 8ff44f0..e147f01 100644
--- a/git.c
+++ b/git.c
@@ -440,7 +440,7 @@ static struct cmd_struct commands[] = {
{ "init", cmd_init_db },
{ "init-db", cmd_init_db },
{ "interpret-trailers", cmd_interpret_trailers, RUN_SETUP_GENTLY },
-   { "log", cmd_log, RUN_SETUP },
+   { "log", cmd_log, RUN_SETUP | NEED_WORK_TREE },
{ "ls-files", cmd_ls_files, RUN_SETUP | SUPPORT_SUPER_PREFIX },
{ "ls-remote", cmd_ls_remote, RUN_SETUP_GENTLY },
{ "ls-tree", cmd_ls_tree, RUN_SETUP },
-- 
2.7.4



Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Jacob Keller
On Sat, Apr 8, 2017 at 3:13 PM, Jeff King  wrote:
> On Sat, Apr 08, 2017 at 02:54:29PM -0700, Jacob Keller wrote:
>
>> > So the best way to use it is something like:
>> >
>> >   git fetch  ;# update 'master' from remote
>> >   git tag base master;# mark our base point
>> >   git rebase -i master   ;# rewrite some commits
>> >   git push --force-with-lease=master:base master:master
>> [...]
>>
>> What if we added a separate command something like:
>>
>> git create-lease
>>
>> which you're expected to run at the start of a rewind-y operation and
>> it creates a tag (or some other ref like a tag but in a different
>> namespace) which is used by force-with-lease?
>
> So then you replace that "git tag" command above with "git
> create-lease". I think it's an incremental improvement because it would
> probably record which branch you're leasing. But I think the more
> important issue is that the user needs to remember to take the lease in
> the first place. A create-lease command doesn't help that.
>

Well, if we don't mind backwards compatibility breaking, we could
require that uesrs run create-lease, and refuse to accept anything
that isn't a lease ref? That would force the user to have created a
lease which should help? But that IS backwards incompatible...

>> It might be possible to generate these lease tags prior to operations
>> which modify history and then maybe having a way to list them so you
>> can select which one you meant when you try to use force-with-lease..
>
> So yeah, I think that is the more interesting direction. I hadn't
> considered resolving the multiple-operation ambiguity at push time. But
> I guess it would be something like "you did a rebase on sha1 X at time
> T, and then one on Y at time T+N", and you pick which one you're
> expecting. Or maybe it could even cull old leases that are still
> ancestors of your push (so old, already-pushed rebases aren't
> mentioned). I suspect that ends up being equivalent to "clear the leases
> when you push". And I think that may be converging on the "integrate"
> refs that Stefan is talking about elsewhere (or some isomorphism of it).
>
> -Peff

Yea, I agree this sort of direction is nicer.

Thanks,
Jake


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Jeff King
On Sat, Apr 08, 2017 at 02:54:29PM -0700, Jacob Keller wrote:

> > So the best way to use it is something like:
> >
> >   git fetch  ;# update 'master' from remote
> >   git tag base master;# mark our base point
> >   git rebase -i master   ;# rewrite some commits
> >   git push --force-with-lease=master:base master:master
> [...]
> 
> What if we added a separate command something like:
> 
> git create-lease
> 
> which you're expected to run at the start of a rewind-y operation and
> it creates a tag (or some other ref like a tag but in a different
> namespace) which is used by force-with-lease?

So then you replace that "git tag" command above with "git
create-lease". I think it's an incremental improvement because it would
probably record which branch you're leasing. But I think the more
important issue is that the user needs to remember to take the lease in
the first place. A create-lease command doesn't help that.

> It might be possible to generate these lease tags prior to operations
> which modify history and then maybe having a way to list them so you
> can select which one you meant when you try to use force-with-lease..

So yeah, I think that is the more interesting direction. I hadn't
considered resolving the multiple-operation ambiguity at push time. But
I guess it would be something like "you did a rebase on sha1 X at time
T, and then one on Y at time T+N", and you pick which one you're
expecting. Or maybe it could even cull old leases that are still
ancestors of your push (so old, already-pushed rebases aren't
mentioned). I suspect that ends up being equivalent to "clear the leases
when you push". And I think that may be converging on the "integrate"
refs that Stefan is talking about elsewhere (or some isomorphism of it).

-Peff


Re: broken text encoding in commit author name

2017-04-08 Thread Igor Djordjevic
Hi Michał,

On 08/04/2017 22:30, Michał Walenciak wrote:
> Hi
> 
> since git 2.12 I'm having random problems with broken text enciding in author 
> name.
> 
> As an example broken commit:
> Author: Michał Walenciak   2017-04-07 21:38:23
> Committer: Michał Walenciak   2017-04-07 21:38:33
> 
> good commit: 
> Author: Michał Walenciak   2017-04-08 21:21:15
> Committer: Michał Walenciak   2017-04-08 21:21:15
> 
> Broken commits are quite rare and I cannot find any rule when they happen.
> Git 2.12.2 here, Arch linux.

For what it`s worth, looking at the timestamp of the broken commit 
example you`ve shown, could the problem be related to amending commits?

If so, are you amending through the command line, or maybe using some 
tool (like "git-gui")...?

There was a similar discussion recently[1], not sure if it`s related 
to your case.

[1] 
https://public-inbox.org/git/CAEPqvoz8s=CVLABuXx-zOzryrXwr_cb39G2TYJvvF
xryzpc...@mail.gmail.com/

Regards,
Buga


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Jeff King
On Sat, Apr 08, 2017 at 05:03:06PM +0200, Stefan Haller wrote:

> Jeff King  wrote:
> 
> > I think Matt's point is just that the default, to use origin/branch, is
> > unsafe. It's convenient when you don't have extra fetches, but that
> > convenience may not be worth the potential surprise.
> 
> I don't think "surprise" is the right word here. The point of
> --force-with-lease is to provide a guarantee, so if you can't trust the
> guarantee, it makes the feature rather pointless.

You can trust it under a certain set of conditions. If you break those
conditions, it doesn't work. So that's why I said "surprise": most users
aren't going to be aware of those conditions.

But I also used the word "unsafe". That, or "dangerous", is certainly
accurate. And I don't at all disagree that the situation should be
improved.

-Peff


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Jacob Keller
On Sat, Apr 8, 2017 at 2:29 AM, Jeff King  wrote:
> On Sat, Apr 08, 2017 at 09:35:04AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> Is it correct that you'd essentially want something that works like:
>>
>> git push --force-with-lease=master:master origin master:master
>
> I don't think that would do anything useful. It would reject any push
> where the remote "master" is not the same as your own master. And of
> course if they _are_ the same, then the push is a noop.
>
>> I haven't used this feature but I'm surprised it works the way it
>> does, as you point out just having your remote refs updated isn't a
>> strong signal for wanting to clobber whatever that ref points to.
>
> The point of the --force-with-lease feature is that you would mark a
> point in time where you started some rewind-y operation (like a rebase),
> and at the end you would want to make sure nobody had moved the remote
> ref when you push over it (which needs to be a force because of the
> rewind).
>
> So the best way to use it is something like:
>
>   git fetch  ;# update 'master' from remote
>   git tag base master;# mark our base point
>   git rebase -i master   ;# rewrite some commits
>   git push --force-with-lease=master:base master:master
>
> That final operation will fail if somebody else pushed in the meantime.
> But obviously this workflow is a pain, because you have to manually mark
> the start of the unsafe operation with a tag.
>
> If you haven't fetched in the meantime, then origin/master is a good
> approximation of "base". But if you have fetched, then it is worthless.
>
> It would be nice if we could automatically deduce the real value of
> base. I don't think we could do it in a foolproof way, but I wonder if
> we could come close in some common circumstances. For instance, imagine
> that unsafe operations like rebase would note that "master" has an
> upstream of "origin/master", and would record a note saying "we took a
> lease for origin/master at sha1 X".
>
> One trouble with that is that you may perform several unsafe operations.
> For example, imagine it takes you multiple rebases to achieve your final
> state:
>
>   git fetch
>   git rebase -i master
>   git rebase -i master
>   git push --force-with-lease=master
>
> and that --force-with-lease now defaults to whatever lease-marker is
> left by rebase. Which marker should it respect? If the second one, then
> it's unsafe. But if the first one, then how do we deal with stale
> markers?
>
> Perhaps it would be enough to reset the markers whenever the ref is
> pushed. I haven't thought it through well enough to know whether that
> just hits more corner cases.
>
> -Peff

What if we added a separate command something like:

git create-lease

which you're expected to run at the start of a rewind-y operation and
it creates a tag (or some other ref like a tag but in a different
namespace) which is used by force-with-lease?

However, I think using origin/master works fine as long as you don't auto-fetch.

If you're doing it right, you can handle origin/master updates by
checking that your rewind-y stuff is correct for the new origin/master
RIGHT before you push. The tricky part here is that you have to
remember to check after every fetch before you push. This is why I
would always create my own tag or lease reference prior to the use.

It might be possible to generate these lease tags prior to operations
which modify history and then maybe having a way to list them so you
can select which one you meant when you try to use force-with-lease..

Thanks,
Jake


Re: [PATCH] submodule: prevent backslash expantion in submodule names

2017-04-08 Thread Joachim Durchholz

Am 08.04.2017 um 12:59 schrieb Jeff King:

The reason I mentioned escaping earlier is I wondered what would happen
when the submodule starts with a double-quote, or has a newline in the
name.


I have tested newlines within the name, these work fine.

I also tested double and single quotes within the name, but not at 
beginning or end.



So I think your patch is fine there. But it does raise a few concerns.
It looks like git-submodule does not cope well with exotic filenames:

  $ git submodule add /some/repo "$(printf 'sub with\nnewline')"
  Cloning into '/home/peff/tmp/sub with
  newline'...
  done.
  error: invalid key (newline): submodule.sub with
  newline.url
  error: invalid key (newline): submodule.sub with
  newline.path
  Failed to register submodule 'sub with
  newline'


Strange. I'm running essentially the same kind of request, and things 
work fine.
Might be due to me using Python3 instead of bash, or maybe due to 
different versions of git.


If anybody is interested, I can publish my test code on github, it was 
scheduled to land there anyway.



I'm not too worried about that.  It's a nonsense request, and our config
format has no syntactic mechanism to represent that key.


Oh. I've been thinking that the quoted format is exactly for that kind 
of stuff.
Though it might be prone to eol conversion if a submodule name contains 
crlf sequences.


Also, funny behavour. Experience has taught me that funny behaviour, if 
it isn't exploitable today, may combine with some new funny behaviour in 
a future version of the same software. So I'm worried even with that.


This is starting to look like a can of worms to me... one way to "close 
the lid" would be if git

* defined what's a valid submodule name,
* rejected invalid submodule names, and
* documented validity rules in the git-submodule docs.
YMMV, just my 2 cents :-)

Regards,
Jo


broken text encoding in commit author name

2017-04-08 Thread Michał Walenciak
Hi

since git 2.12 I'm having random problems with broken text enciding in author 
name.

As an example broken commit:
Author: Michał Walenciak   2017-04-07 21:38:23
Committer: Michał Walenciak   2017-04-07 21:38:33

good commit: 
Author: Michał Walenciak   2017-04-08 21:21:15
Committer: Michał Walenciak   2017-04-08 21:21:15

Broken commits are quite rare and I cannot find any rule when they happen.
Git 2.12.2 here, Arch linux.

regards

-- 
Michał Walenciak
gmail.com kicer86
http://kicer.sileman.net.pl
gg: 3729519


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Stefan Haller
Ævar Arnfjör? Bjarmason  wrote:

> On Sat, Apr 8, 2017 at 5:03 PM, Stefan Haller  wrote:
>
> > Here's a rough proposal for how I would imagine this to work.
> >
> > For every local branch that has a remote tracking branch, git maintains
> > a new config entry branch.*.integrated, which records the sha1 of the
> > last upstream commit that was integrated into the local branch.
> 
> Can you elaborate on what "integrate" means in this context?
> 
> In some ways the entire point of this feature is that you're trying to
> push over history that you don't want to integrate.
> 
> I.e. you're trying to force push your unrelated X over remote history
> A, but not more recent arbitrary history B based on A which someone
> may have just pushed.
> 
> I'm having a hard time imagining how anything merge/rebase/whatever
> would place in branch.*.integrated wouldn't just be a roundabout way
> of recording the info we now record via the tracking branch, or in
> cases where that's auto-updated for some reason having another
> tracking branch as my "[PATCH] push: document & test
> --force-with-lease with multiple remotes" suggests.

It doesn't matter whether the history you are overwriting is arbitrary,
or whether the new history you are pushing is related or unrelated to
what you are overwriting. What matters is whether you are aware of what
you are overwriting.

I want to record all cases where the local branch is brought up to date
with the tracking branch (or vice versa), i.e. mostly push and pull,
because I know that after pushing or pulling, my local branch is up to
date (in some way) with the tracking branch. If I then rewrite the local
branch, I know it is safe to push it *if* the branch on the remote is
still in the same state as what I recorded for last push or pull.

If the tracking branch is updated by fetch though, then my local branch
is not brought up to date with the remote branch, so I may be
overwriting stuff that appeared on the remote without me being aware of
it.

It may well be that there are better names then "integrate"; suggestions
welcome.

Your suggestion to use a second remote doesn't seem like a satisfactory
solution to me, firstly because it's extra work and complexity for the
user, and second because it doesn't solve the problem of working with
more than one local branch (pulling one branch amounts to a fetch for
the other).


-- 
Stefan Haller
Berlin, Germany
http://www.haller-berlin.de/


Re: Feature request: --format=json

2017-04-08 Thread Ævar Arnfjörð Bjarmason
On Sat, Apr 8, 2017 at 6:07 PM, Fred .Flintstone  wrote:
> $ git log --format=json
> [{
> "commit": "64eabf050e315a4c7a11e0c05ca163be7cf9075e",
> "tree": "b1e977800f40bbf6de906b1fe4f2de4b4b14f0fd",
> "author": "Tux  1490981516 +0200",
> "committer": "Tux  1490981516 +0200",
> "message": "This is a test commit",
> "long_message": "This explains in more details the commit"
> }]
>
> This would make it easy to parse the output.

The git-log command isn't plumbing that's meant for machines, but the
git-for-each-ref command is what you're most likely looking for.

It doesn't have JSON output, but you can make e.g. --format emit
something even more easily parsable, e.g. a version of what you have
with each field delimited by a custom delimiter, and then split on
that.

It does have --perl, --tcl etc. options to make it easy to quote the
fields, however there's no logic to manage the state machine JSON
would need to omit trailing commas, whereas emitting output for
languages like Perl where trailing commas don't matter is much easier.


Feature request: --format=json

2017-04-08 Thread Fred .Flintstone
$ git log --format=json
[{
"commit": "64eabf050e315a4c7a11e0c05ca163be7cf9075e",
"tree": "b1e977800f40bbf6de906b1fe4f2de4b4b14f0fd",
"author": "Tux  1490981516 +0200",
"committer": "Tux  1490981516 +0200",
"message": "This is a test commit",
"long_message": "This explains in more details the commit"
}]

This would make it easy to parse the output.


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Ævar Arnfjörð Bjarmason
On Sat, Apr 8, 2017 at 5:03 PM, Stefan Haller  wrote:
> Matt McCutchen  wrote:
>
>> When I'm rewriting history, "git push --force-with-lease" is a nice
>> safeguard compared to "git push --force", but it still assumes the
>> remote-tracking ref gives the old state the user wants to overwrite.
>> Tools that do an implicit fetch, assuming it to be a safe operation,
>> may break this assumption.  In the worst case, Visual Studio Code does
>> an automatic fetch every 3 minutes by default [1], making
>> --force-with-lease pretty much reduce to --force.
>>
>> For a safer workflow, "git push" would check against a separate "old"
>> ref that isn't updated by "git fetch", but is updated by "git push" the
>> same way the remote-tracking ref is and maybe also by commands that
>> update the local branch to take into account remote changes (I'm not
>> sure what reasonable scenarios there are, if any).
>
> Here's a rough proposal for how I would imagine this to work.
>
> For every local branch that has a remote tracking branch, git maintains
> a new config entry branch.*.integrated, which records the sha1 of the
> last upstream commit that was integrated into the local branch.

Can you elaborate on what "integrate" means in this context?

In some ways the entire point of this feature is that you're trying to
push over history that you don't want to integrate.

I.e. you're trying to force push your unrelated X over remote history
A, but not more recent arbitrary history B based on A which someone
may have just pushed.

I'm having a hard time imagining how anything merge/rebase/whatever
would place in branch.*.integrated wouldn't just be a roundabout way
of recording the info we now record via the tracking branch, or in
cases where that's auto-updated for some reason having another
tracking branch as my "[PATCH] push: document & test
--force-with-lease with multiple remotes" suggests.

But maybe I'm just missing something obvious...


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Stefan Haller
Jeff King  wrote:

> I think Matt's point is just that the default, to use origin/branch, is
> unsafe. It's convenient when you don't have extra fetches, but that
> convenience may not be worth the potential surprise.

I don't think "surprise" is the right word here. The point of
--force-with-lease is to provide a guarantee, so if you can't trust the
guarantee, it makes the feature rather pointless.


-- 
Stefan Haller
Berlin, Germany
http://www.haller-berlin.de/


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Stefan Haller
Matt McCutchen  wrote:

> When I'm rewriting history, "git push --force-with-lease" is a nice
> safeguard compared to "git push --force", but it still assumes the
> remote-tracking ref gives the old state the user wants to overwrite. 
> Tools that do an implicit fetch, assuming it to be a safe operation,
> may break this assumption.  In the worst case, Visual Studio Code does
> an automatic fetch every 3 minutes by default [1], making
> --force-with-lease pretty much reduce to --force.
> 
> For a safer workflow, "git push" would check against a separate "old"
> ref that isn't updated by "git fetch", but is updated by "git push" the
> same way the remote-tracking ref is and maybe also by commands that
> update the local branch to take into account remote changes (I'm not
> sure what reasonable scenarios there are, if any).

Here's a rough proposal for how I would imagine this to work.

For every local branch that has a remote tracking branch, git maintains
a new config entry branch.*.integrated, which records the sha1 of the
last upstream commit that was integrated into the local branch.

When --force-with-lease is used without an argument, it will use the
values of "branch.*.remote:branch.*.integrated" as an argument. If
either doesn't exist, the push fails (this is essential).

Initially the "integrated" entry is created at the same time that
branch.*.merge is, i.e. with commits like "git checkout -b
name-of-remote-branch", or "git branch --set-upstream-to" and the like,
and it will be set to the sha1 that the tip of the remote tracking
branch has at that time.

Then, every command that either integrates the remote tracking branch
into the local branch, or updates the remote tracking branch to the
local branch, will update the value of the "integrated" entry. The most
obvious ones are "git pull" and "git push", or course; others that may
have to be supported are "git rebase @{u}", "git rebase --onto @{u}",
"git reset @{u}", and probably others. The nice thing about these is
that initially they don't have to be supported for the feature to still
be useful. After using one of them, push --force-with-lease will fail,
and the user can then investigate the situation and either use push -f
or manually update branch.*.integrated when they have convinced
themselves that everything is fine.

I find it essential that --force-with-lease might fail erroneously, but
never succeed erroneously, and I think this proposal would guarantee
that.


-- 
Stefan Haller
Berlin, Germany
http://www.haller-berlin.de/


Re: [PATCH 4/6] receive-pack: quarantine objects until pre-receive accepts

2017-04-08 Thread Ævar Arnfjörð Bjarmason
On Sat, Oct 1, 2016 at 11:12 AM, Jeff King  wrote:
> On Fri, Sep 30, 2016 at 03:36:30PM -0400, Jeff King wrote:
>
>> @@ -1639,6 +1666,18 @@ static const char *unpack(int err_fd, struct 
>> shallow_info *si)
>>   argv_array_push(, alt_shallow_file);
>>   }
>>
>> + tmp_objdir = tmp_objdir_create();
>> + if (!tmp_objdir)
>> + return "unable to create temporary object directory";
>> + child.env = tmp_objdir_env(tmp_objdir);
>
> One thing to note here: this new code kicks in all the time. My
> reasoning was that there's basically no time you _wouldn't_ want it to,
> and certainly that was the case for us when I wrote it. But I tried to
> think of user-visible changes. Here's what I came up with:
>
>   - we currently leave the tmp_pack_* for a failed push sitting around
> (e.g., if the client hangs up halfway through, or index-pack rejects
> the pack for some reason). But with this series, it would always be
> cleaned up. That's a very good thing if you're running a git hosting
> site. It might make things harder if you're debugging.
>
> I don't think it's a good reason not to enable this by default, but
> it _could_ be a reason to have a config switch to turn it off
> temporarily (or just leave the "incoming-*" directory in place).
>
>   - the environment that pre-receive pack runs in has access to objects
> that the rest of the repository doesn't. So if you were to do
> something silly in your pre-receive like:
>
>   # reject the push, but log a copy of the objects
>   git update-ref refs/rejected/$(date +%s) $new_sha1
>   exit 1
>
> Then your ref-update would succeed (you have $new_sha1), but the
> objects would be deleted immediately afterwards. I find this a
> somewhat questionable pattern, and I have no idea if anybody else
> has thought of it. But it _does_ work today, and not with this
> series.
>
> I don't think it would be too hard to put a config conditional around
> this tmp_objdir_create(). And then all of the tmp_objdir_env() calls
> would just return NULL, and effectively become noops.

Very late reply, but I have a question about this. Is there anything
you can do on the plumbing level to figure out which area an object is
in (of course that's not mutually exclusive).

The use-case for that is e.g. having a hook that rejects large
binaries, but has an exception for a binary that's been added in the
past, before your change there's no optimal way to find that out,
you'd need to go over the whole history and list all blobs that have
ever been added, with your change it would be trivial if something
could give me "this object is in the quarantine area", but AFAICT
there's no API that'll show me that.

Also, I think this whole thing could really do with some documentation
in githooks(5). E.g. what hooks does it apply for? The test is just
for pre-receive but the patch changes run_update_hook(), does it also
take effect for update? Ditto the caveat about update-ref.


Assalamu`Alaikum.

2017-04-08 Thread Mohammad Ouattara



Dear Sir/Madam.

Assalamu`Alaikum.

I am Dr mohammad ouattara, I have  ($10.6 Million us dollars) to transfer into 
your account,

I will send you more details about this deal and the procedures to follow when 
I receive a positive response from you, 

Have a great day,

Dr mohammad ouattara.


Re: [PATCH v2] unpack-trees: avoid duplicate ODB lookups during checkout

2017-04-08 Thread René Scharfe

Am 07.04.2017 um 17:53 schrieb g...@jeffhostetler.com:

From: Jeff Hostetler 

Teach traverse_trees_recursive() to not do redundant ODB
lookups when both directories refer to the same OID.

In operations such as read-tree, checkout, and merge when
the differences between the commits are relatively small,
there will likely be many directories that have the same
SHA-1.  In these cases we can avoid hitting the ODB multiple
times for the same SHA-1.

This patch handles n=2 and n=3 cases and simply copies the
data rather than repeating the fill_tree_descriptor().


On the Windows repo (500K trees, 3.1M files, 450MB index),
this reduced the overall time by 0.75 seconds when cycling
between 2 commits with a single file difference.

(avg) before: 22.699
(avg) after:  21.955
===


Using the p0004-read-tree test (posted earlier this week)
with 1M files on Linux:

before:
$ ./p0004-read-tree.sh
0004.5: switch work1 work2 (1003037)   11.99(8.12+3.32)
0004.6: switch commit aliases (1003037)11.95(8.20+3.14)

after:
$ ./p0004-read-tree.sh
0004.5: switch work1 work2 (1003037)   11.02(7.71+2.78)
0004.6: switch commit aliases (1003037)10.95(7.57+2.82)


Signed-off-by: Jeff Hostetler 
---
  unpack-trees.c | 23 +++
  1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 3a8ee19..143c5d9 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -531,6 +531,11 @@ static int switch_cache_bottom(struct traverse_info *info)
return ret;
  }
  
+static inline int are_same_oid(struct name_entry *name_j, struct name_entry *name_k)

+{
+   return name_j->oid && name_k->oid && !oidcmp(name_j->oid, name_k->oid);
+}
+
  static int traverse_trees_recursive(int n, unsigned long dirmask,
unsigned long df_conflicts,
struct name_entry *names,
@@ -554,10 +559,20 @@ static int traverse_trees_recursive(int n, unsigned long 
dirmask,
newinfo.df_conflicts |= df_conflicts;
  
  	for (i = 0; i < n; i++, dirmask >>= 1) {

-   const unsigned char *sha1 = NULL;
-   if (dirmask & 1)
-   sha1 = names[i].oid->hash;
-   buf[i] = fill_tree_descriptor(t+i, sha1);
+   if (i > 0 && are_same_oid([i], [i-1])) {
+   /* implicitly borrow buf[i-1] inside tree_desc[i] */
+   memcpy([i], [i-1], sizeof(struct tree_desc));


An assignment would be simpler:

t[i] = t[i - 1];


+   buf[i] = NULL;
+   } else if (i > 1 && are_same_oid([i], [i-2])) {
+   /* implicitly borrow buf[i-2] inside tree_desc[i] */
+   memcpy([i], [i-2], sizeof(struct tree_desc));


Similar case.


+   buf[i] = NULL;


What makes the previous two entries special, or differently put: Why not 
just check *all* previous entries?  MAX_UNPACK_TREES is 8; the number of 
comparisons would just about double in the worst case and stay the same 
for three trees or less.  The order of four trees or more wouldn't 
matter anymore.



+   } else {
+   const unsigned char *sha1 = NULL;
+   if (dirmask & 1)
+   sha1 = names[i].oid->hash;
+   buf[i] = fill_tree_descriptor(t+i, sha1);
+   }
}
  
  	bottom = switch_cache_bottom();




[PATCH 10/12] grep: change the internal PCRE macro names to be PCRE1

2017-04-08 Thread Ævar Arnfjörð Bjarmason
Change the internal USE_LIBPCRE define, & build options flag to use a
naming convention ending in PCRE1, without changing the long-standing
USE_LIBPCRE Makefile flag which enables this code.

This is for preparation for libpcre2 support where having things like
USE_LIBPCRE and USE_LIBPCRE2 in any more places than we absolutely
need to for backwards compatibility with old Makefile arguments would
be confusing.

In some ways it would be better to change everything that now uses
USE_LIBPCRE to use USE_LIBPCRE1, and to make specifying
USE_LIBPCRE (or --with-pcre) an error. This would impose a one-time
burden on packagers of git to s/USE_LIBPCRE/USE_LIBPCRE1/ in their
build scripts.

However I'd like to leave the door open to making
USE_LIBPCRE=YesPlease eventually mean USE_LIBPCRE2=YesPlease,
i.e. once PCRE v2 is ubiquitous enough that it makes sense to make it
the default.

This code and the USE_LIBPCRE Makefile argument was added in commit
63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09). At the time there was
no indication that the PCRE project would release an entirely new &
incompatible API around 3 years later.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Makefile  | 4 ++--
 grep.c| 6 +++---
 grep.h| 2 +-
 t/test-lib.sh | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/Makefile b/Makefile
index 23945d87cf..c8a26087e3 100644
--- a/Makefile
+++ b/Makefile
@@ -1084,7 +1084,7 @@ ifdef NO_LIBGEN_H
 endif
 
 ifdef USE_LIBPCRE
-   BASIC_CFLAGS += -DUSE_LIBPCRE
+   BASIC_CFLAGS += -DUSE_LIBPCRE1
ifdef LIBPCREDIR
BASIC_CFLAGS += -I$(LIBPCREDIR)/include
EXTLIBS += -L$(LIBPCREDIR)/$(lib) 
$(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib)
@@ -2235,7 +2235,7 @@ GIT-BUILD-OPTIONS: FORCE
@echo TAR=\''$(subst ','\'',$(subst ','\'',$(TAR)))'\' >>$@+
@echo NO_CURL=\''$(subst ','\'',$(subst ','\'',$(NO_CURL)))'\' >>$@+
@echo NO_EXPAT=\''$(subst ','\'',$(subst ','\'',$(NO_EXPAT)))'\' >>$@+
-   @echo USE_LIBPCRE=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' 
>>$@+
+   @echo USE_LIBPCRE1=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' 
>>$@+
@echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' >>$@+
@echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' >>$@+
@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst 
','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
diff --git a/grep.c b/grep.c
index 1575f8f9ed..99b9e9447f 100644
--- a/grep.c
+++ b/grep.c
@@ -325,7 +325,7 @@ static NORETURN void compile_regexp_failed(const struct 
grep_pat *p,
die("%s'%s': %s", where, p->pattern, error);
 }
 
-#ifdef USE_LIBPCRE
+#ifdef USE_LIBPCRE1
 static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
const char *error;
@@ -377,7 +377,7 @@ static void free_pcre_regexp(struct grep_pat *p)
pcre_free(p->pcre_extra_info);
pcre_free((void *)p->pcre_tables);
 }
-#else /* !USE_LIBPCRE */
+#else /* !USE_LIBPCRE1 */
 static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
die("cannot use Perl-compatible regexes when not compiled with 
USE_LIBPCRE");
@@ -392,7 +392,7 @@ static int pcrematch(struct grep_pat *p, const char *line, 
const char *eol,
 static void free_pcre_regexp(struct grep_pat *p)
 {
 }
-#endif /* !USE_LIBPCRE */
+#endif /* !USE_LIBPCRE1 */
 
 static int is_fixed(const char *s, size_t len)
 {
diff --git a/grep.h b/grep.h
index 267534ca24..073b0e4c92 100644
--- a/grep.h
+++ b/grep.h
@@ -1,7 +1,7 @@
 #ifndef GREP_H
 #define GREP_H
 #include "color.h"
-#ifdef USE_LIBPCRE
+#ifdef USE_LIBPCRE1
 #include 
 #else
 typedef int pcre;
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 6abf1d1918..e5cfbcc36b 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1010,7 +1010,7 @@ esac
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PYTHON" && test_set_prereq PYTHON
-test -n "$USE_LIBPCRE" && test_set_prereq PCRE
+test -n "$USE_LIBPCRE1" && test_set_prereq PCRE
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
 # Can we rely on git's output in the C locale?
-- 
2.11.0



[PATCH 12/12] grep: add support for PCRE v2

2017-04-08 Thread Ævar Arnfjörð Bjarmason
Add support for v2 of the PCRE API. This is a new major version of
PCRE that came out in early 2015[1]. The regular expression syntax is
the same, but while similar-ish requires a different codepath to
support it.

Git can now be compiled with any combination of
USE_LIBPCRE=[YesPlease|] & USE_LIBPCRE2=[YesPlease|]. If both are
provided the version of the PCRE library can be selected at runtime
with grep.PatternType, but the default (for now) is v1. This table
shows what the various combinations do depending on what libraries Git
is compiled against:

|--+-+-+--|
| grep.PatternType | v1  | v2  | v1 && v2 |
|--+-+-+--|
| perl | v1  | v2  | v1   |
| pcre | v1  | v2  | v1   |
| pcre1| v1  | ERR | v1   |
| pcre2| ERR | v2  | v2   |
|--+-+-+--|

When Git is only compiled with v2 grep.PatternType=perl, --perl-regexp
& -P will use v2. All tests pass with this new PCRE version. When Git
is compiled with both v1 & v2 most of the tests will only test v1, but
there are some v2-specific tests that will be run.

Originally I started work on this series because my ad-hoc patch to
support v2 ("Very promising results with libpcre2",

[PATCH 06/12] log: add -P as a synonym for --perl-regexp

2017-04-08 Thread Ævar Arnfjörð Bjarmason
Add a short -P option as a synonym for the longer --perl-regexp, for
consistency with the options the corresponding grep invocations
accept.

This was intentionally omitted in commit 727b6fc3ed ("log --grep:
accept --basic-regexp and --perl-regexp", 2012-10-03) for unspecified
future use.

Since nothing has come along in over 4 1/2 years that's wanted to use
it, it's more valuable to make it consistent with "grep" than to keep
it open for future use, and to avoid the confusion of -P meaning
different things for grep & log, as is the case with the -G option.

As noted in the aforementioned commit the --basic-regexp option can't
have a corresponding -G argument, as the log command already uses that
for -G.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/rev-list-options.txt | 1 +
 revision.c | 2 +-
 t/t4202-log.sh | 9 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index a02f7324c0..5b7fa49a7f 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -91,6 +91,7 @@ endif::git-rev-list[]
Consider the limiting patterns to be fixed strings (don't interpret
pattern as a regular expression).
 
+-P::
 --perl-regexp::
Consider the limiting patterns to be Perl-compatible regular 
expressions.
Requires libpcre to be compiled in.
diff --git a/revision.c b/revision.c
index 7ff61ff5f7..03a3a012de 100644
--- a/revision.c
+++ b/revision.c
@@ -1995,7 +1995,7 @@ static int handle_revision_opt(struct rev_info *revs, int 
argc, const char **arg
DIFF_OPT_SET(>diffopt, PICKAXE_IGNORE_CASE);
} else if (!strcmp(arg, "--fixed-strings") || !strcmp(arg, "-F")) {
revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_FIXED;
-   } else if (!strcmp(arg, "--perl-regexp")) {
+   } else if (!strcmp(arg, "--perl-regexp") || !strcmp(arg, "-P")) {
revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_PCRE;
} else if (!strcmp(arg, "--all-match")) {
revs->grep_filter.all_match = 1;
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 00d0585253..6f7c74e027 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -320,8 +320,17 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
--grep="(1|2)" >actual.fixed.short-arg &&
git log --pretty=tformat:%s -E \
--grep="\|2" >actual.extended.short-arg &&
+   if test_have_prereq PCRE
+   then
+   git log --pretty=tformat:%s -P \
+   --grep="\((?=1)" >actual.perl.short-arg
+   fi &&
test_cmp expect.fixed actual.fixed.short-arg &&
test_cmp expect.extended actual.extended.short-arg &&
+   if test_have_prereq PCRE
+   then
+   test_cmp expect.perl actual.perl.short-arg
+   fi &&
 
git log --pretty=tformat:%s --fixed-strings \
--grep="(1|2)" >actual.fixed.long-arg &&
-- 
2.11.0



[PATCH 04/12] grep: add a test for backreferences in PCRE patterns

2017-04-08 Thread Ævar Arnfjörð Bjarmason
Add a test for backreferences such as (.)\1 in PCRE patterns. This
test ensures that the PCRE_NO_AUTO_CAPTURE option isn't turned
on. Before this change turning it on would break these sort of
patterns, but wouldn't break any tests.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t7810-grep.sh | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 53c2ca05c4..83b0ee53be 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1102,6 +1102,13 @@ test_expect_success LIBPCRE 'grep -P -w pattern' '
test_cmp expected actual
 '
 
+test_expect_success PCRE 'grep -P backreferences work (the PCRE 
NO_AUTO_CAPTURE flag is not set)' '
+   git grep -P -h "(?P.)(?P=one)" hello_world >actual &&
+   test_cmp hello_world actual &&
+   git grep -P -h "(.)\1" hello_world >actual &&
+   test_cmp hello_world actual
+'
+
 test_expect_success 'grep -G invalidpattern properly dies ' '
test_must_fail git grep -G "a["
 '
-- 
2.11.0



[PATCH 02/12] grep: remove redundant regflags assignment under PCRE

2017-04-08 Thread Ævar Arnfjörð Bjarmason
Remove a redundant assignment to the "regflags" variable. This
variable is only used for POSIX regular expression matching, not when
the PCRE library is used.

This redundant assignment was added as a result of copy/paste
programming in commit 84befcd0a4 ("grep: add a grep.patternType
configuration setting", 2012-08-03). That commit modified already
working code in commit cca2c172e0 ("git-grep: do not die upon -F/-P
when grep.extendedRegexp is set.", 2011-05-09) which didn't assign to
regflags when under PCRE.

Revert back to that behavior, more to reduce "wait this is used under
PCRE how?" confusion when reading the code, than to to save ourselves
trivial CPU cycles by removing one assignment.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 grep.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.c b/grep.c
index 56ef0ecbff..8564fe726d 100644
--- a/grep.c
+++ b/grep.c
@@ -197,7 +197,6 @@ static void grep_set_pattern_type_option(enum 
grep_pattern_type pattern_type, st
case GREP_PATTERN_TYPE_PCRE:
opt->fixed = 0;
opt->pcre = 1;
-   opt->regflags &= ~REG_EXTENDED;
break;
}
 }
-- 
2.11.0



[PATCH 05/12] log: add exhaustive tests for pattern style options & config

2017-04-08 Thread Ævar Arnfjörð Bjarmason
Add exhaustive tests for how the different grep.patternType options &
the corresponding command-line options affect git-log.

Before this change it was possible to patch revision.c so that the
--basic-regexp option was synonymous with --extended-regexp, and
--perl-regexp wasn't recognized at all, and still have 100% of the
test suite pass.

This was because the first test being modified here, added in commit
34a4ae55b2 ("log --grep: use the same helper to set -E/-F options as
"git grep"", 2012-10-03), didn't actually check whether we'd enabled
extended regular expressions as distinct from re-toggling non-fixed
string support.

Fix that by changing the pattern to a pattern that'll only match if
--extended-regexp option is provided, but won't match under the
default --basic-regexp option.

Other potential regressions were possible since there were no tests
for the rest of the combinations of grep.patternType configuration
toggles & corresponding git-log command-line options. Add exhaustive
tests for those.

The patterns being passed to fixed/basic/extended/PCRE are carefully
crafted to return the wrong thing if the grep engine were to pick any
other matching method than the one it's told to use.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t4202-log.sh | 67 +-
 1 file changed, 66 insertions(+), 1 deletion(-)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 48b55bfd27..00d0585253 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -261,7 +261,13 @@ test_expect_success 'log --grep -i' '
 
 test_expect_success 'log -F -E --grep= uses ere' '
echo second >expect &&
-   git log -1 --pretty="tformat:%s" -F -E --grep=s.c.nd >actual &&
+   git log -1 --pretty="tformat:%s" -F -E --grep="(s).c.nd" >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success LIBPCRE 'log -F -E --perl-regexp --grep= uses PCRE' '
+   echo second >expect &&
+   git log -1 --pretty="tformat:%s" -F -E --perl-regexp 
--grep="s(?=ec)econd" >actual &&
test_cmp expect actual
 '
 
@@ -279,6 +285,65 @@ test_expect_success 'log with grep.patternType 
configuration and command line' '
test_cmp expect actual
 '
 
+test_expect_success 'log with various grep.patternType configurations & 
command-lines' '
+   git init pattern-type &&
+   (
+   cd pattern-type &&
+   test_commit 1 file A &&
+   test_commit "(1|2)" file B &&
+
+   echo "(1|2)" >expect.fixed &&
+   cp expect.fixed expect.basic &&
+   cp expect.fixed expect.extended &&
+   cp expect.fixed expect.perl &&
+
+   git -c grep.patternType=fixed log --pretty=tformat:%s \
+   --grep="(1|2)" >actual.fixed &&
+   git -c grep.patternType=basic log --pretty=tformat:%s \
+   --grep="(.|.)" >actual.basic &&
+   git -c grep.patternType=extended log --pretty=tformat:%s \
+   --grep="\|2" >actual.extended &&
+   if test_have_prereq LIBPCRE
+   then
+   git -c grep.patternType=perl log --pretty=tformat:%s \
+   --grep="\((?=1)" >actual.perl
+   fi &&
+   test_cmp expect.fixed actual.fixed &&
+   test_cmp expect.basic actual.basic &&
+   test_cmp expect.extended actual.extended &&
+   if test_have_prereq LIBPCRE
+   then
+   test_cmp expect.perl actual.perl
+   fi &&
+
+   git log --pretty=tformat:%s -F \
+   --grep="(1|2)" >actual.fixed.short-arg &&
+   git log --pretty=tformat:%s -E \
+   --grep="\|2" >actual.extended.short-arg &&
+   test_cmp expect.fixed actual.fixed.short-arg &&
+   test_cmp expect.extended actual.extended.short-arg &&
+
+   git log --pretty=tformat:%s --fixed-strings \
+   --grep="(1|2)" >actual.fixed.long-arg &&
+   git log --pretty=tformat:%s --basic-regexp \
+   --grep="(.|.)" >actual.basic.long-arg &&
+   git log --pretty=tformat:%s --extended-regexp \
+   --grep="\|2" >actual.extended.long-arg &&
+   if test_have_prereq LIBPCRE
+   then
+   git log --pretty=tformat:%s --perl-regexp \
+   --grep="\((?=1)" >actual.perl.long-arg
+   fi &&
+   test_cmp expect.fixed actual.fixed.long-arg &&
+   test_cmp expect.basic actual.basic.long-arg &&
+   test_cmp expect.extended actual.extended.long-arg &&
+   if test_have_prereq LIBPCRE
+   then
+   test_cmp expect.perl actual.perl.long-arg
+   fi
+   )
+'
+
 cat > expect <

[PATCH 09/12] test-lib: rename the LIBPCRE prerequisite to PCRE

2017-04-08 Thread Ævar Arnfjörð Bjarmason
Rename the LIBPCRE prerequisite to PCRE. This is for preparation for
libpcre2 support, where having just "LIBPCRE" would be confusing as it
implies v1 of the library.

None of these tests are incompatible between versions 1 & 2 of
libpcre, it's less confusing to give them a more general name to make
it clear that they work on both library versions.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/README|  4 ++--
 t/t4202-log.sh  | 10 +-
 t/t7810-grep.sh | 30 +++---
 t/t7812-grep-icase-non-ascii.sh |  4 ++--
 t/t7813-grep-icase-iso.sh   |  2 +-
 t/test-lib.sh   |  2 +-
 6 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/t/README b/t/README
index ab386c3681..a90cb62583 100644
--- a/t/README
+++ b/t/README
@@ -803,9 +803,9 @@ use these, and "test_set_prereq" for how to define your own.
Test is not run by root user, and an attempt to write to an
unwritable file is expected to fail correctly.
 
- - LIBPCRE
+ - PCRE
 
-   Git was compiled with USE_LIBPCRE=YesPlease. Wrap any tests
+   Git was compiled with support for PCRE. Wrap any tests
that use git-grep --perl-regexp or git-grep -P in these.
 
  - CASE_INSENSITIVE_FS
diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 6f7c74e027..93bb8a 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -265,7 +265,7 @@ test_expect_success 'log -F -E --grep= uses ere' '
test_cmp expect actual
 '
 
-test_expect_success LIBPCRE 'log -F -E --perl-regexp --grep= uses PCRE' '
+test_expect_success PCRE 'log -F -E --perl-regexp --grep= uses PCRE' '
echo second >expect &&
git log -1 --pretty="tformat:%s" -F -E --perl-regexp 
--grep="s(?=ec)econd" >actual &&
test_cmp expect actual
@@ -303,7 +303,7 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
--grep="(.|.)" >actual.basic &&
git -c grep.patternType=extended log --pretty=tformat:%s \
--grep="\|2" >actual.extended &&
-   if test_have_prereq LIBPCRE
+   if test_have_prereq PCRE
then
git -c grep.patternType=perl log --pretty=tformat:%s \
--grep="\((?=1)" >actual.perl
@@ -311,7 +311,7 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
test_cmp expect.fixed actual.fixed &&
test_cmp expect.basic actual.basic &&
test_cmp expect.extended actual.extended &&
-   if test_have_prereq LIBPCRE
+   if test_have_prereq PCRE
then
test_cmp expect.perl actual.perl
fi &&
@@ -338,7 +338,7 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
--grep="(.|.)" >actual.basic.long-arg &&
git log --pretty=tformat:%s --extended-regexp \
--grep="\|2" >actual.extended.long-arg &&
-   if test_have_prereq LIBPCRE
+   if test_have_prereq PCRE
then
git log --pretty=tformat:%s --perl-regexp \
--grep="\((?=1)" >actual.perl.long-arg
@@ -346,7 +346,7 @@ test_expect_success 'log with various grep.patternType 
configurations & command-
test_cmp expect.fixed actual.fixed.long-arg &&
test_cmp expect.basic actual.basic.long-arg &&
test_cmp expect.extended actual.extended.long-arg &&
-   if test_have_prereq LIBPCRE
+   if test_have_prereq PCRE
then
test_cmp expect.perl actual.perl.long-arg
fi
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index b50f1dff43..46f528183d 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -275,7 +275,7 @@ do
test_cmp expected actual
'
 
-   test_expect_success LIBPCRE "grep $L with grep.patterntype=perl" '
+   test_expect_success PCRE "grep $L with grep.patterntype=perl" '
echo "${HC}ab:a+b*c" >expected &&
git -c grep.patterntype=perl grep "a\x{2b}b\x{2a}c" $H ab 
>actual &&
test_cmp expected actual
@@ -1053,12 +1053,12 @@ hello.c:int main(int argc, const char **argv)
 hello.c:   printf("Hello world.\n");
 EOF
 
-test_expect_success LIBPCRE 'grep --perl-regexp pattern' '
+test_expect_success PCRE 'grep --perl-regexp pattern' '
git grep --perl-regexp "\p{Ps}.*?\p{Pe}" hello.c >actual &&
test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P pattern' '
+test_expect_success PCRE 'grep -P pattern' '
git grep -P "\p{Ps}.*?\p{Pe}" hello.c >actual &&
test_cmp expected actual
 '
@@ -1070,13 +1070,13 @@ test_expect_success 'grep pattern with 
grep.extendedRegexp=true' '

[PATCH 11/12] grep: change the internal PCRE code & header names to be PCRE1

2017-04-08 Thread Ævar Arnfjörð Bjarmason
Change the internal PCRE variable & function names to have a "1"
suffix. This is for preparation for libpcre2 support, where having
non-versioned names would be confusing.

The earlier "grep: change the internal PCRE macro names to be PCRE1"
change elaborates on the motivations behind this commit.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/grep.c |  4 ++--
 grep.c | 56 
 grep.h | 10 +-
 revision.c |  2 +-
 4 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 9478ab5dff..dffb9743b8 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -490,7 +490,7 @@ static void compile_submodule_options(const struct grep_opt 
*opt,
case GREP_PATTERN_TYPE_FIXED:
argv_array_push(_options, "-F");
break;
-   case GREP_PATTERN_TYPE_PCRE:
+   case GREP_PATTERN_TYPE_PCRE1:
argv_array_push(_options, "-P");
break;
case GREP_PATTERN_TYPE_UNSPECIFIED:
@@ -1036,7 +1036,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
GREP_PATTERN_TYPE_FIXED),
OPT_SET_INT('P', "perl-regexp", _type_arg,
N_("use Perl-compatible regular expressions"),
-   GREP_PATTERN_TYPE_PCRE),
+   GREP_PATTERN_TYPE_PCRE1),
OPT_GROUP(""),
OPT_BOOL('n', "line-number", , N_("show line 
numbers")),
OPT_NEGBIT('h', NULL, , N_("don't show 
filenames"), 1),
diff --git a/grep.c b/grep.c
index 99b9e9447f..ac7d6f9bbf 100644
--- a/grep.c
+++ b/grep.c
@@ -63,7 +63,7 @@ static int parse_pattern_type_arg(const char *opt, const char 
*arg)
else if (!strcmp(arg, "perl") ||
 !strcmp(arg, "pcre") ||
 !strcmp(arg, "pcre1"))
-   return GREP_PATTERN_TYPE_PCRE;
+   return GREP_PATTERN_TYPE_PCRE1;
die("bad %s argument: %s", opt, arg);
 }
 
@@ -180,25 +180,25 @@ static void grep_set_pattern_type_option(enum 
grep_pattern_type pattern_type, st
 
case GREP_PATTERN_TYPE_BRE:
opt->fixed = 0;
-   opt->pcre = 0;
+   opt->pcre1 = 0;
opt->regflags &= ~REG_EXTENDED;
break;
 
case GREP_PATTERN_TYPE_ERE:
opt->fixed = 0;
-   opt->pcre = 0;
+   opt->pcre1 = 0;
opt->regflags |= REG_EXTENDED;
break;
 
case GREP_PATTERN_TYPE_FIXED:
opt->fixed = 1;
-   opt->pcre = 0;
+   opt->pcre1 = 0;
opt->regflags &= ~REG_EXTENDED;
break;
 
-   case GREP_PATTERN_TYPE_PCRE:
+   case GREP_PATTERN_TYPE_PCRE1:
opt->fixed = 0;
-   opt->pcre = 1;
+   opt->pcre1 = 1;
break;
}
 }
@@ -326,7 +326,7 @@ static NORETURN void compile_regexp_failed(const struct 
grep_pat *p,
 }
 
 #ifdef USE_LIBPCRE1
-static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
+static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt 
*opt)
 {
const char *error;
int erroffset;
@@ -334,23 +334,23 @@ static void compile_pcre_regexp(struct grep_pat *p, const 
struct grep_opt *opt)
 
if (opt->ignore_case) {
if (has_non_ascii(p->pattern))
-   p->pcre_tables = pcre_maketables();
+   p->pcre1_tables = pcre_maketables();
options |= PCRE_CASELESS;
}
if (is_utf8_locale() && has_non_ascii(p->pattern))
options |= PCRE_UTF8;
 
-   p->pcre_regexp = pcre_compile(p->pattern, options, , ,
- p->pcre_tables);
-   if (!p->pcre_regexp)
+   p->pcre1_regexp = pcre_compile(p->pattern, options, , ,
+ p->pcre1_tables);
+   if (!p->pcre1_regexp)
compile_regexp_failed(p, error);
 
-   p->pcre_extra_info = pcre_study(p->pcre_regexp, 0, );
-   if (!p->pcre_extra_info && error)
+   p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, );
+   if (!p->pcre1_extra_info && error)
die("%s", error);
 }
 
-static int pcrematch(struct grep_pat *p, const char *line, const char *eol,
+static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
regmatch_t *match, int eflags)
 {
int ovector[30], ret, flags = 0;
@@ -358,7 +358,7 @@ static int pcrematch(struct grep_pat *p, const char *line, 
const char *eol,
if (eflags & REG_NOTBOL)
flags |= PCRE_NOTBOL;
 
-   ret = pcre_exec(p->pcre_regexp, p->pcre_extra_info, line, eol - line,
+   ret = pcre_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line,
   

[PATCH 07/12] grep & rev-list doc: stop promising libpcre for --perl-regexp

2017-04-08 Thread Ævar Arnfjörð Bjarmason
Stop promising in our grep & rev-list options documentation that we're
always going to be using libpcre when given the --perl-regexp option.

Instead talk about using "Perl-compatible regular expressions" and
using "the PCRE library". Saying "libpcre" strongly suggests that we
might be talking about libpcre.so, which is always going to be v1.

Not only do does the documentation now align more clearly with future
plans, but it should be more easily readable to the layman.

This is for preparation for libpcre2 support, which in the future may
be powering the --perl-regexp option by default, depending on how Git
is compiled.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/git-grep.txt | 4 ++--
 Documentation/rev-list-options.txt | 5 +++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 7b52e3fbc4..1c01154dbe 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -161,8 +161,8 @@ OPTIONS
 
 -P::
 --perl-regexp::
-   Use Perl-compatible regexp for patterns. Requires libpcre to be
-   compiled in.
+   Use Perl-compatible regular expressions for patterns. Uses the
+   PCRE library, which Git needs to be compiled against.
 
 -F::
 --fixed-strings::
diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 5b7fa49a7f..565cde636e 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -93,8 +93,9 @@ endif::git-rev-list[]
 
 -P::
 --perl-regexp::
-   Consider the limiting patterns to be Perl-compatible regular 
expressions.
-   Requires libpcre to be compiled in.
+   Consider the limiting patterns to be Perl-compatible regular
+   expressions. Uses the PCRE library, which Git needs to be
+   compiled against.
 
 --remove-empty::
Stop when a given path disappears from the tree.
-- 
2.11.0



[PATCH 08/12] grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"

2017-04-08 Thread Ævar Arnfjörð Bjarmason
Make the pattern types "pcre" & "pcre1" synonyms for long-standing
"perl" grep.patternType.

This change is part of a longer patch series to add pcre2 support to
Git. It's nice to be able to performance test PCRE v1 v.s. v2 without
having to recompile git, and doing that via grep.patternType makes
sense.

However, just adding "pcre2" when we only have "perl" would be
confusing, so start by adding a "pcre" & "pcre1" synonym.

In the future "perl" and "pcre" might be changed to default to "pcre2"
instead of "pcre1", and depending on how Git is compiled the more
specific "pcre1" or "pcre2" pattern types might produce an error.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt |  9 +
 grep.c   |  4 +++-
 t/t7810-grep.sh  | 10 ++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 475e874d51..5ef12d0694 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1624,6 +1624,15 @@ grep.patternType::
'fixed', or 'perl' will enable the `--basic-regexp`, 
`--extended-regexp`,
`--fixed-strings`, or `--perl-regexp` option accordingly, while the
value 'default' will return to the default matching behavior.
++
+The 'pcre' and 'pcre1' values are synonyms for 'perl'. The other
+values starting with 'pcre' are reserved for future use, e.g. if we'd
+like to use 'pcre2' for the PCRE v2 library.
++
+In the future 'perl' and 'pcre' might become synonyms for some other
+implementation or PCRE version, such as 'pcre2', while the more
+specific 'pcre1' & 'pcre2' might throw errors depending on whether git
+is compiled to include those libraries.
 
 grep.extendedRegexp::
If set to true, enable `--extended-regexp` option by default. This
diff --git a/grep.c b/grep.c
index 8564fe726d..1575f8f9ed 100644
--- a/grep.c
+++ b/grep.c
@@ -60,7 +60,9 @@ static int parse_pattern_type_arg(const char *opt, const char 
*arg)
return GREP_PATTERN_TYPE_ERE;
else if (!strcmp(arg, "fixed"))
return GREP_PATTERN_TYPE_FIXED;
-   else if (!strcmp(arg, "perl"))
+   else if (!strcmp(arg, "perl") ||
+!strcmp(arg, "pcre") ||
+!strcmp(arg, "pcre1"))
return GREP_PATTERN_TYPE_PCRE;
die("bad %s argument: %s", opt, arg);
 }
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 83b0ee53be..b50f1dff43 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1522,4 +1522,14 @@ test_expect_success 'grep with thread options' '
test_must_fail git -c grep.threads=1 grep --threads=-1 st.*dio
 '
 
+test_expect_success LIBPCRE "grep with grep.patternType synonyms 
perl/pcre/pcre1" '
+   echo "#include " >expected &&
+   git -c grep.patternType=perl  grep -h --no-line-number "st(?=dio)" 
>actual &&
+   test_cmp expected actual &&
+   git -c grep.patternType=pcre  grep -h --no-line-number "st(?=dio)" 
>actual &&
+   test_cmp expected actual &&
+   git -c grep.patternType=pcre1 grep -h --no-line-number "st(?=dio)" 
>actual &&
+   test_cmp expected actual
+'
+
 test_done
-- 
2.11.0



[PATCH 03/12] Makefile & configure: reword outdated comment about PCRE

2017-04-08 Thread Ævar Arnfjörð Bjarmason
Reword an outdated comment which suggests that only git-grep can use
PCRE.

This comment was added back when PCRE support was initially added in
commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09), and was true
at the time.

It hasn't been telling the full truth since git-log learned to use
PCRE with --grep in commit 727b6fc3ed ("log --grep: accept
--basic-regexp and --perl-regexp", 2012-10-03), and more importantly
is likely to get more inaccurate over time as more use is made of PCRE
in other areas.

Reword it to be more future-proof, and to more clearly explain that
this enables user-initiated runtime behavior.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Makefile |  6 --
 configure.ac | 12 
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/Makefile b/Makefile
index 9b36068ac5..23945d87cf 100644
--- a/Makefile
+++ b/Makefile
@@ -24,8 +24,10 @@ all::
 # Define NO_OPENSSL environment variable if you do not have OpenSSL.
 # This also implies BLK_SHA1.
 #
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as like log, grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
diff --git a/configure.ac b/configure.ac
index 128165529f..d09a204a7e 100644
--- a/configure.ac
+++ b/configure.ac
@@ -250,8 +250,10 @@ AS_HELP_STRING([--with-openssl],[use OpenSSL library 
(default is YES)])
 AS_HELP_STRING([],  [ARG can be prefix for openssl library and 
headers]),
 GIT_PARSE_WITH([openssl]))
 
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as like log, grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
@@ -499,8 +501,10 @@ GIT_CONF_SUBST([NEEDS_SSL_WITH_CRYPTO])
 GIT_CONF_SUBST([NO_OPENSSL])
 
 #
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as like log, grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 
 if test -n "$USE_LIBPCRE"; then
-- 
2.11.0



[PATCH 01/12] grep: add ability to disable threading with --threads=0 or grep.threads=0

2017-04-08 Thread Ævar Arnfjörð Bjarmason
Add the ability to entirely disable threading by having grep.threads=0
in the config or --threads=0 on the command-line.

This was made configurable in commit 89f09dd34e ("grep: add
--threads= option and grep.threads configuration",
2015-12-15). Before that change there was no way to disable threaded
grep other than to recompile Git.

It's very useful for testing & debugging to be able to entirely
disable threading without recompiling with NO_PTHREADS=YesPlease, so
support setting the value to 0 to disable threading.

There was no reason this wasn't the case already other than an
implementation detail in how OPT_INTEGER() works. When it's used
there's no way to tell the difference between an unset value & the
default value. Use OPT_CALLBACK() instead using the same pattern as in
commit b16a991c1b ("cherry-pick: detect bogus arguments to
--mainline", 2017-03-15).

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/git-grep.txt |  4 ++--
 builtin/grep.c | 26 ++
 t/t7810-grep.sh| 10 ++
 3 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 71f32f3508..7b52e3fbc4 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -56,8 +56,8 @@ grep.extendedRegexp::
other than 'default'.
 
 grep.threads::
-   Number of grep worker threads to use.  If unset (or set to 0),
-   8 threads are used by default (for now).
+   Number of grep worker threads to use.  If unset, 8 threads are
+   used by default (for now). Set to 0 to disable threading.
 
 grep.fullName::
If set to true, enable `--full-name` option by default.
diff --git a/builtin/grep.c b/builtin/grep.c
index 65070c52fc..9478ab5dff 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -35,7 +35,7 @@ static int grep_submodule_launch(struct grep_opt *opt,
 const struct grep_source *gs);
 
 #define GREP_NUM_THREADS_DEFAULT 8
-static int num_threads;
+static int num_threads = -1;
 
 #ifndef NO_PTHREADS
 static pthread_t *threads;
@@ -897,6 +897,24 @@ static int context_callback(const struct option *opt, 
const char *arg,
return 0;
 }
 
+static int thread_callback(const struct option *opt,
+  const char *arg, int unset)
+{
+   int *threads = (int*)opt->value;
+   char *end;
+
+   if (unset) {
+   *threads = GREP_NUM_THREADS_DEFAULT;
+   return 0;
+   }
+
+   *threads = strtol(arg, , 10);
+   if (*end || *threads < 0)
+   return opterror(opt, "invalid number of threads specified", 0);
+
+   return 0;
+}
+
 static int file_callback(const struct option *opt, const char *arg, int unset)
 {
struct grep_opt *grep_opt = opt->value;
@@ -1049,8 +1067,8 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
N_("show  context lines before matches")),
OPT_INTEGER('A', "after-context", _context,
N_("show  context lines after matches")),
-   OPT_INTEGER(0, "threads", _threads,
-   N_("use  worker threads")),
+   OPT_CALLBACK(0, "threads", _threads, N_("n"),
+   N_("use  worker threads"), thread_callback),
OPT_NUMBER_CALLBACK(, N_("shortcut for -C NUM"),
context_callback),
OPT_BOOL('p', "show-function", ,
@@ -1222,7 +1240,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
 #ifndef NO_PTHREADS
if (list.nr || cached || show_in_pager)
num_threads = 0;
-   else if (num_threads == 0)
+   else if (num_threads == -1)
num_threads = GREP_NUM_THREADS_DEFAULT;
else if (num_threads < 0)
die(_("invalid number of threads specified (%d)"), num_threads);
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index cee42097b0..53c2ca05c4 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1505,4 +1505,14 @@ test_expect_success 'grep does not report i-t-a and 
assume unchanged with -L' '
test_cmp expected actual
 '
 
+test_expect_success 'grep with thread options' '
+   git -c grep.threads=4 grep st.*dio &&
+   git grep --threads=4 st.*dio &&
+   git -c grep.threads=4 grep --threads=6 st.*dio &&
+   test_must_fail git -c grep.threads=-1 grep st.*dio &&
+   test_must_fail git -c grep.threads=-1 grep --threads=-1 st.*dio &&
+   test_must_fail git -c grep.threads=-1 grep --threads=1 st.*dio &&
+   test_must_fail git -c grep.threads=1 grep --threads=-1 st.*dio
+'
+
 test_done
-- 
2.11.0



[PATCH 00/12] PCREv2 & more

2017-04-08 Thread Ævar Arnfjörð Bjarmason
This adds PCRE v2 support, but as I was adding that I kept noticing
other related problems to fix. It's all bundled up into the same
series because much of it conflicts because it modifies the same test
or other code. Notes on each patch below.

Ævar Arnfjörð Bjarmason (12):
  grep: add ability to disable threading with --threads=0 or
grep.threads=0

This really has nothing to do with the rest except I'm using it to
test non-multithreaded & threaded PCRE more easily.

  grep: remove redundant regflags assignment under PCRE
  Makefile & configure: reword outdated comment about PCRE

Just some trivial cleanups.

  grep: add a test for backreferences in PCRE patterns
  log: add exhaustive tests for pattern style options & config

Yay, more tests!

  log: add -P as a synonym for --perl-regexp

We've had --perl-regexp for years, but not -P like grep, add it.

  grep & rev-list doc: stop promising libpcre for --perl-regexp
  grep: make grep.patternType=[pcre|pcre1] a synonym for "perl"
  test-lib: rename the LIBPCRE prerequisite to PCRE
  grep: change the internal PCRE macro names to be PCRE1
  grep: change the internal PCRE code & header names to be PCRE1
  grep: add support for PCRE v2

These combined add the support for PCRE 2. It's split up for ease of
readability. The last one's still a bit big, and I could e.g. split up
all the Makefile/autoconf stuff into a different patch (which wouldn't
do anything without the code), but I thought on balance doing it this
way made the most sense.

 Documentation/config.txt   |   7 ++
 Documentation/git-grep.txt |   8 +-
 Documentation/rev-list-options.txt |   6 +-
 Makefile   |  28 +-
 builtin/grep.c |  26 +-
 configure.ac   |  61 ++--
 grep.c | 184 ++---
 grep.h |  26 --
 revision.c |   2 +-
 t/README   |  16 +++-
 t/t4202-log.sh |  76 ++-
 t/t7810-grep.sh|  79 +---
 t/t7812-grep-icase-non-ascii.sh|   4 +-
 t/t7813-grep-icase-iso.sh  |  11 ++-
 t/test-lib.sh  |   4 +-
 15 files changed, 456 insertions(+), 82 deletions(-)

-- 
2.11.0



[PATCH] push: document & test --force-with-lease with multiple remotes

2017-04-08 Thread Ævar Arnfjörð Bjarmason
Document & test for cases where there are two remotes pointing to the
same URL, and a background fetch & subsequent `git push
--force-with-lease` shouldn't clobber un-updated references we haven't
fetched.

Some editors like Microsoft's VSC have a feature to auto-fetch in the
background, this bypasses the protections offered by
--force-with-lease as noted in the documentation being added here.

See the 'Tools that do an automatic fetch defeat "git push
--force-with-lease"' (<1491617750.2149.10.ca...@mattmccutchen.net>)
git mailing list thread for more details. Jakub Narębski suggested
this method of adding another remote to bypass this edge case,
document that & add a test for it.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---

On Sat, Apr 8, 2017 at 11:29 AM, Jeff King  wrote:
> On Sat, Apr 08, 2017 at 09:35:04AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> Is it correct that you'd essentially want something that works like:
>>
>> git push --force-with-lease=master:master origin master:master
>
> I don't think that would do anything useful. It would reject any push
> where the remote "master" is not the same as your own master. And of
> course if they _are_ the same, then the push is a noop.
>

Yeah my whole suggestion is obviously dumb & useless. But I liked
Jakub's suggestion to work around this, so here's docs & a test for
that.

According to my eyeballing of the MS VSC code this should work,
i.e. it seems to do a 'fetch' here, not a 'fetch --all':
https://github.com/Microsoft/vscode/blob/master/src/vs/workbench/parts/git/node/git.lib.ts#L505

Of course another way is to just disable autofetching:
https://github.com/Microsoft/vscode/blob/535a3de60023c81d75d0eac22044284f07dbcddf/extensions/git/src/autofetch.ts#L27

But having two remotes allows you to have your cake & eat it too
without all the hassle of tag creation, which I've added to the docs
though for completeness.

 Documentation/git-push.txt | 37 +
 t/t5533-push-cas.sh| 29 +
 2 files changed, 66 insertions(+)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 1624a35888..2f2e9c078b 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -210,6 +210,43 @@ or we do not even have to have such a remote-tracking 
branch when
 this form is used).  If `` is the empty string, then the named ref
 must not already exist.
 +
+This option interacts very badly with anything that implicitly runs
+`git fetch` on the remote to be pushed to in the background. The
+protection it offers over `--force` is ensuring that subsequent
+changes your work wasn't based on aren't clobbered, but this is
+trivially defeated if some background process is updating refs in the
+background. We don't have anything except the remote tracking info to
+go by as a heuristic for refs you're expected to have seen & are
+willing to clobber.
++
+If your editor or some other system is running `git fetch` in the
+background for you a way to mitigate this is to simply set up another
+remote:
++
+   git remote add origin-push $(git config remote.origin.url)
+   git fetch origin-push
++
+Now when the background process runs `git fetch origin` the references
+on `origin-push` won't be updated, and thus commands like:
++
+   git push --force-with-lease origin
++
+Will fail unless you manually run `git fetch origin-push`. This method
+is of course entirely defeated by something that runs `git fetch
+--all`, in that case you'd need to either disable it or do something
+more tedious like:
++
+   git fetch  ;# update 'master' from remote
+   git tag base master;# mark our base point
+   git rebase -i master   ;# rewrite some commits
+   git push --force-with-lease=master:base master:master
++
+I.e. create a `base` tag for versions of the upstream code that you've
+seen and are willing to overwrite, then rewrite history, and finally
+force push changes to `master` if the remote version is still at
+`base`, regardless of what your local `remotes/origin/master` has been
+updated to in the background.
++
 Note that all forms other than `--force-with-lease=:`
 that specifies the expected current value of the ref explicitly are
 still experimental and their semantics may change as we gain experience
diff --git a/t/t5533-push-cas.sh b/t/t5533-push-cas.sh
index a2c9e7439f..d38ecee217 100755
--- a/t/t5533-push-cas.sh
+++ b/t/t5533-push-cas.sh
@@ -229,4 +229,33 @@ test_expect_success 'new branch already exists' '
)
 '
 
+test_expect_success 'background updates of REMOTE can be mitigated with a 
non-updated REMOTE-push' '
+   rm -rf src dst &&
+   git init --bare src.bare &&
+   test_when_finished "rm -rf src.bare" &&
+   git clone --no-local src.bare dst &&
+   test_when_finished "rm -rf dst" &&
+   (
+   cd dst &&
+   test_commit G &&
+   git remote 

Re: [PATCH] submodule: prevent backslash expantion in submodule names

2017-04-08 Thread Jeff King
On Fri, Apr 07, 2017 at 10:23:06AM -0700, Brandon Williams wrote:

> When attempting to add a submodule with backslashes in its name 'git
> submodule' fails in a funny way.  We can see that some of the
> backslashes are expanded resulting in a bogus path:
> 
> git -C main submodule add ../sub\\with\\backslash
> fatal: repository '/tmp/test/sub\witackslash' does not exist
> fatal: clone of '/tmp/test/sub\witackslash' into submodule path
> 
> To solve this, convert calls to 'read' to 'read -r' in git-submodule.sh
> in order to prevent backslash expantion in submodule names.

This looks sane overall, without digging into the individual read calls.

The reason I mentioned escaping earlier is I wondered what would happen
when the submodule starts with a double-quote, or has a newline in the
name. Git's normal quoting would include backslash escape sequences, and
I wondered if we might be relying on any of these "read" calls to
interpret them. But I don't think so, for two reasons.

One, because that quoting also puts double-quotes around the name. So
plain "read" would not be sufficient to de-quote for us anyway.

And two, because these are being fed from "submodule--helper", which
does not seem to quote in the first place.

So I think your patch is fine there. But it does raise a few concerns.
It looks like git-submodule does not cope well with exotic filenames:

  $ git submodule add /some/repo "$(printf 'sub with\nnewline')"
  Cloning into '/home/peff/tmp/sub with
  newline'...
  done.
  error: invalid key (newline): submodule.sub with
  newline.url
  error: invalid key (newline): submodule.sub with
  newline.path
  Failed to register submodule 'sub with
  newline'

I'm not too worried about that. It's a nonsense request, and our config
format has no syntactic mechanism to represent that key. So tough luck.
But what I am more worried about is:

  $ git submodule--helper list
  16 576053ed5ad378490974fabe97e4bd59633d2d1e 0 sub with
  newline

That's obviously nonsense that git-submodule.sh is going to choke on.
But what happens when the filename is:

  foo\n16000  0\t../../escaped

or something. Can a malicious repository provoke git-submodule.sh to
look at or modify files outside the repository?

-Peff


Re: [PATCH v6 0/3] read-cache: speed up add_index_entry

2017-04-08 Thread Jeff King
On Fri, Apr 07, 2017 at 02:27:24PM -0400, Jeff Hostetler wrote:

> > Just thinking about this algorithmically for a moment. You're saving the
> > binary search when the input is given in sorted order. But in other
> > cases you're adding an extra strcmp() before the binary search begins.
> > So it's a tradeoff.
> > 
> > How often is the input sorted?  You save O(log n) strcmps for a "hit"
> > with your patch, and one for a "miss". So it's a net win if we expect at
> > least 1/log(n) of additions to be sorted (I'm talking about individual
> > calls, but it should scale linearly either way over a set of n calls).
> > 
> > I have no clue if that's a reasonable assumption or not.
> 
> I was seeing checkout call merge_working_tree to iterate over the
> source index/trees and call add_index_entry() for each.  For example,
> in a "checkout -b" like operation where both sides are the same, this
> calls keep_entry() which appends the entry to the new index array.
> The append path should always be taken because the iteration is being
> driven from a sorted list.
> 
> I would think calls to add/stage individual files arrive in random
> order, so I'm not suggesting replacing the code -- just checking the
> end first.

Right, what I was wondering is how much this costs in those random-order
cases. We _know_ it speeds up the cases you care about, but I want to
make sure that it is not making some other case worse. How often do the
random-order cases come up, and how much are they slowed?

I suspect in practice that calls here fall into one of two camps:
feeding a small-ish (compared to the total number of entries) set of
paths, or feeding _every_ path. And if you are feeding every path, you
are likely to do so in sorted order, rather than a random jumble. So it
helps in the big cases, and the small cases are presumably small enough
that we don't care much.

At least that seems like a plausible line of reasoning to me. ;)

-Peff


Re: [PATCH v7 2/3] p0004-read-tree: perf test to time read-tree

2017-04-08 Thread Jeff King
On Fri, Apr 07, 2017 at 09:20:46PM +, g...@jeffhostetler.com wrote:

> diff --git a/t/perf/p0004-read-tree.sh b/t/perf/p0004-read-tree.sh
> new file mode 100755
> index 000..a70e969

I think p0004 is taken by your lazy-init-name-hash script already
(which, btw, I think is missing its executable bit).

> +## usage: dir depth width files
> +fill_index() {
> + awk -v arg_dir=$1 -v arg_depth=$2 -v arg_width=$3 -v arg_files=$4 '
> + function make_paths(dir, depth, width, files, f, w) {
> + for (f = 1; f <= files; f++) {
> + print dir "/file" f
> + }
> + if (depth > 0) {
> + for (w = 1; w <= width; w++) {
> + make_paths(dir "/dir" w, depth - 1, 
> width, files)
> + }
> + }
> + }
> + END { make_paths(arg_dir, arg_depth, arg_width, arg_files) }
> +'  + sed "s/^/100644 $EMPTY_BLOB /" |
> + git update-index --index-info
> + return 0
> +}

I saw some discussion earlier on testing synthetic versus real
repositories. The original point of the perf test suite was to find
performance regressions between versions. So periodically you'd run:

  cd t/perf
  ./run v2.10.0 HEAD

and make sure that nothing got inexplicably slower. And for that, using
real repositories is nice, because it's showing real user-impacting
performance changes, not synthetic benchmarks.

In an ideal world, people run it against their own real repositories and
can report back to the list when they see a problem. So running the
whole suite against your monstrous Windows repo would be a nice
benchmark to do periodically (I shudder to think how long it might take
to run, though).

Of course, perf scripts are also a nice way to show off your
improvements. And synthetic repos can often exaggerate the improvement
(which is sometimes good to see changes, but also sometimes bad if
real-world repos don't show the improvement). And they also serve as a
reproduction case for people who don't necessarily have access to the
extreme repo that motivated the test in the first place.

But one side effect of writing the perf test the way you have it here is
that you _can't_ easily run it against a real repo. I think the perf
suite could provide better tools for this. You can already say "run this
test against repo X" with GIT_PERF_REPO. But there's no way to say
"create a synthetic repo with parameters X,Y,Z, and run against that".

IOW, I think rather than having the perf-scripts themselves create the
synthetic repo, we should have a _library_ of synthetic repos that the
test-runners can choose from. I'm imagining a world where your repo
setup goes into t/perf/repos/many-files.sh (which could even take your
depth, width, and files parameters to allow experimenting), and then you
could run "GIT_PERF_REPO=many-files ./p0004-read-tree.sh".

> +br_base=xxx_base_xxx
> +br_work1=xxx_work1_xxx
> +br_work2=xxx_work2_xxx
> +br_work3=xxx_work3_xxx

FWIW, I really dislike the extra level of indirection here. You still
have to consistently say $br_base everywhere. Why not just call the
branch "br_base" in the first place?

My experience has been that debugging tests is much easier when as
little state is carried in the environment as possible. Because it's
quite often reasonable to do:

  ./t1234-whatever.sh -v -i
  cd trash*
  git cmd-that-failed

where you can pick the final line out from the failed test output. When
the command involves $br_base, I have to dig into the script to find out
what's in that variable.

I know that perf tests need less debugging than the regular regression
tests, but I'd hate to see this pattern get applied liberally.

-Peff


Re: [PATCH v7 1/3] read-cache: add strcmp_offset function

2017-04-08 Thread Jeff King
On Fri, Apr 07, 2017 at 09:20:45PM +, g...@jeffhostetler.com wrote:

> diff --git a/t/helper/test-strcmp-offset.c b/t/helper/test-strcmp-offset.c
> new file mode 100644
> index 000..03e1eef
> --- /dev/null
> +++ b/t/helper/test-strcmp-offset.c
> @@ -0,0 +1,64 @@
> +#include "cache.h"
> +
> +struct test_data {
> + const char *s1;
> + const char *s2;
> + size_t expected_first_change; /* or strlen() when equal */
> +};
> +
> +static struct test_data data[] = {
> + { "abc", "abc", 3 },
> + { "abc", "def", 0 },
> +
> + { "abc", "abz", 2 },
> +
> + { "abc", "abcdef", 3 },
> +
> + { "abc\xF0zzz", "abc\xFFzzz", 3 },
> +
> + { NULL, NULL, 0 }
> +};

I've been thinking a bit on the comments on earlier rounds regarding C
tests. I think in the early days, we generally had plumbing that exposed
the C interfaces in a pretty transparent way. I.e., it was reasonable to
unit-test update-index, because you pretty much know how input to it
will map to the code and data structures used.

These days we have more C infrastructure, and it's sometimes hard to
tickle the exact inputs to those modules through plumbing commands. So I
don't really object to adding module-specific helpers that make it easy
to unit test these underlying modules.

I'm not sure how I feel about sticking test data in the helpers, though.
A higher level language like shell is actually pretty good for passing
data around. Passing in the input makes it much easier to prod a helper
with new test cases, see its output, run it in a debugger for a specific
case, etc.

It also integrates better with our test suite. For instance, here:

> + if (r_tst_sign != r_exp_sign) {
> + error("FAIL: '%s' vs '%s', result expect %d, observed %d\n",
> +   sa, sb, r_exp_sign, r_tst_sign);
> + failed = 1;
> + }
> +
> + if (offset != expected_first_change) {
> + error("FAIL: '%s' vs '%s', offset expect %lu, observed %lu\n",
> +   sa, sb, expected_first_change, offset);
> + failed = 1;
> + }
> +
> + return failed;
> +}

You're essentially rebuilding a test harness just for this one function,
and the regular test harness only knows "did anything fail", and nothing
about specific tests.

Perhaps something like:

 t/helper/test-strcmp-offset.c | 66 +++
 t/t0065-strcmp-offset.sh  | 17 +++--
 2 files changed, 26 insertions(+), 57 deletions(-)

diff --git a/t/helper/test-strcmp-offset.c b/t/helper/test-strcmp-offset.c
index 03e1eef8a..1fdf4d137 100644
--- a/t/helper/test-strcmp-offset.c
+++ b/t/helper/test-strcmp-offset.c
@@ -1,64 +1,22 @@
 #include "cache.h"
 
-struct test_data {
-   const char *s1;
-   const char *s2;
-   size_t expected_first_change; /* or strlen() when equal */
-};
-
-static struct test_data data[] = {
-   { "abc", "abc", 3 },
-   { "abc", "def", 0 },
-
-   { "abc", "abz", 2 },
-
-   { "abc", "abcdef", 3 },
-
-   { "abc\xF0zzz", "abc\xFFzzz", 3 },
-
-   { NULL, NULL, 0 }
-};
-
-int try_pair(const char *sa, const char *sb, size_t expected_first_change)
+int cmd_main(int argc, const char **argv)
 {
-   int failed = 0;
-   int r_exp, r_tst, r_exp_sign, r_tst_sign;
+   int result;
size_t offset;
 
+   if (!argv[1] || !argv[2])
+   die("usage: %s  ", argv[0]);
+
+   result = strcmp_offset(argv[1], argv[2], );
+
/*
 * Because differnt CRTs behave differently, only rely on signs
 * of the result values.
 */
-   r_exp = strcmp(sa, sb);
-   r_exp_sign = ((r_exp < 0) ? -1 : ((r_exp == 0) ? 0 : 1));
-
-   r_tst = strcmp_offset(sa, sb, );
-   r_tst_sign = ((r_tst < 0) ? -1 : ((r_tst == 0) ? 0 : 1));
-
-   if (r_tst_sign != r_exp_sign) {
-   error("FAIL: '%s' vs '%s', result expect %d, observed %d\n",
- sa, sb, r_exp_sign, r_tst_sign);
-   failed = 1;
-   }
-
-   if (offset != expected_first_change) {
-   error("FAIL: '%s' vs '%s', offset expect %lu, observed %lu\n",
- sa, sb, expected_first_change, offset);
-   failed = 1;
-   }
-
-   return failed;
-}
-
-int cmd_main(int argc, const char **argv)
-{
-   int failed = 0;
-   int k;
-
-   for (k=0; data[k].s1; k++) {
-   failed += try_pair(data[k].s1, data[k].s2, 
data[k].expected_first_change);
-   failed += try_pair(data[k].s2, data[k].s1, 
data[k].expected_first_change);
-   }
-
-   return failed;
+   result = result < 0 ? -1 :
+result > 0 ? 1 :
+0;
+   printf("%d %"PRIuMAX"\n", result, (uintmax_t)offset);
+   return 0;
 }
diff --git a/t/t0065-strcmp-offset.sh b/t/t0065-strcmp-offset.sh
index 0176c8c92..8c167d24b 100755
--- a/t/t0065-strcmp-offset.sh
+++ b/t/t0065-strcmp-offset.sh
@@ -4,8 +4,19 @@ 

Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Jakub Narębski
W dniu 08.04.2017 o 11:29, Jeff King pisze:
[...]

> Perhaps it would be enough to reset the markers whenever the ref is
> pushed. I haven't thought it through well enough to know whether that
> just hits more corner cases.

I wonder if using a separate remote for pushing (with separate remote-
-tracking branches) would be a good solution for this problem...

-- 
Jakub Narębski

 



Re: [PATCH v7 3/3] read-cache: speed up add_index_entry during checkout

2017-04-08 Thread Jeff King
On Fri, Apr 07, 2017 at 09:20:47PM +, g...@jeffhostetler.com wrote:

> This helps performance on very large repositories.
> 
> 
> Before and after numbers on index with 1M files
> ./p0004-read-tree.sh
> 0004.2: read-tree work1 (1003037)  3.21(2.54+0.62)
> 0004.3: switch base work1 (3038 1003037)   7.49(5.39+1.84)
> 0004.5: switch work1 work2 (1003037)   11.91(8.38+3.00)
> 0004.6: switch commit aliases (1003037)12.22(8.30+3.06)
> 
> ./p0004-read-tree.sh
> 0004.2: read-tree work1 (1003040)  2.40(1.65+0.73)
> 0004.3: switch base work1 (3041 1003040)   6.07(4.12+1.66)
> 0004.5: switch work1 work2 (1003040)   10.23(6.76+2.92)
> 0004.6: switch commit aliases (1003040)10.53(6.97+2.83)
> 

By the way, you may want to try:

  $ cd t/perf
  $ ./run HEAD^ HEAD p0004-read-tree.sh

which gives you the before/after in a nice table, with percentage
changes:

  Test   HEAD^ HEAD 
 
  
---
  0004.2: read-tree work1 (1003065)  2.34(1.90+0.42)   1.91(1.51+0.38) 
-18.4%
  0004.3: switch base work1 (3066 1003065)   5.12(4.14+0.96)   4.45(3.55+0.88) 
-13.1%
  0004.5: switch work1 work2 (1003065)   8.55(6.63+1.87)   7.78(5.76+2.00) 
-9.0% 
  0004.6: switch commit aliases (1003065)8.59(6.75+1.80)   7.64(5.92+1.70) 
-11.1%

The results are stored for each tested version, so you can re-run just a
single test and then re-output the results with "./aggregate.perl HEAD^
HEAD p0004-read-tree.sh".

The "run" script obviously builds each version behind the scenes, so you
probably also want to set GIT_PERF_MAKE_OPTS as appropriate (at the very
least "-j16" makes it more pleasant).

-Peff


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Jeff King
On Sat, Apr 08, 2017 at 01:25:43AM -0700, Jacob Keller wrote:

> On Fri, Apr 7, 2017 at 7:15 PM, Matt McCutchen  wrote:
> > When I'm rewriting history, "git push --force-with-lease" is a nice
> > safeguard compared to "git push --force", but it still assumes the
> > remote-tracking ref gives the old state the user wants to overwrite.
> > Tools that do an implicit fetch, assuming it to be a safe operation,
> > may break this assumption.  In the worst case, Visual Studio Code does
> > an automatic fetch every 3 minutes by default [1], making
> > --force-with-lease pretty much reduce to --force.
> >
> 
> Isn't the point of force-with-lease to actually record a "commit" id,
> and not pass it a branch name, but actually the sha1 you intend the
> remote server to be at? Sure if you happen to pass it a branch or
> remote name it will interpret it for yuou, but you should be able to
> do something like
> 
> current=$(git rev-parse origin/branch)
> 
> git push --force-with-lease=$current
> 
> and this will work regardless of when if if you fetch in between?

That's definitely the _best way to do it (modulo using "branch:$current"
in the final command). I think Matt's point is just that the default, to
use origin/branch, is unsafe. It's convenient when you don't have extra
fetches, but that convenience may not be worth the potential surprise.

-Peff


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Jeff King
On Sat, Apr 08, 2017 at 09:35:04AM +0200, Ævar Arnfjörð Bjarmason wrote:

> Is it correct that you'd essentially want something that works like:
> 
> git push --force-with-lease=master:master origin master:master

I don't think that would do anything useful. It would reject any push
where the remote "master" is not the same as your own master. And of
course if they _are_ the same, then the push is a noop.

> I haven't used this feature but I'm surprised it works the way it
> does, as you point out just having your remote refs updated isn't a
> strong signal for wanting to clobber whatever that ref points to.

The point of the --force-with-lease feature is that you would mark a
point in time where you started some rewind-y operation (like a rebase),
and at the end you would want to make sure nobody had moved the remote
ref when you push over it (which needs to be a force because of the
rewind).

So the best way to use it is something like:

  git fetch  ;# update 'master' from remote
  git tag base master;# mark our base point
  git rebase -i master   ;# rewrite some commits
  git push --force-with-lease=master:base master:master

That final operation will fail if somebody else pushed in the meantime.
But obviously this workflow is a pain, because you have to manually mark
the start of the unsafe operation with a tag.

If you haven't fetched in the meantime, then origin/master is a good
approximation of "base". But if you have fetched, then it is worthless.

It would be nice if we could automatically deduce the real value of
base. I don't think we could do it in a foolproof way, but I wonder if
we could come close in some common circumstances. For instance, imagine
that unsafe operations like rebase would note that "master" has an
upstream of "origin/master", and would record a note saying "we took a
lease for origin/master at sha1 X".

One trouble with that is that you may perform several unsafe operations.
For example, imagine it takes you multiple rebases to achieve your final
state:

  git fetch
  git rebase -i master
  git rebase -i master
  git push --force-with-lease=master

and that --force-with-lease now defaults to whatever lease-marker is
left by rebase. Which marker should it respect? If the second one, then
it's unsafe. But if the first one, then how do we deal with stale
markers?

Perhaps it would be enough to reset the markers whenever the ref is
pushed. I haven't thought it through well enough to know whether that
just hits more corner cases.

-Peff


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Jacob Keller
On Fri, Apr 7, 2017 at 7:15 PM, Matt McCutchen  wrote:
> When I'm rewriting history, "git push --force-with-lease" is a nice
> safeguard compared to "git push --force", but it still assumes the
> remote-tracking ref gives the old state the user wants to overwrite.
> Tools that do an implicit fetch, assuming it to be a safe operation,
> may break this assumption.  In the worst case, Visual Studio Code does
> an automatic fetch every 3 minutes by default [1], making
> --force-with-lease pretty much reduce to --force.
>

Isn't the point of force-with-lease to actually record a "commit" id,
and not pass it a branch name, but actually the sha1 you intend the
remote server to be at? Sure if you happen to pass it a branch or
remote name it will interpret it for yuou, but you should be able to
do something like

current=$(git rev-parse origin/branch)

git push --force-with-lease=$current

and this will work regardless of when if if you fetch in between?

Thanks,
Jake


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Ævar Arnfjörð Bjarmason
On Sat, Apr 8, 2017 at 4:15 AM, Matt McCutchen  wrote:
> When I'm rewriting history, "git push --force-with-lease" is a nice
> safeguard compared to "git push --force", but it still assumes the
> remote-tracking ref gives the old state the user wants to overwrite.
> Tools that do an implicit fetch, assuming it to be a safe operation,
> may break this assumption.  In the worst case, Visual Studio Code does
> an automatic fetch every 3 minutes by default [1], making
> --force-with-lease pretty much reduce to --force.
>
> For a safer workflow, "git push" would check against a separate "old"
> ref that isn't updated by "git fetch", but is updated by "git push" the
> same way the remote-tracking ref is and maybe also by commands that
> update the local branch to take into account remote changes (I'm not
> sure what reasonable scenarios there are, if any).  Has there been any
> work on this?  I can write a wrapper script for the simple case of "git
> push" of a single branch to the same branch on a remote, which is
> pretty much all I use right now, but a native implementation would
> support all of the command-line usage forms, which would be nice.
>
> Thanks for reading!
>
> [1]
> https://github.com/Microsoft/vscode/blob/535a3de60023c81d75d0eac22044284f07dbcddf/extensions/git/src/autofetch.ts

Is it correct that you'd essentially want something that works like:

git push --force-with-lease=master:master origin master:master

As opposed to the current:

git push --force-with-lease=master:origin/master origin master:master

Which is how the default:

git push --force-with-lease

Works now, assuming you're on the master branch with correct tracking info.

I haven't used this feature but I'm surprised it works the way it
does, as you point out just having your remote refs updated isn't a
strong signal for wanting to clobber whatever that ref points to.

Junio implemented this in v1.8.3.2-736-g28f5d17611 & noted in that
commit that the current semantics may not be a sensible default. I
think looking at the local ref instead of the remote tracking ref
would be a better default.


Re: Tools that do an automatic fetch defeat "git push --force-with-lease"

2017-04-08 Thread Stefan Haller
Matt McCutchen  wrote:

> When I'm rewriting history, "git push --force-with-lease" is a nice
> safeguard compared to "git push --force", but it still assumes the
> remote-tracking ref gives the old state the user wants to overwrite. 
> Tools that do an implicit fetch, assuming it to be a safe operation,
> may break this assumption.  In the worst case, Visual Studio Code does
> an automatic fetch every 3 minutes by default [1], making
> --force-with-lease pretty much reduce to --force.

That's a big problem, but even without such tools, I find
--force-with-lease without an argument to be pretty limited in
usefulness.

I like to type "git fetch" myself regularly, just to see what's new
upstream before integrating it; this already breaks it. But even
avoiding "git fetch" doesn't help if you are working on more than one
branch at a time, because doing "git pull" on one branch will do an
implicit "git fetch" on the other.

I like the idea of git maintaining a separate "last integrated" commit
for each branch, I think this could solve it in a nice way. I'm probably
not qualified enough to work on this myself though, but I'm happy to
give input if someone else wants to.


-- 
Stefan Haller
Berlin, Germany
http://www.haller-berlin.de/