Re: Recommendation for consistent update on invoke of "sha1_to_hex()"

2018-01-04 Thread Jeff King
On Fri, Jan 05, 2018 at 11:26:01AM +0800, 牛旭 wrote:

> By mining historical patches, we suggest that invokes of sha1_to_hex()
> should be replaced with that of oid_to_hex(). One example for
> recommendation and corresponding patch are listed as follows. 

Note that these two functions take different types. If you only have a
"const unsigned char *", then you must use sha1_to_hex(). If you have a
"struct object_id", then you should be using oid_to_hex(). If there are
sites which do:

  sha1_to_hex(oid.hash)

those should be converted to use oid_to_hex(). I think there's a
coccinelle rule for this, though, so there shouldn't be any lingering
calls like that.

Of course the ultimate goal is for every function to use oid_to_hex().
But that's much bigger than a single-line change, since groups of
dependent functions need to be converted (try "git log
--author=carlson" to see example patches).

> One example of missed spot:
> 1  void assert_sha1_type(const unsigned char *sha1, enum 
>   object_type expect)
> 2 {
> 3  enum object_type type = sha1_object_info(sha1, NULL);
> 4  if (type < 0)
> 5   die("%s is not a valid object",sha1_to_hex(sha1));
> 6  if (type != expect)
> 7   die("%s is not a valid '%s' object", sha1_to_hex(sha1),
> 8  typename(expect));
> 9 }

So this is an example that doesn't convert easily. The function has only
the bare pointer, so you'd have to change its parameter list (and
therefore all of its callers).

-Peff


Re: Patch recommendation for replace invoke of error() to that of error_errno()

2018-01-04 Thread Jeff King
On Fri, Jan 05, 2018 at 11:24:02AM +0800, 牛旭 wrote:

> Our team researches on consistent update of Git during evolution. And
> we have figured out several spots that may be missed. 
> 
> 
> By mining historical patches, we suggest that invokes of error(...,
> strerror(errno)) should be replaced with that of error_errno(). One
> example for recommendation and corresponding patch are listed as
> follows. 

Yes, historically we've done cleanups like this across time as we touch
the various pieces of code. More conversions are welcome as long as they
don't conflict with any topics that other people are working on (a good
test is to see if your suggested changes merge cleanly with the "pu"
branch).

In more recent times, we've been using the Coccinelle tool to do
automated conversions across the code base. Look at the contents and
history of the contrib/coccinelle directory. This might be a candidate
for that cleanup.

> One example of missed spot:
> 
> 1  int cmd_fetch__tool(int argc, const char **argv, const char 
>   *prefix)
> 2  {
> 
> 31  filename = git_path_fetch_head();
> 32  fp = fopen(filename, "a");
> 33  if (!fp)
> 34  return error("cannot open %s: %s", filename, strerror(errno));
> 
>   }

This one is actually a bit funny. It's in contrib/examples, which is all
historical code. It's not compiled or used as part of Git (and I'd
suspect most of it would not compile at all these days). It's not really
worth modernizing.

> More recommendations and supporting patches are saved in attachments.
> It is so kind of you to reply me about the correctness of our
> suggestions. And thank you for your reading. 

Eek, word documents. We're happy to take patches, but please format them
as plain text in your email (e.g., by using git-send-email). More
details are in Documentation/SubmittingPatches. Thanks.

-Peff


Re: [BUG] v2.16.0-rc0 seg faults when git bisect skip

2018-01-04 Thread Yasushi SHOJI
Hi,

On Fri, Jan 5, 2018 at 11:45 AM, Yasushi SHOJI  wrote:
> So that I can make a test case.

OK, here is the step to reproduce on git.git

$ git bisect start
$ git bisect bad v2.16.0-rc0
$ git bisect good 0433d533f13671f4313c31e34707f0f5281a18e0
$ git bisect good
$ git bisect bad
$ git bisect bad
$ git bisect skip #=> works with error
$ git bisect good #=> Segmentation fault
$ git bisect skip #=> Segmentation fault

The following is with full outputs, just in case you need it.

$ git describe
v2.16.0-rc0

$ git bisect start
$ git bisect bad v2.16.0-rc0

$ git bisect good 0433d533f13671f4313c31e34707f0f5281a18e0
Bisecting: 4 revisions left to test after this (roughly 2 steps)
[c87b653c46c4455561642b14efc8920a0b3e44b9] builtin/describe.c: rename
`oid` to avoid variable shadowing

$ git bisect good
Bisecting: 2 revisions left to test after this (roughly 1 step)
[4dbc59a4cce418ff8428a9d2ecd67c34ca50db56] builtin/describe.c: factor
out describe_commit

$ git bisect bad
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[cdaed0cf023a47cae327671fae11c10d88100ee7] builtin/describe.c: print
debug statements earlier

$ git bisect bad
cdaed0cf023a47cae327671fae11c10d88100ee7 is the first bad commit
commit cdaed0cf023a47cae327671fae11c10d88100ee7
Author: Stefan Beller 
Date:   Wed Nov 15 18:00:37 2017 -0800

builtin/describe.c: print debug statements earlier

When debugging, print the received argument at the start of the
function instead of in the middle. This ensures that the received
argument is printed in all code paths, and also allows a subsequent
refactoring to not need to move the "arg" parameter.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 

:04 04 beb1f2c4d9ee7584e54e0489c7e85e348cbf9fc7
b9882eea0772e3025690d3513cea5a940567668e M builtin

$ git bisect skip
There are only 'skip'ped commits left to test.
The first bad commit could be any of:
cdaed0cf023a47cae327671fae11c10d88100ee7
We cannot bisect more!

$ git bisect good
Segmentation fault

$ git bisect skip
Segmentation fault
-- 
   yashi


Re: Bring together merge and rebase

2018-01-04 Thread Carl Baldwin
On Thu, Jan 04, 2018 at 10:09:19PM -0700, Carl Baldwin wrote:
> This would be very cool. I've wanted to tackle this for a long time. I
> think I even filed an issue with gerrit about this years ago.

Yep, it turned out that it was a duplicate but I described what I did to
work around it.

https://bugs.chromium.org/p/gerrit/issues/detail?id=2375


Re: Bug report: git clone with dest

2018-01-04 Thread Isaac Shabtay
Hello Johannes, Jeff,

I cloned git's codebase, applied the four patches on master, built and
tested on Ubuntu Trusty. (After verifying that master indeed exhibits
this behaviour on Linux as well. Just checking).
Seems to work fine.
I also looked at the code. Most of the patched lines relate to tests,
and the one for clone.c seems reasonable to me. Added tests seem to
have very good coverage too.

I qualify everything I had written above by saying that it's my first
time ever looking at Git's source code.

On 4 January 2018 at 15:20, Johannes Schindelin
 wrote:
> Hi Isaac,
>
> On Wed, 3 Jan 2018, Isaac Shabtay wrote:
>
>> Indeed interesting... this one's for the books...
>> Thanks for the patches. Any idea when these are going to make it to
>> the official Git client builds? (specifically the Windows one)
>
> If you help them getting reviewed, tested, and validated, I could be
> talked into taking them into Git for Windows early so that they'll be in
> Git for Windows v2.16.0. It will require your effort, though.
>
> Ciao,
> Johannes


Re: Bring together merge and rebase

2018-01-04 Thread Carl Baldwin
On Thu, Jan 04, 2018 at 12:19:34PM -0700, Martin Fick wrote:
> On Tuesday, December 26, 2017 12:40:26 AM Jacob Keller 
> wrote:
> > On Mon, Dec 25, 2017 at 10:02 PM, Carl Baldwin 
>  wrote:
> > >> On Mon, Dec 25, 2017 at 5:16 PM, Carl Baldwin 
>  wrote:
> > >> A bit of a tangent here, but a thought I didn't wanna
> > >> lose: In the general case where a patch was rebased
> > >> and the original parent pointer was changed, it is
> > >> actually quite hard to show a diff of what changed
> > >> between versions.
> > 
> > My biggest gripes are that the gerrit web interface
> > doesn't itself do something like this (and jgit does not
> > appear to be able to generate combined diffs at all!)
> 
> I believe it now does, a presentation was given at the 
> Gerrit User summit in London describing this work.  It would 
> indeed be great if git could do this also!

This would be very cool. I've wanted to tackle this for a long time. I
think I even filed an issue with gerrit about this years ago.

Carl


Re: Bring together merge and rebase

2018-01-04 Thread Carl Baldwin
On Thu, Jan 04, 2018 at 01:06:27PM -0700, Martin Fick wrote:
> On Tuesday, December 26, 2017 01:31:55 PM Carl Baldwin 
> wrote:
> ...
> > What I propose is that gerrit and github could end up more
> > robust, featureful, and interoperable if they had this
> > feature to build from.
> 
> I agree (assuming we come up with a well defined feature)
> 
> > With gerrit specifically, adopting this feature would make
> > the "change" concept richer than it is now because it
> > could supersede the change-id in the commit message and
> > allow a change to evolve in a distributed non-linear way
> > with protection against clobbering work.
> 
> We (the Gerrit maintainers) would like changes to be able to 
> evolve non-linearly so that we can eventually support 
> distributed Gerrit reviews, and the amended-commit pointer 
> is one way I have thought to resolve this.

I really think that keeping these references is the key to doing this.

> > I have no intention to disparage either tool. I love them
> > both. They've both made my career better in different
> > ways. I know there is no guarantee that github, gerrit,
> > or any other tool will do anything to adopt this. But,
> > I'm hoping they are reading this thread and that they
> > recognize how this feature can make them a little bit
> > better and jump in and help. I know it is a lot to hope
> > for but I think it could be great if it happened.
> 
> We (the Gerrit maintainers) do recognize it, and I am glad 
> that someone is pushing for solutions in this space.  I am 
> not sure what the right solution is, and how to modify 
> workflows to deal better with this.  I do think that starting 
> by making your local repo track pointers to amended-commits, 
> likely with various git hooks and notes (as also proposed by 
> Johannes Schindelin), would be a good start.   With that in 
> place, then you can attack various specific workflows.

I have started a prototype that I will use to demonstrate this. I hope
to have something in a couple of weeks. I do have a day job also, so it
will be slow going. One idea that I had was to put my own server with
special hooks in it in front of gerrit to illustrate how collaboration
on a gerrit change, or even a chain of them, can be made safe. It would
act as a middle man between my client and the gerrit server. I'd just
have to change remote reference on my client to demonstrate.

> If you want to then attack the Gerrit workflow, it would be 
> good if you could prevent pushing new patchests that are 
> amended versions of patchsets that are out of date.  While 
> it would be great if Gerrit could reject such pushes, I 
> wonder if to start, git could detect and it prevent the push 
> in this situation?  Could a git push hook analyze the ref 
> advertisements and figure this out (all the patchsets are in 
> the advertisement)?  Can a git hook look at the ref 
> advertisement?

I'll think about this. At the least, the hook would have to look at the
server to see if there are new revisions. It would be difficult to close
race conditions that occur because the client will always be using
potentially out of date information even if it just went and pulled down
the latest stuff. I think I still like my middle man idea better as a
short term proof of concept.

Preventing pushing amended/rebased versions of out of date changes is
simple. Follow the "predecessor" references until you hit a known
commit. If that commit is the latest revision of the change then it is
up to date. If that commit not the latest revision, then it is out of
date. Reject it. This is what I plan to illustrate in my middle man
server.

If you traverse the entire graph of predecessors without finding a known
commit, then you have a new change. (In fact, the changeset id in the
commit message in a gerrit change seems unnecessary at this point). It
gets a little more complicated when you think about combining/squashing
changes (resulting in two or more "predecessor" references from a single
commit) or dividing a change into multiple but it works.

The harder part is the push/pull interaction between client and server.
When you go to push your amended update to a patchset, you want git to
send along any other new commits to complete the predecessor graph on
the server side. For example, you might rebase your commit and then
amend it to fix something. Personally, I'd like the rebase and the amend
to both be kept separately.

Similarly, when you've just had a push rejected because you're out of
date, you want to be able to easily pull down the commits you're missing
so that you can merge locally and try to push again.

You also don't want gc to garbage collect the intermediate commits. I
think gerrit uses many references internally in the git repo to "pin"
older revisions in the repository so that they don't appear orphaned. I
think I'm going to have to do something similar in my prototype.

If you think about it, this is all very much like what git already does
with its 

RE

2018-01-04 Thread Mrs Alice Walton
my name is Mrs. Alice Walton, a business woman an America Citizen and the 
heiress to the fortune of Walmart stores, born October 7, 1949. I have a 
mission for you worth $100,000,000.00(Hundred Million United State Dollars) 
which I intend using for CHARITY


Re: [PATCH] git-archive: accept --owner and --group like GNU tar

2018-01-04 Thread suzuki toshiya

Dear Junio,

Could you tell me your thought about the way for me to go?
Do you agree with his suggestion; "--uid etc is not the right
solution, --include-untracked is better and generic" ? Or,
should I work "--uid etc" further?

Regards,
mpsuzuki

Junio C Hamano wrote:

René Scharfe  writes:


I don't know if it's a good idea, but perhaps we don't even need a new
option.  We could change how pathspecs of untracked files are handled:
Instead of aborting we could include them in the archive.  (Sounds like
the simplest possible interface, but may have practical problems.)


One practical problem is that users who do this

$ git archive HEAD Documentation/ | tar tf -

would be expecting (at least) two different things, depending on the
situation they are in.

So at least you'd need an "--include-untracked" option, I guess.





Re: Bring together merge and rebase

2018-01-04 Thread Carl Baldwin
On Thu, Jan 04, 2018 at 12:54:00PM -0700, Martin Fick wrote:
> On Monday, December 25, 2017 06:16:40 PM Carl Baldwin wrote:
> > On Sun, Dec 24, 2017 at 10:52:15PM -0500, Theodore Ts'o 
> wrote:
> > Look at what happens in a rebase type workflow in any of
> > the following scenarios. All of these came up regularly
> > in my time with Gerrit.
> > 
> > 1. Make a quick edit through the web UI then later
> > work on the change again in your local clone. It is easy
> > to forget to pull down the change made through the UI
> > before starting to work on it again. If that happens, the
> > change made through the UI will almost certainly be
> > clobbered.
> > 
> > 2. You or someone else creates a second change that is
> > dependent on yours and works on it while yours is still
> > evolving. If the second change gets rebased with an older
> > copy of the base change and then posted back up for
> > review, newer work in the base change has just been
> > clobbered.
> > 
> > 3. As a reviewer, you decide the best way to explain
> > how you'd like to see something done differently is to
> > make the quick change yourself and push it up. If the
> > author fails to fetch what you pushed before continuing
> > onto something else, it gets clobbered.
> > 
> > 4. You want to collaborate on a single change with
> > someone else in any way and for whatever reason. As soon
> > as that change starts hitting multiple work spaces, there
> > are synchronization issues that currently take careful
> > manual intervention.
> 
> These scenarios seem to come up most for me at Gerrit hack-
> a-thons where we collaborate a lot in short time spans on 
> changes.  We (the Gerrit maintainers) too have wanted and 
> sometimes discussed ways to track the relation of "amended" 
> commits (which is generally what Gerrit patchsets are).  We 
> also concluded that some sort of parent commit pointer was 
> needed, although parent is somewhat the wrong term since 
> that already means something in git.  Rather, maybe some 
> "predecessor" type of term would be better, maybe 
> "antecedent", but "amended-commit" pointer might be best?

I like "replaces" as I have proposed or "supersedes". "predecessor" also
seems pretty good. I may add that to my list of favorites.

Carl


Recommendation for consistent update on invoke of "sha1_to_hex()"

2018-01-04 Thread 牛旭
Our team researches on consistent update of Git during evolution. And we have 
figured out several spots that may be missed. 


By mining historical patches, we suggest that invokes of sha1_to_hex() should 
be replaced with that of oid_to_hex(). One example for recommendation and 
corresponding patch are listed as follows. 

One example of missed spot:
1  void assert_sha1_type(const unsigned char *sha1, enum 
  object_type expect)
2 {
3  enum object_type type = sha1_object_info(sha1, NULL);
4  if (type < 0)
5   die("%s is not a valid object",sha1_to_hex(sha1));
6  if (type != expect)
7   die("%s is not a valid '%s' object", sha1_to_hex(sha1),
8  typename(expect));
9 }

One example of historical patch:
1  * We do not want to call parse_object here, because
2  * inflating blobs and trees could be very expensive.
3  * However, we do need to know the correct type for
4  * later processing, and the revision machinery expects
5  * commits and tags to have been parsed.
6  */
7  - type = sha1_object_info(sha1, NULL);
8  + type = sha1_object_info(oid->hash, NULL);
9  if (type < 0)
10 - die("unable to get object info for %s", sha1_to_hex(sha1));
11 + die("unable to get object info for %s", oid_to_hex(oid));
12
More recommendations and supporting patches are saved in attachments. It is so 
kind of you to reply me about the correctness of our suggestions. And thank you 
for your reading. 
<>
<>


Patch recommendation for replace invoke of error() to that of error_errno()

2018-01-04 Thread 牛旭
Our team researches on consistent update of Git during evolution. And we have 
figured out several spots that may be missed. 


By mining historical patches, we suggest that invokes of error(..., 
strerror(errno)) should be replaced with that of error_errno(). One example for 
recommendation and corresponding patch are listed as follows. 

One example of missed spot:

1  int cmd_fetch__tool(int argc, const char **argv, const char 
  *prefix)
2  {

31  filename = git_path_fetch_head();
32  fp = fopen(filename, "a");
33  if (!fp)
34  return error("cannot open %s: %s", filename, strerror(errno));

  }

One example of historical patch:
1  if (!strcmp(*paths, "-"))
2  in = stdin;
3  else
4  in = fopen(*paths, "r");
5
6  if (!in)
7  -  return error(_("could not open '%s' for reading: %s"),
8  - *paths, strerror(errno));
9  +  return error_errno(_("could not open '%s' for reading"),
10 + *paths);
11
12 mail = mkpath("%s/%0*d", state->dir, state->prec, i + 1);
13
14 out = fopen(mail, "w");
15 if (!out)
16 - return error(_("could not open '%s' for writing: %s"),
17 - mail, strerror(errno));
18 + return error_errno(_("could not open '%s' for writing"),
19 + mail);
20
21 ret = fn(out, in, keep_cr);
22
23 fclose(out);
24 fclose(in);
25
More recommendations and supporting patches are saved in attachments. It is so 
kind of you to reply me about the correctness of our suggestions. And thank you 
for your reading. <>
<>


Re: [BUG] v2.16.0-rc0 seg faults when git bisect skip

2018-01-04 Thread Yasushi SHOJI
Hi,

On Thu, Jan 4, 2018 at 3:26 AM, Martin Ågren  wrote:
> On 3 January 2018 at 15:21, Ævar Arnfjörð Bjarmason  wrote:
>>
>> On Wed, Jan 03 2018, Yasushi SHOJI jotted:
>>
>>> Hi,
>>>
>>> git version 2.16.0.rc0 seg faults on my machine when I
>>> [...]
>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>> #0  0x55a73107f900 in best_bisection_sorted (list=0x0, nr=0) at 
>>> bisect.c:232
>>> 232 free_commit_list(p->next);
>>> (gdb) bt
>>> #0  0x55a73107f900 in best_bisection_sorted (list=0x0, nr=0) at 
>>> bisect.c:232
>>> #1  0x55a73107fc0f in do_find_bisection (list=0x0, nr=0,
>>> weights=0x55a731b6ffd0, find_all=1) at bisect.c:361
>>> #2  0x55a73107fcf4 in find_bisection (commit_list=0x7ffe8750d4d0,
>>> reaches=0x7ffe8750d4c4, all=0x7ffe8750d4c0, find_all=1) at
>>> bisect.c:400
>>> #3  0x55a73108128d in bisect_next_all (prefix=0x0, no_checkout=0)
>>> at bisect.c:969
>>> #4  0x55a730fd5238 in cmd_bisect__helper (argc=0,
>>> argv=0x7ffe8750e230, prefix=0x0) at builtin/bisect--helper.c:140
>>> #5  0x55a730fcbc76 in run_builtin (p=0x55a73145c778
>>> , argc=2, argv=0x7ffe8750e230) at git.c:346
>>> #6  0x55a730fcbf40 in handle_builtin (argc=2, argv=0x7ffe8750e230)
>>> at git.c:554
>>> #7  0x55a730fcc0e8 in run_argv (argcp=0x7ffe8750e0ec,
>>> argv=0x7ffe8750e0e0) at git.c:606
>>> #8  0x55a730fcc29b in cmd_main (argc=2, argv=0x7ffe8750e230) at 
>>> git.c:683
>>> #9  0x55a731068d9e in main (argc=3, argv=0x7ffe8750e228) at 
>>> common-main.c:43
>>> (gdb) p p
>>> $1 = (struct commit_list *) 0x0
>>>
>>> As you can see, the code dereferences to the 'next' while 'p' is NULL.
>>>
>>> I'm sure I did 'git bisect good' after git _found_ bad commit.  Then I
>>> typed 'git bisect skip' on the commit 726804874 of guile repository.
>>> If that matters at all.
>>>
>>> I haven't touched guile repo to preserve the current state.
>>
>> I can't reproduce this myself, but looking at the backtrace it seems
>> pretty obvious that 7c117184d7 ("bisect: fix off-by-one error in
>> `best_bisection_sorted()`", 2017-11-05) is the culprit.
>>
>> That changed more careful code added by Christian in 50e62a8e70
>> ("rev-list: implement --bisect-all", 2007-10-22) to free a pointer which
>> as you can see can be NULL.
>>
>> If you can test a patch to see if it works this should fix it:
>>
>> diff --git a/bisect.c b/bisect.c
>> index 0fca17c02b..2f3008b078 100644
>> --- a/bisect.c
>> +++ b/bisect.c
>> @@ -229,8 +229,10 @@ static struct commit_list *best_bisection_sorted(struct 
>> commit_list *list, int n
>> if (i < cnt - 1)
>> p = p->next;
>> }
>> -   free_commit_list(p->next);
>> -   p->next = NULL;
>> +   if (p) {
>> +   free_commit_list(p->next);
>> +   p->next = NULL;
>> +   }
>> strbuf_release();
>> free(array);
>> return list;
>>
>> But given the commit message by Martin maybe there's some deeper bug here.
>
> I haven't tried to reproduce, or tested the patch, but from the looks of
> it, your analysis and fix are both spot on. The special case that yashi
> has hit is that `list` is NULL. The old code handled that very well, the
> code after my patch ... not so well. The loop-sort-loop pattern reduces
> to a no-op, both before and after my patch. But what I failed to realize
> was that `list` could be NULL.

The patch (actually, I've tested the one in pu, 2e9fdc795cb27) avoids
the seg fault for sure, but the question is:

When does the list allowed to contain NULLs?

Since nobody noticed it since 7c117184d7, it must be a rare case, right?

Would you guys elaborate a bit?  I don't have any insight how
best_bisection_sorted() should work and what the list should contain.
So that I can make a test case.

Thanks,
-- 
   yashi


Happy new year!

2018-01-04 Thread Sgt. Britta Lopez
Good Day,
How are u doing today ? Apologies! I am a military woman ,seeking
your kind assistance to move the sum of ($7M USD) to you, as far
as i can be assured that my money will be safe in your care until
i complete my service here in Afghanistan and come over next
month.
This is legitimate, and there is no danger involved.If
interested, reply immediately for detailed information.
 
Reply to this email sgtbrittalope...@gmail.com
Regards ,
Sgt. Britta Lopez


Re: Bring together merge and rebase

2018-01-04 Thread Martin Fick
> On Jan 4, 2018 11:19 AM, "Martin Fick" 
 wrote:
> > On Tuesday, December 26, 2017 12:40:26 AM Jacob Keller
> > 
> > wrote:
> > > On Mon, Dec 25, 2017 at 10:02 PM, Carl Baldwin
> > 
> >  wrote:
> > > >> On Mon, Dec 25, 2017 at 5:16 PM, Carl Baldwin
> > 
> >  wrote:
> > > >> A bit of a tangent here, but a thought I didn't
> > > >> wanna
> > > >> lose: In the general case where a patch was rebased
> > > >> and the original parent pointer was changed, it is
> > > >> actually quite hard to show a diff of what changed
> > > >> between versions.
> > > 
> > > My biggest gripes are that the gerrit web interface
> > > doesn't itself do something like this (and jgit does
> > > not
> > > appear to be able to generate combined diffs at all!)
> > 
> > I believe it now does, a presentation was given at the
> > Gerrit User summit in London describing this work.  It
> > would indeed be great if git could do this also!


On Thursday, January 04, 2018 04:02:40 PM Jacob Keller 
wrote:
> Any chance slides or a recording was posted anywhere? I'm
> quite interested in this topic.

Slides and video + transcript here:

https://gerrit.googlesource.com/summit/2017/+/master/sessions/new-in-2.15.md

Watch the part after the backend improvements,

-Martin

-- 
The Qualcomm Innovation Center, Inc. is a member of Code 
Aurora Forum, hosted by The Linux Foundation



Re: [PATCH 3/4] clone: factor out dir_exists() helper

2018-01-04 Thread Jeff King
On Thu, Jan 04, 2018 at 06:54:12PM -0500, Jeff King wrote:

> > If we really want to be anal, perhaps a new helper path_exists()
> > that cares only about existence of paths (i.e. the implementation of
> > these two helpers they currently have), together with update to
> > check the st.st_mode for file_exists() and dir_exists(), may help
> > making the API set more rational, but I do not think it is worth it.
> 
> Yep, I also considered that file_exists() probably wants to be
> path_exists() with its current implementation. We'd probably want to
> review all of the callers.
> 
> Anyway, I tried to do the minimal refactoring here, with no change in
> behavior. I'm not opposed to calling this dir_exists() as path_exists()
> and making it globally available (as you note, I don't think we'd want
> to use a true dir_exists() here).

So I actually started down this road just now, but I'm not sure if it's
worth it.  If we were to transition to an endgame with path_exists(),
dir_exists(), and file_exists(), we'd probably want to do something
like:

  1. introduce path_exists(), but leave existing file_exists() callers
 in place

  2. introduce file_exists_as_file(), which checks S_IFREG

  3. audit each file_exists() call to see if it ought to be
 path_exists() or file_exists_as_file() and convert as needed

  4. When there are no more file_exists() calls left, all
 file_exists_as_file() instances can be renamed to file_exists().

But as with any "audit each..." plan, that leaves topics in flight out
of luck. If we want to be kind to those, we'd have to wait a long while
to shake out any file_exists() callers.

At which point is there much value in having path_exists() as a wrapper?
It's a better name, perhaps, but I think my future-proofing against
"file_exists() may become file-specific" was probably overly paranoid. I
don't think we could sanely do that conversion without risking breakage.

Maybe we should just document its behavior and use it here, rather than
introducing this new dir_exists().

-Peff


Re: Rewrite cat-file.c : need help to find a bug

2018-01-04 Thread Christian Couder
On Thu, Jan 4, 2018 at 11:23 PM, Оля Тележная  wrote:
>
> So for now 2 of my last commits fail, and I am tired of searching for the 
> error.
> I was also trying to leave cat_file_info variable and fill in both new
> and old variables and then compare resulting values by printing them
> into file. Everything is OK, but I find it dudpicious that the
> resulting file is too small (fprintf was invoked only 3 times, it was
> here: 
> https://github.com/telezhnaya/git/commit/54a5b5a0167ad634c26e4fd7df234a46286ede0a#diff-2846189963e8aec1bcb559b69b7f20d0R1489)
>
> I have left few comments in github to simplify your understanding what
> I was trying to achieve. Feel free to ask any questions if you find
> the code strange, unclear or suspicious.

Let me try to see how I can debug it.

Running `./t1006-cat-file.sh -v -i` gives:

---
expecting success:
maybe_remove_timestamp "$batch_output" $no_ts >expect &&
maybe_remove_timestamp "$(echo $sha1 | git cat-file --batch)"
$no_ts >actual &&
test_cmp expect actual

Segmentation fault (core dumped)
--- expect  2018-01-04 23:31:20.515114634 +
+++ actual  2018-01-04 23:31:20.635114274 +
@@ -1,2 +0,0 @@
-5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689 blob 11
-Hello World
\ No newline at end of file
not ok 9 - --batch output of blob is correct
#
#   maybe_remove_timestamp "$batch_output" $no_ts >expect &&
#   maybe_remove_timestamp "$(echo $sha1 | git cat-file
--batch)" $no_ts >actual &&
#   test_cmp expect actual
#
---

So there is a segfault probably when running $(echo $sha1 | git
cat-file --batch). Let's try to run that manually.

$ cd trash\ directory.t1006-cat-file/
$  echo 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689 | git cat-file --batch
Segmentation fault (core dumped)

That's it. Now let's use gdb to see where it comes from:

$ echo 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689 > myarg.txt
$ gdb git
GNU gdb (Ubuntu 8.0.1-0ubuntu1) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
...
(gdb)

Let's run the cat-file command inside gdb:

(gdb) run cat-file --batch < myarg.txt
Starting program: /home/ubuntu/bin/git cat-file --batch < myarg.txt
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x556e88e6 in populate_value (ref=0x7fffd430) at ref-filter.c:1496
1496ref->disk_size = *obj_info.disk_sizep;
(gdb)

Let's get a backtrace:

(gdb)  bt
#0  0x556e88e6 in populate_value (ref=0x7fffd430) at
ref-filter.c:1496
#1  0x555783f1 in batch_object_write (
obj_name=0x55a655f0
"5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689", opt=0x7fffd6e0,
data=0x7fffd5e0) at builtin/cat-file.c:291
#2  0x55578660 in batch_one_object (
obj_name=0x55a655f0
"5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689", opt=0x7fffd6e0,
data=0x7fffd5e0) at builtin/cat-file.c:346

Let's see what's the code that makes it segfault:

(gdb) l
1491fflush(stdout);
1492return -1;
1493}
1494ref->type = *obj_info.typep;
1495ref->size = *obj_info.sizep;
1496ref->disk_size = *obj_info.disk_sizep;
1497hashcpy(ref->delta_base_oid.hash,
obj_info.delta_base_sha1);
1498}
1499
1500/* Fill in specials first */

Line 1496 has "ref->disk_size = *obj_info.disk_sizep;" so let's look
at those variables:

(gdb) p *ref
$1 = {objectname = {hash =
"^\034\060\235\256\177E\340\363\233\033\363\254<\331\333\022\347\326\211"},
  flag = 0, kind = 4148386208, symref = 0x7778b9e0
<_IO_2_1_stdin_> "\210 \255\373",
  commit = 0x7fffd510, values = 0x55a66cb0, type = OBJ_BLOB, size = 11,
  disk_size = -7613955248136140544, rest = 0x0, delta_base_oid = {
hash = "-\334qUUU\000\000\360\324\377\377\377\177\000\000\340\325\377\377"},
  start_of_request = 0x55a655f0 "5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689",
  refname = 0x7fffd4a8 ""}
(gdb)  p obj_info
$2 = {typep = 0x55a53df8 , sizep = 0x55a66c30,
disk_sizep = 0x0, delta_base_sha1 = 0x0,
  typename = 0x0, contentp = 0x0, whence = OI_LOOSE, u = {packed =
{pack = 0x0, offset = 0,
  is_delta = 0}}}

Ok we can see that "disk_sizep = 0x0" which means that it segfault
because line 1496 tries to read the value pointed to by disk_sizep
which is NULL.

I hope this will help you.

Best,
Christian.


Re: [PATCH 3/4] clone: factor out dir_exists() helper

2018-01-04 Thread Jeff King
On Thu, Jan 04, 2018 at 03:47:18PM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Two parts of git-clone's setup logic check whether a
> > directory exists, and they both call stat directly with the
> > same scratch "struct stat" buffer. Let's pull that into a
> > helper, which has a few advantages:
> >
> >   - it makes the purpose of the stat calls more obvious
> >
> >   - it makes it clear that we don't care about the
> > information in "buf" remaining valid
> >
> >   - if we later decide to make the check more robust (e.g.,
> > complaining about non-directories), we can do it in one
> > place
> >
> > Note that we could just use file_exists() for this, which
> > has identical code. But we specifically care about
> > directories, so this future-proofs us against that function
> > later getting more picky about seeing actual files.
> 
> It leaves funny taste in my mouth to see that dir_exists() does call
> stat() but does not check st.st_mode to see if it is a directory,
> but for this particular caller, we want dest_exists() to be true
> even when the thing is a non-directory, so that !is_empty_dir(dir)
> call is made on the next line to trigger "exists but not an empty
> dir" error.  After all, what this caller really wants to ask is "is
> something sitting there?" and the answer it expects under normal
> condition is "no, there is nothing there".

Yeah, that was part of the reason I left this as file-local instead of
adding it globally.

> If we really want to be anal, perhaps a new helper path_exists()
> that cares only about existence of paths (i.e. the implementation of
> these two helpers they currently have), together with update to
> check the st.st_mode for file_exists() and dir_exists(), may help
> making the API set more rational, but I do not think it is worth it.

Yep, I also considered that file_exists() probably wants to be
path_exists() with its current implementation. We'd probably want to
review all of the callers.

Anyway, I tried to do the minimal refactoring here, with no change in
behavior. I'm not opposed to calling this dir_exists() as path_exists()
and making it globally available (as you note, I don't think we'd want
to use a true dir_exists() here).

-Peff


Re: [PATCH 4/4] clone: do not clean up directories we didn't create

2018-01-04 Thread Junio C Hamano
Jeff King  writes:

> diff --git a/builtin/clone.c b/builtin/clone.c
> index 04b0d7283f..284651797e 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -502,12 +504,12 @@ static void remove_junk(void)
>  
>   if (junk_git_dir) {
>   strbuf_addstr(, junk_git_dir);
> - remove_dir_recursively(, 0);
> + remove_dir_recursively(, junk_git_dir_flags);
>   strbuf_reset();
>   }
>   if (junk_work_tree) {
>   strbuf_addstr(, junk_work_tree);
> - remove_dir_recursively(, 0);
> + remove_dir_recursively(, junk_work_tree_flags);
>   }
>   strbuf_release();
>  }
> @@ -972,14 +974,24 @@ int cmd_clone(int argc, const char **argv, const char 
> *prefix)
>   if (safe_create_leading_directories_const(work_tree) < 0)
>   die_errno(_("could not create leading directories of 
> '%s'"),
> work_tree);
> - if (!dest_exists && mkdir(work_tree, 0777))
> + if (dest_exists)
> + junk_work_tree_flags |= REMOVE_DIR_KEEP_TOPLEVEL;
> + else if (mkdir(work_tree, 0777))
>   die_errno(_("could not create work tree dir '%s'"),
> work_tree);
>   junk_work_tree = work_tree;
>   set_git_work_tree(work_tree);
>   }
>  
> - junk_git_dir = real_git_dir ? real_git_dir : git_dir;
> + if (real_git_dir) {
> + if (dir_exists(real_git_dir))
> + junk_git_dir_flags |= REMOVE_DIR_KEEP_TOPLEVEL;
> + junk_git_dir = real_git_dir;
> + } else {
> + if (dest_exists)
> + junk_git_dir_flags |= REMOVE_DIR_KEEP_TOPLEVEL;
> + junk_git_dir = git_dir;
> + }
>   if (safe_create_leading_directories_const(git_dir) < 0)
>   die(_("could not create leading directories of '%s'"), git_dir);

The changes all look reasonable.

Thanks.


Re: [PATCH 3/4] clone: factor out dir_exists() helper

2018-01-04 Thread Junio C Hamano
Jeff King  writes:

> Two parts of git-clone's setup logic check whether a
> directory exists, and they both call stat directly with the
> same scratch "struct stat" buffer. Let's pull that into a
> helper, which has a few advantages:
>
>   - it makes the purpose of the stat calls more obvious
>
>   - it makes it clear that we don't care about the
> information in "buf" remaining valid
>
>   - if we later decide to make the check more robust (e.g.,
> complaining about non-directories), we can do it in one
> place
>
> Note that we could just use file_exists() for this, which
> has identical code. But we specifically care about
> directories, so this future-proofs us against that function
> later getting more picky about seeing actual files.

It leaves funny taste in my mouth to see that dir_exists() does call
stat() but does not check st.st_mode to see if it is a directory,
but for this particular caller, we want dest_exists() to be true
even when the thing is a non-directory, so that !is_empty_dir(dir)
call is made on the next line to trigger "exists but not an empty
dir" error.  After all, what this caller really wants to ask is "is
something sitting there?" and the answer it expects under normal
condition is "no, there is nothing there".

If we really want to be anal, perhaps a new helper path_exists()
that cares only about existence of paths (i.e. the implementation of
these two helpers they currently have), together with update to
check the st.st_mode for file_exists() and dir_exists(), may help
making the API set more rational, but I do not think it is worth it.

Thanks.

>
> Signed-off-by: Jeff King 
> ---
>  builtin/clone.c | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 2da71db107..04b0d7283f 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -863,10 +863,15 @@ static void dissociate_from_references(void)
>   free(alternates);
>  }
>  
> +static int dir_exists(const char *path)
> +{
> + struct stat sb;
> + return !stat(path, );
> +}
> +
>  int cmd_clone(int argc, const char **argv, const char *prefix)
>  {
>   int is_bundle = 0, is_local;
> - struct stat buf;
>   const char *repo_name, *repo, *work_tree, *git_dir;
>   char *path, *dir;
>   int dest_exists;
> @@ -938,7 +943,7 @@ int cmd_clone(int argc, const char **argv, const char 
> *prefix)
>   dir = guess_dir_name(repo_name, is_bundle, option_bare);
>   strip_trailing_slashes(dir);
>  
> - dest_exists = !stat(dir, );
> + dest_exists = dir_exists(dir);
>   if (dest_exists && !is_empty_dir(dir))
>   die(_("destination path '%s' already exists and is not "
>   "an empty directory."), dir);
> @@ -949,7 +954,7 @@ int cmd_clone(int argc, const char **argv, const char 
> *prefix)
>   work_tree = NULL;
>   else {
>   work_tree = getenv("GIT_WORK_TREE");
> - if (work_tree && !stat(work_tree, ))
> + if (work_tree && dir_exists(work_tree))
>   die(_("working tree '%s' already exists."), work_tree);
>   }


Re: Bug report: git clone with dest

2018-01-04 Thread Johannes Schindelin
Hi Isaac,

On Wed, 3 Jan 2018, Isaac Shabtay wrote:

> Indeed interesting... this one's for the books...
> Thanks for the patches. Any idea when these are going to make it to
> the official Git client builds? (specifically the Windows one)

If you help them getting reviewed, tested, and validated, I could be
talked into taking them into Git for Windows early so that they'll be in
Git for Windows v2.16.0. It will require your effort, though.

Ciao,
Johannes


Re: [PATCH v3 0/5] Add --no-ahead-behind to status

2018-01-04 Thread Jeff King
On Wed, Jan 03, 2018 at 09:47:28PM +, Jeff Hostetler wrote:

> Config values of true and false control non-porcelain formats
> for compatibility reasons as previously discussed.  In the
> last commit I added a new value of 2 for the config setting
> to allow porcelain formats to inherit the new setting.  I've
> marked this experimental for now or so that we can discuss
> it.

I'm mildly negative on this "level 2" config. If influencing the
porcelain via config creates compatibility headaches, then why would we
allow it here? And if it doesn't, then why do we need to protect against
it? This seems to exist in a funny middle ground that cannot decide
whether it is bad or not.

It's like we're inserting a foot-gun, but putting it just far enough out
of reach that we can blame the user when they shoot themselves with it.

Is there a compelling use case for this? From the previous discussion,
this is the strawman I came up with:

  Scripted callers like Visual Studio don't want to unconditionally pass
  --no-ahead-behind, because it makes sense only for large repositories
  (and small ones would prefer the more exact answer, if we can get it
  cheaply). So we'd like the user to trigger "this is large" on a
  per-repo basis, and accept the consequences of possibly broken
  porcelain callers.

I think we could have the best of both worlds, though, if the existing
config option were coupled with a command-line option to say "yes, I
understand no-ahead-behind, so use it for the porcelain if applicable".
IOW, the user does:

  git config status.aheadbehind false

and VS does:

  git status --ahead-behind=maybe

and together both sides have assented to the "quick" thing.

-Peff


Re: [PATCH 3/6] fsmonitor: Update helper tool, now that flags are filled later

2018-01-04 Thread Johannes Schindelin
Hi Alex,

On Tue, 2 Jan 2018, Alex Vandiver wrote:

> diff --git a/config.c b/config.c
> index e617c2018..7c6ed888e 100644
> --- a/config.c
> +++ b/config.c
> @@ -2174,8 +2174,13 @@ int git_config_get_fsmonitor(void)
>   if (core_fsmonitor && !*core_fsmonitor)
>   core_fsmonitor = NULL;
>  
> - if (core_fsmonitor)
> - return 1;
> +
> + if (core_fsmonitor) {
> + if (!strcasecmp(core_fsmonitor, "keep"))
> + return -1;
> + else
> + return 1;
> + }

It took me a while to reason about this:

- there is no existing code path that can return -1 from
  git_config_get_fsmonitor(),

- the callers in builtin/update-index.c (testing explicitly for 0 and 1)
  do not matter because they only trigger warnings.

- the remaining two callers are in fsmonitor.c:

  - tweak_fsmonitor() (which handles -1 specifically), and

  - inflate_fsmonitor_ewah(), which only tests whether
git_config_get_fsmonitor() returned a non-zero value, but that test is
inside a code block that is only triggered if the index has an
fsmonitor_dirty array, meaning: it already had fsmonitor enabled.
Therefore the test is legitimate.

This would take the next reader as much time, I would wager a bet. So
maybe you can include this information (or at least the information about
inflate_fsmonitor_ewah()) in the commit message?

> diff --git a/t/helper/test-dump-fsmonitor.c b/t/helper/test-dump-fsmonitor.c
> index ad452707e..48c4bab0b 100644
> --- a/t/helper/test-dump-fsmonitor.c
> +++ b/t/helper/test-dump-fsmonitor.c
> @@ -1,12 +1,14 @@
>  #include "cache.h"
> +#include "config.h"
>  
>  int cmd_main(int ac, const char **av)
>  {
>   struct index_state *istate = _index;
>   int i;
>  
> + git_config_push_parameter("core.fsmonitor=keep");

The alternative would be to use an environment variable. We already use
GIT_FSMONITOR_TEST.

However, I wonder why we need this. Do we really update the index anywhere
in the tests, then *toggle* the core.fsmonitor setting, and *then* call
test-dump-fsmonitor?

And if we do, can't we simply avoid it?

Ciao,
Johannes


[PATCHv2 6/6] diff: use HAS_MULTI_BITS instead of counting bits manually

2018-01-04 Thread Stefan Beller
This aligns the style to the previous patch.

Signed-off-by: Stefan Beller 
---
 diff.c | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/diff.c b/diff.c
index 42858d4c7d..c2ee81a1fa 100644
--- a/diff.c
+++ b/diff.c
@@ -4107,20 +4107,15 @@ void diff_setup(struct diff_options *options)
 
 void diff_setup_done(struct diff_options *options)
 {
-   int count = 0;
+   unsigned check_mask = DIFF_FORMAT_NAME |
+ DIFF_FORMAT_NAME_STATUS |
+ DIFF_FORMAT_CHECKDIFF |
+ DIFF_FORMAT_NO_OUTPUT;
 
if (options->set_default)
options->set_default(options);
 
-   if (options->output_format & DIFF_FORMAT_NAME)
-   count++;
-   if (options->output_format & DIFF_FORMAT_NAME_STATUS)
-   count++;
-   if (options->output_format & DIFF_FORMAT_CHECKDIFF)
-   count++;
-   if (options->output_format & DIFF_FORMAT_NO_OUTPUT)
-   count++;
-   if (count > 1)
+   if (HAS_MULTI_BITS(options->output_format & check_mask))
die(_("--name-only, --name-status, --check and -s are mutually 
exclusive"));
 
if (HAS_MULTI_BITS(options->pickaxe_opts & DIFF_PICKAXE_KINDS_MASK))
-- 
2.16.0.rc0.223.g4a4ac83678-goog



[PATCHv2 2/6] diff: migrate diff_flags.pickaxe_ignore_case to a pickaxe_opts bit

2018-01-04 Thread Stefan Beller
Currently flags for pickaxing are found in different places. Unify the
flags into the `pickaxe_opts` field, which will contain any pickaxe related
flags.

Signed-off-by: Stefan Beller 
---
 diff.h | 3 ++-
 diffcore-pickaxe.c | 6 +++---
 revision.c | 2 +-
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/diff.h b/diff.h
index ea310f76fd..8af1213684 100644
--- a/diff.h
+++ b/diff.h
@@ -91,7 +91,6 @@ struct diff_flags {
unsigned override_submodule_config:1;
unsigned dirstat_by_line:1;
unsigned funccontext:1;
-   unsigned pickaxe_ignore_case:1;
unsigned default_follow_renames:1;
 };
 
@@ -327,6 +326,8 @@ extern void diff_setup_done(struct diff_options *);
 #define DIFF_PICKAXE_KIND_S4 /* traditional plumbing counter */
 #define DIFF_PICKAXE_KIND_G8 /* grep in the patch */
 
+#define DIFF_PICKAXE_IGNORE_CASE   32
+
 extern void diffcore_std(struct diff_options *);
 extern void diffcore_fix_diff_index(struct diff_options *);
 
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 9476bd2108..4b5d88ea46 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -222,11 +222,11 @@ void diffcore_pickaxe(struct diff_options *o)
 
if (opts & (DIFF_PICKAXE_REGEX | DIFF_PICKAXE_KIND_G)) {
int cflags = REG_EXTENDED | REG_NEWLINE;
-   if (o->flags.pickaxe_ignore_case)
+   if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_CASE)
cflags |= REG_ICASE;
regcomp_or_die(, needle, cflags);
regexp = 
-   } else if (o->flags.pickaxe_ignore_case &&
+   } else if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_CASE &&
   has_non_ascii(needle)) {
struct strbuf sb = STRBUF_INIT;
int cflags = REG_NEWLINE | REG_ICASE;
@@ -236,7 +236,7 @@ void diffcore_pickaxe(struct diff_options *o)
strbuf_release();
regexp = 
} else {
-   kws = kwsalloc(o->flags.pickaxe_ignore_case
+   kws = kwsalloc(o->pickaxe_opts & DIFF_PICKAXE_IGNORE_CASE
   ? tolower_trans_tbl : NULL);
kwsincr(kws, needle, strlen(needle));
kwsprep(kws);
diff --git a/revision.c b/revision.c
index e2e691dd5a..ccf1d212ce 100644
--- a/revision.c
+++ b/revision.c
@@ -2076,7 +2076,7 @@ static int handle_revision_opt(struct rev_info *revs, int 
argc, const char **arg
revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_ERE;
} else if (!strcmp(arg, "--regexp-ignore-case") || !strcmp(arg, "-i")) {
revs->grep_filter.ignore_case = 1;
-   revs->diffopt.flags.pickaxe_ignore_case = 1;
+   revs->diffopt.pickaxe_opts |= DIFF_PICKAXE_IGNORE_CASE;
} else if (!strcmp(arg, "--fixed-strings") || !strcmp(arg, "-F")) {
revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_FIXED;
} else if (!strcmp(arg, "--perl-regexp") || !strcmp(arg, "-P")) {
-- 
2.16.0.rc0.223.g4a4ac83678-goog



[PATCHv2 3/6] diff: introduce DIFF_PICKAXE_KINDS_MASK

2018-01-04 Thread Stefan Beller
Currently the check whether to perform pickaxing is done via checking
`diffopt->pickaxe`, which contains the command line argument that we
want to pickaxe for. Soon we'll introduce a new type of pickaxing, that
will not store anything in the `.pickaxe` field, so let's migrate the
check to be dependent on pickaxe_opts.

It is not enough to just replace the check for pickaxe by pickaxe_opts,
because flags might be set, but pickaxing was not requested ('-i').
To cope with that, introduce a mask to check only for the bits indicating
the modes of operation.

Signed-off-by: Stefan Beller 
---
 builtin/log.c  | 4 ++--
 combine-diff.c | 2 +-
 diff.c | 4 ++--
 diff.h | 2 ++
 revision.c | 2 +-
 5 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 6c1fa896ad..bd6f2d1efb 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -180,8 +180,8 @@ static void cmd_log_init_finish(int argc, const char 
**argv, const char *prefix,
if (rev->show_notes)
init_display_notes(>notes_opt);
 
-   if (rev->diffopt.pickaxe || rev->diffopt.filter ||
-   rev->diffopt.flags.follow_renames)
+   if ((rev->diffopt.pickaxe_opts & DIFF_PICKAXE_KINDS_MASK) ||
+   rev->diffopt.filter || rev->diffopt.flags.follow_renames)
rev->always_show_header = 0;
 
if (source)
diff --git a/combine-diff.c b/combine-diff.c
index 2505de119a..bc08c4c5b1 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -1438,7 +1438,7 @@ void diff_tree_combined(const struct object_id *oid,
opt->flags.follow_renames   ||
opt->break_opt != -1||
opt->detect_rename  ||
-   opt->pickaxe||
+   (opt->pickaxe_opts & DIFF_PICKAXE_KINDS_MASK)   ||
opt->filter;
 
 
diff --git a/diff.c b/diff.c
index 0763e89263..5508745dc8 100644
--- a/diff.c
+++ b/diff.c
@@ -4173,7 +4173,7 @@ void diff_setup_done(struct diff_options *options)
/*
 * Also pickaxe would not work very well if you do not say recursive
 */
-   if (options->pickaxe)
+   if (options->pickaxe_opts & DIFF_PICKAXE_KINDS_MASK)
options->flags.recursive = 1;
/*
 * When patches are generated, submodules diffed against the work tree
@@ -5777,7 +5777,7 @@ void diffcore_std(struct diff_options *options)
if (options->break_opt != -1)
diffcore_merge_broken();
}
-   if (options->pickaxe)
+   if (options->pickaxe_opts & DIFF_PICKAXE_KINDS_MASK)
diffcore_pickaxe(options);
if (options->orderfile)
diffcore_order(options->orderfile);
diff --git a/diff.h b/diff.h
index 8af1213684..9ec4f824fe 100644
--- a/diff.h
+++ b/diff.h
@@ -326,6 +326,8 @@ extern void diff_setup_done(struct diff_options *);
 #define DIFF_PICKAXE_KIND_S4 /* traditional plumbing counter */
 #define DIFF_PICKAXE_KIND_G8 /* grep in the patch */
 
+#define DIFF_PICKAXE_KINDS_MASK (DIFF_PICKAXE_KIND_S | DIFF_PICKAXE_KIND_G)
+
 #define DIFF_PICKAXE_IGNORE_CASE   32
 
 extern void diffcore_std(struct diff_options *);
diff --git a/revision.c b/revision.c
index ccf1d212ce..5d11ecaf27 100644
--- a/revision.c
+++ b/revision.c
@@ -2407,7 +2407,7 @@ int setup_revisions(int argc, const char **argv, struct 
rev_info *revs, struct s
revs->diff = 1;
 
/* Pickaxe, diff-filter and rename following need diffs */
-   if (revs->diffopt.pickaxe ||
+   if ((revs->diffopt.pickaxe_opts & DIFF_PICKAXE_KINDS_MASK) ||
revs->diffopt.filter ||
revs->diffopt.flags.follow_renames)
revs->diff = 1;
-- 
2.16.0.rc0.223.g4a4ac83678-goog



[PATCHv2 5/6] diff: properly error out when combining multiple pickaxe options

2018-01-04 Thread Stefan Beller
In f506b8e8b5 (git log/diff: add -G that greps in the patch text,
2010-08-23) we were hesitant to check if the user requests both -S and
-G at the same time. Now that the pickaxe family also offers --find-object,
which looks slightly more different than the former two, let's add a check
that those are not used at the same time.

Signed-off-by: Stefan Beller 
---
 diff.c | 3 +++
 diffcore-pickaxe.c | 1 -
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index a872bdcac1..42858d4c7d 100644
--- a/diff.c
+++ b/diff.c
@@ -4123,6 +4123,9 @@ void diff_setup_done(struct diff_options *options)
if (count > 1)
die(_("--name-only, --name-status, --check and -s are mutually 
exclusive"));
 
+   if (HAS_MULTI_BITS(options->pickaxe_opts & DIFF_PICKAXE_KINDS_MASK))
+   die(_("-G, -S and --find-object are mutually exclusive"));
+
/*
 * Most of the time we can say "there are changes"
 * only by checking if there are changed paths, but
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 72bb5a9514..239ce5122b 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -251,7 +251,6 @@ void diffcore_pickaxe(struct diff_options *o)
}
}
 
-   /* Might want to warn when both S and G are on; I don't care... */
pickaxe(_queued_diff, o, regexp, kws,
(opts & DIFF_PICKAXE_KIND_G) ? diff_grep : has_changes);
 
-- 
2.16.0.rc0.223.g4a4ac83678-goog



[PATCHv2 4/6] diffcore: add a pickaxe option to find a specific blob

2018-01-04 Thread Stefan Beller
Sometimes users are given a hash of an object and they want to
identify it further (ex.: Use verify-pack to find the largest blobs,
but what are these? or [1])

One might be tempted to extend git-describe to also work with blobs,
such that `git describe ` gives a description as
':'.  This was implemented at [2]; as seen by the sheer
number of responses (>110), it turns out this is tricky to get right.
The hard part to get right is picking the correct 'commit-ish' as that
could be the commit that (re-)introduced the blob or the blob that
removed the blob; the blob could exist in different branches.

Junio hinted at a different approach of solving this problem, which this
patch implements. Teach the diff machinery another flag for restricting
the information to what is shown. For example:

$ ./git log --oneline --find-object=v2.0.0:Makefile
b2feb64309 Revert the whole "ask curl-config" topic for now
47fbfded53 i18n: only extract comments marked with "TRANSLATORS:"

we observe that the Makefile as shipped with 2.0 was appeared in
v1.9.2-471-g47fbfded53 and in v2.0.0-rc1-5-gb2feb6430b.  The
reason why these commits both occur prior to v2.0.0 are evil
merges that are not found using this new mechanism.

[1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob
[2] https://public-inbox.org/git/20171028004419.10139-1-sbel...@google.com/

Signed-off-by: Stefan Beller 
---
 Documentation/diff-options.txt | 10 +++
 diff.c | 21 -
 diff.h |  8 -
 diffcore-pickaxe.c | 45 +---
 revision.c |  3 ++
 t/t4064-diff-oidfind.sh| 68 ++
 6 files changed, 135 insertions(+), 20 deletions(-)
 create mode 100755 t/t4064-diff-oidfind.sh

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index dd0dba5b1d..f9cf85db05 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -492,6 +492,15 @@ occurrences of that string did not change).
 See the 'pickaxe' entry in linkgit:gitdiffcore[7] for more
 information.
 
+--find-object=::
+   Look for differences that change the number of occurrences of
+   the specified object. Similar to `-S`, just the argument is different
+   in that it doesn't search for a specific string but for a specific
+   object id.
++
+The object can be a blob or a submodule commit. It implies the `-t` option in
+`git-log` to also find trees.
+
 --pickaxe-all::
When `-S` or `-G` finds a change, show all the changes in that
changeset, not just the files that contain the change
@@ -500,6 +509,7 @@ information.
 --pickaxe-regex::
Treat the  given to `-S` as an extended POSIX regular
expression to match.
+
 endif::git-format-patch[]
 
 -O::
diff --git a/diff.c b/diff.c
index 5508745dc8..a872bdcac1 100644
--- a/diff.c
+++ b/diff.c
@@ -4082,6 +4082,7 @@ void diff_setup(struct diff_options *options)
options->interhunkcontext = diff_interhunk_context_default;
options->ws_error_highlight = ws_error_highlight_default;
options->flags.rename_empty = 1;
+   options->objfind = NULL;
 
/* pathchange left =NULL by default */
options->change = diff_change;
@@ -4487,6 +4488,23 @@ static int parse_ws_error_highlight_opt(struct 
diff_options *opt, const char *ar
return 1;
 }
 
+static int parse_objfind_opt(struct diff_options *opt, const char *arg)
+{
+   struct object_id oid;
+
+   if (get_oid(arg, ))
+   return error("unable to resolve '%s'", arg);
+
+   if (!opt->objfind)
+   opt->objfind = xcalloc(1, sizeof(*opt->objfind));
+
+   opt->pickaxe_opts |= DIFF_PICKAXE_KIND_OBJFIND;
+   opt->flags.recursive = 1;
+   opt->flags.tree_in_recursive = 1;
+   oidset_insert(opt->objfind, );
+   return 1;
+}
+
 int diff_opt_parse(struct diff_options *options,
   const char **av, int ac, const char *prefix)
 {
@@ -4736,7 +4754,8 @@ int diff_opt_parse(struct diff_options *options,
else if ((argcount = short_opt('O', av, ))) {
options->orderfile = prefix_filename(prefix, optarg);
return argcount;
-   }
+   } else if (skip_prefix(arg, "--find-object=", ))
+   return parse_objfind_opt(options, arg);
else if ((argcount = parse_long_opt("diff-filter", av, ))) {
int offending = parse_diff_filter_opt(optarg, options);
if (offending)
diff --git a/diff.h b/diff.h
index 9ec4f824fe..8a56cac2ad 100644
--- a/diff.h
+++ b/diff.h
@@ -7,6 +7,7 @@
 #include "tree-walk.h"
 #include "pathspec.h"
 #include "object.h"
+#include "oidset.h"
 
 struct rev_info;
 struct diff_options;
@@ -173,6 +174,8 @@ struct diff_options {
enum diff_words_type word_diff;
enum diff_submodule_format submodule_format;
 
+   struct oidset 

[PATCHv2 1/6] diff.h: Make pickaxe_opts an unsigned bit field

2018-01-04 Thread Stefan Beller
This variable is used as a bit field[1], and as we are about to add more
fields, indicate its usage as a bit field by making it unsigned.

[1] containing the bits

#define DIFF_PICKAXE_ALL1
#define DIFF_PICKAXE_REGEX  2
#define DIFF_PICKAXE_KIND_S 4
#define DIFF_PICKAXE_KIND_G 8

Signed-off-by: Stefan Beller 
---
 diff.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/diff.h b/diff.h
index 0fb18dd735..ea310f76fd 100644
--- a/diff.h
+++ b/diff.h
@@ -146,7 +146,7 @@ struct diff_options {
int skip_stat_unmatch;
int line_termination;
int output_format;
-   int pickaxe_opts;
+   unsigned pickaxe_opts;
int rename_score;
int rename_limit;
int needed_rename_limit;
-- 
2.16.0.rc0.223.g4a4ac83678-goog



[PATCHv2 0/6] pickaxe refactorings and a new mode to find blobs (WAS: diffcore: add a pickaxe option to find a specific blob)

2018-01-04 Thread Stefan Beller
v2:
Thanks Junio and Jacob for review!
* fixed up the last patch to rephrase the error message to contain an 'and'
* use HAS_MULTI_BITS as well
* a bonus patch that uses HAS_MULTI_BITS for the existing code, too.

v1
After some discussion [1], we're convinced that the original approach for
adding in just another pickaxe mode to find blobs was too hacky.

So I went the less hacky way and did some refactoring first (patches 1-3),
Then we'll have the new pickaxe mode to find blobs in patch 4. It grew
slightly larger as it had issues with the setup (we neither have a regex
nor a KWS to init) in this new world, so there are a few more lines in there.

The last patch is just the cherry on the cake, helping to keep users sane by
warning when they try to use different pickaxe modes at the same time.

Thanks,
Stefan


[1] 
https://public-inbox.org/git/cagz79kab0g9zetf6qtc45+zglm3gosywv7e+gkce2ykohb0...@mail.gmail.com/


Stefan Beller (6):
  diff.h: Make pickaxe_opts an unsigned bit field
  diff: migrate diff_flags.pickaxe_ignore_case to a pickaxe_opts bit
  diff: introduce DIFF_PICKAXE_KINDS_MASK
  diffcore: add a pickaxe option to find a specific blob
  diff: properly error out when combining multiple pickaxe options
  diff: use HAS_MULTI_BITS instead of counting bits manually

 Documentation/diff-options.txt | 10 +++
 builtin/log.c  |  4 +--
 combine-diff.c |  2 +-
 diff.c | 43 ++
 diff.h | 13 ++--
 diffcore-pickaxe.c | 48 -
 revision.c |  7 +++--
 t/t4064-diff-oidfind.sh| 68 ++
 8 files changed, 155 insertions(+), 40 deletions(-)
 create mode 100755 t/t4064-diff-oidfind.sh

-- 
2.16.0.rc0.223.g4a4ac83678-goog



Re: [PATCH 6/6] fsmonitor: Use fsmonitor data in `git diff`

2018-01-04 Thread Johannes Schindelin
Hi Alex,

On Tue, 2 Jan 2018, Alex Vandiver wrote:

> diff --git a/diff-lib.c b/diff-lib.c
> index 8104603a3..13ff00d81 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -95,6 +95,9 @@ int run_diff_files(struct rev_info *revs, unsigned int 
> option)
>  
>   diff_set_mnemonic_prefix(>diffopt, "i/", "w/");
>  
> + if (!(option & DIFF_SKIP_FSMONITOR))
> + refresh_fsmonitor(_index);
> +
>   if (diff_unmerged_stage < 0)
>   diff_unmerged_stage = 2;

I read over this hunk five times, and only now am I able to wrap my head
around this: if we do *not* want to skip the fsmonitor data, we refresh
the fsmonitor data in the index.

That feels a bit like an unneeded double negation. Speaking for myself, I
would prefore `DIFF_IGNORE_FSMONITOR` instead, it would feel less like a
double negation then. But I am not a native speaker, so I might be wrong.

> +   if (ce->ce_flags & CE_FSMONITOR_VALID && !(option & 
> DIFF_SKIP_FSMONITOR))
> +   continue;

Since we do expect this to be called without the DIFF_SKIP_FSMONITOR flag,
I guess it makes sense to order it this way.

I still have troubles to understand why we ignore the fsmonitor data with
`git add`, though... we want to add only modified files, right? I thought
that the fsmonitor data could help performance exactly there (I am
thinking of a certain insanely large code base where a developer might
want to change only one or maybe 3 files out of an entire machine workshop
of files, and with fsmonitor it should be a really fast operation because
it should ignore all but those few files, right?)... Could you maybe try
to help me understand that better?

Thanks,
Johannes


[PATCHv3 3/4] builtin/blame: add option to color metadata fields separately

2018-01-04 Thread Stefan Beller
Unlike the previous commit, this dims colors for each
metadata field individually.

Signed-off-by: Stefan Beller 
---
 builtin/blame.c | 83 -
 t/t8012-blame-colors.sh | 38 ++
 2 files changed, 113 insertions(+), 8 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 7b9c6e8676..60c8dadf4b 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -323,6 +323,7 @@ static const char *format_time(timestamp_t time, const char 
*tz_str,
 #define OUTPUT_SHOW_EMAIL  0400
 #define OUTPUT_LINE_PORCELAIN  01000
 #define OUTPUT_COLOR_LINE  02000
+#define OUTPUT_COLOR_FIELDS04000
 
 static void emit_porcelain_details(struct blame_origin *suspect, int repeat)
 {
@@ -370,7 +371,56 @@ static void emit_porcelain(struct blame_scoreboard *sb, 
struct blame_entry *ent,
putchar('\n');
 }
 
-static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, 
int opt)
+static int had_same_field_previously(int opt, int field,
+ struct blame_entry *ent,
+ struct blame_entry *prev)
+{
+   struct commit_info ci, prev_ci;
+
+   switch (field) {
+   case OUTPUT_SHOW_SCORE:
+   return ent->score == prev->score;
+   case OUTPUT_SHOW_NAME:
+   return prev->suspect &&
+   !strcmp(ent->suspect->path, prev->suspect->path);
+   case OUTPUT_SHOW_NUMBER:
+   return ent->s_lno == prev->s_lno + prev->num_lines - 1;
+
+   case OUTPUT_NO_AUTHOR:
+   get_commit_info(ent->suspect->commit, , 1);
+   get_commit_info(prev->suspect->commit, _ci, 1);
+   return ((opt & OUTPUT_SHOW_EMAIL) &&
+   !strcmp(ci.author_mail.buf, prev_ci.author_mail.buf)) ||
+   !strcmp(ci.author.buf, prev_ci.author.buf);
+   default:
+   BUG("unknown field");
+   }
+   return 0;
+}
+
+static void setup_field_color(int opt, int cnt, int field,
+ struct blame_entry *ent,
+ struct blame_entry *prev,
+ const char **use_color,
+ const char **reset_color)
+{
+   if (!(opt & OUTPUT_COLOR_FIELDS))
+   return;
+
+   if ((cnt > 0 ||
+(prev && had_same_field_previously(opt, field, ent, prev {
+   *use_color = repeated_meta_color;
+   *reset_color = GIT_COLOR_RESET;
+   } else {
+   *use_color = "";
+   *reset_color = "";
+   }
+}
+
+static void emit_other(struct blame_scoreboard *sb,
+  struct blame_entry *ent,
+  struct blame_entry *prev,
+  int opt)
 {
int cnt;
const char *cp;
@@ -414,18 +464,27 @@ static void emit_other(struct blame_scoreboard *sb, 
struct blame_entry *ent, int
   show_raw_time),
   ent->lno + 1 + cnt);
} else {
-   if (opt & OUTPUT_SHOW_SCORE)
+   if (opt & OUTPUT_SHOW_SCORE) {
+   setup_field_color(opt, cnt, OUTPUT_SHOW_SCORE,
+ ent, prev, , );
printf(" %s%*d %02d%s", color,
   max_score_digits, ent->score,
   ent->suspect->refcnt, reset);
-   if (opt & OUTPUT_SHOW_NAME)
+   }
+   if (opt & OUTPUT_SHOW_NAME) {
+   setup_field_color(opt, cnt, OUTPUT_SHOW_NAME,
+  ent, prev, , );
printf(" %s%-*.*s%s", color, longest_file,
  longest_file,
  suspect->path,
  reset);
-   if (opt & OUTPUT_SHOW_NUMBER)
+   }
+   if (opt & OUTPUT_SHOW_NUMBER) {
+   setup_field_color(opt, cnt, OUTPUT_SHOW_NUMBER,
+ ent, prev, , );
printf(" %s%*d%s", color, max_orig_digits,
   ent->s_lno + 1 + cnt, reset);
+   }
if (!(opt & OUTPUT_NO_AUTHOR)) {
const char *name;
int pad;
@@ -434,6 +493,8 @@ static void emit_other(struct blame_scoreboard *sb, struct 
blame_entry *ent, int
else
name = ci.author.buf;
pad = longest_author - 

[PATCHv3 4/4] builtin/blame: highlight recently changed lines

2018-01-04 Thread Stefan Beller
Choose a different color for dates and imitate a 'temperature cool down'
depending upon age.

Originally I had planned to have the temperature cooldown dependent on
the age of the project or file for example, as that might scale better,
but that can be added on top of this commit, e.g. instead of giving a
date, you could imagine giving a percentage that would be the linearly
interpolated between now and the beginning of the file.

Signed-off-by: Stefan Beller 
---
 Documentation/config.txt | 18 
 builtin/blame.c  | 74 +++-
 2 files changed, 91 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 45749a574d..aa967ce3ed 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1221,6 +1221,24 @@ color.blame.repeatedMeta::
is repeated meta information per line (such as commit id,
author name, date and timezone). Defaults to dark gray.
 
+color.blame.highlightRecent::
+   This can be used to color the author and date of a blame line.
+   This overrides `color.blame.repeatedMeta` setting, which colors
+   repetitions.
++
+This setting should be set to a comma-separated list of color and date 
settings,
+starting and ending with a color, the dates should be set from oldest to 
newest.
+The metadata will be colored given the colors if the the line was introduced
+before the given timestamp, overwriting older timestamped colors.
++
+Instead of an absolute timestamp relative timestamps work as well, e.g.
+2.weeks.ago is valid to address anything older than 2 weeks.
++
+It defaults to "blue,12 month ago,white,1 month ago,red", which colors
+everything older than one year blue, recent changes between one month and
+one year old are kept white, and lines introduced within the last month are
+colored red.
+
 color.ui::
This variable determines the default value for variables such
as `color.diff` and `color.grep` that control the use of color
diff --git a/builtin/blame.c b/builtin/blame.c
index 60c8dadf4b..3d444e66b3 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -24,6 +24,7 @@
 #include "dir.h"
 #include "progress.h"
 #include "blame.h"
+#include "string-list.h"
 
 static char blame_usage[] = N_("git blame [] [] [] 
[--] ");
 
@@ -324,6 +325,7 @@ static const char *format_time(timestamp_t time, const char 
*tz_str,
 #define OUTPUT_LINE_PORCELAIN  01000
 #define OUTPUT_COLOR_LINE  02000
 #define OUTPUT_COLOR_FIELDS04000
+#define OUTPUT_HEATED_LINES01
 
 static void emit_porcelain_details(struct blame_origin *suspect, int repeat)
 {
@@ -417,6 +419,61 @@ static void setup_field_color(int opt, int cnt, int field,
}
 }
 
+static struct color_field {
+   timestamp_t hop;
+   char col[COLOR_MAXLEN];
+} *colorfield;
+static int colorfield_nr, colorfield_alloc;
+
+static void parse_color_fields(const char *s)
+{
+   struct string_list l = STRING_LIST_INIT_DUP;
+   struct string_list_item *item;
+   enum { EXPECT_DATE, EXPECT_COLOR } next = EXPECT_COLOR;
+
+   /* Ideally this would be stripped and split at the same time? */
+   string_list_split(, s, ',', -1);
+   ALLOC_GROW(colorfield, colorfield_nr + 1, colorfield_alloc);
+
+   for_each_string_list_item(item, ) {
+   switch (next) {
+   case EXPECT_DATE:
+   colorfield[colorfield_nr].hop = 
approxidate(item->string);
+   next = EXPECT_COLOR;
+   colorfield_nr++;
+   ALLOC_GROW(colorfield, colorfield_nr + 1, 
colorfield_alloc);
+   break;
+   case EXPECT_COLOR:
+   if (color_parse(item->string, 
colorfield[colorfield_nr].col))
+   die(_("expecting a color: %s"), item->string);
+   next = EXPECT_DATE;
+   break;
+   }
+   }
+
+   if (next == EXPECT_COLOR)
+   die (_("must end with a color"));
+
+   colorfield[colorfield_nr].hop = TIME_MAX;
+}
+
+static void setup_default_colorfield(void)
+{
+   parse_color_fields("blue,12 month ago,white,1 month ago,red");
+}
+
+static void determine_line_heat(struct blame_entry *ent, const char 
**dest_color)
+{
+   int i = 0;
+   struct commit_info ci;
+   get_commit_info(ent->suspect->commit, , 1);
+
+   while (i < colorfield_nr && ci.author_time > colorfield[i].hop)
+   i++;
+
+   *dest_color = colorfield[i].col;
+}
+
 static void emit_other(struct blame_scoreboard *sb,
   struct blame_entry *ent,
   struct blame_entry *prev,
@@ -424,6 +481,7 @@ static void emit_other(struct blame_scoreboard *sb,
 {
int cnt;
const char *cp;
+   const char *heatcolor = NULL;
struct blame_origin *suspect = ent->suspect;
struct commit_info 

[PATCHv3 1/4] color.h: document and modernize header

2018-01-04 Thread Stefan Beller
Add documentation explaining the functions in color.h.
While at it, mark them extern.

Signed-off-by: Stefan Beller 
---
 color.c |  2 --
 color.h | 58 --
 2 files changed, 44 insertions(+), 16 deletions(-)

diff --git a/color.c b/color.c
index d48dd947c9..c4dc1cb989 100644
--- a/color.c
+++ b/color.c
@@ -399,8 +399,6 @@ static int color_vfprintf(FILE *fp, const char *color, 
const char *fmt,
return r;
 }
 
-
-
 int color_fprintf(FILE *fp, const char *color, const char *fmt, ...)
 {
va_list args;
diff --git a/color.h b/color.h
index fd2b688dfb..2e768a10c6 100644
--- a/color.h
+++ b/color.h
@@ -72,26 +72,56 @@ extern int color_stdout_is_tty;
  * Use the first one if you need only color config; the second is a convenience
  * if you are just going to change to git_default_config, too.
  */
-int git_color_config(const char *var, const char *value, void *cb);
-int git_color_default_config(const char *var, const char *value, void *cb);
+extern int git_color_config(const char *var, const char *value, void *cb);
+extern int git_color_default_config(const char *var, const char *value, void 
*cb);
 
 /*
- * Set the color buffer (which must be COLOR_MAXLEN bytes)
- * to the raw color bytes; this is useful for initializing
- * default color variables.
+ * NEEDSWWORK: document this function or refactor grep.c to stop using this
+ * function.
  */
-void color_set(char *dst, const char *color_bytes);
+extern void color_set(char *dst, const char *color_bytes);
 
-int git_config_colorbool(const char *var, const char *value);
-int want_color(int var);
-int color_parse(const char *value, char *dst);
-int color_parse_mem(const char *value, int len, char *dst);
+/*
+ * Parse a config option, which can be a boolean or one of
+ * "never", "auto", "always". Return a constant of
+ * GIT_COLOR_NEVER for "never" or negative boolean,
+ * GIT_COLOR_ALWAYS for "always" or a positive boolean,
+ * and GIT_COLOR_AUTO for "auto".
+ */
+extern int git_config_colorbool(const char *var, const char *value);
+
+/*
+ * Resolve the constants as returned by git_config_colorbool()
+ * (specifically "auto") to a boolean answer.
+ */
+extern int want_color(int var);
+
+/*
+ * Translate a Git color from 'value' into a string that the terminal can
+ * interpret and store it into 'dst'. The Git color values are of the form
+ * "foreground [background] [attr]" where fore- and background can be a color
+ * name ("red"), a RGB code (#0xFF) or a 256-color-mode from the terminal.
+ */
+extern int color_parse(const char *value, char *dst);
+extern int color_parse_mem(const char *value, int len, char *dst);
+
+/*
+ * Output the formatted string in the specified color (and then reset to normal
+ * color so subsequent output is uncolored). Omits the color encapsulation if
+ * `color` is NULL. The `color_fprintf_ln` prints a new line after resetting
+ * the color. The `color_print_strbuf` prints the given pre-formatted strbuf
+ * instead, up to its first NUL character.
+ */
 __attribute__((format (printf, 3, 4)))
-int color_fprintf(FILE *fp, const char *color, const char *fmt, ...);
+extern int color_fprintf(FILE *fp, const char *color, const char *fmt, ...);
 __attribute__((format (printf, 3, 4)))
-int color_fprintf_ln(FILE *fp, const char *color, const char *fmt, ...);
-void color_print_strbuf(FILE *fp, const char *color, const struct strbuf *sb);
+extern int color_fprintf_ln(FILE *fp, const char *color, const char *fmt, ...);
+extern void color_print_strbuf(FILE *fp, const char *color, const struct 
strbuf *sb);
 
-int color_is_nil(const char *color);
+/*
+ * Check if the given color is GIT_COLOR_NIL that means "no color selected".
+ * The caller needs to replace the color with the actual desired color.
+ */
+extern int color_is_nil(const char *color);
 
 #endif /* COLOR_H */
-- 
2.16.0.rc0.223.g4a4ac83678-goog



[PATCHv3 0/4] blame: (dim rep. metadata lines or fields, decay date coloring)

2018-01-04 Thread Stefan Beller
v3:

Thanks Eric for feedback, I implemented all of the suggestions.
Specifically I dropped the abstractions in patch2 but keep around a similar
abstraction in patch 3 as that still looks like it benefits (the condition
is just growing large).

Thanks,
Stefan

v2:
This is picking up [1], but presenting it in another approach,
as I realized these are orthogonal features:
* dimming repeated lines/fields of information
* giving a quick visual information how old (as a proxy for 'well tested')
  a line of code is.

Both features are configurable.

Changes from sending it out in November:
* better commit messages
* rebased on master

Any feedback welcome.

Thanks,
Stefan

[1] https://public-inbox.org/git/20171110011002.10179-1-sbel...@google.com/

Stefan Beller (4):
  color.h: document and modernize header
  builtin/blame: dim uninteresting metadata
  builtin/blame: add option to color metadata fields separately
  builtin/blame: highlight recently changed lines

 Documentation/config.txt |  23 ++
 builtin/blame.c  | 201 ++-
 color.c  |   2 -
 color.h  |  59 ++
 t/t8012-blame-colors.sh  |  56 +
 5 files changed, 305 insertions(+), 36 deletions(-)
 create mode 100755 t/t8012-blame-colors.sh

-- 
2.16.0.rc0.223.g4a4ac83678-goog



[PATCHv3 2/4] builtin/blame: dim uninteresting metadata

2018-01-04 Thread Stefan Beller
When using git-blame lots of lines contain redundant information, for
example in hunks that consist of multiple lines, the metadata (commit
name, author, date) are repeated. A reader may not be interested in those,
so offer an option to color the information that is repeated from the
previous line differently.

Signed-off-by: Stefan Beller 
---
 Documentation/config.txt |  5 +
 builtin/blame.c  | 50 ++--
 color.h  |  1 +
 t/t8012-blame-colors.sh  | 18 +
 4 files changed, 60 insertions(+), 14 deletions(-)
 create mode 100755 t/t8012-blame-colors.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 64c1dbba94..45749a574d 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1216,6 +1216,11 @@ color.status.::
status short-format), or
`unmerged` (files which have unmerged changes).
 
+color.blame.repeatedMeta::
+   Use the customized color for the part of git-blame output that
+   is repeated meta information per line (such as commit id,
+   author name, date and timezone). Defaults to dark gray.
+
 color.ui::
This variable determines the default value for variables such
as `color.diff` and `color.grep` that control the use of color
diff --git a/builtin/blame.c b/builtin/blame.c
index 005f55aaa2..7b9c6e8676 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -7,6 +7,7 @@
 
 #include "cache.h"
 #include "config.h"
+#include "color.h"
 #include "builtin.h"
 #include "commit.h"
 #include "diff.h"
@@ -46,6 +47,7 @@ static int xdl_opts;
 static int abbrev = -1;
 static int no_whole_file_rename;
 static int show_progress;
+static char *repeated_meta_color;
 
 static struct date_mode blame_date_mode = { DATE_ISO8601 };
 static size_t blame_date_width;
@@ -316,10 +318,11 @@ static const char *format_time(timestamp_t time, const 
char *tz_str,
 #define OUTPUT_PORCELAIN   010
 #define OUTPUT_SHOW_NAME   020
 #define OUTPUT_SHOW_NUMBER 040
-#define OUTPUT_SHOW_SCORE  0100
-#define OUTPUT_NO_AUTHOR   0200
+#define OUTPUT_SHOW_SCORE  0100
+#define OUTPUT_NO_AUTHOR   0200
 #define OUTPUT_SHOW_EMAIL  0400
-#define OUTPUT_LINE_PORCELAIN 01000
+#define OUTPUT_LINE_PORCELAIN  01000
+#define OUTPUT_COLOR_LINE  02000
 
 static void emit_porcelain_details(struct blame_origin *suspect, int repeat)
 {
@@ -383,6 +386,7 @@ static void emit_other(struct blame_scoreboard *sb, struct 
blame_entry *ent, int
for (cnt = 0; cnt < ent->num_lines; cnt++) {
char ch;
int length = (opt & OUTPUT_LONG_OBJECT_NAME) ? GIT_SHA1_HEXSZ : 
abbrev;
+   const char *color = "", *reset = "";
 
if (suspect->commit->object.flags & UNINTERESTING) {
if (blank_boundary)
@@ -393,7 +397,12 @@ static void emit_other(struct blame_scoreboard *sb, struct 
blame_entry *ent, int
}
}
 
-   printf("%.*s", length, hex);
+   if ((opt & OUTPUT_COLOR_LINE) && cnt > 0) {
+   color = repeated_meta_color;
+   reset = GIT_COLOR_RESET;
+   }
+
+   printf("%s%.*s%s", color, length, hex, reset);
if (opt & OUTPUT_ANNOTATE_COMPAT) {
const char *name;
if (opt & OUTPUT_SHOW_EMAIL)
@@ -406,16 +415,17 @@ static void emit_other(struct blame_scoreboard *sb, 
struct blame_entry *ent, int
   ent->lno + 1 + cnt);
} else {
if (opt & OUTPUT_SHOW_SCORE)
-   printf(" %*d %02d",
+   printf(" %s%*d %02d%s", color,
   max_score_digits, ent->score,
-  ent->suspect->refcnt);
+  ent->suspect->refcnt, reset);
if (opt & OUTPUT_SHOW_NAME)
-   printf(" %-*.*s", longest_file, longest_file,
-  suspect->path);
+   printf(" %s%-*.*s%s", color, longest_file,
+ longest_file,
+ suspect->path,
+ reset);
if (opt & OUTPUT_SHOW_NUMBER)
-   printf(" %*d", max_orig_digits,
-  ent->s_lno + 1 + cnt);
-
+   printf(" %s%*d%s", color, max_orig_digits,
+  ent->s_lno + 1 + cnt, reset);
if (!(opt & OUTPUT_NO_AUTHOR)) {
const char *name;
int pad;
@@ -424,11 +434,12 @@ static void 

Re: [PATCH 4/6] fsmonitor: Make output of test-dump-fsmonitor more concise

2018-01-04 Thread Johannes Schindelin
Hi Alex,

On Tue, 2 Jan 2018, Alex Vandiver wrote:

> Rather than display one very long line, summarize the contents of that
> line.  The tests do not currently rely on any content except the first
> line ("no fsmonitor" / "fsmonitor last update").

The more interesting part would be the entries with outdated ("invalid")
information. I thought that this information was pretty useful for
debugging. Maybe we could still keep at least that part, or at least
trigger outputting it via a command-line flag?

Ciao,
Johannes


Re: [PATCH 2/6] fsmonitor: Stop inline'ing mark_fsmonitor_valid / _invalid

2018-01-04 Thread Johannes Schindelin
Hi Alex,

On Tue, 2 Jan 2018, Alex Vandiver wrote:

> These were inline'd when they were first introduced, presumably as an
> optimization for cases when they were called in tight loops.  This
> complicates using these functions, as untracked_cache_invalidate_path
> is defined in dir.h.
> 
> Leave the inline'ing up to the compiler's decision, for ease of use.

As a compromise, you could leave the rather simple mark_fsmonitor_valid()
as inlined function. It should be by far the more-called function, anyway.

Ciao,
Johannes


Re: Rewrite cat-file.c : need help to find a bug

2018-01-04 Thread Оля Тележная
2017-12-29 17:04 GMT+03:00 Оля Тележная :
> 2017-12-29 16:22 GMT+03:00 Jeff King :
>> On Fri, Dec 29, 2017 at 01:05:50PM +0300, Оля Тележная wrote:
>>
>>> Hi everyone,
>>> I am trying to reuse formatting logic from ref-filter in cat-file
>>> command. Now cat-file uses its own formatting code.
>>> I am trying to achieve that step-by-step,

Happy to say that my previous bug is fixed, and I am struggling with
the next one.
Now I want to get rid of using expand_data structure (I took it from
cat-file.c to simplify migrating process and now it's time to delete
it).

>>> My code is here.
>>> https://github.com/telezhnaya/git/commits/catfile
>>> All commits that starts from "cat-file: " are successfully passing all 
>>> tests.

So for now 2 of my last commits fail, and I am tired of searching for the error.
I was also trying to leave cat_file_info variable and fill in both new
and old variables and then compare resulting values by printing them
into file. Everything is OK, but I find it dudpicious that the
resulting file is too small (fprintf was invoked only 3 times, it was
here: 
https://github.com/telezhnaya/git/commit/54a5b5a0167ad634c26e4fd7df234a46286ede0a#diff-2846189963e8aec1bcb559b69b7f20d0R1489)

I have left few comments in github to simplify your understanding what
I was trying to achieve. Feel free to ask any questions if you find
the code strange, unclear or suspicious.
Thanks!

Olga


Re: [RFC PATCH 2/2] Remove now useless email-address parsing code

2018-01-04 Thread Alex Bennée

Matthieu Moy  writes:

> We now use Mail::Address unconditionaly, hence parse_mailboxes is now
> dead code. Remove it and its tests.
>
> Signed-off-by: Matthieu Moy 
> ---
>  perl/Git.pm  | 71 
> 
>  t/t9000-addresses.sh | 27 
>  t/t9000/test.pl  | 67 -
>  3 files changed, 165 deletions(-)
>  delete mode 100755 t/t9000-addresses.sh
>  delete mode 100755 t/t9000/test.pl

Should we add the tests for t9001-send-email.sh to guard against regressions?

>
> diff --git a/perl/Git.pm b/perl/Git.pm
> index 02a3871..9d60d79 100644
> --- a/perl/Git.pm
> +++ b/perl/Git.pm
> @@ -880,77 +880,6 @@ sub ident_person {
>   return "$ident[0] <$ident[1]>";
>  }
>
> -=item parse_mailboxes
> -
> -Return an array of mailboxes extracted from a string.
> -
> -=cut
> -
> -# Very close to Mail::Address's parser, but we still have minor
> -# differences in some cases (see t9000 for examples).
> -sub parse_mailboxes {
> - my $re_comment = qr/\((?:[^)]*)\)/;
> - my $re_quote = qr/"(?:[^\"\\]|\\.)*"/;
> - my $re_word = qr/(?:[^]["\s()<>:;@\\,.]|\\.)+/;
> -
> - # divide the string in tokens of the above form
> - my $re_token = qr/(?:$re_quote|$re_word|$re_comment|\S)/;
> - my @tokens = map { $_ =~ /\s*($re_token)\s*/g } @_;
> - my $end_of_addr_seen = 0;
> -
> - # add a delimiter to simplify treatment for the last mailbox
> - push @tokens, ",";
> -
> - my (@addr_list, @phrase, @address, @comment, @buffer) = ();
> - foreach my $token (@tokens) {
> - if ($token =~ /^[,;]$/) {
> - # if buffer still contains undeterminated strings
> - # append it at the end of @address or @phrase
> - if ($end_of_addr_seen) {
> - push @phrase, @buffer;
> - } else {
> - push @address, @buffer;
> - }
> -
> - my $str_phrase = join ' ', @phrase;
> - my $str_address = join '', @address;
> - my $str_comment = join ' ', @comment;
> -
> - # quote are necessary if phrase contains
> - # special characters
> - if ($str_phrase =~ /[][()<>:;@\\,.\000-\037\177]/) {
> - $str_phrase =~ s/(^|[^\\])"/$1/g;
> - $str_phrase = qq["$str_phrase"];
> - }
> -
> - # add "<>" around the address if necessary
> - if ($str_address ne "" && $str_phrase ne "") {
> - $str_address = qq[<$str_address>];
> - }
> -
> - my $str_mailbox = "$str_phrase $str_address 
> $str_comment";
> - $str_mailbox =~ s/^\s*|\s*$//g;
> - push @addr_list, $str_mailbox if ($str_mailbox);
> -
> - @phrase = @address = @comment = @buffer = ();
> - $end_of_addr_seen = 0;
> - } elsif ($token =~ /^\(/) {
> - push @comment, $token;
> - } elsif ($token eq "<") {
> - push @phrase, (splice @address), (splice @buffer);
> - } elsif ($token eq ">") {
> - $end_of_addr_seen = 1;
> - push @address, (splice @buffer);
> - } elsif ($token eq "@" && !$end_of_addr_seen) {
> - push @address, (splice @buffer), "@";
> - } else {
> - push @buffer, $token;
> - }
> - }
> -
> - return @addr_list;
> -}
> -
>  =item hash_object ( TYPE, FILENAME )
>
>  Compute the SHA1 object id of the given C considering it is
> diff --git a/t/t9000-addresses.sh b/t/t9000-addresses.sh
> deleted file mode 100755
> index a1ebef6..000
> --- a/t/t9000-addresses.sh
> +++ /dev/null
> @@ -1,27 +0,0 @@
> -#!/bin/sh
> -
> -test_description='compare address parsing with and without Mail::Address'
> -. ./test-lib.sh
> -
> -if ! test_have_prereq PERL; then
> - skip_all='skipping perl interface tests, perl not available'
> - test_done
> -fi
> -
> -perl -MTest::More -e 0 2>/dev/null || {
> - skip_all="Perl Test::More unavailable, skipping test"
> - test_done
> -}
> -
> -perl -MMail::Address -e 0 2>/dev/null || {
> - skip_all="Perl Mail::Address unavailable, skipping test"
> - test_done
> -}
> -
> -test_external_has_tap=1
> -
> -test_external_without_stderr \
> - 'Perl address parsing function' \
> - perl "$TEST_DIRECTORY"/t9000/test.pl
> -
> -test_done
> diff --git a/t/t9000/test.pl b/t/t9000/test.pl
> deleted file mode 100755
> index dfeaa9c..000
> --- a/t/t9000/test.pl
> +++ /dev/null
> @@ -1,67 +0,0 @@
> -#!/usr/bin/perl
> -use lib (split(/:/, $ENV{GITPERLLIB}));
> -
> -use 

Re: [PATCH 2/4] builtin/blame: dim uninteresting metadata

2018-01-04 Thread Stefan Beller
On Thu, Dec 28, 2017 at 2:29 PM, Eric Sunshine  wrote:
> On Thu, Dec 28, 2017 at 4:03 PM, Stefan Beller  wrote:
>> When using git-blame lots of lines contain redundant information, for
>> example in hunks that consist of multiple lines, the metadata (commit
>> name, author, date) are repeated. A reader may not be interested in those,
>> so offer an option to color the information that is repeated from the
>> previous line differently.
>>
>> Signed-off-by: Stefan Beller 
>> ---
>> diff --git a/builtin/blame.c b/builtin/blame.c
>> @@ -367,6 +370,28 @@ static void emit_porcelain(struct blame_scoreboard *sb, 
>> struct blame_entry *ent,
>> +static inline void colors_unset(const char **use_color, const char 
>> **reset_color)
>> +{
>> +   *use_color = "";
>> +   *reset_color = "";
>> +}
>> +
>> +static inline void colors_set(const char **use_color, const char 
>> **reset_color)
>> +{
>> +   *use_color = repeated_meta_color;
>> +   *reset_color = GIT_COLOR_RESET;
>> +}
>> +
>> +static void setup_line_color(int opt, int cnt,
>> +const char **use_color,
>> +const char **reset_color)
>> +{
>> +   colors_unset(use_color, reset_color);
>> +
>> +   if ((opt & OUTPUT_COLOR_LINE) && cnt > 0)
>> +   colors_set(use_color, reset_color);
>> +}
>
> I'm not convinced that this colors_unset() / colors_set() /
> setup_line_color() abstraction is buying much. With this abstraction,
> I found the code more difficult to reason about than if the colors
> were just set/unset manually in the code which needs the colors. I
> *could* perhaps imagine setup_line_color() existing as a separate
> function since it is slightly more complex than the other two, but as
> it has only a single caller through all patches, even that may not be
> sufficient to warrant its existence.

Have you viewed this patch in context of the following patch?
Originally I was convinced an abstraction was not needed, but
as the next patch shows, a helper per field seems handy.

>
>> @@ -383,6 +408,7 @@ static void emit_other(struct blame_scoreboard *sb, 
>> struct blame_entry *ent, int
>> for (cnt = 0; cnt < ent->num_lines; cnt++) {
>> char ch;
>> int length = (opt & OUTPUT_LONG_OBJECT_NAME) ? 
>> GIT_SHA1_HEXSZ : abbrev;
>> +   const char *col, *rcol;
>
> I can't help but read these as "column" and "[r]column"; the former,
> especially, is just too ingrained to interpret it any other way.
> Perhaps call these "color" and "reset" instead?

will fix.

>
>> @@ -607,6 +636,12 @@ static int git_blame_config(const char *var, const char 
>> *value, void *cb)
>> +   if (!strcmp(var, "color.blame.repeatedmeta")) {
>> +   if (color_parse_mem(value, strlen(value), 
>> repeated_meta_color))
>> +   warning(_("ignore invalid color '%s' in 
>> color.blame.repeatedMeta"),
>> +   value);
>
> Does this need to say "ignore"? If you drop that word, you still have
> a valid warning message.

dropped 'ignore'.

>
>> +   return 0;
>> +   }
>> @@ -681,6 +716,7 @@ int cmd_blame(int argc, const char **argv, const char 
>> *prefix)
>> OPT_BIT('s', NULL, _option, N_("Suppress author name 
>> and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),
>> OPT_BIT('e', "show-email", _option, N_("Show author 
>> email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),
>> OPT_BIT('w', NULL, _opts, N_("Ignore whitespace 
>> differences"), XDF_IGNORE_WHITESPACE),
>> +   OPT_BIT(0, "color-lines", _option, N_("color 
>> redundant metadata from previous line"), OUTPUT_COLOR_LINE),
>
> Not sure what this help message means. Do you mean "color redundant
> ... _differently_ ..."? Or "_dim_ redundant..."?

Originally (in a patch set a couple months back) I had 'dim', but Junio
seems to have an interesting color setup, such that he would not call
it dimming IIRC, hence I think I wanted to say color _differently_. Fixed.

>> diff --git a/t/t8012-blame-colors.sh b/t/t8012-blame-colors.sh
>> @@ -0,0 +1,18 @@
>> +#!/bin/sh
>> +
>> +test_description='colored git blame'
>> +. ./test-lib.sh
>> +
>> +PROG='git blame -c'
>> +. "$TEST_DIRECTORY"/annotate-tests.sh
>> +
>> +test_expect_success 'colored blame colors continuous lines' '
>
> What are "continuous lines"? Did you mean "contiguous"?

Thanks for hinting at the subtle difference!

Thanks for the review!
(I will drop the abstraction and see how it goes)

Stefan


Re: [PATCH v3 2/5] status: add --[no-]ahead-behind to status and commit for V2 format.

2018-01-04 Thread Junio C Hamano
Jeff Hostetler  writes:

> + sti = stat_tracking_info(branch, _ahead, _behind,
> +  , s->ahead_behind_flags);
>   if (base) {
>   base = shorten_unambiguous_ref(base, 0);
>   fprintf(s->fp, "# branch.upstream %s%c", base, eol);
>   free((char *)base);
>  
> - if (ab_info)
> - fprintf(s->fp, "# branch.ab +%d -%d%c", 
> nr_ahead, nr_behind, eol);
> + if (sti >= 0) {
> + if (s->ahead_behind_flags == AHEAD_BEHIND_FULL)
> + fprintf(s->fp, "# branch.ab +%d -%d%c",
> + nr_ahead, nr_behind, eol);
> + else if (s->ahead_behind_flags == 
> AHEAD_BEHIND_QUICK)
> + fprintf(s->fp, "# branch.equal %s%c",
> + sti ? "neq" : "eq", eol);

This is related to what I said in the review of [1/5], but this
arrangement to require the caller to know that _QUICK request
results in incomplete information smells like a bit of maintenance
hassle.

We'd rather allow the caller to tell if it was given incomplete
information from the values returned by stat_tracking_info()
function (notice the plural "values" here; what is returned in
nr_{ahead,behind} also counts).  For example, we can keep the (-1 =>
"no relation", 0 => "identical", 1 => "different") as return values
of the function, but have it clear nr_{ahead,behind} when it only
knows the two are different but not by how many commits.  That way,
this step can instead do:

ab_info = stat_tracking_info(...);
if (base) {
... do the base thing ...
if (ab_info > 0) {
/* different */
if (nr_ahead || nr_behind)
fprintf(... +%d -%d ... nr_ahead, nr_behind, 
...);
else
fprintf(... "neq" ...);
} else if (!ab_info) {
/* same */
fprintf(... +%d -%d ... nr_ahead, nr_behind, ...);
}
...

That would allow us to later add different kinds of _QUICK that do
not give full information without having to update this consumer
with something like

-   else if (s->ahead_behind_flags == AHEAD_BEHIND_QUICK)
+   else if (s->ahead_behind_flags == AHEAD_BEHIND_QUICK ||
+s->ahead_behind_flags == AHEAD_BEHIND_QUICK_DIFFERENTLY)

when it happens.

Two related tangents.  

 - I do not see why "eq" need to exist at all.  Even in _QUICK mode,
   when you determine that two are identical, you know your branch
   is ahead and behind by 0 commit each.

 - A short-format that is easy to parse by machines does not have to
   be cryptic to humans.  s/neq/not-equal/ or something may be an
   improvement.

Thanks.


Previous hunk in different file during add --patch

2018-01-04 Thread Robert Dailey
I keep expecting that pressing K during patch add that it will cross
file boundaries and go to previous hunks in files before the current
one. However, it reports "no hunks" when reaching the top hunk in the
current file. Is there a way to go to the previous file?


Re: [PATCH v3 1/5] stat_tracking_info: return +1 when branches not equal

2018-01-04 Thread Junio C Hamano
Jeff Hostetler  writes:

> - if (stat_tracking_info(branch, _ours,
> -_theirs, NULL)) {
> + if (stat_tracking_info(branch, _ours, _theirs,
> +NULL, AHEAD_BEHIND_FULL) < 0) {
> ...
> - if (stat_tracking_info(branch, _ours,
> -_theirs, NULL))
> + if (stat_tracking_info(branch, _ours, _theirs,
> +NULL, AHEAD_BEHIND_FULL) < 0)

Mental note: any code that reacted to stat_tracking_info() returning
non-zero was reacting to "no useful info in num_{ours,theirs}".
They now have to compare the returned value with "< 0" for the same
purpose.

> ...
>   * Returns -1 if num_ours and num_theirs could not be filled in (e.g., no
> - * upstream defined, or ref does not exist), 0 otherwise.
> + * upstream defined, or ref does not exist).  Returns 0 if the commits are
> + * identical.  Returns 1 if commits are different.
>   */
>  int stat_tracking_info(struct branch *branch, int *num_ours, int *num_theirs,
> -const char **upstream_name)
> +const char **upstream_name, enum ahead_behind_flags abf)
>  {
>   struct object_id oid;
>   struct commit *ours, *theirs;
> @@ -2019,6 +2026,8 @@ int stat_tracking_info(struct branch *branch, int 
> *num_ours, int *num_theirs,
>   *num_theirs = *num_ours = 0;
>   return 0;
>   }
> + if (abf == AHEAD_BEHIND_QUICK)
> + return 1;
> ...
>   argv_array_clear();
> - return 0;
> + return 1;
>  }

When a caller gets +1 from the function, it is not safe to peek into
*num_{ours,theirs} if it passed _QUICK.

> @@ -2064,7 +2073,8 @@ int format_tracking_info(struct branch *branch, struct 
> strbuf *sb)
> - if (stat_tracking_info(branch, , , _base) < 0) {
> + if (stat_tracking_info(branch, , , _base,
> +AHEAD_BEHIND_FULL) < 0) {

Sane conversion to the new return value convention.

> diff --git a/wt-status.c b/wt-status.c
> index 94e5eba..8f7fdc6 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -1791,7 +1791,8 @@ static void wt_shortstatus_print_tracking(struct 
> wt_status *s)
>  
>   color_fprintf(s->fp, branch_color_local, "%s", branch_name);
>  
> - if (stat_tracking_info(branch, _ours, _theirs, ) < 0) {
> + if (stat_tracking_info(branch, _ours, _theirs, ,
> +AHEAD_BEHIND_FULL) < 0) {

Ditto.

> @@ -1928,7 +1929,8 @@ static void wt_porcelain_v2_print_tracking(struct 
> wt_status *s)
>   /* Lookup stats on the upstream tracking branch, if set. */
>   branch = branch_get(branch_name);
>   base = NULL;
> - ab_info = (stat_tracking_info(branch, _ahead, _behind, 
> ) == 0);
> + ab_info = (stat_tracking_info(branch, _ahead, _behind,
> +   , AHEAD_BEHIND_FULL) >= 0);

If a later step plans to (conditionally) allow _QUICK to be passed
here, this conversion is questionable, because ab_info being true
no longer is a sign that nr_{ahead,behind} are valid.

I suspect that moving the "s/ab_info/sti/" bits around here from
step [2/5] to this commit may make the result after this patch more
consistent, but it is not a big deal either way.

>   if (base) {
>   base = shorten_unambiguous_ref(base, 0);
>   fprintf(s->fp, "# branch.upstream %s%c", base, eol);



Re: [ANNOUNCE] Git v2.16.0-rc0

2018-01-04 Thread Paul Smith
On Thu, 2018-01-04 at 20:18 +, Thomas Gummerer wrote:
> On 12/29, Paul Smith wrote:
> > On Thu, 2017-12-28 at 20:30 -0800, Junio C Hamano wrote:
> > >   * The way "git worktree add" determines what branch to create
> > > from where and checkout in the new worktree has been updated a
> > > bit.
> > 
> > Does this include the enhancements published a few weeks ago to
> > allow worktrees to be created directly from remote branches without
> > first checking out the branch locally? I'm really looking forward
> > to that change...
> 
> Yes, this release will include that.  It would be awesome if you
> could test the rc, now is the best time to scream if something about
> it is not as you'd expect :)

OK, I pulled and built this locally, and indeed the straightforward
"git worktree add  " works just as I'd hoped!  Nice!

I'll play with it more and let you know if I hit issues.


[PATCH] rebase -p: fix quoting when calling `git merge`

2018-01-04 Thread Johannes Schindelin
It has been reported that strategy arguments are not passed to `git
merge` correctly when rebasing interactively, preserving merges.

The reason is that the strategy arguments are already quoted, and then
quoted again.

This fixes https://github.com/git-for-windows/git/issues/1321

Original-patch-by: Kim Gybels 
Also-reported-by: Matwey V. Kornilov 
Signed-off-by: Johannes Schindelin 
---
 git-rebase--interactive.sh |  9 ++---
 t/t3418-rebase-continue.sh | 14 ++
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index e3f5a0abf3c..085aa068cbb 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -392,9 +392,12 @@ pick_one_preserving_merges () {
new_parents=${new_parents# $first_parent}
merge_args="--no-log --no-ff"
if ! do_with_author output eval \
-   'git merge ${gpg_sign_opt:+"$gpg_sign_opt"} \
-   $allow_rerere_autoupdate $merge_args \
-   $strategy_args -m "$msg_content" $new_parents'
+   git merge ${gpg_sign_opt:+$(git rev-parse \
+   --sq-quote "$gpg_sign_opt")} \
+   $allow_rerere_autoupdate "$merge_args" \
+   "$strategy_args" \
+   -m $(git rev-parse --sq-quote "$msg_content") \
+   "$new_parents"
then
printf "%s\n" "$msg_content" > 
"$GIT_DIR"/MERGE_MSG
die_with_patch $sha1 "$(eval_gettext "Error 
redoing merge \$sha1")"
diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh
index fcfdd197bd3..7c91a85f43a 100755
--- a/t/t3418-rebase-continue.sh
+++ b/t/t3418-rebase-continue.sh
@@ -74,6 +74,20 @@ test_expect_success 'rebase --continue remembers merge 
strategy and options' '
test -f funny.was.run
 '
 
+test_expect_success 'rebase passes merge strategy options correctly' '
+   rm -fr .git/rebase-* &&
+   git reset --hard commit-new-file-F3-on-topic-branch &&
+   test_commit theirs-to-merge &&
+   git reset --hard HEAD^ &&
+   test_commit some-commit &&
+   test_tick &&
+   git merge --no-ff theirs-to-merge &&
+   FAKE_LINES="1 edit 2 3" git rebase -i -f -p -m \
+   -s recursive --strategy-option=theirs HEAD~2 &&
+   test_commit force-change &&
+   git rebase --continue
+'
+
 test_expect_success 'setup rerere database' '
rm -fr .git/rebase-* &&
git reset --hard commit-new-file-F3-on-topic-branch &&

base-commit: 1eaabe34fc6f486367a176207420378f587d3b48
-- 
2.15.1.windows.2.391.g0b42e3c56de

Published-As: https://github.com/dscho/git/releases/tag/rebase-strategy-opts-v1
Fetch-It-Via: git fetch https://github.com/dscho/git rebase-strategy-opts-v1


Re: [PATCH v5 13/34] directory rename detection: tests for handling overwriting untracked files

2018-01-04 Thread Elijah Newren
On Wed, Jan 3, 2018 at 5:52 PM, SZEDER Gábor  wrote:

>> + test $(git rev-parse :0:y/b) = $(git rev-parse O:z/b) &&
>
> There is a test helper for that :)
>
>   test_cmp_rev :0:y/b O:z/b
>
> Note, that this is not only a matter of useful output on failure, but
> also that of correctness and robustness.


Cool, good to know.  Is there any reason test_cmp_rev is not
documented in t/README?

I already changed these yesterday, as part of trying to avoid the use
of plain 'test' as you suggested (I was just waiting another day or so
for more feedback before resubmitting the series).  Since I tended to
have several of these rev-parse comparisons in a single test, I simply
combined them:
git rev-parse >actual
  :0:y/b :1:x/c :2:x/c :3:x/c
git rev-parse >expect
  O:z/b O:x/c A:x/c B:x/c
test_cmp expect actual

That does result in fewer rev-parse invocations, which is probably
good, but the test_cmp_rev does seem slightly more readable.  Hmmm...

> I noticed that this patch series adds several similar
>
>   test $(git hash-object this) = $(git rev-parse that)
>
> conditions.  Well, for that we don't have a test helper
> function.  Similar 'hash-object = rev-parse' comparisons are already
> present in two other test scripts, so perhaps it's worth adding a
> helper function.  Or you could perhaps
>
>   git cat-file -p that >out &&
>   test_cmp this out

Yeah, I switched these over to the same format above; e.g.
   git hash-object this >actual &&
   git rev-parse that > expect &&
   test_cmp expect actual

That does have the advantage of re-using a similar style.

>> + test "very" = "$(cat y/c)" &&
>> +
>> + test "important" = "$(cat y/d)" &&
>
> The 'verbose' helper could make conditions like these more, well,
> verbose about their failure:
>
>   verbose test "very" = "$(cat y/c)" &&

Also good to know.  Is there any reason the "verbose" helper isn't
documented in t/README?

I also switched these over yesterday to write to files and then use
test_cmp, but

>> + test "important" != "$(git rev-parse :3:y/d)" &&
>
> I'm not sure what this condition is supposed to check.

Well that's embarrassing.  That was supposed to be 'cat-file -p', not
rev-parse.  And since I'm already testing that y/d on disk does have
"important" as its contents and that :3:y/d matches O:z/c (i.e. has
contents of "c"), the check was rather unnecessary, and too easy to
false pass.  I think I should just remove it (and two others like it).


Re: Fwd: Unknown option for merge-recursive: -X'diff-algorithm=patience'

2018-01-04 Thread Johannes Schindelin
Hi Matwey,

On Mon, 1 Jan 2018, Matwey V. Kornilov wrote:

> Hello,
> 
> I am running git 2.15.1 and facing the following issue with linux kernel
> tree.
> 
> # git checkout v3.8
> # git branch abc-3.8
> # git checkout v3.9
> # git branch abc-3.9
> # git checkout abc-3.8
> 
> Introduce new commit on top of abc-3.8. Here, I edit README.
> 
> # vim README
> # git commit -a -v
> [abc-3.8 4bf088b5d341] Hello world
> 
> Then I try to rebase abc-3.9 on top of abc-3.8 as the following:
> 
> # git rebase --preserve-merges -s recursive -Xdiff-algorithm=patience
> --onto abc-3.8 v3.8 abc-3.9
> 
> And then I see:
> 
> fatal: Unknown option for merge-recursive: -X'diff-algorithm=patience'
> Error redoing merge e84cf5d0fd53badf3a93c790e280cc92a69ed999
> 
> Attached here is GIT_TRACE=1 output.

Funnily enough, this had been reported on the Git for Windows bugtracker,
and somebody even found the culprit:

https://github.com/git-for-windows/git/issues/1321#issuecomment-337279119

Sadly, I lost track of this. Will prepare a proper patch right now.

Ciao,
Johannes


Re: [RFC PATCH 1/2] add a local copy of Mail::Address from CPAN

2018-01-04 Thread Eric Sunshine
On Thu, Jan 4, 2018 at 1:55 PM, Matthieu Moy  wrote:
> We used to have two versions of the email parsing code. Our
> parse_mailboxes (in Git.pm), and Mail::Address which we used if
> installed. Unfortunately, both versions have different sets of bugs, and
> changing the behavior of git depending on whether Mail::Address is
> installed was a bad idea.
>
> A first attempt to solve this was cc90750 (send-email: don't use
> Mail::Address, even if available, 2017-08-23), but it turns out our
> parse_mailboxes is too buggy for some uses. For example the lack of
> about nested comments support breaks get_maintainer.pl in the Linux

s/about//

> kernel tree:
>
>   https://public-inbox.org/git/20171116154814.23785-1-alex.ben...@linaro.org/
>
> This patch goes the other way: use Mail::Address anyway, but have a
> local copy as a fallback, when the system one is not available.
>
> The duplicated script is small (276 lines of code) and stable in time.
> Maintaining the local copy should not be an issue, and will certainly be
> less burden than maintaining our own parse_mailboxes.
>
> Another option would be to consider Mail::Address as a hard dependency,
> but it's easy enough to save the trouble of extra-dependency to the end
> user or packager.
>
> Signed-off-by: Matthieu Moy 


Re: Test failure for v2.16.0-rc0 on cygwin

2018-01-04 Thread Johannes Schindelin
Hi,

On Tue, 2 Jan 2018, Ramsay Jones wrote:

> On 02/01/18 15:32, Ramsay Jones wrote:
> > On 02/01/18 11:36, Adam Dinwoodie wrote:
> >> On Saturday 30 December 2017 at 02:40 pm +, Adam Dinwoodie wrote:
> >>> On Saturday 30 December 2017 at 02:21 pm +, Ramsay Jones wrote:
> [snip]
> >> I'm not able to reproduce this: t5580 is passing on both my Windows 10
> >> test systems on v2.16.0-rc0.
> [snip]
> > I just tried running the test again by hand, and I can't get it to fail!
> > 
> > Hmm, I have just set off a complete test-suite run, so I won't be able
> > to report on that for a while ... ;-)
> 
> Yep, as expected, the test-suite passes no problem now! ;-)
> 
> > I have an idea: when running the failing tests the other day, I was
> > remotely logged in via ssh (I have cygwin sshd running on my win10
> > box), but today I was logged in directly. The sshd daemon is run by
> > a privileged user, so maybe that could be related ... dunno.
> 
> Also, when logged-in remotely it fails consistently, when logged-in
> directly it passes consistently. :-D

You are most likely hitting cmd.exe at some point there. In cmd.exe, there
are some restrictions that are inherited by spawned processes AFAIU. For
example, the current directory cannot be a UNC path.

You are most likely running the interactive Cygwin session in MinTTY? Then
you do not get those restrictions. If you start Cygwin in a cmd.exe
window, you should see the exact same test failures again.

Ciao,
Dscho


Re: [PATCH 28/40] pack-objects: don't pack objects in external odbs

2018-01-04 Thread Jeff Hostetler



On 1/3/2018 11:33 AM, Christian Couder wrote:

Objects managed by an external ODB should not be put into
pack files. They should be transfered using other mechanism
that can be specific to the external odb.

Signed-off-by: Christian Couder 
---
  builtin/pack-objects.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 6c71552cdf..4ed66c7677 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -28,6 +28,7 @@
  #include "argv-array.h"
  #include "mru.h"
  #include "packfile.h"
+#include "external-odb.h"
  
  static const char *pack_usage[] = {

N_("git pack-objects --stdout [...] [<  | < 
]"),
@@ -1026,6 +1027,9 @@ static int want_object_in_pack(const struct object_id 
*oid,
return want;
}
  
+	if (external_odb_has_object(oid->hash))

+   return 0;
+


I worry about the performance of this in light of my comments
earlier in the patch series about the expense of building the
"have" sets.

Since we've already checked for a loose object and we are about
to walk thru the local packfiles, so if we don't find it in any
of them, then we don't have it locally.  Only then do we need
to worry about external odbs.

If we don't have it locally, does the caller of this function
have sufficient "promisor" infomation infer that the object
should exist on the promisor remote?   Since you're going to
omit it from the packfile anyway, you don't really need to know
if the remote actually has it.



for (entry = packed_git_mru.head; entry; entry = entry->next) {
struct packed_git *p = entry->item;
off_t offset;



Jeff


Re: [ANNOUNCE] Git v2.16.0-rc0

2018-01-04 Thread Johannes Schindelin
Hi,

On Thu, 28 Dec 2017, Junio C Hamano wrote:

> An early preview release Git v2.16.0-rc0 is now available for
> testing at the usual places.

And a corresponding Git for Windows prerelease is also available:

https://github.com/git-for-windows/git/releases/tag/v2.16.0-rc0.windows.1

Ciao,
Johannes


Re: Bug/feature request: Can’t disable fsck warnings during clone

2018-01-04 Thread Johannes Schindelin
Hi Kevin,

On Thu, 28 Dec 2017, Kevin A. Mitchell wrote:

> I’ve set transfer.fsckObjects to true globally, for safety.
> Unfortunately, this messed up my Spacevim install.
> 
> Doing some digging, I found that some of the repos had a warning. I
> can turn the warning off, but that only affects git fsck, not git
> clone. Turning off transfer.fsckObjects also fixes the problem. I’d
> rather have it on for my development work.
> 
> Tested with the “next” branch as well.
> 
> $ git -c transfer.fsckObjects=true -c fsck.zeroPaddedFilemode=ignore
> clone https://github.com/albfan/ag.vim
> Cloning into 'ag.vim'...
> remote: Counting objects: 1879, done.
> error: object 65e1a0027644b6625b32d30ba5ccf1c4d483480a:
> zeroPaddedFilemode: contains zero-padded file modes
> fatal: Error in object
> fatal: index-pack failed
> $ git —version
> git version 2.15.1.501.g29533fb16
> 
> but this works as expected:
> 
> $ git clone https://github.com/albfan/ag.vim
> Cloning into 'ag.vim'...
> remote: Counting objects: 1879, done.
> remote: Total 1879 (delta 0), reused 0 (delta 0), pack-reused 1879
> Receiving objects: 100% (1879/1879), 1.23 MiB | 2.76 MiB/s, done.
> Resolving deltas: 100% (938/938), done.
> $ cd ag.vim
> $ git -c transfer.fsckObjects=true -c fsck.zeroPaddedFilemode=ignore fsck
> Checking object directories: 100% (256/256), done.
> Checking objects: 100% (1879/1879), done.
> $ git -c transfer.fsckObjects=true fsck
> Checking object directories: 100% (256/256), done.
> warning in tree 65e1a0027644b6625b32d30ba5ccf1c4d483480a:
> zeroPaddedFilemode: contains zero-padded file modes
> Checking objects: 100% (1879/1879), done.
> 
> It would be useful to be able to turn off individual warnings during
> cloning. Is there something I’m missing in the config? Or, is this
> something that could be fixed?

Well, you can apparently have your cake and eat it too (see
https://git-scm.com/docs/git-config#git-config-receivefsckltmsg-idgt):

receive.fsck.::
When `receive.fsckObjects` is set to true, errors can be switched
to warnings and vice versa by configuring the `receive.fsck.`
setting where the `` is the fsck message ID and the value
is one of `error`, `warn` or `ignore`. For convenience, fsck prefixes
the error/warning with the message ID, e.g. "missingEmail: invalid
author/committer line - missing email" means that setting
`receive.fsck.missingEmail = ignore` will hide that issue.

In your case, use receive.fsck.zeroPaddedFilemode=ignore=warn (or
=ignore).

Ciao,
Johannes

Re: [ANNOUNCE] Git v2.16.0-rc0

2018-01-04 Thread Thomas Gummerer
On 12/29, Paul Smith wrote:
> On Thu, 2017-12-28 at 20:30 -0800, Junio C Hamano wrote:
> >  * The way "git worktree add" determines what branch to create from
> >where and checkout in the new worktree has been updated a bit.
> 
> Does this include the enhancements published a few weeks ago to allow
> worktrees to be created directly from remote branches without first
> checking out the branch locally? I'm really looking forward to that
> change...

Yes, this release will include that.  It would be awesome if you could
test the rc, now is the best time to scream if something about it is
not as you'd expect :)

> Thanks!


Re: [PATCH v2 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX

2018-01-04 Thread Thomas Gummerer
On 12/18, Lars Schneider wrote:
> 
> > On 17 Dec 2017, at 23:51, Thomas Gummerer  wrote:
> > 
> > Split index mode only has a few dedicated tests, but as the index is
> > involved in nearly every git operation, this doesn't quite cover all the
> > ways repositories with split index can break.  To use split index mode
> > throughout the test suite a GIT_TEST_SPLIT_INDEX environment variable
> > can be set, which makes git split the index at random and thus
> > excercises the functionality much more thoroughly.
> > 
> > As this is not turned on by default, it is not executed nearly as often
> > as the test suite is run, so occationally breakages slip through.  Try
> > to counteract that by running the test suite with GIT_TEST_SPLIT_INDEX
> > mode turned on on travis.
> > 
> > To avoid using too many cycles on travis only run split index mode in
> > the linux-gcc and the linux 32-bit gcc targets.
> 
> I am surprised to see the split index mode test for the linux 32-bit
> target. Is it likely that a split index bug appears only on 32-bit? 
> Wouldn't the linux-gcc target be sufficient to save resources/time?

I'm not sure it's particularly likely for a bug to appear only on
32-bit builds.  It also doesn't seem to take too long to run, so I
thought I'd add it just in case, but I'm happy running the tests only
in the 64-bit builds if that's preferred.

> >  The Linux builds were
> > chosen over the Mac OS builds because they tend to be much faster to
> > complete.
> > 
> > The linux gcc build was chosen over the linux clang build because the
> > linux clang build is the fastest build, so it can serve as an early
> > indicator if something is broken and we want to avoid spending the extra
> > cycles of running the test suite twice for that.
> > 
> > Helped-by: Lars Schneider 
> > Helped-by: Junio C Hamano 
> > Signed-off-by: Thomas Gummerer 
> > ---
> > ci/run-linux32-build.sh | 1 +
> > ci/run-tests.sh | 4 
> > 2 files changed, 5 insertions(+)
> > 
> > diff --git a/ci/run-linux32-build.sh b/ci/run-linux32-build.sh
> > index e30fb2cddc..f173c9cf2a 100755
> > --- a/ci/run-linux32-build.sh
> > +++ b/ci/run-linux32-build.sh
> > @@ -27,4 +27,5 @@ linux32 --32bit i386 su -m -l $CI_USER -c '
> > cd /usr/src/git &&
> > make --jobs=2 &&
> > make --quiet test
> > +GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test
> > '
> > diff --git a/ci/run-tests.sh b/ci/run-tests.sh
> > index f0c743de94..c7aee5b9ff 100755
> > --- a/ci/run-tests.sh
> > +++ b/ci/run-tests.sh
> > @@ -8,3 +8,7 @@
> > mkdir -p $HOME/travis-cache
> > ln -s $HOME/travis-cache/.prove t/.prove
> > make --quiet test
> > +if test "$jobname" = "linux-gcc"
> > +then
> > +   GIT_TEST_SPLIT_INDEX=YesPlease make --quiet test
> > +fi
> 
> For now I think that looks good. Maybe we could define additional test 
> configurations with an environment variable. That could be an array variable
> defined in the lib-travis.ci "case" statement:
> https://github.com/git/git/blob/1229713f78cd2883798e95f33c19c81b523413fd/ci/lib-travisci.sh#L42-L65

That sounds like a good idea.  I'll try to see if I can come up with
something.

> - Lars


Re: Bring together merge and rebase

2018-01-04 Thread Martin Fick
On Tuesday, December 26, 2017 01:31:55 PM Carl Baldwin 
wrote:
...
> What I propose is that gerrit and github could end up more
> robust, featureful, and interoperable if they had this
> feature to build from.

I agree (assuming we come up with a well defined feature)

> With gerrit specifically, adopting this feature would make
> the "change" concept richer than it is now because it
> could supersede the change-id in the commit message and
> allow a change to evolve in a distributed non-linear way
> with protection against clobbering work.

We (the Gerrit maintainers) would like changes to be able to 
evolve non-linearly so that we can eventually support 
distributed Gerrit reviews, and the amended-commit pointer 
is one way I have thought to resolve this.

> I have no intention to disparage either tool. I love them
> both. They've both made my career better in different
> ways. I know there is no guarantee that github, gerrit,
> or any other tool will do anything to adopt this. But,
> I'm hoping they are reading this thread and that they
> recognize how this feature can make them a little bit
> better and jump in and help. I know it is a lot to hope
> for but I think it could be great if it happened.

We (the Gerrit maintainers) do recognize it, and I am glad 
that someone is pushing for solutions in this space.  I am 
not sure what the right solution is, and how to modify 
workflows to deal better with this.  I do think that starting 
by making your local repo track pointers to amended-commits, 
likely with various git hooks and notes (as also proposed by 
Johannes Schindelin), would be a good start.   With that in 
place, then you can attack various specific workflows.

If you want to then attack the Gerrit workflow, it would be 
good if you could prevent pushing new patchests that are 
amended versions of patchsets that are out of date.  While 
it would be great if Gerrit could reject such pushes, I 
wonder if to start, git could detect and it prevent the push 
in this situation?  Could a git push hook analyze the ref 
advertisements and figure this out (all the patchsets are in 
the advertisement)?  Can a git hook look at the ref 
advertisement?

-Martin


-- 
The Qualcomm Innovation Center, Inc. is a member of Code 
Aurora Forum, hosted by The Linux Foundation



Re: [PATCH 01/40] Add initial external odb support

2018-01-04 Thread Jeff Hostetler



On 1/3/2018 11:33 AM, Christian Couder wrote:

The external-odb.{c,h} files will contain the functions
that are called by the rest of Git mostly from
"sha1_file.c" to access the objects managed by the
external odbs.

The odb-helper.{c,h} files will contain the functions to
actually implement communication with either the internal
functions or the external scripts or processes that will
manage and provide external git objects.

For now only infrastructure to create helpers from the
config and to manage a cache for the 'have' command is
implemented.

Helped-by: Jeff King 
Signed-off-by: Christian Couder 
---

[...]

diff --git a/odb-helper.h b/odb-helper.h
new file mode 100644
index 00..9395e606ce
--- /dev/null
+++ b/odb-helper.h
@@ -0,0 +1,24 @@
+#ifndef ODB_HELPER_H
+#define ODB_HELPER_H
+
+struct odb_helper {
+   const char *name;
+   const char *dealer;
+
+   struct odb_helper_object {
+   unsigned char sha1[20];


Should this be "struct object_id" ?


+   unsigned long size;
+   enum object_type type;
+   } *have;
+   int have_nr;
+   int have_alloc;
+   int have_valid;
+
+   struct odb_helper *next;
+};
+
+extern struct odb_helper *odb_helper_new(const char *name, int namelen);
+extern int odb_helper_has_object(struct odb_helper *o,
+const unsigned char *sha1);
+
+#endif /* ODB_HELPER_H */


Jeff



Re: [PATCH 15/40] external-odb: add script mode support

2018-01-04 Thread Jeff Hostetler



On 1/3/2018 11:33 AM, Christian Couder wrote:

This adds support for the script command mode where
an helper script or command is called to retrieve or
manage objects.

This implements the 'have' and 'get_git_obj'
instructions for the script mode.

Signed-off-by: Christian Couder 
---
  external-odb.c  |  51 ++-
  external-odb.h  |   1 +
  odb-helper.c| 218 +++-
  odb-helper.h|   4 +
  sha1_file.c |  12 ++-
  t/t0400-external-odb.sh |  44 ++
  6 files changed, 327 insertions(+), 3 deletions(-)
  create mode 100755 t/t0400-external-odb.sh

diff --git a/external-odb.c b/external-odb.c
index 5d0afb9762..81f2aa5fac 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -33,8 +33,14 @@ static int external_odb_config(const char *var, const char 
*value, void *data)
  
  	o = find_or_create_helper(name, namelen);
  
-	if (!strcmp(subkey, "promisorremote"))

+   if (!strcmp(subkey, "promisorremote")) {
+   o->type = ODB_HELPER_GIT_REMOTE;
return git_config_string(>dealer, var, value);
+   }
+   if (!strcmp(subkey, "scriptcommand")) {
+   o->type = ODB_HELPER_SCRIPT_CMD;
+   return git_config_string(>dealer, var, value);
+   }
  
  	return 0;

  }
@@ -77,6 +83,49 @@ int external_odb_has_object(const unsigned char *sha1)
return 0;
  }
  
+int external_odb_get_object(const unsigned char *sha1)

+{
+   struct odb_helper *o;
+   const char *path;
+
+   if (!external_odb_has_object(sha1))
+   return -1;
+
+   path = sha1_file_name_alt(external_odb_root(), sha1);
+   safe_create_leading_directories_const(path);
+   prepare_external_alt_odb();
+
+   for (o = helpers; o; o = o->next) {
+   struct strbuf tmpfile = STRBUF_INIT;
+   int ret;
+   int fd;
+
+   if (!odb_helper_has_object(o, sha1))
+   continue;
+
+   fd = create_object_tmpfile(, path);
+   if (fd < 0) {
+   strbuf_release();
+   return -1;
+   }
+
+   if (odb_helper_get_object(o, sha1, fd) < 0) {
+   close(fd);
+   unlink(tmpfile.buf);
+   strbuf_release();
+   continue;
+   }
+
+   close_sha1_file(fd);
+   ret = finalize_object_file(tmpfile.buf, path);
+   strbuf_release();
+   if (!ret)
+   return 0;
+   }
+
+   return -1;
+}
+
  int external_odb_get_direct(const unsigned char *sha1)
  {
struct odb_helper *o;
diff --git a/external-odb.h b/external-odb.h
index fd6708163e..fb8b94972f 100644
--- a/external-odb.h
+++ b/external-odb.h
@@ -4,6 +4,7 @@
  extern int has_external_odb(void);
  extern const char *external_odb_root(void);
  extern int external_odb_has_object(const unsigned char *sha1);
+extern int external_odb_get_object(const unsigned char *sha1);
  extern int external_odb_get_direct(const unsigned char *sha1);
  
  #endif /* EXTERNAL_ODB_H */

diff --git a/odb-helper.c b/odb-helper.c
index 4b70b287af..c1a3443dc7 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -21,13 +21,124 @@ struct odb_helper_cmd {
struct child_process child;
  };
  
+/*

+ * Callers are responsible to ensure that the result of vaddf(fmt, ap)
+ * is properly shell-quoted.
+ */
+static void prepare_helper_command(struct argv_array *argv, const char *cmd,
+  const char *fmt, va_list ap)
+{
+   struct strbuf buf = STRBUF_INIT;
+
+   strbuf_addstr(, cmd);
+   strbuf_addch(, ' ');
+   strbuf_vaddf(, fmt, ap);


I do find this a bit odd that you're putting the cmd, a space,
and the printf results into a single arg in the argv, rather than
directly loading up the argv.

Is there an issue with the whitespace between the cmd and the
printf result being in the same arg -- especially if there are
quoting issues in the fmt string as you mention in the comment
above?  (not sure, just asking)


+
+   argv_array_push(argv, buf.buf);
+   strbuf_release();
+}
+
+__attribute__((format (printf,3,4)))
+static int odb_helper_start(struct odb_helper *o,
+   struct odb_helper_cmd *cmd,
+   const char *fmt, ...)
+{
+   va_list ap;
+
+   memset(cmd, 0, sizeof(*cmd));
+   argv_array_init(>argv);
+
+   if (!o->dealer)
+   return -1;
+
+   va_start(ap, fmt);
+   prepare_helper_command(>argv, o->dealer, fmt, ap);
+   va_end(ap);
+
+   cmd->child.argv = cmd->argv.argv;
+   cmd->child.use_shell = 1;
+   cmd->child.no_stdin = 1;
+   cmd->child.out = -1;
+
+   if (start_command(>child) < 0) {
+   argv_array_clear(>argv);
+   return -1;
+   }
+
+ 

Re: Bring together merge and rebase

2018-01-04 Thread Martin Fick
On Monday, December 25, 2017 06:16:40 PM Carl Baldwin wrote:
> On Sun, Dec 24, 2017 at 10:52:15PM -0500, Theodore Ts'o 
wrote:
> Look at what happens in a rebase type workflow in any of
> the following scenarios. All of these came up regularly
> in my time with Gerrit.
> 
> 1. Make a quick edit through the web UI then later
> work on the change again in your local clone. It is easy
> to forget to pull down the change made through the UI
> before starting to work on it again. If that happens, the
> change made through the UI will almost certainly be
> clobbered.
> 
> 2. You or someone else creates a second change that is
> dependent on yours and works on it while yours is still
> evolving. If the second change gets rebased with an older
> copy of the base change and then posted back up for
> review, newer work in the base change has just been
> clobbered.
> 
> 3. As a reviewer, you decide the best way to explain
> how you'd like to see something done differently is to
> make the quick change yourself and push it up. If the
> author fails to fetch what you pushed before continuing
> onto something else, it gets clobbered.
> 
> 4. You want to collaborate on a single change with
> someone else in any way and for whatever reason. As soon
> as that change starts hitting multiple work spaces, there
> are synchronization issues that currently take careful
> manual intervention.

These scenarios seem to come up most for me at Gerrit hack-
a-thons where we collaborate a lot in short time spans on 
changes.  We (the Gerrit maintainers) too have wanted and 
sometimes discussed ways to track the relation of "amended" 
commits (which is generally what Gerrit patchsets are).  We 
also concluded that some sort of parent commit pointer was 
needed, although parent is somewhat the wrong term since 
that already means something in git.  Rather, maybe some 
"predecessor" type of term would be better, maybe 
"antecedent", but "amended-commit" pointer might be best?

-Martin

-- 
The Qualcomm Innovation Center, Inc. is a member of Code 
Aurora Forum, hosted by The Linux Foundation



Re: [PATCH 1/4] color.h: document and modernize header

2018-01-04 Thread Stefan Beller
On Thu, Dec 28, 2017 at 2:00 PM, Eric Sunshine  wrote:
> On Thu, Dec 28, 2017 at 4:03 PM, Stefan Beller  wrote:
>> Add documentation explaining the functions in color.h.
>> While at it, mark them extern.
>>
>> Signed-off-by: Stefan Beller 
>> ---
>> diff --git a/color.h b/color.h
>> @@ -72,26 +72,48 @@ extern int color_stdout_is_tty;
>>  /*
>> - * Set the color buffer (which must be COLOR_MAXLEN bytes)
>> - * to the raw color bytes; this is useful for initializing
>> + * Set the color buffer `dst` (which must be COLOR_MAXLEN bytes)
>> + * to the raw color bytes `color_bytes`; this is useful for initializing
>>   * default color variables.
>>   */
>> -void color_set(char *dst, const char *color_bytes);
>> +extern void color_set(char *dst, const char *color_bytes);
>
> I don't see an explanation of what "color bytes" are. From where does
> one obtain such bytes? How is this function used? The function comment
> does not particularly answer these questions.

Right. This description is bad.


It's implementation is just

xsnprintf(dst, COLOR_MAXLEN, "%s", color_bytes);

Apparently this function is only ever used by grep.c which uses
it to fill in struct grep_opt {
...
char color_context[COLOR_MAXLEN];
char color_filename[COLOR_MAXLEN];
char color_function[COLOR_MAXLEN];
char color_lineno[COLOR_MAXLEN];
char color_match_context[COLOR_MAXLEN];
char color_match_selected[COLOR_MAXLEN];
char color_selected[COLOR_MAXLEN];
char color_sep[COLOR_MAXLEN];
...
}

I guess I'll drop the documentation for color_set, and put a NEEDSWORK
comment there as I'd think we don't need to copy around the colors
using snprintf, but either can keep pointers or use xmemdup.

>
>> +/*
>> + * Parses a config option, which can be a boolean or one of
>> + * "never", "auto", "always". Returns the constant for the given setting.
>> + */
>> +extern int git_config_colorbool(const char *var, const char *value);
>
> I suppose that "constant for the given setting" means one of
> GIT_COLOR_NEVER , GIT_COLOR_AUTO, GIT_COLOR_ALWAYS? Perhaps say so
> explicitly?

Maybe I should fix the code as well. Currently it only returns one of
0 (=GIT_COLOR_NEVER), 1 (=GIT_COLOR_ALWAYS) and
GIT_COLOR_AUTO.

> Would it also make sense to say that boolean "true" ("yes", etc.)
> results in GIT_COLOR_ALWAYS and "false" ("no", etc.)" results in
> GIT_COLOR_NEVER?

done.

>
> Finally, for grammatical consistency with other comments:
> s/Parses/Parse
> s/Returns/Return/

done

>
>> +/* Is the output supposed to be colored? Resolve and cache the 'auto' 
>> setting */
>> +extern int want_color(int var);
>
> What is the 'var' argument? How is it interpreted? (...goes and checks
> implementation...) I guess this documentation should explain that the
> caller would pass in the result of git_config_colorbool().
>
> Also, the meaning of "Resolve and cache 'auto' setting" stumped me for
> a while since it's not clear why it's here (or why it's missing the
> full stop), but I eventually realized that it's describing an
> implementation detail, which probably doesn't belong in API
> documentation.

done

>
>> +/*
>> + * Translate the human readable color string from `value` and into
>> + * terminal color codes and store them in `dst`
>> + */
>> +extern int color_parse(const char *value, char *dst);
>> +extern int color_parse_mem(const char *value, int len, char *dst);
>
> What does "human readable" mean in this context? Is it talking about
> color names or RGB(A) tuples or what?
>
> Also, how does the caller know how large to make 'dst'? At minimum,
> you should say something about COLOR_MAXLEN.
>
> Finally, for the 'len' case, what happens if 'dst' is too small? This
> should be documented.
>
> And, the return value of these functions should be discussed.
>
>> +/*
>> + * Print the format string `fmt`, encapsulated by setting and resetting the
>> + * color. Omits the color encapsulation if `color` is NULL.
>
> The "encapsulated by setting and resetting the color" bit is hard to
> grok. Perhaps instead say something along the lines of:
>
> Output the formatted string in the specified color (and
> then reset to normal color so subsequent output is
> uncolored).

sounds good.

>
>> + * The `color_fprintf_ln` prints a new line after resetting the color.
>> + * The `color_print_strbuf` prints the given pre-formatted strbuf instead.
>
> Should the strbuf variation warn that it only outputs content up to
> the first embedded NUL? (Or should that bug/misfeature be fixed?)

added.

>
>> + */
>>  __attribute__((format (printf, 3, 4)))
>> +extern int color_fprintf(FILE *fp, const char *color, const char *fmt, ...);
>>  __attribute__((format (printf, 3, 4)))
>> +extern int color_fprintf_ln(FILE *fp, const char *color, const char *fmt, 
>> ...);
>> +extern void color_print_strbuf(FILE *fp, const char *color, const struct 
>> strbuf *sb);
>>
>> -int 

Re: Bring together merge and rebase

2018-01-04 Thread Martin Fick
On Sunday, December 24, 2017 12:01:38 AM Johannes Schindelin 
wrote:
> Hi Carl,
> 
> On Sat, 23 Dec 2017, Carl Baldwin wrote:
> > I imagine that a "git commit --amend" would also insert
> > a "replaces" reference to the original commit but I
> > failed to mention that in my original post.
> 
> And cherry-pick, too, of course.
> 
> Both of these examples hint at a rather huge urge of some
> users to turn this feature off because the referenced
> commits may very well be throw-away commits in their
> case, making the newly-recorded information completely
> undesired.
> 
> Example: I am working on a topic branch. In the middle, I
> see a typo. I commit a fix, continue to work on the topic
> branch. Later, I cherry-pick that commit to a separate
> topic branch because I really don't think that those two
> topics are related. Now I definitely do not want a
> reference of the cherry-picked commit to the original
> one: the latter will never be pushed to a public
> repository, and gc'ed in a few weeks.
> 
> Of course, that is only my wish, other users in similar
> situations may want that information. Demonstrating that
> you would be better served with an opt-in feature that
> uses notes rather than a baked-in commit header.

I think what you are highlighting is not when to track this, 
but rather when to share this tracking.  In my local repo, I 
would definitely want to know that I cherry-picked this from 
elsewhere, it helps me understand what I have done later 
when I look back at old commits and branches that need to 
potentially be thrown away.  But I agree you may not want to 
share these publicly.

I am not sure what the right formula is, for when to share 
these pointers publicly, but it seems like it might be that 
whenever you push something, it should push along any 
references to amended commits that were publicly available 
already.  I am not sure how to track that, but I suspect it 
is a subset of the union of commits you have fetched, and 
commits you have pushed (i.e. you got it from elsewhere, or 
you created it and already shared it with others)?  Maybe it 
should also include any commits reachable by advertisements 
to places you are pushing to (in case it got shared some 
other way)?

-Martin

-- 
The Qualcomm Innovation Center, Inc. is a member of Code 
Aurora Forum, hosted by The Linux Foundation



[RFC PATCH 2/2] Remove now useless email-address parsing code

2018-01-04 Thread Matthieu Moy
We now use Mail::Address unconditionaly, hence parse_mailboxes is now
dead code. Remove it and its tests.

Signed-off-by: Matthieu Moy 
---
 perl/Git.pm  | 71 
 t/t9000-addresses.sh | 27 
 t/t9000/test.pl  | 67 -
 3 files changed, 165 deletions(-)
 delete mode 100755 t/t9000-addresses.sh
 delete mode 100755 t/t9000/test.pl

diff --git a/perl/Git.pm b/perl/Git.pm
index 02a3871..9d60d79 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -880,77 +880,6 @@ sub ident_person {
return "$ident[0] <$ident[1]>";
 }
 
-=item parse_mailboxes
-
-Return an array of mailboxes extracted from a string.
-
-=cut
-
-# Very close to Mail::Address's parser, but we still have minor
-# differences in some cases (see t9000 for examples).
-sub parse_mailboxes {
-   my $re_comment = qr/\((?:[^)]*)\)/;
-   my $re_quote = qr/"(?:[^\"\\]|\\.)*"/;
-   my $re_word = qr/(?:[^]["\s()<>:;@\\,.]|\\.)+/;
-
-   # divide the string in tokens of the above form
-   my $re_token = qr/(?:$re_quote|$re_word|$re_comment|\S)/;
-   my @tokens = map { $_ =~ /\s*($re_token)\s*/g } @_;
-   my $end_of_addr_seen = 0;
-
-   # add a delimiter to simplify treatment for the last mailbox
-   push @tokens, ",";
-
-   my (@addr_list, @phrase, @address, @comment, @buffer) = ();
-   foreach my $token (@tokens) {
-   if ($token =~ /^[,;]$/) {
-   # if buffer still contains undeterminated strings
-   # append it at the end of @address or @phrase
-   if ($end_of_addr_seen) {
-   push @phrase, @buffer;
-   } else {
-   push @address, @buffer;
-   }
-
-   my $str_phrase = join ' ', @phrase;
-   my $str_address = join '', @address;
-   my $str_comment = join ' ', @comment;
-
-   # quote are necessary if phrase contains
-   # special characters
-   if ($str_phrase =~ /[][()<>:;@\\,.\000-\037\177]/) {
-   $str_phrase =~ s/(^|[^\\])"/$1/g;
-   $str_phrase = qq["$str_phrase"];
-   }
-
-   # add "<>" around the address if necessary
-   if ($str_address ne "" && $str_phrase ne "") {
-   $str_address = qq[<$str_address>];
-   }
-
-   my $str_mailbox = "$str_phrase $str_address 
$str_comment";
-   $str_mailbox =~ s/^\s*|\s*$//g;
-   push @addr_list, $str_mailbox if ($str_mailbox);
-
-   @phrase = @address = @comment = @buffer = ();
-   $end_of_addr_seen = 0;
-   } elsif ($token =~ /^\(/) {
-   push @comment, $token;
-   } elsif ($token eq "<") {
-   push @phrase, (splice @address), (splice @buffer);
-   } elsif ($token eq ">") {
-   $end_of_addr_seen = 1;
-   push @address, (splice @buffer);
-   } elsif ($token eq "@" && !$end_of_addr_seen) {
-   push @address, (splice @buffer), "@";
-   } else {
-   push @buffer, $token;
-   }
-   }
-
-   return @addr_list;
-}
-
 =item hash_object ( TYPE, FILENAME )
 
 Compute the SHA1 object id of the given C considering it is
diff --git a/t/t9000-addresses.sh b/t/t9000-addresses.sh
deleted file mode 100755
index a1ebef6..000
--- a/t/t9000-addresses.sh
+++ /dev/null
@@ -1,27 +0,0 @@
-#!/bin/sh
-
-test_description='compare address parsing with and without Mail::Address'
-. ./test-lib.sh
-
-if ! test_have_prereq PERL; then
-   skip_all='skipping perl interface tests, perl not available'
-   test_done
-fi
-
-perl -MTest::More -e 0 2>/dev/null || {
-   skip_all="Perl Test::More unavailable, skipping test"
-   test_done
-}
-
-perl -MMail::Address -e 0 2>/dev/null || {
-   skip_all="Perl Mail::Address unavailable, skipping test"
-   test_done
-}
-
-test_external_has_tap=1
-
-test_external_without_stderr \
-   'Perl address parsing function' \
-   perl "$TEST_DIRECTORY"/t9000/test.pl
-
-test_done
diff --git a/t/t9000/test.pl b/t/t9000/test.pl
deleted file mode 100755
index dfeaa9c..000
--- a/t/t9000/test.pl
+++ /dev/null
@@ -1,67 +0,0 @@
-#!/usr/bin/perl
-use lib (split(/:/, $ENV{GITPERLLIB}));
-
-use 5.008;
-use warnings;
-use strict;
-
-use Test::More qw(no_plan);
-use Mail::Address;
-
-BEGIN { use_ok('Git') }
-
-my @success_list = (q[Jane],
-   q[j...@example.com],
-   q[],
-   q[Jane ],
-   q[Jane Doe ],

[PATCH v4 2/7] wildmatch test: use more standard shell style

2018-01-04 Thread Ævar Arnfjörð Bjarmason
Change the wildmatch test to use more standard shell style, usually we
use "if test" not "if [".

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t3070-wildmatch.sh | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index 27fa878f6e..4d589d1f9a 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -5,7 +5,8 @@ test_description='wildmatch tests'
 . ./test-lib.sh
 
 match() {
-   if [ $1 = 1 ]; then
+   if test "$1" = 1
+   then
test_expect_success "wildmatch: match '$3' '$4'" "
test-wildmatch wildmatch '$3' '$4'
"
@@ -17,7 +18,8 @@ match() {
 }
 
 imatch() {
-   if [ $1 = 1 ]; then
+   if test "$1" = 1
+   then
test_expect_success "iwildmatch:match '$2' '$3'" "
test-wildmatch iwildmatch '$2' '$3'
"
@@ -29,7 +31,8 @@ imatch() {
 }
 
 pathmatch() {
-   if [ $1 = 1 ]; then
+   if test "$1" = 1
+   then
test_expect_success "pathmatch: match '$2' '$3'" "
test-wildmatch pathmatch '$2' '$3'
"
-- 
2.15.1.424.g9478a66081



[PATCH v4 3/7] wildmatch test: don't try to vertically align our output

2018-01-04 Thread Ævar Arnfjörð Bjarmason
Don't try to vertically align the test output, which is futile anyway
under the TAP output where we're going to be emitting a number for
each test without aligning the test count.

This makes subsequent changes of mine where I'm not going to be
aligning this output as I add new tests easier.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t3070-wildmatch.sh | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index 4d589d1f9a..19ea64bba9 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -7,11 +7,11 @@ test_description='wildmatch tests'
 match() {
if test "$1" = 1
then
-   test_expect_success "wildmatch: match '$3' '$4'" "
+   test_expect_success "wildmatch: match '$3' '$4'" "
test-wildmatch wildmatch '$3' '$4'
"
else
-   test_expect_success "wildmatch:  no match '$3' '$4'" "
+   test_expect_success "wildmatch: no match '$3' '$4'" "
! test-wildmatch wildmatch '$3' '$4'
"
fi
@@ -20,7 +20,7 @@ match() {
 imatch() {
if test "$1" = 1
then
-   test_expect_success "iwildmatch:match '$2' '$3'" "
+   test_expect_success "iwildmatch: match '$2' '$3'" "
test-wildmatch iwildmatch '$2' '$3'
"
else
@@ -33,11 +33,11 @@ imatch() {
 pathmatch() {
if test "$1" = 1
then
-   test_expect_success "pathmatch: match '$2' '$3'" "
+   test_expect_success "pathmatch: match '$2' '$3'" "
test-wildmatch pathmatch '$2' '$3'
"
else
-   test_expect_success "pathmatch:  no match '$2' '$3'" "
+   test_expect_success "pathmatch: no match '$2' '$3'" "
! test-wildmatch pathmatch '$2' '$3'
"
fi
-- 
2.15.1.424.g9478a66081



[PATCH v4 0/7] increase wildmatch test coverage

2018-01-04 Thread Ævar Arnfjörð Bjarmason
This fixes errors in v3 that caused failures on bash & on Windows.

Ævar Arnfjörð Bjarmason (7):
  wildmatch test: indent with tabs, not spaces
  wildmatch test: use more standard shell style
  wildmatch test: don't try to vertically align our output
  wildmatch test: use a paranoia pattern from nul_match()
  wildmatch test: remove dead fnmatch() test code
  wildmatch test: perform all tests under all wildmatch() modes

No changes

  wildmatch test: create & test files on disk in addition to in-memory

Rephrase the commit message a bit.

Change $10 to ${10} for portability.

Skip tests that would create a file with "\" in the name on Windows
(or rather, where we don't have the BSLASHPSPEC prereq).

 t/helper/test-wildmatch.c |   2 +
 t/t3070-wildmatch.sh  | 777 +++---
 2 files changed, 534 insertions(+), 245 deletions(-)

-- 
2.15.1.424.g9478a66081



[PATCH v4 4/7] wildmatch test: use a paranoia pattern from nul_match()

2018-01-04 Thread Ævar Arnfjörð Bjarmason
Use a pattern from the nul_match() function in t7008-grep-binary.sh to
make sure that we don't just fall through to the "else" if there's an
unknown parameter.

This is something I added in commit 77f6f4406f ("grep: add a test
helper function for less verbose -f \0 tests", 2017-05-20) to grep
tests, which were modeled on these wildmatch tests, and I'm now
porting back to the original wildmatch tests.

I am not using the "say '...'; exit 1" pattern from t-basic.sh
because if I fail I want to run the rest of the tests (unless under
-i), and doing this makes sure we do that and don't exit right away
without fully reporting our errors.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t3070-wildmatch.sh | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index 19ea64bba9..9691d8eda3 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -10,10 +10,13 @@ match() {
test_expect_success "wildmatch: match '$3' '$4'" "
test-wildmatch wildmatch '$3' '$4'
"
-   else
+   elif test "$1" = 0
+   then
test_expect_success "wildmatch: no match '$3' '$4'" "
! test-wildmatch wildmatch '$3' '$4'
"
+   else
+   test_expect_success "PANIC: Test framework error. Unknown 
matches value $1" 'false'
fi
 }
 
@@ -23,10 +26,13 @@ imatch() {
test_expect_success "iwildmatch: match '$2' '$3'" "
test-wildmatch iwildmatch '$2' '$3'
"
-   else
+   elif test "$1" = 0
+   then
test_expect_success "iwildmatch: no match '$2' '$3'" "
! test-wildmatch iwildmatch '$2' '$3'
"
+   else
+   test_expect_success "PANIC: Test framework error. Unknown 
matches value $1" 'false'
fi
 }
 
@@ -36,10 +42,13 @@ pathmatch() {
test_expect_success "pathmatch: match '$2' '$3'" "
test-wildmatch pathmatch '$2' '$3'
"
-   else
+   elif test "$1" = 0
+   then
test_expect_success "pathmatch: no match '$2' '$3'" "
! test-wildmatch pathmatch '$2' '$3'
"
+   else
+   test_expect_success "PANIC: Test framework error. Unknown 
matches value $1" 'false'
fi
 }
 
-- 
2.15.1.424.g9478a66081



[PATCH v4 6/7] wildmatch test: perform all tests under all wildmatch() modes

2018-01-04 Thread Ævar Arnfjörð Bjarmason
Rewrite the wildmatch() test suite so that each test now tests all
combinations of the wildmatch() WM_CASEFOLD and WM_PATHNAME flags.

Before this change some test inputs were not tested on
e.g. WM_PATHNAME. Now the function is stress tested on all possible
inputs, and for each input we declare what the result should be if the
mode is case-insensitive, or pathname matching, or case-sensitive or
not matching pathnames.

Also before this change, nothing was testing case-insensitive
non-pathname matching, so I've added that to test-wildmatch.c and made
use of it.

This yields a rather scary patch, but there are no functional changes
here, just more test coverage. Some now-redundant tests were deleted
as a result of this change, since they were now duplicating an earlier
test.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/helper/test-wildmatch.c |   2 +
 t/t3070-wildmatch.sh  | 478 +++---
 2 files changed, 239 insertions(+), 241 deletions(-)

diff --git a/t/helper/test-wildmatch.c b/t/helper/test-wildmatch.c
index 921d7b3e7e..66d33dfcfd 100644
--- a/t/helper/test-wildmatch.c
+++ b/t/helper/test-wildmatch.c
@@ -16,6 +16,8 @@ int cmd_main(int argc, const char **argv)
return !!wildmatch(argv[3], argv[2], WM_PATHNAME | WM_CASEFOLD);
else if (!strcmp(argv[1], "pathmatch"))
return !!wildmatch(argv[3], argv[2], 0);
+   else if (!strcmp(argv[1], "ipathmatch"))
+   return !!wildmatch(argv[3], argv[2], WM_CASEFOLD);
else
return 1;
 }
diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index 2f8a681c72..06db053ae2 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -4,278 +4,274 @@ test_description='wildmatch tests'
 
 . ./test-lib.sh
 
-match() {
-   if test "$1" = 1
+wildtest() {
+   match_w_glob=$1
+   match_w_globi=$2
+   match_w_pathmatch=$3
+   match_w_pathmatchi=$4
+   text=$5
+   pattern=$6
+
+   if test "$match_w_glob" = 1
then
-   test_expect_success "wildmatch: match '$2' '$3'" "
-   test-wildmatch wildmatch '$2' '$3'
+   test_expect_success "wildmatch: match '$text' '$pattern'" "
+   test-wildmatch wildmatch '$text' '$pattern'
"
-   elif test "$1" = 0
+   elif test "$match_w_glob" = 0
then
-   test_expect_success "wildmatch: no match '$2' '$3'" "
-   ! test-wildmatch wildmatch '$2' '$3'
+   test_expect_success "wildmatch: no match '$text' '$pattern'" "
+   ! test-wildmatch wildmatch '$text' '$pattern'
"
else
-   test_expect_success "PANIC: Test framework error. Unknown 
matches value $1" 'false'
+   test_expect_success "PANIC: Test framework error. Unknown 
matches value $match_w_glob" 'false'
fi
-}
 
-imatch() {
-   if test "$1" = 1
+   if test "$match_w_globi" = 1
then
-   test_expect_success "iwildmatch: match '$2' '$3'" "
-   test-wildmatch iwildmatch '$2' '$3'
+   test_expect_success "iwildmatch: match '$text' '$pattern'" "
+   test-wildmatch iwildmatch '$text' '$pattern'
"
-   elif test "$1" = 0
+   elif test "$match_w_globi" = 0
then
-   test_expect_success "iwildmatch: no match '$2' '$3'" "
-   ! test-wildmatch iwildmatch '$2' '$3'
+   test_expect_success "iwildmatch: no match '$text' '$pattern'" "
+   ! test-wildmatch iwildmatch '$text' '$pattern'
"
else
-   test_expect_success "PANIC: Test framework error. Unknown 
matches value $1" 'false'
+   test_expect_success "PANIC: Test framework error. Unknown 
matches value $match_w_globi" 'false'
fi
-}
 
-pathmatch() {
-   if test "$1" = 1
+   if test "$match_w_pathmatch" = 1
then
-   test_expect_success "pathmatch: match '$2' '$3'" "
-   test-wildmatch pathmatch '$2' '$3'
+   test_expect_success "pathmatch: match '$text' '$pattern'" "
+   test-wildmatch pathmatch '$text' '$pattern'
"
-   elif test "$1" = 0
+   elif test "$match_w_pathmatch" = 0
then
-   test_expect_success "pathmatch: no match '$2' '$3'" "
-   ! test-wildmatch pathmatch '$2' '$3'
+   test_expect_success "pathmatch: no match '$text' '$pattern'" "
+   ! test-wildmatch pathmatch '$text' '$pattern'
"
else
-   test_expect_success "PANIC: Test framework error. Unknown 
matches value $1" 'false'
+   test_expect_success "PANIC: Test framework error. Unknown 
matches value $match_w_pathmatch" 'false'
+   fi
+
+   if test 

[PATCH v4 5/7] wildmatch test: remove dead fnmatch() test code

2018-01-04 Thread Ævar Arnfjörð Bjarmason
Remove the unused fnmatch() test parameter from the wildmatch
test. The code that used to test this was removed in 70a8fc999d ("stop
using fnmatch (either native or compat)", 2014-02-15).

As a --word-diff shows the only change to the body of the tests is the
removal of the second out of four parameters passed to match().

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t3070-wildmatch.sh | 356 +--
 1 file changed, 178 insertions(+), 178 deletions(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index 9691d8eda3..2f8a681c72 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -7,13 +7,13 @@ test_description='wildmatch tests'
 match() {
if test "$1" = 1
then
-   test_expect_success "wildmatch: match '$3' '$4'" "
-   test-wildmatch wildmatch '$3' '$4'
+   test_expect_success "wildmatch: match '$2' '$3'" "
+   test-wildmatch wildmatch '$2' '$3'
"
elif test "$1" = 0
then
-   test_expect_success "wildmatch: no match '$3' '$4'" "
-   ! test-wildmatch wildmatch '$3' '$4'
+   test_expect_success "wildmatch: no match '$2' '$3'" "
+   ! test-wildmatch wildmatch '$2' '$3'
"
else
test_expect_success "PANIC: Test framework error. Unknown 
matches value $1" 'false'
@@ -53,176 +53,176 @@ pathmatch() {
 }
 
 # Basic wildmat features
-match 1 1 foo foo
-match 0 0 foo bar
-match 1 1 '' ""
-match 1 1 foo '???'
-match 0 0 foo '??'
-match 1 1 foo '*'
-match 1 1 foo 'f*'
-match 0 0 foo '*f'
-match 1 1 foo '*foo*'
-match 1 1 foobar '*ob*a*r*'
-match 1 1 aaabababab '*ab'
-match 1 1 'foo*' 'foo\*'
-match 0 0 foobar 'foo\*bar'
-match 1 1 'f\oo' 'f\\oo'
-match 1 1 ball '*[al]?'
-match 0 0 ten '[ten]'
-match 0 1 ten '**[!te]'
-match 0 0 ten '**[!ten]'
-match 1 1 ten 't[a-g]n'
-match 0 0 ten 't[!a-g]n'
-match 1 1 ton 't[!a-g]n'
-match 1 1 ton 't[^a-g]n'
-match 1 x 'a]b' 'a[]]b'
-match 1 x a-b 'a[]-]b'
-match 1 x 'a]b' 'a[]-]b'
-match 0 x aab 'a[]-]b'
-match 1 x aab 'a[]a-]b'
-match 1 1 ']' ']'
+match 1 foo foo
+match 0 foo bar
+match 1 '' ""
+match 1 foo '???'
+match 0 foo '??'
+match 1 foo '*'
+match 1 foo 'f*'
+match 0 foo '*f'
+match 1 foo '*foo*'
+match 1 foobar '*ob*a*r*'
+match 1 aaabababab '*ab'
+match 1 'foo*' 'foo\*'
+match 0 foobar 'foo\*bar'
+match 1 'f\oo' 'f\\oo'
+match 1 ball '*[al]?'
+match 0 ten '[ten]'
+match 0 ten '**[!te]'
+match 0 ten '**[!ten]'
+match 1 ten 't[a-g]n'
+match 0 ten 't[!a-g]n'
+match 1 ton 't[!a-g]n'
+match 1 ton 't[^a-g]n'
+match 1 'a]b' 'a[]]b'
+match 1 a-b 'a[]-]b'
+match 1 'a]b' 'a[]-]b'
+match 0 aab 'a[]-]b'
+match 1 aab 'a[]a-]b'
+match 1 ']' ']'
 
 # Extended slash-matching features
-match 0 0 'foo/baz/bar' 'foo*bar'
-match 0 0 'foo/baz/bar' 'foo**bar'
-match 0 1 'foobazbar' 'foo**bar'
-match 1 1 'foo/baz/bar' 'foo/**/bar'
-match 1 0 'foo/baz/bar' 'foo/**/**/bar'
-match 1 0 'foo/b/a/z/bar' 'foo/**/bar'
-match 1 0 'foo/b/a/z/bar' 'foo/**/**/bar'
-match 1 0 'foo/bar' 'foo/**/bar'
-match 1 0 'foo/bar' 'foo/**/**/bar'
-match 0 0 'foo/bar' 'foo?bar'
-match 0 0 'foo/bar' 'foo[/]bar'
-match 0 0 'foo/bar' 'foo[^a-z]bar'
-match 0 0 'foo/bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
-match 1 1 'foo-bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
-match 1 0 'foo' '**/foo'
-match 1 x 'XXX/foo' '**/foo'
-match 1 0 'bar/baz/foo' '**/foo'
-match 0 0 'bar/baz/foo' '*/foo'
-match 0 0 'foo/bar/baz' '**/bar*'
-match 1 0 'deep/foo/bar/baz' '**/bar/*'
-match 0 0 'deep/foo/bar/baz/' '**/bar/*'
-match 1 0 'deep/foo/bar/baz/' '**/bar/**'
-match 0 0 'deep/foo/bar' '**/bar/*'
-match 1 0 'deep/foo/bar/' '**/bar/**'
-match 0 0 'foo/bar/baz' '**/bar**'
-match 1 0 'foo/bar/baz/x' '*/bar/**'
-match 0 0 'deep/foo/bar/baz/x' '*/bar/**'
-match 1 0 'deep/foo/bar/baz/x' '**/bar/*/*'
+match 0 'foo/baz/bar' 'foo*bar'
+match 0 'foo/baz/bar' 'foo**bar'
+match 0 'foobazbar' 'foo**bar'
+match 1 'foo/baz/bar' 'foo/**/bar'
+match 1 'foo/baz/bar' 'foo/**/**/bar'
+match 1 'foo/b/a/z/bar' 'foo/**/bar'
+match 1 'foo/b/a/z/bar' 'foo/**/**/bar'
+match 1 'foo/bar' 'foo/**/bar'
+match 1 'foo/bar' 'foo/**/**/bar'
+match 0 'foo/bar' 'foo?bar'
+match 0 'foo/bar' 'foo[/]bar'
+match 0 'foo/bar' 'foo[^a-z]bar'
+match 0 'foo/bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
+match 1 'foo-bar' 'f[^eiu][^eiu][^eiu][^eiu][^eiu]r'
+match 1 'foo' '**/foo'
+match 1 'XXX/foo' '**/foo'
+match 1 'bar/baz/foo' '**/foo'
+match 0 'bar/baz/foo' '*/foo'
+match 0 'foo/bar/baz' '**/bar*'
+match 1 'deep/foo/bar/baz' '**/bar/*'
+match 0 'deep/foo/bar/baz/' '**/bar/*'
+match 1 'deep/foo/bar/baz/' '**/bar/**'
+match 0 'deep/foo/bar' '**/bar/*'
+match 1 'deep/foo/bar/' '**/bar/**'
+match 0 'foo/bar/baz' '**/bar**'
+match 1 'foo/bar/baz/x' '*/bar/**'
+match 0 'deep/foo/bar/baz/x' '*/bar/**'
+match 1 'deep/foo/bar/baz/x' '**/bar/*/*'
 
 # Various additional tests
-match 0 0 'acrt' 

[PATCH v4 1/7] wildmatch test: indent with tabs, not spaces

2018-01-04 Thread Ævar Arnfjörð Bjarmason
Replace the 4-width mixed space & tab indentation in this file with
indentation with tabs as we do in most of the rest of our tests.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t3070-wildmatch.sh | 54 ++--
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index 163a14a1c2..27fa878f6e 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -5,39 +5,39 @@ test_description='wildmatch tests'
 . ./test-lib.sh
 
 match() {
-if [ $1 = 1 ]; then
-   test_expect_success "wildmatch: match '$3' '$4'" "
-   test-wildmatch wildmatch '$3' '$4'
-   "
-else
-   test_expect_success "wildmatch:  no match '$3' '$4'" "
-   ! test-wildmatch wildmatch '$3' '$4'
-   "
-fi
+   if [ $1 = 1 ]; then
+   test_expect_success "wildmatch: match '$3' '$4'" "
+   test-wildmatch wildmatch '$3' '$4'
+   "
+   else
+   test_expect_success "wildmatch:  no match '$3' '$4'" "
+   ! test-wildmatch wildmatch '$3' '$4'
+   "
+   fi
 }
 
 imatch() {
-if [ $1 = 1 ]; then
-   test_expect_success "iwildmatch:match '$2' '$3'" "
-   test-wildmatch iwildmatch '$2' '$3'
-   "
-else
-   test_expect_success "iwildmatch: no match '$2' '$3'" "
-   ! test-wildmatch iwildmatch '$2' '$3'
-   "
-fi
+   if [ $1 = 1 ]; then
+   test_expect_success "iwildmatch:match '$2' '$3'" "
+   test-wildmatch iwildmatch '$2' '$3'
+   "
+   else
+   test_expect_success "iwildmatch: no match '$2' '$3'" "
+   ! test-wildmatch iwildmatch '$2' '$3'
+   "
+   fi
 }
 
 pathmatch() {
-if [ $1 = 1 ]; then
-   test_expect_success "pathmatch: match '$2' '$3'" "
-   test-wildmatch pathmatch '$2' '$3'
-   "
-else
-   test_expect_success "pathmatch:  no match '$2' '$3'" "
-   ! test-wildmatch pathmatch '$2' '$3'
-   "
-fi
+   if [ $1 = 1 ]; then
+   test_expect_success "pathmatch: match '$2' '$3'" "
+   test-wildmatch pathmatch '$2' '$3'
+   "
+   else
+   test_expect_success "pathmatch:  no match '$2' '$3'" "
+   ! test-wildmatch pathmatch '$2' '$3'
+   "
+   fi
 }
 
 # Basic wildmat features
-- 
2.15.1.424.g9478a66081



[PATCH v4 7/7] wildmatch test: create & test files on disk in addition to in-memory

2018-01-04 Thread Ævar Arnfjörð Bjarmason
There has never been any full roundtrip testing of what git-ls-files
and other commands that use wildmatch() actually do, rather we've been
satisfied with just testing the underlying C function.

Due to git-ls-files and friends having their own codepaths before they
call wildmatch() there's sometimes differences in the behavior between
the two. Even when we test for those (as with [1]), there was no one
place where you can review how these two modes differ.

Now there is. We now attempt to create a file called $haystack and
match $needle against it for each pair of $needle and $haystack that
we were passing to test-wildmatch.

If we can't create the file we skip the test. This ensures that we can
run this on all platforms and not maintain some infinitely growing
whitelist of e.g. platforms that don't support certain characters in
filenames.

As a result of doing this we can now see the cases where these two
ways of testing wildmatch differ:

 * Creating a file called 'a[]b' and running ls-files 'a[]b' will show
   that file, but wildmatch("a[]b", "a[]b") will not match

 * wildmatch() won't match a file called \ against \, but ls-files
   will.

 * `git --glob-pathspecs ls-files 'foo**'` will match a file
   'foo/bba/arr', but wildmatch won't, however pathmatch will.

   This seems like a bug to me, the two are otherwise equivalent as
   these tests show.

This also reveals the case discussed in [1] above, where '' is now an
error as far as ls-files is concerned, but wildmatch() itself happily
accepts it.

1. 9e4e8a64c2 ("pathspec: die on empty strings as pathspec",
   2017-06-06)

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t3070-wildmatch.sh | 301 +--
 1 file changed, 290 insertions(+), 11 deletions(-)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index 06db053ae2..f606f91acb 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -4,14 +4,113 @@ test_description='wildmatch tests'
 
 . ./test-lib.sh
 
+create_test_file() {
+   file=$1
+
+   case $file in
+   # `touch .` will succeed but obviously not do what we intend
+   # here.
+   ".")
+   return 1
+   ;;
+   # We cannot create a file with an empty filename.
+   "")
+   return 1
+   ;;
+   # The tests that are testing that e.g. foo//bar is matched by
+   # foo/*/bar can't be tested on filesystems since there's no
+   # way we're getting a double slash.
+   *//*)
+   return 1
+   ;;
+   # When testing the difference between foo/bar and foo/bar/ we
+   # can't test the latter.
+   */)
+   return 1
+   ;;
+   # On Windows, \ in paths is silently converted to /, which
+   # would result in the "touch" below working, but the test
+   # itself failing. See 6fd1106aa4 ("t3700: Skip a test with
+   # backslashes in pathspec", 2009-03-13) for prior art and
+   # details.
+   *\\*)
+   if ! test_have_prereq BSLASHPSPEC
+   then
+   return 1
+   fi
+   # NOTE: The ;;& bash extension is not portable, so
+   # this test needs to be at the end of the pattern
+   # list.
+   #
+   # If we want to add more conditional returns we either
+   # need a new case statement, or turn this whole thing
+   # into a series of "if" tests.
+   ;;
+   esac
+
+   # Turn foo/bar/baz into foo/bar to create foo/bar as a
+   # directory structure.
+   dirs=${file%/*}
+
+   # We touch "./$file" instead of "$file" because even an
+   # escaped "touch -- -" means get arguments from stdin.
+   if test "$file" != "$dirs"
+   then
+   mkdir -p -- "$dirs" &&
+   touch -- "./$file" &&
+   return 0
+   else
+   touch -- "./$file" &&
+   return 0
+   fi
+   return 1
+}
+
+wildtest_file_setup() {
+   test_when_finished "
+   rm -rf -- * &&
+   git reset
+   " &&
+   git add -A &&
+   >expect.err
+}
+
+wildtest_stdout_stderr_cmp() {
+   tr -d '\0' actual &&
+   test_cmp expect.err actual.err &&
+   test_cmp expect actual
+}
+
 wildtest() {
-   match_w_glob=$1
-   match_w_globi=$2
-   match_w_pathmatch=$3
-   match_w_pathmatchi=$4
-   text=$5
-   pattern=$6
+   if test "$#" = 6
+   then
+   # When test-wildmatch and git ls-files produce the same
+   # result.
+   match_w_glob=$1
+   match_f_w_glob=$match_w_glob
+   match_w_globi=$2
+   match_f_w_globi=$match_w_globi
+   match_w_pathmatch=$3
+   match_f_w_pathmatch=$match_w_pathmatch
+   match_w_pathmatchi=$4
+   

Re: [PATCH v2 1/5] core.aheadbehind: add new config setting

2018-01-04 Thread Jeff King
On Wed, Dec 27, 2017 at 09:41:30AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > I, too, had a funny feeling about calling this "core". But I didn't have
> > a better name, as I'm not sure what other place we have for config
> > options that cross many command boundaries. "diff" and "status" don't
> > seem quite right to me. While you can argue they are subsystems, it
> > seems too easy for users to confuse them with the commands of the same
> > names.
> >
> > Maybe there should be a "ui.*" config hierarchy for these kinds of
> > cross-command interface options?
> 
> I had an impression that ui.* was primarily pretty-printing,
> colouring and things of such nature.

I didn't think we had a "ui.*" so far. We have "color.ui" and
"column.ui", but I think that's it.

At any rate, my intent was to consider this a "ui" issue, in that we are
deciding how the ahead/behind hints should be shown to the user.

> I do not think it is such a
> bad idea to honor a status.frotz variable that affects how (e.g. to
> what degree of detailedness) status on frotz are reported in Git
> subcommands other than 'git status' if they report the same sort of
> information on 'frotz' that 'git status' makes.

Is ahead/behind uniquely attached to git-status? IOW, could this be called
"branch.aheadbehind" and git-status respects it? It seems like putting
it in status introduces a weird asymmetry.

I buy the argument more that "status" here is not "this is a git-status
config option", but "this config section encompasses various things
about the status of a repository reported by many commands". But then
it's kind of funny to have many of the existing options there that
really are specific to git-status.

In can be both of those things, of course, but then it becomes less
clear to the user which config options affect which command.

I dunno. It is probably not _that_ big a deal, and I can live with it
wherever. But Git has a reputation for having inconsistencies and weird
asymmetries in its UI, so I like to give some thought to squashing them
preemptively.

-Peff


Re: Fwd: Unknown option for merge-recursive: -X'diff-algorithm=patience'

2018-01-04 Thread Matwey V. Kornilov
2018-01-04 21:57 GMT+03:00 Junio C Hamano :
> "Matwey V. Kornilov"  writes:
>
>> It seems there is some issue with double escaping:
>> ...
>>> # git rebase --preserve-merges -s recursive -Xdiff-algorithm=patience
>>> --onto abc-3.8 v3.8 abc-3.9
>>>
>>> And then I see:
>>>
>>> fatal: Unknown option for merge-recursive: -X'diff-algorithm=patience'
>
> The string looks like a result of a shell script that quotes any
> end-user controlled string before incorporating into a string to be
> eval-ed, or something like that.  Is this a recent regression?  Does
> a bit older version of Git, like 2.12 or older, behave differently?
>

I've just checked, 2.12.4 has the same behaviour. By the way, can you
reproduce this at all?

-- 
With best regards,
Matwey V. Kornilov


Re: Bring together merge and rebase

2018-01-04 Thread Martin Fick
On Tuesday, December 26, 2017 12:40:26 AM Jacob Keller 
wrote:
> On Mon, Dec 25, 2017 at 10:02 PM, Carl Baldwin 
 wrote:
> >> On Mon, Dec 25, 2017 at 5:16 PM, Carl Baldwin 
 wrote:
> >> A bit of a tangent here, but a thought I didn't wanna
> >> lose: In the general case where a patch was rebased
> >> and the original parent pointer was changed, it is
> >> actually quite hard to show a diff of what changed
> >> between versions.
> 
> My biggest gripes are that the gerrit web interface
> doesn't itself do something like this (and jgit does not
> appear to be able to generate combined diffs at all!)

I believe it now does, a presentation was given at the 
Gerrit User summit in London describing this work.  It would 
indeed be great if git could do this also!

-Martin 



-- 
The Qualcomm Innovation Center, Inc. is a member of Code 
Aurora Forum, hosted by The Linux Foundation



[RFC PATCH 1/2] add a local copy of Mail::Address from CPAN

2018-01-04 Thread Matthieu Moy
We used to have two versions of the email parsing code. Our
parse_mailboxes (in Git.pm), and Mail::Address which we used if
installed. Unfortunately, both versions have different sets of bugs, and
changing the behavior of git depending on whether Mail::Address is
installed was a bad idea.

A first attempt to solve this was cc90750 (send-email: don't use
Mail::Address, even if available, 2017-08-23), but it turns out our
parse_mailboxes is too buggy for some uses. For example the lack of
about nested comments support breaks get_maintainer.pl in the Linux
kernel tree:

  https://public-inbox.org/git/20171116154814.23785-1-alex.ben...@linaro.org/

This patch goes the other way: use Mail::Address anyway, but have a
local copy as a fallback, when the system one is not available.

The duplicated script is small (276 lines of code) and stable in time.
Maintaining the local copy should not be an issue, and will certainly be
less burden than maintaining our own parse_mailboxes.

Another option would be to consider Mail::Address as a hard dependency,
but it's easy enough to save the trouble of extra-dependency to the end
user or packager.

Signed-off-by: Matthieu Moy 
---
I looked at the perl/Git/Error.pm wrapper, and ended up writting a
different, much simpler version. I'm not sure the same approach would
apply to Error.pm, but my straightforward version does the job for
Mail/Address.pm.

I would also be fine with using our local copy unconditionaly.

 git-send-email.perl   |   3 +-
 perl/Git/FromCPAN/Mail/Address.pm | 276 ++
 perl/Git/Mail/Address.pm  |  24 
 3 files changed, 302 insertions(+), 1 deletion(-)
 create mode 100644 perl/Git/FromCPAN/Mail/Address.pm
 create mode 100755 perl/Git/Mail/Address.pm

diff --git a/git-send-email.perl b/git-send-email.perl
index 02747b6..d0dcc6d 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -30,6 +30,7 @@ use Git::Error qw(:try);
 use Cwd qw(abs_path cwd);
 use Git;
 use Git::I18N;
+use Git::Mail::Address;
 
 Getopt::Long::Configure qw/ pass_through /;
 
@@ -489,7 +490,7 @@ my ($repoauthor, $repocommitter);
 ($repocommitter) = Git::ident_person(@repo, 'committer');
 
 sub parse_address_line {
-   return Git::parse_mailboxes($_[0]);
+   return map { $_->format } Mail::Address->parse($_[0]);
 }
 
 sub split_addrs {
diff --git a/perl/Git/FromCPAN/Mail/Address.pm 
b/perl/Git/FromCPAN/Mail/Address.pm
new file mode 100644
index 000..13b2ff7
--- /dev/null
+++ b/perl/Git/FromCPAN/Mail/Address.pm
@@ -0,0 +1,276 @@
+# Copyrights 1995-2017 by [Mark Overmeer ].
+#  For other contributors see ChangeLog.
+# See the manual pages for details on the licensing terms.
+# Pod stripped from pm file by OODoc 2.02.
+package Mail::Address;
+use vars '$VERSION';
+$VERSION = '2.19';
+
+use strict;
+
+use Carp;
+
+# use locale;   removed in version 1.78, because it causes taint problems
+
+sub Version { our $VERSION }
+
+
+
+# given a comment, attempt to extract a person's name
+sub _extract_name
+{   # This function can be called as method as well
+my $self = @_ && ref $_[0] ? shift : undef;
+
+local $_ = shift
+or return '';
+
+# Using encodings, too hard. See Mail::Message::Field::Full.
+return '' if m/\=\?.*?\?\=/;
+
+# trim whitespace
+s/^\s+//;
+s/\s+$//;
+s/\s+/ /;
+
+# Disregard numeric names (e.g. 123456.1...@compuserve.com)
+return "" if /^[\d ]+$/;
+
+s/^\((.*)\)$/$1/; # remove outermost parenthesis
+s/^"(.*)"$/$1/;   # remove outer quotation marks
+s/\(.*?\)//g; # remove minimal embedded comments
+s/\\//g;  # remove all escapes
+s/^"(.*)"$/$1/;   # remove internal quotation marks
+s/^([^\s]+) ?, ?(.*)$/$2 $1/; # reverse "Last, First M." if applicable
+s/,.*//;
+
+# Change casing only when the name contains only upper or only
+# lower cased characters.
+unless( m/[A-Z]/ && m/[a-z]/ )
+{   # Set the case of the name to first char upper rest lower
+s/\b(\w+)/\L\u$1/igo;  # Upcase first letter on name
+s/\bMc(\w)/Mc\u$1/igo; # Scottish names such as 'McLeod'
+s/\bo'(\w)/O'\u$1/igo; # Irish names such as 'O'Malley, O'Reilly'
+s/\b(x*(ix)?v*(iv)?i*)\b/\U$1/igo; # Roman numerals, eg 'Level III 
Support'
+}
+
+# some cleanup
+s/\[[^\]]*\]//g;
+s/(^[\s'"]+|[\s'"]+$)//g;
+s/\s{2,}/ /g;
+
+$_;
+}
+
+sub _tokenise
+{   local $_ = join ',', @_;
+my (@words,$snippet,$field);
+
+s/\A\s+//;
+s/[\r\n]+/ /g;
+
+while ($_ ne '')
+{   $field = '';
+if(s/^\s*\(/(/ )# (...)
+{   my $depth = 0;
+
+ PAREN: while(s/^(\(([^\(\)\\]|\\.)*)//)
+{   $field .= $1;
+$depth++;
+while(s/^(([^\(\)\\]|\\.)*\)\s*)//)
+{   $field .= $1;
+last PAREN unless --$depth;
+   $field .= $1 if 

Re: [PATCH v2 2/3] prune: fix pruning with multiple worktrees and split index

2018-01-04 Thread Junio C Hamano
Thomas Gummerer  writes:

> [sorry for the late reply.  I was on Christmas holidays until today
> and am still catching up on the mailing list.  It will probably take
> me untill the weekend to send a re-roll]
>
> On 12/18, Brandon Williams wrote:
>> On 12/17, Thomas Gummerer wrote:
>> > be489d02d2 ("revision.c: --indexed-objects add objects from all
>> > worktrees", 2017-08-23) made sure that pruning takes objects from all
>> > worktrees into account.
>> > 
>> > It did that by reading the index of every worktree and adding the
>> > necessary index objects to the set of pending objects.  The index is
>> > read by read_index_from.  As mentioned in the previous commit,
>> > read_index_from depends on the CWD for the location of the split index,
>> 
>> As I mentioned before this doesn't actually depend on the CWD but
>> rather the per-worktree gitdir.
>
> Right, will fix.

Thanks.


Re: Misleading documentation for git-diff-files (diff-filter)

2018-01-04 Thread John Cheng
Thanks for the clarification! I also didn't realize that diff-files -R
will show added files. You learn something new everyday ;)

On Thu, Jan 4, 2018 at 11:09 AM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
>> index 9d1586b956..743af97b06 100644
>> --- a/Documentation/diff-options.txt
>> +++ b/Documentation/diff-options.txt
>> @@ -469,6 +469,12 @@ ifndef::git-format-patch[]
>>  +
>>  Also, these upper-case letters can be downcased to exclude.  E.g.
>>  `--diff-filter=ad` excludes added and deleted paths.
>> ++
>> +Note that not all diffs can feature all types. For instance, diffs
>> +from the index to the working tree can never have Added entries
>> +(because the set of paths included in the diff is limited by what is in
>> +the index).  Similarly, copied and renamed entries cannot appear if
>> +detection for those types is disabled.
>
> Makes sense; thanks.



-- 
---
John L Cheng


Re: Misleading documentation for git-diff-files (diff-filter)

2018-01-04 Thread Junio C Hamano
Jeff King  writes:

> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index 9d1586b956..743af97b06 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -469,6 +469,12 @@ ifndef::git-format-patch[]
>  +
>  Also, these upper-case letters can be downcased to exclude.  E.g.
>  `--diff-filter=ad` excludes added and deleted paths.
> ++
> +Note that not all diffs can feature all types. For instance, diffs
> +from the index to the working tree can never have Added entries
> +(because the set of paths included in the diff is limited by what is in
> +the index).  Similarly, copied and renamed entries cannot appear if
> +detection for those types is disabled.

Makes sense; thanks.


Re: [RFC/PATCH] connect: add GIT_SSH_{SEND,RECEIVE}{,_COMMAND} env variables

2018-01-04 Thread Junio C Hamano
Jeff King  writes:

> I get that you may have two different keys to go with two different
> identities on a remote system. But I'm not sure I understand why
> "sending" or "receiving" is the right way to split those up. Wouldn't
> you also sometimes want to fetch from repository X? IOW, wouldn't you
> want to tie identity "A" to repository "X", and "B" to repository "Y?
>
>> So now I just have a GIT_SSH_COMMAND that dispatches to different keys
>> depending on the operation, as noted in the commit message, and I can
>> assure you that without that logic it doesn't work.
>
> You mentioned host aliases later, which is the solution I've seen in the
> wild. And then you can map each remote to a different host alias.

Yup, I do agree that it is exactly the established solution for this
kind of situation.


Re: [PATCH] doc/SubmittingPatches: improve text formatting

2018-01-04 Thread Junio C Hamano
"brian m. carlson"  writes:

> On Tue, Jan 02, 2018 at 10:33:50AM -0500, Todd Zullinger wrote:
>> 049e64aa50 ("Documentation: convert SubmittingPatches to AsciiDoc",
>> 2017-11-12) changed the `git blame` and `git shortlog` examples given in
>> the section on sending your patches.
>> 
>> In order to italicize the `$path` argument the commands are enclosed in
>> plus characters as opposed to backticks.  The difference between the
>> quoting methods is that backtick enclosed text is not subject to further
>> expansion.  This formatting makes reading SubmittingPatches in a git
>> clone a little more difficult.  In addition to the underscores around
>> `$path` the `--` chars in `git shortlog --no-merges` must be replaced
>> with `{litdd}`.
>> 
>> Use backticks to quote these commands.  The italicized `$path` is lost
>> from the html version but the commands can be read (and copied) more
>> easily by users reading the text version.  These readers are more likely
>> to use the commands while submitting patches.  Make it easier for them.
>
> I think this change is fine.  I don't have a strong opinion either way
> and if others think the change makes the plain text more readable, I'm
> all for it.

I too would prefer to move towards plain text that reads easier,
especially if it does not make the resulting typeset version
unreadable; thanks, both.


Re: Fwd: Unknown option for merge-recursive: -X'diff-algorithm=patience'

2018-01-04 Thread Junio C Hamano
"Matwey V. Kornilov"  writes:

> It seems there is some issue with double escaping:
> ...
>> # git rebase --preserve-merges -s recursive -Xdiff-algorithm=patience
>> --onto abc-3.8 v3.8 abc-3.9
>> 
>> And then I see:
>> 
>> fatal: Unknown option for merge-recursive: -X'diff-algorithm=patience'

The string looks like a result of a shell script that quotes any
end-user controlled string before incorporating into a string to be
eval-ed, or something like that.  Is this a recent regression?  Does
a bit older version of Git, like 2.12 or older, behave differently?



Re: [PATCH] perf: amend the grep tests to test grep.threads

2018-01-04 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> Ever since 5b594f457a ("Threaded grep", 2010-01-25) the number of
> threads git-grep uses under PTHREADS has been hardcoded to 8, but
> there's no performance test to check whether this is an optimal
> setting.
>
> Amend the existing tests for the grep engines to support a mode where
> this can be tested, e.g.:
>
> GIT_PERF_GREP_THREADS='1 8 16' GIT_PERF_LARGE_REPO=~/g/linux ./run p782*
>
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>
> This looks less scary under diff -w.
>
>  t/perf/p7820-grep-engines.sh   | 52 ---
>  t/perf/p7821-grep-engines-fixed.sh | 55 
> ++
>  2 files changed, 86 insertions(+), 21 deletions(-)
>
> diff --git a/t/perf/p7820-grep-engines.sh b/t/perf/p7820-grep-engines.sh
> index 62aba19e76..8b09c5bf32 100755
> --- a/t/perf/p7820-grep-engines.sh
> +++ b/t/perf/p7820-grep-engines.sh
> @@ -12,6 +12,9 @@ e.g. GIT_PERF_7820_GREP_OPTS=' -i'. Some options to try:
> ...
> + if ! test_have_prereq PERF_GREP_ENGINES_THREADS
>   then
> - test_cmp out.basic out.perl
> + test_perf $prereq "$engine grep$GIT_PERF_7820_GREP_OPTS 
> '$pattern'" "
> + git -c grep.patternType=$engine 
> grep$GIT_PERF_7820_GREP_OPTS -- '$pattern' >'out.$engine' || :
> + "
> + else
> + for threads in $GIT_PERF_GREP_THREADS
> + do
> + test_perf PTHREADS,$prereq "$engine 
> grep$GIT_PERF_7820_GREP_OPTS '$pattern' with $threads threads" "

Is it guaranteed that $prereq is not empty at this point?  

Judging by the way the other side uses "test_perf $prereq ..."
without quotes around it, I suspect you do expect it to be empty in
some cases.  It means you expect test_have_prereq is prepared to
skip an empty prerequisite in a prereq list, but I do not recall
writing that helper in such a way, so...

PTHREADS${prereq:+,}$prereq

or something along that line, perhaps?

> diff --git a/t/perf/p7821-grep-engines-fixed.sh 
> b/t/perf/p7821-grep-engines-fixed.sh
> index c7ef1e198f..61e41b82cf 100755
> --- a/t/perf/p7821-grep-engines-fixed.sh
> +++ b/t/perf/p7821-grep-engines-fixed.sh
> @@ -6,6 +6,9 @@ Set GIT_PERF_7821_GREP_OPTS in the environment to pass 
> options to
> ...
>  for pattern in 'int' 'uncommon' 'æ'
>  do
>   for engine in fixed basic extended perl
> @@ -23,19 +31,44 @@ do
> ...
> + if ! test_have_prereq PERF_GREP_ENGINES_THREADS
>   then
> - test_cmp out.fixed out.perl
> + test_perf $prereq "$engine grep$GIT_PERF_7821_GREP_OPTS 
> $pattern" "
> + git -c grep.patternType=$engine 
> grep$GIT_PERF_7821_GREP_OPTS $pattern >'out.$engine' || :
> + "
> + else
> + for threads in $GIT_PERF_GREP_THREADS
> + do
> + test_perf PTHREADS,$prereq "$engine 
> grep$GIT_PERF_7821_GREP_OPTS $pattern with $threads threads" "
> + git -c grep.patternType=$engine -c 
> grep.threads=$threads grep$GIT_PERF_7821_GREP_OPTS $pattern 
> >'out.$engine.$threads' || :
> + "
> + done

Same here, which means these two scripts share somewhat large body
of text and makes me wonder if it is worth refactoring it to ease
future updates to them.

Thanks.


Re: [PATCH v2 2/2] Windows: stop supplying BLK_SHA1=YesPlease by default

2018-01-04 Thread Ævar Arnfjörð Bjarmason

On Thu, Jan 04 2018, Johannes Schindelin jotted:

> Hi,
>
> On Thu, 28 Dec 2017, Ævar Arnfjörð Bjarmason wrote:
>
>> Using BLK_SHA1 in lieu of the OpenSSL routines was done in
>> 9bccfcdbff ("Windows: use BLK_SHA1 again", 2009-10-22), since DC_SHA1
>> is now the default for git in general it makes sense for Windows to
>> use that too, this looks like something that was missed back in
>> e6b07da278 ("Makefile: make DC_SHA1 the default", 2017-03-17).
>>
>> As noted in 2cfc70f0de ("mingw: use OpenSSL's SHA-1 routines",
>> 2017-02-09) OpenSSL has a performance benefit compared to BLK_SHA1 on
>> MinGW, so perhaps that and the Windows default should be changed
>> around again. That's a topic for another series, it seems clear that
>> this specific flag is nobody's explicit intention.
>>
>> Reviewed-by: Jonathan Nieder 
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>>  config.mak.uname | 1 -
>>  1 file changed, 1 deletion(-)
>>
>> diff --git a/config.mak.uname b/config.mak.uname
>> index 685a80d138..6a862abd35 100644
>> --- a/config.mak.uname
>> +++ b/config.mak.uname
>> @@ -361,7 +361,6 @@ ifeq ($(uname_S),Windows)
>>  NO_REGEX = YesPlease
>>  NO_GETTEXT = YesPlease
>>  NO_PYTHON = YesPlease
>> -BLK_SHA1 = YesPlease
>>  ETAGS_TARGET = ETAGS
>>  NO_INET_PTON = YesPlease
>>  NO_INET_NTOP = YesPlease
>
> This patch is actually identical to 8756c75cd10 (msvc: use OpenSSL's SHA-1
> routines, 2016-10-12) in Git for Windows' master.
>
> I did plan to submit this, but it is part of a bigger effort to get Git to
> build in current versions of Visual Studio again.
>
> Before this work, the MSVC build could only use an ancient OpenSSL version
> from pre-built binaries hosted on repo.or.cz, and those are unlikely to
> get the performance benefits that you seek.
>
> So I would like to ask to skip this patch for now, and take Jeff
> Hostetler's patch as part of the MSVC patches later, once they have been
> matured in Git for Windows?

Sounds good to me. Junio, could you please drop this one and just take
1/2?


Re: [PATCH] git-archive: accept --owner and --group like GNU tar

2018-01-04 Thread Junio C Hamano
René Scharfe  writes:

> I don't know if it's a good idea, but perhaps we don't even need a new
> option.  We could change how pathspecs of untracked files are handled:
> Instead of aborting we could include them in the archive.  (Sounds like
> the simplest possible interface, but may have practical problems.)

One practical problem is that users who do this

$ git archive HEAD Documentation/ | tar tf -

would be expecting (at least) two different things, depending on the
situation they are in.

So at least you'd need an "--include-untracked" option, I guess.


Re: [PATCH 14/40] sha1_file: prepare for external odbs

2018-01-04 Thread Jeff Hostetler



On 1/3/2018 11:33 AM, Christian Couder wrote:

In the following commits we will need some functions that were
internal to sha1_file.c, so let's first make them non static
and declare them in "cache.h". While at it, let's rename
'create_tmpfile()' to 'create_object_tmpfile()' to make its
name less generic.

Let's also split out 'sha1_file_name_alt()' from
'sha1_file_name()' and 'open_sha1_file_alt()' from
'open_sha1_file()', as we will need both of these new
functions too.

[...]

diff --git a/sha1_file.c b/sha1_file.c
index 261baf800f..785e8dda03 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -322,17 +322,22 @@ static void fill_sha1_path(struct strbuf *buf, const 
unsigned char *sha1)
}
  }
  
-const char *sha1_file_name(const unsigned char *sha1)

+const char *sha1_file_name_alt(const char *objdir, const unsigned char *sha1)
  {
static struct strbuf buf = STRBUF_INIT;


While we are refactoring sha1_file_name() and adding
sha1_file_name_alt(), could we also change the API and
pass in the strbuf so we can get rid of the static buffer?
Granted, it is a little off topic, but it will help out
in the long run.

[...]

@@ -1551,7 +1562,7 @@ static inline int directory_size(const char *filename)
   * We want to avoid cross-directory filename renames, because those
   * can have problems on various filesystems (FAT, NFS, Coda).
   */
-static int create_tmpfile(struct strbuf *tmp, const char *filename)
+int create_object_tmpfile(struct strbuf *tmp, const char *filename)
  {
int fd, dirlen = directory_size(filename);
  
@@ -1591,7 +1602,7 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen,

static struct strbuf tmp_file = STRBUF_INIT;


Same thing here, since we are renaming the function anyway, could we
add a strbuf arg and get rid of the static one?



const char *filename = sha1_file_name(sha1);
  
-	fd = create_tmpfile(_file, filename);

+   fd = create_object_tmpfile(_file, filename);
if (fd < 0) {
if (errno == EACCES)
return error("insufficient permission for adding an object 
to repository database %s", get_object_directory());



Jeff


Re: [PATCH 10/40] external-odb: implement external_odb_get_direct

2018-01-04 Thread Jeff Hostetler



On 1/3/2018 11:33 AM, Christian Couder wrote:

This is implemented only in the promisor remote mode
for now by calling fetch_object().

Signed-off-by: Christian Couder 
---
  external-odb.c | 15 +++
  external-odb.h |  1 +
  odb-helper.c   | 13 +
  odb-helper.h   |  3 ++-
  4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/external-odb.c b/external-odb.c
index d26e63d8b1..5d0afb9762 100644
--- a/external-odb.c
+++ b/external-odb.c
@@ -76,3 +76,18 @@ int external_odb_has_object(const unsigned char *sha1)
return 1;
return 0;
  }
+
+int external_odb_get_direct(const unsigned char *sha1)
+{
+   struct odb_helper *o;
+
+   external_odb_init();
+
+   for (o = helpers; o; o = o->next) {
+   if (odb_helper_get_direct(o, sha1) < 0)
+   continue;
+   return 0;

> +  }

Would this be simpler said as:
for (o = ...)
if (!odb_helper_get_direct(...))
return 0;



+
+   return -1;
+}
diff --git a/external-odb.h b/external-odb.h
index 9a3c2f01b3..fd6708163e 100644
--- a/external-odb.h
+++ b/external-odb.h
@@ -4,5 +4,6 @@
  extern int has_external_odb(void);
  extern const char *external_odb_root(void);
  extern int external_odb_has_object(const unsigned char *sha1);
+extern int external_odb_get_direct(const unsigned char *sha1);
  
  #endif /* EXTERNAL_ODB_H */

diff --git a/odb-helper.c b/odb-helper.c
index 1404393807..4b70b287af 100644
--- a/odb-helper.c
+++ b/odb-helper.c
@@ -4,6 +4,7 @@
  #include "odb-helper.h"
  #include "run-command.h"
  #include "sha1-lookup.h"
+#include "fetch-object.h"
  
  struct odb_helper *odb_helper_new(const char *name, int namelen)

  {
@@ -52,3 +53,15 @@ int odb_helper_has_object(struct odb_helper *o, const 
unsigned char *sha1)
return !!odb_helper_lookup(o, sha1);
  }
  
+int odb_helper_get_direct(struct odb_helper *o,

+ const unsigned char *sha1)
+{
+   int res = 0;
+   uint64_t start = getnanotime();
+
+   fetch_object(o->dealer, sha1);
+
+   trace_performance_since(start, "odb_helper_get_direct");
+
+   return res;


'res' will always be 0, so the external_odb_get_direct() will
only do the first helper.  i haven't looked at the rest of the
series yet, so maybe you've already addressed this.

Also, I put a TODO comment in the fetch_object() header to
consider returning an error/success, so maybe that could help
here too.


+}
diff --git a/odb-helper.h b/odb-helper.h
index 9395e606ce..f4bc66b0ef 100644
--- a/odb-helper.h
+++ b/odb-helper.h
@@ -20,5 +20,6 @@ struct odb_helper {
  extern struct odb_helper *odb_helper_new(const char *name, int namelen);
  extern int odb_helper_has_object(struct odb_helper *o,
 const unsigned char *sha1);
-
+extern int odb_helper_get_direct(struct odb_helper *o,
+const unsigned char *sha1);
  #endif /* ODB_HELPER_H */



Re: [PATCH 2/2] Windows: stop supplying BLK_SHA1=YesPlease by default

2018-01-04 Thread Johannes Schindelin
Hi,

On Thu, 28 Dec 2017, Ævar Arnfjörð Bjarmason wrote:

> On Wed, Dec 27 2017, Jonathan Nieder jotted:
> 
> > +git-for-windows
> > Ævar Arnfjörð Bjarmason wrote:
> >
> >> Using BLK_SHA1 in lieu of the OpenSSL routines was done in [1], but
> >> since DC_SHA1 is now the default for git in general it makes sense for
> >> Windows to use that too, this looks like something that was missed
> >> back in [2].
> >>
> >> As noted in [3] OpenSSL has a performance benefit compared to BLK_SHA1
> >> on MinGW, so perhaps that and the Windows default should be changed
> >> around again, but that's a topic for another series, it seems clear
> >> that this specific flag is nobody's explicit intention.
> >
> > I have some memory of performance issues on Windows when DC_SHA1 was
> > introduced leading to interest in a mixed configuration with DC_SHA1
> > only being used where it is security sensitive (e.g. for object naming
> > but not for packfile trailers).
> >
> > Did anything come of that?
> 
> This was Johannes Schindelin (CC'd) on-list when the sha1dc discussion
> first came up earlier this year. I.e. it's slower, so we could use
> openssl on trusted data and sha1dc on untrusted data, but nothing came
> of that.

The performance degradation is noticeable, as far as I can see. And no, we
have not yet worked on the hyprid SHA-1-DC solution, as we expect bigger
benefits from trying to avoid unnecessary SHA-1 calculation to begin with
(read: we will try to catch bigger fish first).

All of this may also become moot if we ever get off the ground with
SHA-256 (or whatever we will use then).

Ciao,
Dscho

Re: [PATCH v2 2/2] Windows: stop supplying BLK_SHA1=YesPlease by default

2018-01-04 Thread Johannes Schindelin
Hi,

On Thu, 28 Dec 2017, Ævar Arnfjörð Bjarmason wrote:

> Using BLK_SHA1 in lieu of the OpenSSL routines was done in
> 9bccfcdbff ("Windows: use BLK_SHA1 again", 2009-10-22), since DC_SHA1
> is now the default for git in general it makes sense for Windows to
> use that too, this looks like something that was missed back in
> e6b07da278 ("Makefile: make DC_SHA1 the default", 2017-03-17).
> 
> As noted in 2cfc70f0de ("mingw: use OpenSSL's SHA-1 routines",
> 2017-02-09) OpenSSL has a performance benefit compared to BLK_SHA1 on
> MinGW, so perhaps that and the Windows default should be changed
> around again. That's a topic for another series, it seems clear that
> this specific flag is nobody's explicit intention.
> 
> Reviewed-by: Jonathan Nieder 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  config.mak.uname | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/config.mak.uname b/config.mak.uname
> index 685a80d138..6a862abd35 100644
> --- a/config.mak.uname
> +++ b/config.mak.uname
> @@ -361,7 +361,6 @@ ifeq ($(uname_S),Windows)
>   NO_REGEX = YesPlease
>   NO_GETTEXT = YesPlease
>   NO_PYTHON = YesPlease
> - BLK_SHA1 = YesPlease
>   ETAGS_TARGET = ETAGS
>   NO_INET_PTON = YesPlease
>   NO_INET_NTOP = YesPlease

This patch is actually identical to 8756c75cd10 (msvc: use OpenSSL's SHA-1
routines, 2016-10-12) in Git for Windows' master.

I did plan to submit this, but it is part of a bigger effort to get Git to
build in current versions of Visual Studio again.

Before this work, the MSVC build could only use an ancient OpenSSL version
from pre-built binaries hosted on repo.or.cz, and those are unlikely to
get the performance benefits that you seek.

So I would like to ask to skip this patch for now, and take Jeff
Hostetler's patch as part of the MSVC patches later, once they have been
matured in Git for Windows?

Thanks,
Dscho

[PATCH] utf8.c: mark utf{16,32}_{be,le}_bom symbols static

2018-01-04 Thread Ramsay Jones

Signed-off-by: Ramsay Jones 
---

Hi Lars,

If you need to re-roll your 'ls/checkout-encoding' branch, could you
please squash this into the relevant patch (commit d5318db2d0,
"utf8: add function to detect prohibited UTF-16/32 BOM", 2017-12-31).

[yes, noticed by sparse].

Thanks!

ATB,
Ramsay Jones

 utf8.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/utf8.c b/utf8.c
index 1978d6c42..f033fec1c 100644
--- a/utf8.c
+++ b/utf8.c
@@ -544,10 +544,10 @@ static int has_bom_prefix(const char *data, size_t len,
return (len >= bom_len) && !memcmp(data, bom, bom_len);
 }
 
-const char utf16_be_bom[] = {0xFE, 0xFF};
-const char utf16_le_bom[] = {0xFF, 0xFE};
-const char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
-const char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
+static const char utf16_be_bom[] = {0xFE, 0xFF};
+static const char utf16_le_bom[] = {0xFF, 0xFE};
+static const char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
+static const char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
 
 int has_prohibited_utf_bom(const char *enc, const char *data, size_t len)
 {
-- 
2.15.0


Re: [RFC/PATCH] connect: add GIT_SSH_{SEND,RECEIVE}{,_COMMAND} env variables

2018-01-04 Thread Ævar Arnfjörð Bjarmason

On Thu, Jan 04 2018, Jeff King jotted:

> On Thu, Jan 04, 2018 at 11:10:17AM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> That's badly explained, sorry, when I say "push" I mean "push and/or
>> pull".
>>
>> I don't know about Github, but on Gitlab when you provision a deploy key
>> and associate it with a repo it must be *globally* rw or ro, there's no
>> way to on a per-repo basis say it should be rw ro.
>>
>> I have a job that's fetching a bunch of repos to review code in them
>> (for auditing purposes). It then commits the results of that review to
>> other git repos.
>>
>> Thus I want to have a ro key to all those reviewed repos, but rw keys to
>> the audit repo itself (and it'll also pull with the rw key).
>
> OK, that part makes sense to me.
>
> But I'm not sure how your patch solves it. When you "git fetch" on the
> audit repo, wouldn't your GIT_SSH_RECEIVE_COMMAND kick in and use the
> wrong key? What am I missing?

I add both the ro and rw key to some projects. Those are a tiny subset
of the overall number.


Re: [PATCH] git-archive: accept --owner and --group like GNU tar

2018-01-04 Thread René Scharfe
Am 04.01.2018 um 03:25 schrieb suzuki toshiya:
> Taking a glance on parse-options.h, I could not find the
> existing class collecting the operands as an array (or
> linked list) from multiple "--xxx=yyy" options. Similar
> things might be the collecting the pathnames to pathspec
> structure. Should I write something with OPTION_CALLBACK?

There is OPT_STRING_LIST; Documentation/technical/api-parse-options.txt
says:

  `OPT_STRING_LIST(short, long,  string_list, arg_str, description)`::
  Introduce an option with string argument.
  The string argument is stored as an element in `string_list`.
  Use of `--no-option` will clear the list of preceding values.

I don't know if it's a good idea, but perhaps we don't even need a new
option.  We could change how pathspecs of untracked files are handled:
Instead of aborting we could include them in the archive.  (Sounds like
the simplest possible interface, but may have practical problems.)

René


Re: Misleading documentation for git-diff-files (diff-filter)

2018-01-04 Thread Jeff King
On Thu, Jan 04, 2018 at 07:53:53AM -0800, John Cheng wrote:

> To be clear, I don't mean to imply that diff-files should include
> files that are not the index. I was trying to say that as a user, the
> documentation gave me a different impression.
> 
> For background, my intent was to have a script to look for local git
> repos that with unstaged changes. After some trial and error, I found
> that git-ls-files gave me what I needed. However, I wanted to point
> out why I initially believed git-diff-files with show "added files".

I took your original mail to be a suggestion that the documentation
could be more clear. :)

So maybe something like the patch below?

It actually would be an interesting feature for a diff from the index to
the working tree to include untracked files as new entries. I don't
think we'd want to do that by default (since the actions a user needs to
take are quite different between a modified file, one marked
intent-to-add, and an untracked one). But I could see it being useful
for a caller to want to consider the whole tree as a single diff.

-- >8 --
Subject: [PATCH] docs/diff-options: clarify scope of diff-filter types

The same document for "--diff-filter" is included by many
programs in the diff family. Because it mentions all
possible types (added, removed, etc), this may imply to the
reader that all types can be generated by a particular
command. But this isn't necessarily the case; "diff-files"
cannot generally produce an "Added" entry, since the diff is
limited to what is already in the index.

Let's make it clear that the list here is the full one, and
does not imply anything about what a particular invocation
may produce.

Note that conditionally including items (e.g., omitting
"Added" in the git-diff-files manpage) isn't the right
solution here for two reasons:

  - The problem isn't diff-files, but doing an index to
working tree diff. "git diff" can do the same diff, but
also has other modes where "Added" does show up.

  - The direction of the diff matters. Doing "diff-files -R"
can get you Added entries (but not Deleted ones).

So it's best just to explain that the set of available types
depends on the specific diff invocation.

Reported-by: John Cheng 
Signed-off-by: Jeff King 
---
 Documentation/diff-options.txt | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 9d1586b956..743af97b06 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -469,6 +469,12 @@ ifndef::git-format-patch[]
 +
 Also, these upper-case letters can be downcased to exclude.  E.g.
 `--diff-filter=ad` excludes added and deleted paths.
++
+Note that not all diffs can feature all types. For instance, diffs
+from the index to the working tree can never have Added entries
+(because the set of paths included in the diff is limited by what is in
+the index).  Similarly, copied and renamed entries cannot appear if
+detection for those types is disabled.
 
 -S::
Look for differences that change the number of occurrences of
-- 
2.16.0.rc0.391.gbd38408415



Re: Misleading documentation for git-diff-files (diff-filter)

2018-01-04 Thread John Cheng
To be clear, I don't mean to imply that diff-files should include
files that are not the index. I was trying to say that as a user, the
documentation gave me a different impression.

For background, my intent was to have a script to look for local git
repos that with unstaged changes. After some trial and error, I found
that git-ls-files gave me what I needed. However, I wanted to point
out why I initially believed git-diff-files with show "added files".
Think of this more as user feedback.

On Wed, Jan 3, 2018 at 3:53 PM, Junio C Hamano  wrote:
> John Cheng  writes:
>
>> I wanted to know if git diff-files shows files that are not in the
>> index but are in the working tree.
>
> At least in the original design of Git, that would fundamentally be
> impossible, as Git _only_ cares about paths that are in the index,
> so a new file won't be in the picture until it is added.  Because a
> change is shown as "A"dded by the diff family of commands only when
> the old side lacks a path that appears in the new side, there is no
> way "diff-files" that compares the index and the working tree would
> see a path that is missing from the old (i.e. the index) side.



-- 
---
John L Cheng


Re: [RFC/PATCH] connect: add GIT_SSH_{SEND,RECEIVE}{,_COMMAND} env variables

2018-01-04 Thread Jeff King
On Thu, Jan 04, 2018 at 11:10:17AM +0100, Ævar Arnfjörð Bjarmason wrote:

> That's badly explained, sorry, when I say "push" I mean "push and/or
> pull".
> 
> I don't know about Github, but on Gitlab when you provision a deploy key
> and associate it with a repo it must be *globally* rw or ro, there's no
> way to on a per-repo basis say it should be rw ro.
> 
> I have a job that's fetching a bunch of repos to review code in them
> (for auditing purposes). It then commits the results of that review to
> other git repos.
> 
> Thus I want to have a ro key to all those reviewed repos, but rw keys to
> the audit repo itself (and it'll also pull with the rw key).

OK, that part makes sense to me.

But I'm not sure how your patch solves it. When you "git fetch" on the
audit repo, wouldn't your GIT_SSH_RECEIVE_COMMAND kick in and use the
wrong key? What am I missing?

-Peff


Re: Bring together merge and rebase

2018-01-04 Thread Johannes Schindelin
Hi,

On Sun, 24 Dec 2017, Alexei Lozovsky wrote:

> On Dec 24, 2017, at 01:01, Johannes Schindelin wrote:
> > 
> > Hi Carl,
> > 
> > On Sat, 23 Dec 2017, Carl Baldwin wrote:
> > 
> >> I imagine that a "git commit --amend" would also insert a "replaces"
> >> reference to the original commit but I failed to mention that in my
> >> original post.
> > 
> > And cherry-pick, too, of course.
> 
> Why would it?

Because that's the command you use if you perform an interactive rebase
"manually". Or if you need to split a topic branch into two.

Ciao,
Johannes


Hello

2018-01-04 Thread Mr. X
My name is Heather ten Broeke from Atlanta Georgia i have a project for you 
contact my private email :heatherbrooeke1...@gmail.com for details.


Re: Fwd: Unknown option for merge-recursive: -X'diff-algorithm=patience'

2018-01-04 Thread Matwey V. Kornilov

Hi,

It seems there is some issue with double escaping:

14:57:18.524010 git.c:344   trace: built-in: git 'merge'
'--no-log' '--no-ff' '--strategy=recursive' '-X' ''\''diff-algorithm=p
atience'\''' '-m' 'Merge branch '\''core-rcu-for-linus'\'' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Where did additional quotes around diff-algorithm came from?


01.01.2018 14:28, Matwey V. Kornilov пишет:
> Hello,
> 
> I am running git 2.15.1 and facing the following issue with linux kernel
> tree.
> 
> # git checkout v3.8
> # git branch abc-3.8
> # git checkout v3.9
> # git branch abc-3.9
> # git checkout abc-3.8
> 
> Introduce new commit on top of abc-3.8. Here, I edit README.
> 
> # vim README
> # git commit -a -v
> [abc-3.8 4bf088b5d341] Hello world
> 
> Then I try to rebase abc-3.9 on top of abc-3.8 as the following:
> 
> # git rebase --preserve-merges -s recursive -Xdiff-algorithm=patience
> --onto abc-3.8 v3.8 abc-3.9
> 
> And then I see:
> 
> fatal: Unknown option for merge-recursive: -X'diff-algorithm=patience'
> Error redoing merge e84cf5d0fd53badf3a93c790e280cc92a69ed999
> 
> Attached here is GIT_TRACE=1 output.
> 




Re: [PATCH v2 7/7] wildmatch test: create & test files on disk in addition to in-memory

2018-01-04 Thread Adam Dinwoodie
On Wednesday 03 January 2018 at 08:14 pm +0100, Ævar Arnfjörð Bjarmason wrote:
> 
> On Wed, Jan 03 2018, Adam Dinwoodie jotted:
> 
> > On Wednesday 03 January 2018 at 02:31 pm +0100, Ævar Arnfjörð Bjarmason 
> > wrote:
> >> Does the fixup above in <878tdm8k2d@evledraar.gmail.com> work for
> >> you, i.e. changing $10 in the script to ${10}?
> >
> > This fixes some but not all of the failures: I'm now down from 42 to 24
> > failures.
> >
> > Updated verbose test output is at
> > https://gist.github.com/me-and/04443bcb00e12436f0eacce079b56d02
> 
> Thanks lot, looking through our own commit logs I believe the rest
> should be fixed by this (prior art in 6fd1106aa4), it would be great if
> you could test it, I don't have access to a Windows machine:
> 
> diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
> index f985139b6f..5838fcb77d 100755
> --- a/t/t3070-wildmatch.sh
> +++ b/t/t3070-wildmatch.sh
> @@ -23,6 +23,15 @@ create_test_file() {
> *//*)
> return 1
> ;;
> +   # On Windows, \ in paths is silently converted to /, which
> +   # would result in the "touch" below working, but the test
> +   # itself failing.
> +   *\\*)
> +   if ! test_have_prereq BSLASHPSPEC
> +   then
> +   return 1
> +   fi
> +   ;;
> # When testing the difference between foo/bar and foo/bar/ we
> # can't test the latter.
> */)

Confirmed this fixes all the outstanding test failures.  Thank you!


Re: [RFC/PATCH] connect: add GIT_SSH_{SEND,RECEIVE}{,_COMMAND} env variables

2018-01-04 Thread Ævar Arnfjörð Bjarmason

On Thu, Jan 04 2018, Jeff King jotted:

> On Thu, Jan 04, 2018 at 01:08:28AM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> Hopefully this is clearer, and depending on how the rest of the
>> discussion goes I'll submit v2 with something like this in the commit
>> message:
>>
>> SSH keys A and B are known to the remote service, and used to identify
>> two different users.
>>
>> A can only push to repository X, and B can only fetch from repository Y.
>>
>> Thus, if you have a script that does:
>>
>> GIT_SSH_COMMAND="ssh -i A -i B" git ...
>>
>> It'll always fail for pulling from X, and pushing to Y. Supply:
>>
>> GIT_SSH_COMMAND="ssh -i B -i A" git ...
>>
>> And now pulling will work, but pushing won't.
>
> I get that you may have two different keys to go with two different
> identities on a remote system. But I'm not sure I understand why
> "sending" or "receiving" is the right way to split those up. Wouldn't
> you also sometimes want to fetch from repository X? IOW, wouldn't you
> want to tie identity "A" to repository "X", and "B" to repository "Y?

That's badly explained, sorry, when I say "push" I mean "push and/or
pull".

I don't know about Github, but on Gitlab when you provision a deploy key
and associate it with a repo it must be *globally* rw or ro, there's no
way to on a per-repo basis say it should be rw ro.

I have a job that's fetching a bunch of repos to review code in them
(for auditing purposes). It then commits the results of that review to
other git repos.

Thus I want to have a ro key to all those reviewed repos, but rw keys to
the audit repo itself (and it'll also pull with the rw key).

Hence this patch, I thought *maybe* others would be interested in this
since it seems to me to be an easy thing to run into with these ssh-key
based hosting providers, but maybe not.

>> So now I just have a GIT_SSH_COMMAND that dispatches to different keys
>> depending on the operation, as noted in the commit message, and I can
>> assure you that without that logic it doesn't work.
>
> You mentioned host aliases later, which is the solution I've seen in the
> wild. And then you can map each remote to a different host alias.