Re: [PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

2014-02-06 Thread Kirill Smelkov
On Wed, Feb 05, 2014 at 02:58:36PM -0800, Junio C Hamano wrote:
> Kirill Smelkov  writes:
> 
> > On Wed, Feb 05, 2014 at 11:42:41AM -0800, Junio C Hamano wrote:
> >> Kirill Smelkov  writes:
> >> 
> >> > I agree object data should be immutable for good. The only thing I'm 
> >> > talking
> >> > about here is mode, which is parsed from a tree buffer and is stored in
> >> > separate field:
> >> 
> >> Ah, I do not see any problem in that case, then.
> >> 
> >> Thanks.
> >
> > Thanks, that simplifies things for me.

Below is a patch which does it. Please apply, if it is ok.


> Surely.
> 
> Be careful when traversing N-trees in parallel---you may have to
> watch out for the entry ordering rules that sorts the following
> paths in the order shown:
> 
>   a
>   a-b
> a/b
> a_b
> 
> Inside a single tree, you cannot have 'a' and 'a/b' at the same
> time, but one tree may have 'a' (without 'a/b') while another one
> may have 'a/b' (without 'a'), and walking them in parallel has
> historically been one source of funny bugs.

Thanks for the warning. I'm relying here on base_name_compare() and
ordering it imposes on entries and how it differentiates directories and
files, so let's hope this should be ok.

Actual reworking is still in flux though...

Thanks,
Kirill


 8< 
From: Kirill Smelkov 
Date: Thu, 6 Feb 2014 15:36:31 +0400
Subject: [PATCH] Finally switch over tree descriptors to contain a pre-parsed 
entry

This continues 4651ece8 (Switch over tree descriptors to contain a
pre-parsed entry) and moves the only rest computational part

mode = canon_mode(mode)

from tree_entry_extract() to tree entry decode phase - to
decode_tree_entry().

The reason to do it, is that canon_mode() is at least 2 conditional
jumps for regular files, and that could be noticeable should canon_mode()
be invoked several times.

That does not matter for current Git codebase, where typical tree
traversal is

while (t->size) {
sha1 = tree_entry_extract(t, &path, &mode);
...
update_tree_entry(t);
}

i.e. we do t -> sha1,path.mode "extraction" only once per entry. In such
cases, it does not matter performance-wise, where that mode
canonicalization is done - either once in tree_entry_extract(), or once
in decode_tree_entry() called by update_tree_entry() - it is
approximately the same.

But for future code, which could need to work with several tree_desc's
in parallel, it could be handy to operate on tree_desc descriptors, and
do "extracts" only when needed, or at all, access only relevant part of
it through structure fields directly.

And for such situations, having canon_mode() be done once in decode
phase is better - we won't need to pay the performance price of 2 extra
conditional jumps on every t->mode access.

So let's move mode canonicalization to decode_tree_entry(). That was the
final bit. Now after tree entry is decoded, it is fully ready and could
be accessed either directly via field, or through tree_entry_extract()
which this time got really "totally trivial".

Signed-off-by: Kirill Smelkov 
---
 tree-walk.c | 2 +-
 tree-walk.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tree-walk.c b/tree-walk.c
index 79dba1d..4dc86c7 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -37,7 +37,7 @@ static void decode_tree_entry(struct tree_desc *desc, const 
char *buf, unsigned
 
/* Initialize the descriptor entry */
desc->entry.path = path;
-   desc->entry.mode = mode;
+   desc->entry.mode = canon_mode(mode);
desc->entry.sha1 = (const unsigned char *)(path + len);
 }
 
diff --git a/tree-walk.h b/tree-walk.h
index ae04b64..ae7fb3a 100644
--- a/tree-walk.h
+++ b/tree-walk.h
@@ -16,7 +16,7 @@ struct tree_desc {
 static inline const unsigned char *tree_entry_extract(struct tree_desc *desc, 
const char **pathp, unsigned int *modep)
 {
*pathp = desc->entry.path;
-   *modep = canon_mode(desc->entry.mode);
+   *modep = desc->entry.mode;
return desc->entry.sha1;
 }
 
-- 
1.9.rc1.181.g641f458

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

2014-02-05 Thread Junio C Hamano
Kirill Smelkov  writes:

> On Wed, Feb 05, 2014 at 11:42:41AM -0800, Junio C Hamano wrote:
>> Kirill Smelkov  writes:
>> 
>> > I agree object data should be immutable for good. The only thing I'm 
>> > talking
>> > about here is mode, which is parsed from a tree buffer and is stored in
>> > separate field:
>> 
>> Ah, I do not see any problem in that case, then.
>> 
>> Thanks.
>
> Thanks, that simplifies things for me.

Surely.

Be careful when traversing N-trees in parallel---you may have to
watch out for the entry ordering rules that sorts the following
paths in the order shown:

a
a-b
a/b
a_b

Inside a single tree, you cannot have 'a' and 'a/b' at the same
time, but one tree may have 'a' (without 'a/b') while another one
may have 'a/b' (without 'a'), and walking them in parallel has
historically been one source of funny bugs.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

2014-02-05 Thread Kirill Smelkov
On Wed, Feb 05, 2014 at 11:42:41AM -0800, Junio C Hamano wrote:
> Kirill Smelkov  writes:
> 
> > I agree object data should be immutable for good. The only thing I'm talking
> > about here is mode, which is parsed from a tree buffer and is stored in
> > separate field:
> 
> Ah, I do not see any problem in that case, then.
> 
> Thanks.

Thanks, that simplifies things for me.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

2014-02-05 Thread Junio C Hamano
Kirill Smelkov  writes:

> I agree object data should be immutable for good. The only thing I'm talking
> about here is mode, which is parsed from a tree buffer and is stored in
> separate field:

Ah, I do not see any problem in that case, then.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

2014-02-05 Thread Kirill Smelkov
On Wed, Feb 05, 2014 at 09:36:55AM -0800, Junio C Hamano wrote:
> Kirill Smelkov  writes:
> 
> > Only, before I clean things up, I'd like to ask - would the following
> > patch be accepted
> >
> >  8< ---
> > diff --git a/tree-walk.c b/tree-walk.c
> > index 79dba1d..4dc86c7 100644
> > --- a/tree-walk.c
> > +++ b/tree-walk.c
> > @@ -37,7 +37,7 @@ static void decode_tree_entry(struct tree_desc *desc, 
> > const char *buf, unsigned
> >  
> > /* Initialize the descriptor entry */
> > desc->entry.path = path;
> > -   desc->entry.mode = mode;
> > +   desc->entry.mode = canon_mode(mode);
> > desc->entry.sha1 = (const unsigned char *)(path + len);
> >  }
> >  
> > diff --git a/tree-walk.h b/tree-walk.h
> > index ae04b64..ae7fb3a 100644
> > --- a/tree-walk.h
> > +++ b/tree-walk.h
> > @@ -16,7 +16,7 @@ struct tree_desc {
> >  static inline const unsigned char *tree_entry_extract(struct tree_desc 
> > *desc, const char **pathp, unsigned int *modep)
> >  {
> > *pathp = desc->entry.path;
> > -   *modep = canon_mode(desc->entry.mode);
> > +   *modep = desc->entry.mode;
> > return desc->entry.sha1;
> >  }
> >  8< ---
> >  
> > ?
> 
> Doesn't desc point into and walks over the data we read from the
> tree object directly?
> 
> We try to keep (tree|commit)->buffer intact and that is done pretty
> deliberately.  While you are walking a tree or parsing a commit,
> somebody else, perhaps called indirectly by a helper function you
> call, may call lookup_object() for the same object, get the copy
> that has already been read and start using it.  This kind of change
> will introduce bugs that are hard to debug unless it is done very
> carefully (e.g. starting from making tree.buffer into a const pointer
> and propagating constness throughout the system), which might not be
> worth the pain.

I agree object data should be immutable for good. The only thing I'm talking
about here is mode, which is parsed from a tree buffer and is stored in
separate field:

 8<  tree-walk.h
struct name_entry {
const unsigned char *sha1;
const char *path;
unsigned int mode;  <---
};

struct tree_desc {
const void *buffer;
struct name_entry entry;
unsigned int size;
};

 8<  tree-walk.c
const char *get_mode(const char *str, unsigned int *modep)
{ ... } /* parses symbolic mode from tree buffer to uint */

void decode_tree_entry(struct tree_desc *desc, const char *buf, 
unsigned long size)
{
const char *path;
unsigned int mode, len;

...
path = get_mode(buf, &mode);

/* Initialize the descriptor entry */
desc->entry.path = path;
desc->entry.mode = mode;<---
desc->entry.sha1 = (const unsigned char *)(path + len);


so we are not talking about modifying object buffers here. All I'm
asking is about canonicalizing _already_ parsed and copied mode on the
fly.

Does that change anything?


Sorry, if I'm maybe misunderstanding something, and thanks,
Kirill
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

2014-02-05 Thread Junio C Hamano
Kirill Smelkov  writes:

> Only, before I clean things up, I'd like to ask - would the following
> patch be accepted
>
>  8< ---
> diff --git a/tree-walk.c b/tree-walk.c
> index 79dba1d..4dc86c7 100644
> --- a/tree-walk.c
> +++ b/tree-walk.c
> @@ -37,7 +37,7 @@ static void decode_tree_entry(struct tree_desc *desc, const 
> char *buf, unsigned
>  
>   /* Initialize the descriptor entry */
>   desc->entry.path = path;
> - desc->entry.mode = mode;
> + desc->entry.mode = canon_mode(mode);
>   desc->entry.sha1 = (const unsigned char *)(path + len);
>  }
>  
> diff --git a/tree-walk.h b/tree-walk.h
> index ae04b64..ae7fb3a 100644
> --- a/tree-walk.h
> +++ b/tree-walk.h
> @@ -16,7 +16,7 @@ struct tree_desc {
>  static inline const unsigned char *tree_entry_extract(struct tree_desc 
> *desc, const char **pathp, unsigned int *modep)
>  {
>   *pathp = desc->entry.path;
> - *modep = canon_mode(desc->entry.mode);
> + *modep = desc->entry.mode;
>   return desc->entry.sha1;
>  }
>  8< ---
>  
> ?

Doesn't desc point into and walks over the data we read from the
tree object directly?

We try to keep (tree|commit)->buffer intact and that is done pretty
deliberately.  While you are walking a tree or parsing a commit,
somebody else, perhaps called indirectly by a helper function you
call, may call lookup_object() for the same object, get the copy
that has already been read and start using it.  This kind of change
will introduce bugs that are hard to debug unless it is done very
carefully (e.g. starting from making tree.buffer into a const pointer
and propagating constness throughout the system), which might not be
worth the pain.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

2014-02-05 Thread Kirill Smelkov
On Tue, Feb 04, 2014 at 10:37:24AM -0800, Junio C Hamano wrote:
> Kirill Smelkov  writes:
> 
> >> if we did not want to reinvent the whole tree walking thing, which
> >> would add risks for new bugs and burden to maintain this and the
> >> usual two-tree diff tree walking in sync.
> >
> > Junio, I understand it is not good to duplicate code and increase
> > maintenance risks
> > Instead I propose we switch to the new tree walker completely.
> 
> I am not fundamentally opposed to the idea. We'll see what happens.

Thanks. Locally, with draft patches with generalized tree-walker, for
`git log --raw --no-abbrev --no-renames` I was able to get to as fast as
with old two-tree walker for navy.git and 1% faster for linux.git and
all tests pass.

Only, before I clean things up, I'd like to ask - would the following
patch be accepted

 8< ---
diff --git a/tree-walk.c b/tree-walk.c
index 79dba1d..4dc86c7 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -37,7 +37,7 @@ static void decode_tree_entry(struct tree_desc *desc, const 
char *buf, unsigned
 
/* Initialize the descriptor entry */
desc->entry.path = path;
-   desc->entry.mode = mode;
+   desc->entry.mode = canon_mode(mode);
desc->entry.sha1 = (const unsigned char *)(path + len);
 }
 
diff --git a/tree-walk.h b/tree-walk.h
index ae04b64..ae7fb3a 100644
--- a/tree-walk.h
+++ b/tree-walk.h
@@ -16,7 +16,7 @@ struct tree_desc {
 static inline const unsigned char *tree_entry_extract(struct tree_desc *desc, 
const char **pathp, unsigned int *modep)
 {
*pathp = desc->entry.path;
-   *modep = canon_mode(desc->entry.mode);
+   *modep = desc->entry.mode;
return desc->entry.sha1;
 }
 8< ---
 
?

The reason I ask is because it is more handy to work with tree_desc
structures, compared to splitting it to sha1/path/mode/pathlen, and
if canon_mode() is done only once, clients don't have to pay the
performance price of canon_mode on each tree_desc "extract".

I understand there is fsck, which on invalid tree entry prints the mode,
but maybe somehow fsck should be special, and common interface be tuned
to usual clients?

Thanks,
Kirill
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

2014-02-04 Thread Junio C Hamano
Kirill Smelkov  writes:

>> if we did not want to reinvent the whole tree walking thing, which
>> would add risks for new bugs and burden to maintain this and the
>> usual two-tree diff tree walking in sync.
>
> Junio, I understand it is not good to duplicate code and increase
> maintenance risks
> Instead I propose we switch to the new tree walker completely.

I am not fundamentally opposed to the idea. We'll see what happens.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

2014-02-04 Thread Kirill Smelkov
On Mon, Feb 03, 2014 at 03:39:02PM -0800, Junio C Hamano wrote:
> Junio C Hamano  writes:
> 
> > Kirill Smelkov  writes:
> >
> >> As was recently shown (c839f1bd "combine-diff: optimize
> >> combine_diff_path sets intersection"), combine-diff runs very slowly. In
> >> that commit we optimized paths sets intersection, but that accounted
> >> only for ~ 25% of the slowness, and as my tracing showed, for linux.git
> >> v3.10..v3.11, for merges a lot of time is spent computing
> >> diff(commit,commit^2) just to only then intersect that huge diff to
> >> almost small set of files from diff(commit,commit^1).
> >>
> >> That's because at present, to compute combine-diff, for first finding
> >> paths, that "every parent touches", we use the following combine-diff
> >> property/definition:
> >>
> >> D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn)  (w.r.t. paths)
> >>
> >> where
> >>
> >> D(A,P1...Pn) is combined diff between commit A, and parents Pi
> >>
> >> and
> >>
> >> D(A,Pi) is usual two-tree diff Pi..A
> >
> > and A ^ B means what???
> 
> Silly me; obviously it is the "set intersection" operator.
> 
> We probably could instead use the "current" set of paths as literal
> pathspec to compute subsequent paths, i.e.
> 
>   D(A,Pi,PS) is two tree diff P1..A limited to paths PS
> 
>   D(A,P1...Pn) = D(A,P1,[]) ^
>  D(A,P2,D(A,P1,[])) ^
>...
>  D(A,Pn,D(A,P1...Pn-1))
> 
> if we did not want to reinvent the whole tree walking thing, which
> would add risks for new bugs and burden to maintain this and the
> usual two-tree diff tree walking in sync.

Junio, I understand it is not good to duplicate code and increase
maintenance risks. On the other hand, I don't quite like the approach
with keeping current paths - it could work and be faster compared to
what we had, but to me it is still suboptimal, because if the first diff
D(A,P1) is huge then oops. Also, to implement it rationally, without
delving into unneeded recursion, we would need to do the diffing without
recursion, intersect results, and then recurse into overlapping subtrees,
which means tree-walker rework anyway.


Instead I propose we switch to the new tree walker completely. With the
attached patch which draftly shows it (diff_tree is gone, the work is
done through diff_tree_combined_paths, and then the result is
"exported" to diff_filepairs) all the tests pass. Also, timings for

git log --raw --no-abbrev --no-renames

( without -c - it would be the same - we are not touching that code, it
  would only add, irrelevant-to-the-change constant time )

are

linux.git   v3.10..v3.11became 0.1% _faster_
navy.gitbecame 1.4% slower


That means, that the new tree-walker is correct and should be ok
performance-wise (I had not optimized it at all, yet).

What would you say?

Thanks,
Kirill

P.S. Thanks for commenting on other patches. I'll try to correct them
tomorrow, as I have no more time today and need to run.


 8< 
From: Kirill Smelkov 
Date: Tue, 4 Feb 2014 20:11:16 +0400
Subject: [PATCH] X Show new tree-walker could be used instead of the old one

All tests pass. Timings for git log --raw --no-abrev --no-renames

linux.git   v3.10..v3.11became 0.1% _faster_
navy.gitbecame 1.4% slower
---
 diff.h  |  2 ++
 line-log.c  | 12 +++--
 revision.c  | 16 ++-
 tree-diff.c | 88 +++--
 4 files changed, 90 insertions(+), 28 deletions(-)

diff --git a/diff.h b/diff.h
index 5496560..0a9845a 100644
--- a/diff.h
+++ b/diff.h
@@ -189,8 +189,10 @@ const char *diff_line_prefix(struct diff_options *);
 
 extern const char mime_boundary_leader[];
 
+#if 0
 extern int diff_tree(struct tree_desc *t1, struct tree_desc *t2,
 const char *base, struct diff_options *opt);
+#endif
 extern int diff_tree_sha1(const unsigned char *old, const unsigned char *new,
  const char *base, struct diff_options *opt);
 extern int diff_root_tree_sha1(const unsigned char *new, const char *base,
diff --git a/line-log.c b/line-log.c
index 717638b..5739bbf 100644
--- a/line-log.c
+++ b/line-log.c
@@ -766,6 +766,7 @@ void line_log_init(struct rev_info *rev, const char 
*prefix, struct string_list
}
 }
 
+#if 0
 static void load_tree_desc(struct tree_desc *desc, void **tree,
   const unsigned char *sha1)
 {
@@ -775,6 +776,7 @@ static void load_tree_desc(struct tree_desc *desc, void 
**tree,
die("Unable to read tree (%s)", sha1_to_hex(sha1));
init_tree_desc(desc, *tree, size);
 }
+#endif
 
 static int count_parents(struct commit *commit)
 {
@@ -843,17 +845,23 @@ static void queue_diffs(struct line_log_data *range,
struct commit *commit, struct commit *parent)
 {
void *tree1 = NULL, *tree2 = NULL;
-   struct tree_desc desc1, desc2;
+// st

Re: [PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

2014-02-03 Thread Junio C Hamano
Kirill Smelkov  writes:

> + parents_sha1 = xmalloc(nparent * sizeof(parents_sha1[0]));
> + for (i = 0; i < nparent; i++)
> + parents_sha1[i] = parents->sha1[i];
> +
> + /* fake list head, so worker can assume it is non-NULL */
> + struct combine_diff_path paths_head;

decl-after-statement; please fix.

> + paths_head.next = NULL;
> +
> + struct strbuf base;

likewise.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

2014-02-03 Thread Junio C Hamano
Junio C Hamano  writes:

> Kirill Smelkov  writes:
>
>> As was recently shown (c839f1bd "combine-diff: optimize
>> combine_diff_path sets intersection"), combine-diff runs very slowly. In
>> that commit we optimized paths sets intersection, but that accounted
>> only for ~ 25% of the slowness, and as my tracing showed, for linux.git
>> v3.10..v3.11, for merges a lot of time is spent computing
>> diff(commit,commit^2) just to only then intersect that huge diff to
>> almost small set of files from diff(commit,commit^1).
>>
>> That's because at present, to compute combine-diff, for first finding
>> paths, that "every parent touches", we use the following combine-diff
>> property/definition:
>>
>> D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn)  (w.r.t. paths)
>>
>> where
>>
>> D(A,P1...Pn) is combined diff between commit A, and parents Pi
>>
>> and
>>
>> D(A,Pi) is usual two-tree diff Pi..A
>
> and A ^ B means what???

Silly me; obviously it is the "set intersection" operator.

We probably could instead use the "current" set of paths as literal
pathspec to compute subsequent paths, i.e.

D(A,Pi,PS) is two tree diff P1..A limited to paths PS

D(A,P1...Pn) = D(A,P1,[]) ^
   D(A,P2,D(A,P1,[])) ^
   ...
   D(A,Pn,D(A,P1...Pn-1))

if we did not want to reinvent the whole tree walking thing, which
would add risks for new bugs and burden to maintain this and the
usual two-tree diff tree walking in sync.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

2014-02-03 Thread Junio C Hamano
Kirill Smelkov  writes:

> As was recently shown (c839f1bd "combine-diff: optimize
> combine_diff_path sets intersection"), combine-diff runs very slowly. In
> that commit we optimized paths sets intersection, but that accounted
> only for ~ 25% of the slowness, and as my tracing showed, for linux.git
> v3.10..v3.11, for merges a lot of time is spent computing
> diff(commit,commit^2) just to only then intersect that huge diff to
> almost small set of files from diff(commit,commit^1).
>
> That's because at present, to compute combine-diff, for first finding
> paths, that "every parent touches", we use the following combine-diff
> property/definition:
>
> D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn)  (w.r.t. paths)
>
> where
>
> D(A,P1...Pn) is combined diff between commit A, and parents Pi
>
> and
>
> D(A,Pi) is usual two-tree diff Pi..A

and A ^ B means what???

I do like the approach of walking the tree entries and stop as
shallowly as possible without recursing.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/8] combine-diff: Fast changed-to-all-parents paths scanning

2014-02-03 Thread Kirill Smelkov
As was recently shown (c839f1bd "combine-diff: optimize
combine_diff_path sets intersection"), combine-diff runs very slowly. In
that commit we optimized paths sets intersection, but that accounted
only for ~ 25% of the slowness, and as my tracing showed, for linux.git
v3.10..v3.11, for merges a lot of time is spent computing
diff(commit,commit^2) just to only then intersect that huge diff to
almost small set of files from diff(commit,commit^1).

That's because at present, to compute combine-diff, for first finding
paths, that "every parent touches", we use the following combine-diff
property/definition:

D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn)  (w.r.t. paths)

where

D(A,P1...Pn) is combined diff between commit A, and parents Pi

and

D(A,Pi) is usual two-tree diff Pi..A

So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n
1-parent diffs and intersecting results will be slow.

And usually, for linux.git and other topic-based workflows, that
D(A,P2) is huge, because, if merge-base of A and P2, is several dozens
of merges (from A, via first parent) below, that D(A,P2) will be diffing
sum of merges from several subsystems to 1 subsystem.

The solution is to avoid computing n 1-parent diffs, and to find
changed-to-all-parents paths via scanning A's and all Pi's trees
simultaneously, at each step comparing their entries, and based on that
comparison, populate paths result, and deduce we could *skip*
*recursing* into subdirectories, if at least for 1 parent, sha1 of that
dir tree is the same as in A. That would save us from doing significant
amount of needless work.

Such approach is very similar to what diff_tree() does, only there we
deal with scanning only 2 trees simultaneously, and for n+1 tree, the
logic is a bit more complex:

D(A,X1...Xn) calculation scheme
---

D(A,X1...Xn) = D(A,X1) ^ ... ^ D(A,Xn)   (regarding resulting paths set)

 D(A,Xj) - diff between A..Xj
 D(A,X1...Xn)- combined diff from A to parents X1,...,Xn

We start from all trees, which are sorted, and compare their entries in
lock-step:

  A X1   Xn
  - --
 |a|   |x1| |xn|
 |-|   |--| ... |--|  i = argmin(x1...xn)
 | |   |  | |  |
 |-|   |--| |--|
 |.|   |. | |. |
  . ..
  . ..

at any time there could be 3 cases:

 1)  a < xi;
 2)  a > xi;
 3)  a = xi.

Schematic deduction of what every case means, and what to do, follows:

1)  a < xi  ->  ∀j a ∉ Xj  ->  "+a" ∈ D(A,Xj)  ->  D += "+a";  a↓

2)  a > xi

2.1) ∃j: xj > xi  ->  "-xi" ∉ D(A,Xj)  ->  D += ø;  ∀ xk=xi  xk↓
2.2) ∀j  xj = xi  ->  xj ∉ A  ->  "-xj" ∈ D(A,Xj)  ->  D += "-xi";  ∀j 
xj↓

3)  a = xi

3.1) ∃j: xj > xi  ->  "+a" ∈ D(A,Xj)  ->  only xk=xi remains to 
investigate
3.2) xj = xi  ->  investigate δ(a,xj)
 |
 |
 v

3.1+3.2) looking at δ(a,xk) ∀k: xk=xi - if all != ø  ->

  ⎧δ(a,xk)  - if xk=xi
 ->  D += ⎨
  ⎩"+a" - if xk>xi

in any case a↓  ∀ xk=xi  xk↓

~

For comparison, here is how diff_tree() works:

D(A,B) calculation scheme
-

A B
- -
   |a|   |b|a < b   ->  a ∉ B   ->   D(A,B) +=  +aa↓
   |-|   |-|a > b   ->  b ∉ A   ->   D(A,B) +=  -bb↓
   | |   | |a = b   ->  investigate δ(a,b)a↓ b↓
   |-|   |-|
   |.|   |.|
. .
. .

For now there is a limitation however - for simple scan to work, we have
to know in advance, a user had not asked for finding rename/copies
detection. At present, if he/she has, we fallback to calling old n
1-parent diffs code. I think that restriction, at least in theory,
could be lifted. I propose we have fast code for at least common case,
and move on incrementally.

Another similar limitation, is that if diffcore transforms which
filter-out paths, are active, we also fallback to old code. The reason
here is pure "technical" - all transforms expect to find paths in
diff_filepair's queue, which applies only to 1-parent case. If needed,
The solution here would be to generalize diff_filepair to
combine_diff_path everywhere, and do not differentiate between plain and
combined diffs (all diffs will be combined with diff(A,B) being
combined diff with only 1 parent), but again, let's move on
incrementally.

I consciously placed trees-scanning code in tree-diff.c, foreseeing diff
and combine-diff merge, and because that code is very similar to
diff_tree - to keep similar codes near.

Timings for `git log --raw --no-abbrev --no-renames` without `-c` ("git log")
and with `-c` ("git log -c") and with `-c --merges` ("git log -c --merges")
before and after the patch are as follows:

linux.git