Re: [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based

2014-03-28 Thread Johannes Sixt
Am 3/27/2014 19:48, schrieb Junio C Hamano:
 From: Kirill Smelkov k...@mns.spb.ru
 Date: Mon, 24 Feb 2014 20:21:46 +0400
 ...
 
 By the way, in general I do not appreciate people lying on the Date:
 with an in-body header in their patches, either in the original or
 in rerolls.

format-patch is not very cooperative in this aspect. When I prepare a
patch series with format-patch, I find myself editing out the Date: line
from all patches it produces again and again. :-(

-- Hannes
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based

2014-03-28 Thread Johannes Sixt
Am 28.03.2014 18:06, schrieb Junio C Hamano:
 Johannes Sixt j.s...@viscovery.net writes:
 
 Am 3/27/2014 19:48, schrieb Junio C Hamano:
 From: Kirill Smelkov k...@mns.spb.ru
 Date: Mon, 24 Feb 2014 20:21:46 +0400
 ...

 By the way, in general I do not appreciate people lying on the Date:
 with an in-body header in their patches, either in the original or
 in rerolls.

 format-patch is not very cooperative in this aspect. When I prepare a
 patch series with format-patch, I find myself editing out the Date: line
 from all patches it produces again and again. :-(
 
 I am not sure what you mean.  If you are pasting the format-patch
 output into an editor your MUA is using to receive the body of the
 message from you, you would remove all the non-body lines, not just
 Date: but Subject: and From:, no?

Correct. So I should add that my gripe is about when I want to send a
patch series with git-send-email that was prepared with git-format-patch.

-- Hannes

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based

2014-03-28 Thread Johannes Sixt
Am 28.03.2014 19:36, schrieb Junio C Hamano:
 Johannes Sixt j...@kdbg.org writes:
 
 Am 28.03.2014 18:06, schrieb Junio C Hamano:
 Johannes Sixt j.s...@viscovery.net writes:

 Am 3/27/2014 19:48, schrieb Junio C Hamano:
 From: Kirill Smelkov k...@mns.spb.ru
 Date: Mon, 24 Feb 2014 20:21:46 +0400
 ...

 By the way, in general I do not appreciate people lying on the Date:
 with an in-body header in their patches, either in the original or
 in rerolls.

 format-patch is not very cooperative in this aspect. When I prepare a
 patch series with format-patch, I find myself editing out the Date: line
 from all patches it produces again and again. :-(

 I am not sure what you mean.  If you are pasting the format-patch
 output into an editor your MUA is using to receive the body of the
 message from you, you would remove all the non-body lines, not just
 Date: but Subject: and From:, no?

 Correct. So I should add that my gripe is about when I want to send a
 patch series with git-send-email that was prepared with git-format-patch.
 
 Hmph.  Don't you get fresh timestamps for your messages in such a
 case, ignoring whatever is at the beginning of the input files?
 
 My reading of git-send-email is:
 
  * $time = time - scalar $#files prepares the initial timestamp,
so that running two git send-email back to back will give
timestamps to the series sent out by the first invocation that
are older than the ones the second series will get;
 
  * sub send_message calls format_2822_time($time++) to send the
first message with that initial timestamp, incrementing the
timestamps by 1 second intervals (without having to actually wait
1 second in between messages) for each patch.

Ah, nice! I didn't know that. I never dared to leave an old author date
(or any date) in the patches, and assumed that it would be kept and
disrupt the email time line.

Thanks for the hint, and sorry for the noise.

-- Hannes

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based

2014-03-28 Thread Junio C Hamano
Johannes Sixt j...@kdbg.org writes:

 My reading of git-send-email is:
 
  * $time = time - scalar $#files prepares the initial timestamp,
so that running two git send-email back to back will give
timestamps to the series sent out by the first invocation that
are older than the ones the second series will get;

A completely irrelevant tangent, but I was being an idiot here.  The
-scaler #$files is not about two send-email running back to back.
A second invocation that sends out a long series will start its
timestamp #$files in the past, that will overlap with the timestamp
of the last one in the first invocation.  And that is not what the
code attempts to address.  It wants to merely avoid timestamps from
the future.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based

2014-03-27 Thread Kirill Smelkov
On Wed, Mar 26, 2014 at 02:34:24PM -0700, Junio C Hamano wrote:
 Kirill Smelkov k...@navytux.spb.ru writes:
 
  On Tue, Mar 25, 2014 at 10:46:32AM -0700, Junio C Hamano wrote:
  Kirill Smelkov k...@navytux.spb.ru writes:
  
   What are the downsides of __ prefix by the way?
  
  Aren't these names reserved for compiler/runtime implementations?
 
  Yes, but there are precedents when people don't obey it widely and
  in practice everything works :)
 
 I think you are alluding to the practice in the Linux kernel, but
 their requirement is vastly different---their product do not even
 link with libc and they always compile with specific selected
 versions of gcc, no?

Yes, that is correct. Only __ was so visually appealing that there was
a temptation to break the rules, but...


  Let it be something portable anyway -
  how about diff_tree_sha1_low() ?
 
 Sure.
 
 As this is a file-scope static, I do not think the exact naming
 matters that much.  Just FYI, we seem to use ll_ prefix (standing
 for low-level) in some places.

... let's then use this ll_ prefix scheme for consistency.

Corrected patch is below, and I've sent corrections to follow-up
patches as well.

Thanks,
Kirill

(please keep author email)
 8 
From: Kirill Smelkov k...@mns.spb.ru
Date: Mon, 24 Feb 2014 20:21:46 +0400
Subject: [PATCH v3a] tree-diff: rework diff_tree interface to be sha1 based

In the next commit this will allow to reduce intermediate calls, when
recursing into subtrees - at that stage we know only subtree sha1, and
it is natural for tree walker to start from that phase. For now we do

diff_tree
show_path
diff_tree_sha1
diff_tree
...

and the change will allow to reduce it to

diff_tree
show_path
diff_tree

Also, it will allow to omit allocating strbuf for each subtree, and just
reuse the common strbuf via playing with its len.

The above-mentioned improvements go in the next 2 patches.

The downside is that try_to_follow_renames(), if active, we cause
re-reading of 2 initial trees, which was negligible based on my timings,
and which is outweighed cogently by the upsides.

NOTE To keep with the current interface and semantics, I needed to
rename the function from diff_tree() to diff_tree_sha1(). As
diff_tree_sha1() was already used, and the function we are talking here
is its more low-level helper, let's use convention for prefixing
such helpers with ll_. So the final renaming is

diff_tree() - ll_diff_tree_sha1()

Signed-off-by: Kirill Smelkov k...@mns.spb.ru
---

Changes since v3:

 - further rename diff_tree_sha1_low() - ll_diff_tree_sha1() to follow Git
   style for naming low-level helpers.

Changes since v2:

 - renamed __diff_tree_sha1() - diff_tree_sha1_low() as the former
   overlaps with reserved-for-implementation identifiers namespace.

Changes since v1:

 - don't need to touch diff.h, as diff_tree() became static.


 tree-diff.c | 60 
 1 file changed, 28 insertions(+), 32 deletions(-)

diff --git a/tree-diff.c b/tree-diff.c
index f137f39..1d02e43 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -141,12 +141,17 @@ static void skip_uninteresting(struct tree_desc *t, 
struct strbuf *base,
}
 }
 
-static int diff_tree(struct tree_desc *t1, struct tree_desc *t2,
-const char *base_str, struct diff_options *opt)
+static int ll_diff_tree_sha1(const unsigned char *old, const unsigned char 
*new,
+const char *base_str, struct diff_options *opt)
 {
+   struct tree_desc t1, t2;
+   void *t1tree, *t2tree;
struct strbuf base;
int baselen = strlen(base_str);
 
+   t1tree = fill_tree_descriptor(t1, old);
+   t2tree = fill_tree_descriptor(t2, new);
+
/* Enable recursion indefinitely */
opt-pathspec.recursive = DIFF_OPT_TST(opt, RECURSIVE);
 
@@ -159,39 +164,41 @@ static int diff_tree(struct tree_desc *t1, struct 
tree_desc *t2,
if (diff_can_quit_early(opt))
break;
if (opt-pathspec.nr) {
-   skip_uninteresting(t1, base, opt);
-   skip_uninteresting(t2, base, opt);
+   skip_uninteresting(t1, base, opt);
+   skip_uninteresting(t2, base, opt);
}
-   if (!t1-size  !t2-size)
+   if (!t1.size  !t2.size)
break;
 
-   cmp = tree_entry_pathcmp(t1, t2);
+   cmp = tree_entry_pathcmp(t1, t2);
 
/* t1 = t2 */
if (cmp == 0) {
if (DIFF_OPT_TST(opt, FIND_COPIES_HARDER) ||
-   hashcmp(t1-entry.sha1, t2-entry.sha1) ||
-   (t1-entry.mode != t2-entry.mode))
-   show_path(base, opt, t1, t2);
+   hashcmp(t1.entry.sha1, 

Re: [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based

2014-03-27 Thread Junio C Hamano
Kirill Smelkov k...@navytux.spb.ru writes:

 (please keep author email)
  8 
 From: Kirill Smelkov k...@mns.spb.ru
 Date: Mon, 24 Feb 2014 20:21:46 +0400
 Subject: [PATCH v3a] tree-diff: rework diff_tree interface to be sha1 based

git am -c will discard everything above the scissors and then
start parsing the in-body headers from there, so the above From:
will be used.

But you have a few entries in .mailmap; do you want to update them
as well?

By the way, in general I do not appreciate people lying on the Date:
with an in-body header in their patches, either in the original or
in rerolls.

Thanks.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based

2014-03-27 Thread Kirill Smelkov
+stefanbeller

On Thu, Mar 27, 2014 at 11:48:11AM -0700, Junio C Hamano wrote:
 Kirill Smelkov k...@navytux.spb.ru writes:
 
  (please keep author email)
   8 
  From: Kirill Smelkov k...@mns.spb.ru
  Date: Mon, 24 Feb 2014 20:21:46 +0400
  Subject: [PATCH v3a] tree-diff: rework diff_tree interface to be sha1 based
 
 git am -c will discard everything above the scissors and then
 start parsing the in-body headers from there, so the above From:
 will be used.

Thanks.

 But you have a few entries in .mailmap; do you want to update them
 as well?

When Stefan Beller was contacting me on emails, if I recall correctly, I
told him all those kirr@... entries are mine, but the one this patch is
authored with indicates that something was done at work, and I'd prefer to
acknowledge that. So maybe

 8 
From: Kirill Smelkov k...@navytux.spb.ru
Date: Thu, 27 Mar 2014 23:32:14 +0400
Subject: [PATCH] .mailmap: Separate Kirill Smelkov personal and work addresses

The address k...@mns.spb.ru indicates that a patch was done at work and
I'd like to acknowledge that.

The address k...@navytux.spb.ru is my personal email and indicates that
a contribution is done completely on my own time and resources.

k...@landau.phys.spbu.ru is old university account which no longer works
(sigh, to much spam because of me on the server) and maps to
k...@navytux.spb.ru which should be considered as primary.

Signed-off-by: Kirill Smelkov k...@navytux.spb.ru
---
 .mailmap | 1 -
 1 file changed, 1 deletion(-)

diff --git a/.mailmap b/.mailmap
index 11057cb..0be5e02 100644
--- a/.mailmap
+++ b/.mailmap
@@ -117,7 +117,6 @@ Keith Cascio ke...@cs.ucla.edu ke...@cs.ucla.edu
 Kent Engstrom k...@lysator.liu.se
 Kevin Leung kevin...@gmail.com
 Kirill Smelkov k...@navytux.spb.ru k...@landau.phys.spbu.ru
-Kirill Smelkov k...@navytux.spb.ru k...@mns.spb.ru
 Knut Franke knut.fra...@gmx.de k.fra...@science-computing.de
 Lars Doelle lars.doelle@on-line ! de
 Lars Doelle lars.doe...@on-line.de
-- 
1.9.rc0.143.g6fd479e
 8 

On the other hand, it is still all me, and the main address (navytux) is
indicated correctly, so I dunno...

 By the way, in general I do not appreciate people lying on the Date:
 with an in-body header in their patches, either in the original or
 in rerolls.
 
 Thanks.

I see. Somehow it is pity that the date of original work is lost via
this approach, as now we are only changing cosmetics etc, and the bulk
of the work was done earlier.

Anyway, we can drop the date, but please keep the email, as it is used
for the acknowledgment.

Thanks,
Kirill
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based

2014-03-26 Thread Kirill Smelkov
On Tue, Mar 25, 2014 at 10:46:32AM -0700, Junio C Hamano wrote:
 Kirill Smelkov k...@navytux.spb.ru writes:
 
  What are the downsides of __ prefix by the way?
 
 Aren't these names reserved for compiler/runtime implementations?

Yes, but there are precedents when people don't obey it widely and
in practice everything works :) Let it be something portable anyway -
how about diff_tree_sha1_low() ?

So corrected patch is below. If such suffixing will be accepted, I will
send follow-up patches corrected similiary.

  ( or please pull them from
git://repo.or.cz/git/kirr.git y6/tree-diff-walk-multitree )

Thanks,
Kirill

 8 
From: Kirill Smelkov k...@mns.spb.ru
Date: Mon, 24 Feb 2014 20:21:46 +0400
Subject: [PATCH v3] tree-diff: rework diff_tree interface to be sha1 based

In the next commit this will allow to reduce intermediate calls, when
recursing into subtrees - at that stage we know only subtree sha1, and
it is natural for tree walker to start from that phase. For now we do

diff_tree
show_path
diff_tree_sha1
diff_tree
...

and the change will allow to reduce it to

diff_tree
show_path
diff_tree

Also, it will allow to omit allocating strbuf for each subtree, and just
reuse the common strbuf via playing with its len.

The above-mentioned improvements go in the next 2 patches.

The downside is that try_to_follow_renames(), if active, we cause
re-reading of 2 initial trees, which was negligible based on my timings,
and which is outweighed cogently by the upsides.

NOTE To keep with the current interface and semantics, I needed to
rename the function from diff_tree() to diff_tree_sha1(). As
diff_tree_sha1() was already used, and the function we are talking here
is its more low-level helper, let's use convention for suffixing
such helpers with _low. So the final renaming is

diff_tree() - diff_tree_sha1_low()

Signed-off-by: Kirill Smelkov k...@mns.spb.ru
---

Changes since v2:

 - renamed __diff_tree_sha1() - diff_tree_sha1_low() as the former
   overlaps with reserved-for-implementation identifiers namespace.


 tree-diff.c | 60 
 1 file changed, 28 insertions(+), 32 deletions(-)

diff --git a/tree-diff.c b/tree-diff.c
index f137f39..0277c5c 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -141,12 +141,17 @@ static void skip_uninteresting(struct tree_desc *t, 
struct strbuf *base,
}
 }
 
-static int diff_tree(struct tree_desc *t1, struct tree_desc *t2,
-const char *base_str, struct diff_options *opt)
+static int diff_tree_sha1_low(const unsigned char *old, const unsigned char 
*new,
+ const char *base_str, struct diff_options *opt)
 {
+   struct tree_desc t1, t2;
+   void *t1tree, *t2tree;
struct strbuf base;
int baselen = strlen(base_str);
 
+   t1tree = fill_tree_descriptor(t1, old);
+   t2tree = fill_tree_descriptor(t2, new);
+
/* Enable recursion indefinitely */
opt-pathspec.recursive = DIFF_OPT_TST(opt, RECURSIVE);
 
@@ -159,39 +164,41 @@ static int diff_tree(struct tree_desc *t1, struct 
tree_desc *t2,
if (diff_can_quit_early(opt))
break;
if (opt-pathspec.nr) {
-   skip_uninteresting(t1, base, opt);
-   skip_uninteresting(t2, base, opt);
+   skip_uninteresting(t1, base, opt);
+   skip_uninteresting(t2, base, opt);
}
-   if (!t1-size  !t2-size)
+   if (!t1.size  !t2.size)
break;
 
-   cmp = tree_entry_pathcmp(t1, t2);
+   cmp = tree_entry_pathcmp(t1, t2);
 
/* t1 = t2 */
if (cmp == 0) {
if (DIFF_OPT_TST(opt, FIND_COPIES_HARDER) ||
-   hashcmp(t1-entry.sha1, t2-entry.sha1) ||
-   (t1-entry.mode != t2-entry.mode))
-   show_path(base, opt, t1, t2);
+   hashcmp(t1.entry.sha1, t2.entry.sha1) ||
+   (t1.entry.mode != t2.entry.mode))
+   show_path(base, opt, t1, t2);
 
-   update_tree_entry(t1);
-   update_tree_entry(t2);
+   update_tree_entry(t1);
+   update_tree_entry(t2);
}
 
/* t1  t2 */
else if (cmp  0) {
-   show_path(base, opt, t1, /*t2=*/NULL);
-   update_tree_entry(t1);
+   show_path(base, opt, t1, /*t2=*/NULL);
+   update_tree_entry(t1);
}
 
/* t1  t2 */
else {
-   show_path(base, opt, /*t1=*/NULL, t2);
-   

Re: [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based

2014-03-26 Thread Junio C Hamano
Kirill Smelkov k...@navytux.spb.ru writes:

 On Tue, Mar 25, 2014 at 10:46:32AM -0700, Junio C Hamano wrote:
 Kirill Smelkov k...@navytux.spb.ru writes:
 
  What are the downsides of __ prefix by the way?
 
 Aren't these names reserved for compiler/runtime implementations?

 Yes, but there are precedents when people don't obey it widely and
 in practice everything works :)

I think you are alluding to the practice in the Linux kernel, but
their requirement is vastly different---their product do not even
link with libc and they always compile with specific selected
versions of gcc, no?

 Let it be something portable anyway -
 how about diff_tree_sha1_low() ?

Sure.

As this is a file-scope static, I do not think the exact naming
matters that much.  Just FYI, we seem to use ll_ prefix (standing
for low-level) in some places.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based

2014-03-25 Thread Kirill Smelkov
On Mon, Mar 24, 2014 at 02:36:22PM -0700, Junio C Hamano wrote:
 Kirill Smelkov k...@mns.spb.ru writes:
 
  The downside is that try_to_follow_renames(), if active, we cause
  re-reading of 2 initial trees, which was negligible based on my timings,
 
 That would depend on how often the codepath triggered in your test
 case, but is totally understandable.  It fires only when the path we
 have been following disappears from the parent, and the processing
 of try-to-follow itself is very compute-intensive (it needs to run
 find-copies-harder logic) that will end up reading many subtrees of
 the two initial trees; two more reading of tree objects will be
 dwarfed by the actual processing.

I agree and thanks for the explanation.

  and which is outweighed cogently by the upsides.
 
  Changes since v1:
 
   - don't need to touch diff.h, as diff_tree() became static.
 
 Nice.  I wonder if it is an option to let the function keep its name
 diff_tree() without renaming it to __diff_tree_whatever(), though.

As I see it, in Git for functions operating on trees, there is convention
to accept either `struct tree_desc *` and be named simply, or sha1 and
be named with _sha1 suffix. From this point of view for new diff_tree()
accepting sha1's and staying with its old name would be confusing.

Besides, in the end we'll have two function with high-level wrapper, and
lower-lever worker:

- diff_tree_sha1(), and
- diff_tree_paths().

So it's not about this only particular case.  Both do some simple
preparation, call worker, and perform some cleanup.

So the question is how to name the worker?

In Linux they use __ prefix. We could also use some other prefix or
suffix, e.g. _bh (for bottom-half), _worker, _low, _raw, etc...


To me, personally, the cleanest is __ prefix, but maybe I'm too used
to Linux etc... I'm open to other naming scheme, only it should be
consistent.

What are the downsides of __ prefix by the way?


   tree-diff.c | 60 
  
   1 file changed, 28 insertions(+), 32 deletions(-)
 
  diff --git a/tree-diff.c b/tree-diff.c
  index b99622c..f90acf5 100644
  --- a/tree-diff.c
  +++ b/tree-diff.c
  @@ -137,12 +137,17 @@ static void skip_uninteresting(struct tree_desc *t, 
  struct strbuf *base,
  }
   }
   
  -static int diff_tree(struct tree_desc *t1, struct tree_desc *t2,
  -const char *base_str, struct diff_options *opt)
  +static int __diff_tree_sha1(const unsigned char *old, const unsigned char 
  *new,
  +   const char *base_str, struct diff_options *opt)
   {
  +   struct tree_desc t1, t2;
  +   void *t1tree, *t2tree;
  struct strbuf base;
  int baselen = strlen(base_str);
   
  +   t1tree = fill_tree_descriptor(t1, old);
  +   t2tree = fill_tree_descriptor(t2, new);
  +
  /* Enable recursion indefinitely */
  opt-pathspec.recursive = DIFF_OPT_TST(opt, RECURSIVE);
   
  @@ -155,39 +160,41 @@ static int diff_tree(struct tree_desc *t1, struct 
  tree_desc *t2,
  if (diff_can_quit_early(opt))
  break;
  if (opt-pathspec.nr) {
  -   skip_uninteresting(t1, base, opt);
  -   skip_uninteresting(t2, base, opt);
  +   skip_uninteresting(t1, base, opt);
  +   skip_uninteresting(t2, base, opt);
  }
  -   if (!t1-size  !t2-size)
  +   if (!t1.size  !t2.size)
  break;
   
  -   cmp = tree_entry_pathcmp(t1, t2);
  +   cmp = tree_entry_pathcmp(t1, t2);
   
  /* t1 = t2 */
  if (cmp == 0) {
  if (DIFF_OPT_TST(opt, FIND_COPIES_HARDER) ||
  -   hashcmp(t1-entry.sha1, t2-entry.sha1) ||
  -   (t1-entry.mode != t2-entry.mode))
  -   show_path(base, opt, t1, t2);
  +   hashcmp(t1.entry.sha1, t2.entry.sha1) ||
  +   (t1.entry.mode != t2.entry.mode))
  +   show_path(base, opt, t1, t2);
   
  -   update_tree_entry(t1);
  -   update_tree_entry(t2);
  +   update_tree_entry(t1);
  +   update_tree_entry(t2);
  }
   
  /* t1  t2 */
  else if (cmp  0) {
  -   show_path(base, opt, t1, /*t2=*/NULL);
  -   update_tree_entry(t1);
  +   show_path(base, opt, t1, /*t2=*/NULL);
  +   update_tree_entry(t1);
  }
   
  /* t1  t2 */
  else {
  -   show_path(base, opt, /*t1=*/NULL, t2);
  -   update_tree_entry(t2);
  +   show_path(base, opt, /*t1=*/NULL, t2);
  +   update_tree_entry(t2);
  }
  }
   
  strbuf_release(base);
  +   free(t2tree);
  +   free(t1tree);
  return 0;
   }
   
  @@ -202,7 +209,7 @@ 

Re: [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based

2014-03-25 Thread Junio C Hamano
Kirill Smelkov k...@navytux.spb.ru writes:

 What are the downsides of __ prefix by the way?

Aren't these names reserved for compiler/runtime implementations?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based

2014-03-24 Thread Junio C Hamano
Kirill Smelkov k...@mns.spb.ru writes:

 The downside is that try_to_follow_renames(), if active, we cause
 re-reading of 2 initial trees, which was negligible based on my timings,

That would depend on how often the codepath triggered in your test
case, but is totally understandable.  It fires only when the path we
have been following disappears from the parent, and the processing
of try-to-follow itself is very compute-intensive (it needs to run
find-copies-harder logic) that will end up reading many subtrees of
the two initial trees; two more reading of tree objects will be
dwarfed by the actual processing.

 and which is outweighed cogently by the upsides.

 Changes since v1:

  - don't need to touch diff.h, as diff_tree() became static.

Nice.  I wonder if it is an option to let the function keep its name
diff_tree() without renaming it to __diff_tree_whatever(), though.

  tree-diff.c | 60 
  1 file changed, 28 insertions(+), 32 deletions(-)

 diff --git a/tree-diff.c b/tree-diff.c
 index b99622c..f90acf5 100644
 --- a/tree-diff.c
 +++ b/tree-diff.c
 @@ -137,12 +137,17 @@ static void skip_uninteresting(struct tree_desc *t, 
 struct strbuf *base,
   }
  }
  
 -static int diff_tree(struct tree_desc *t1, struct tree_desc *t2,
 -  const char *base_str, struct diff_options *opt)
 +static int __diff_tree_sha1(const unsigned char *old, const unsigned char 
 *new,
 + const char *base_str, struct diff_options *opt)
  {
 + struct tree_desc t1, t2;
 + void *t1tree, *t2tree;
   struct strbuf base;
   int baselen = strlen(base_str);
  
 + t1tree = fill_tree_descriptor(t1, old);
 + t2tree = fill_tree_descriptor(t2, new);
 +
   /* Enable recursion indefinitely */
   opt-pathspec.recursive = DIFF_OPT_TST(opt, RECURSIVE);
  
 @@ -155,39 +160,41 @@ static int diff_tree(struct tree_desc *t1, struct 
 tree_desc *t2,
   if (diff_can_quit_early(opt))
   break;
   if (opt-pathspec.nr) {
 - skip_uninteresting(t1, base, opt);
 - skip_uninteresting(t2, base, opt);
 + skip_uninteresting(t1, base, opt);
 + skip_uninteresting(t2, base, opt);
   }
 - if (!t1-size  !t2-size)
 + if (!t1.size  !t2.size)
   break;
  
 - cmp = tree_entry_pathcmp(t1, t2);
 + cmp = tree_entry_pathcmp(t1, t2);
  
   /* t1 = t2 */
   if (cmp == 0) {
   if (DIFF_OPT_TST(opt, FIND_COPIES_HARDER) ||
 - hashcmp(t1-entry.sha1, t2-entry.sha1) ||
 - (t1-entry.mode != t2-entry.mode))
 - show_path(base, opt, t1, t2);
 + hashcmp(t1.entry.sha1, t2.entry.sha1) ||
 + (t1.entry.mode != t2.entry.mode))
 + show_path(base, opt, t1, t2);
  
 - update_tree_entry(t1);
 - update_tree_entry(t2);
 + update_tree_entry(t1);
 + update_tree_entry(t2);
   }
  
   /* t1  t2 */
   else if (cmp  0) {
 - show_path(base, opt, t1, /*t2=*/NULL);
 - update_tree_entry(t1);
 + show_path(base, opt, t1, /*t2=*/NULL);
 + update_tree_entry(t1);
   }
  
   /* t1  t2 */
   else {
 - show_path(base, opt, /*t1=*/NULL, t2);
 - update_tree_entry(t2);
 + show_path(base, opt, /*t1=*/NULL, t2);
 + update_tree_entry(t2);
   }
   }
  
   strbuf_release(base);
 + free(t2tree);
 + free(t1tree);
   return 0;
  }
  
 @@ -202,7 +209,7 @@ static inline int diff_might_be_rename(void)
   !DIFF_FILE_VALID(diff_queued_diff.queue[0]-one);
  }
  
 -static void try_to_follow_renames(struct tree_desc *t1, struct tree_desc 
 *t2, const char *base, struct diff_options *opt)
 +static void try_to_follow_renames(const unsigned char *old, const unsigned 
 char *new, const char *base, struct diff_options *opt)
  {
   struct diff_options diff_opts;
   struct diff_queue_struct *q = diff_queued_diff;
 @@ -240,7 +247,7 @@ static void try_to_follow_renames(struct tree_desc *t1, 
 struct tree_desc *t2, co
   diff_opts.break_opt = opt-break_opt;
   diff_opts.rename_score = opt-rename_score;
   diff_setup_done(diff_opts);
 - diff_tree(t1, t2, base, diff_opts);
 + __diff_tree_sha1(old, new, base, diff_opts);
   diffcore_std(diff_opts);
   free_pathspec(diff_opts.pathspec);
  
 @@ -301,23 +308,12 @@ static void try_to_follow_renames(struct tree_desc *t1, 
 struct tree_desc *t2, co
  
  int diff_tree_sha1(const unsigned char