Re: [PATCH 8/4] match-trees: drop x = x initializations

2013-03-25 Thread Junio C Hamano
René Scharfe rene.scha...@lsrfire.ath.cx writes:

 Am 24.03.2013 05:55, schrieb Junio C Hamano:
 So I like your change for readability, but for GCC 4.4.5 we still
 need the unnecessary initialization.

 Hrm, perhaps we can make it even simpler for the compiler.

And the result is even simpler for human readers, I'd have to say.

 I'm a bit uneasy about this one because we lack proper tests for
 this code and I don't know how to write ones off the bat.

This looks pretty much a straight-forward equivalent rewrite from
your earlier one, which was also an obvious equivalent to the
original, at least to me.  The first four lines in the original were
made into two tree_entry() calls (what a useful helper we haven't
been using!) and that allows us to lose explicit update_tree_entry()
calls.



  match-trees.c | 68 
 ---
  1 file changed, 28 insertions(+), 40 deletions(-)

 diff --git a/match-trees.c b/match-trees.c
 index 26f7ed1..2bb734d 100644
 --- a/match-trees.c
 +++ b/match-trees.c
 @@ -47,6 +47,13 @@ static int score_matches(unsigned mode1, unsigned mode2, 
 const char *path)
   return score;
  }
  
 +static int base_name_entries_compare(const struct name_entry *a,
 +  const struct name_entry *b)
 +{
 + return base_name_compare(a-path, tree_entry_len(a), a-mode,
 +  b-path, tree_entry_len(b), b-mode);
 +}
 +
  /*
   * Inspect two trees, and give a score that tells how similar they are.
   */
 @@ -71,54 +78,35 @@ static int score_trees(const unsigned char *hash1, const 
 unsigned char *hash2)
   if (type != OBJ_TREE)
   die(%s is not a tree, sha1_to_hex(hash2));
   init_tree_desc(two, two_buf, size);
 - while (one.size | two.size) {
 - const unsigned char *elem1 = elem1;
 - const unsigned char *elem2 = elem2;
 - const char *path1 = path1;
 - const char *path2 = path2;
 - unsigned mode1 = mode1;
 - unsigned mode2 = mode2;
 + for (;;) {
 + struct name_entry e1, e2;
 + int got_entry_from_one = tree_entry(one, e1);
 + int got_entry_from_two = tree_entry(two, e2);
   int cmp;
  
 - if (one.size)
 - elem1 = tree_entry_extract(one, path1, mode1);
 - if (two.size)
 - elem2 = tree_entry_extract(two, path2, mode2);
 -
 - if (!one.size) {
 - /* two has more entries */
 - score += score_missing(mode2, path2);
 - update_tree_entry(two);
 - continue;
 - }
 - if (!two.size) {
 + if (got_entry_from_one  got_entry_from_two)
 + cmp = base_name_entries_compare(e1, e2);
 + else if (got_entry_from_one)
   /* two lacks this entry */
 - score += score_missing(mode1, path1);
 - update_tree_entry(one);
 - continue;
 - }
 - cmp = base_name_compare(path1, strlen(path1), mode1,
 - path2, strlen(path2), mode2);
 - if (cmp  0) {
 + cmp = -1;
 + else if (got_entry_from_two)
 + /* two has more entries */
 + cmp = 1;
 + else
 + break;
 +
 + if (cmp  0)
   /* path1 does not appear in two */
 - score += score_missing(mode1, path1);
 - update_tree_entry(one);
 - continue;
 - }
 - else if (cmp  0) {
 + score += score_missing(e1.mode, e1.path);
 + else if (cmp  0)
   /* path2 does not appear in one */
 - score += score_missing(mode2, path2);
 - update_tree_entry(two);
 - continue;
 - }
 - else if (hashcmp(elem1, elem2))
 + score += score_missing(e2.mode, e2.path);
 + else if (hashcmp(e1.sha1, e2.sha1))
   /* they are different */
 - score += score_differs(mode1, mode2, path1);
 + score += score_differs(e1.mode, e2.mode, e1.path);
   else
   /* same subtree or blob */
 - score += score_matches(mode1, mode2, path1);
 - update_tree_entry(one);
 - update_tree_entry(two);
 + score += score_matches(e1.mode, e2.mode, e1.path);
   }
   free(one_buf);
   free(two_buf);
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/4] match-trees: drop x = x initializations

2013-03-24 Thread Jeff King
On Sat, Mar 23, 2013 at 09:55:53PM -0700, Junio C Hamano wrote:

 René Scharfe rene.scha...@lsrfire.ath.cx writes:
 
  Hmm, let's see if we can help the compiler follow the code without
  making it harder for people to understand.  The patch looks a bit
  jumbled, but the resulting code is OK in my biased opinion.
 
 I actually think the result is much better than a mere OK; the
 duplicated at this point we know path1 (or path2) is missing from
 the other side has been bothering me and I was about to suggest a
 similar rewrite before I read your message ;-)
 
 However, the same compiler still thinks {elem,path,mode}1 can be
 used uninitialized (but not {elem,path,mode}2).  The craziness I
 reported in the previous message is also the same.  With this patch
 on top to swap the side we inspect first, the compiler thinks
 {elem,path,mode}2 can be used uninitialized but not the other three
 variables X-.

Yeah, I'd agree that the result is more readable, but it does not
address the original problem. Junio, do you want to drop my patch,
squash the initialization of mode into René's version, and just add a
note to the commit message that we still have to deal with the gcc
warning?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/4] match-trees: drop x = x initializations

2013-03-24 Thread René Scharfe
Am 24.03.2013 05:55, schrieb Junio C Hamano:
 However, the same compiler still thinks {elem,path,mode}1 can be
 used uninitialized (but not {elem,path,mode}2).  The craziness I
 reported in the previous message is also the same.  With this patch
 on top to swap the side we inspect first, the compiler thinks
 {elem,path,mode}2 can be used uninitialized but not the other three
 variables X-.
 
 So I like your change for readability, but for GCC 4.4.5 we still
 need the unnecessary initialization.

Hrm, perhaps we can make it even simpler for the compiler.

-- 8 --
Subject: match-trees: simplify score_trees() using tree_entry()

Convert the loop in score_trees() to tree_entry().  The code becomes
shorter and simpler because the calls to update_tree_entry() are not
needed any more.

Another benefit is that we need less variables to track the current
tree entries; as a side-effect of that the compiler has an easier
job figuring out the control flow and thus can avoid false warnings
about uninitialized variables.

Using struct name_entry also allows the use of tree_entry_len() for
finding the path length instead of strlen(), which may be slightly
more efficient.

Also unify the handling of missing entries in one of the two trees
(i.e. added or removed files): Just set cmp appropriately first, no
matter if we ran off the end of a tree or if we actually have two
entries to compare, and check its value a bit later without
duplicating the handler code.

Signed-off-by: Rene Scharfe rene.scha...@lsrfire.ath.cx
---
I'm a bit uneasy about this one because we lack proper tests for
this code and I don't know how to write ones off the bat.

 match-trees.c | 68 ---
 1 file changed, 28 insertions(+), 40 deletions(-)

diff --git a/match-trees.c b/match-trees.c
index 26f7ed1..2bb734d 100644
--- a/match-trees.c
+++ b/match-trees.c
@@ -47,6 +47,13 @@ static int score_matches(unsigned mode1, unsigned mode2, 
const char *path)
return score;
 }
 
+static int base_name_entries_compare(const struct name_entry *a,
+const struct name_entry *b)
+{
+   return base_name_compare(a-path, tree_entry_len(a), a-mode,
+b-path, tree_entry_len(b), b-mode);
+}
+
 /*
  * Inspect two trees, and give a score that tells how similar they are.
  */
@@ -71,54 +78,35 @@ static int score_trees(const unsigned char *hash1, const 
unsigned char *hash2)
if (type != OBJ_TREE)
die(%s is not a tree, sha1_to_hex(hash2));
init_tree_desc(two, two_buf, size);
-   while (one.size | two.size) {
-   const unsigned char *elem1 = elem1;
-   const unsigned char *elem2 = elem2;
-   const char *path1 = path1;
-   const char *path2 = path2;
-   unsigned mode1 = mode1;
-   unsigned mode2 = mode2;
+   for (;;) {
+   struct name_entry e1, e2;
+   int got_entry_from_one = tree_entry(one, e1);
+   int got_entry_from_two = tree_entry(two, e2);
int cmp;
 
-   if (one.size)
-   elem1 = tree_entry_extract(one, path1, mode1);
-   if (two.size)
-   elem2 = tree_entry_extract(two, path2, mode2);
-
-   if (!one.size) {
-   /* two has more entries */
-   score += score_missing(mode2, path2);
-   update_tree_entry(two);
-   continue;
-   }
-   if (!two.size) {
+   if (got_entry_from_one  got_entry_from_two)
+   cmp = base_name_entries_compare(e1, e2);
+   else if (got_entry_from_one)
/* two lacks this entry */
-   score += score_missing(mode1, path1);
-   update_tree_entry(one);
-   continue;
-   }
-   cmp = base_name_compare(path1, strlen(path1), mode1,
-   path2, strlen(path2), mode2);
-   if (cmp  0) {
+   cmp = -1;
+   else if (got_entry_from_two)
+   /* two has more entries */
+   cmp = 1;
+   else
+   break;
+
+   if (cmp  0)
/* path1 does not appear in two */
-   score += score_missing(mode1, path1);
-   update_tree_entry(one);
-   continue;
-   }
-   else if (cmp  0) {
+   score += score_missing(e1.mode, e1.path);
+   else if (cmp  0)
/* path2 does not appear in one */
-   score += score_missing(mode2, path2);
-   update_tree_entry(two);
-   continue;
-   }
-

Re: [PATCH 8/4] match-trees: drop x = x initializations

2013-03-23 Thread René Scharfe
Am 22.03.2013 17:21, schrieb Jeff King:
 Of the 8 patches, this is the one I find the least satisfying, if only
 because I do not think gcc's failure is because of complicated control
 flow, and rearranging the code would only hurt readability.

Hmm, let's see if we can help the compiler follow the code without
making it harder for people to understand.  The patch looks a bit
jumbled, but the resulting code is OK in my biased opinion.

-- 8 --
There are two ways we can spot missing entries, i.e. added or removed
files: By reaching the end of one of the trees while the other still
has entries, or in the middle of the two lists with base_name_compare().
Missing files are handled the same in either case, but the code is
duplicated.

Unify the handling by just setting cmp appropriately when running off
a tree instead of handling the case on the spot.  If both trees contain
entries, call base_name_compare() as usual.

This make the code slightly shorter, and also helps gcc 4.6 to
understand that none of the variables in the loop are used without
initialization.  Therefore we can remove the trick to initialize them
using themselves, which was used to squelch false warnings.

[Stolen from Jeff King:]
While we're in the area, let's also update the loop
condition to use logical-OR rather than bitwise-OR. They should
be equivalent in this case, and the use of the latter was
probably a typo.

Signed-off-by: Rene Scharfe rene.scha...@lsrfire.ath.cx
---
 match-trees.c | 36 ++--
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/match-trees.c b/match-trees.c
index 26f7ed1..c0c66bb 100644
--- a/match-trees.c
+++ b/match-trees.c
@@ -71,34 +71,26 @@ static int score_trees(const unsigned char *hash1, const 
unsigned char *hash2)
if (type != OBJ_TREE)
die(%s is not a tree, sha1_to_hex(hash2));
init_tree_desc(two, two_buf, size);
-   while (one.size | two.size) {
-   const unsigned char *elem1 = elem1;
-   const unsigned char *elem2 = elem2;
-   const char *path1 = path1;
-   const char *path2 = path2;
-   unsigned mode1 = mode1;
-   unsigned mode2 = mode2;
-   int cmp;
+   while (one.size || two.size) {
+   const unsigned char *elem1, *elem2;
+   const char *path1, *path2;
+   unsigned mode1, mode2;
+   int cmp = 0;
 
if (one.size)
elem1 = tree_entry_extract(one, path1, mode1);
+   else
+   /* two has more entries */
+   cmp = 1;
if (two.size)
elem2 = tree_entry_extract(two, path2, mode2);
-
-   if (!one.size) {
-   /* two has more entries */
-   score += score_missing(mode2, path2);
-   update_tree_entry(two);
-   continue;
-   }
-   if (!two.size) {
+   else
/* two lacks this entry */
-   score += score_missing(mode1, path1);
-   update_tree_entry(one);
-   continue;
-   }
-   cmp = base_name_compare(path1, strlen(path1), mode1,
-   path2, strlen(path2), mode2);
+   cmp = -1;
+
+   if (!cmp)
+   cmp = base_name_compare(path1, strlen(path1), mode1,
+   path2, strlen(path2), mode2);
if (cmp  0) {
/* path1 does not appear in two */
score += score_missing(mode1, path1);
-- 
1.8.2


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/4] match-trees: drop x = x initializations

2013-03-23 Thread Junio C Hamano
René Scharfe rene.scha...@lsrfire.ath.cx writes:

 Hmm, let's see if we can help the compiler follow the code without
 making it harder for people to understand.  The patch looks a bit
 jumbled, but the resulting code is OK in my biased opinion.

I actually think the result is much better than a mere OK; the
duplicated at this point we know path1 (or path2) is missing from
the other side has been bothering me and I was about to suggest a
similar rewrite before I read your message ;-)

However, the same compiler still thinks {elem,path,mode}1 can be
used uninitialized (but not {elem,path,mode}2).  The craziness I
reported in the previous message is also the same.  With this patch
on top to swap the side we inspect first, the compiler thinks
{elem,path,mode}2 can be used uninitialized but not the other three
variables X-.

So I like your change for readability, but for GCC 4.4.5 we still
need the unnecessary initialization.

 match-trees.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/match-trees.c b/match-trees.c
index c0c66bb..9ea2c80 100644
--- a/match-trees.c
+++ b/match-trees.c
@@ -77,16 +77,16 @@ static int score_trees(const unsigned char *hash1, const 
unsigned char *hash2)
unsigned mode1, mode2;
int cmp = 0;
 
-   if (one.size)
-   elem1 = tree_entry_extract(one, path1, mode1);
-   else
-   /* two has more entries */
-   cmp = 1;
if (two.size)
elem2 = tree_entry_extract(two, path2, mode2);
else
/* two lacks this entry */
cmp = -1;
+   if (one.size)
+   elem1 = tree_entry_extract(one, path1, mode1);
+   else
+   /* two has more entries */
+   cmp = 1;
 
if (!cmp)
cmp = base_name_compare(path1, strlen(path1), mode1,


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/4] match-trees: drop x = x initializations

2013-03-22 Thread Jeff King
These nonsense assignments are meant to squelch gcc warnings
that the variables might be used uninitialized. However, gcc
gets it mostly right, realizing that we will either
extract tree entries from both sides, or we will hit a
continue statement and go to the top of the loop.

However, while getting this right for the elem and path
variables, it does not do so for the mode variables. Let's
drop the nonsense initialization where modern gcc does not
need them, and just set the modes to 0, along with a
comment. These values should never be used, but it makes
both gcc, as well as any compiler which does not like the x
= x initializations, happy.

While we're in the area, let's also update the loop
condition to use logical-OR rather than bitwise-OR. They should
be equivalent in this case, and the use of the latter was
probably a typo.

Signed-off-by: Jeff King p...@peff.net
---
Of the 8 patches, this is the one I find the least satisfying, if only
because I do not think gcc's failure is because of complicated control
flow, and rearranging the code would only hurt readability. And I'm
quite curious why it complains about mode, but not about the other
variables, which are set in the exact same place (and why it would not
be able to handle such a simple control flow at all).

It makes me wonder if I am missing something, or there is some subtle
bug. But I can't see it. Other eyes appreciated.

 match-trees.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/match-trees.c b/match-trees.c
index 26f7ed1..4360f10 100644
--- a/match-trees.c
+++ b/match-trees.c
@@ -71,13 +71,13 @@ static int score_trees(const unsigned char *hash1, const 
unsigned char *hash2)
if (type != OBJ_TREE)
die(%s is not a tree, sha1_to_hex(hash2));
init_tree_desc(two, two_buf, size);
-   while (one.size | two.size) {
-   const unsigned char *elem1 = elem1;
-   const unsigned char *elem2 = elem2;
-   const char *path1 = path1;
-   const char *path2 = path2;
-   unsigned mode1 = mode1;
-   unsigned mode2 = mode2;
+   while (one.size || two.size) {
+   const unsigned char *elem1;
+   const unsigned char *elem2;
+   const char *path1;
+   const char *path2;
+   unsigned mode1 = 0; /* make gcc happy */
+   unsigned mode2 = 0; /* make gcc happy */
int cmp;
 
if (one.size)
-- 
1.8.2.13.g0f18d3c
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/4] match-trees: drop x = x initializations

2013-03-22 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 Of the 8 patches, this is the one I find the least satisfying, if only
 because I do not think gcc's failure is because of complicated control
 flow, and rearranging the code would only hurt readability. And I'm
 quite curious why it complains about mode, but not about the other
 variables, which are set in the exact same place (and why it would not
 be able to handle such a simple control flow at all).

 It makes me wonder if I am missing something, or there is some subtle
 bug. But I can't see it. Other eyes appreciated.

I obviously am not qualified as other eyes to catch bugs in this
code as this is entirely mine, but I do not see any obvious reason
that would make the compiler to think mode[12] less initialized than
elem[12] or path[12] either.

These three are all updated by the same tree_entry_extract() call,
and whenever we use mode[12] we use path[12], so if it decides path1
is used or assigned, it should be able to tell mode1 is, too.

Unsatisfactory, it surely is...

  match-trees.c | 14 +++---
  1 file changed, 7 insertions(+), 7 deletions(-)

 diff --git a/match-trees.c b/match-trees.c
 index 26f7ed1..4360f10 100644
 --- a/match-trees.c
 +++ b/match-trees.c
 @@ -71,13 +71,13 @@ static int score_trees(const unsigned char *hash1, const 
 unsigned char *hash2)
   if (type != OBJ_TREE)
   die(%s is not a tree, sha1_to_hex(hash2));
   init_tree_desc(two, two_buf, size);
 - while (one.size | two.size) {
 - const unsigned char *elem1 = elem1;
 - const unsigned char *elem2 = elem2;
 - const char *path1 = path1;
 - const char *path2 = path2;
 - unsigned mode1 = mode1;
 - unsigned mode2 = mode2;
 + while (one.size || two.size) {
 + const unsigned char *elem1;
 + const unsigned char *elem2;
 + const char *path1;
 + const char *path2;
 + unsigned mode1 = 0; /* make gcc happy */
 + unsigned mode2 = 0; /* make gcc happy */
   int cmp;
  
   if (one.size)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html