Re: [PATCH] GSoC Miniproject 15. Rewrite fsck.c:fsck_commit()

2014-03-21 Thread Ashwin Jha


 Original Message 
Subject:[PATCH] GSoC Miniproject 15. Rewrite fsck.c:fsck_commit()
Date:   Fri, 21 Mar 2014 07:24:46 +0530
From:   Ashwin Jha ajha@gmail.com
To: git@vger.kernel.org
CC: Ashwin Jha ajha@gmail.com



modified fsck.c:fsck_commit(). Replaced memcmp() with starts_with() function.
starts_with() seems much more relevant than memcmp(). It uses one less argument
and its return value makes more sense.
skip_prefix() is not used as it uses strcmp() internally which seems 
unnecessarily
for current task. The current task can be easily done by providing offsets to 
the
buffer pointer (the way it is implemented currently).

Signed-off-by: Ashwin Jha ajha@gmail.com
---
 fsck.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/fsck.c b/fsck.c
index 64bf279..82e1640 100644
--- a/fsck.c
+++ b/fsck.c
@@ -6,6 +6,7 @@
 #include commit.h
 #include tag.h
 #include fsck.h
+#include strbuf.h
 
 static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)

 {
@@ -290,12 +291,12 @@ static int fsck_commit(struct commit *commit, fsck_error 
error_func)
int parents = 0;
int err;
 
-	if (memcmp(buffer, tree , 5))

+   if (!starts_with(buffer, tree ))
return error_func(commit-object, FSCK_ERROR, invalid format - 
expected 'tree' line);
if (get_sha1_hex(buffer+5, tree_sha1) || buffer[45] != '\n')
return error_func(commit-object, FSCK_ERROR, invalid 'tree' line 
format - bad sha1);
buffer += 46;
-   while (!memcmp(buffer, parent , 7)) {
+   while (starts_with(buffer, parent )) {
if (get_sha1_hex(buffer+7, sha1) || buffer[47] != '\n')
return error_func(commit-object, FSCK_ERROR, invalid 
'parent' line format - bad sha1);
buffer += 48;
@@ -322,15 +323,15 @@ static int fsck_commit(struct commit *commit, fsck_error 
error_func)
if (p || parents)
return error_func(commit-object, FSCK_ERROR, parent 
objects missing);
}
-   if (memcmp(buffer, author , 7))
+   if (!starts_with(buffer, author ))
return error_func(commit-object, FSCK_ERROR, invalid format - 
expected 'author' line);
buffer += 7;
err = fsck_ident(buffer, commit-object, error_func);
if (err)
return err;
-   if (memcmp(buffer, committer , strlen(committer )))
+   if (!starts_with(buffer, committer ))
return error_func(commit-object, FSCK_ERROR, invalid format - 
expected 'committer' line);
-   buffer += strlen(committer );
+   buffer += 10;
err = fsck_ident(buffer, commit-object, error_func);
if (err)
return err;
--
1.7.9.5



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] GSoC Miniproject 15. Rewrite fsck.c:fsck_commit()

2014-03-21 Thread Eric Sunshine
On Fri, Mar 21, 2014 at 5:18 AM, Ashwin Jha ajha@gmail.com wrote:
 On Fri, Mar 21, 2014 at 9:03 AM, Eric Sunshine sunsh...@sunshineco.com
 wrote:
 On Thu, Mar 20, 2014 at 9:54 PM, Ashwin Jha ajha@gmail.com wrote:
  Subject: [PATCH] GSoC Miniproject 15. Rewrite fsck.c:fsck_commit()
  starts_with() seems much more relevant than memcmp(). It uses one less
  argument and its return value makes more sense.

 As a justification, uses one less argument falls flat, and really
 has nothing to do with the decision to make the change. The bit about
 the return value is a slightly better but is still weak.

 You might instead justify the change by pointing out that the name
 starts_with()
 does a better job of conveying the intention of the code, which is to
 check the string for a prefix, than does memcmp().

 Actually, from the line starts_with() seems much more relevant than
 memcmp() my intention was to say that starts_with() does a better job of
 conveying the intention of the code, which is to check the string for a
 prefix, than does memcmp() as mentioned by you.

Good to hear. When you resubmit (if you do), perhaps use that wording
or something similar to justify the change.

  skip_prefix() is not used as it uses strcmp() internally which seems
  unnecessarily
  for current task. The current task can be easily done by providing
  offsets to the
  buffer pointer (the way it is implemented currently).

 Not sure what this means. What is the current task, and what is
 implemented where currently?

 From current task, I meant to say the task of offsetting the buffer pointer
 to get the correct substring as in:
 get_sha1_hex(buffer+5, tree_sha1)

 Please forgive me for this. I should have written this in a better way.

Thanks for the clarification.

  Signed-off-by: Ashwin Jha ajha@gmail.com
  ---
   fsck.c |   11 ++-
   1 file changed, 6 insertions(+), 5 deletions(-)
 
  diff --git a/fsck.c b/fsck.c
  index 64bf279..82e1640 100644
  --- a/fsck.c
  +++ b/fsck.c
  @@ -6,6 +6,7 @@
   #include commit.h
   #include tag.h
   #include fsck.h
  +#include strbuf.h
 
   static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void
  *data)
   {
  @@ -290,12 +291,12 @@ static int fsck_commit(struct commit *commit,
  fsck_error error_func)
  int parents = 0;
  int err;
 
  -   if (memcmp(buffer, tree , 5))
  +   if (!starts_with(buffer, tree ))
  return error_func(commit-object, FSCK_ERROR, invalid
  format - expected 'tree' line);
  if (get_sha1_hex(buffer+5, tree_sha1) || buffer[45] != '\n')

 One of the benefits of starts_with() and skip_prefix() is that they
 allow you to eliminate magic numbers, such as 5 in the memcmp()
 invocation. However, if you look a couple lines below, you see in the
 expression 'buffer+5' that the magic number is still present. In fact,
 the code becomes less clear with your change because the 5 in
 'buffer+5' is much more mysterious without the preceding
 memcmp(foo,bar,5). It is possible to eliminate this magic number,
 but starts_with() is not the answer.

 I considered this point while making the changes. But, I thought that since
 all that is required is a constant offset to the buffer pointer, using
 skip_prefix() will only add to the overhead of function calling.

  return error_func(commit-object, FSCK_ERROR, invalid
  'tree' line format - bad sha1);
  buffer += 46;

 And as you can see here (buffer +=46) will still be a problem even if I
 replace the buffer+5 code.
 I think a more better way would be to define these magic no. as macros.

 But, I guess you are right. The current changes do make it a bit unclear.

I understand your argument: since magic numbers remain elsewhere, then
little is gained by eliminating only a few of them via skip_prefix().
A counterargument might be that even that small gain can be a
maintenance bonus, since it reduces the number of potential places
where errors can be made when modifying the code. (But you are welcome
to counter that argument if you feel strongly about it.)

 To summarize, I can think of two ways:
 1. skip_prefix() can be used, in place of both starts_with() and memcmp().
 The return value of skip_prefix can
 be checked against NULL to determine whether correct format is used or
 not.
 Though, even this change will left some of the magic no (as shown
 above). ;-)
 2. Define macros for all the magic no. (and tags like tree, parent
 etc.). This way the code will be more clear
 and any future changes to these magic no. (or tag names) will be much
 easier to handle.

Perhaps provide an illustration to explain what you mean.

 In my opinion, 2 will be a better option. But, I can understand that I may
 have overlooked some potential flaws in this method.
 Please guide me to the correct approach. :-)

There isn't necessarily one correct approach. Judging from reviewer
responses to submissions by other GSoC hopefuls who tackled this

Re: [PATCH] GSoC Miniproject 15. Rewrite fsck.c:fsck_commit()

2014-03-21 Thread Ashwin Jha


On 03/22/2014 12:11 AM, Eric Sunshine wrote:

On Fri, Mar 21, 2014 at 5:18 AM, Ashwin Jha ajha@gmail.com wrote:

On Fri, Mar 21, 2014 at 9:03 AM, Eric Sunshine sunsh...@sunshineco.com
wrote:

On Thu, Mar 20, 2014 at 9:54 PM, Ashwin Jha ajha@gmail.com wrote:

Subject: [PATCH] GSoC Miniproject 15. Rewrite fsck.c:fsck_commit()
starts_with() seems much more relevant than memcmp(). It uses one less
argument and its return value makes more sense.

As a justification, uses one less argument falls flat, and really
has nothing to do with the decision to make the change. The bit about
the return value is a slightly better but is still weak.

You might instead justify the change by pointing out that the name
starts_with()
does a better job of conveying the intention of the code, which is to
check the string for a prefix, than does memcmp().

Actually, from the line starts_with() seems much more relevant than
memcmp() my intention was to say that starts_with() does a better job of
conveying the intention of the code, which is to check the string for a
prefix, than does memcmp() as mentioned by you.

Good to hear. When you resubmit (if you do), perhaps use that wording
or something similar to justify the change.


skip_prefix() is not used as it uses strcmp() internally which seems
unnecessarily
for current task. The current task can be easily done by providing
offsets to the
buffer pointer (the way it is implemented currently).

Not sure what this means. What is the current task, and what is
implemented where currently?

 From current task, I meant to say the task of offsetting the buffer pointer
to get the correct substring as in:
get_sha1_hex(buffer+5, tree_sha1)

Please forgive me for this. I should have written this in a better way.

Thanks for the clarification.


Signed-off-by: Ashwin Jha ajha@gmail.com
---
  fsck.c |   11 ++-
  1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/fsck.c b/fsck.c
index 64bf279..82e1640 100644
--- a/fsck.c
+++ b/fsck.c
@@ -6,6 +6,7 @@
  #include commit.h
  #include tag.h
  #include fsck.h
+#include strbuf.h

  static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void
*data)
  {
@@ -290,12 +291,12 @@ static int fsck_commit(struct commit *commit,
fsck_error error_func)
 int parents = 0;
 int err;

-   if (memcmp(buffer, tree , 5))
+   if (!starts_with(buffer, tree ))
 return error_func(commit-object, FSCK_ERROR, invalid
format - expected 'tree' line);
 if (get_sha1_hex(buffer+5, tree_sha1) || buffer[45] != '\n')

One of the benefits of starts_with() and skip_prefix() is that they
allow you to eliminate magic numbers, such as 5 in the memcmp()
invocation. However, if you look a couple lines below, you see in the
expression 'buffer+5' that the magic number is still present. In fact,
the code becomes less clear with your change because the 5 in
'buffer+5' is much more mysterious without the preceding
memcmp(foo,bar,5). It is possible to eliminate this magic number,
but starts_with() is not the answer.


I considered this point while making the changes. But, I thought that since
all that is required is a constant offset to the buffer pointer, using
skip_prefix() will only add to the overhead of function calling.

 return error_func(commit-object, FSCK_ERROR, invalid
'tree' line format - bad sha1);
 buffer += 46;

And as you can see here (buffer +=46) will still be a problem even if I
replace the buffer+5 code.
I think a more better way would be to define these magic no. as macros.

But, I guess you are right. The current changes do make it a bit unclear.

I understand your argument: since magic numbers remain elsewhere, then
little is gained by eliminating only a few of them via skip_prefix().
A counterargument might be that even that small gain can be a
maintenance bonus, since it reduces the number of potential places
where errors can be made when modifying the code. (But you are welcome
to counter that argument if you feel strongly about it.)


To summarize, I can think of two ways:
1. skip_prefix() can be used, in place of both starts_with() and memcmp().
The return value of skip_prefix can
 be checked against NULL to determine whether correct format is used or
not.
 Though, even this change will left some of the magic no (as shown
above). ;-)
2. Define macros for all the magic no. (and tags like tree, parent
etc.). This way the code will be more clear
 and any future changes to these magic no. (or tag names) will be much
easier to handle.

Perhaps provide an illustration to explain what you mean.
I think you want some explanation on point 2. What I have suggested here 
is that
all the keywords (like tree, parent) and magic no. (which are 
nothing but suitable

pointer offsets, used to fetch these keywords) be defined as macros.
This will serve two purposes:
1. The code will be more readable in the sense that each magic no. will 
have a meaningful name.


Re: [PATCH] GSoC Miniproject 15. Rewrite fsck.c:fsck_commit()

2014-03-20 Thread Eric Sunshine
Thanks for the submission. Comments below to give you a feel for the
Git review process...

On Thu, Mar 20, 2014 at 9:54 PM, Ashwin Jha ajha@gmail.com wrote:
 Subject: [PATCH] GSoC Miniproject 15. Rewrite fsck.c:fsck_commit()

The subject becomes part of the permanent Git history, but the fact
that this is a GSoC submission won't be meaningful to anyone months or
years from now. You can mention GSoC inside [...], however, as in
[PATCH GSoC], since that gets stripped off the subject automatically
when the patch is applied. Use the commentary section after the ---
line just below your sign-off to explain that this is microproject 15.

The subject itself should concisely summarize the change. Rewrite
fsck.c:fsck_commit() doesn't say much. You might say instead:

Subject: fsk_commit: replace memcmp() with starts_with()

 modified fsck.c:fsck_commit(). Replaced memcmp() with starts_with() function.

Capitalize start of sentence.

Use imperative mood: modify rather than modified; Replace rather
than Replaced.

 starts_with() seems much more relevant than memcmp(). It uses one less 
 argument
 and its return value makes more sense.

As a justification, uses one less argument falls flat, and really
has nothing to do with the decision to make the change. The bit about
the return value is a slightly better but is still weak. You might
instead justify the change by pointing out that the name starts_with()
does a better job of conveying the intention of the code, which is to
check the string for a prefix, than does memcmp().

 skip_prefix() is not used as it uses strcmp() internally which seems 
 unnecessarily
 for current task. The current task can be easily done by providing offsets to 
 the
 buffer pointer (the way it is implemented currently).

Not sure what this means. What is the current task, and what is
implemented where currently?

 Signed-off-by: Ashwin Jha ajha@gmail.com
 ---
  fsck.c |   11 ++-
  1 file changed, 6 insertions(+), 5 deletions(-)

 diff --git a/fsck.c b/fsck.c
 index 64bf279..82e1640 100644
 --- a/fsck.c
 +++ b/fsck.c
 @@ -6,6 +6,7 @@
  #include commit.h
  #include tag.h
  #include fsck.h
 +#include strbuf.h

  static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
  {
 @@ -290,12 +291,12 @@ static int fsck_commit(struct commit *commit, 
 fsck_error error_func)
 int parents = 0;
 int err;

 -   if (memcmp(buffer, tree , 5))
 +   if (!starts_with(buffer, tree ))
 return error_func(commit-object, FSCK_ERROR, invalid 
 format - expected 'tree' line);
 if (get_sha1_hex(buffer+5, tree_sha1) || buffer[45] != '\n')

One of the benefits of starts_with() and skip_prefix() is that they
allow you to eliminate magic numbers, such as 5 in the memcmp()
invocation. However, if you look a couple lines below, you see in the
expression 'buffer+5' that the magic number is still present. In fact,
the code becomes less clear with your change because the 5 in
'buffer+5' is much more mysterious without the preceding
memcmp(foo,bar,5). It is possible to eliminate this magic number,
but starts_with() is not the answer.

 return error_func(commit-object, FSCK_ERROR, invalid 
 'tree' line format - bad sha1);
 buffer += 46;
 -   while (!memcmp(buffer, parent , 7)) {
 +   while (starts_with(buffer, parent )) {
 if (get_sha1_hex(buffer+7, sha1) || buffer[47] != '\n')

Ditto here with magic number 7 in 'buffer+7'.

 return error_func(commit-object, FSCK_ERROR, 
 invalid 'parent' line format - bad sha1);
 buffer += 48;
 @@ -322,15 +323,15 @@ static int fsck_commit(struct commit *commit, 
 fsck_error error_func)
 if (p || parents)
 return error_func(commit-object, FSCK_ERROR, 
 parent objects missing);
 }
 -   if (memcmp(buffer, author , 7))
 +   if (!starts_with(buffer, author ))
 return error_func(commit-object, FSCK_ERROR, invalid 
 format - expected 'author' line);
 buffer += 7;

And again with 7.

 err = fsck_ident(buffer, commit-object, error_func);
 if (err)
 return err;
 -   if (memcmp(buffer, committer , strlen(committer )))
 +   if (!starts_with(buffer, committer ))
 return error_func(commit-object, FSCK_ERROR, invalid 
 format - expected 'committer' line);
 -   buffer += strlen(committer );
 +   buffer += 10;

Again with 10 (newly introduced).

 err = fsck_ident(buffer, commit-object, error_func);
 if (err)
 return err;
 --
 1.7.9.5
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html