Re: [PATCH v8 1/5] Refactor parse_loc

2013-02-28 Thread Thomas Rast
Junio C Hamano  writes:

> Thomas Rast  writes:
>
>> From: Bo Yang 
>>
>> We want to use the same style of -L n,m argument for 'git log -L' as
>> for git-blame.  Refactor the argument parsing of the range arguments
>> from builtin/blame.c to the (new) file that will hold the 'git log -L'
>> logic.
>>
>> To accommodate different data structures in blame and log -L, the file
>> contents are abstracted away; parse_range_arg takes a callback that it
>> uses to get the contents of a line of the (notional) file.
>>
>> The new test is for a case that made me pause during debugging: the
>> 'blame -L with invalid end' test was the only one that noticed an
>> outright failure to parse the end *at all*.  So make a more explicit
>> test for that.
>>
>> Signed-off-by: Bo Yang 
>> Signed-off-by: Thomas Rast 
>> ---
>>  Documentation/blame-options.txt |  19 +--
>>  Documentation/line-range-format.txt |  18 +++
>>  Makefile|   2 +
>>  builtin/blame.c |  99 +++---
>>  line-log.c  | 105 
>> 
>>  line-log.h  |  23 
>
> Was this churn necessary?  
>
> It is strange to move existing functions that will be tweaked to be
> shared by two different codepaths (blame and line-log) to the new
> user.
>
> The only effect this has, as opposed to tweaking the functions in
> place and making them extern, is to make it harder to see the tweaks
> made while moving the lines, and also make it more cumbersome to
> determine the lineage of the code later.
>  
> It would have been understandable if they were moved to a new
> library-ish file (perhaps "line-range.[ch]"); even though that
> approach shares the same downsides, at least it would have a better
> excuse "We will share this, so let's move it to a neutral third
> place to allow us to hide the implementation details from both
> users".  The arrangement this patch series makes does not even have
> that excuse.  The final implementation still stay with one of the
> users; the only difference is that it is away from the original user
> and close to the new user.

Even though I am moving from builtin/blame.c to line-log.c?  I would
otherwise have to call from a rather lib-ish file into a "frontend"
file.  I always figured I wasn't supposed to do that.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 1/5] Refactor parse_loc

2013-02-28 Thread Junio C Hamano
Thomas Rast  writes:

> From: Bo Yang 
>
> We want to use the same style of -L n,m argument for 'git log -L' as
> for git-blame.  Refactor the argument parsing of the range arguments
> from builtin/blame.c to the (new) file that will hold the 'git log -L'
> logic.
>
> To accommodate different data structures in blame and log -L, the file
> contents are abstracted away; parse_range_arg takes a callback that it
> uses to get the contents of a line of the (notional) file.
>
> The new test is for a case that made me pause during debugging: the
> 'blame -L with invalid end' test was the only one that noticed an
> outright failure to parse the end *at all*.  So make a more explicit
> test for that.
>
> Signed-off-by: Bo Yang 
> Signed-off-by: Thomas Rast 
> ---
>  Documentation/blame-options.txt |  19 +--
>  Documentation/line-range-format.txt |  18 +++
>  Makefile|   2 +
>  builtin/blame.c |  99 +++---
>  line-log.c  | 105 
> 
>  line-log.h  |  23 

Was this churn necessary?  

It is strange to move existing functions that will be tweaked to be
shared by two different codepaths (blame and line-log) to the new
user.

The only effect this has, as opposed to tweaking the functions in
place and making them extern, is to make it harder to see the tweaks
made while moving the lines, and also make it more cumbersome to
determine the lineage of the code later.
 
It would have been understandable if they were moved to a new
library-ish file (perhaps "line-range.[ch]"); even though that
approach shares the same downsides, at least it would have a better
excuse "We will share this, so let's move it to a neutral third
place to allow us to hide the implementation details from both
users".  The arrangement this patch series makes does not even have
that excuse.  The final implementation still stay with one of the
users; the only difference is that it is away from the original user
and close to the new user.

> @@ -1927,83 +1933,6 @@ static const char *add_prefix(const char *prefix, 
> const char *path)
>  }
>  
>  /*
> - * Parsing of (comma separated) one item in the -L option
> - */
> -static const char *parse_loc(const char *spec,
> -  struct scoreboard *sb, long lno,
> -  long begin, long *ret)
> -{
> - char *term;
> - const char *line;
> - long num;
> - int reg_error;
> - regex_t regexp;
> - regmatch_t match[1];
> -
> - /* Allow "-L ,+20" to mean starting at 
> -  * for 20 lines, or "-L ,-5" for 5 lines ending at
> -  * .
> -  */
> - if (1 < begin && (spec[0] == '+' || spec[0] == '-')) {

Did you fix a bug here?

This original only process -L begin,+20 and -L begin,-4 for the
value of begin larger than 1 while reading the second part that
comes after the comma, but incoming begin must be 2 or more, because
the caller adds one to the result of parsing what comes before the
comma, and "1 < begin" here, not "begin != -1", is done for that
reason.

I noticed these slight differences only after eyeballing the lines
deleted from here and lines added to the other place, but the
differences would have been unnoticed if reviewers were not careful.

Again, was it really necessary to move these functions to the new
file?

> diff --git a/line-log.c b/line-log.c
> new file mode 100644
> index 000..a24a86b
> --- /dev/null
> +++ b/line-log.c
> @@ -0,0 +1,105 @@
> +#include "git-compat-util.h"
> +#include "line-log.h"
> +
> +/*
> + * Parse one item in the -L option
> + */
> +const char *parse_loc(const char *spec, nth_line_fn_t nth_line,
> + void *data, long lines, long begin, long *ret)
> +{
> + char *term;
> + const char *line;
> + long num;
> + int reg_error;
> + regex_t regexp;
> + regmatch_t match[1];
> +
> + /*
> +  * Allow "-L ,+20" to mean starting at 
> +  * for 20 lines, or "-L ,-5" for 5 lines ending at
> +  * .
> +  */
> + if (begin != -1 && (spec[0] == '+' || spec[0] == '-')) {
> + num = strtol(spec + 1, &term, 10);
> + if (term != spec + 1) {
> + if (spec[0] == '-')
> + num = 0 - num;
> + if (0 < num)
> + *ret = begin + num - 2;
> + else if (!num)
> + *ret = begin;
> + else
> + *ret = begin + num;
> + return term;
> + }
> + return spec;
> + }
> + num = strtol(spec, &term, 10);
> + if (term != spec) {
> + *ret = num;
> + return term;
> + }
> + if (spec[0] != '/')
> + return spec;
> +
> + /* it could be a regexp of form /.../ */
> + for (term = (char *) spec + 1; *term && *term

[PATCH v8 1/5] Refactor parse_loc

2013-02-28 Thread Thomas Rast
From: Bo Yang 

We want to use the same style of -L n,m argument for 'git log -L' as
for git-blame.  Refactor the argument parsing of the range arguments
from builtin/blame.c to the (new) file that will hold the 'git log -L'
logic.

To accommodate different data structures in blame and log -L, the file
contents are abstracted away; parse_range_arg takes a callback that it
uses to get the contents of a line of the (notional) file.

The new test is for a case that made me pause during debugging: the
'blame -L with invalid end' test was the only one that noticed an
outright failure to parse the end *at all*.  So make a more explicit
test for that.

Signed-off-by: Bo Yang 
Signed-off-by: Thomas Rast 
---
 Documentation/blame-options.txt |  19 +--
 Documentation/line-range-format.txt |  18 +++
 Makefile|   2 +
 builtin/blame.c |  99 +++---
 line-log.c  | 105 
 line-log.h  |  23 
 t/t8003-blame-corner-cases.sh   |   6 +++
 7 files changed, 163 insertions(+), 109 deletions(-)
 create mode 100644 Documentation/line-range-format.txt
 create mode 100644 line-log.c
 create mode 100644 line-log.h

diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt
index b0d31df..6998d9f 100644
--- a/Documentation/blame-options.txt
+++ b/Documentation/blame-options.txt
@@ -13,24 +13,7 @@
Annotate only the given line range.   and  can take
one of these forms:
 
-   - number
-+
-If  or  is a number, it specifies an
-absolute line number (lines count from 1).
-+
-
-- /regex/
-+
-This form will use the first line matching the given
-POSIX regex.  If  is a regex, it will search
-starting at the line given by .
-+
-
-- +offset or -offset
-+
-This is only valid for  and will specify a number
-of lines before or after the line given by .
-+
+include::line-range-format.txt[]
 
 -l::
Show long rev (Default: off).
diff --git a/Documentation/line-range-format.txt 
b/Documentation/line-range-format.txt
new file mode 100644
index 000..265bc23
--- /dev/null
+++ b/Documentation/line-range-format.txt
@@ -0,0 +1,18 @@
+- number
++
+If  or  is a number, it specifies an
+absolute line number (lines count from 1).
++
+
+- /regex/
++
+This form will use the first line matching the given
+POSIX regex.  If  is a regex, it will search
+starting at the line given by .
++
+
+- +offset or -offset
++
+This is only valid for  and will specify a number
+of lines before or after the line given by .
++
diff --git a/Makefile b/Makefile
index 26d3332..76831fe 100644
--- a/Makefile
+++ b/Makefile
@@ -683,6 +683,7 @@ LIB_H += help.h
 LIB_H += http.h
 LIB_H += kwset.h
 LIB_H += levenshtein.h
+LIB_H += line-log.h
 LIB_H += list-objects.h
 LIB_H += ll-merge.h
 LIB_H += log-tree.h
@@ -798,6 +799,7 @@ LIB_OBJS += hex.o
 LIB_OBJS += ident.o
 LIB_OBJS += kwset.o
 LIB_OBJS += levenshtein.o
+LIB_OBJS += line-log.o
 LIB_OBJS += list-objects.o
 LIB_OBJS += ll-merge.o
 LIB_OBJS += lockfile.o
diff --git a/builtin/blame.c b/builtin/blame.c
index 86100e9..c933b15 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -21,6 +21,7 @@
 #include "parse-options.h"
 #include "utf8.h"
 #include "userdiff.h"
+#include "line-log.h"
 
 static char blame_usage[] = N_("git blame [options] [rev-opts] [rev] [--] 
file");
 
@@ -566,11 +567,16 @@ static void dup_entry(struct blame_entry *dst, struct 
blame_entry *src)
dst->score = 0;
 }
 
-static const char *nth_line(struct scoreboard *sb, int lno)
+static const char *nth_line(struct scoreboard *sb, long lno)
 {
return sb->final_buf + sb->lineno[lno];
 }
 
+static const char *nth_line_cb(void *data, long lno)
+{
+   return nth_line((struct scoreboard *)data, lno);
+}
+
 /*
  * It is known that lines between tlno to same came from parent, and e
  * has an overlap with that range.  it also is known that parent's
@@ -1927,83 +1933,6 @@ static const char *add_prefix(const char *prefix, const 
char *path)
 }
 
 /*
- * Parsing of (comma separated) one item in the -L option
- */
-static const char *parse_loc(const char *spec,
-struct scoreboard *sb, long lno,
-long begin, long *ret)
-{
-   char *term;
-   const char *line;
-   long num;
-   int reg_error;
-   regex_t regexp;
-   regmatch_t match[1];
-
-   /* Allow "-L ,+20" to mean starting at 
-* for 20 lines, or "-L ,-5" for 5 lines ending at
-* .
-*/
-   if (1 < begin && (spec[0] == '+' || spec[0] == '-')) {
-   num = strtol(spec + 1, &term, 10);
-   if (term != spec + 1) {
-   if (spec[0] == '-')
-   num = 0 - num;
-   if (0 < num)
-   *ret = begin + num - 2;
-   else if (!num)
-