Re: [PATCHv2] grep: use slash for path delimiter, not colon

2013-08-26 Thread Junio C Hamano
Jeff King  writes:

> On Mon, Aug 26, 2013 at 03:28:26PM -0400, Jeff King wrote:
>
>> Changing the object_array API would be hard, but I don't think we need
>> to do it here. Can we simply stop using object_array to pass the list,
>> and instead just have a custom list?
>> 
>> I'll see how painful that is.
>
> Not very, I think. Here's the series.
>
>   [1/2]: grep: stop using object_array
>   [2/2]: grep: use slash for path delimiter, not colon

I agree that if we were to do this, these patches show a reasonable
approach to do so.

I however am not yet convinced if its output is necessarily better
X-<.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] grep: use slash for path delimiter, not colon

2013-08-26 Thread Jeff King
On Mon, Aug 26, 2013 at 03:28:26PM -0400, Jeff King wrote:

> Changing the object_array API would be hard, but I don't think we need
> to do it here. Can we simply stop using object_array to pass the list,
> and instead just have a custom list?
> 
> I'll see how painful that is.

Not very, I think. Here's the series.

  [1/2]: grep: stop using object_array
  [2/2]: grep: use slash for path delimiter, not colon

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] grep: use slash for path delimiter, not colon

2013-08-26 Thread Jeff King
On Mon, Aug 26, 2013 at 10:46:12AM -0400, Phil Hord wrote:

> This version is a bit more deterministic and also adds a test.
> 
> It accepts the expense of examining the path argument again to 
> determine if it is a tree-ish + path rather than just a tree (commit).
> The get_sha1 call occurs one extra time for each tree-ish argument,
> so it's not expensive.

I don't like this approach in general because it lacks atomicity. IOW,
the thing you are looking up may change between the two get_sha1 calls.
You're _almost_ good here because you don't actually care what the
second call returns, but only which features it _would_ have used. But
you may see the second call fail because the ref doesn't exist anymore,
or points to a different tree, and you will erroneously use ":" instead
of "/".

I admit this is not that likely, but I'd really rather avoid introducing
such races if we can.

> We avoid mucking with the object_array API this way, and also do not
> rely on the object-type to tell us anything about the way the object
> name was spelled.

Changing the object_array API would be hard, but I don't think we need
to do it here. Can we simply stop using object_array to pass the list,
and instead just have a custom list?

I'll see how painful that is.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2] grep: use slash for path delimiter, not colon

2013-08-26 Thread Phil Hord
When a commit is grepped and matching filenames are printed, grep-objects
creates the filename by prefixing the original cmdline argument to the
matched path separated by a colon.  Normally this forms a valid blob
reference to the filename, like this:

  git grep -l foo HEAD
  HEAD:some/path/to/foo.txt
  ^

But a tree path may be given to grep instead; in this case the colon is
not a valid delimiter to use since it is placed inside a path.

  git grep -l foo HEAD:some
  HEAD:some:path/to/foo.txt
   ^

The slash path delimiter should be used instead.  Fix git grep to
discern the correct delimiter so it can report valid object names.

  git grep -l foo HEAD:some
  HEAD:some/path/to/foo.txt
   ^

Also, prevent the delimiter being added twice, as happens now in these
examples:

  git grep -l foo HEAD:
  HEAD::some/path/to/foo.txt
   ^
  git grep -l foo HEAD:some/
  HEAD:some/:path/to/foo.txt
^

Add a test to confirm correct path forming.
---
This version is a bit more deterministic and also adds a test.

It accepts the expense of examining the path argument again to 
determine if it is a tree-ish + path rather than just a tree (commit).
The get_sha1 call occurs one extra time for each tree-ish argument,
so it's not expensive. We avoid mucking with the object_array API this
way, and also do not rely on the object-type to tell us anything about
the way the object name was spelled.

This one also adds a check to avoid duplicating an extant delimiter.

 builtin/grep.c  |  9 -
 t/t7810-grep.sh | 15 +++
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 03bc442..6fc418f 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -480,8 +480,15 @@ static int grep_object(struct grep_opt *opt, const struct 
pathspec *pathspec,
len = name ? strlen(name) : 0;
strbuf_init(&base, PATH_MAX + len + 1);
if (len) {
+   struct object_context ctx;
+   unsigned char sha1[20];
+   char delimiter = ':';
+   if (!get_sha1_with_context(name, 0, sha1, &ctx) &&
+   ctx.path[0]!=0)
+   delimiter='/';
strbuf_add(&base, name, len);
-   strbuf_addch(&base, ':');
+   if (name[len-1] != delimiter)
+   strbuf_addch(&base, delimiter);
}
init_tree_desc(&tree, data, size);
hit = grep_tree(opt, pathspec, &tree, &base, base.len,
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index f698001..2494bfc 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -886,6 +886,21 @@ test_expect_success 'grep -e -- -- path' '
 '
 
 cat >expected actual &&
+   test_cmp expected actual
+'
+
+cat >expected