On Sat, Oct 27, 2012 at 08:37:24PM +0200, Peter Oberndorfer wrote:

> It seems "git diff-tree -Ganything <tree>" crashes[1] with a null
> pointer dereference
> when run on a commit that adds a file (pdf) with a textconv filter.
> It can be reproduced with vanilla git by having a commit on top that
> adds a file with a textconv filter and executing git diff-tree
> -Ganything HEAD
> But running git log -Ganything still works without a crash.
> This problem seems to exist since the feature was first added in f506b8e8b5.

Thanks for a thorough bug report. I didn't reproduce the crash, but I
can see how it happens (it happens with diff-tree because we will reuse
the working tree file in that instance; it could also happen if you
turned on textconv caching).

> While testing I also noticed the -S and -G act on the original file
> instead of the textconv munged data.
> Is this intentional or caused by accessing the wrong data?

Both, perhaps. :)

-G operates on the munged data; you can see it feed the munged data to
xdiff in diff_grep. But the optimization for handling added and removed
files accidentally fed the wrong pointer. Fixing that is a no-brainer,
since the optimization is inconsistent with the regular code path.

-S, however, predates the invention of textconv and has never used it.
It is a little less clear that textconv is the right thing here, because
it is not about grepping the diff, but about counting occurrences of the
string inside the file content. I tend to think that doing so on the
textconv'd data would be what people generally want, but it is a
behavior change.

> Wild guess: should we really access p->one->data and not mf1.ptr?

Precisely. Thanks for your wild guess; it made finding the bug very
easy. :)

> Is there some more information i should provide?

The patch below should fix it. I added tests, but please try your
real-world test case on it to double-check.

-- >8 --
Subject: [PATCH] diff_grep: use textconv buffers for add/deleted files

If you use "-G" to grep a diff, we will apply a configured
textconv filter to the data before generating the diff.
However, if the diff is an addition or deletion, we do not
bother running the diff at all, and just look for the token
in the added (or removed) content. This works because we
know that the diff must contain every line of content.

However, while we used the textconv-derived buffers in the
regular diff, we accidentally passed the original unmodified
buffers to regexec when checking the added or removed
content. This could lead to an incorrect answer.

Worse, in some cases we might have a textconv buffer but no
original buffer (e.g., if we pulled the textconv data from
cache, or if we reused a working tree file when generating
it). In that case, we could actually feed NULL to regexec
and segfault.

Reported-by: Peter Oberndorfer <kumbay...@arcor.de>
Signed-off-by: Jeff King <p...@peff.net>
 diffcore-pickaxe.c       |  4 ++--
 t/t4030-diff-textconv.sh | 12 ++++++++++++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index ed23eb4..a209376 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -104,10 +104,10 @@ static int diff_grep(struct diff_filepair *p, struct 
diff_options *o,
                if (!mf2.ptr)
                        return 0; /* ignore unmerged */
                /* created "two" -- does it have what we are looking for? */
-               hit = !regexec(regexp, p->two->data, 1, &regmatch, 0);
+               hit = !regexec(regexp, mf2.ptr, 1, &regmatch, 0);
        } else if (!mf2.ptr) {
                /* removed "one" -- did it have what we are looking for? */
-               hit = !regexec(regexp, p->one->data, 1, &regmatch, 0);
+               hit = !regexec(regexp, mf1.ptr, 1, &regmatch, 0);
        } else {
                 * We have both sides; need to run textual diff and see if
diff --git a/t/t4030-diff-textconv.sh b/t/t4030-diff-textconv.sh
index eebb1ee..461d27a 100755
--- a/t/t4030-diff-textconv.sh
+++ b/t/t4030-diff-textconv.sh
@@ -84,6 +84,18 @@ test_expect_success 'status -v produces text' '
        git reset --soft HEAD@{1}
+test_expect_success 'grep-diff (-G) operates on textconv data (add)' '
+       echo one >expect &&
+       git log --root --format=%s -G0 >actual &&
+       test_cmp expect actual
+test_expect_success 'grep-diff (-G) operates on textconv data (modification)' '
+       echo two >expect &&
+       git log --root --format=%s -G1 >actual &&
+       test_cmp expect actual
 cat >expect.stat <<'EOF'
  file | Bin 2 -> 4 bytes
  1 file changed, 0 insertions(+), 0 deletions(-)

To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to