On Mon, Oct 29, 2012 at 02:05:24AM -0400, Jeff King wrote:

> > i have a file with exactly 12288(0x3000) bytes in the repository.
> > When the file is loaded, the data is placed luckily so the data end
> > falls at a page boundary.
> > Later diff_grep() calls regexec() which calls strlen() on the loaded buffer
> > and ends up reading beyond the actual data into the next page
> > which is not allocated and causes a pagefault.
> > Or it could possibly (randomly) match the regex on data that is not
> > actually part of a file...
> Yuck. For the most part, we treat blob content (and generally most
> object content) as a sized buffer. However, there are some spots which,
> either through laziness or because a code interface expects a string, we
> pass the value as a string. This works because the object-reading code
> puts an extra NUL at the end of our buffer to handle just such an
> instance. So we might prematurely end if the object contains embedded
> NULs, but we would never read past the end.
> The code to read the output of a textconv filter does not do this
> explicitly. I would think it would get it for free by virtue of reading
> into a strbuf, though. I'll try to investigate.

I can't seem to replicate the problem here, even under valgrind. Do you
have a minimal test case?

