While the xdiff machinery is quite capable of working with strings given
as pointer and size, Git's add-on functionality simply assumes that we
are operating on NUL-terminated strings, e.g. y running regexec() on the
provided pointer, with no way to pass the size, too.

In general, this assumption is wrong.

It is true that many code paths populate the mmfile_t structure silently
appending a NUL, e.g. when running textconv on a temporary file and
reading the results back into an strbuf.

The assumption is most definitely wrong, however, when mmap()ing a file.

Practically, we seemed to be lucky that the bytes after mmap()ed memory
were 1) accessible and 2) somehow contained NUL bytes *somewhere*.

In a use case reported by Chris Sidi, it turned out that the mmap()ed
file had the precise size of a memory page, and on Windows the bytes
after memory-mapped pages are in general not valid.

This patch works around that issue, giving us time to discuss the best
course how to fix this problem more generally.

Signed-off-by: Johannes Schindelin <johannes.schinde...@gmx.de>
 diff.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/diff.c b/diff.c
index 534c12e..32f7f46 100644
--- a/diff.c
+++ b/diff.c
@@ -2826,6 +2826,15 @@ int diff_populate_filespec(struct diff_filespec *s, 
unsigned int flags)
                        s->data = strbuf_detach(&buf, &size);
                        s->size = size;
                        s->should_free = 1;
+               } else {
+                       /* data must be NUL-terminated so e.g. for regexec() */
+                       char *data = xmalloc(s->size + 1);
+                       memcpy(data, s->data, s->size);
+                       data[s->size] = '\0';
+                       munmap(s->data, s->size);
+                       s->should_munmap = 0;
+                       s->data = data;
+                       s->should_free = 1;
        else {

