Re: [PATCH] builtin/blame: destroy initialized commit_info only

2015-02-09 Thread Eric Sunshine
On Mon, Feb 9, 2015 at 4:28 PM, Eric Sunshine sunsh...@sunshineco.com wrote:
 Since ea02ffa3 (mailmap: simplify map_user() interface, 2013-01-05),
 find_alignment() has been invoking commit_info_destroy() on an
 uninitialized auto 'struct commit_info' (when METAINFO_SHOWN is not
 set). commit_info_destroy() calls strbuf_release() for each of

s/each of/each/

...despite several proof-reads (sigh).

 'commit_info' strbuf member, which randomly invokes free() on whatever
 random stack value happens to be reside in strbuf.buf, thus leading to
 periodic crashes.

 Reported-by: Dilyan Palauzov dilyan.palau...@aegee.org
 Signed-off-by: Eric Sunshine sunsh...@sunshineco.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] builtin/blame: destroy initialized commit_info only

2015-02-09 Thread Eric Sunshine
On Mon, Feb 9, 2015 at 4:33 PM, Eric Sunshine sunsh...@sunshineco.com wrote:
 On Mon, Feb 9, 2015 at 4:28 PM, Eric Sunshine sunsh...@sunshineco.com wrote:
 Since ea02ffa3 (mailmap: simplify map_user() interface, 2013-01-05),
 find_alignment() has been invoking commit_info_destroy() on an
 uninitialized auto 'struct commit_info' (when METAINFO_SHOWN is not
 set). commit_info_destroy() calls strbuf_release() for each of

 s/each of/each/

 ...despite several proof-reads (sigh).

 'commit_info' strbuf member, which randomly invokes free() on whatever
 random stack value happens to be reside in strbuf.buf, thus leading to

s/to be/to/

(sigh again)

 periodic crashes.

 Reported-by: Dilyan Palauzov dilyan.palau...@aegee.org
 Signed-off-by: Eric Sunshine sunsh...@sunshineco.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] builtin/blame: destroy initialized commit_info only

2015-02-09 Thread Jeff King
On Mon, Feb 09, 2015 at 04:28:07PM -0500, Eric Sunshine wrote:

 Since ea02ffa3 (mailmap: simplify map_user() interface, 2013-01-05),
 find_alignment() has been invoking commit_info_destroy() on an
 uninitialized auto 'struct commit_info' (when METAINFO_SHOWN is not
 set). commit_info_destroy() calls strbuf_release() for each of
 'commit_info' strbuf member, which randomly invokes free() on whatever
 random stack value happens to be reside in strbuf.buf, thus leading to
 periodic crashes.
 
 Reported-by: Dilyan Palauzov dilyan.palau...@aegee.org
 Signed-off-by: Eric Sunshine sunsh...@sunshineco.com
 ---
 
 No test accompanying this fix since I don't know how to formulate one.
 
 Discussion: http://thread.gmane.org/gmane.comp.version-control.git/263534

Thanks, good catch. This is almost certainly the source of the reported
bug. I was curious why we didn't find this until now, and what we could
do to detect similar problems. You can stop reading if you're not
interested. :)

We call strbuf_release on each of the uninitialized components. That
will only call free(sb-buf) if sb-alloc is non-zero. And if sb-buf is
0, then that is OK (it is a noop to free NULL). So the bug only shows
itself when the uninitialized memory is such that both are non-zero.

We're in a loop here, where we set the METAINFO_SHOWN to avoid
processing commits from the loop twice. So generally the first time
through the loop (when we really do have uninitialized crap in the
struct), that flag will not be set, and we will fill in the struct. Then
a later iteration of the loop does have the flag set, and does not fill
in the struct. But quite often we'll have whatever was left in the
struct from the previous loop iteration. Which, because strbuf_release()
re-initializes the strbufs, is a valid strbuf which it is a noop to
free.

Of course, that's entirely up to the compiler. It's reasonable to use
the same block for the variable in each iteration of the loop, but it is
free to do whatever it likes. It may even come and go with various
optimization settings.

Obviously if we hit a compiler that feeds the uninitialized values to
free(), this is pretty easy to detect with valgrind or another
heap-checking library. But let's assume we have a compiler that does the
reuse behavior. There's no bug in the generated code, but it's
technically undefined behavior. Can we detect it?

The compiler _could_ know statically that this is a use of uninitialized
memory, except that it only sees:

commit_info_destroy(ci);

on the uninitialized memory. Since we are passing a pointer to a
function it can't see, it has to assume that it's actually initializing
the memory for us (i.e., it looks the same as commit_info_init). So it
doesn't catch it. I don't know of a way to annotate the pointers better,
or if there is a way to get the compiler to see the code in a more
global way.

Valgrind cannot detect this. With the reuse behavior, we never even
call malloc, so there's nothing for it to see there. We are accessing
uninitialized memory, but it's on the stack. Valgrind doesn't even know
about it, or that we are looping here.

Clang's address sanitizer has compiler support, so it does get to see
this memory and could put a canary value in for each loop iteration. But
it doesn't. Instead, you're supposed to use the memory sanitizer to
catch uninitialized memory.

I tried that, but got overwhelmed with false positives. Like valgrind,
it has problems accepting that memory written by zlib is actually
initialized. But in theory, if we went to the work to annotate some
false positives, it should be able to find this problem.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] builtin/blame: destroy initialized commit_info only

2015-02-09 Thread Jeff King
On Mon, Feb 09, 2015 at 06:24:35PM -0500, Jeff King wrote:

 Clang's address sanitizer has compiler support, so it does get to see
 this memory and could put a canary value in for each loop iteration. But
 it doesn't. Instead, you're supposed to use the memory sanitizer to
 catch uninitialized memory.
 
 I tried that, but got overwhelmed with false positives. Like valgrind,
 it has problems accepting that memory written by zlib is actually
 initialized. But in theory, if we went to the work to annotate some
 false positives, it should be able to find this problem.

I got rid of the false positives here, through a combination of
compiling with NO_OPENSSL (since it otherwise doesn't know that
git_SHA1_Final is initializing hashes), and this patch which lets it
assume that the output of zlib (at least for these cases) is always
initialized:

diff --git a/sha1_file.c b/sha1_file.c
index 30995e6..28c8f84 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1682,6 +1682,7 @@ unsigned long get_size_from_delta(struct packed_git *p,
git_zstream stream;
int st;
 
+   memset(delta_head, 0, 20);
memset(stream, 0, sizeof(stream));
stream.next_out = delta_head;
stream.avail_out = sizeof(delta_head);
@@ -1973,6 +1974,7 @@ static void *unpack_compressed_entry(struct packed_git *p,
buffer = xmallocz_gently(size);
if (!buffer)
return NULL;
+   memset(buffer, 0, size);
memset(stream, 0, sizeof(stream));
stream.next_out = buffer;
stream.avail_out = size + 1;


Sadly, though, the test case in question runs to completion. It does not
seem to detect our use of uninitialized memory. :(

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html