Re: Stack read out-of-bounds in parse_sha1_header_extended using git 2.10.0

2016-09-27 Thread Jeff King
On Mon, Sep 26, 2016 at 11:10:54AM -0700, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > I am inclined to say that it has no security implications.  You have
> > to be able to write a bogus loose object in an object store you
> > already have write access to in the first place, in order to cause
> > this ...
> 
> Note that you could social-engineer others to fetch from you and
> feed a small enough update that results in loose objects created in
> their repositories, without you having a direct write access to the
> repository.
> 
> The codepath under discussion in this thread however cannot be used
> as an attack vector via that route, because the "fetch from
> elsewhere" codepath runs verification of the incoming data stream
> before storing the results (either in loose object files, or in a
> packfile) on disk.

I don't think it could be used at all for anything that speaks the git
protocol, because the object header is not present at all in a packfile.
So even if you hit unpack-objects, it would be writing the (correct)
loose object header itself.

But when we grab loose objects _directly_ from a remote, as in dumb-http
fetch, I'd suspect that the code doing the verification calls
unpack_sha1_header() as part of it. So I didn't test, but I'd strongly
suspect that's a viable attack vector.

I'm not sure what the actual attack would look like, though, aside from
locally accessing memory in a read-only way.

-Peff


Re: Stack read out-of-bounds in parse_sha1_header_extended using git 2.10.0

2016-09-26 Thread Gustavo Grieco
Btw, this other test case will trigger a similar issue, but in another line of 
code:

To reproduce: 

$ git init ; mkdir -p .git/objects/b2 ; printf 
'eJwNwoENgDAIBECkDsII5Z8CHagLGPePXu59zjHGRIOZG3OzI/lnRc4KemXDPdYSml6iQ+4ATIZ+nAEK4g=='
 | base64 -d > .git/objects/b2/93584ddd61af21260be75ee9f73e9d53f08cd0

Then:

$ git fsck

notice: HEAD points to an unborn branch (master)
=
==24569==ERROR: AddressSanitizer: stack-buffer-overflow on address 
0x7ffe7645fda0 at pc 0x006fe799 bp 0x7ffe7645fc40 sp 0x7ffe7645fc30
READ of size 1 at 0x7ffe7645fda0 thread T0
#0 0x6fe798 in parse_sha1_header_extended 
/home/g/Work/Code/git-2.10.0/sha1_file.c:1714
...

It will be nice to test the current patch.

- Original Message -
> Junio C Hamano  writes:
> 
> > I am inclined to say that it has no security implications.  You have
> > to be able to write a bogus loose object in an object store you
> > already have write access to in the first place, in order to cause
> > this ...
> 
> Note that you could social-engineer others to fetch from you and
> feed a small enough update that results in loose objects created in
> their repositories, without you having a direct write access to the
> repository.
> 
> The codepath under discussion in this thread however cannot be used
> as an attack vector via that route, because the "fetch from
> elsewhere" codepath runs verification of the incoming data stream
> before storing the results (either in loose object files, or in a
> packfile) on disk.
> 
> 


Re: Stack read out-of-bounds in parse_sha1_header_extended using git 2.10.0

2016-09-26 Thread Junio C Hamano
Junio C Hamano  writes:

> I am inclined to say that it has no security implications.  You have
> to be able to write a bogus loose object in an object store you
> already have write access to in the first place, in order to cause
> this ...

Note that you could social-engineer others to fetch from you and
feed a small enough update that results in loose objects created in
their repositories, without you having a direct write access to the
repository.

The codepath under discussion in this thread however cannot be used
as an attack vector via that route, because the "fetch from
elsewhere" codepath runs verification of the incoming data stream
before storing the results (either in loose object files, or in a
packfile) on disk.



Re: Stack read out-of-bounds in parse_sha1_header_extended using git 2.10.0

2016-09-26 Thread Junio C Hamano
Gustavo Grieco  writes:

> Fair enough. We are testing our tool to try to find
> bugs/vulnerabilities in several git implementations. I will report
> here my results if i can find some other memory issue in this git
> client.

Thanks.  With or without security implications, it is basic codebase
hygiene to identify and correct these issues, and your help is
highly appreciated.


Re: Stack read out-of-bounds in parse_sha1_header_extended using git 2.10.0

2016-09-26 Thread Gustavo Grieco
Fair enough. We are testing our tool to try to find bugs/vulnerabilities in 
several git implementations. I will report here my results if i can find some 
other memory issue in this git client.

- Original Message -
> Gustavo Grieco  writes:
> 
> > Now that the cause of this issue is identified, i would like to
> > know if there is an impact in the security, so i can request a CVE
> > if necessary.
> 
> I am inclined to say that it has no security implications.  You have
> to be able to write a bogus loose object in an object store you
> already have write access to in the first place, in order to cause
> this read-only access that goes beyond what is allocated, so at the
> worst, what you can do is to hurt yourself, and you can already hurt
> yourself in various other ways.
> 


Re: Stack read out-of-bounds in parse_sha1_header_extended using git 2.10.0

2016-09-26 Thread Junio C Hamano
Gustavo Grieco  writes:

> Now that the cause of this issue is identified, i would like to
> know if there is an impact in the security, so i can request a CVE
> if necessary.

I am inclined to say that it has no security implications.  You have
to be able to write a bogus loose object in an object store you
already have write access to in the first place, in order to cause
this read-only access that goes beyond what is allocated, so at the
worst, what you can do is to hurt yourself, and you can already hurt
yourself in various other ways.


Re: Stack read out-of-bounds in parse_sha1_header_extended using git 2.10.0

2016-09-26 Thread Gustavo Grieco
Hello,

Now that the cause of this issue is identified, i would like to know if there 
is an impact in the security, so i can request a CVE if necessary.

Thanks!


Re: Stack read out-of-bounds in parse_sha1_header_extended using git 2.10.0

2016-09-26 Thread Jeff King
On Sun, Sep 25, 2016 at 05:10:31PM -0700, Junio C Hamano wrote:

> Gustavo Grieco  writes:
> 
> > We found a stack read out-of-bounds parsing object files using git 2.10.0. 
> > It was tested on ArchLinux x86_64. To reproduce, first recompile git with 
> > ASAN support and then execute:
> >
> > $ git init ; mkdir -p .git/objects/b2 ; printf 'x' > 
> > .git/objects/b2/93584ddd61af21260be75ee9f73e9d53f08cd0
> 
> Interesting.  If you prepare such a broken loose object file in your
> local repository, I would expect that either unpack_sha1_header() or
> unpack_sha1_header_to_strbuf() that sha1_loose_object_info() calls
> would detect and barf by noticing that an error came from libz while
> it attempts to inflate and would not even call parse_sha1_header.
> 
> But it is nevertheless bad to assume that whatever happens to
> inflate without an error must be formatted correctly to allow
> parsing (i.e. has ' ' and then NUL termination within the first 32
> bytes after inflation), which is exactly what the hdr[32] is saying.

Yeah. I also was surprised that we didn't barf on a zlib failure. But
based on previous debugging of corrupted zlib data, my recollection
is that there are a large number of weird corruptions that zlib will
happily pass back and only later complain about a checksum error. So
presumably "x" is one of those, and it might not hold for other
corruptions (but I didn't try).

> Note that this is totally unteseted and not thought through; I
> briefly thought about what unpack_sha1_header_to_strbuf() does with
> this change (it first lets unpack_sha1_header() to attempt with a
> small buffer but it seems to discard the error code from it before
> seeing if the returned buffer has NUL in it); there may be bad
> interactions with it.

Yeah, that seems wrong. I don't think it would involve an out of bounds
read, but we probably could fail to correctly report zlib corruption.

> diff --git a/sha1_file.c b/sha1_file.c
> index 60ff21f..dfcbd76 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1648,6 +1648,8 @@ unsigned long unpack_object_header_buffer(const 
> unsigned char *buf,
>  
>  int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned 
> long mapsize, void *buffer, unsigned long bufsiz)
>  {
> + int status;
> +
>   /* Get the data stream */
>   memset(stream, 0, sizeof(*stream));
>   stream->next_in = map;
> @@ -1656,7 +1658,15 @@ int unpack_sha1_header(git_zstream *stream, unsigned 
> char *map, unsigned long ma
>   stream->avail_out = bufsiz;
>  
>   git_inflate_init(stream);
> - return git_inflate(stream, 0);
> + status = git_inflate(stream, 0);
> + if (status)
> + return status;
> +
> + /* Make sure we got the terminating NUL for the object header */
> + if (!memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer))
> + return -1;
> +
> + return 0;

This doesn't look too invasive as an approach, though I would have done
it differently. We're making the assumption that once there is a NUL,
the header-parser won't do anything stupid, which creates a coupling
between those two bits of code. My inclination would have been to just
treat the header as a ptr/len pair, and make sure the parser never reads
past the end.

But I implemented that, and it _is_ rather invasive. And it's not like
coupling unpack_sha1_header() and parse_sha1_header() is all that
terrible; they are meant to be paired.

I haven't read through your follow-up yet; I'll do that before posting
my version.

>  static int unpack_sha1_header_to_strbuf(git_zstream *stream, unsigned char 
> *map,
> @@ -1758,6 +1768,8 @@ static int parse_sha1_header_extended(const char *hdr, 
> struct object_info *oi,
>   char c = *hdr++;
>   if (c == ' ')
>   break;
> + if (!c)
> + die("invalid object header");
>   type_len++;
>   }

We keep reading from hdr after this, though I think those bits would all
bail correctly on seeing NUL.

-Peff


Re: Stack read out-of-bounds in parse_sha1_header_extended using git 2.10.0

2016-09-25 Thread Junio C Hamano
Gustavo Grieco  writes:

> We found a stack read out-of-bounds parsing object files using git 2.10.0. It 
> was tested on ArchLinux x86_64. To reproduce, first recompile git with ASAN 
> support and then execute:
>
> $ git init ; mkdir -p .git/objects/b2 ; printf 'x' > 
> .git/objects/b2/93584ddd61af21260be75ee9f73e9d53f08cd0

Interesting.  If you prepare such a broken loose object file in your
local repository, I would expect that either unpack_sha1_header() or
unpack_sha1_header_to_strbuf() that sha1_loose_object_info() calls
would detect and barf by noticing that an error came from libz while
it attempts to inflate and would not even call parse_sha1_header.

But it is nevertheless bad to assume that whatever happens to
inflate without an error must be formatted correctly to allow
parsing (i.e. has ' ' and then NUL termination within the first 32
bytes after inflation), which is exactly what the hdr[32] is saying.

Perhaps we need something like the following to tighten the
codepath.

Note that this is totally unteseted and not thought through; I
briefly thought about what unpack_sha1_header_to_strbuf() does with
this change (it first lets unpack_sha1_header() to attempt with a
small buffer but it seems to discard the error code from it before
seeing if the returned buffer has NUL in it); there may be bad
interactions with it.


 sha1_file.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/sha1_file.c b/sha1_file.c
index 60ff21f..dfcbd76 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1648,6 +1648,8 @@ unsigned long unpack_object_header_buffer(const unsigned 
char *buf,
 
 int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long 
mapsize, void *buffer, unsigned long bufsiz)
 {
+   int status;
+
/* Get the data stream */
memset(stream, 0, sizeof(*stream));
stream->next_in = map;
@@ -1656,7 +1658,15 @@ int unpack_sha1_header(git_zstream *stream, unsigned 
char *map, unsigned long ma
stream->avail_out = bufsiz;
 
git_inflate_init(stream);
-   return git_inflate(stream, 0);
+   status = git_inflate(stream, 0);
+   if (status)
+   return status;
+
+   /* Make sure we got the terminating NUL for the object header */
+   if (!memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer))
+   return -1;
+
+   return 0;
 }
 
 static int unpack_sha1_header_to_strbuf(git_zstream *stream, unsigned char 
*map,
@@ -1758,6 +1768,8 @@ static int parse_sha1_header_extended(const char *hdr, 
struct object_info *oi,
char c = *hdr++;
if (c == ' ')
break;
+   if (!c)
+   die("invalid object header");
type_len++;
}
 



Stack read out-of-bounds in parse_sha1_header_extended using git 2.10.0

2016-09-25 Thread Gustavo Grieco
Hi,

We found a stack read out-of-bounds parsing object files using git 2.10.0. It 
was tested on ArchLinux x86_64. To reproduce, first recompile git with ASAN 
support and then execute:

$ git init ; mkdir -p .git/objects/b2 ; printf 'x' > 
.git/objects/b2/93584ddd61af21260be75ee9f73e9d53f08cd0

Finally you can trigger the bug using several commands from git (other commands 
that parses all objects will work too), for instance:

$ git fsck

The ASAN report is here:

==2763==ERROR: AddressSanitizer: stack-buffer-overflow on address 
0x7ffe16e4a690 at pc 0x006fe5dc bp 0x7ffe16e4a530 sp 0x7ffe16e4a520
READ of size 1 at 0x7ffe16e4a690 thread T0
#0 0x6fe5db in parse_sha1_header_extended 
/home/g/Work/Code/git-2.10.0/sha1_file.c:1684
#1 0x702cd4 in sha1_loose_object_info 
/home/g/Work/Code/git-2.10.0/sha1_file.c:2660
#2 0x70332c in sha1_object_info_extended 
/home/g/Work/Code/git-2.10.0/sha1_file.c:2696
#3 0x7038e0 in sha1_object_info 
/home/g/Work/Code/git-2.10.0/sha1_file.c:2745
#4 0x648498 in parse_object /home/g/Work/Code/git-2.10.0/object.c:260
#5 0x48d46d in fsck_sha1 builtin/fsck.c:367
#6 0x48da47 in fsck_loose builtin/fsck.c:493
#7 0x707514 in for_each_file_in_obj_subdir 
/home/g/Work/Code/git-2.10.0/sha1_file.c:3477
#8 0x70775b in for_each_loose_file_in_objdir_buf 
/home/g/Work/Code/git-2.10.0/sha1_file.c:3512
#9 0x707885 in for_each_loose_file_in_objdir 
/home/g/Work/Code/git-2.10.0/sha1_file.c:3532
#10 0x48dc1d in fsck_object_dir builtin/fsck.c:521
#11 0x48e2e6 in cmd_fsck builtin/fsck.c:644
#12 0x407a8f in run_builtin /home/g/Work/Code/git-2.10.0/git.c:352
#13 0x407e35 in handle_builtin /home/g/Work/Code/git-2.10.0/git.c:539
#14 0x408175 in run_argv /home/g/Work/Code/git-2.10.0/git.c:593
#15 0x408458 in cmd_main /home/g/Work/Code/git-2.10.0/git.c:665
#16 0x53fc70 in main /home/g/Work/Code/git-2.10.0/common-main.c:40
#17 0x7f0f46d43290 in __libc_start_main (/usr/lib/libc.so.6+0x20290)
#18 0x405209 in _start (/home/g/Work/Code/git-2.10.0/git+0x405209)

Address 0x7ffe16e4a690 is located in stack of thread T0 at offset 192 in frame
#0 0x702834 in sha1_loose_object_info 
/home/g/Work/Code/git-2.10.0/sha1_file.c:2614

  This frame has 5 object(s):
[32, 40) 'mapsize'
[96, 120) 'hdrbuf'
[160, 192) 'hdr' <== Memory access at offset 192 overflows this variable
[224, 368) 'st'
[416, 576) 'stream'
HINT: this may be a false positive if your program uses some custom stack 
unwind mechanism or swapcontext
  (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow 
/home/g/Work/Code/git-2.10.0/sha1_file.c:1684 in parse_sha1_header_extended
Shadow bytes around the buggy address:
  0x100042dc1480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100042dc1490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100042dc14a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100042dc14b0: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f4
  0x100042dc14c0: f4 f4 f2 f2 f2 f2 00 00 00 f4 f2 f2 f2 f2 00 00
=>0x100042dc14d0: 00 00[f2]f2 f2 f2 00 00 00 00 00 00 00 00 00 00
  0x100042dc14e0: 00 00 00 00 00 00 00 00 f4 f4 f2 f2 f2 f2 00 00
  0x100042dc14f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100042dc1500: 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00
  0x100042dc1510: 00 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f2 f2 f2 f2
  0x100042dc1520: 00 00 00 00 00 f4 f4 f4 f3 f3 f3 f3 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:   00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:   fa
  Heap right redzone:  fb
  Freed heap region:   fd
  Stack left redzone:  f1
  Stack mid redzone:   f2
  Stack right redzone: f3
  Stack partial redzone:   f4
  Stack after return:  f5
  Stack use after scope:   f8
  Global redzone:  f9
  Global init order:   f6
  Poisoned by user:f7
  Container overflow:  fc
  Array cookie:ac
  Intra object redzone:bb
  ASan internal:   fe
  Left alloca redzone: ca
  Right alloca redzone:cb


Regards,
Gustavo.