Re: [BUG] parse_object() does not behave as documented

2014-08-08 Thread Samuel Bronson
Ping?

Samuel Bronson naes...@gmail.com writes:

 [Hmm, nobody seems ot have commented on this analysis; maybe reposting
 it with a subject containing [BUG] will help?]

 Samuel Bronson naes...@gmail.com writes:

 The following message is a courtesy copy of an article
 that has been posted to gmane.comp.version-control.git as well.

 Oh, I forgot to provide any analysis of the problem.  Oops.

 It may be just as well, though; I was tired enough that I might have
 botched it in any case.  So, have an analysis:

 While inflate errors are obviously NOT GOOD, and should perhaps be
 instantly fatal for most commands, git fsck is something of a special
 case because it is useful to have *it* report as many corrupt objects as
 possible in one run.

 Unfortunately, this is not currently the case, as shown by the provided
 testcase.

 The output for this testcase is:

 ,
 | checking known breakage:
 | hash1= 
 | hash2=fffe 
 | mkdir -p .git/objects/ff 
 | echo not-zlib $(sha1_file $hash1) 
 | test_when_finished remove_object $hash1 
 | echo not-zlib $(sha1_file $hash2) 
 | test_when_finished remove_object $hash2 
 | 
 | # Return value is not documented
 | test_might_fail git fsck 2out 
 | cat out  echo == 
 | grep $hash1.*corrupt out 
 | grep $hash2.*corrupt out
 | 
 | error: inflate: data stream error (incorrect header check)
 | error: unable to unpack  header
 | error: inflate: data stream error (incorrect header check)
 | fatal: loose object  (stored
 | in .git/objects/ff/ff) is
 | corrupt
 | ==
 | fatal: loose object  (stored
 | in .git/objects/ff/ff) is
 | corrupt
 | not ok 5 - fsck survives inflate errors # TODO known breakage
 `

 If I flip it from expect_failure to expect_success and run the test with
 -i, then go into the trash directory and run gdb ../../git-fsck, I can
 obtain this (thoroughly rehearsed  trimmed) gdb transcript:

 ,
 | % gdb ../../git-fsck
 | GNU gdb (Debian 7.7.1-3) 7.7.1
 ...
 | Reading symbols from ../../git-fsck...done.
 | (gdb) break error
 | Breakpoint 1 at 0x813d24c: file usage.c, line 143.
 | (gdb) break die
 | Breakpoint 2 at 0x813d152: file usage.c, line 94.
 | (gdb) run
 | Starting program: /home/naesten/hacking/git/git-fsck
 | [Thread debugging using libthread_db enabled]
 | Using host libthread_db library
 | /lib/i386-linux-gnu/i686/cmov/libthread_db.so.1.
 | Checking object directories: 100% (256/256), done.
 | 
 | Breakpoint 1, error (err=0x8182f7a inflate: %s (%s)) at usage.c:143
 | 143 {
 | (gdb) bt
 | #0  error (err=0x8182f7a inflate: %s (%s)) at usage.c:143
 | #1  0x081452ff in git_inflate (strm=0xbfffe6b8, flush=0)
 | at zlib.c:144
 | #2  0x08125367 in unpack_sha1_header (stream=optimized out,
 | map=optimized out, mapsize=optimized out,
 | buffer=optimized out, bufsiz=optimized out)
 | at sha1_file.c:1515
 | #3  0x08125546 in sha1_loose_object_info (
 | sha1=sha1@entry=0x82659d4 '\377' repeats 20 times,
 | oi=oi@entry=0xbfffe788) at sha1_file.c:2528
 | #4  0x08126b2d in sha1_object_info_extended (
 | sha1=0x82659d4 '\377' repeats 20 times, oi=0xbfffe788, flags=1)
 | at sha1_file.c:2565
 | #5  0x0812666f in sha1_object_info (
 | sha1=0x82659d4 '\377' repeats 20 times, sizep=0x0)
 | at sha1_file.c:2601
 | #6  0x080f6941 in parse_object (
 | sha1=0x82659d4 '\377' repeats 20 times) at object.c:247
 | #7  0x080758ac in fsck_sha1 (
 | sha1=sha1@entry=0x82659d4 '\377' repeats 20 times)
 | at builtin/fsck.c:333
 ...
 | (gdb) c
 | Continuing.
 | error: inflate: data stream error (incorrect header check)
 | 
 | Breakpoint 1, error (err=0x817c525 unable to unpack %s header)
 | at usage.c:143
 | 143 {
 | (gdb) bt
 | #0  error (err=0x817c525 unable to unpack %s header) at usage.c:143
 | #1  0x08125564 in sha1_loose_object_info (
 | sha1=sha1@entry=0x82659d4 '\377' repeats 20 times,
 | oi=oi@entry=0xbfffe788) at sha1_file.c:2529
 | #2  0x08126b2d in sha1_object_info_extended (
 | sha1=0x82659d4 '\377' repeats 20 times, oi=0xbfffe788, flags=1)
 | at sha1_file.c:2565
 | #3  0x0812666f in sha1_object_info (
 | sha1=0x82659d4 '\377' repeats 20 times, sizep=0x0)
 | at sha1_file.c:2601
 | #4  0x080f6941 in parse_object (
 | sha1=0x82659d4 '\377' repeats 20 times) at object.c:247
 ...
 | (gdb) frame 4
 | #4  0x080f6941 in parse_object (
 | sha1=0x82659d4 '\377' repeats 20 times) at object.c:247
 | warning: Source file is more recent than executable.
 | 247 sha1_object_info(sha1, NULL) == OBJ_BLOB)) { // -- first error
 | (gdb) down
 | #3  0x0812666f in sha1_object_info (
 | sha1=0x82659d4 '\377' repeats 20 

[BUG] parse_object() does not behave as documented

2014-07-29 Thread Samuel Bronson
[Hmm, nobody seems ot have commented on this analysis; maybe reposting
it with a subject containing [BUG] will help?]

Samuel Bronson naes...@gmail.com writes:

 The following message is a courtesy copy of an article
 that has been posted to gmane.comp.version-control.git as well.

 Oh, I forgot to provide any analysis of the problem.  Oops.

 It may be just as well, though; I was tired enough that I might have
 botched it in any case.  So, have an analysis:

 While inflate errors are obviously NOT GOOD, and should perhaps be
 instantly fatal for most commands, git fsck is something of a special
 case because it is useful to have *it* report as many corrupt objects as
 possible in one run.

 Unfortunately, this is not currently the case, as shown by the provided
 testcase.

 The output for this testcase is:

 ,
 | checking known breakage:
 | hash1= 
 | hash2=fffe 
 | mkdir -p .git/objects/ff 
 | echo not-zlib $(sha1_file $hash1) 
 | test_when_finished remove_object $hash1 
 | echo not-zlib $(sha1_file $hash2) 
 | test_when_finished remove_object $hash2 
 | 
 | # Return value is not documented
 | test_might_fail git fsck 2out 
 | cat out  echo == 
 | grep $hash1.*corrupt out 
 | grep $hash2.*corrupt out
 | 
 | error: inflate: data stream error (incorrect header check)
 | error: unable to unpack  header
 | error: inflate: data stream error (incorrect header check)
 | fatal: loose object  (stored
 | in .git/objects/ff/ff) is
 | corrupt
 | ==
 | fatal: loose object  (stored
 | in .git/objects/ff/ff) is
 | corrupt
 | not ok 5 - fsck survives inflate errors # TODO known breakage
 `

 If I flip it from expect_failure to expect_success and run the test with
 -i, then go into the trash directory and run gdb ../../git-fsck, I can
 obtain this (thoroughly rehearsed  trimmed) gdb transcript:

 ,
 | % gdb ../../git-fsck
 | GNU gdb (Debian 7.7.1-3) 7.7.1
 ...
 | Reading symbols from ../../git-fsck...done.
 | (gdb) break error
 | Breakpoint 1 at 0x813d24c: file usage.c, line 143.
 | (gdb) break die
 | Breakpoint 2 at 0x813d152: file usage.c, line 94.
 | (gdb) run
 | Starting program: /home/naesten/hacking/git/git-fsck
 | [Thread debugging using libthread_db enabled]
 | Using host libthread_db library
 | /lib/i386-linux-gnu/i686/cmov/libthread_db.so.1.
 | Checking object directories: 100% (256/256), done.
 | 
 | Breakpoint 1, error (err=0x8182f7a inflate: %s (%s)) at usage.c:143
 | 143 {
 | (gdb) bt
 | #0  error (err=0x8182f7a inflate: %s (%s)) at usage.c:143
 | #1  0x081452ff in git_inflate (strm=0xbfffe6b8, flush=0)
 | at zlib.c:144
 | #2  0x08125367 in unpack_sha1_header (stream=optimized out,
 | map=optimized out, mapsize=optimized out,
 | buffer=optimized out, bufsiz=optimized out)
 | at sha1_file.c:1515
 | #3  0x08125546 in sha1_loose_object_info (
 | sha1=sha1@entry=0x82659d4 '\377' repeats 20 times,
 | oi=oi@entry=0xbfffe788) at sha1_file.c:2528
 | #4  0x08126b2d in sha1_object_info_extended (
 | sha1=0x82659d4 '\377' repeats 20 times, oi=0xbfffe788, flags=1)
 | at sha1_file.c:2565
 | #5  0x0812666f in sha1_object_info (
 | sha1=0x82659d4 '\377' repeats 20 times, sizep=0x0)
 | at sha1_file.c:2601
 | #6  0x080f6941 in parse_object (
 | sha1=0x82659d4 '\377' repeats 20 times) at object.c:247
 | #7  0x080758ac in fsck_sha1 (
 | sha1=sha1@entry=0x82659d4 '\377' repeats 20 times)
 | at builtin/fsck.c:333
 ...
 | (gdb) c
 | Continuing.
 | error: inflate: data stream error (incorrect header check)
 | 
 | Breakpoint 1, error (err=0x817c525 unable to unpack %s header)
 | at usage.c:143
 | 143 {
 | (gdb) bt
 | #0  error (err=0x817c525 unable to unpack %s header) at usage.c:143
 | #1  0x08125564 in sha1_loose_object_info (
 | sha1=sha1@entry=0x82659d4 '\377' repeats 20 times,
 | oi=oi@entry=0xbfffe788) at sha1_file.c:2529
 | #2  0x08126b2d in sha1_object_info_extended (
 | sha1=0x82659d4 '\377' repeats 20 times, oi=0xbfffe788, flags=1)
 | at sha1_file.c:2565
 | #3  0x0812666f in sha1_object_info (
 | sha1=0x82659d4 '\377' repeats 20 times, sizep=0x0)
 | at sha1_file.c:2601
 | #4  0x080f6941 in parse_object (
 | sha1=0x82659d4 '\377' repeats 20 times) at object.c:247
 ...
 | (gdb) frame 4
 | #4  0x080f6941 in parse_object (
 | sha1=0x82659d4 '\377' repeats 20 times) at object.c:247
 | warning: Source file is more recent than executable.
 | 247 sha1_object_info(sha1, NULL) == OBJ_BLOB)) { // -- first error
 | (gdb) down
 | #3  0x0812666f in sha1_object_info (
 | sha1=0x82659d4 '\377' repeats 20 times, sizep=0x0)
 | at sha1_file.c:2601
 | 2601 if