Re: [PATCH] Add failing test: fsck survives inflate errors

2014-08-09 Thread Duy Nguyen
On Mon, Jul 21, 2014 at 3:43 AM, Samuel Bronson naes...@gmail.com wrote:
 So, given that parse_object()'s documentation is:

 --8---cut here---start-8---
 /*
  * Returns the object, having parsed it to find out what it is.
  *
  * Returns NULL if the object is missing or corrupt.
  */
 --8---cut here---end---8---

 it probably should not call read_sha1_file() on a corrupt object.

 Options for fixing this would appear to include:

 1. Saving the result of sha1_object_info(sha1, NULL) to a variable and
returning early if the object is corrupt.  (But what happens if there
is corruption far enough in that it isn't seen when trying to grab
the object header?)

 2. Calling read_object() and giving our own error messages.

 3. Making read_sha1_file_extended only *optionally* die; since it's
calling die() directly.

We've been using die() quite freely (or at least used to) and there
are many more cases that can trigger die() and parse_object() can do
nothing about it. Adding a gentle flag to read_sha1_file_extended
and pass it further down could be the first step. Patches welcome.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Add failing test: fsck survives inflate errors

2014-07-20 Thread Samuel Bronson
The following message is a courtesy copy of an article
that has been posted to gmane.comp.version-control.git as well.

Oh, I forgot to provide any analysis of the problem.  Oops.

It may be just as well, though; I was tired enough that I might have
botched it in any case.  So, have an analysis:

While inflate errors are obviously NOT GOOD, and should perhaps be
instantly fatal for most commands, git fsck is something of a special
case because it is useful to have *it* report as many corrupt objects as
possible in one run.

Unfortunately, this is not currently the case, as shown by the provided
testcase.

The output for this testcase is:

,
| checking known breakage:
| hash1= 
| hash2=fffe 
| mkdir -p .git/objects/ff 
| echo not-zlib $(sha1_file $hash1) 
| test_when_finished remove_object $hash1 
| echo not-zlib $(sha1_file $hash2) 
| test_when_finished remove_object $hash2 
| 
| # Return value is not documented
| test_might_fail git fsck 2out 
| cat out  echo == 
| grep $hash1.*corrupt out 
| grep $hash2.*corrupt out
| 
| error: inflate: data stream error (incorrect header check)
| error: unable to unpack  header
| error: inflate: data stream error (incorrect header check)
| fatal: loose object  (stored in 
.git/objects/ff/ff) is corrupt
| ==
| fatal: loose object  (stored in 
.git/objects/ff/ff) is corrupt
| not ok 5 - fsck survives inflate errors # TODO known breakage
`

If I flip it from expect_failure to expect_success and run the test with
-i, then go into the trash directory and run gdb ../../git-fsck, I can
obtain this (thoroughly rehearsed  trimmed) gdb transcript:

,
| % gdb ../../git-fsck
| GNU gdb (Debian 7.7.1-3) 7.7.1
...
| Reading symbols from ../../git-fsck...done.
| (gdb) break error
| Breakpoint 1 at 0x813d24c: file usage.c, line 143.
| (gdb) break die
| Breakpoint 2 at 0x813d152: file usage.c, line 94.
| (gdb) run
| Starting program: /home/naesten/hacking/git/git-fsck
| [Thread debugging using libthread_db enabled]
| Using host libthread_db library 
/lib/i386-linux-gnu/i686/cmov/libthread_db.so.1.
| Checking object directories: 100% (256/256), done.
| 
| Breakpoint 1, error (err=0x8182f7a inflate: %s (%s)) at usage.c:143
| 143 {
| (gdb) bt
| #0  error (err=0x8182f7a inflate: %s (%s)) at usage.c:143
| #1  0x081452ff in git_inflate (strm=0xbfffe6b8, flush=0)
| at zlib.c:144
| #2  0x08125367 in unpack_sha1_header (stream=optimized out,
| map=optimized out, mapsize=optimized out,
| buffer=optimized out, bufsiz=optimized out)
| at sha1_file.c:1515
| #3  0x08125546 in sha1_loose_object_info (
| sha1=sha1@entry=0x82659d4 '\377' repeats 20 times,
| oi=oi@entry=0xbfffe788) at sha1_file.c:2528
| #4  0x08126b2d in sha1_object_info_extended (
| sha1=0x82659d4 '\377' repeats 20 times, oi=0xbfffe788, flags=1)
| at sha1_file.c:2565
| #5  0x0812666f in sha1_object_info (
| sha1=0x82659d4 '\377' repeats 20 times, sizep=0x0)
| at sha1_file.c:2601
| #6  0x080f6941 in parse_object (
| sha1=0x82659d4 '\377' repeats 20 times) at object.c:247
| #7  0x080758ac in fsck_sha1 (
| sha1=sha1@entry=0x82659d4 '\377' repeats 20 times)
| at builtin/fsck.c:333
...
| (gdb) c
| Continuing.
| error: inflate: data stream error (incorrect header check)
| 
| Breakpoint 1, error (err=0x817c525 unable to unpack %s header)
| at usage.c:143
| 143 {
| (gdb) bt
| #0  error (err=0x817c525 unable to unpack %s header) at usage.c:143
| #1  0x08125564 in sha1_loose_object_info (
| sha1=sha1@entry=0x82659d4 '\377' repeats 20 times,
| oi=oi@entry=0xbfffe788) at sha1_file.c:2529
| #2  0x08126b2d in sha1_object_info_extended (
| sha1=0x82659d4 '\377' repeats 20 times, oi=0xbfffe788, flags=1)
| at sha1_file.c:2565
| #3  0x0812666f in sha1_object_info (
| sha1=0x82659d4 '\377' repeats 20 times, sizep=0x0)
| at sha1_file.c:2601
| #4  0x080f6941 in parse_object (
| sha1=0x82659d4 '\377' repeats 20 times) at object.c:247
...
| (gdb) frame 4
| #4  0x080f6941 in parse_object (
| sha1=0x82659d4 '\377' repeats 20 times) at object.c:247
| warning: Source file is more recent than executable.
| 247  sha1_object_info(sha1, NULL) == OBJ_BLOB)) { // -- 
first error
| (gdb) down
| #3  0x0812666f in sha1_object_info (
| sha1=0x82659d4 '\377' repeats 20 times, sizep=0x0)
| at sha1_file.c:2601
| 2601if (sha1_object_info_extended(sha1, oi, 
LOOKUP_REPLACE_OBJECT)  0)
| (gdb) fin
| Run till exit from #3  0x0812666f in sha1_object_info (
| sha1=0x82659d4 '\377' repeats 20 times, sizep=0x0)
| at sha1_file.c:2601
| error: unable to unpack