Re: [PATCH] Add failing test: "fsck survives inflate errors"

2014-08-08 Thread Duy Nguyen
On Mon, Jul 21, 2014 at 3:43 AM, Samuel Bronson  wrote:
> So, given that parse_object()'s documentation is:
>
> --8<---cut here---start->8---
> /*
>  * Returns the object, having parsed it to find out what it is.
>  *
>  * Returns NULL if the object is missing or corrupt.
>  */
> --8<---cut here---end--->8---
>
> it probably should not call read_sha1_file() on a corrupt object.
>
> Options for fixing this would appear to include:
>
> 1. Saving the result of sha1_object_info(sha1, NULL) to a variable and
>returning early if the object is corrupt.  (But what happens if there
>is corruption far enough in that it isn't seen when trying to grab
>the object header?)
>
> 2. Calling read_object() and giving our own error messages.
>
> 3. Making read_sha1_file_extended only *optionally* die; since it's
>calling die() directly.

We've been using die() quite freely (or at least used to) and there
are many more cases that can trigger die() and parse_object() can do
nothing about it. Adding a "gentle" flag to read_sha1_file_extended
and pass it further down could be the first step. Patches welcome.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Add failing test: "fsck survives inflate errors"

2014-07-20 Thread Samuel Bronson
The following message is a courtesy copy of an article
that has been posted to gmane.comp.version-control.git as well.

Oh, I forgot to provide any analysis of the problem.  Oops.

It may be just as well, though; I was tired enough that I might have
botched it in any case.  So, have an analysis:

While inflate errors are obviously NOT GOOD, and should perhaps be
instantly fatal for most commands, git fsck is something of a special
case because it is useful to have *it* report as many corrupt objects as
possible in one run.

Unfortunately, this is not currently the case, as shown by the provided
testcase.

The output for this testcase is:

,
| checking known breakage:
| hash1= &&
| hash2=fffe &&
| mkdir -p .git/objects/ff &&
| echo not-zlib >$(sha1_file $hash1) &&
| test_when_finished "remove_object $hash1" &&
| echo not-zlib >$(sha1_file $hash2) &&
| test_when_finished "remove_object $hash2" &&
| 
| # Return value is not documented
| test_might_fail git fsck 2>out &&
| cat out && echo == &&
| grep "$hash1.*corrupt" out &&
| grep "$hash2.*corrupt" out
| 
| error: inflate: data stream error (incorrect header check)
| error: unable to unpack  header
| error: inflate: data stream error (incorrect header check)
| fatal: loose object  (stored in 
.git/objects/ff/ff) is corrupt
| ==
| fatal: loose object  (stored in 
.git/objects/ff/ff) is corrupt
| not ok 5 - fsck survives inflate errors # TODO known breakage
`

If I flip it from expect_failure to expect_success and run the test with
-i, then go into the trash directory and run "gdb ../../git-fsck", I can
obtain this (thoroughly rehearsed & trimmed) gdb transcript:

,
| % gdb ../../git-fsck
| GNU gdb (Debian 7.7.1-3) 7.7.1
...
| Reading symbols from ../../git-fsck...done.
| (gdb) break error
| Breakpoint 1 at 0x813d24c: file usage.c, line 143.
| (gdb) break die
| Breakpoint 2 at 0x813d152: file usage.c, line 94.
| (gdb) run
| Starting program: /home/naesten/hacking/git/git-fsck
| [Thread debugging using libthread_db enabled]
| Using host libthread_db library 
"/lib/i386-linux-gnu/i686/cmov/libthread_db.so.1".
| Checking object directories: 100% (256/256), done.
| 
| Breakpoint 1, error (err=0x8182f7a "inflate: %s (%s)") at usage.c:143
| 143 {
| (gdb) bt
| #0  error (err=0x8182f7a "inflate: %s (%s)") at usage.c:143
| #1  0x081452ff in git_inflate (strm=0xbfffe6b8, flush=0)
| at zlib.c:144
| #2  0x08125367 in unpack_sha1_header (stream=,
| map=, mapsize=,
| buffer=, bufsiz=)
| at sha1_file.c:1515
| #3  0x08125546 in sha1_loose_object_info (
| sha1=sha1@entry=0x82659d4 '\377' ,
| oi=oi@entry=0xbfffe788) at sha1_file.c:2528
| #4  0x08126b2d in sha1_object_info_extended (
| sha1=0x82659d4 '\377' , oi=0xbfffe788, flags=1)
| at sha1_file.c:2565
| #5  0x0812666f in sha1_object_info (
| sha1=0x82659d4 '\377' , sizep=0x0)
| at sha1_file.c:2601
| #6  0x080f6941 in parse_object (
| sha1=0x82659d4 '\377' ) at object.c:247
| #7  0x080758ac in fsck_sha1 (
| sha1=sha1@entry=0x82659d4 '\377' )
| at builtin/fsck.c:333
...
| (gdb) c
| Continuing.
| error: inflate: data stream error (incorrect header check)
| 
| Breakpoint 1, error (err=0x817c525 "unable to unpack %s header")
| at usage.c:143
| 143 {
| (gdb) bt
| #0  error (err=0x817c525 "unable to unpack %s header") at usage.c:143
| #1  0x08125564 in sha1_loose_object_info (
| sha1=sha1@entry=0x82659d4 '\377' ,
| oi=oi@entry=0xbfffe788) at sha1_file.c:2529
| #2  0x08126b2d in sha1_object_info_extended (
| sha1=0x82659d4 '\377' , oi=0xbfffe788, flags=1)
| at sha1_file.c:2565
| #3  0x0812666f in sha1_object_info (
| sha1=0x82659d4 '\377' , sizep=0x0)
| at sha1_file.c:2601
| #4  0x080f6941 in parse_object (
| sha1=0x82659d4 '\377' ) at object.c:247
...
| (gdb) frame 4
| #4  0x080f6941 in parse_object (
| sha1=0x82659d4 '\377' ) at object.c:247
| warning: Source file is more recent than executable.
| 247  sha1_object_info(sha1, NULL) == OBJ_BLOB)) { // <-- 
first error
| (gdb) down
| #3  0x0812666f in sha1_object_info (
| sha1=0x82659d4 '\377' , sizep=0x0)
| at sha1_file.c:2601
| 2601if (sha1_object_info_extended(sha1, &oi, 
LOOKUP_REPLACE_OBJECT) < 0)
| (gdb) fin
| Run till exit from #3  0x0812666f in sha1_object_info (
| sha1=0x82659d4 '\377' , sizep=0x0)
| at sha1_file.c:2601
| error: unable to unpack  header
| parse_object (sha1=0x82659d4 '\377' )
| at object.c:246
| 246 (!obj && has_sha1_file(sha1) &&
| Value returned is $1 = -1
| (gdb) c
| Continuing.
| 
|

[PATCH] Add failing test: "fsck survives inflate errors"

2014-07-19 Thread Samuel Bronson
While inflate errors are obviously NOT GOOD, and should perhaps be
fatal for most commands, git fsck is something of a special case
because it is useful to have *it* report as many corrupt objects as
possible in one run.

Signed-off-by: Samuel Bronson 
---
 t/t1450-fsck.sh | 17 +
 1 file changed, 17 insertions(+)

diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 8c739c9..6dcc4b2 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -53,6 +53,23 @@ test_expect_success 'setup: helpers for corruption tests' '
}
 '
 
+# git fsck should be able to detect more than one corrupt object per run
+test_expect_failure 'fsck survives inflate errors' '
+   hash1= &&
+   hash2=fffe &&
+   mkdir -p .git/objects/ff &&
+   echo not-zlib >$(sha1_file $hash1) &&
+   test_when_finished "remove_object $hash1" &&
+   echo not-zlib >$(sha1_file $hash2) &&
+   test_when_finished "remove_object $hash2" &&
+
+   # Return value is not documented
+   test_might_fail git fsck 2>out &&
+   cat out && echo == &&
+   grep "$hash1.*corrupt" out &&
+   grep "$hash2.*corrupt" out
+'
+
 test_expect_success 'object with bad sha1' '
sha=$(echo blob | git hash-object -w --stdin) &&
old=$(echo $sha | sed "s+^..+&/+") &&
-- 
2.0.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html