[PATCH 0/2] recover from failed to apply delta

2013-06-14 Thread Jeff King
On Thu, Jun 13, 2013 at 08:05:21PM -0400, Nicolas Pitre wrote:

  We already handle the case where we were not able to read
  the delta from disk. However, when we find that the delta we
  read does not apply, we simply die.  This case is harder to
  trigger, as corruption in the delta data itself would
  trigger a crc error from zlib.  However, a corruption that
  pointed us at the wrong delta base might cause it.

 That makes sense.
 
 Could you produce a test case to go along with this change?

Yes. I was a little worried I would have trouble doing it without
relying on a lot of pack internals, but the infrastructure you set up in
t5303 makes it relatively easy (and we do not have to make any
assumptions that t5303 does not already make).

Here is a re-roll; the first patch is a small cleanup in t5303 that is
required for the new tests to work.

  [1/2]: t5303: drop count=1 from corruption dd
  [2/2]: unpack_entry: do not die when we fail to apply a delta

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] recover from failed to apply delta

2013-06-14 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Thu, Jun 13, 2013 at 08:05:21PM -0400, Nicolas Pitre wrote:

  We already handle the case where we were not able to read
  the delta from disk. However, when we find that the delta we
  read does not apply, we simply die.  This case is harder to
  trigger, as corruption in the delta data itself would
  trigger a crc error from zlib.  However, a corruption that
  pointed us at the wrong delta base might cause it.

 That makes sense.
 
 Could you produce a test case to go along with this change?

 Yes. I was a little worried I would have trouble doing it without
 relying on a lot of pack internals, but the infrastructure you set up in
 t5303 makes it relatively easy (and we do not have to make any
 assumptions that t5303 does not already make).

 Here is a re-roll; the first patch is a small cleanup in t5303 that is
 required for the new tests to work.

Heh, I was doing the same, but I cheated ;-)

diff --git a/t/t5303-pack-corruption-resilience.sh 
b/t/t5303-pack-corruption-resilience.sh
index 5b1250f..57436db 100755
--- a/t/t5303-pack-corruption-resilience.sh
+++ b/t/t5303-pack-corruption-resilience.sh
@@ -51,7 +51,7 @@ do_corrupt_object() {
 ofs=`git show-index  ${pack}.idx | grep $1 | cut -f1 -d ` 
 ofs=$(($ofs + $2)) 
 chmod +w ${pack}.pack 
-dd of=${pack}.pack count=1 bs=1 conv=notrunc seek=$ofs 
+dd of=${pack}.pack count=${3-1} bs=1 conv=notrunc seek=$ofs 
 test_must_fail git verify-pack ${pack}.pack
 }
 
@@ -276,6 +276,30 @@ test_expect_success \
  git cat-file blob $blob_3  /dev/null'
 
 test_expect_success \
+'corrupt delta-part of a packed object, fall back to loose' \
+'create_new_pack 
+ path=$(echo $blob_3 | sed -e s|^\(..\)|\1/|) 
+ cat .git/objects/$path saved 
+ git prune-packed 
+
+ dd if=${pack}.idx bs=1 count=20 skip=1032 blob1-bin 
+ dd if=${pack}.pack bs=1 count=20 skip=2233 blob3-delta-base-bin 
+
+ # At the beginning of the REF_DELTA representation of $blob_3,
+ # write 20-byte base object name for $blob_1, instead of $blob_2.
+ # The binary representation of object name for $blob_1 is found
+ # at offset 4 + 4 + 256*4 = 1032 for 20 bytes.
+ dd if=${pack}.idx bs=1 count=20 skip=1032 | do_corrupt_object $blob_3 2 
20 
+ test_must_fail git cat-file blob $blob_3 /dev/null 
+
+ # Resurrect the loose object for $blob_3
+ mkdir -p .git/objects/$(echo $path | sed -e s|^\(..\).*|\1|) 
+ cat saved .git/objects/$path 
+
+ git cat-file blob $blob_3 /dev/null
+'
+
+test_expect_success \
 'corrupting header to have too small output buffer fails unpack' \
 'create_new_pack 
  git prune-packed 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] recover from failed to apply delta

2013-06-14 Thread Jeff King
On Fri, Jun 14, 2013 at 02:51:35PM -0700, Junio C Hamano wrote:

  Here is a re-roll; the first patch is a small cleanup in t5303 that is
  required for the new tests to work.
 
 Heh, I was doing the same, but I cheated ;-)
 
 diff --git a/t/t5303-pack-corruption-resilience.sh 
 b/t/t5303-pack-corruption-resilience.sh
 index 5b1250f..57436db 100755
 --- a/t/t5303-pack-corruption-resilience.sh
 +++ b/t/t5303-pack-corruption-resilience.sh
 @@ -51,7 +51,7 @@ do_corrupt_object() {
  ofs=`git show-index  ${pack}.idx | grep $1 | cut -f1 -d ` 
  ofs=$(($ofs + $2)) 
  chmod +w ${pack}.pack 
 -dd of=${pack}.pack count=1 bs=1 conv=notrunc seek=$ofs 
 +dd of=${pack}.pack count=${3-1} bs=1 conv=notrunc seek=$ofs 
  test_must_fail git verify-pack ${pack}.pack

Yeah, I almost did that, but then I realized that dd will simply read
all of its input, anyway.

  test_expect_success \
 +'corrupt delta-part of a packed object, fall back to loose' \
 +'create_new_pack 
 + path=$(echo $blob_3 | sed -e s|^\(..\)|\1/|) 
 + cat .git/objects/$path saved 
 + git prune-packed 
 +
 + dd if=${pack}.idx bs=1 count=20 skip=1032 blob1-bin 
 + dd if=${pack}.pack bs=1 count=20 skip=2233 blob3-delta-base-bin 
 +
 + # At the beginning of the REF_DELTA representation of $blob_3,
 + # write 20-byte base object name for $blob_1, instead of $blob_2.
 + # The binary representation of object name for $blob_1 is found
 + # at offset 4 + 4 + 256*4 = 1032 for 20 bytes.
 + dd if=${pack}.idx bs=1 count=20 skip=1032 | do_corrupt_object $blob_3 2 
 20 
 + test_must_fail git cat-file blob $blob_3 /dev/null 

I didn't want to bother coming up with the binary version of the
REF_DELTA sha1, so I used OFS_DELTA. :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] recover from failed to apply delta

2013-06-14 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 + # At the beginning of the REF_DELTA representation of $blob_3,
 + # write 20-byte base object name for $blob_1, instead of $blob_2.
 + # The binary representation of object name for $blob_1 is found
 + # at offset 4 + 4 + 256*4 = 1032 for 20 bytes.
 + dd if=${pack}.idx bs=1 count=20 skip=1032 | do_corrupt_object $blob_3 
 2 20 
 + test_must_fail git cat-file blob $blob_3 /dev/null 

 I didn't want to bother coming up with the binary version of the
 REF_DELTA sha1, so I used OFS_DELTA. :)

Yeah, I contemplated on doing something like this

printf $(echo $blob_3 | sed -e 's/\(..\)/\\x\1/g')

but of course printf \xAA is not in POSIX and at that point I
punted and instead read from .idx at the known offset ;-)

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html