On Mon, Sep 26, 2016 at 03:32:44PM -0400, David Turner wrote:
> From: Jeff King <[email protected]>
>
> When the tree-walker runs into an error, it just calls
> die(), and the message is always "corrupt tree file".
> However, we are actually covering several cases here; let's
> give the user a hint about what happened.
>
> Let's also avoid using the word "corrupt", which makes it
> seem like the data bit-rotted on disk. Our sha1 check would
> already have found that. These errors are ones of data that
> is malformed in the first place.
>
> Signed-off-by: David Turner <[email protected]>
> Signed-off-by: Jeff King <[email protected]>
Yay. This has been on my "to look at and repost" list for literally 2
years. Thanks for picking it up (see kids, procrastination _does_ pay
off).
> t/t1007-hash-object.sh | 15 +++++++++++++--
> t/t1007/tree-with-empty-filename | Bin 0 -> 28 bytes
> t/t1007/tree-with-malformed-mode | Bin 0 -> 39 bytes
Ooh, and tests. Exciting.
> -test_expect_success 'corrupt tree' '
> +test_expect_success 'too-short tree' '
> echo abc >malformed-tree &&
> - test_must_fail git hash-object -t tree malformed-tree
> + test_must_fail git hash-object -t tree malformed-tree 2>err &&
> + grep "too-short tree object" err
> +'
Should this be test_i18ngrep? Even if the message is not translated now,
it seems like a good proactive measure (and probably it _should_ be
translated).
> +test_expect_success 'malformed mode in tree' '
> + test_must_fail git hash-object -t tree
> ../t1007/tree-with-malformed-mode 2>err &&
> + grep "malformed mode in tree entry for tree" err
> +'
This ".." will break when the test is run with "--root". You should use
"$TEST_DIRECTORY"/t1007/...
instead. And ditto in the second test, of course.
> diff --git a/t/t1007/tree-with-empty-filename
> b/t/t1007/tree-with-empty-filename
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..aeb1ceb20e485eebd0acbb81c974d1c6fedcc1fe
> GIT binary patch
> literal 28
> kcmXpsFfcPQQDAsB_tET47q2;ccWbUIkGgT_Nl)-Z0Hx{;SO5S3
>
> literal 0
> HcmV?d00001
This is rather opaque, of course. :)
I wonder if it would be possible to generate the test vector with
something like:
# any 20 bytes will do
bin_sha1='\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0'
printf "100644 \0$bin_sha1" >tree-with-empty-filename
I know that is longer and possibly more error-prone to run, but I think
it makes the test much easier to read and modify later.
I also wonder if $bin_sha1 should actually be more like:
hex_sha1=$(echo foo | git hash-object --stdin -w)
bin_sha1=$(echo $hex_sha1 | perl -ne 'printf "\\%3o", ord for /./g')
so that it's a real sha1 (or maybe it is in your original, from an
object that happens to be in the repo; it's hard to tell). I wouldn't
expect the code to actually get to the point of looking at the sha1, but
it's perhaps a more realistic test.
I also think it would be nice if hash-object had a "--binary-sha1"
option to avoid the perl grossness. :)
> diff --git a/tree-walk.c b/tree-walk.c
> index ce27842..ba544cf 100644
The code change itself looks brilliant, naturally. :)
-Peff