Re: [PATCH v2] fsck: handle NULL return of lookup_blob() and lookup_tree()

2017-10-06 Thread René Scharfe
Am 06.10.2017 um 04:23 schrieb Junio C Hamano:
> René Scharfe  writes:
>> +blob_bin=$(echo $blob | hex2oct) &&
>> +tree=$(
>> +printf "4 dir\0${blob_bin}100644 file\0${blob_bin}" |
> 
> Wow, that's ... cute.
> 
>> +git hash-object -t tree --stdin -w --literally
> 
> Makes me curious why --literally is here.  Even if we let
> check_tree() called from index_mem() by taking the normal path,
> it wouldn't complain the type mismatch, I suspect.  I guess doing it
> this way is a future-proof against check_tree() getting tightened in
> the future, in which case I think it makes sense.
> 
> And for the same reason, hashing "--literally" like this patch does
> is a better solution than using "git mktree", which would have
> allowed us to avoid the hex2oct and instead feed the tree in a bit
> more human-readable way.

git mktree errors out already, complaining about the object type
mismatch.  But I added "--literally" only accidentally, when I copied
the invocation from a few lines up.  The test works fine without that
flag currently.  The flag captures the intent, however, of knowingly
building a flawed tree object.

René


Re: [PATCH v2] fsck: handle NULL return of lookup_blob() and lookup_tree()

2017-10-05 Thread Junio C Hamano
René Scharfe  writes:

> An error message is already shown by object_as_type(), which is called
> by the lookup functions.  The walk callback functions are expected to
> handle NULL object pointers passed to them, but put_object_name() needs
> a valid object, so avoid calling it without one.

Thanks for getting the details right ;-)

> + blob_bin=$(echo $blob | hex2oct) &&
> + tree=$(
> + printf "4 dir\0${blob_bin}100644 file\0${blob_bin}" |

Wow, that's ... cute.

> + git hash-object -t tree --stdin -w --literally

Makes me curious why --literally is here.  Even if we let
check_tree() called from index_mem() by taking the normal path,
it wouldn't complain the type mismatch, I suspect.  I guess doing it
this way is a future-proof against check_tree() getting tightened in
the future, in which case I think it makes sense.

And for the same reason, hashing "--literally" like this patch does
is a better solution than using "git mktree", which would have
allowed us to avoid the hex2oct and instead feed the tree in a bit
more human-readable way.

Thanks, will queue.

> + ) &&
> + commit=$(git commit-tree $tree) &&
> + git update-ref refs/heads/type_mismatch $commit &&
> + test_must_fail git fsck >out 2>&1 &&
> + test_i18ngrep "is a blob, not a tree" out &&
> + test_i18ngrep ! "dangling blob" out
> +'
> +
>  test_expect_success 'tag pointing to nonexistent' '
>   cat >invalid-tag <<-\EOF &&
>   object 


[PATCH v2] fsck: handle NULL return of lookup_blob() and lookup_tree()

2017-10-05 Thread René Scharfe
lookup_blob() and lookup_tree() can return NULL if they find an object
of an unexpected type.  Accessing the object member is undefined in that
case.  Cast the result to a struct object pointer instead; we can do
that because object is the first member of all object types.  This trick
is already used in other places in the code.

An error message is already shown by object_as_type(), which is called
by the lookup functions.  The walk callback functions are expected to
handle NULL object pointers passed to them, but put_object_name() needs
a valid object, so avoid calling it without one.

Suggested-by: SZEDER Gábor 
Helped-by: Junio C Hamano 
Signed-off-by: Rene Scharfe 
---
Changes from v1:
- Don't abort on encountering an object with a mismatched type.
- Added a test for showing the problem with SANITIZE=address,undefined.

 fsck.c  |  8 
 t/t1450-fsck.sh | 22 ++
 2 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/fsck.c b/fsck.c
index 2ad00fc4d0..032699e9ac 100644
--- a/fsck.c
+++ b/fsck.c
@@ -358,15 +358,15 @@ static int fsck_walk_tree(struct tree *tree, void *data, 
struct fsck_options *op
continue;
 
if (S_ISDIR(entry.mode)) {
-   obj = _tree(entry.oid)->object;
-   if (name)
+   obj = (struct object *)lookup_tree(entry.oid);
+   if (name && obj)
put_object_name(options, obj, "%s%s/", name,
entry.path);
result = options->walk(obj, OBJ_TREE, data, options);
}
else if (S_ISREG(entry.mode) || S_ISLNK(entry.mode)) {
-   obj = _blob(entry.oid)->object;
-   if (name)
+   obj = (struct object *)lookup_blob(entry.oid);
+   if (name && obj)
put_object_name(options, obj, "%s%s", name,
entry.path);
result = options->walk(obj, OBJ_BLOB, data, options);
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 4087150db1..cb4b66e29d 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -222,6 +222,28 @@ test_expect_success 'unparseable tree object' '
test_i18ngrep ! "fatal: empty filename in tree entry" out
 '
 
+hex2oct() {
+   perl -ne 'printf "\\%03o", hex for /../g'
+}
+
+test_expect_success 'tree entry with type mismatch' '
+   test_when_finished "remove_object \$blob" &&
+   test_when_finished "remove_object \$tree" &&
+   test_when_finished "remove_object \$commit" &&
+   test_when_finished "git update-ref -d refs/heads/type_mismatch" &&
+   blob=$(echo blob | git hash-object -w --stdin) &&
+   blob_bin=$(echo $blob | hex2oct) &&
+   tree=$(
+   printf "4 dir\0${blob_bin}100644 file\0${blob_bin}" |
+   git hash-object -t tree --stdin -w --literally
+   ) &&
+   commit=$(git commit-tree $tree) &&
+   git update-ref refs/heads/type_mismatch $commit &&
+   test_must_fail git fsck >out 2>&1 &&
+   test_i18ngrep "is a blob, not a tree" out &&
+   test_i18ngrep ! "dangling blob" out
+'
+
 test_expect_success 'tag pointing to nonexistent' '
cat >invalid-tag <<-\EOF &&
object 
-- 
2.14.2