On Thu, Jan 10, 2019 at 11:55:51PM +0000, brian m. carlson wrote:

> > I think the only reason they are "struct object_id" is because that's
> > what tree_entry_extract() returns. Should we be providing another
> > function to allow more byte-oriented access?
> 
> The reason is we recursively call splice_tree, passing rewrite_here as
> the first argument. That argument is then used for read_object_file,
> which requires a struct object_id * (because it is, logically, an object
> ID).
> 
> I *think* we could fix this by copying an unsigned char *rewrite_here
> into a temporary struct object_id before we recurse, but it's not
> obvious to me if that's safe to do.

I think rewrite_here needs to be a direct pointer into the buffer, since
we plan to modify it.

I think rewrite_with is correct to be an object_id. It's either the oid
passed in from the caller, or the subtree we generated (in which case
it's the result from write_object_file).

So the "most correct" thing is probably something like this:

diff --git a/match-trees.c b/match-trees.c
index 03e81b56e1..129b13a970 100644
--- a/match-trees.c
+++ b/match-trees.c
@@ -179,7 +179,7 @@ static int splice_tree(const struct object_id *oid1, const 
char *prefix,
        char *buf;
        unsigned long sz;
        struct tree_desc desc;
-       struct object_id *rewrite_here;
+       unsigned char *rewrite_here;
        const struct object_id *rewrite_with;
        struct object_id subtree;
        enum object_type type;
@@ -206,9 +206,16 @@ static int splice_tree(const struct object_id *oid1, const 
char *prefix,
                        if (!S_ISDIR(mode))
                                die("entry %s in tree %s is not a tree", name,
                                    oid_to_hex(oid1));
-                       rewrite_here = (struct object_id *)(desc.entry.path +
-                                                           
strlen(desc.entry.path) +
-                                                           1);
+                       /*
+                        * We cast here for two reasons:
+                        *
+                        *   - to flip the "char *" (for the path) to "unsigned
+                        *     char *" (for the hash stored after it)
+                        *
+                        *   - to discard the "const"; this is OK because we
+                        *     know it points into our non-const "buf"
+                        */
+                       rewrite_here = (unsigned char *)desc.entry.path + 
strlen(desc.entry.path) + 1;
                        break;
                }
                update_tree_entry(&desc);
@@ -224,7 +231,7 @@ static int splice_tree(const struct object_id *oid1, const 
char *prefix,
        } else {
                rewrite_with = oid2;
        }
-       hashcpy(rewrite_here->hash, rewrite_with->hash);
+       hashcpy(rewrite_here, rewrite_with->hash);
        status = write_object_file(buf, sz, tree_type, result);
        free(buf);
        return status;

I think if I were trying to write this in a less-subtle way, I'd
probably stop trying to do it in-place, and have a copy loop more like:

  for entry in src_tree
    if match_entry(entry, prefix)
      entry = rewrite_entry(entry) /* either oid2 or subtree */
    push_entry(dst_tree)

We may even have to go that way eventually if we might ever be rewriting
to a tree with a different hash size (i.e., there is a hidden assumption
here that rewrite_here points to exactly the_hash_algo->rawsz bytes of
hash). But I think for now it's not necessary, and it's way outside the
scope of what you're trying to do here.

-Peff

Reply via email to