Duy Nguyen <pclo...@gmail.com> writes:

> Right. I missed this but I think this is a less important test
> because I added a new test to make sure "diff --cached" ("git
> status" to be exact) outputs the right thing when i-t-a entries
> are present.

OK.

>> If on the other hand the tests show the same result from these two
>> "diff --cached" and the result is different from what the test
>> expects, that means your patch changed the world order, i.e. an
>> i-t-a entry used to be treated as if it were adding an empty blob to
>> the index but it is now treated as non-existent, then that is a good
>> thing and the only thing we need to update is what the test expects.
>> I am guessing that instead of expecting dir/bar to be shown, it now
>> should expect no output?
>
> Yes, no output.

Good, as that means your changes did not break things, right?

> But still, if I revert the change in cache-tree.c and force write-tree
> to produce valid cache-tree when i-t-a entries are present, this test
> still passes incorrectly.

This is worrysome.

Doesn't that suggest that the diff-cache optimization is disabled
and cache-tree that is marked as valid (because you "revert" the
fix) is not looked at?  That is the only explanation that makes
sense to me---you write out a tree omitting i-t-a entries in the
index and update the index without your earlier fix eec3e7e4
(cache-tree: invalidate i-t-a paths after generating trees,
2012-12-16), i.e. marking the cache-tree entries valid.  A later
invocation of diff-cache *should* trust that validity and just say
"the tree and the index match" without even digging into the
individual entries, at which point you should see a bogus result.

If that is the case, it would be a performance regression, which may
be already there before your patches or something your patches may
have introduced.

Ah, wait.

I suspect that it all cancels out.

 * You "add -N dir/bar".  You write-tree.  The tree is written as if
   dir/bar is not there.  cache-tree is updated and it is marked as
   valid (because you deliberately broke the earlier fix you made at
   eec3e7e4).

 * You run "diff-cache".  It walks the HEAD and the cache-tree and
   finds that HEAD:dir and the cache-tree entry for "dir" records
   the same tree object name, and the cache-tree entry is marked
   "valid".  Hence you declare "no differences".

But for the updated definition of "diff-cache", there indeed is no
difference.  The HEAD and the index matches, because dir/bar does
not yet exist.

OK, so I think I can see how the result could work without
invalidating the cache-tree entry that contains i-t-a entries.

It might be even the right thing to do in general.  We can view that
invalidation a workaround to achieve the old behaviour of diff-cache,
which used to report that dir/ are different between HEAD and index.

We can even argue that it is the right thing to mark the cache-tree
entry for dir/ as valid, no matter how many i-t-a entries are in
there, as long as writing out the tree for dir/ out of index results
in the same tree as the tree that is stored as HEAD:dir/.  And from
that viewpoint, keeping the earlier fix (invalidation code) when
these patches are applied _is_ introducing a performance bug.

Now, as you mentioned, there _may_ be codepaths that uses the same
definition of "what's in the index" as what diff-cache used to take
before your patches, and they may be broken by removing the
invalidation.  If we find such codepaths, we should treat their old
behaviour as buggy and fix them, instead of reintroducing the
invalidation and keep their current behaviour, as the new world
order is "i-t-a entries in the index does not yet exist."

Thanks for straightening my confusion out.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to