Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
David Turner writes: >> I am not convinced that doing an equivalent of write-tree when you >> switch branches is the right approach in the first place. You will >> eventually write it out as a tree, and having a relatively undamaged >> cache-tree will help you when you do so, but spending the cycles >> necessary to compute a fully populated cache-tree, only to let it >> degrade over time until you are ready to write it out as a tree, >> somehow sounds like asking for a duplicated work upfront. > > As I understand it, the cache-tree extension was originally designed to > speed up writing the tree later. However, as Karsten Blees's work (and > my own tests) have shown, it also speeds up git status. I use git > status a lot while working, and I've talked to a few others who do the > same. So I think it's worth spending extra time when switching branches > to have a good working experience within that branch. You are reading the history of the subsystem correctly. Since b65982b6 (Optimize "diff-index --cached" using cache-tree, 2009-05-20), having an undamaged cache-tree does help with "git status" as well. > In the new version of the patchset (which I'll post shortly), I've added > an option WRITE_TREE_REPAIR, which does all of the work to compute a new > tree object, but only adds it to the cache-tree if it already exists > on-disk. This is a little wasteful for the reason that you note. So if > you would like, I could add a config option to skip it. But I think it > is a good default. OK, sounds good. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
On Tue, 2014-07-01 at 13:15 -0700, Junio C Hamano wrote: > David Turner writes: > > > When git checkout checks out a branch, create or update the > > cache-tree so that subsequent operations are faster. > > > > Signed-off-by: David Turner > > --- > > builtin/checkout.c| 8 > > cache-tree.c | 5 +++-- > > t/t0090-cache-tree.sh | 15 ++- > > 3 files changed, 25 insertions(+), 3 deletions(-) > > > > diff --git a/builtin/checkout.c b/builtin/checkout.c > > index 07cf555..a023a86 100644 > > --- a/builtin/checkout.c > > +++ b/builtin/checkout.c > > @@ -553,6 +553,14 @@ static int merge_working_tree(const struct > > checkout_opts *opts, > > } > > } > > > > + if (!active_cache_tree) > > + active_cache_tree = cache_tree(); > > + > > + if (!cache_tree_fully_valid(active_cache_tree)) > > + cache_tree_update(active_cache_tree, > > + (const struct cache_entry * const > > *)active_cache, > > + active_nr, 0); > > + > > This looks much better than the previous round, but it still does > allow verify_cache() to throw noises against unmerged entries in the > cache, as WRITE_TREE_SILENT flag is not passed down, no? > > $ git checkout master^0 > $ git am $this_message > $ make > $ edit builtin/checkout.c ;# make changes to the above lines > $ ./git checkout -m master^0 > x builtin/checkout.c: unmerged > (972c8a7b28f16f88475268f9a714048c228e69db) > x builtin/checkout.c: unmerged > (f1dc56e55f7b2200412142b10517458ccfda2952) > x builtin/checkout.c: unmerged > (3b9753ba8c19e7dfe6e922f30eb85c83a92a4596) > M builtin/checkout.c > Warning: you are leaving 1 commit behind, not connected to > any of your branches: > > 25fab54 cache-tree: Create/update cache-tree on checkout > > Switched to branch 'master' > > Passing WRITE_TREE_SILENT in the flags parameter will get rid of the > conflict notice output from the above. > > The user is not interested in writing a brand new tree object at all > in this case, so it feels wrong to actually let the call chain go > down to update_one() and create new tree objects. > > Side note. And passing WRITE_TREE_DRY_RUN is not a good > solution either, because a later write_cache_as_tree() will > not create the necessary tree object once you stuff a tree > object name in the cache-tree. > > What we want in this code path is a way to repair a sub cache_tree > if it can be repaired without creating a new tree object and > otherwise leave that part invalid. The existing cache-tree > framework is not prepared to do that kind of thing. It wants to > start from the bottom and percolate things up, computing levels > nearer to the top-level only after it fully created the trees for > deeper levels, because it is meant to be used only when we really > want to write out trees. We may want to reuse update_one() but > > I am not convinced that doing an equivalent of write-tree when you > switch branches is the right approach in the first place. You will > eventually write it out as a tree, and having a relatively undamaged > cache-tree will help you when you do so, but spending the cycles > necessary to compute a fully populated cache-tree, only to let it > degrade over time until you are ready to write it out as a tree, > somehow sounds like asking for a duplicated work upfront. As I understand it, the cache-tree extension was originally designed to speed up writing the tree later. However, as Karsten Blees's work (and my own tests) have shown, it also speeds up git status. I use git status a lot while working, and I've talked to a few others who do the same. So I think it's worth spending extra time when switching branches to have a good working experience within that branch. In the new version of the patchset (which I'll post shortly), I've added an option WRITE_TREE_REPAIR, which does all of the work to compute a new tree object, but only adds it to the cache-tree if it already exists on-disk. This is a little wasteful for the reason that you note. So if you would like, I could add a config option to skip it. But I think it is a good default. Does this seem OK to you, assuming the implementation is good? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
On Tue, 2014-07-01 at 14:03 -0700, Junio C Hamano wrote: > David Turner writes: > > > On Tue, 2014-07-01 at 06:16 +0200, Torsten Bögershausen wrote: > >> diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh > >> index 6c33e28..7c60675 100755 > >> --- a/t/t0090-cache-tree.sh > >> +++ b/t/t0090-cache-tree.sh > >> @@ -85,9 +85,22 @@ test_expect_success 'reset --hard without index gives > >> cache-tree' ' > >>test_shallow_cache_tree > >> ' > >> > >> -test_expect_failure 'checkout gives cache-tree' ' > >> +test_expect_success 'checkout gives cache-tree' ' > >> + git tag current > >>git checkout HEAD^ && > >>test_shallow_cache_tree > >> > >> The && chainis broken here. > >> Does the test now pass, because "git tag" is added ? > > > > The tag does not cause the cache-tree to be created, so git tag does not > > cause the test to pass. > > That does not explain why it is a good idea to declare success of > this test if this new "git tag current" fails here for whatever > reason (e.g. somebody updated "git tag" for a reason that is > completely unrelated to cache-tree and made it segfault without > creating the "current" tag). Indeed; that's why the latest version includes &&. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
David Turner writes: > When git checkout checks out a branch, create or update the > cache-tree so that subsequent operations are faster. > > Signed-off-by: David Turner > --- Could you number your patches e.g. [PATCH v4 1/3] and/or summarize below the three-dash line what got updated since the last round? The readers can guess without when one is sending a reroll every other day or less frequently, but with rerolls more often than that, it gets unwieldy to check which points raised during the review have been addressed. Thanks. > builtin/checkout.c| 8 > cache-tree.c | 5 +++-- > t/t0090-cache-tree.sh | 19 --- > 3 files changed, 27 insertions(+), 5 deletions(-) > > diff --git a/builtin/checkout.c b/builtin/checkout.c > index 07cf555..a023a86 100644 > --- a/builtin/checkout.c > +++ b/builtin/checkout.c > @@ -553,6 +553,14 @@ static int merge_working_tree(const struct checkout_opts > *opts, > } > } > > + if (!active_cache_tree) > + active_cache_tree = cache_tree(); > + > + if (!cache_tree_fully_valid(active_cache_tree)) > + cache_tree_update(active_cache_tree, > + (const struct cache_entry * const > *)active_cache, > + active_nr, 0); > + > if (write_cache(newfd, active_cache, active_nr) || > commit_locked_index(lock_file)) > die(_("unable to write new index file")); > diff --git a/cache-tree.c b/cache-tree.c > index 7fa524a..28d2356 100644 > --- a/cache-tree.c > +++ b/cache-tree.c > @@ -612,9 +612,10 @@ int write_cache_as_tree(unsigned char *sha1, int flags, > const char *prefix) > cache_tree_find(active_cache_tree, prefix); > if (!subtree) > return WRITE_TREE_PREFIX_ERROR; > - hashcpy(sha1, subtree->sha1); > + if (sha1) > + hashcpy(sha1, subtree->sha1); > } > - else > + else if (sha1) > hashcpy(sha1, active_cache_tree->sha1); > > if (0 <= newfd) > diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh > index 6c33e28..98fb1ab 100755 > --- a/t/t0090-cache-tree.sh > +++ b/t/t0090-cache-tree.sh > @@ -44,14 +44,14 @@ test_expect_success 'read-tree HEAD establishes > cache-tree' ' > > test_expect_success 'git-add invalidates cache-tree' ' > test_when_finished "git reset --hard; git read-tree HEAD" && > - echo "I changed this file" > foo && > + echo "I changed this file" >foo && > git add foo && > test_invalid_cache_tree > ' > > test_expect_success 'update-index invalidates cache-tree' ' > test_when_finished "git reset --hard; git read-tree HEAD" && > - echo "I changed this file" > foo && > + echo "I changed this file" >foo && > git update-index --add foo && > test_invalid_cache_tree > ' > @@ -85,9 +85,22 @@ test_expect_success 'reset --hard without index gives > cache-tree' ' > test_shallow_cache_tree > ' > > -test_expect_failure 'checkout gives cache-tree' ' > +test_expect_success 'checkout gives cache-tree' ' > + git tag current && > git checkout HEAD^ && > test_shallow_cache_tree > ' > > +test_expect_success 'checkout -b gives cache-tree' ' > + git checkout current && > + git checkout -b prev HEAD^ && > + test_shallow_cache_tree > +' > + > +test_expect_success 'checkout -B gives cache-tree' ' > + git checkout current && > + git checkout -B prev HEAD^ && > + test_shallow_cache_tree > +' > + > test_done -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
David Turner writes: > On Tue, 2014-07-01 at 06:16 +0200, Torsten Bögershausen wrote: >> diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh >> index 6c33e28..7c60675 100755 >> --- a/t/t0090-cache-tree.sh >> +++ b/t/t0090-cache-tree.sh >> @@ -85,9 +85,22 @@ test_expect_success 'reset --hard without index gives >> cache-tree' ' >> test_shallow_cache_tree >> ' >> >> -test_expect_failure 'checkout gives cache-tree' ' >> +test_expect_success 'checkout gives cache-tree' ' >> +git tag current >> git checkout HEAD^ && >> test_shallow_cache_tree >> >> The && chainis broken here. >> Does the test now pass, because "git tag" is added ? > > The tag does not cause the cache-tree to be created, so git tag does not > cause the test to pass. That does not explain why it is a good idea to declare success of this test if this new "git tag current" fails here for whatever reason (e.g. somebody updated "git tag" for a reason that is completely unrelated to cache-tree and made it segfault without creating the "current" tag). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
David Turner writes: > When git checkout checks out a branch, create or update the > cache-tree so that subsequent operations are faster. > > Signed-off-by: David Turner > --- > builtin/checkout.c| 8 > cache-tree.c | 5 +++-- > t/t0090-cache-tree.sh | 15 ++- > 3 files changed, 25 insertions(+), 3 deletions(-) > > diff --git a/builtin/checkout.c b/builtin/checkout.c > index 07cf555..a023a86 100644 > --- a/builtin/checkout.c > +++ b/builtin/checkout.c > @@ -553,6 +553,14 @@ static int merge_working_tree(const struct checkout_opts > *opts, > } > } > > + if (!active_cache_tree) > + active_cache_tree = cache_tree(); > + > + if (!cache_tree_fully_valid(active_cache_tree)) > + cache_tree_update(active_cache_tree, > + (const struct cache_entry * const > *)active_cache, > + active_nr, 0); > + This looks much better than the previous round, but it still does allow verify_cache() to throw noises against unmerged entries in the cache, as WRITE_TREE_SILENT flag is not passed down, no? $ git checkout master^0 $ git am $this_message $ make $ edit builtin/checkout.c ;# make changes to the above lines $ ./git checkout -m master^0 x builtin/checkout.c: unmerged (972c8a7b28f16f88475268f9a714048c228e69db) x builtin/checkout.c: unmerged (f1dc56e55f7b2200412142b10517458ccfda2952) x builtin/checkout.c: unmerged (3b9753ba8c19e7dfe6e922f30eb85c83a92a4596) M builtin/checkout.c Warning: you are leaving 1 commit behind, not connected to any of your branches: 25fab54 cache-tree: Create/update cache-tree on checkout Switched to branch 'master' Passing WRITE_TREE_SILENT in the flags parameter will get rid of the conflict notice output from the above. The user is not interested in writing a brand new tree object at all in this case, so it feels wrong to actually let the call chain go down to update_one() and create new tree objects. Side note. And passing WRITE_TREE_DRY_RUN is not a good solution either, because a later write_cache_as_tree() will not create the necessary tree object once you stuff a tree object name in the cache-tree. What we want in this code path is a way to repair a sub cache_tree if it can be repaired without creating a new tree object and otherwise leave that part invalid. The existing cache-tree framework is not prepared to do that kind of thing. It wants to start from the bottom and percolate things up, computing levels nearer to the top-level only after it fully created the trees for deeper levels, because it is meant to be used only when we really want to write out trees. We may want to reuse update_one() but I am not convinced that doing an equivalent of write-tree when you switch branches is the right approach in the first place. You will eventually write it out as a tree, and having a relatively undamaged cache-tree will help you when you do so, but spending the cycles necessary to compute a fully populated cache-tree, only to let it degrade over time until you are ready to write it out as a tree, somehow sounds like asking for a duplicated work upfront. By the way, you seem to have touched write_cache_as_tree() in the same patch, but I am not sure what makes the change necessary. I do not see a new caller to it that passes a NULL to its sha1 parameter. > diff --git a/cache-tree.c b/cache-tree.c > index 7fa524a..28d2356 100644 > --- a/cache-tree.c > +++ b/cache-tree.c > @@ -612,9 +612,10 @@ int write_cache_as_tree(unsigned char *sha1, int flags, > const char *prefix) > cache_tree_find(active_cache_tree, prefix); > if (!subtree) > return WRITE_TREE_PREFIX_ERROR; > - hashcpy(sha1, subtree->sha1); > + if (sha1) > + hashcpy(sha1, subtree->sha1); > } > - else > + else if (sha1) > hashcpy(sha1, active_cache_tree->sha1); > > if (0 <= newfd) > diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh > index 6c33e28..7c60675 100755 > --- a/t/t0090-cache-tree.sh > +++ b/t/t0090-cache-tree.sh > @@ -85,9 +85,22 @@ test_expect_success 'reset --hard without index gives > cache-tree' ' > test_shallow_cache_tree > ' > > -test_expect_failure 'checkout gives cache-tree' ' > +test_expect_success 'checkout gives cache-tree' ' > + git tag current > git checkout HEAD^ && > test_shallow_cache_tree > ' > > +test_expect_success 'checkout -b gives cache-tree' ' > + git checkout current && > + git checkout -b prev HEAD^ && > + test_shallow_cache_tree > +' > + > +test_expect_success 'checkout -B gives cache-tree' ' > + git checkout current && > + git checkout -B prev HEAD^ && > + test_shallow_cache_tree > +' > + > test_done -- To unsubscr
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
On Tue, 2014-07-01 at 06:16 +0200, Torsten Bögershausen wrote: > diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh > index 6c33e28..7c60675 100755 > --- a/t/t0090-cache-tree.sh > +++ b/t/t0090-cache-tree.sh > @@ -85,9 +85,22 @@ test_expect_success 'reset --hard without index gives > cache-tree' ' > test_shallow_cache_tree > ' > > -test_expect_failure 'checkout gives cache-tree' ' > +test_expect_success 'checkout gives cache-tree' ' > + git tag current > git checkout HEAD^ && > test_shallow_cache_tree > > The && chainis broken here. > Does the test now pass, because "git tag" is added ? The tag does not cause the cache-tree to be created, so git tag does not cause the test to pass. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh index 6c33e28..7c60675 100755 --- a/t/t0090-cache-tree.sh +++ b/t/t0090-cache-tree.sh @@ -85,9 +85,22 @@ test_expect_success 'reset --hard without index gives cache-tree' ' test_shallow_cache_tree ' -test_expect_failure 'checkout gives cache-tree' ' +test_expect_success 'checkout gives cache-tree' ' + git tag current git checkout HEAD^ && test_shallow_cache_tree The && chainis broken here. Does the test now pass, because "git tag" is added ? In this case: does it may make sense the keep the old one as it is an add a new test case like this ? +test_expect_success 'tag and checkout gives cache-tree' ' [] -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
Duy Nguyen writes: > I wonder if we should do this in !opts->force code path only. In the > opts->force code path we could use prime_cache_tree() (like > read-tree), which is supposedly faster... Nobody sane should be constantly running "checkout -f", so even if priming from existing tree objects were faster, it would be adding complexity to the code to optimize for a wrong case. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
David Turner writes: > When git checkout checks out a branch, create or update the > cache-tree so that subsequent operations are faster. > > Signed-off-by: David Turner > --- > builtin/checkout.c| 4 > cache-tree.c | 22 -- > cache-tree.h | 1 + > t/t0090-cache-tree.sh | 15 ++- > 4 files changed, 31 insertions(+), 11 deletions(-) > > diff --git a/builtin/checkout.c b/builtin/checkout.c > index 07cf555..df791e8 100644 > --- a/builtin/checkout.c > +++ b/builtin/checkout.c > @@ -553,6 +553,10 @@ static int merge_working_tree(const struct checkout_opts > *opts, > } > } > > + if (write_cache_as_tree(NULL, WRITE_TREE_DO_NOT_WRITE, "")) { > + warn("Unable to write cache_tree"); > + } > + This is not even warn-worthy event, isn't it? "checkout -m other" would attempt a three-way merge to replay your local changes relative to your current HEAD in the context of "other" branch and will leave conflicts in the index and in the working tree, and it is perfectly a normal thing that you cannot write the index in such a state as a tree object. Perhaps you should check if the index is unmerged before even attempting to compute the cache tree. Also it looks very strange that write-cache-as-tree, whose *PRIMARY* function is to write a tree, receives a "DO NOT WRITE" option here. I understand that you want a new interface into the cache-tree subsystem that only updates the cache-tree but it somehow smells like a sloppy/lazy refactoring that is not done quite right. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
On Sat, Jun 28, 2014 at 7:20 AM, David Turner wrote: > When git checkout checks out a branch, create or update the > cache-tree so that subsequent operations are faster. > > Signed-off-by: David Turner > --- > builtin/checkout.c| 4 > cache-tree.c | 22 -- > cache-tree.h | 1 + > t/t0090-cache-tree.sh | 15 ++- > 4 files changed, 31 insertions(+), 11 deletions(-) > > diff --git a/builtin/checkout.c b/builtin/checkout.c > index 07cf555..df791e8 100644 > --- a/builtin/checkout.c > +++ b/builtin/checkout.c > @@ -553,6 +553,10 @@ static int merge_working_tree(const struct checkout_opts > *opts, > } > } > > + if (write_cache_as_tree(NULL, WRITE_TREE_DO_NOT_WRITE, "")) { > + warn("Unable to write cache_tree"); > + } > + I wonder if we should do this in !opts->force code path only. In the opts->force code path we could use prime_cache_tree() (like read-tree), which is supposedly faster (but may need some tests to be sure). prime_cache_tree() could be made a bit faster by doing it during tree traversal in unpack_trees() so we don't have to unpack any tree objects twice. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html