Re: [PATCH/RFC/BUG] unpack-trees.c: do not use "the_index"
On 06/02, Elijah Newren wrote: > On Fri, Jun 1, 2018 at 10:03 PM, Duy Nguyen wrote: > > On Fri, Jun 1, 2018 at 8:34 PM, Elijah Newren wrote: > >> On Fri, Jun 1, 2018 at 9:11 AM, Nguyễn Thái Ngọc Duy > >> wrote: > >>> This is more of a bug report than an actual fix because I'm not sure > >>> if "o->src_index" is always the correct answer instead of "the_index" > >>> here. But this is very similar to 7db118303a (unpack_trees: fix > >>> breakage when o->src_index != o->dst_index - 2018-04-23) and could > >>> potentially break things again... > > I'm pretty sure your patch is correct. Adding Brandon Williams to the > cc for comment since his patches came up in the analysis below... > > >> Actually, I don't think the patch will break anything in the current > >> code. Currently, all callers of unpack_trees() (even merge recursive > >> which uses different src_index and dst_index now) set src_index = > >> &the_index. So, these changes should continue to work as before (with > >> a minor possible exception of merge-recursive calling into other > >> functions from unpack-trees.c after unpack_trees() has finished..). > >> That's not to say that your patch is bug free, just that I think any > >> bugs shouldn't be triggerable from the current codebase. > > > > Ah.. I thought merge-recursive would do fancier things and used some > > temporary index. Good to know. > > Well, it does does use a temporary index, but for dst_index rather > than src_index. It then does some fancier things, but not until the > call to unpack_trees() is over. In particular, at that point, it > swaps the_index and tmp_index, reversing their roles so that now > tmp_index is the original index and the_index becomes the result after > unpack_trees() is run. That's done because I later want to use the > original index for calling verify_uptodate(). verify_uptodate() is > then used for was_tracked_and_matches() and was_tracked(). > > Anyway, the whole upshot of this is: > * merge-recursive uses src_index == &the_index for the unpack_trees() call. > * merge-recursive uses src_index == o->orig_index for subsequent > calls to verify_uptodate(), but verify_uptodate() doesn't call into > any of the sites you have modified. > > Further: > * Every other existing caller of unpack-trees in the code sets > src_index == &the_index, so this won't break any of them. > * There are only two callers in the codebase that set dst_index to > something other than &the_index -- diff-lib.c (which sets it to NULL) > and merge-recursive (which does the stuff described above). > > So, having done that analysis, I am now pretty convinced your patch > won't break anything. That's one half... > > >> Also, if any of the changes you made are wrong, what was there before > >> was also clearly wrong. So I think we're at least no worse off. > >> > >> But, I agree, it's not easy to verify what the correct thing should be > >> in all cases. I'll try to take a closer look in the next couple days. > > > > Thanks. I will also stare at this code some more in the next couple > > days trying to remember what these functions do. > > Your patch has two divisible parts: > > 1) Your modifications to > * clear_ce_flags_1() > * clear_ce_flags_dir() > * clear_ce_flags() > * mark_new_skip_worktree() > The clear_ce_flags*() functions are only called by each other and by > mark_new_skip_worktree(), which in turn is only called from > unpack_trees(). Also, in all of these, your change ends up only > modifying what index_state is passed to is_excluded_from_list(). > > 2) Your modifications to > * verify_clean_subdirectory() > * check_ok_to_remove() > In this case, the former is only called by the latter, and the latter > ends up only being called (via verify_absent_1()) from verify_absent() > and verify_absent_sparse(). > > I'll address each, in reverse order. > > 2) The stuff that affects verify_absent() > > While verify_absent() is not called from merge-recursive right now, it > was something I wanted to use in the future for very similar reasons > that verify_uptodate() started being called directly from > merge-recursive. In particular, if the rewrite of merge-recursive[A] > I want to do sets index_only when calling unpack_trees(), then does > the whole merge without touching the worktree, then at the end goes to > update the working tree, it will need to do extra checks. > verify_absent() will come in handy as one of those extra checks. For > that case, using the_index (the new index just created with lots of > changes) would be wrong in all the same ways that using the_index > caused massive problems for was_tracked() in merge-recursive (e.g. the > blow up of when Junio merged the original directory rename detection > series into master and subsequently reverted it); we'd instead want > src_index (which was the index that existed when merge was called) > instead. So, with this patch you've fixed some important bugs that I > would have hit later
Re: [PATCH/RFC/BUG] unpack-trees.c: do not use "the_index"
On Fri, Jun 1, 2018 at 10:03 PM, Duy Nguyen wrote: > On Fri, Jun 1, 2018 at 8:34 PM, Elijah Newren wrote: >> On Fri, Jun 1, 2018 at 9:11 AM, Nguyễn Thái Ngọc Duy >> wrote: >>> This is more of a bug report than an actual fix because I'm not sure >>> if "o->src_index" is always the correct answer instead of "the_index" >>> here. But this is very similar to 7db118303a (unpack_trees: fix >>> breakage when o->src_index != o->dst_index - 2018-04-23) and could >>> potentially break things again... I'm pretty sure your patch is correct. Adding Brandon Williams to the cc for comment since his patches came up in the analysis below... >> Actually, I don't think the patch will break anything in the current >> code. Currently, all callers of unpack_trees() (even merge recursive >> which uses different src_index and dst_index now) set src_index = >> &the_index. So, these changes should continue to work as before (with >> a minor possible exception of merge-recursive calling into other >> functions from unpack-trees.c after unpack_trees() has finished..). >> That's not to say that your patch is bug free, just that I think any >> bugs shouldn't be triggerable from the current codebase. > > Ah.. I thought merge-recursive would do fancier things and used some > temporary index. Good to know. Well, it does does use a temporary index, but for dst_index rather than src_index. It then does some fancier things, but not until the call to unpack_trees() is over. In particular, at that point, it swaps the_index and tmp_index, reversing their roles so that now tmp_index is the original index and the_index becomes the result after unpack_trees() is run. That's done because I later want to use the original index for calling verify_uptodate(). verify_uptodate() is then used for was_tracked_and_matches() and was_tracked(). Anyway, the whole upshot of this is: * merge-recursive uses src_index == &the_index for the unpack_trees() call. * merge-recursive uses src_index == o->orig_index for subsequent calls to verify_uptodate(), but verify_uptodate() doesn't call into any of the sites you have modified. Further: * Every other existing caller of unpack-trees in the code sets src_index == &the_index, so this won't break any of them. * There are only two callers in the codebase that set dst_index to something other than &the_index -- diff-lib.c (which sets it to NULL) and merge-recursive (which does the stuff described above). So, having done that analysis, I am now pretty convinced your patch won't break anything. That's one half... >> Also, if any of the changes you made are wrong, what was there before >> was also clearly wrong. So I think we're at least no worse off. >> >> But, I agree, it's not easy to verify what the correct thing should be >> in all cases. I'll try to take a closer look in the next couple days. > > Thanks. I will also stare at this code some more in the next couple > days trying to remember what these functions do. Your patch has two divisible parts: 1) Your modifications to * clear_ce_flags_1() * clear_ce_flags_dir() * clear_ce_flags() * mark_new_skip_worktree() The clear_ce_flags*() functions are only called by each other and by mark_new_skip_worktree(), which in turn is only called from unpack_trees(). Also, in all of these, your change ends up only modifying what index_state is passed to is_excluded_from_list(). 2) Your modifications to * verify_clean_subdirectory() * check_ok_to_remove() In this case, the former is only called by the latter, and the latter ends up only being called (via verify_absent_1()) from verify_absent() and verify_absent_sparse(). I'll address each, in reverse order. 2) The stuff that affects verify_absent() While verify_absent() is not called from merge-recursive right now, it was something I wanted to use in the future for very similar reasons that verify_uptodate() started being called directly from merge-recursive. In particular, if the rewrite of merge-recursive[A] I want to do sets index_only when calling unpack_trees(), then does the whole merge without touching the worktree, then at the end goes to update the working tree, it will need to do extra checks. verify_absent() will come in handy as one of those extra checks. For that case, using the_index (the new index just created with lots of changes) would be wrong in all the same ways that using the_index caused massive problems for was_tracked() in merge-recursive (e.g. the blow up of when Junio merged the original directory rename detection series into master and subsequently reverted it); we'd instead want src_index (which was the index that existed when merge was called) instead. So, with this patch you've fixed some important bugs that I would have hit later. [A] sidenote: see https://public-inbox.org/git/xmqqk1ydkbx0@gitster.mtv.corp.google.com/ for more details 1) mark_new_skip_worktree() ... is_excluded_from_list(). Sadly, I'm not very familiar with the skip_worktree and
Re: [PATCH/RFC/BUG] unpack-trees.c: do not use "the_index"
On Fri, Jun 1, 2018 at 8:34 PM, Elijah Newren wrote: > Hi, > > On Fri, Jun 1, 2018 at 9:11 AM, Nguyễn Thái Ngọc Duy > wrote: >> unpack-trees code works on multiple indexes specified in >> unpack_trees_options. Although they normally all refer to the_index at >> the call site, that is the caller's business. unpack-trees.c should >> not make any assumption about that and should use the correct index >> field in unpack_trees_options. >> >> This patch is actually confusing because sometimes the function >> parameter is also named "the_index" while some other times "the_index" >> is the global variable because the function just does not have a >> parameter of the same name! The only subtle difference is that the >> function parameter is a pointer while the global one is not. >> >> This is more of a bug report than an actual fix because I'm not sure >> if "o->src_index" is always the correct answer instead of "the_index" >> here. But this is very similar to 7db118303a (unpack_trees: fix >> breakage when o->src_index != o->dst_index - 2018-04-23) and could >> potentially break things again... > > Actually, I don't think the patch will break anything in the current > code. Currently, all callers of unpack_trees() (even merge recursive > which uses different src_index and dst_index now) set src_index = > &the_index. So, these changes should continue to work as before (with > a minor possible exception of merge-recursive calling into other > functions from unpack-trees.c after unpack_trees() has finished..). > That's not to say that your patch is bug free, just that I think any > bugs shouldn't be triggerable from the current codebase. Ah.. I thought merge-recursive would do fancier things and used some temporary index. Good to know. > Also, if any of the changes you made are wrong, what was there before > was also clearly wrong. So I think we're at least no worse off. > > But, I agree, it's not easy to verify what the correct thing should be > in all cases. I'll try to take a closer look in the next couple days. Thanks. I will also stare at this code some more in the next couple days trying to remember what these functions do. -- Duy
Re: [PATCH/RFC/BUG] unpack-trees.c: do not use "the_index"
On Fri, Jun 1, 2018 at 8:51 PM, Stefan Beller wrote: > On Fri, Jun 1, 2018 at 11:34 AM, Elijah Newren wrote: > >>> +/* Do not use the_index here, you probably want o->src_index */ >>> +#define the_index the_index_should_not_be_used here >> >> Good call. > > Is the space instead of the underscore between the last two words intentional? Unintentional. I wanted gcc to show this phrase when a developer accidentally uses the_index here (it's basically "the_index_should_not_be_used_here is not declared blah blah"). Two words would cause a different syntax error. -- Duy
Re: [PATCH/RFC/BUG] unpack-trees.c: do not use "the_index"
On Fri, Jun 1, 2018 at 11:34 AM, Elijah Newren wrote: >> +/* Do not use the_index here, you probably want o->src_index */ >> +#define the_index the_index_should_not_be_used here > > Good call. Is the space instead of the underscore between the last two words intentional?
Re: [PATCH/RFC/BUG] unpack-trees.c: do not use "the_index"
Hi, On Fri, Jun 1, 2018 at 9:11 AM, Nguyễn Thái Ngọc Duy wrote: > unpack-trees code works on multiple indexes specified in > unpack_trees_options. Although they normally all refer to the_index at > the call site, that is the caller's business. unpack-trees.c should > not make any assumption about that and should use the correct index > field in unpack_trees_options. > > This patch is actually confusing because sometimes the function > parameter is also named "the_index" while some other times "the_index" > is the global variable because the function just does not have a > parameter of the same name! The only subtle difference is that the > function parameter is a pointer while the global one is not. > > This is more of a bug report than an actual fix because I'm not sure > if "o->src_index" is always the correct answer instead of "the_index" > here. But this is very similar to 7db118303a (unpack_trees: fix > breakage when o->src_index != o->dst_index - 2018-04-23) and could > potentially break things again... Actually, I don't think the patch will break anything in the current code. Currently, all callers of unpack_trees() (even merge recursive which uses different src_index and dst_index now) set src_index = &the_index. So, these changes should continue to work as before (with a minor possible exception of merge-recursive calling into other functions from unpack-trees.c after unpack_trees() has finished..). That's not to say that your patch is bug free, just that I think any bugs shouldn't be triggerable from the current codebase. Also, if any of the changes you made are wrong, what was there before was also clearly wrong. So I think we're at least no worse off. But, I agree, it's not easy to verify what the correct thing should be in all cases. I'll try to take a closer look in the next couple days. > > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > unpack-trees.c | 45 ++--- > 1 file changed, 26 insertions(+), 19 deletions(-) > > diff --git a/unpack-trees.c b/unpack-trees.c > index 3a85a02a77..114496cfc2 100644 > --- a/unpack-trees.c > +++ b/unpack-trees.c > @@ -18,6 +18,9 @@ > #include "fsmonitor.h" > #include "fetch-object.h" > > +/* Do not use the_index here, you probably want o->src_index */ > +#define the_index the_index_should_not_be_used here Good call.