Re: [RFC/PATCH 00/18] Add --index-only option to git merge
On Thu, Apr 7, 2016 at 11:58 PM, Elijah Newren wrote: > Luckily, I figured out that bug. So, that leaves just one case left > that I can't seem to figure out: read_tree_trivial. So much better, > right? Even it's name is sitting there, mocking me. "Ha ha, I'm > read_tree_*trivial* and you can't figure me out." read_tree_trivial > is a jerk. It turns out this wasn't a bug in my index-only handling of the trivial merge case; it was a pre-existing bug in git handling of trivial merges that probably no one else had ever triggered before. I'm submitting a fix with my other miscellaneous merge fixups... -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 00/18] Add --index-only option to git merge
Elijah Newren writes: > On Fri, Apr 8, 2016 at 11:08 AM, Junio C Hamano wrote: >> Elijah Newren writes: >> >> The goal is stated rather vaguely--when you have a working tree and >> perform this "in index" merge, you would obviously update the index >> with the merge result and ... > > Yes, I think you hit the nail on the head with your suggestions > elsewhere to (1) limit this to bare repository only, and/or (2) do the > "git merge --into master en/topic" > > I'll fix this. > ... > The pointer is helpful, but I struggle a bit with the name '--cached'. > The past tense in that word suggests to me that it should be a > read-only operation on the cache (which works for grep), rather than a > write operation (such as for merge or apply). It could also be > misconstrued in merge's case to think that the index is one of the > things being merged (rather than erroring out if the index doesn't > match HEAD). I know apply already uses --cached, but if others don't > mind, I personally would rather not use it here. Is this something > you feel strongly about, or are you okay with --index-only? "cached" is not in past tense, but is used an adjective. You can update what is in the index and that is to update the "cached" information. But I do not think we need to discuss this, because we do not need such an option, as I understand that your updated direction is to do one or both of (1) do the "merge and update HEAD only when no manual conflict resolution is needed" thing in a bare repository, (2) "git merge --into=master en/topic" does the "merge master and en/topic and update master only when no manual conflict resolution is needed and do so without touching HEAD, index or the working tree". Because the current "git merge" refuses to work without the working tree, (1) can be done without adding any new option---if you are run in a bare repository, by definition, you are being asked to do that new thing, and because you will not do the new thing in a repository with a working tree, there is no need for an option to tell the command, "no, don't do the usual thing but do this new thing". And obviously (2) already has a new option --into and no other new option is needed to tell the command that it needs to do that new thing. >> I have a suspicion that this would become moot, as the conclusion >> may become that "git merge" that works only in index does not make >> sense unless you are in a bare repository---in which case, these >> external merge drivers has to be given a temporary working tree >> anyway, and they can keep writing their result out to what they >> consider is the working tree (i.e. via GIT_WORK_TREE or something). >> You read the result of them from that temporary working tree into >> the index before cleaning the temporary working tree. That way, >> you do not have to touch them at all, no? In fact, the temporary >> working tree trick may be applicable even when you are working in a >> repository with a tree working tree. > > I'm a little confused; you seem to be suggesting git-merge-one-file > will always need to have a working tree of some sort, though I don't > see why. I was loosely referring to the working tree, which includes things like having a place you can safely check out temporary files as if they were one of the working tree files, i.e. the arrangement in which "merge-one-file" is designed to operate. Even if they are "tmpname" generated files, you do not want to clobber your $GIT_DIR with these random files when driving these blob level merge drivers in a bare repository. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 00/18] Add --index-only option to git merge
On Fri, Apr 8, 2016 at 6:01 AM, Michael J Gruber wrote: > I haven't looked at your series thoroughly but immediately had to think > of 'tr/remerge-diff' (on 'pu'), see > http://permalink.gmane.org/gmane.comp.version-control.git/256591 > > There, Thomas used index-only merge to reproduce an automatic merge as > the base for a useful "remerge-diff". > > I've been rebasing (and using) that series on 'next' for a while now > without any problems; some reasons kept it from being merged on next, > see the thread. > > So, it would be interesting whether you solve the same problem > differently, or face the same problems ;) Thanks for the link. Looks like a very interesting series, even if we're solving slightly different problems (I don't want conflicts auto-resolved; I want to be able to look at what conflicted and why with commands like 'git ls-files -u' afterward.) But the problem he's trying to solve is interesting to me too. We have one patch from each of our series that does overlap, the index-only modification to merge-recursive, though implemented slightly differently. I think mine's a little clearer, and I have a hunch that he might be able to use the idea in mine to dramatically simplify some of the other stuff he's doing. In particular, merge-recursive already has code to auto resolve conflicts that he could just re-use instead of reimplementing, which I believe would dramatically simplify his patch #8. I didn't read all his code super closely, so I might be missing something important, but I got the feeling that he didn't need different behavior than what merge-recursive already implements for virtual merge bases, and that even he wasn't certain whether he had handled all cases (e.g. not only conflict markers and modify/delete, but also rename/delete, rename/add, rename/rename (both 1to2 and 2to1), D/F conflicts, and perhaps others I'm not thinking of at the moment). We already have well reviewed and tested code for all those cases; it's just a subset of the things triggered by o->call_depth for virtual merges bases. Also, like the index-only stuff triggered by o->call_depth, the auto-resolve-conflict stuff is pretty well separated so it should be easy to add a flag to trigger just this portion without getting all the other stuff that o->call_depth normally does. As far as I could tell, his series stalled out both because of concerns surrounding whether all the automatic-conflict-resolutions were correct, and because it made git log no longer be a read-only operation. Neither of those concerns are applicable to my patchset; I invented totally new problems and concerns instead. :-) I don't have a good solution to the second concern for his patchset, but I think there's probably an easy solution to the first. Once I get my patchset cleaned up, I may take a look at reviving his (if he doesn't beat me to it.) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 00/18] Add --index-only option to git merge
Hi, On Fri, Apr 8, 2016 at 11:08 AM, Junio C Hamano wrote: > Elijah Newren writes: > > The goal is stated rather vaguely--when you have a working tree and > perform this "in index" merge, you would obviously update the index > with the merge result and ... Yes, I think you hit the nail on the head with your suggestions elsewhere to (1) limit this to bare repository only, and/or (2) do the "git merge --into master en/topic" I'll fix this. >> The core fix, to merge-recursive, was actually quite easy. The >> recursive merge logic already had the ability to ignore the working >> directory and operate entirely on the index -- it needed to do this >> when creating a virtual merge base, i.e. when o->call_depth > 0. > > You need to be careful here, though. The creation of a virtual > parent accepts (actually it wants to have) conflicted blobs as if > that is a valid merge result--"git merge but only in index" probably > does not want that behaviour. Yes, precisely. :-) A close look at the merge-recursive code showed that the code to handle the index-only behavior was already nicely separated from the other things triggered by o->call_depth (e.g. the conflicted blobs case you mention above, among others), making it really easy to modify it so we could get the index-only behavior without the other stuff. > I'd prefer these as a separate series if they are truly unrelated > cleanups, so that we can quickly review and have them graduated > without waiting for the remainder. You can still submit the main > series with a comment that says it applies on that separate clean-up > series and the right things should happen ;-) Will do. > When you say "index only", I'd say people would understand that to > be saying "as opposed to using and updating both the index and the > working tree", so I do not think there is no confusion. > > An established convention to spell "index only" found in "apply" and > "grep" is to say "--cached", though (cf. "git help cli"). I'm assuming your double negative was unintentional, i.e. that you do not think there is any confusion. Let me know if I misread. The pointer is helpful, but I struggle a bit with the name '--cached'. The past tense in that word suggests to me that it should be a read-only operation on the cache (which works for grep), rather than a write operation (such as for merge or apply). It could also be misconstrued in merge's case to think that the index is one of the things being merged (rather than erroring out if the index doesn't match HEAD). I know apply already uses --cached, but if others don't mind, I personally would rather not use it here. Is this something you feel strongly about, or are you okay with --index-only? > If you have to assume, assume that there are people who use these > programs in their scripts and workflows, because it is a relatively > new development to make "git merge" directly calling into the > recursive machinery bypassing these commands. So, my question wasn't so much about whether the git-merge-* programs are used, as to whether they should support all the same options that e.g. "git merge -s recursive" does. I'm not sure if having the old merge-* programs support fewer options is considered good ("we want to toss them eventually", or "they are less tested these days so we want to take less risk with modifying them") or bad ("we don't want there to be any difference in ability or behavior"). It should be easy to add the --index-only option to each of these, I'm just unsure whether it matters or if it's wanted. > I have a suspicion that this would become moot, as the conclusion > may become that "git merge" that works only in index does not make > sense unless you are in a bare repository---in which case, these > external merge drivers has to be given a temporary working tree > anyway, and they can keep writing their result out to what they > consider is the working tree (i.e. via GIT_WORK_TREE or something). > You read the result of them from that temporary working tree into > the index before cleaning the temporary working tree. That way, > you do not have to touch them at all, no? In fact, the temporary > working tree trick may be applicable even when you are working in a > repository with a tree working tree. I'm a little confused; you seem to be suggesting git-merge-one-file will always need to have a working tree of some sort, though I don't see why. git-merge-one-file doesn't currently read from the working tree, except to check for untracked files in the way; it instead gets file contents from git unpack-file, which pulls it from the object store. git-merge-one-file currently records the merge resolution, if clean, in both the index and the working copy, so my modifications are about making it able to write to only the index. That's the only modification I think it needs to avoid requiring a working tree at all. (git-merge-one-file does create some temporary files in the current working directory via git un
Re: [RFC/PATCH 00/18] Add --index-only option to git merge
Elijah Newren writes: > This patch series adds an --index-only flag to git merge, the idea > being to allow a merge to be performed entirely in the index without > touching (or even needing) a working tree. The goal is stated rather vaguely--when you have a working tree and perform this "in index" merge, you would obviously update the index with the merge result and write it out as a tree to record the merge, and update HEAD to point at the new commit, but "without touching" would imply that you would not be checking out the result to the working tree. That sounds crazy and probably you didn't mean it that way. One worry immediately comes to mind is what should happen when conflicts occur. A knee-jerk reaction only after reading the above explanation is to error out; there could be other sensible alternatives, but I do not think of any offhand. Another thing that comes to mind is if that should be an option to the end-user facing "git merge" command that can be run in a repository _with_ a working tree. When used in a repository with a working tree, I am not sure what the next step the end user wants to make after the "only in index" merge succeeds. The index has been updated with the merge result at that point and I am guessing that the merge would also have created a new commit and updated the HEAD to point at it. The contents of the working tree for paths that were not changed when the user initiated the merge can be safely checked out, but what should happen to the ones that were dirty wrt the index (I am again guessing that you do not allow a "git merge only in index" if the index already has changes wrt the HEAD)? I can understand if the option only worked in a bare repository, though. > The core fix, to merge-recursive, was actually quite easy. The > recursive merge logic already had the ability to ignore the working > directory and operate entirely on the index -- it needed to do this > when creating a virtual merge base, i.e. when o->call_depth > 0. You need to be careful here, though. The creation of a virtual parent accepts (actually it wants to have) conflicted blobs as if that is a valid merge result--"git merge but only in index" probably does not want that behaviour. > A brief-ish summary of the series: > > * Patches 1 and 2 are unrelated cleanups, which could be submitted > independently. However, submitting them as separate patches would > result in a minor conflict, so I'm just including them together. > > * Patches 3 and 4 add some testcases and a fix for a separate issue > found while testing this series which could be split off and > submitted separately, but fixing this problem and enforcing the > starting state I expected permitted me to reduce the range of > testcases I needed to consider for the --index-only merges. So I > thought it made sense to include in this series. I'd prefer these as a separate series if they are truly unrelated cleanups, so that we can quickly review and have them graduated without waiting for the remainder. You can still submit the main series with a comment that says it applies on that separate clean-up series and the right things should happen ;-) > In particular, I was worried about how git merge behaved when the > index differed from HEAD prior to the merge starting. I discovered > that it wasn't just allowed for fast-forward merges (where behavior > is well-defined) but that it was also (accidentally I'm assuming) > allowed for octopus merges with weird/surprising behavior. I briefly looked at these tests and I agree with your expectation, i.e. we want to allow a merge as long as there is no conflicts with the change already in the index (i.e. the changed paths have the same contents in ORIG_HEAD and HEAD). > Some things I am concerned about: > > * The option name (--index-only) may be slightly misleading since the > index isn't the only thing modified within the git directory, other > normal things under there are still modified (e.g. writing MERGE_* > files, sticking blobs/trees/commits in objects/*, and updating refs > and reflogs). I personally prefer this name and think the confusion > would be minor, but I'm a bit unsure and wanted some other opinions > on this. When you say "index only", I'd say people would understand that to be saying "as opposed to using and updating both the index and the working tree", so I do not think there is no confusion. An established convention to spell "index only" found in "apply" and "grep" is to say "--cached", though (cf. "git help cli"). > * I didn't add this option to the separate git-merge-recursive > executable and make the worktree-optional modification to the git > subcommands merge-recursive, merge-recursive-ours, > merge-recursive-theirs, and merge-subtree in git.c. Should I, or > are these separate binaries and subcommands just present for > historical backward compatibility reasons with people expected to > call e.g. "git mer
Re: [RFC/PATCH 00/18] Add --index-only option to git merge
Elijah Newren venit, vidit, dixit 08.04.2016 08:58: > This patch series adds an --index-only flag to git merge, the idea > being to allow a merge to be performed entirely in the index without > touching (or even needing) a working tree. > > The core fix, to merge-recursive, was actually quite easy. The > recursive merge logic already had the ability to ignore the working > directory and operate entirely on the index -- it needed to do this > when creating a virtual merge base, i.e. when o->call_depth > 0. It's > just that o->call_depth was also used for other purposes, so I just > needed to introduce a new flag to disambiguate and switch all the > necessary index-only-related call sites to use it. It actually seems > to make the code slightly easier to read too, which is a nice bonus. > That was all done in patch 12 of this series. > > Adding all the necessary testcases and switching over the other merge > strategies turned out to be the harder part...and still has a problem, > as I'll mention below. I haven't looked at your series thoroughly but immediately had to think of 'tr/remerge-diff' (on 'pu'), see http://permalink.gmane.org/gmane.comp.version-control.git/256591 There, Thomas used index-only merge to reproduce an automatic merge as the base for a useful "remerge-diff". I've been rebasing (and using) that series on 'next' for a while now without any problems; some reasons kept it from being merged on next, see the thread. So, it would be interesting whether you solve the same problem differently, or face the same problems ;) Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC/PATCH 00/18] Add --index-only option to git merge
This patch series adds an --index-only flag to git merge, the idea being to allow a merge to be performed entirely in the index without touching (or even needing) a working tree. The core fix, to merge-recursive, was actually quite easy. The recursive merge logic already had the ability to ignore the working directory and operate entirely on the index -- it needed to do this when creating a virtual merge base, i.e. when o->call_depth > 0. It's just that o->call_depth was also used for other purposes, so I just needed to introduce a new flag to disambiguate and switch all the necessary index-only-related call sites to use it. It actually seems to make the code slightly easier to read too, which is a nice bonus. That was all done in patch 12 of this series. Adding all the necessary testcases and switching over the other merge strategies turned out to be the harder part...and still has a problem, as I'll mention below. A brief-ish summary of the series: * Patches 1 and 2 are unrelated cleanups, which could be submitted independently. However, submitting them as separate patches would result in a minor conflict, so I'm just including them together. * Patches 3 and 4 add some testcases and a fix for a separate issue found while testing this series which could be split off and submitted separately, but fixing this problem and enforcing the starting state I expected permitted me to reduce the range of testcases I needed to consider for the --index-only merges. So I thought it made sense to include in this series. In particular, I was worried about how git merge behaved when the index differed from HEAD prior to the merge starting. I discovered that it wasn't just allowed for fast-forward merges (where behavior is well-defined) but that it was also (accidentally I'm assuming) allowed for octopus merges with weird/surprising behavior. * Patches 5-10 add testcases for the --index-only option for each of the merge strategies/cases: recursive, fast-forward update, resolve, octopus, ours, subtree. * Patch 11 adds the --index-only option to the code and adds the documentation, while patches 12-18 implement the --index-only option for each of the strategies/cases. Some things I am concerned about: * The option name (--index-only) may be slightly misleading since the index isn't the only thing modified within the git directory, other normal things under there are still modified (e.g. writing MERGE_* files, sticking blobs/trees/commits in objects/*, and updating refs and reflogs). I personally prefer this name and think the confusion would be minor, but I'm a bit unsure and wanted some other opinions on this. * I didn't add this option to the separate git-merge-recursive executable and make the worktree-optional modification to the git subcommands merge-recursive, merge-recursive-ours, merge-recursive-theirs, and merge-subtree in git.c. Should I, or are these separate binaries and subcommands just present for historical backward compatibility reasons with people expected to call e.g. "git merge -s recursive" these days? * The addition of --index-only to the various git-merge*.sh scripts is lacking in flexibility (e.g. has to be passed as the first argument). These scripts seemed to have fairly rigid calling conventions already, suggesting there's not much value in making them flexible, but perhaps that was the wrong conclusion to draw. * Expanding on the last item, git-merge-one-file.sh is of particular concern; it seemed to strongly assume exactly seven arguments and that the position of each mattered. I didn't want to break that, so I added --index-only as an optional 8th argument, even though it seems slightly odd force an option argument to always come after positional ones (and it made the changes to merge_entry in merge-index.c slightly easier to implement). Does that seem okay? * For a long time I had two remaining bugs, one of which was in checkout_fast_forward. I was feeling kind of sheepish about that, because how much simpler could it be than handling a fast-forward merge (with possible uncommited index entries to carry forward)? Getting stuck on a simple case like that would be embarrassing. Luckily, I figured out that bug. So, that leaves just one case left that I can't seem to figure out: read_tree_trivial. So much better, right? Even it's name is sitting there, mocking me. "Ha ha, I'm read_tree_*trivial* and you can't figure me out." read_tree_trivial is a jerk. Elijah Newren (18): Remove duplicate code Avoid checking working copy when creating a virtual merge base Document weird bug in octopus merges via testcases merge-octopus: Abort if index not clean Add testcase for --index-only merges needing the recursive strategy Add testcase for --index-only merges needing an ff update Add testcase for --index-only merges with the resolve strategy Add testcase for --index-o