Re: [RFC/PATCH 00/18] Add --index-only option to git merge

2016-04-09 Thread Elijah Newren
On Thu, Apr 7, 2016 at 11:58 PM, Elijah Newren  wrote:
>   Luckily, I figured out that bug.  So, that leaves just one case left
>   that I can't seem to figure out: read_tree_trivial.  So much better,
>   right?  Even it's name is sitting there, mocking me.  "Ha ha, I'm
>   read_tree_*trivial* and you can't figure me out."  read_tree_trivial
>   is a jerk.

It turns out this wasn't a bug in my index-only handling of the
trivial merge case; it was a pre-existing bug in git handling of
trivial merges that probably no one else had ever triggered before.
I'm submitting a fix with my other miscellaneous merge fixups...
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 00/18] Add --index-only option to git merge

2016-04-08 Thread Junio C Hamano
Elijah Newren  writes:

> On Fri, Apr 8, 2016 at 11:08 AM, Junio C Hamano  wrote:
>> Elijah Newren  writes:
>>
>> The goal is stated rather vaguely--when you have a working tree and
>> perform this "in index" merge, you would obviously update the index
>> with the merge result and ...
>
> Yes, I think you hit the nail on the head with your suggestions
> elsewhere to (1) limit this to bare repository only, and/or (2) do the
> "git merge --into master en/topic"
>
> I'll fix this.
> ...
> The pointer is helpful, but I struggle a bit with the name '--cached'.
> The past tense in that word suggests to me that it should be a
> read-only operation on the cache (which works for grep), rather than a
> write operation (such as for merge or apply).  It could also be
> misconstrued in merge's case to think that the index is one of the
> things being merged (rather than erroring out if the index doesn't
> match HEAD).  I know apply already uses --cached, but if others don't
> mind, I personally would rather not use it here.  Is this something
> you feel strongly about, or are you okay with --index-only?

"cached" is not in past tense, but is used an adjective.  You can
update what is in the index and that is to update the "cached"
information.

But I do not think we need to discuss this, because we do not need
such an option, as I understand that your updated direction is to do
one or both of (1) do the "merge and update HEAD only when no manual
conflict resolution is needed" thing in a bare repository, (2) "git
merge --into=master en/topic" does the "merge master and en/topic
and update master only when no manual conflict resolution is needed
and do so without touching HEAD, index or the working tree".

Because the current "git merge" refuses to work without the working
tree, (1) can be done without adding any new option---if you are run
in a bare repository, by definition, you are being asked to do that
new thing, and because you will not do the new thing in a repository
with a working tree, there is no need for an option to tell the
command, "no, don't do the usual thing but do this new thing".  And
obviously (2) already has a new option --into and no other new option
is needed to tell the command that it needs to do that new thing.

>> I have a suspicion that this would become moot, as the conclusion
>> may become that "git merge" that works only in index does not make
>> sense unless you are in a bare repository---in which case, these
>> external merge drivers has to be given a temporary working tree
>> anyway, and they can keep writing their result out to what they
>> consider is the working tree (i.e. via GIT_WORK_TREE or something).
>> You read the result of them from that temporary working tree into
>> the index before cleaning the temporary working tree.  That way,
>> you do not have to touch them at all, no?  In fact, the temporary
>> working tree trick may be applicable even when you are working in a
>> repository with a tree working tree.
>
> I'm a little confused; you seem to be suggesting git-merge-one-file
> will always need to have a working tree of some sort, though I don't
> see why.

I was loosely referring to the working tree, which includes things
like having a place you can safely check out temporary files as if
they were one of the working tree files, i.e. the arrangement in
which "merge-one-file" is designed to operate.  Even if they are
"tmpname" generated files, you do not want to clobber your $GIT_DIR
with these random files when driving these blob level merge drivers
in a bare repository.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 00/18] Add --index-only option to git merge

2016-04-08 Thread Elijah Newren
On Fri, Apr 8, 2016 at 6:01 AM, Michael J Gruber
 wrote:
> I haven't looked at your series thoroughly but immediately had to think
> of 'tr/remerge-diff' (on 'pu'), see
> http://permalink.gmane.org/gmane.comp.version-control.git/256591
>
> There, Thomas used index-only merge to reproduce an automatic merge as
> the base for a useful "remerge-diff".
>
> I've been rebasing (and using) that series on 'next' for a while now
> without any problems; some reasons kept it from being merged on next,
> see the thread.
>
> So, it would be interesting whether you solve the same problem
> differently, or face the same problems ;)

Thanks for the link.  Looks like a very interesting series, even if
we're solving slightly different problems (I don't want conflicts
auto-resolved; I want to be able to look at what conflicted and why
with commands like 'git ls-files -u' afterward.)

But the problem he's trying to solve is interesting to me too.  We
have one patch from each of our series that does overlap, the
index-only modification to merge-recursive, though implemented
slightly differently.  I think mine's a little clearer, and I have a
hunch that he might be able to use the idea in mine to dramatically
simplify some of the other stuff he's doing.  In particular,
merge-recursive already has code to auto resolve conflicts that he
could just re-use instead of reimplementing, which I believe would
dramatically simplify his patch #8.

I didn't read all his code super closely, so I might be missing
something important, but I got the feeling that he didn't need
different behavior than what merge-recursive already implements for
virtual merge bases, and that even he wasn't certain whether he had
handled all cases (e.g. not only conflict markers and modify/delete,
but also rename/delete, rename/add, rename/rename (both 1to2 and
2to1), D/F conflicts, and perhaps others I'm not thinking of at the
moment).  We already have well reviewed and tested code for all those
cases; it's just a subset of the things triggered by o->call_depth for
virtual merges bases.  Also, like the index-only stuff triggered by
o->call_depth, the auto-resolve-conflict stuff is pretty well
separated so it should be easy to add a flag to trigger just this
portion without getting all the other stuff that o->call_depth
normally does.

As far as I could tell, his series stalled out both because of
concerns surrounding whether all the automatic-conflict-resolutions
were correct, and because it made git log no longer be a read-only
operation.  Neither of those concerns are applicable to my patchset; I
invented totally new problems and concerns instead.  :-)  I don't have
a good solution to the second concern for his patchset, but I think
there's probably an easy solution to the first.  Once I get my
patchset cleaned up, I may take a look at reviving his (if he doesn't
beat me to it.)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 00/18] Add --index-only option to git merge

2016-04-08 Thread Elijah Newren
Hi,

On Fri, Apr 8, 2016 at 11:08 AM, Junio C Hamano  wrote:
> Elijah Newren  writes:
>
> The goal is stated rather vaguely--when you have a working tree and
> perform this "in index" merge, you would obviously update the index
> with the merge result and ...

Yes, I think you hit the nail on the head with your suggestions
elsewhere to (1) limit this to bare repository only, and/or (2) do the
"git merge --into master en/topic"

I'll fix this.

>> The core fix, to merge-recursive, was actually quite easy.  The
>> recursive merge logic already had the ability to ignore the working
>> directory and operate entirely on the index -- it needed to do this
>> when creating a virtual merge base, i.e. when o->call_depth > 0.
>
> You need to be careful here, though.  The creation of a virtual
> parent accepts (actually it wants to have) conflicted blobs as if
> that is a valid merge result--"git merge but only in index" probably
> does not want that behaviour.

Yes, precisely.  :-)

A close look at the merge-recursive code showed that the code to
handle the index-only behavior was already nicely separated from the
other things triggered by o->call_depth (e.g. the conflicted blobs
case you mention above, among others), making it really easy to modify
it so we could get the index-only behavior without the other stuff.

> I'd prefer these as a separate series if they are truly unrelated
> cleanups, so that we can quickly review and have them graduated
> without waiting for the remainder.  You can still submit the main
> series with a comment that says it applies on that separate clean-up
> series and the right things should happen ;-)

Will do.

> When you say "index only", I'd say people would understand that to
> be saying "as opposed to using and updating both the index and the
> working tree", so I do not think there is no confusion.
>
> An established convention to spell "index only" found in "apply" and
> "grep" is to say "--cached", though (cf. "git help cli").

I'm assuming your double negative was unintentional, i.e. that you do
not think there is any confusion.  Let me know if I misread.

The pointer is helpful, but I struggle a bit with the name '--cached'.
The past tense in that word suggests to me that it should be a
read-only operation on the cache (which works for grep), rather than a
write operation (such as for merge or apply).  It could also be
misconstrued in merge's case to think that the index is one of the
things being merged (rather than erroring out if the index doesn't
match HEAD).  I know apply already uses --cached, but if others don't
mind, I personally would rather not use it here.  Is this something
you feel strongly about, or are you okay with --index-only?

> If you have to assume, assume that there are people who use these
> programs in their scripts and workflows, because it is a relatively
> new development to make "git merge" directly calling into the
> recursive machinery bypassing these commands.

So, my question wasn't so much about whether the git-merge-* programs
are used, as to whether they should support all the same options that
e.g. "git merge -s recursive" does.  I'm not sure if having the old
merge-* programs support fewer options is considered good ("we want to
toss them eventually", or "they are less tested these days so we want
to take less risk with modifying them") or bad ("we don't want there
to be any difference in ability or behavior").

It should be easy to add the --index-only option to each of these, I'm
just unsure whether it matters or if it's wanted.

> I have a suspicion that this would become moot, as the conclusion
> may become that "git merge" that works only in index does not make
> sense unless you are in a bare repository---in which case, these
> external merge drivers has to be given a temporary working tree
> anyway, and they can keep writing their result out to what they
> consider is the working tree (i.e. via GIT_WORK_TREE or something).
> You read the result of them from that temporary working tree into
> the index before cleaning the temporary working tree.  That way,
> you do not have to touch them at all, no?  In fact, the temporary
> working tree trick may be applicable even when you are working in a
> repository with a tree working tree.

I'm a little confused; you seem to be suggesting git-merge-one-file
will always need to have a working tree of some sort, though I don't
see why.

git-merge-one-file doesn't currently read from the working tree,
except to check for untracked files in the way; it instead gets file
contents from git unpack-file, which pulls it from the object store.
git-merge-one-file currently records the merge resolution, if clean,
in both the index and the working copy, so my modifications are about
making it able to write to only the index. That's the only
modification I think it needs to avoid requiring a working tree at
all.

(git-merge-one-file does create some temporary files in the current
working directory via git un

Re: [RFC/PATCH 00/18] Add --index-only option to git merge

2016-04-08 Thread Junio C Hamano
Elijah Newren  writes:

> This patch series adds an --index-only flag to git merge, the idea
> being to allow a merge to be performed entirely in the index without
> touching (or even needing) a working tree.

The goal is stated rather vaguely--when you have a working tree and
perform this "in index" merge, you would obviously update the index
with the merge result and write it out as a tree to record the merge,
and update HEAD to point at the new commit, but "without touching"
would imply that you would not be checking out the result to the
working tree.  That sounds crazy and probably you didn't mean it
that way.

One worry immediately comes to mind is what should happen when
conflicts occur.  A knee-jerk reaction only after reading the above
explanation is to error out; there could be other sensible
alternatives, but I do not think of any offhand.

Another thing that comes to mind is if that should be an option to
the end-user facing "git merge" command that can be run in a
repository _with_ a working tree.  When used in a repository with a
working tree, I am not sure what the next step the end user wants to
make after the "only in index" merge succeeds.  The index has been
updated with the merge result at that point and I am guessing that
the merge would also have created a new commit and updated the HEAD
to point at it.  The contents of the working tree for paths that
were not changed when the user initiated the merge can be safely
checked out, but what should happen to the ones that were dirty wrt
the index (I am again guessing that you do not allow a "git merge
only in index" if the index already has changes wrt the HEAD)?

I can understand if the option only worked in a bare repository,
though.

> The core fix, to merge-recursive, was actually quite easy.  The
> recursive merge logic already had the ability to ignore the working
> directory and operate entirely on the index -- it needed to do this
> when creating a virtual merge base, i.e. when o->call_depth > 0.

You need to be careful here, though.  The creation of a virtual
parent accepts (actually it wants to have) conflicted blobs as if
that is a valid merge result--"git merge but only in index" probably
does not want that behaviour.

> A brief-ish summary of the series:
>
> * Patches 1 and 2 are unrelated cleanups, which could be submitted
>   independently.  However, submitting them as separate patches would
>   result in a minor conflict, so I'm just including them together.
>
> * Patches 3 and 4 add some testcases and a fix for a separate issue
>   found while testing this series which could be split off and
>   submitted separately, but fixing this problem and enforcing the
>   starting state I expected permitted me to reduce the range of
>   testcases I needed to consider for the --index-only merges.  So I
>   thought it made sense to include in this series.

I'd prefer these as a separate series if they are truly unrelated
cleanups, so that we can quickly review and have them graduated
without waiting for the remainder.  You can still submit the main
series with a comment that says it applies on that separate clean-up
series and the right things should happen ;-)

>   In particular, I was worried about how git merge behaved when the
>   index differed from HEAD prior to the merge starting.  I discovered
>   that it wasn't just allowed for fast-forward merges (where behavior
>   is well-defined) but that it was also (accidentally I'm assuming)
>   allowed for octopus merges with weird/surprising behavior.

I briefly looked at these tests and I agree with your expectation,
i.e. we want to allow a merge as long as there is no conflicts with
the change already in the index (i.e. the changed paths have the
same contents in ORIG_HEAD and HEAD).

> Some things I am concerned about:
>
> * The option name (--index-only) may be slightly misleading since the
>   index isn't the only thing modified within the git directory, other
>   normal things under there are still modified (e.g. writing MERGE_*
>   files, sticking blobs/trees/commits in objects/*, and updating refs
>   and reflogs).  I personally prefer this name and think the confusion
>   would be minor, but I'm a bit unsure and wanted some other opinions
>   on this.

When you say "index only", I'd say people would understand that to
be saying "as opposed to using and updating both the index and the
working tree", so I do not think there is no confusion.

An established convention to spell "index only" found in "apply" and
"grep" is to say "--cached", though (cf. "git help cli").

> * I didn't add this option to the separate git-merge-recursive
>   executable and make the worktree-optional modification to the git
>   subcommands merge-recursive, merge-recursive-ours,
>   merge-recursive-theirs, and merge-subtree in git.c.  Should I, or
>   are these separate binaries and subcommands just present for
>   historical backward compatibility reasons with people expected to
>   call e.g. "git mer

Re: [RFC/PATCH 00/18] Add --index-only option to git merge

2016-04-08 Thread Michael J Gruber
Elijah Newren venit, vidit, dixit 08.04.2016 08:58:
> This patch series adds an --index-only flag to git merge, the idea
> being to allow a merge to be performed entirely in the index without
> touching (or even needing) a working tree.
> 
> The core fix, to merge-recursive, was actually quite easy.  The
> recursive merge logic already had the ability to ignore the working
> directory and operate entirely on the index -- it needed to do this
> when creating a virtual merge base, i.e. when o->call_depth > 0.  It's
> just that o->call_depth was also used for other purposes, so I just
> needed to introduce a new flag to disambiguate and switch all the
> necessary index-only-related call sites to use it.  It actually seems
> to make the code slightly easier to read too, which is a nice bonus.
> That was all done in patch 12 of this series.
> 
> Adding all the necessary testcases and switching over the other merge
> strategies turned out to be the harder part...and still has a problem,
> as I'll mention below.

I haven't looked at your series thoroughly but immediately had to think
of 'tr/remerge-diff' (on 'pu'), see
http://permalink.gmane.org/gmane.comp.version-control.git/256591

There, Thomas used index-only merge to reproduce an automatic merge as
the base for a useful "remerge-diff".

I've been rebasing (and using) that series on 'next' for a while now
without any problems; some reasons kept it from being merged on next,
see the thread.

So, it would be interesting whether you solve the same problem
differently, or face the same problems ;)

Michael
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC/PATCH 00/18] Add --index-only option to git merge

2016-04-07 Thread Elijah Newren
This patch series adds an --index-only flag to git merge, the idea
being to allow a merge to be performed entirely in the index without
touching (or even needing) a working tree.

The core fix, to merge-recursive, was actually quite easy.  The
recursive merge logic already had the ability to ignore the working
directory and operate entirely on the index -- it needed to do this
when creating a virtual merge base, i.e. when o->call_depth > 0.  It's
just that o->call_depth was also used for other purposes, so I just
needed to introduce a new flag to disambiguate and switch all the
necessary index-only-related call sites to use it.  It actually seems
to make the code slightly easier to read too, which is a nice bonus.
That was all done in patch 12 of this series.

Adding all the necessary testcases and switching over the other merge
strategies turned out to be the harder part...and still has a problem,
as I'll mention below.


A brief-ish summary of the series:

* Patches 1 and 2 are unrelated cleanups, which could be submitted
  independently.  However, submitting them as separate patches would
  result in a minor conflict, so I'm just including them together.

* Patches 3 and 4 add some testcases and a fix for a separate issue
  found while testing this series which could be split off and
  submitted separately, but fixing this problem and enforcing the
  starting state I expected permitted me to reduce the range of
  testcases I needed to consider for the --index-only merges.  So I
  thought it made sense to include in this series.

  In particular, I was worried about how git merge behaved when the
  index differed from HEAD prior to the merge starting.  I discovered
  that it wasn't just allowed for fast-forward merges (where behavior
  is well-defined) but that it was also (accidentally I'm assuming)
  allowed for octopus merges with weird/surprising behavior.

* Patches 5-10 add testcases for the --index-only option for each of
  the merge strategies/cases: recursive, fast-forward update, resolve,
  octopus, ours, subtree.

* Patch 11 adds the --index-only option to the code and adds the
  documentation, while patches 12-18 implement the --index-only option
  for each of the strategies/cases.


Some things I am concerned about:

* The option name (--index-only) may be slightly misleading since the
  index isn't the only thing modified within the git directory, other
  normal things under there are still modified (e.g. writing MERGE_*
  files, sticking blobs/trees/commits in objects/*, and updating refs
  and reflogs).  I personally prefer this name and think the confusion
  would be minor, but I'm a bit unsure and wanted some other opinions
  on this.

* I didn't add this option to the separate git-merge-recursive
  executable and make the worktree-optional modification to the git
  subcommands merge-recursive, merge-recursive-ours,
  merge-recursive-theirs, and merge-subtree in git.c.  Should I, or
  are these separate binaries and subcommands just present for
  historical backward compatibility reasons with people expected to
  call e.g. "git merge -s recursive" these days?

* The addition of --index-only to the various git-merge*.sh scripts is
  lacking in flexibility (e.g. has to be passed as the first
  argument).  These scripts seemed to have fairly rigid calling
  conventions already, suggesting there's not much value in making
  them flexible, but perhaps that was the wrong conclusion to draw.

* Expanding on the last item, git-merge-one-file.sh is of particular
  concern; it seemed to strongly assume exactly seven arguments and
  that the position of each mattered.  I didn't want to break that, so
  I added --index-only as an optional 8th argument, even though it
  seems slightly odd force an option argument to always come after
  positional ones (and it made the changes to merge_entry in
  merge-index.c slightly easier to implement).  Does that seem okay?

* For a long time I had two remaining bugs, one of which was in
  checkout_fast_forward.  I was feeling kind of sheepish about that,
  because how much simpler could it be than handling a fast-forward
  merge (with possible uncommited index entries to carry forward)?
  Getting stuck on a simple case like that would be embarrassing.
  Luckily, I figured out that bug.  So, that leaves just one case left
  that I can't seem to figure out: read_tree_trivial.  So much better,
  right?  Even it's name is sitting there, mocking me.  "Ha ha, I'm
  read_tree_*trivial* and you can't figure me out."  read_tree_trivial
  is a jerk.


Elijah Newren (18):
  Remove duplicate code
  Avoid checking working copy when creating a virtual merge base
  Document weird bug in octopus merges via testcases
  merge-octopus: Abort if index not clean
  Add testcase for --index-only merges needing the recursive strategy
  Add testcase for --index-only merges needing an ff update
  Add testcase for --index-only merges with the resolve strategy
  Add testcase for --index-o