Re: submodules and merging (Was: Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue)

2017-12-04 Thread Jacob Keller
On Mon, Dec 4, 2017 at 11:33 AM, Stefan Beller  wrote:
> On Sat, Nov 25, 2017 at 9:59 PM, Jacob Keller  wrote:
>> On Sat, Nov 25, 2017 at 2:37 PM, Elijah Newren  wrote:
>>> On Wed, Nov 15, 2017 at 9:13 AM, Jacob Keller  
>>> wrote:
 On Tue, Nov 14, 2017 at 10:13 AM, Stefan Beller  wrote:
>>>
> But this line of though might be distracting from your original point,
> which was that we have so much to keep in mind when doing tree
> operations (flags, D/F conflicts, now submodules too). I wonder how
> a sensible refactoring would look like to detangle all these aspects,
> but still keeping Git fast and not overengineered.

 I think given how complex a lot of these code paths are, that an
 attempt to refactor it a bit to detangle some of the mess would be
 well worth the time. I'd suspect it might make handling the more
 complex task of actually resolving conflicts to be easier, so the
 effort to clean up the code here should be worth it.
>>>
>>> I think changing from a 4-way merge to a 3-way merge would make things
>>> much better, as Junio outlined here:
>>>
>>> https://public-inbox.org/git/xmqqd147kpdm@gitster.mtv.corp.google.com/
>>>
>>> I don't know of any way to detangle the other aspects, yet.
>
> Jonathan Nieder and me tried some pair programming some time ago[1]
> plumbing the repository object through most of the low level internals, which
> would help in detangling submodule merges as then these merges could
> be done in-core, just as Junio laid out.
>
> [1] https://github.com/stefanbeller/git/tree/object-store-jrn-rebased
>

This sounds promising to me.

Thanks,
Jake

>> I agree, that is absolutely a (big) step in the right direction.
>
>
> I agree as well; A better (abstracted) merge backend would be huge for
> the future of Git.
>
> Thanks,
> Stefan


Re: submodules and merging (Was: Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue)

2017-12-04 Thread Stefan Beller
On Sat, Nov 25, 2017 at 9:59 PM, Jacob Keller  wrote:
> On Sat, Nov 25, 2017 at 2:37 PM, Elijah Newren  wrote:
>> On Wed, Nov 15, 2017 at 9:13 AM, Jacob Keller  wrote:
>>> On Tue, Nov 14, 2017 at 10:13 AM, Stefan Beller  wrote:
>>
 But this line of though might be distracting from your original point,
 which was that we have so much to keep in mind when doing tree
 operations (flags, D/F conflicts, now submodules too). I wonder how
 a sensible refactoring would look like to detangle all these aspects,
 but still keeping Git fast and not overengineered.
>>>
>>> I think given how complex a lot of these code paths are, that an
>>> attempt to refactor it a bit to detangle some of the mess would be
>>> well worth the time. I'd suspect it might make handling the more
>>> complex task of actually resolving conflicts to be easier, so the
>>> effort to clean up the code here should be worth it.
>>
>> I think changing from a 4-way merge to a 3-way merge would make things
>> much better, as Junio outlined here:
>>
>> https://public-inbox.org/git/xmqqd147kpdm@gitster.mtv.corp.google.com/
>>
>> I don't know of any way to detangle the other aspects, yet.

Jonathan Nieder and me tried some pair programming some time ago[1]
plumbing the repository object through most of the low level internals, which
would help in detangling submodule merges as then these merges could
be done in-core, just as Junio laid out.

[1] https://github.com/stefanbeller/git/tree/object-store-jrn-rebased

> I agree, that is absolutely a (big) step in the right direction.


I agree as well; A better (abstracted) merge backend would be huge for
the future of Git.

Thanks,
Stefan


Re: submodules and merging (Was: Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue)

2017-11-25 Thread Jacob Keller
On Sat, Nov 25, 2017 at 2:37 PM, Elijah Newren  wrote:
> On Wed, Nov 15, 2017 at 9:13 AM, Jacob Keller  wrote:
>> On Tue, Nov 14, 2017 at 10:13 AM, Stefan Beller  wrote:
>
>>> But this line of though might be distracting from your original point,
>>> which was that we have so much to keep in mind when doing tree
>>> operations (flags, D/F conflicts, now submodules too). I wonder how
>>> a sensible refactoring would look like to detangle all these aspects,
>>> but still keeping Git fast and not overengineered.
>>
>> I think given how complex a lot of these code paths are, that an
>> attempt to refactor it a bit to detangle some of the mess would be
>> well worth the time. I'd suspect it might make handling the more
>> complex task of actually resolving conflicts to be easier, so the
>> effort to clean up the code here should be worth it.
>
> I think changing from a 4-way merge to a 3-way merge would make things
> much better, as Junio outlined here:
>
> https://public-inbox.org/git/xmqqd147kpdm@gitster.mtv.corp.google.com/
>
> I don't know of any way to detangle the other aspects, yet.

I agree, that is absolutely a (big) step in the right direction.

Thanks,
Jake


Re: submodules and merging (Was: Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue)

2017-11-25 Thread Elijah Newren
On Wed, Nov 15, 2017 at 9:13 AM, Jacob Keller  wrote:
> On Tue, Nov 14, 2017 at 10:13 AM, Stefan Beller  wrote:

>> But this line of though might be distracting from your original point,
>> which was that we have so much to keep in mind when doing tree
>> operations (flags, D/F conflicts, now submodules too). I wonder how
>> a sensible refactoring would look like to detangle all these aspects,
>> but still keeping Git fast and not overengineered.
>
> I think given how complex a lot of these code paths are, that an
> attempt to refactor it a bit to detangle some of the mess would be
> well worth the time. I'd suspect it might make handling the more
> complex task of actually resolving conflicts to be easier, so the
> effort to clean up the code here should be worth it.

I think changing from a 4-way merge to a 3-way merge would make things
much better, as Junio outlined here:

https://public-inbox.org/git/xmqqd147kpdm@gitster.mtv.corp.google.com/

I don't know of any way to detangle the other aspects, yet.


Re: submodules and merging (Was: Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue)

2017-11-15 Thread Jacob Keller
On Tue, Nov 14, 2017 at 10:13 AM, Stefan Beller  wrote:
> Thanks for your reply!
>
> On Tue, Nov 14, 2017 at 9:17 AM, Elijah Newren  wrote:
>> On Mon, Nov 13, 2017 at 3:46 PM, Stefan Beller  wrote:
>>> On Mon, Nov 13, 2017 at 3:39 PM, Elijah Newren  wrote:
 On Mon, Nov 13, 2017 at 2:12 PM, Stefan Beller  wrote:
> I wanted to debug a very similar issue today just after reviewing this
> series, see
> https://public-inbox.org/git/743acc29-85bb-3773-b6a0-68d4a0b8f...@ispras.ru/

 Oh, bleh.  That's not a D/F conflict at all, it's the code assuming
 there's a D/F conflict because the entry it is processing ("sub") is a
 submodule rather than a file, and it panics when it sees "a directory
 in the way" -- a directory that just so happens to be named "sub" and
 which is in fact the desired submodule, meaning that the working
 directory is already good and needs no changes.
>>>
>>> yup, I came to find the same snippet of code to be the offender,
>>> I just haven't figured out how to fix this bug.
>>>
>>> Thanks for taking a look!
>>>
>>> As you have a lot of fresh knowledge in the merge-recursive case
>>> currently, how would we approach the fix here?
>>
>> submodules and merging looks pretty broken, to me.  Backing off from
>> the current bug and just looking at merging with submodules in
>> general, makes me a little uneasy with what I see.  I mean, just look
>> at update_file_flags, when it wants the working directory updated, the
>> code for a submodule is the following:
>>
>> if (update_wd) {
>> 
>>
>> if (S_ISGITLINK(mode)) {
>> /*
>>  * We may later decide to recursively descend into
>>  * the submodule directory and update its index
>>  * and/or work tree, but we do not do that now.
>>  */
>> update_wd = 0;
>> goto update_index;
>> }
>>
>> So, it just doesn't put anything there, so the working directory is
>> made out-of-date immediately.  Users are happy with that?  Maybe it is
>> what makes sense, but it surprised me.
>
> Submodules are traditionally not touched by git commands and we are slowly
> trying to get that changed. Some commands have a --recurse-submodules
> flag now, including checkout, reset; merge is missing this flag as the 
> semantics
> are hard to define sensibly, yet.
>

Yea, figuring out how to represent a conflict with submodules is
pretty challenging, especially in cases where one side has a submodule
and the other side does not.

> Yes, introducing submodules into the mix is a big mess, because
> in the tree it is recorded as if it is a file (only the top level
> entry at path/)
> but on the file system a submodule is represented as a directory
> with contents, so the conflict detection is harder, too.
>
> I had the idea of introducing a command that can "internalize"
> a submodule. This would take the tree recorded in the submodule
> commit and put it where the submodule was mounted,
> The opposite is "externalizing" a submodule, which would turn
> a tree into a submodule. (One of the many question is if it will take
> the whole history or start with a new fresh initial commit).
>
> But this line of though might be distracting from your original point,
> which was that we have so much to keep in mind when doing tree
> operations (flags, D/F conflicts, now submodules too). I wonder how
> a sensible refactoring would look like to detangle all these aspects,
> but still keeping Git fast and not overengineered.

I think given how complex a lot of these code paths are, that an
attempt to refactor it a bit to detangle some of the mess would be
well worth the time. I'd suspect it might make handling the more
complex task of actually resolving conflicts to be easier, so the
effort to clean up the code here should be worth it.

Thanks,
Jake

>
> Thanks,
> Stefan


Re: submodules and merging (Was: Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue)

2017-11-14 Thread Stefan Beller
Thanks for your reply!

On Tue, Nov 14, 2017 at 9:17 AM, Elijah Newren  wrote:
> On Mon, Nov 13, 2017 at 3:46 PM, Stefan Beller  wrote:
>> On Mon, Nov 13, 2017 at 3:39 PM, Elijah Newren  wrote:
>>> On Mon, Nov 13, 2017 at 2:12 PM, Stefan Beller  wrote:
 I wanted to debug a very similar issue today just after reviewing this
 series, see
 https://public-inbox.org/git/743acc29-85bb-3773-b6a0-68d4a0b8f...@ispras.ru/
>>>
>>> Oh, bleh.  That's not a D/F conflict at all, it's the code assuming
>>> there's a D/F conflict because the entry it is processing ("sub") is a
>>> submodule rather than a file, and it panics when it sees "a directory
>>> in the way" -- a directory that just so happens to be named "sub" and
>>> which is in fact the desired submodule, meaning that the working
>>> directory is already good and needs no changes.
>>
>> yup, I came to find the same snippet of code to be the offender,
>> I just haven't figured out how to fix this bug.
>>
>> Thanks for taking a look!
>>
>> As you have a lot of fresh knowledge in the merge-recursive case
>> currently, how would we approach the fix here?
>
> submodules and merging looks pretty broken, to me.  Backing off from
> the current bug and just looking at merging with submodules in
> general, makes me a little uneasy with what I see.  I mean, just look
> at update_file_flags, when it wants the working directory updated, the
> code for a submodule is the following:
>
> if (update_wd) {
> 
>
> if (S_ISGITLINK(mode)) {
> /*
>  * We may later decide to recursively descend into
>  * the submodule directory and update its index
>  * and/or work tree, but we do not do that now.
>  */
> update_wd = 0;
> goto update_index;
> }
>
> So, it just doesn't put anything there, so the working directory is
> made out-of-date immediately.  Users are happy with that?  Maybe it is
> what makes sense, but it surprised me.

Submodules are traditionally not touched by git commands and we are slowly
trying to get that changed. Some commands have a --recurse-submodules
flag now, including checkout, reset; merge is missing this flag as the semantics
are hard to define sensibly, yet.

> From there, we can start stacking on the weird.  For example let's say
> we have this setup:
> O (merge base): path/contents containing "whatever\n"
> A (ours): path is a submodule
> B (theirs): path/contents containing "whatever\nworks\n"
>
> Merging A into B results in
>
> CONFLICT (modify/delete): path/contents deleted in A and modified in
> HEAD. Version HEAD of path/contents left in tree.
> CONFLICT (directory/file): There is a directory with name path in
> HEAD. Adding path as path~A
> Automatic merge failed; fix conflicts and then commit the result.
>
> but path~A is never created, because of the update_file_flags code
> snippet above.  If the user looks at the path directory, it doesn't
> have any submodule info either.  It has just "disappeared", despite
> the claim made in the conflict notice.  Maybe okay, but slightly
> confusing.
>
> Now let's say the user just wants to back out of this merge.  What if
> they run 'git merge --abort'?  Well, they're greeted with:
>
> error: 'path/contents' appears as both a file and as a directory
> error: path/contents: cannot drop to stage #0
> error: Untracked working tree file 'path/contents' would be
> overwritten by merge.
> fatal: Could not reset index file to revision 'HEAD'.
>
> "Would be overwritten by merge"?  We're unmerging, and I agree it
> shouldn't be overwritten, but the fact that it thinks it needs to try
> is worrisome.  And I don't like the fact that it just outright failed.
> Okay, what about 'git reset --hard' at this point:
>
> error: 'path/contents' appears as both a file and as a directory
> error: path/contents: cannot drop to stage #0
> warning: unable to rmdir path: Directory not empty
> HEAD is now at 3e098a0 B
>
> Cannot drop to stage #0?  Oh, boy.  Things must be really messed up.
> Except they're not; it did drop path/contents to stage #0 despite the
> error, and git status is clean again.
>
> Now, let's switch tracks and look at things from the other side.  What
> if the user had been on A and tried to merge B into A?
>
> $ git checkout A
> Switched to branch 'A'
> $ git merge B
> CONFLICT (modify/delete): path/contents deleted in HEAD and modified
> in B. Version B of path/contents left in tree.
> Adding path
> Automatic merge failed; fix conflicts and then commit the result.
>
> path/contents is left in the tree?  But there's a submodule in the
> way!  Did it really put it there...?  Yep:
>
> $ ls -al path/
> total 16
> drwxrwxr-x 3 newren newren 4096 Nov 14 08:40 .
> drwxrwxr-x 4 newren newren 4096 Nov 14 08:38 ..
> -rw-rw-r-- 1 newren newren   15 Nov 14 08:40 contents
> -rw-rw-r-- 1 newren newren0 Nov 14 08:39 foo
> drwxrwxr-x 8