Re: Integrating submodules with no side effects

2016-10-25 Thread Robert Dailey
On Wed, Oct 19, 2016 at 2:51 PM, Robert Dailey  wrote:
> On Wed, Oct 19, 2016 at 2:45 PM, Stefan Beller  wrote:
>> On Wed, Oct 19, 2016 at 12:19 PM, Robert Dailey
>>  wrote:
>>> On Wed, Oct 19, 2016 at 11:23 AM, Stefan Beller  wrote:
 You could try this patch series:
 https://github.com/jlehmann/git-submod-enhancements/tree/git-checkout-recurse-submodules
 (rebased to a newer version; no functional changes:)
 https://github.com/stefanbeller/git/tree/submodule-co
 (I'll rebase that later to origin/master)

>
> Do you have any info on how I can prevent that error? Ideally I want
> the integration to go smoothly and transparently, not just for the
> person doing the actual transition (me) but for everyone else that
> gets those changes from upstream. They should not even notice that it
> happened (i.e. no failed commands, awkward behavior, or manual steps).

 It depends on how long you want to postpone the transition, but I plan to
 upstream the series referenced above in the near future,
 which would enable your situation to Just Work (tm). ;)
>>>
>>> At first glance, what you've linked to essentially looks like
>>> automated `git submodule update` for every `git checkout`. Am I
>>> misunderstanding?
>>
>> Essentially yes, except with stricter rules than the actual submodule update
>> IIRC.
>>
>>>
>>> If I'm correct, this is not the same as what I'm talking about. The
>>> problem appears to be more internal: When a submodule is removed, the
>>> physical files that were there are not removed by Git.
>>
>> That is also done by that series: submodules ought to be treated as files:
>> If you checkout a new version where a file is deleted, the checkout command
>> will actually remove the file for you (and e.g. solve any
>> directory/file conflicts
>> that may happen in the transition.)
>>
>>> It leaves them
>>> there in the working copy as untracked files.
>>
>> That is the current behavior as checkout tries hard to ignore submodules.
>>
>>> The next step Git takes
>>> (again, just from outside observation) is to add those very same files
>>> to the working copy, since they were added to a commit. However, at
>>> this point Git fails because it's trying to create (write) files to
>>> the working copy when an exact file of that name already exists there.
>>> Git will not overwrite untracked files, so at this point it fails.
>>>
>>> What needs to happen, somehow, is Git sees that the files were
>>> actually part of a submodule (which was removed) and remove the
>>> physical files as well, assuming that they were not modified in the
>>> submodule itself. This will ensure that the next step (creating the
>>> files) will succeed since the files no longer block it.
>>
>> Yep.
>
> It's great we're finally on the same page ;-)
>
> However, I don't see how this problem can be solved with your script,
> or solved in general outside of that. Does this mean that Git needs to
> change to treat submodules as it does normal files, per your previous
> assertion, which means submodules should *not* be left behind in the
> working copy as untracked files?

I'll assume (due to the lack of responses) that the only viable
solution here is to integrate the submodule using a different
directory name than the one used by the submodule itself. It's
unfortunate but I'll do it if I have no other option.


Re: Integrating submodules with no side effects

2016-10-19 Thread Robert Dailey
On Wed, Oct 19, 2016 at 2:45 PM, Stefan Beller  wrote:
> On Wed, Oct 19, 2016 at 12:19 PM, Robert Dailey
>  wrote:
>> On Wed, Oct 19, 2016 at 11:23 AM, Stefan Beller  wrote:
>>> You could try this patch series:
>>> https://github.com/jlehmann/git-submod-enhancements/tree/git-checkout-recurse-submodules
>>> (rebased to a newer version; no functional changes:)
>>> https://github.com/stefanbeller/git/tree/submodule-co
>>> (I'll rebase that later to origin/master)
>>>

 Do you have any info on how I can prevent that error? Ideally I want
 the integration to go smoothly and transparently, not just for the
 person doing the actual transition (me) but for everyone else that
 gets those changes from upstream. They should not even notice that it
 happened (i.e. no failed commands, awkward behavior, or manual steps).
>>>
>>> It depends on how long you want to postpone the transition, but I plan to
>>> upstream the series referenced above in the near future,
>>> which would enable your situation to Just Work (tm). ;)
>>
>> At first glance, what you've linked to essentially looks like
>> automated `git submodule update` for every `git checkout`. Am I
>> misunderstanding?
>
> Essentially yes, except with stricter rules than the actual submodule update
> IIRC.
>
>>
>> If I'm correct, this is not the same as what I'm talking about. The
>> problem appears to be more internal: When a submodule is removed, the
>> physical files that were there are not removed by Git.
>
> That is also done by that series: submodules ought to be treated as files:
> If you checkout a new version where a file is deleted, the checkout command
> will actually remove the file for you (and e.g. solve any
> directory/file conflicts
> that may happen in the transition.)
>
>> It leaves them
>> there in the working copy as untracked files.
>
> That is the current behavior as checkout tries hard to ignore submodules.
>
>> The next step Git takes
>> (again, just from outside observation) is to add those very same files
>> to the working copy, since they were added to a commit. However, at
>> this point Git fails because it's trying to create (write) files to
>> the working copy when an exact file of that name already exists there.
>> Git will not overwrite untracked files, so at this point it fails.
>>
>> What needs to happen, somehow, is Git sees that the files were
>> actually part of a submodule (which was removed) and remove the
>> physical files as well, assuming that they were not modified in the
>> submodule itself. This will ensure that the next step (creating the
>> files) will succeed since the files no longer block it.
>
> Yep.

It's great we're finally on the same page ;-)

However, I don't see how this problem can be solved with your script,
or solved in general outside of that. Does this mean that Git needs to
change to treat submodules as it does normal files, per your previous
assertion, which means submodules should *not* be left behind in the
working copy as untracked files?


Re: Integrating submodules with no side effects

2016-10-19 Thread Stefan Beller
On Wed, Oct 19, 2016 at 12:19 PM, Robert Dailey
 wrote:
> On Wed, Oct 19, 2016 at 11:23 AM, Stefan Beller  wrote:
>> You could try this patch series:
>> https://github.com/jlehmann/git-submod-enhancements/tree/git-checkout-recurse-submodules
>> (rebased to a newer version; no functional changes:)
>> https://github.com/stefanbeller/git/tree/submodule-co
>> (I'll rebase that later to origin/master)
>>
>>>
>>> Do you have any info on how I can prevent that error? Ideally I want
>>> the integration to go smoothly and transparently, not just for the
>>> person doing the actual transition (me) but for everyone else that
>>> gets those changes from upstream. They should not even notice that it
>>> happened (i.e. no failed commands, awkward behavior, or manual steps).
>>
>> It depends on how long you want to postpone the transition, but I plan to
>> upstream the series referenced above in the near future,
>> which would enable your situation to Just Work (tm). ;)
>
> At first glance, what you've linked to essentially looks like
> automated `git submodule update` for every `git checkout`. Am I
> misunderstanding?

Essentially yes, except with stricter rules than the actual submodule update
IIRC.

>
> If I'm correct, this is not the same as what I'm talking about. The
> problem appears to be more internal: When a submodule is removed, the
> physical files that were there are not removed by Git.

That is also done by that series: submodules ought to be treated as files:
If you checkout a new version where a file is deleted, the checkout command
will actually remove the file for you (and e.g. solve any
directory/file conflicts
that may happen in the transition.)

> It leaves them
> there in the working copy as untracked files.

That is the current behavior as checkout tries hard to ignore submodules.

> The next step Git takes
> (again, just from outside observation) is to add those very same files
> to the working copy, since they were added to a commit. However, at
> this point Git fails because it's trying to create (write) files to
> the working copy when an exact file of that name already exists there.
> Git will not overwrite untracked files, so at this point it fails.
>
> What needs to happen, somehow, is Git sees that the files were
> actually part of a submodule (which was removed) and remove the
> physical files as well, assuming that they were not modified in the
> submodule itself. This will ensure that the next step (creating the
> files) will succeed since the files no longer block it.

Yep.


Re: Integrating submodules with no side effects

2016-10-19 Thread Robert Dailey
On Wed, Oct 19, 2016 at 11:23 AM, Stefan Beller  wrote:
> You could try this patch series:
> https://github.com/jlehmann/git-submod-enhancements/tree/git-checkout-recurse-submodules
> (rebased to a newer version; no functional changes:)
> https://github.com/stefanbeller/git/tree/submodule-co
> (I'll rebase that later to origin/master)
>
>>
>> Do you have any info on how I can prevent that error? Ideally I want
>> the integration to go smoothly and transparently, not just for the
>> person doing the actual transition (me) but for everyone else that
>> gets those changes from upstream. They should not even notice that it
>> happened (i.e. no failed commands, awkward behavior, or manual steps).
>
> It depends on how long you want to postpone the transition, but I plan to
> upstream the series referenced above in the near future,
> which would enable your situation to Just Work (tm). ;)

At first glance, what you've linked to essentially looks like
automated `git submodule update` for every `git checkout`. Am I
misunderstanding?

If I'm correct, this is not the same as what I'm talking about. The
problem appears to be more internal: When a submodule is removed, the
physical files that were there are not removed by Git. It leaves them
there in the working copy as untracked files. The next step Git takes
(again, just from outside observation) is to add those very same files
to the working copy, since they were added to a commit. However, at
this point Git fails because it's trying to create (write) files to
the working copy when an exact file of that name already exists there.
Git will not overwrite untracked files, so at this point it fails.

What needs to happen, somehow, is Git sees that the files were
actually part of a submodule (which was removed) and remove the
physical files as well, assuming that they were not modified in the
submodule itself. This will ensure that the next step (creating the
files) will succeed since the files no longer block it.


Re: Integrating submodules with no side effects

2016-10-19 Thread Stefan Beller
On Wed, Oct 19, 2016 at 6:27 AM, Robert Dailey  wrote:
> On Tue, Oct 18, 2016 at 4:17 PM, Stefan Beller  wrote:
>> On Tue, Oct 18, 2016 at 12:35 PM, Robert Dailey
>>  wrote:
>>> Hello git experts,
>>>
>>> I have in the past attempted to integrate submodules into my primary
>>> repository using the same directory name. However, this has always
>>> caused headache when going to and from branches that take you between
>>> when this integration occurred and when it didn't. It's a bit hard to
>>> explain. Basically, if I have a submodule "foo", and I delete that
>>> submodule and physically add its files under the same directory "foo",
>>> when I do a pull to get this change from another clone, it fails
>>> saying:
>>>
>>> error: The following untracked working tree files would be overwritten
>>> by checkout:
>>> foo/somefile.txt
>>> Please move or remove them before you switch branches.
>>> Aborting
>>> could not detach HEAD
>>>
>>>
>>> Obviously, git can't delete the submodule because the files have also
>>> been added directly. I don't think it is built to handle this
>>> scenario. Here is the series of commands I ran to "integrate" the
>>> submodule (replace the submodule with a directory containing the exact
>>> contents of the submodule itself):
>>>
>>> #!/usr/bin/env bash
>>> mv "$1" "${1}_"
>>> git submodule deinit "$1"
>>
>> This removes the submodule entries from .git/config
>> (and it would remove the contents of that submodule, but they are moved)
>>
>>> git rm "$1"
>>
>> Removing the git link here.
>>
>> So we still have the entries in the .gitmodules file there.
>> Maybe add:
>>
>> name=$(git submodule-helper name $1)
>> git config -f .gitmodules --unset submodule.$name.*
>> git add .gitmodules
>>
>> ? (Could be optional)
>
> Actually I verified that it seems `git rm` is specialized for
> submodules somewhere, because when I run that command on a submodule
> the relevant entries in the .gitmodules file are removed. I did not
> have to do this as a separate step.
>
>>> mv "${1}_" "$1"
>>> git add "$1/**"
>>
>> Moving back into place and adding all files in there.
>>
>>>
>>> The above script is named git-integrate-submodule, I run it like so:
>>>
>>> $ git integrate-submodule foo
>>>
>>> Then I do:
>>>
>>> $ git commit -m 'Integrated foo submodule'
>>>
>>> Is there any way to make this work nicely?
>>
>> I think you can just remove the gitlink from the index and not from the 
>> working
>> tree ("git rm --cached $1")
>
> What is the goal of doing it this way? What does this simplify?

You don't have to mv it back and forth with an underscore I would imagine?

>
>>> The only solution I've
>>> found is to obviously rename the directory before adding the physical
>>> files, for example name it foo1. Because they're different, they never
>>> "clash".
>>
>> Also look at the difference between plumbing and porcelain commands[1],
>> as plumbing is more stable than the porcelain, so it will be easier to 
>> maintain
>> this script.
>
> Which plumbing commands did you have in mind?

None specifically. I write scripts using porcelain all the time for
personal use.
But if you were planning to publish this seriously then I'd recommend looking at
plumbing commands.

>
>> I think this would be an actually reasonable feature, which Git itself
>> could support via "git submodule [de]integrate", but then we'd also want
>> to see the reverse, i.e. take a sub directory and make it a submodule.
>
> Integrating this as a feature might be fine, I think when you bring up
> the question of retaining history makes things much harder.
> Fortunately for me that is not a requirement in this case, so I'm able
> to do things with much less effort.

That reminds me of subtree merging, which could be used for this case.
(see 'git subtree')

>
> However the primary purpose of my post was to find out how to
> integrate the submodule without the error on next pull by other
> collaborators of my repository. It's a real pain to recover your
> working copy when going inbetween commits where the submodule
> integration happened inbetween them. I did quote the exact error
> message I got in my original post.

You could try this patch series:
https://github.com/jlehmann/git-submod-enhancements/tree/git-checkout-recurse-submodules
(rebased to a newer version; no functional changes:)
https://github.com/stefanbeller/git/tree/submodule-co
(I'll rebase that later to origin/master)

>
> Do you have any info on how I can prevent that error? Ideally I want
> the integration to go smoothly and transparently, not just for the
> person doing the actual transition (me) but for everyone else that
> gets those changes from upstream. They should not even notice that it
> happened (i.e. no failed commands, awkward behavior, or manual steps).

It depends on how long you want to postpone the transition, but I plan to
upstream the series referenced above in the near 

Re: Integrating submodules with no side effects

2016-10-19 Thread Robert Dailey
On Tue, Oct 18, 2016 at 4:17 PM, Stefan Beller  wrote:
> On Tue, Oct 18, 2016 at 12:35 PM, Robert Dailey
>  wrote:
>> Hello git experts,
>>
>> I have in the past attempted to integrate submodules into my primary
>> repository using the same directory name. However, this has always
>> caused headache when going to and from branches that take you between
>> when this integration occurred and when it didn't. It's a bit hard to
>> explain. Basically, if I have a submodule "foo", and I delete that
>> submodule and physically add its files under the same directory "foo",
>> when I do a pull to get this change from another clone, it fails
>> saying:
>>
>> error: The following untracked working tree files would be overwritten
>> by checkout:
>> foo/somefile.txt
>> Please move or remove them before you switch branches.
>> Aborting
>> could not detach HEAD
>>
>>
>> Obviously, git can't delete the submodule because the files have also
>> been added directly. I don't think it is built to handle this
>> scenario. Here is the series of commands I ran to "integrate" the
>> submodule (replace the submodule with a directory containing the exact
>> contents of the submodule itself):
>>
>> #!/usr/bin/env bash
>> mv "$1" "${1}_"
>> git submodule deinit "$1"
>
> This removes the submodule entries from .git/config
> (and it would remove the contents of that submodule, but they are moved)
>
>> git rm "$1"
>
> Removing the git link here.
>
> So we still have the entries in the .gitmodules file there.
> Maybe add:
>
> name=$(git submodule-helper name $1)
> git config -f .gitmodules --unset submodule.$name.*
> git add .gitmodules
>
> ? (Could be optional)

Actually I verified that it seems `git rm` is specialized for
submodules somewhere, because when I run that command on a submodule
the relevant entries in the .gitmodules file are removed. I did not
have to do this as a separate step.

>> mv "${1}_" "$1"
>> git add "$1/**"
>
> Moving back into place and adding all files in there.
>
>>
>> The above script is named git-integrate-submodule, I run it like so:
>>
>> $ git integrate-submodule foo
>>
>> Then I do:
>>
>> $ git commit -m 'Integrated foo submodule'
>>
>> Is there any way to make this work nicely?
>
> I think you can just remove the gitlink from the index and not from the 
> working
> tree ("git rm --cached $1")

What is the goal of doing it this way? What does this simplify?

>> The only solution I've
>> found is to obviously rename the directory before adding the physical
>> files, for example name it foo1. Because they're different, they never
>> "clash".
>
> Also look at the difference between plumbing and porcelain commands[1],
> as plumbing is more stable than the porcelain, so it will be easier to 
> maintain
> this script.

Which plumbing commands did you have in mind?

> I think this would be an actually reasonable feature, which Git itself
> could support via "git submodule [de]integrate", but then we'd also want
> to see the reverse, i.e. take a sub directory and make it a submodule.

Integrating this as a feature might be fine, I think when you bring up
the question of retaining history makes things much harder.
Fortunately for me that is not a requirement in this case, so I'm able
to do things with much less effort.

However the primary purpose of my post was to find out how to
integrate the submodule without the error on next pull by other
collaborators of my repository. It's a real pain to recover your
working copy when going inbetween commits where the submodule
integration happened inbetween them. I did quote the exact error
message I got in my original post.

Do you have any info on how I can prevent that error? Ideally I want
the integration to go smoothly and transparently, not just for the
person doing the actual transition (me) but for everyone else that
gets those changes from upstream. They should not even notice that it
happened (i.e. no failed commands, awkward behavior, or manual steps).


Re: Integrating submodules with no side effects

2016-10-18 Thread Stefan Beller
On Tue, Oct 18, 2016 at 12:35 PM, Robert Dailey
 wrote:
> Hello git experts,
>
> I have in the past attempted to integrate submodules into my primary
> repository using the same directory name. However, this has always
> caused headache when going to and from branches that take you between
> when this integration occurred and when it didn't. It's a bit hard to
> explain. Basically, if I have a submodule "foo", and I delete that
> submodule and physically add its files under the same directory "foo",
> when I do a pull to get this change from another clone, it fails
> saying:
>
> error: The following untracked working tree files would be overwritten
> by checkout:
> foo/somefile.txt
> Please move or remove them before you switch branches.
> Aborting
> could not detach HEAD
>
>
> Obviously, git can't delete the submodule because the files have also
> been added directly. I don't think it is built to handle this
> scenario. Here is the series of commands I ran to "integrate" the
> submodule (replace the submodule with a directory containing the exact
> contents of the submodule itself):
>
> #!/usr/bin/env bash
> mv "$1" "${1}_"
> git submodule deinit "$1"

This removes the submodule entries from .git/config
(and it would remove the contents of that submodule, but they are moved)

> git rm "$1"

Removing the git link here.

So we still have the entries in the .gitmodules file there.
Maybe add:

name=$(git submodule-helper name $1)
git config -f .gitmodules --unset submodule.$name.*
git add .gitmodules

? (Could be optional)

> mv "${1}_" "$1"
> git add "$1/**"

Moving back into place and adding all files in there.

>
> The above script is named git-integrate-submodule, I run it like so:
>
> $ git integrate-submodule foo
>
> Then I do:
>
> $ git commit -m 'Integrated foo submodule'
>
> Is there any way to make this work nicely?

I think you can just remove the gitlink from the index and not from the working
tree ("git rm --cached $1")

> The only solution I've
> found is to obviously rename the directory before adding the physical
> files, for example name it foo1. Because they're different, they never
> "clash".

Also look at the difference between plumbing and porcelain commands[1],
as plumbing is more stable than the porcelain, so it will be easier to maintain
this script.

I think this would be an actually reasonable feature, which Git itself
could support via "git submodule [de]integrate", but then we'd also want
to see the reverse, i.e. take a sub directory and make it a submodule.

[1] e.g. https://www.kernel.org/pub/software/scm/git/docs/

Thanks,
Stefan