Re: [PATCH 0/2] Let git submodule add fail when .git/modules/name already exists

2012-09-30 Thread Jens Lehmann
Am 30.09.2012 06:47, schrieb Junio C Hamano:
 Jens Lehmann jens.lehm...@web.de writes:
 
 The only long term solution I can think of is to use some kind of UUID for
 the name, so that the names of newly added submodules won't have a chance
 to clash anymore. For the short term aborting git submodule add when a
 submodule of that name already exists in .git/modules of the superproject
 together with the ability to provide a custom name might at least solve
 the local clashes.
 
 That assumes that the addition of the submodule for the second time
 is to add a completely different submodule at the same location and
 is done on purpose, but is that a sensible assumption?
 
 If a superproject that is about an embedded appliance used to have a
 submodule A bound at its path kernel, but for some reason stopped
 shipping with kernel and then later reintroduced the directory
 kernel bound to some submodule B, my gut feeling is that it is
 just as likely (if not more likely) that A and B are indeed the same
 submodule (i.e. it shares the same history) as they are totally
 unrelated.
 
 Could it be that it is a user error combined with the immaturity of
 git submodule tool that does not yet support it used to be here,
 but it disappears for a while and then it reappears in the history
 of the superproject very well that caused the user to manually add
 a new submodule which in fact is the same submodule at the same
 path?
 
 I think failing with a better error message is a good idea. It
 should suggest to either resurrect the submodule that is stashed
 away in $GIT_DIR/modules/$name if it indeed is the same, or to
 give it a different name (perhaps kernel used to be pointing at
 the Linux kernel history, then the user is replacing it with a
 totally different implementation that is really from different
 origin and do not share any history, perhaps BSD).  In such a case,
 the user may want to pick bsd-kernel or something as its name, to
 differentiate it.

Good point! I will add a more detailed error message (including
the url of the default remote which is configured for the already
present submodule repo) and teach --force to skip the test and
resurrect that submodule repo.

 Using some kind of UUID can easily be added in a subsequent patch,...
 
 I would suggest thinking really long and hard before saying UUID.

Absolutely.

 It is an easy cop-out to ensure uniqueness, but risks to allow two
 people (or one person at two different time) to give two unrelated
 names to a single thing that actually is the same.

I'm not too worried about that (even though it would be good for
the disk footprint). And I couldn't come up with a better way to
solve the problem we currently have when the same name is used
for two different submodule repos.

 A better alternative might be to use the commit object name at the
 root of the history of the submodule, which would catch the simplest
 and most common case of the mistake, I would think.

This won't work well e.g. when one uses a fork of another repo,
that will contain different commits while still having the same
root commit. I was also thinking about hashing the URL, but that
will break when the user reconfigures the URL to somewhere else.
After playing with some ideas I couldn't find a way to let the
submodule's repo provide sufficient uniqueness.

I'd say for now we go with the detection of name clashes and let
the user choose if he wants to resurrect that submodule repo or
if he wants to choose another name. But if we notice further down
the road that collisions are a problem in real life, we can think
again if UUIDs - or something else - might be a solution.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] Let git submodule add fail when .git/modules/name already exists

2012-09-29 Thread Jens Lehmann
Am 26.09.2012 22:56, schrieb Jens Lehmann:
 Am 26.09.2012 06:18, schrieb Jonathan Johnson:
 To reproduce

 1) add a git submodule in a specific location (we'll say it's at 
 `./submodule/location`)
 2) go through the normal steps of removing a submodule, as listed here - 
 https://git.wiki.kernel.org/index.php/GitSubmoduleTutorial
 3) Now the submodule is completely removed and there is no reference to it 
 in .gitmodules or .git/config
 4) Re-add a different repository at the same location 
 (`./submodule/location`)

 Expected - The new submodule repository will be set up at 
 ./submodule/location and have the new repository as its origin

 What Actually Happens - The new submodule uses the existing `$gitdir` (old 
 repository) as the actual backing repository to the submodule, but the new 
 repository is reflected in .gitmodules and .git/config.

 So to recap, the result is that `git remote show origin`  in the submodule 
 shows a different origin than is in .gitmodules and .git/config

 One simple step to remedy this would be to add the deletion of the backing 
 repository from the .git/modules directory, but again, I think an actual 
 command to take care of all of these steps is in order anyways.  Not sure 
 you want to encourage people poking around in the .git directory.
 
 Unfortunately just throwing away the old repository under .git/modules,
 whether manually or by a git command, is no real solution here: it would
 make it impossible to go back to a commit which records the old submodule
 and check that out again.
 
 The reason for this issue is that the submodule path is used as its name
 by git submodule add. While we could check this type of conflict locally,
 we can't really avoid it due to the distributed nature of git (somebody
 else could add a different repo under the same path - and thus the same
 name - in another clone of the repo).
 
 The only long term solution I can think of is to use some kind of UUID for
 the name, so that the names of newly added submodules won't have a chance
 to clash anymore. For the short term aborting git submodule add when a
 submodule of that name already exists in .git/modules of the superproject
 together with the ability to provide a custom name might at least solve
 the local clashes.

This two patch series implements the short term solution described above.

Using some kind of UUID can easily be added in a subsequent patch, we just
have to replace 'sm_name=$sm_path' with 'sm_name=$(generate uuid)' in
line 348 of git-submodule.sh. I think it'll be the best solution to just
use a random UUID for that, as doing anything clever (like using the SHA1
of the url to avoid copies of the same remote repo) might lead to subtle
breakages (e.g. because it assumes the url stays unique forever, which it
sometimes won't). But maybe the short term solution is sufficient as most
of the time people won't produce submodule name conflicts (and names
derived from paths are much more readable that UUIDs). Thoughts?


Jens Lehmann (2):
  Teach git submodule add the --name option
  submodule add: Fail when .git/modules/name already exists

 Documentation/git-submodule.txt |  7 -
 Documentation/gitmodules.txt|  4 ++-
 git-submodule.sh| 35 ---
 t/t7400-submodule-basic.sh  | 63 +
 t/t7406-submodule-update.sh |  2 +-
 5 files changed, 97 insertions(+), 14 deletions(-)

-- 
1.7.12.1.430.g4fd6dc4


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Let git submodule add fail when .git/modules/name already exists

2012-09-29 Thread Junio C Hamano
Jens Lehmann jens.lehm...@web.de writes:

 The only long term solution I can think of is to use some kind of UUID for
 the name, so that the names of newly added submodules won't have a chance
 to clash anymore. For the short term aborting git submodule add when a
 submodule of that name already exists in .git/modules of the superproject
 together with the ability to provide a custom name might at least solve
 the local clashes.

That assumes that the addition of the submodule for the second time
is to add a completely different submodule at the same location and
is done on purpose, but is that a sensible assumption?

If a superproject that is about an embedded appliance used to have a
submodule A bound at its path kernel, but for some reason stopped
shipping with kernel and then later reintroduced the directory
kernel bound to some submodule B, my gut feeling is that it is
just as likely (if not more likely) that A and B are indeed the same
submodule (i.e. it shares the same history) as they are totally
unrelated.

Could it be that it is a user error combined with the immaturity of
git submodule tool that does not yet support it used to be here,
but it disappears for a while and then it reappears in the history
of the superproject very well that caused the user to manually add
a new submodule which in fact is the same submodule at the same
path?

I think failing with a better error message is a good idea. It
should suggest to either resurrect the submodule that is stashed
away in $GIT_DIR/modules/$name if it indeed is the same, or to
give it a different name (perhaps kernel used to be pointing at
the Linux kernel history, then the user is replacing it with a
totally different implementation that is really from different
origin and do not share any history, perhaps BSD).  In such a case,
the user may want to pick bsd-kernel or something as its name, to
differentiate it.

 Using some kind of UUID can easily be added in a subsequent patch,...

I would suggest thinking really long and hard before saying UUID.
It is an easy cop-out to ensure uniqueness, but risks to allow two
people (or one person at two different time) to give two unrelated
names to a single thing that actually is the same.

A better alternative might be to use the commit object name at the
root of the history of the submodule, which would catch the simplest
and most common case of the mistake, I would think.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html