Re: [RFC 01/10] submodule: add 'core.submodulesFile' to override the '.gitmodules' path

2018-04-18 Thread Stefan Beller
Hi Antonio,

>>
>> Good point! I wonder if the cleaner solution would be to just
>> tell git to use HEAD:.gitmodules and not check out the file?
>> then you would not need to come up with a namespace for names
>> of the .gitmodules files and scatter them into the worktree as well?
>>
>
> Any solution which:
>
>   1. prevents the gitmodules file to be checked out
>   2. but still tracks it in the git repository
>
> OR
>
>   1. allows to set the gitmoudles file under some namespace
>
> would work for vcsh I guess.

I personally would tend to rather go for supporting your first solution
(prevent .gitmodules from checked-out, load from sparse HEAD),
but I do not have strong arguments or feeling about this dimension.
I am fine with a namespaced .gtimodules solution, too.

Both solutions can be implemented by either:

A) adding the code where it is (like your patch, e.g. using

> -   value=$(git config -f .gitmodules submodule."$name"."$option")
> +   gitmodules_file=$(git config core.submodulesfile)
> +   : ${gitmodules_file:=.gitmodules}
> +   value=$(git config -f "$gitmodules_file" 
> submodule."$name"."$option")

B) adding a helper, which is a layer of indirection
to load the relevant configuration.

And when it comes to this dimension, I'd strongly favor B over A.
Having this indirection helper in place enables to add more options
later easily as only one place needs to be touched.
(These other options could include the other solution as presented above,
or the idea with the special ref as mentioned in an earlier email)


>> > Can you give an example from the user point of view of such a
>> > "config-from-gitmodules" command?
>> >
>>
>> git submodule config  
>>
>> as an 'alias' for
>>
>>gitmodules_file=$(git config core.submodulesfile)
>>: ${gitmodules_file:=.gitmodules}
>>value=$(git config -f "$gitmodules_file"
>> submodule."$name"."$option")
>>
>> The helper would figure out which config file to load form
>> (.gitmodules in tree, HEAD:.gitmodules, your new proposed gitmodules file,
>> .git/config... or the special ref) and then return the  for 
>>
>> So maybe:
>>
>> $ git clone https://gerrit.googlesource.com/gerrit && cd gerrit
>> # ^ My goto-repo with submodules
>>
>> $ git submodule config "plugins/hooks" URL
>> ../plugins/hooks
>>
>>
>
> I may look into such supporting changes once you decide the approach to
> take for the bigger problem.

I think once we have the helper in place you can implement the solution
to the bigger problem as you like?

There are a few pros and cons for namespaced .gitmodules and
non-checked-out sparse HEAD .gitmodules:

How do you modify the .gitmodules config?

In the namespaced solution, you can tell users to edit that
file manually or use "git config -f $new_location" to manipulate
that file.

In the sparse solution editing becomes a little bit trickier, as you
need to edit a file in the index (or HEAD).

If you have the special ref, you could just checkout the
special ref in another worktree and make changes and
commit there


How do you change the setup?

In case of a sparse gitmodules file, you can just check it out
(make it non-sparse) or vice versa.

In case of a namespaced gitmodules file, you'd change the
config setting and have to move the file to the new location.
as git config is just about configuring, the user is left alone
with moving the file, or would we have a helper for that?
("git submodule relocate-gitmodules" or such)?

If you have the special ref, you could just checkout the
special ref in another worktree and make changes and
commit there.

I hope this helps instead of confusing more,

Thanks,
Stefan


Re: [RFC 01/10] submodule: add 'core.submodulesFile' to override the '.gitmodules' path

2018-04-18 Thread Antonio Ospite
On Mon, 16 Apr 2018 14:22:35 -0700
Stefan Beller  wrote:

> On Mon, Apr 16, 2018 at 9:37 AM, Antonio Ospite  wrote:
> > On Thu, 12 Apr 2018 16:50:03 -0700
> > Stefan Beller  wrote:
> >
> >> Hi Antonio,
> >>
> >> On Thu, Apr 12, 2018 at 3:20 PM, Antonio Ospite  wrote:
> >> > When multiple repositories with detached work-trees take turns using the
> >> > same directory as their work-tree, and more than one of them want to use
> >> > submodules, there will be conflicts about the '.gitmodules' file.
> >>
> >> unlike other files which would not conflict?
> >> There might be file names such as LICENSE, Readme.md etc,
> >> which are common enough that they would produce conflicts as well?
> >> I find this argument on its own rather weak. ("Just delete everything in
> >> the working dir before using it with another repository"). I might be
> >> missing a crucial bit here?
> >>
> >
> > All the vcsh repositories _share_ the same work-tree; they may control
> > it taking turns but, in general, all files are meant to be checked out
> > at all times as the basic use case is: *distinct* sets of config files.
> >
> > Maybe saying that the repositories "take turns" is confusing.
> > It's an unnecessary information, so I will omit that part form the
> > commit message.
> 
> So they all have the same workdir, do they track the same set of files
> or do they track a disjoint set of files, and ignoring the other repositories
> files via the ignore mechanism?
>

To recap,

vcsh[1] sets $HOME as the work-tree of multiple repositories to track
different sets of dotfiles in distinct repositories, while still having
the files directly available in $HOME. Each repository can ignore
untracked files via the ignore mechanism (namely core.excludesFile).

[1] https://github.com/RichiH/vcsh

For all this to work well, the sets of the tracked files would also need
to be disjoint, and usually they "practically" are, once a few
exceptions are taken care of.

Common intersecting items like LICENSE and README can be handled via
sparse-checkout to have "disjoint checkouts" and this solves most of
the problems, but the same mechanism cannot be used for .gitmodules as
it needs to be checked out.

And the problem cannot be worked around like done with .gitignore
(using core.excludesFile instead) because .gitmodules is unique and
hardcoded.

> This sounds like an interesting setup. I never though of that as something
> useful (in either configuration).
>

Give vcsh a try maybe.

[...]
> > However I guess that my point here is that the gitmodules file is
> > something that influences git behavior so it should not be on the user's
> > shoulder to manage conflicts for it, and most importantly it needs to
> > be checked out for git to access it, doesn't it?
> 
> Good point! I wonder if the cleaner solution would be to just
> tell git to use HEAD:.gitmodules and not check out the file?
> then you would not need to come up with a namespace for names
> of the .gitmodules files and scatter them into the worktree as well?
>

Any solution which:

  1. prevents the gitmodules file to be checked out
  2. but still tracks it in the git repository

OR
  
  1. allows to set the gitmoudles file under some namespace

would work for vcsh I guess.

> 
> >> > -   value=$(git config -f .gitmodules 
> >> > submodule."$name"."$option")
> >> > +   gitmodules_file=$(git config core.submodulesfile)
> >> > +   : ${gitmodules_file:=.gitmodules}
> >> > +   value=$(git config -f "$gitmodules_file" 
> >> > submodule."$name"."$option")
> >>
> >> I wonder if it would be cheaper to write a special config lookup now, e.g.
> >> in builtin/submodule--helper.c we could have a "config-from-gitmodules"
> >> subcommand that is looking up the modules file and then running the config
> >> on that file.
> >>
> >
> > Can you give an example from the user point of view of such a
> > "config-from-gitmodules" command?
> >
> 
> git submodule config  
> 
> as an 'alias' for
> 
>gitmodules_file=$(git config core.submodulesfile)
>: ${gitmodules_file:=.gitmodules}
>value=$(git config -f "$gitmodules_file"
> submodule."$name"."$option")
> 
> The helper would figure out which config file to load form
> (.gitmodules in tree, HEAD:.gitmodules, your new proposed gitmodules file,
> .git/config... or the special ref) and then return the  for 
> 
> So maybe:
> 
> $ git clone https://gerrit.googlesource.com/gerrit && cd gerrit
> # ^ My goto-repo with submodules
> 
> $ git submodule config "plugins/hooks" URL
> ../plugins/hooks
> 
>

I may look into such supporting changes once you decide the approach to
take for the bigger problem.

Thank you,
   Antonio

-- 
Antonio Ospite
https://ao2.it
https://twitter.com/ao2it

A: Because it messes up the order in which people normally read text.
   See http://en.wikipedia.org/wiki/Posting_style
Q: Why is top-posting such a bad thing?


Re: [RFC 01/10] submodule: add 'core.submodulesFile' to override the '.gitmodules' path

2018-04-16 Thread Stefan Beller
On Mon, Apr 16, 2018 at 9:37 AM, Antonio Ospite  wrote:
> On Thu, 12 Apr 2018 16:50:03 -0700
> Stefan Beller  wrote:
>
>> Hi Antonio,
>>
>> On Thu, Apr 12, 2018 at 3:20 PM, Antonio Ospite  wrote:
>> > When multiple repositories with detached work-trees take turns using the
>> > same directory as their work-tree, and more than one of them want to use
>> > submodules, there will be conflicts about the '.gitmodules' file.
>>
>> unlike other files which would not conflict?
>> There might be file names such as LICENSE, Readme.md etc,
>> which are common enough that they would produce conflicts as well?
>> I find this argument on its own rather weak. ("Just delete everything in
>> the working dir before using it with another repository"). I might be
>> missing a crucial bit here?
>>
>
> All the vcsh repositories _share_ the same work-tree; they may control
> it taking turns but, in general, all files are meant to be checked out
> at all times as the basic use case is: *distinct* sets of config files.
>
> Maybe saying that the repositories "take turns" is confusing.
> It's an unnecessary information, so I will omit that part form the
> commit message.

So they all have the same workdir, do they track the same set of files
or do they track a disjoint set of files, and ignoring the other repositories
files via the ignore mechanism?

This sounds like an interesting setup. I never though of that as something
useful (in either configuration).

> After your question I've done some research and I've seen other vcsh
> users managing conflicting LICENSE and README files using git
> sparse-checkouts, to have these files in the single repositories but
> not checked out in the shared work-tree:
> https://github.com/RichiH/vcsh/issues/120#issuecomment-42639619
> https://github.com/jwhitley/vcsh-root/commit/30b0d495c2cbe47ae9617ace9c2c14720d961d78
>
> However I guess that my point here is that the gitmodules file is
> something that influences git behavior so it should not be on the user's
> shoulder to manage conflicts for it, and most importantly it needs to
> be checked out for git to access it, doesn't it?

Good point! I wonder if the cleaner solution would be to just
tell git to use HEAD:.gitmodules and not check out the file?
then you would not need to come up with a namespace for names
of the .gitmodules files and scatter them into the worktree as well?


>> > -   value=$(git config -f .gitmodules 
>> > submodule."$name"."$option")
>> > +   gitmodules_file=$(git config core.submodulesfile)
>> > +   : ${gitmodules_file:=.gitmodules}
>> > +   value=$(git config -f "$gitmodules_file" 
>> > submodule."$name"."$option")
>>
>> I wonder if it would be cheaper to write a special config lookup now, e.g.
>> in builtin/submodule--helper.c we could have a "config-from-gitmodules"
>> subcommand that is looking up the modules file and then running the config
>> on that file.
>>
>
> Can you give an example from the user point of view of such a
> "config-from-gitmodules" command?
>

git submodule config  

as an 'alias' for

   gitmodules_file=$(git config core.submodulesfile)
   : ${gitmodules_file:=.gitmodules}
   value=$(git config -f "$gitmodules_file"
submodule."$name"."$option")

The helper would figure out which config file to load form
(.gitmodules in tree, HEAD:.gitmodules, your new proposed gitmodules file,
.git/config... or the special ref) and then return the  for 

So maybe:

$ git clone https://gerrit.googlesource.com/gerrit && cd gerrit
# ^ My goto-repo with submodules

$ git submodule config "plugins/hooks" URL
../plugins/hooks



> I might look into it, but that can also be a followup change.


>> > diff --git a/submodule.c b/submodule.c
>> > index 9a50168b2..2afbdb644 100644
>> > --- a/submodule.c
>> > +++ b/submodule.c
>> > @@ -36,13 +36,13 @@ static struct oid_array ref_tips_after_fetch;
>> >   */
>> >  int is_gitmodules_unmerged(const struct index_state *istate)
>> >  {
>> > -   int pos = index_name_pos(istate, GITMODULES_FILE, 
>> > strlen(GITMODULES_FILE));
>> > +   int pos = index_name_pos(istate, submodules_file, 
>> > strlen(submodules_file));
>>
>> Ah, regarding the coverletter: This clearly assumes the modules
>> file is in the tree. So at least here we would make an exception
>> for files outside the tree to either not check for un-merged-ness or
>> disallow that case entirely.
>>
>
> Sorry I am not sure I follow what you are saying here, keep in mind
> that I am new to git internals.
>
> Do you mean that, even if we ensure (in
> config.c::git_default_core_config) that only paths relative to
> the work-tree are allowed, we still have to check here that the
> constraint is respected? And is so, why?

index_name_pos looks up a position of a file in the index,
which would fail for any file not in the index.

So if we give a path outside the tree, the lookup would fail
and we'd treat it as no .gitmodu

Re: [RFC 01/10] submodule: add 'core.submodulesFile' to override the '.gitmodules' path

2018-04-16 Thread Antonio Ospite
On Thu, 12 Apr 2018 16:50:03 -0700
Stefan Beller  wrote:

> Hi Antonio,
> 
> On Thu, Apr 12, 2018 at 3:20 PM, Antonio Ospite  wrote:
> > When multiple repositories with detached work-trees take turns using the
> > same directory as their work-tree, and more than one of them want to use
> > submodules, there will be conflicts about the '.gitmodules' file.
> 
> unlike other files which would not conflict?
> There might be file names such as LICENSE, Readme.md etc,
> which are common enough that they would produce conflicts as well?
> I find this argument on its own rather weak. ("Just delete everything in
> the working dir before using it with another repository"). I might be
> missing a crucial bit here?
>

All the vcsh repositories _share_ the same work-tree; they may control
it taking turns but, in general, all files are meant to be checked out
at all times as the basic use case is: *distinct* sets of config files.

Maybe saying that the repositories "take turns" is confusing.
It's an unnecessary information, so I will omit that part form the
commit message.

After your question I've done some research and I've seen other vcsh
users managing conflicting LICENSE and README files using git
sparse-checkouts, to have these files in the single repositories but
not checked out in the shared work-tree:
https://github.com/RichiH/vcsh/issues/120#issuecomment-42639619
https://github.com/jwhitley/vcsh-root/commit/30b0d495c2cbe47ae9617ace9c2c14720d961d78

However I guess that my point here is that the gitmodules file is
something that influences git behavior so it should not be on the user's
shoulder to manage conflicts for it, and most importantly it needs to
be checked out for git to access it, doesn't it?

> > git hardcodes this path so it's not possible to override its location on
> > a per-repository basis to allow such repositories to coexists
> > peacefully.
> >
> > Make the path of the "gitmodules file" customizable exposing
> > a 'core.submodulesFile' configuration setting.
> >
> > The default value will still be '.gitmodules' when 'core.submodulesFile'
> > is not set.
> 
> ok.
> 
> 
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -1774,6 +1774,7 @@ extern void prepare_pager_args(struct child_process 
> > *, const char *pager);
> >  extern const char *editor_program;
> >  extern const char *askpass_program;
> >  extern const char *excludes_file;
> > +extern const char *submodules_file;
> 
> Could you place this variable in repository.h in struct repository?
> (Some developers currently try to move any global state to that place,
> as that makes working with e.g. nested submodules easier in-process
> and you would not need to spawn processes for submodules)
> 
> Once migrated to the repository struct mentioned above, you'd access
> it via the_repository->submodules_file for the main repository.
>

OK, thanks, I didn't like the global variable either, I was just copying
from excludes_file.

> 
> > diff --git a/git-submodule.sh b/git-submodule.sh
> > index 24914963c..610fd0dc5 100755
> > --- a/git-submodule.sh
> > +++ b/git-submodule.sh
> > @@ -71,7 +71,9 @@ get_submodule_config () {
> > value=$(git config submodule."$name"."$option")
> > if test -z "$value"
> > then
> > -   value=$(git config -f .gitmodules 
> > submodule."$name"."$option")
> > +   gitmodules_file=$(git config core.submodulesfile)
> > +   : ${gitmodules_file:=.gitmodules}
> > +   value=$(git config -f "$gitmodules_file" 
> > submodule."$name"."$option")
> 
> I wonder if it would be cheaper to write a special config lookup now, e.g.
> in builtin/submodule--helper.c we could have a "config-from-gitmodules"
> subcommand that is looking up the modules file and then running the config
> on that file.
>

Can you give an example from the user point of view of such a
"config-from-gitmodules" command?

I might look into it, but that can also be a followup change. 

> I am surprised how little access of the .gitmodules is left in 
> git-submodule.sh
> (which is partially ported to the builtin/submodule--helper.c)
> 
> > diff --git a/submodule-config.c b/submodule-config.c
> > index 3f2075764..8a3396ade 100644
> > --- a/submodule-config.c
> > +++ b/submodule-config.c
> > @@ -468,7 +468,7 @@ static int gitmodule_oid_from_commit(const struct 
> > object_id *treeish_name,
> > return 1;
> > }
> >
> > -   strbuf_addf(rev, "%s:.gitmodules", oid_to_hex(treeish_name));
> > +   strbuf_addf(rev, "%s:%s", oid_to_hex(treeish_name), 
> > submodules_file);
> > if (get_oid(rev->buf, gitmodules_oid) >= 0)
> > ret = 1;
> >
> > @@ -583,7 +583,7 @@ void repo_read_gitmodules(struct repository *repo)
> > if (repo_read_index(repo) < 0)
> > return;
> >
> > -   gitmodules = repo_worktree_path(repo, GITMODULES_FILE);
> > +   gitmodules = repo_worktree_path(repo, submodules_file);
> >
> >

Re: [RFC 01/10] submodule: add 'core.submodulesFile' to override the '.gitmodules' path

2018-04-12 Thread Stefan Beller
Hi Antonio,

On Thu, Apr 12, 2018 at 3:20 PM, Antonio Ospite  wrote:
> When multiple repositories with detached work-trees take turns using the
> same directory as their work-tree, and more than one of them want to use
> submodules, there will be conflicts about the '.gitmodules' file.

unlike other files which would not conflict?
There might be file names such as LICENSE, Readme.md etc,
which are common enough that they would produce conflicts as well?
I find this argument on its own rather weak. ("Just delete everything in
the working dir before using it with another repository"). I might be
missing a crucial bit here?

> git hardcodes this path so it's not possible to override its location on
> a per-repository basis to allow such repositories to coexists
> peacefully.
>
> Make the path of the "gitmodules file" customizable exposing
> a 'core.submodulesFile' configuration setting.
>
> The default value will still be '.gitmodules' when 'core.submodulesFile'
> is not set.

ok.


> --- a/cache.h
> +++ b/cache.h
> @@ -1774,6 +1774,7 @@ extern void prepare_pager_args(struct child_process *, 
> const char *pager);
>  extern const char *editor_program;
>  extern const char *askpass_program;
>  extern const char *excludes_file;
> +extern const char *submodules_file;

Could you place this variable in repository.h in struct repository?
(Some developers currently try to move any global state to that place,
as that makes working with e.g. nested submodules easier in-process
and you would not need to spawn processes for submodules)

Once migrated to the repository struct mentioned above, you'd access
it via the_repository->submodules_file for the main repository.


> diff --git a/git-submodule.sh b/git-submodule.sh
> index 24914963c..610fd0dc5 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -71,7 +71,9 @@ get_submodule_config () {
> value=$(git config submodule."$name"."$option")
> if test -z "$value"
> then
> -   value=$(git config -f .gitmodules submodule."$name"."$option")
> +   gitmodules_file=$(git config core.submodulesfile)
> +   : ${gitmodules_file:=.gitmodules}
> +   value=$(git config -f "$gitmodules_file" 
> submodule."$name"."$option")

I wonder if it would be cheaper to write a special config lookup now, e.g.
in builtin/submodule--helper.c we could have a "config-from-gitmodules"
subcommand that is looking up the modules file and then running the config
on that file.

I am surprised how little access of the .gitmodules is left in git-submodule.sh
(which is partially ported to the builtin/submodule--helper.c)

> diff --git a/submodule-config.c b/submodule-config.c
> index 3f2075764..8a3396ade 100644
> --- a/submodule-config.c
> +++ b/submodule-config.c
> @@ -468,7 +468,7 @@ static int gitmodule_oid_from_commit(const struct 
> object_id *treeish_name,
> return 1;
> }
>
> -   strbuf_addf(rev, "%s:.gitmodules", oid_to_hex(treeish_name));
> +   strbuf_addf(rev, "%s:%s", oid_to_hex(treeish_name), submodules_file);
> if (get_oid(rev->buf, gitmodules_oid) >= 0)
> ret = 1;
>
> @@ -583,7 +583,7 @@ void repo_read_gitmodules(struct repository *repo)
> if (repo_read_index(repo) < 0)
> return;
>
> -   gitmodules = repo_worktree_path(repo, GITMODULES_FILE);
> +   gitmodules = repo_worktree_path(repo, submodules_file);
>
> if (!is_gitmodules_unmerged(repo->index))
> git_config_from_file(gitmodules_cb, gitmodules, repo);
> diff --git a/submodule.c b/submodule.c
> index 9a50168b2..2afbdb644 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -36,13 +36,13 @@ static struct oid_array ref_tips_after_fetch;
>   */
>  int is_gitmodules_unmerged(const struct index_state *istate)
>  {
> -   int pos = index_name_pos(istate, GITMODULES_FILE, 
> strlen(GITMODULES_FILE));
> +   int pos = index_name_pos(istate, submodules_file, 
> strlen(submodules_file));

Ah, regarding the coverletter: This clearly assumes the modules
file is in the tree. So at least here we would make an exception
for files outside the tree to either not check for un-merged-ness or
disallow that case entirely.



There are quite a few functions in submodule.c which access the new global. :/
So moving them to the_repository should be fine, but eventually (not
in this series)
all these functions would would want to take a repository argument as well
such that they work on more than the_repository.

Thanks,
Stefan