Re: [PATCH 5/5] submodule: fixup nested submodules after moving the submodule

2018-03-27 Thread Stefan Beller
On Tue, Mar 27, 2018 at 5:07 PM, Jonathan Tan  wrote:

> s/submoduled/submodules

> s/superprojects/superproject's/

> s/and //
>
> s/force/forcing/

All wording fixed.

>> + sub_path = sub_worktree + strlen(super_worktree) + 1;
>> +
>> + if (repo_submodule_init(&subrepo, superproject, sub_path))
>> + return;
>> +
>> + repo_read_index(&subrepo);
>
> From the name of this function and its usage in
> connect_work_tree_and_git_dir(), I expected this function to just
> iterate through all the files in its workdir (which it is doing in the
> "for" loop below) and connect any nested submodules. Why does it need
> access to its superproject? (I would think that repo_init() would be
> sufficient here instead of repo_submodule_init().)

Testing validates your thinking (for now).

If we ever want to have good error reporting (see bmwills hint to
check for index corruption), we may want to have all the repos constructed
as submodules from the_repository, as then the error messages might
be better (e.g. in the future we could display the
"submodule nesting stack").

I'll remove the superproject argument for now.

>> +
>> + strbuf_reset(&sub_wt);
>> + strbuf_addf(&sub_wt, "%s/%s/.git", sub_worktree, sub->path);
>> +
>> + strbuf_reset(&sub_gd);
>> + strbuf_addf(&sub_gd, "%s/modules/%s", sub_gitdir, sub->name);
>> +
>> + strbuf_setlen(&sub_wt, sub_wt.len - strlen("/.git"));
>> +
>> + if (is_submodule_active(&subrepo, ce->name)) {
>> + connect_work_tree_and_git_dir(sub_wt.buf, sub_gd.buf, 
>> 0);
>> + connect_wt_gitdir_in_nested(sub_wt.buf, sub_gd.buf, 
>> &subrepo);
>
> The modifications of sub_wt and sub_gd should probably go here, since
> they are not used unless this "if" block is executed.

Thanks! I also cut out the setlen call by giving the correct format string.
(The code presented here is not very old, I just fired it off as soon as the
test passed)

>
>> +void connect_work_tree_and_git_dir(const char *work_tree_,
>> +const char *git_dir_,
>> +int recurse_into_nested)
>
> How is this function expected to be used? From what I see:
>  - if recurse_into_nested is 0, this works regardless of whether the
>work_tree_ and git_dir_ is directly or indirectly a submodule of
>the_repository
>  - if recurse_into_nested is 1, work_tree_ and git_dir_ must be directly
>a submodule of the_repository (since it is referenced directly)
>
> This seems confusing - is this expected?

In the next revision of the series connect_wt_gitdir_in_nested
will no longer have a third argument for the repo, as we use repo_init.

That eases the handling a bit here.


Re: [PATCH 5/5] submodule: fixup nested submodules after moving the submodule

2018-03-27 Thread Jonathan Tan
On Tue, 27 Mar 2018 14:39:18 -0700
Stefan Beller  wrote:

> connect_work_tree_and_git_dir is used to connect a submodule worktree with
> its git directory and vice versa after events that require a reconnection
> such as moving around the working tree. As submodules can have nested
> submoduled themselves, we'd also want to fix the nested submodules when

s/submoduled/submodules

> asked to. Add an option to recurse into the nested submodules and connect
> them as well.
> 
> As submodules are identified by their name (which determines their git
> directory in relation to their superprojects git directory) internally

s/superprojects/superproject's/

> and by their path in the working tree of the superproject, we need to
> make sure that the mapping of name <-> path is kept intact. We can do
> that in the git-mv command by writing out the gitmodules file and first

s/and //

> and then force a reload of the submodule config machinery.

s/force/forcing/

> 
> Signed-off-by: Stefan Beller 

[snip]

> +static void connect_wt_gitdir_in_nested(const char *sub_worktree,
> + const char *sub_gitdir,
> + struct repository *superproject)
> +{
> + int i;
> + struct repository subrepo;
> + struct strbuf sub_wt = STRBUF_INIT;
> + struct strbuf sub_gd = STRBUF_INIT;
> + const struct submodule *sub;
> + const char *super_worktree,
> +*sub_path; /* path inside the superproject */
> +
> + /* subrepo got moved, so superproject has outdated information */
> + submodule_free(superproject);
> +
> + super_worktree = real_pathdup(superproject->worktree, 1);
> +
> + sub_path = sub_worktree + strlen(super_worktree) + 1;
> +
> + if (repo_submodule_init(&subrepo, superproject, sub_path))
> + return;
> +
> + repo_read_index(&subrepo);

>From the name of this function and its usage in
connect_work_tree_and_git_dir(), I expected this function to just
iterate through all the files in its workdir (which it is doing in the
"for" loop below) and connect any nested submodules. Why does it need
access to its superproject? (I would think that repo_init() would be
sufficient here instead of repo_submodule_init().)

> +
> + for (i = 0; i < subrepo.index->cache_nr; i++) {
> + const struct cache_entry *ce = subrepo.index->cache[i];
> +
> + if (!S_ISGITLINK(ce->ce_mode))
> + continue;
> +
> + while (i + 1 < subrepo.index->cache_nr &&
> +!strcmp(ce->name, subrepo.index->cache[i + 1]->name))
> + /*
> +  * Skip entries with the same name in different stages
> +  * to make sure an entry is returned only once.
> +  */
> + i++;
> +
> + sub = submodule_from_path(&subrepo, &null_oid, ce->name);
> + if (!sub)
> + /* submodule not checked out? */
> + continue;
> +
> + strbuf_reset(&sub_wt);
> + strbuf_addf(&sub_wt, "%s/%s/.git", sub_worktree, sub->path);
> +
> + strbuf_reset(&sub_gd);
> + strbuf_addf(&sub_gd, "%s/modules/%s", sub_gitdir, sub->name);
> +
> + strbuf_setlen(&sub_wt, sub_wt.len - strlen("/.git"));
> +
> + if (is_submodule_active(&subrepo, ce->name)) {
> + connect_work_tree_and_git_dir(sub_wt.buf, sub_gd.buf, 
> 0);
> + connect_wt_gitdir_in_nested(sub_wt.buf, sub_gd.buf, 
> &subrepo);

The modifications of sub_wt and sub_gd should probably go here, since
they are not used unless this "if" block is executed.

> +void connect_work_tree_and_git_dir(const char *work_tree_,
> +const char *git_dir_,
> +int recurse_into_nested)

How is this function expected to be used? From what I see:
 - if recurse_into_nested is 0, this works regardless of whether the
   work_tree_ and git_dir_ is directly or indirectly a submodule of
   the_repository
 - if recurse_into_nested is 1, work_tree_ and git_dir_ must be directly
   a submodule of the_repository (since it is referenced directly)

This seems confusing - is this expected?


Re: [PATCH 5/5] submodule: fixup nested submodules after moving the submodule

2018-03-27 Thread Brandon Williams
On 03/27, Stefan Beller wrote:
> connect_work_tree_and_git_dir is used to connect a submodule worktree with
> its git directory and vice versa after events that require a reconnection
> such as moving around the working tree. As submodules can have nested
> submoduled themselves, we'd also want to fix the nested submodules when
> asked to. Add an option to recurse into the nested submodules and connect
> them as well.
> 
> As submodules are identified by their name (which determines their git
> directory in relation to their superprojects git directory) internally
> and by their path in the working tree of the superproject, we need to
> make sure that the mapping of name <-> path is kept intact. We can do
> that in the git-mv command by writing out the gitmodules file and first
> and then force a reload of the submodule config machinery.
> 
> Signed-off-by: Stefan Beller 
> ---
>  builtin/mv.c|  6 ++--
>  builtin/submodule--helper.c |  3 +-
>  dir.c   | 70 +++--
>  dir.h   | 12 ++-
>  submodule.c |  6 ++--
>  t/t7001-mv.sh   |  2 +-
>  6 files changed, 87 insertions(+), 12 deletions(-)
> 
> diff --git a/builtin/mv.c b/builtin/mv.c
> index 6d141f7a53..7a63667d64 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -276,10 +276,12 @@ int cmd_mv(int argc, const char **argv, const char 
> *prefix)
>   die_errno(_("renaming '%s' failed"), src);
>   }
>   if (submodule_gitfile[i]) {
> - if (submodule_gitfile[i] != SUBMODULE_WITH_GITDIR)
> - connect_work_tree_and_git_dir(dst, 
> submodule_gitfile[i]);
>   if (!update_path_in_gitmodules(src, dst))
>   gitmodules_modified = 1;
> + if (submodule_gitfile[i] != SUBMODULE_WITH_GITDIR)
> + connect_work_tree_and_git_dir(dst,
> +   
> submodule_gitfile[i],
> +   1);
>   }
>  
>   if (mode == WORKING_DIRECTORY)
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index a921fbbf56..05fd657f99 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -1259,8 +1259,7 @@ static int module_clone(int argc, const char **argv, 
> const char *prefix)
>   strbuf_reset(&sb);
>   }
>  
> - /* Connect module worktree and git dir */
> - connect_work_tree_and_git_dir(path, sm_gitdir);
> + connect_work_tree_and_git_dir(path, sm_gitdir, 0);
>  
>   p = git_pathdup_submodule(path, "config");
>   if (!p)
> diff --git a/dir.c b/dir.c
> index dedbf5d476..313176e291 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -19,6 +19,7 @@
>  #include "varint.h"
>  #include "ewah/ewok.h"
>  #include "fsmonitor.h"
> +#include "submodule-config.h"
>  
>  /*
>   * Tells read_directory_recursive how a file or directory should be treated.
> @@ -3010,8 +3011,67 @@ void untracked_cache_add_to_index(struct index_state 
> *istate,
>   untracked_cache_invalidate_path(istate, path, 1);
>  }
>  
> -/* Update gitfile and core.worktree setting to connect work tree and git dir 
> */
> -void connect_work_tree_and_git_dir(const char *work_tree_, const char 
> *git_dir_)
> +static void connect_wt_gitdir_in_nested(const char *sub_worktree,
> + const char *sub_gitdir,
> + struct repository *superproject)
> +{
> + int i;
> + struct repository subrepo;

You never clear this struct which means it leaks the memory it points
to.

> + struct strbuf sub_wt = STRBUF_INIT;
> + struct strbuf sub_gd = STRBUF_INIT;
> + const struct submodule *sub;
> + const char *super_worktree,
> +*sub_path; /* path inside the superproject */
> +
> + /* subrepo got moved, so superproject has outdated information */
> + submodule_free(superproject);
> +
> + super_worktree = real_pathdup(superproject->worktree, 1);
> +
> + sub_path = sub_worktree + strlen(super_worktree) + 1;
> +
> + if (repo_submodule_init(&subrepo, superproject, sub_path))
> + return;
> +
> + repo_read_index(&subrepo);

You may want to check the return value to see if reading the index was
successful.

> +
> + for (i = 0; i < subrepo.index->cache_nr; i++) {
> + const struct cache_entry *ce = subrepo.index->cache[i];
> +
> + if (!S_ISGITLINK(ce->ce_mode))
> + continue;
> +
> + while (i + 1 < subrepo.index->cache_nr &&
> +!strcmp(ce->name, subrepo.index->cache[i + 1]->name))
> + /*
> +  * Skip entries with the same name in different stages
> +  * to make sure an entry is returned only once.