On Wed, Dec 7, 2016 at 3:03 PM, Junio C Hamano <[email protected]> wrote:
> Stefan Beller <[email protected]> writes:
>
>> @@ -1093,7 +1129,8 @@ static struct cmd_struct commands[] = {
>> {"resolve-relative-url", resolve_relative_url, 0},
>> {"resolve-relative-url-test", resolve_relative_url_test, 0},
>> {"init", module_init, 0},
>> - {"remote-branch", resolve_remote_submodule_branch, 0}
>> + {"remote-branch", resolve_remote_submodule_branch, 0},
>> + {"embed-git-dirs", embed_git_dir, SUPPORT_SUPER_PREFIX}
>> };
>
> If you want to avoid patch noise like this, your 2/5 can add a
> trailing comma after the entry for remote-branch. It is OK to end
> elements in an array literal with trailing comma, even though we
> avoid doing similar in enum definition (which is only allowed in
> newer C standards).
Ok, thanks for pointing out!
>
>> diff --git a/dir.c b/dir.c
>> index bfa8c8a9a5..e023b04407 100644
>> --- a/dir.c
>> +++ b/dir.c
>> @@ -2748,3 +2748,30 @@ void untracked_cache_add_to_index(struct index_state
>> *istate,
>> {
>> untracked_cache_invalidate_path(istate, path);
>> }
>> +
>> +/*
>> + * Migrate the git directory of the given `path` from `old_git_dir` to
>> + * `new_git_dir`. If an error occurs, append it to `err` and return the
>> + * error code.
>> + */
>> +int relocate_gitdir(const char *path, const char *old_git_dir,
>> + const char *new_git_dir, const char *displaypath,
>> + struct strbuf *err)
>> +{
>> + int ret = 0;
>> +
>> + printf("Migrating git directory of '%s' from\n'%s' to\n'%s'\n",
>> + displaypath, old_git_dir, new_git_dir);
>
> By using "strbuf err", it looks like that the calling convention of
> this function wants to cater to callers who want to have tight
> control over their output and an unconditional printing to the
> standard output looks somewhat out of place.
Before sending it out to the list I had a version with another flag
that controlled whether we die on error or just print a warning.
That was not fully true however as the connect_work_tree_and_git_dir
would die nevertheless, we'd only warn for the failed rename().
It turns out we do not need a fully functional non-die()ing version
for the checkout series, so I ripped that out again and the version sent out
just dies in case of failure in relocate_gitdir.
That said, I think the printing of the migration message ought to go
into the caller and to stderr as you note.
>
> Besides, does it belong to the standard output? It looks more like
> a progress-bar eye candy that we send out to the standard error
> stream (and only when we are talking to a tty).
>
>> +/* Embeds a single submodule, non recursively. */
>> +static void submodule_embed_git_dir_for_path(const char *prefix, const char
>> *path)
>> +{
>> + struct worktree **worktrees;
>> + struct strbuf pathbuf = STRBUF_INIT;
>> + struct strbuf errbuf = STRBUF_INIT;
>> + char *old_git_dir = NULL, *real_old_git_dir = NULL, *real_new_git_dir
>> = NULL;
>> + const char *new_git_dir;
>> + const struct submodule *sub;
>> + int code;
>> +
>> + worktrees = get_submodule_worktrees(path, 0);
>> + if (worktrees) {
>> + int i;
>> + for (i = 0; worktrees[i]; i++)
>> + ;
>> + free_worktrees(worktrees);
>> + if (i > 1)
>> + die(_("relocate_gitdir for submodule '%s' with "
>> + "more than one worktree not supported"), path);
>> + }
>
> We may benefit from "does this have secondary worktrees?" boolean
> helper function, perhaps?
ok, I'll add that helper as its own patch to the worktree code earlier
in the series.
>
>> + old_git_dir = xstrfmt("%s/.git", path);
>> + if (read_gitfile(old_git_dir))
>> + /* If it is an actual gitfile, it doesn't need migration. */
>> + return;
>
> Isn't this "no-op return" a valid thing to do, even when the
> submodule has secondary worktrees that borrow from it?
If we have 2 worktrees, all bets are off (in my non-understanding).
The git file here *could* point to git directory living inside the
other working tree, which then would need migration from that other
working tree to some embedded place.
I don't think that is how worktrees work (or have ever worked?)
but I am unfamiliar with worktrees, so I erred on the safe side.
> I am
> wondering if the "ah, we don't do a repository that has secondary
> worktrees" check should come after this one.
Maybe Duy can answer that? (Where are git directories
located in a worktree setup, even historically? Were they
ever inside one of the submodules? The other working tree may
not be a submodule, but a standalone thing as well?)
>
>> + real_old_git_dir = xstrdup(real_path(old_git_dir));
>> +...
>> +}
>
>> +/*
>> + * Migrate the git directory of the submodule given by path from
>> + * having its git directory within the working tree to the git dir nested
>> + * in its superprojects git dir under modules/.
>> + */
>
> I think that this operation removes the git directories that are
> embedded in the working tree of the superproject and storing them
> away to safer place, i.e. unembedding.
Oh right we had that discussion what the embedding actually means.
I asked around for English language help here:
What about "absorbGitDir" ? (absorb as in eat/consume) for the external
UI and internally I can call it relocate_submodule_git_dir_into_superproject
(or embed_git_dir_into_superproject)
>> +
>> +
>
> Lose the extra blank line here?
ok
>> + if (!skip_prefix(real_sub_git_dir, real_common_git_dir, &v))
>
> Yeah, checking for NULL-ness with !skip_prefix() helps ;-)
I think you are referring to the interdiff to the previous patches..
Yes we should just use it as it is here.
>
>> + submodule_embed_git_dir_for_path(prefix, path);
>> +
>> + if (flags & RELOCATE_GITDIR_RECURSE_SUBMODULES) {
>> + struct child_process cp = CHILD_PROCESS_INIT;
>> + struct strbuf sb = STRBUF_INIT;
>> +
>> + if (flags & ~RELOCATE_GITDIR_RECURSE_SUBMODULES)
>> + die("BUG: we don't know how to pass the flags down?");
>> +
>> + if (get_super_prefix())
>> + strbuf_addstr(&sb, get_super_prefix());
>> + strbuf_addstr(&sb, path);
>> + strbuf_addch(&sb, '/');
>> +
>> + cp.dir = path;
>> + cp.git_cmd = 1;
>> + cp.no_stdin = 1;
>> + argv_array_pushl(&cp.args, "--super-prefix", sb.buf,
>> + "submodule--helper",
>> + "embed-git-dirs", NULL);
>> + prepare_submodule_repo_env(&cp.env_array);
>> + if (run_command(&cp))
>> + die(_("could not recurse into submodule '%s'"), path);
>> + strbuf_release(&sb);
>> + }
>
> Hmph. We cannot use run_processes_parallel() thing here? Is its
> API too hard to use to be worth it?
The problem here is that we do not know about the names of
the nested submodules. We could do a "git -C path submodule--helper list"
and then use the run_processes_parallel for
git -C <submodule> embed-git-dir <nested-sub1>
git -C <submodule> embed-git-dir <nested-sub2>
git -C <submodule> embed-git-dir <nested-sub3>
etc. As a first approach I considered
git -C <submodule> embed-git-dir <no-pathspec>
to be fast enough as the functionality is not implemented is only
a filesystem-local rename() (fast regardless of directory size).
So if we want to make that relocate git dir aware of
non-atomic-cross-filesystem moves, we want to revisit the decision
to run it in parallel?
>
>> +test_expect_success 'embedding does not fail for deinitalized submodules' '
>> + test_when_finished "git submodule update --init" &&
>> + git submodule deinit --all &&
>> + git submodule embedgitdirs &&
>> + test -d .git/modules/sub1 &&
>> + ! test -f sub1/.git &&
>
> Does this expect "sub1/.git is not a regular file (we want directory
> instead)"? Or "there is no filesystem entity at sub1/.git"?
This mainly ought to test that the new call doesn't crash or segfault
but accepts this as a valid state.
>
> If the former, write "test -d sub1/.git"; if the latter, you
> probably want "! test -e sub1/.git" instead.
However the -e is the correct thing to do.