Re: [PATCH/POC 3/7] setup.c: add split-repo support to .git files
On Sat, Dec 14, 2013 at 3:43 AM, Jonathan Nieder wrote: > Problems: > > * What if I move my worktree with "mv"? Then I still need the >corresponding $GIT_SUPER_DIR/repos/ directory, and nobody told >the GIT_SUPER_DIR about it. > > * What if my worktree is on removable media (think "network >filesystem") and has just been temporarily unmounted instead of >deleted? > > So maybe it would be nicer to: > > i. When the worktree is on the same filesystem, keep a *hard link* to > some file in the worktree (e.g., the .git file). If the link count > goes down, it is safe to remove the $GIT_SUPER_DIR/repos/ > directory. Link count goes down to 1 if I move the worktree to a different filesystem and it's not safe to remove $GIT_SUPER_DIR/repos/ in this case, I think. > ii. When the worktree is on another filesystem, always keep > $GIT_SUPER_DIR/repos/ unless the user decides to manually > remove it. Provide documentation or a command to list basic > information about $GIT_SUPER_DIR/repos directories (path to > worktree, what branch they're on, etc). > > (i) doesn't require any futureproofing. As soon as someone wants it, > they can implement the check and fall back to (ii) for worktrees > without the magic hard link. > > (ii) would benefit from recording the working tree directory as a > possibly unreliable, human-friendly reminder. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/POC 3/7] setup.c: add split-repo support to .git files
On Sat, Dec 14, 2013 at 3:43 AM, Jonathan Nieder wrote: > Junio C Hamano wrote: > >> - Do we want to record where the working tree directory is in >>$GIT_SUPER_DIR/repos/ somewhere? Would it help to have such >>a record? > > That could be nice for the purpose of garbage collecting them. I fear > that for users it is too tempting to remove a worktree with "rm -rf" > without considering the relationship from the parent repo that might > be making walking through all reflogs slower or holding on to objects > no one cares about any more. > > I imagine it would work like this: > > 1. At worktree creation time, full path to the working tree directory > is stored in $GIT_SUPER_DIR/repos/. > > 2. "git gc" notices that the worktree is missing and writes a file > under $GIT_SUPER_DIR/repos/ with a timestamp, saying so. > > 3. If the worktree still hasn't existed for a month, "git gc" deletes > the corresponding $GIT_SUPER_DIR/repos/ directory. I was thinking about doing something like this in "git prune" but manually. Your idea sounds nicer. > Problems: > > * What if I move my worktree with "mv"? Then I still need the >corresponding $GIT_SUPER_DIR/repos/ directory, and nobody told >the GIT_SUPER_DIR about it. We could store $GIT_SUPER_DIR as relative path. That way if you move it, you break it. When you fix it, hopefully you remember to fix the link in repos/ too Alternatively, the setup up code could be taught to verify that $GIT_SUPER_DIR/repos/id/ actually points to the current worktree. If not warn the user or something > * What if my worktree is on removable media (think "network >filesystem") and has just been temporarily unmounted instead of >deleted? Or we keep update a timestamp in repos/ to note the last used time of this worktree. "gc" or "prune" would warn about unused repos after a certain amount of time, do not remove them automatically. This could be iii. to your list below. > So maybe it would be nicer to: > > i. When the worktree is on the same filesystem, keep a *hard link* to > some file in the worktree (e.g., the .git file). If the link count > goes down, it is safe to remove the $GIT_SUPER_DIR/repos/ > directory. This can still break with updating by creating a new version, then renaming it. Although I can't think why anybody (or anything) would want to do that on .git file. This does not work on Windows though. > ii. When the worktree is on another filesystem, always keep > $GIT_SUPER_DIR/repos/ unless the user decides to manually > remove it. Provide documentation or a command to list basic > information about $GIT_SUPER_DIR/repos directories (path to > worktree, what branch they're on, etc). And on Windows, a new partition means a new drive, so it works there too. > > (i) doesn't require any futureproofing. As soon as someone wants it, > they can implement the check and fall back to (ii) for worktrees > without the magic hard link. > > (ii) would benefit from recording the working tree directory as a > possibly unreliable, human-friendly reminder. > >> - How would this interact with core.worktree in .git/config of that >>"super" repository? > > Eek. I'll see if I can ignore core.worktree when $GIT_SUPER_DIR is set. If not, ban this use case :) -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/POC 3/7] setup.c: add split-repo support to .git files
Junio C Hamano wrote: > - Do we want to record where the working tree directory is in >$GIT_SUPER_DIR/repos/ somewhere? Would it help to have such >a record? That could be nice for the purpose of garbage collecting them. I fear that for users it is too tempting to remove a worktree with "rm -rf" without considering the relationship from the parent repo that might be making walking through all reflogs slower or holding on to objects no one cares about any more. I imagine it would work like this: 1. At worktree creation time, full path to the working tree directory is stored in $GIT_SUPER_DIR/repos/. 2. "git gc" notices that the worktree is missing and writes a file under $GIT_SUPER_DIR/repos/ with a timestamp, saying so. 3. If the worktree still hasn't existed for a month, "git gc" deletes the corresponding $GIT_SUPER_DIR/repos/ directory. Problems: * What if I move my worktree with "mv"? Then I still need the corresponding $GIT_SUPER_DIR/repos/ directory, and nobody told the GIT_SUPER_DIR about it. * What if my worktree is on removable media (think "network filesystem") and has just been temporarily unmounted instead of deleted? So maybe it would be nicer to: i. When the worktree is on the same filesystem, keep a *hard link* to some file in the worktree (e.g., the .git file). If the link count goes down, it is safe to remove the $GIT_SUPER_DIR/repos/ directory. ii. When the worktree is on another filesystem, always keep $GIT_SUPER_DIR/repos/ unless the user decides to manually remove it. Provide documentation or a command to list basic information about $GIT_SUPER_DIR/repos directories (path to worktree, what branch they're on, etc). (i) doesn't require any futureproofing. As soon as someone wants it, they can implement the check and fall back to (ii) for worktrees without the magic hard link. (ii) would benefit from recording the working tree directory as a possibly unreliable, human-friendly reminder. > - How would this interact with core.worktree in .git/config of that >"super" repository? Eek. Thanks, Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/POC 3/7] setup.c: add split-repo support to .git files
Nguyễn Thái Ngọc Duy writes: > If a .git file contains > > gitsuper: > gitdir: > > then we set GIT_SUPER_DIR to and GIT_DIR to > $GIT_SUPER_DIR/repos/. I initially thought: "what is with that complexity? isn't it just the matter of replacing 'gitdir: ' with 'gitsuper: ' stored in the file .git???" Until I realized that there is nowhere to keep per-workdir data if we only had .git as a pointer, and that is why you have that thing. It would have helped me avoid that confusion if the above description was followed by: The latter, $GIT_SUPER_DIR/repos/, is a directory, underneath which per-work-dir items like index, HEAD, logs/HEAD (what else?) reside. or something like that. And $GIT_SUPER_DIR/repos/*/HEAD, especially when they are detached, plus $GIT_SUPER_DIR/repos/*/index, will work as the starting point of object reachability scanning when running repack, fsck, etc. A few more random thoughts... - Reusing "gitdir:" for this purpose is not advisable; use a different name. This is used to identify a workdir, so perhaps "gitworkdir: " might be a better name; - Do we want to record where the working tree directory is in $GIT_SUPER_DIR/repos/ somewhere? Would it help to have such a record? - How would this interact with core.worktree in .git/config of that "super" repository? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/POC 3/7] setup.c: add split-repo support to .git files
If a .git file contains gitsuper: gitdir: then we set GIT_SUPER_DIR to and GIT_DIR to $GIT_SUPER_DIR/repos/. Signed-off-by: Nguyễn Thái Ngọc Duy --- cache.h | 1 + setup.c | 40 +--- 2 files changed, 34 insertions(+), 7 deletions(-) diff --git a/cache.h b/cache.h index 823582f..f85ee70 100644 --- a/cache.h +++ b/cache.h @@ -410,6 +410,7 @@ extern const char *get_git_namespace(void); extern const char *strip_namespace(const char *namespaced_ref); extern const char *get_git_work_tree(void); extern const char *read_gitfile(const char *path); +extern const char *read_gitfile_super(const char *path, char **super); extern const char *resolve_gitdir(const char *suspect); extern void set_git_work_tree(const char *tree); diff --git a/setup.c b/setup.c index 5432a31..84362a6 100644 --- a/setup.c +++ b/setup.c @@ -281,16 +281,23 @@ static int check_repository_format_gently(const char *gitdir, int *nongit_ok) /* * Try to read the location of the git directory from the .git file, * return path to git directory if found. + * + * If "gitsuper: " line is found and super is not NULL, *super points + * to the absolute path of the given path. The next line contains the + * repo id. */ -const char *read_gitfile(const char *path) +const char *read_gitfile_super(const char *path, char **super) { - char *buf; + struct strbuf sb = STRBUF_INIT; + char *buf, *to_free; char *dir; const char *slash; struct stat st; int fd; ssize_t len; + if (super) + *super = NULL; if (stat(path, &st)) return NULL; if (!S_ISREG(st.st_mode)) @@ -298,12 +305,19 @@ const char *read_gitfile(const char *path) fd = open(path, O_RDONLY); if (fd < 0) die_errno("Error opening '%s'", path); - buf = xmalloc(st.st_size + 1); + to_free = buf = xmalloc(st.st_size + 1); len = read_in_full(fd, buf, st.st_size); close(fd); if (len != st.st_size) die("Error reading %s", path); buf[len] = '\0'; + if (super &&!prefixcmp(buf, "gitsuper: ")) { + char *p = strchr(buf, '\n'); + *super = buf + strlen("gitsuper: "); + *p = '\0'; + len -= (p + 1) - buf; + buf = p + 1; + } if (prefixcmp(buf, "gitdir: ")) die("Invalid gitfile format: %s", path); while (buf[len - 1] == '\n' || buf[len - 1] == '\r') @@ -312,6 +326,11 @@ const char *read_gitfile(const char *path) die("No path in gitfile: %s", path); buf[len] = '\0'; dir = buf + 8; + if (super && *super) { + strbuf_addf(&sb, "%s/repos/%s", *super, dir); + dir = sb.buf; + *super = xstrdup(real_path(*super)); + } if (!is_absolute_path(dir) && (slash = strrchr(path, '/'))) { size_t pathlen = slash+1 - path; @@ -320,18 +339,25 @@ const char *read_gitfile(const char *path) strncpy(dir, path, pathlen); strncpy(dir + pathlen, buf + 8, len - 8); dir[dirlen] = '\0'; - free(buf); - buf = dir; + free(to_free); + to_free = buf = dir; } - if (!is_git_directory(dir)) + if (!is_git_directory_super(dir, super ? *super : dir)) die("Not a git repository: %s", dir); path = real_path(dir); - free(buf); + free(to_free); + strbuf_release(&sb); return path; } +const char *read_gitfile(const char *path) +{ + return read_gitfile_super(path, NULL); +} + + static const char *setup_explicit_git_dir(const char *gitdirenv, char *cwd, int len, int *nongit_ok) -- 1.8.5.1.77.g42c48fa -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html