Re: [PATCH/POC 3/7] setup.c: add split-repo support to .git files

2013-12-22 Thread Duy Nguyen
On Sat, Dec 14, 2013 at 3:43 AM, Jonathan Nieder  wrote:
> Problems:
>
>  * What if I move my worktree with "mv"?  Then I still need the
>corresponding $GIT_SUPER_DIR/repos/ directory, and nobody told
>the GIT_SUPER_DIR about it.
>
>  * What if my worktree is on removable media (think "network
>filesystem") and has just been temporarily unmounted instead of
>deleted?
>
> So maybe it would be nicer to:
>
>   i. When the worktree is on the same filesystem, keep a *hard link* to
>  some file in the worktree (e.g., the .git file).  If the link count
>  goes down, it is safe to remove the $GIT_SUPER_DIR/repos/
>  directory.

Link count goes down to 1 if I move the worktree to a different
filesystem and it's not safe to remove $GIT_SUPER_DIR/repos/ in
this case, I think.

>  ii. When the worktree is on another filesystem, always keep
>  $GIT_SUPER_DIR/repos/ unless the user decides to manually
>  remove it.  Provide documentation or a command to list basic
>  information about $GIT_SUPER_DIR/repos directories (path to
>  worktree, what branch they're on, etc).
>
> (i) doesn't require any futureproofing.  As soon as someone wants it,
> they can implement the check and fall back to (ii) for worktrees
> without the magic hard link.
>
> (ii) would benefit from recording the working tree directory as a
> possibly unreliable, human-friendly reminder.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/POC 3/7] setup.c: add split-repo support to .git files

2013-12-13 Thread Duy Nguyen
On Sat, Dec 14, 2013 at 3:43 AM, Jonathan Nieder  wrote:
> Junio C Hamano wrote:
>
>>  - Do we want to record where the working tree directory is in
>>$GIT_SUPER_DIR/repos/ somewhere?  Would it help to have such
>>a record?
>
> That could be nice for the purpose of garbage collecting them.  I fear
> that for users it is too tempting to remove a worktree with "rm -rf"
> without considering the relationship from the parent repo that might
> be making walking through all reflogs slower or holding on to objects
> no one cares about any more.
>
> I imagine it would work like this:
>
>  1. At worktree creation time, full path to the working tree directory
> is stored in $GIT_SUPER_DIR/repos/.
>
>  2. "git gc" notices that the worktree is missing and writes a file
> under $GIT_SUPER_DIR/repos/ with a timestamp, saying so.
>
>  3. If the worktree still hasn't existed for a month, "git gc" deletes
> the corresponding $GIT_SUPER_DIR/repos/ directory.

I was thinking about doing something like this in "git prune" but
manually. Your idea sounds nicer.

> Problems:
>
>  * What if I move my worktree with "mv"?  Then I still need the
>corresponding $GIT_SUPER_DIR/repos/ directory, and nobody told
>the GIT_SUPER_DIR about it.

We could store $GIT_SUPER_DIR as relative path. That way if you move
it, you break it. When you fix it, hopefully you remember to fix the
link in repos/ too

Alternatively, the setup up code could be taught to verify that
$GIT_SUPER_DIR/repos/id/ actually points to the current
worktree. If not warn the user or something

>  * What if my worktree is on removable media (think "network
>filesystem") and has just been temporarily unmounted instead of
>deleted?

Or we keep update a timestamp in repos/ to note the last used time
of this worktree. "gc" or "prune" would warn about unused repos after
a certain amount of time, do not remove them automatically. This could
be iii. to your list below.

> So maybe it would be nicer to:
>
>   i. When the worktree is on the same filesystem, keep a *hard link* to
>  some file in the worktree (e.g., the .git file).  If the link count
>  goes down, it is safe to remove the $GIT_SUPER_DIR/repos/
>  directory.

This can still break with updating by creating a new version, then
renaming it. Although I can't think why anybody (or anything) would
want to do that on .git file. This does not work on Windows though.

>  ii. When the worktree is on another filesystem, always keep
>  $GIT_SUPER_DIR/repos/ unless the user decides to manually
>  remove it.  Provide documentation or a command to list basic
>  information about $GIT_SUPER_DIR/repos directories (path to
>  worktree, what branch they're on, etc).

And on Windows, a new partition means a new drive, so it works there too.

>
> (i) doesn't require any futureproofing.  As soon as someone wants it,
> they can implement the check and fall back to (ii) for worktrees
> without the magic hard link.
>
> (ii) would benefit from recording the working tree directory as a
> possibly unreliable, human-friendly reminder.
>
>>  - How would this interact with core.worktree in .git/config of that
>>"super" repository?
>
> Eek.

I'll see if I can ignore core.worktree when $GIT_SUPER_DIR is set. If
not, ban this use case :)
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/POC 3/7] setup.c: add split-repo support to .git files

2013-12-13 Thread Jonathan Nieder
Junio C Hamano wrote:

>  - Do we want to record where the working tree directory is in
>$GIT_SUPER_DIR/repos/ somewhere?  Would it help to have such
>a record?

That could be nice for the purpose of garbage collecting them.  I fear
that for users it is too tempting to remove a worktree with "rm -rf"
without considering the relationship from the parent repo that might
be making walking through all reflogs slower or holding on to objects
no one cares about any more.

I imagine it would work like this:

 1. At worktree creation time, full path to the working tree directory
is stored in $GIT_SUPER_DIR/repos/.

 2. "git gc" notices that the worktree is missing and writes a file
under $GIT_SUPER_DIR/repos/ with a timestamp, saying so.

 3. If the worktree still hasn't existed for a month, "git gc" deletes
the corresponding $GIT_SUPER_DIR/repos/ directory.

Problems:

 * What if I move my worktree with "mv"?  Then I still need the
   corresponding $GIT_SUPER_DIR/repos/ directory, and nobody told
   the GIT_SUPER_DIR about it.

 * What if my worktree is on removable media (think "network
   filesystem") and has just been temporarily unmounted instead of
   deleted?

So maybe it would be nicer to:

  i. When the worktree is on the same filesystem, keep a *hard link* to
 some file in the worktree (e.g., the .git file).  If the link count
 goes down, it is safe to remove the $GIT_SUPER_DIR/repos/
 directory.

 ii. When the worktree is on another filesystem, always keep
 $GIT_SUPER_DIR/repos/ unless the user decides to manually
 remove it.  Provide documentation or a command to list basic
 information about $GIT_SUPER_DIR/repos directories (path to
 worktree, what branch they're on, etc).

(i) doesn't require any futureproofing.  As soon as someone wants it,
they can implement the check and fall back to (ii) for worktrees
without the magic hard link.

(ii) would benefit from recording the working tree directory as a
possibly unreliable, human-friendly reminder.

>  - How would this interact with core.worktree in .git/config of that
>"super" repository?

Eek.

Thanks,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/POC 3/7] setup.c: add split-repo support to .git files

2013-12-13 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> If a .git file contains
>
> gitsuper: 
> gitdir: 
>
> then we set GIT_SUPER_DIR to  and GIT_DIR to
> $GIT_SUPER_DIR/repos/.

I initially thought: "what is with that complexity? isn't it just
the matter of replacing 'gitdir: ' with 'gitsuper: '
stored in the file .git???"

Until I realized that there is nowhere to keep per-workdir data if
we only had .git as a pointer, and that is why you have that 
thing.  It would have helped me avoid that confusion if the above
description was followed by:

The latter, $GIT_SUPER_DIR/repos/, is a directory,
underneath which per-work-dir items like index, HEAD, logs/HEAD
(what else?) reside.

or something like that.  And $GIT_SUPER_DIR/repos/*/HEAD, especially
when they are detached, plus $GIT_SUPER_DIR/repos/*/index, will work
as the starting point of object reachability scanning when running
repack, fsck, etc.

A few more random thoughts...

 - Reusing "gitdir:" for this purpose is not advisable; use a
   different name.  This  is used to identify a workdir, so
   perhaps "gitworkdir: " might be a better name;

 - Do we want to record where the working tree directory is in
   $GIT_SUPER_DIR/repos/ somewhere?  Would it help to have such
   a record?

 - How would this interact with core.worktree in .git/config of that
   "super" repository?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/POC 3/7] setup.c: add split-repo support to .git files

2013-12-11 Thread Nguyễn Thái Ngọc Duy
If a .git file contains

gitsuper: 
gitdir: 

then we set GIT_SUPER_DIR to  and GIT_DIR to
$GIT_SUPER_DIR/repos/.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 cache.h |  1 +
 setup.c | 40 +---
 2 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/cache.h b/cache.h
index 823582f..f85ee70 100644
--- a/cache.h
+++ b/cache.h
@@ -410,6 +410,7 @@ extern const char *get_git_namespace(void);
 extern const char *strip_namespace(const char *namespaced_ref);
 extern const char *get_git_work_tree(void);
 extern const char *read_gitfile(const char *path);
+extern const char *read_gitfile_super(const char *path, char **super);
 extern const char *resolve_gitdir(const char *suspect);
 extern void set_git_work_tree(const char *tree);
 
diff --git a/setup.c b/setup.c
index 5432a31..84362a6 100644
--- a/setup.c
+++ b/setup.c
@@ -281,16 +281,23 @@ static int check_repository_format_gently(const char 
*gitdir, int *nongit_ok)
 /*
  * Try to read the location of the git directory from the .git file,
  * return path to git directory if found.
+ *
+ * If "gitsuper: " line is found and super is not NULL, *super points
+ * to the absolute path of the given path. The next line contains the
+ * repo id.
  */
-const char *read_gitfile(const char *path)
+const char *read_gitfile_super(const char *path, char **super)
 {
-   char *buf;
+   struct strbuf sb = STRBUF_INIT;
+   char *buf, *to_free;
char *dir;
const char *slash;
struct stat st;
int fd;
ssize_t len;
 
+   if (super)
+   *super = NULL;
if (stat(path, &st))
return NULL;
if (!S_ISREG(st.st_mode))
@@ -298,12 +305,19 @@ const char *read_gitfile(const char *path)
fd = open(path, O_RDONLY);
if (fd < 0)
die_errno("Error opening '%s'", path);
-   buf = xmalloc(st.st_size + 1);
+   to_free = buf = xmalloc(st.st_size + 1);
len = read_in_full(fd, buf, st.st_size);
close(fd);
if (len != st.st_size)
die("Error reading %s", path);
buf[len] = '\0';
+   if (super &&!prefixcmp(buf, "gitsuper: ")) {
+   char *p = strchr(buf, '\n');
+   *super = buf + strlen("gitsuper: ");
+   *p = '\0';
+   len -= (p + 1) - buf;
+   buf = p + 1;
+   }
if (prefixcmp(buf, "gitdir: "))
die("Invalid gitfile format: %s", path);
while (buf[len - 1] == '\n' || buf[len - 1] == '\r')
@@ -312,6 +326,11 @@ const char *read_gitfile(const char *path)
die("No path in gitfile: %s", path);
buf[len] = '\0';
dir = buf + 8;
+   if (super && *super) {
+   strbuf_addf(&sb, "%s/repos/%s", *super, dir);
+   dir = sb.buf;
+   *super = xstrdup(real_path(*super));
+   }
 
if (!is_absolute_path(dir) && (slash = strrchr(path, '/'))) {
size_t pathlen = slash+1 - path;
@@ -320,18 +339,25 @@ const char *read_gitfile(const char *path)
strncpy(dir, path, pathlen);
strncpy(dir + pathlen, buf + 8, len - 8);
dir[dirlen] = '\0';
-   free(buf);
-   buf = dir;
+   free(to_free);
+   to_free = buf = dir;
}
 
-   if (!is_git_directory(dir))
+   if (!is_git_directory_super(dir, super ? *super : dir))
die("Not a git repository: %s", dir);
path = real_path(dir);
 
-   free(buf);
+   free(to_free);
+   strbuf_release(&sb);
return path;
 }
 
+const char *read_gitfile(const char *path)
+{
+   return read_gitfile_super(path, NULL);
+}
+
+
 static const char *setup_explicit_git_dir(const char *gitdirenv,
  char *cwd, int len,
  int *nongit_ok)
-- 
1.8.5.1.77.g42c48fa

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html