Re: [PATCH] Alternate object pool mechanism updates.

2005-08-16 Thread Linus Torvalds


On Sun, 14 Aug 2005, Junio C Hamano wrote:

 Linus Torvalds [EMAIL PROTECTED] writes:
 
  I think this is great - especially for places like kernel.org, where a lot 
  of repos end up being related to each other, yet independent.
 
 Yes.  There is one shortcoming in the current git-clone -s in
 the proposed updates branch.  If the parent repository has
 alternates on its own, that information should be copied to the
 cloned one as well (e.g. Jeff has alternates pointing at you,
 and I clone from Jeff with -s flag --- I should list not just
 Jeff but also you to borrow from in my alternates file).

Btw, looking at the code, it strikes me that using : to separate the 
alternate object directories in the file is rather strange.

Maybe allow a different format for the file? Or at least allow '\n' as an 
alternate separator (but it would be nice to allow comments too).

Finally, I have to say that that info directory is confusing. Namely,
there's two of them - the git info and the object info directories are
totally different directories - maybe logical, but to me it smells like
info is here a code-name for misc files that don't make sense anywhere
else.

Anyway, I don't think alternates is necessarily sensible as a object  
information. Sure, it's about alternate objects, but the thing is, object 
directories can be shared across many projects, but alternates to me 
makes more sense as a per-project thing.

What this all is leading up to is that I think we'd be better off with a 
totally new git config file, in .git/config, and we'd have all the 
startup configuration there. Including things like alternate object 
directories, perhaps standard preferences for that particular repo, and 
things like the grafts thing.

Wouldn't that be nice?

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Alternate object pool mechanism updates.

2005-08-16 Thread Daniel Barkalow
On Tue, 16 Aug 2005, Linus Torvalds wrote:

 Finally, I have to say that that info directory is confusing. Namely,
 there's two of them - the git info and the object info directories are
 totally different directories - maybe logical, but to me it smells like
 info is here a code-name for misc files that don't make sense anywhere
 else.

 What this all is leading up to is that I think we'd be better off with a
 totally new git config file, in .git/config, and we'd have all the
 startup configuration there. Including things like alternate object
 directories, perhaps standard preferences for that particular repo, and
 things like the grafts thing.

 Wouldn't that be nice?

I'd originally proposed the .git/info directory because I keep multiple
working trees for the same repository, by having symlinks for .git/objects
and .git/refs, and I could also get other per-repository things to be
shared properly without knowing exactly what they are if they're in a
subdirectory of .git that could be a symlink. This would mean that a
.git/config would be per-working-tree, like .git/index or .git/HEAD, not
pre-repository like .git/info/config. Of course, the core didn't have
any thing to go in .git/info at the time, so it didn't really get tacked
down.

(I find it convenient to have mainline and my latest work both checked out
for reference while I'm generating a series of commits for a patch set,
and I don't want three different repositories which could be out of sync;
this also keeps the repository safely out of pwd, since I have the actual
repositories as ~/git/{project}.git/)

-Daniel
*This .sig left intentionally blank*
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Alternate object pool mechanism updates.

2005-08-16 Thread Junio C Hamano
Linus Torvalds [EMAIL PROTECTED] writes:

 Btw, looking at the code, it strikes me that using : to separate the 
 alternate object directories in the file is rather strange.

Yes, I admit it one was done in a quick and dirty way.  Patches
welcome [*1*] ;-)

 Anyway, I don't think alternates is necessarily sensible as a object  
 information. Sure, it's about alternate objects, but the thing is, object 
 directories can be shared across many projects, but alternates to me 
 makes more sense as a per-project thing.

Well, I have to think about this a bit more, but I have to say
there were some thinking behind the way things are right now.

$GIT_DIR/info describes properties of the repository.  That's
why refs, graft and rev-cache go there.

$GIT_OBJECT_DIRECTORY/info describes the properties of the
object pool (I am inventing words as I speak, but an object pool
is a directory that can be combined with other object pools to
make an object database).  So object/info/packs talks about the
packs in it, but not about packs it borrows from its alternates.
The alternates file in question talks about what other object
pools you need to consult to obtain the objects it refers to but
it lacks itself.  If two repositories share a particular object
pool as its .git/objects directory, by symlinking .git/objects
or by using GIT_OBJECT_DIRECTORY environment, it does not matter
from which repository you look at this object pool.  The set of
objects it refers to but lacks itself, and from which other
pools these objects can be obtained, do not depend on from which
repository you are looking at it.  While I agree with everything
you said about maybe logical but confusing, I have to disagree
with you about this one.

 What this all is leading up to is that I think we'd be better off with a 
 totally new git config file, in .git/config, and we'd have all the 
 startup configuration there.

I think what _is_ lacking is an easy way to have per repository
configuration that can be shared among opt-in developers.  The
graft file naturally falls into this category, and probably the
Porcelain standard .git/info/exclude file as well.  Although we
ended up doing .git/hooks, it is a per repository thing and
logically it _could_ be moved to .git/info/hooks [*2*].  And
that might also be a nice thing to share among opt-in
developers.

[Footnote]

*1* Sorry I could not resist --- I always wanted to say this.

*2* I do not think we _should_ move it under .git/info, though.

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Alternate object pool mechanism updates.

2005-08-16 Thread Junio C Hamano
Linus Torvalds [EMAIL PROTECTED] writes:

 We've got a git prune-packed, it would be good to have a git
 prune-alternate or something equivalent.

If you have GIT_ALTERNATE_DIRECTORIES environment variable, git
prune-packed will remove objects from your repository if it is
found in somebody else's pack.  I am not sure if this is the
behaviour we would want.

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Alternate object pool mechanism updates.

2005-08-14 Thread Junio C Hamano
Linus Torvalds [EMAIL PROTECTED] writes:

 I think this is great - especially for places like kernel.org, where a lot 
 of repos end up being related to each other, yet independent.

Yes.  There is one shortcoming in the current git-clone -s in
the proposed updates branch.  If the parent repository has
alternates on its own, that information should be copied to the
cloned one as well (e.g. Jeff has alternates pointing at you,
and I clone from Jeff with -s flag --- I should list not just
Jeff but also you to borrow from in my alternates file).

 However, exactly for places like kernel.org it would _also_ be nice if
 there was some way to prune objects that have been merged back into the
 parent.

Yes.  Another possibility is to use git-relink which was written
exactly to solve this in a different way.

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Alternate object pool mechanism updates.

2005-08-13 Thread Junio C Hamano
It was a mistake to use GIT_ALTERNATE_OBJECT_DIRECTORIES
environment variable to specify what alternate object pools to
look for missing objects when working with an object database.
It is not a property of the process running the git commands,
but a property of the object database that is partial and needs
other object pools to complete the set of objects it lacks.

This patch allows you to have $GIT_OBJECT_DIRECTORY/info/alt
file whose contents is in exactly the same format as the
environment variable, to let an object database name alternate
object pools it depends on.

Signed-off-by: Junio C Hamano [EMAIL PROTECTED]
---

 cache.h  |5 +-
 fsck-cache.c |8 ++-
 sha1_file.c  |  146 --
 3 files changed, 88 insertions(+), 71 deletions(-)

8150a422f79cc461316052b52263289b851d4820
diff --git a/cache.h b/cache.h
--- a/cache.h
+++ b/cache.h
@@ -278,9 +278,10 @@ struct checkout {
 extern int checkout_entry(struct cache_entry *ce, struct checkout *state);
 
 extern struct alternate_object_database {
-   char *base;
+   struct alternate_object_database *next;
char *name;
-} *alt_odb;
+   char base[0]; /* more */
+} *alt_odb_list;
 extern void prepare_alt_odb(void);
 
 extern struct packed_git {
diff --git a/fsck-cache.c b/fsck-cache.c
--- a/fsck-cache.c
+++ b/fsck-cache.c
@@ -456,13 +456,13 @@ int main(int argc, char **argv)
fsck_head_link();
fsck_object_dir(get_object_directory());
if (check_full) {
-   int j;
+   struct alternate_object_database *alt;
struct packed_git *p;
prepare_alt_odb();
-   for (j = 0; alt_odb[j].base; j++) {
+   for (alt = alt_odb_list; alt; alt = alt-next) {
char namebuf[PATH_MAX];
-   int namelen = alt_odb[j].name - alt_odb[j].base;
-   memcpy(namebuf, alt_odb[j].base, namelen);
+   int namelen = alt-name - alt-base;
+   memcpy(namebuf, alt-base, namelen);
namebuf[namelen - 1] = 0;
fsck_object_dir(namebuf);
}
diff --git a/sha1_file.c b/sha1_file.c
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -222,84 +222,100 @@ char *sha1_pack_index_name(const unsigne
return base;
 }
 
-struct alternate_object_database *alt_odb;
+struct alternate_object_database *alt_odb_list;
+static struct alternate_object_database **alt_odb_tail;
 
 /*
  * Prepare alternate object database registry.
- * alt_odb points at an array of struct alternate_object_database.
- * This array is terminated with an element that has both its base
- * and name set to NULL.  alt_odb[n] comes from n'th non-empty
- * element from colon separated ALTERNATE_DB_ENVIRONMENT environment
- * variable, and its base points at a statically allocated buffer
- * that contains /the/directory/corresponding/to/.git/objects/...,
- * while its name points just after the slash at the end of
- * .git/objects/ in the example above, and has enough space to hold
- * 40-byte hex SHA1, an extra slash for the first level indirection,
- * and the terminating NUL.
- * This function allocates the alt_odb array and all the strings
- * pointed by base fields of the array elements with one xmalloc();
- * the string pool immediately follows the array.
+ *
+ * The variable alt_odb_list points at the list of struct
+ * alternate_object_database.  The elements on this list come from
+ * non-empty elements from colon separated ALTERNATE_DB_ENVIRONMENT
+ * environment variable, and $GIT_OBJECT_DIRECTORY/info/alt file,
+ * whose contents is exactly in the same format as that environment
+ * variable.  Its base points at a statically allocated buffer that
+ * contains /the/directory/corresponding/to/.git/objects/..., while
+ * its name points just after the slash at the end of .git/objects/
+ * in the example above, and has enough space to hold 40-byte hex
+ * SHA1, an extra slash for the first level indirection, and the
+ * terminating NUL.
  */
-void prepare_alt_odb(void)
+static void link_alt_odb_entries(const char *alt, const char *ep)
 {
-   int pass, totlen, i;
const char *cp, *last;
-   char *op = NULL;
-   const char *alt = gitenv(ALTERNATE_DB_ENVIRONMENT) ? : ;
+   struct alternate_object_database *ent;
+
+   last = alt;
+   do {
+   for (cp = last; cp  ep  *cp != ':'; cp++)
+   ;
+   if (last != cp) {
+   /* 43 = 40-byte + 2 '/' + terminating NUL */
+   int pfxlen = cp - last;
+   int entlen = pfxlen + 43;
+
+   ent = xmalloc(sizeof(*ent) + entlen);
+   *alt_odb_tail = ent;
+   alt_odb_tail = (ent-next);
+   ent-next = NULL;
+
+   memcpy(ent-base, last, pfxlen);
+  

Re: [PATCH] Alternate object pool mechanism updates.

2005-08-13 Thread Petr Baudis
Dear diary, on Sat, Aug 13, 2005 at 11:09:13AM CEST, I got a letter
where Junio C Hamano [EMAIL PROTECTED] told me that...
 It was a mistake to use GIT_ALTERNATE_OBJECT_DIRECTORIES
 environment variable to specify what alternate object pools to
 look for missing objects when working with an object database.
 It is not a property of the process running the git commands,
 but a property of the object database that is partial and needs
 other object pools to complete the set of objects it lacks.
 
 This patch allows you to have $GIT_OBJECT_DIRECTORY/info/alt
 file whose contents is in exactly the same format as the
 environment variable, to let an object database name alternate
 object pools it depends on.
 
 Signed-off-by: Junio C Hamano [EMAIL PROTECTED]

What about calling it rather info/alternates (or info/alternate)? It
looks better, sounds better, is more namespace-ecological tab-completes
fine and you don't type it that often anyway. :-)

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
If you want the holes in your knowledge showing up try teaching
someone.  -- Alan Cox
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html