Re: Mozilla SHA1 implementation

2005-04-22 Thread Paul Mackerras
Linus Torvalds writes:

 I've just integrated the Mozilla SHA1 library implementation that Adgar
 Toernig sent me into the standard git archive (but I did the integration
 differently).

Here is a new PPC SHA1 patch that integrates better with this...

 Interestingly, the Mozilla SHA1 code is about twice as fast as the openssl
 code on my G5, and judging by the disassembly, it's because it's much
 simpler. I think the openssl people have unrolled all the loops totally,
 which tends to be a disaster on any half-way modern CPU. But hey, it could
 be something as simple as optimization flags too.

Very interesting.  On my G4 powerbook (since I am at LCA), for a
fsck-cache on a linux-2.6 tree, it takes 6.6 seconds with the openssl
SHA1, 10.7 seconds with the Mozilla SHA1, and ~5.8 seconds with my
SHA1.  I'll test it on a G5 tonight, hopefully.

Paul.

diff -urN git.orig/Makefile git/Makefile
--- git.orig/Makefile   2005-04-22 16:23:44.0 +1000
+++ git/Makefile2005-04-22 16:43:31.0 +1000
@@ -34,9 +34,14 @@
   SHA1_HEADER=mozilla-sha1/sha1.h
   LIB_OBJS += mozilla-sha1/sha1.o
 else
+ifdef PPC_SHA1
+  SHA1_HEADER=ppc/sha1.h
+  LIB_OBJS += ppc/sha1.o ppc/sha1ppc.o
+else
   SHA1_HEADER=openssl/sha.h
   LIBS += -lssl
 endif
+endif
 
 CFLAGS += '-DSHA1_HEADER=$(SHA1_HEADER)'
 
@@ -77,7 +82,7 @@
 write-tree.o: $(LIB_H)
 
 clean:
-   rm -f *.o mozilla-sha1/*.o $(PROG) $(LIB_FILE)
+   rm -f *.o mozilla-sha1/*.o ppc/*.o $(PROG) $(LIB_FILE)
 
 backup: clean
cd .. ; tar czvf dircache.tar.gz dir-cache
diff -urN git.orig/ppc/sha1.c git/ppc/sha1.c
--- /dev/null   2005-04-04 12:56:19.0 +1000
+++ git/ppc/sha1.c  2005-04-22 16:29:19.0 +1000
@@ -0,0 +1,72 @@
+/*
+ * SHA-1 implementation.
+ *
+ * Copyright (C) 2005 Paul Mackerras [EMAIL PROTECTED]
+ *
+ * This version assumes we are running on a big-endian machine.
+ * It calls an external sha1_core() to process blocks of 64 bytes.
+ */
+#include stdio.h
+#include string.h
+#include sha1.h
+
+extern void sha1_core(uint32_t *hash, const unsigned char *p,
+ unsigned int nblocks);
+
+int SHA1_Init(SHA_CTX *c)
+{
+   c-hash[0] = 0x67452301;
+   c-hash[1] = 0xEFCDAB89;
+   c-hash[2] = 0x98BADCFE;
+   c-hash[3] = 0x10325476;
+   c-hash[4] = 0xC3D2E1F0;
+   c-len = 0;
+   c-cnt = 0;
+   return 0;
+}
+
+int SHA1_Update(SHA_CTX *c, const void *ptr, unsigned long n)
+{
+   unsigned long nb;
+   const unsigned char *p = ptr;
+
+   c-len += n  3;
+   while (n != 0) {
+   if (c-cnt || n  64) {
+   nb = 64 - c-cnt;
+   if (nb  n)
+   nb = n;
+   memcpy(c-buf.b[c-cnt], p, nb);
+   if ((c-cnt += nb) == 64) {
+   sha1_core(c-hash, c-buf.b, 1);
+   c-cnt = 0;
+   }
+   } else {
+   nb = n  6;
+   sha1_core(c-hash, p, nb);
+   nb = 6;
+   }
+   n -= nb;
+   p += nb;
+   }
+   return 0;
+}  
+
+int SHA1_Final(unsigned char *hash, SHA_CTX *c)
+{
+   unsigned int cnt = c-cnt;
+
+   c-buf.b[cnt++] = 0x80;
+   if (cnt  56) {
+   if (cnt  64)
+   memset(c-buf.b[cnt], 0, 64 - cnt);
+   sha1_core(c-hash, c-buf.b, 1);
+   cnt = 0;
+   }
+   if (cnt  56)
+   memset(c-buf.b[cnt], 0, 56 - cnt);
+   c-buf.l[7] = c-len;
+   sha1_core(c-hash, c-buf.b, 1);
+   memcpy(hash, c-hash, 20);
+   return 0;
+}
diff -urN git.orig/ppc/sha1.h git/ppc/sha1.h
--- /dev/null   2005-04-04 12:56:19.0 +1000
+++ git/ppc/sha1.h  2005-04-22 16:45:28.0 +1000
@@ -0,0 +1,20 @@
+/*
+ * SHA-1 implementation.
+ *
+ * Copyright (C) 2005 Paul Mackerras [EMAIL PROTECTED]
+ */
+#include stdint.h
+
+typedef struct sha_context {
+   uint32_t hash[5];
+   uint32_t cnt;
+   uint64_t len;
+   union {
+   unsigned char b[64];
+   uint64_t l[8];
+   } buf;
+} SHA_CTX;
+
+int SHA1_Init(SHA_CTX *c);
+int SHA1_Update(SHA_CTX *c, const void *p, unsigned long n);
+int SHA1_Final(unsigned char *hash, SHA_CTX *c);
diff -urN git.orig/ppc/sha1ppc.S git/ppc/sha1ppc.S
--- /dev/null   2005-04-04 12:56:19.0 +1000
+++ git/ppc/sha1ppc.S   2005-04-22 16:29:19.0 +1000
@@ -0,0 +1,185 @@
+/*
+ * SHA-1 implementation for PowerPC.
+ *
+ * Copyright (C) 2005 Paul Mackerras.
+ */
+#define FS 80
+
+/*
+ * We roll the registers for T, A, B, C, D, E around on each
+ * iteration; T on iteration t is A on iteration t+1, and so on.
+ * We use registers 7 - 12 for this.
+ */
+#define RT(t)  t)+5)%6)+7)
+#define RA(t)  t)+4)%6)+7)
+#define RB(t)  t)+3)%6)+7)
+#define RC(t)  t)+2)%6)+7)
+#define RD(t)  t)+1)%6)+7)
+#define RE(t)  

Re: [ANNOUNCE] git-pasky-0.6.3 request for testing

2005-04-22 Thread Greg KH
On Fri, Apr 22, 2005 at 05:09:31AM +0200, Petr Baudis wrote:
   Hello,
 
   FYI, I've released git-pasky-0.6.3 earlier in the night.

Hm, fun thing to try:
go into a kernel git tree.
rm Makefile
git diff

Watch it as it thinks that every Makefile in the kernel tree is now
gone...

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] git-pasky-0.6.3 request for testing

2005-04-22 Thread Barry K. Nathan
With git-pasky 0.6.3, git log is unusable on my Mandrake 10.1 system.
Basically I get a neverending flood of these until I press 'q' to quit
less:

/home/barryn/softbag/git-pasky-0.6.3/gitlog.sh: line 73:  7598 Segmentation faul
t  sed -re '
/ *Signed-off-by.*/Is//'$colsignoff''$c
oldefault'/
s/^//
'
/home/barryn/softbag/git-pasky-0.6.3/gitlog.sh: line 73:  7609 Segmentation faul
t  sed -re '
/ *Signed-off-by.*/Is//'$colsignoff''$c
oldefault'/
s/^//
'
/home/barryn/softbag/git-pasky-0.6.3/gitlog.sh: line 73:  7620 Segmentation faul
t  sed -re '
/ *Signed-off-by.*/Is//'$colsignoff''$c
oldefault'/
s/^//
'

git-pasky-0.6.2 works fine.

I'm not sure if I have time tonight (or tomorrow) to troubleshoot this
further, but I'll see if I can.

-Barry K. Nathan [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Mozilla SHA1 implementation

2005-04-22 Thread Paul Mackerras
Linus Torvalds writes:

 Interestingly, the Mozilla SHA1 code is about twice as fast as the openssl
 code on my G5, and judging by the disassembly, it's because it's much
 simpler. I think the openssl people have unrolled all the loops totally,
 which tends to be a disaster on any half-way modern CPU. But hey, it could
 be something as simple as optimization flags too.

Which gcc version are you using?

I get the opposite result on my 2GHz G5: the Mozilla version does
45MB/s, the openssl version does 135MB/s, and my version does 218MB/s.
The time for a fsck-cache on a linux-2.6 tree (cache hot) is 8.0
seconds for the Mozilla version, 5.2 seconds for the openssl version,
and 4.4 seconds for my version.

Paul.
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


proposal: delta based git archival

2005-04-22 Thread Michel Lespinasse
I noticed people on this mailing list start talking about using blob deltas
for compression, and the basic issue that the resulting files are too small
for efficient filesystem storage. I thought about this a little and decided
I should send out my ideas for discussion.

In my proposal, the current git object storage model (one compressed object
per file) remains as the primary storage mechanism, however there would be
some kind of backup mechanism based on multiple deltas grouped in one file.

For example, suppose you're looking for an object with a hash of
eab75ce51622aa312bb0b03572d43769f420c347

First you'd look at .git/objects/ea/b75ce51622aa312bb0b03572d43769f420c347 -
if the file exists, that's your object.

If the file does not exist, you'd then look for .git/deltas/ea/b,
.git/deltas/ea/b7, .git/deltas/ea/b75, .git/deltas/ea/b75c, ...
up to some maximum search path lenght. You stop at the first file you can
find.

Supposing that file is .git/deltas/ea/b7, it would contain a diff
(let's assume unified format for now, though ideally it'd be better to
have something that allows binary file deltas too) of many archived
objects with hashes starting with eab7, compared to a different object
(presumably some direct or indirect ancestor):

diff -u 8f5ba0203e31204c5c052d995a5b4449226bcfb5 
eab75ce51622aa312bb0b03572d43769f420c347
--- 8f5ba0203e31204c5c052d995a5b4449226bcfb5
+++ eab75ce51622aa312bb0b03572d43769f420c347
@@ -522,7 +522,7 @@

diff -u 77dc2cb94930017f62b55b9706cbadda8c90f650 
eab71c51dbc62797d6c903203de44cc6a734c05c
--- 77dc2cb94930017f62b55b9706cbadda8c90f650
+++ eab71c51dbc62797d6c903203de44cc6a734c05c
@@ -560,13 +563,17 @@
...

Based on this delta file, we'd then look for the object
8f5ba0203e31204c5c052d995a5b4449226bcfb5 (this process could require
recursively rebuilding that object) and try to build
eab75ce51622aa312bb0b03572d43769f420c347 by applying the delta and then
double checking the hash.

To me the strenghts of this proposal would be:
* It does not muddy the git object model - it just acts independently of it,
  as a way to rebuild git objects from deltas
* Old objects can be compressed by creating a delta with a close ancestor,
  then erasing the original file storage for that object. The object delta
  can be appended to an existing delta file (which avoids the small-file
  storage issue), or if the delta file gets too big, it can be split off
  into 16 smaller files based on the hashes of the objects this file stores
  deltas for.
* The system is flexible enough to explore different delta
  strategies. For example one could decide to keep one object every 10
  in the database and store other 9 as deltas based on the immediate
  object ancestor, or any other tradeoff - and the system would still
  work the same (with different performance tradeoffs though).

Does this sound insane ? Too complicated maybe ?

Is there any kind of semi-standard binary-capable multiple-file diff format
that could be used for this application instead of unified diffs ?

-- 
Michel Walken Lespinasse
Bill Gates is a monocle and a Persian cat away from being the villain
in a James Bond movie. -- Dennis Miller
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT_INDEX_FILE environment variable

2005-04-22 Thread Zach Welch
Howdy,

Linus Torvalds wrote:
 On Thu, 21 Apr 2005, Junio C Hamano wrote: 
I am thinking about an alternative way of doing the above by
some modifications to the git core.  I think the root of this
problem is that there is no equivalent to GIT_INDEX_FILE and
SHA1_FILE_DIRECTORY that tells the core git where the project
top directory (i.e. the root of the working tree that
corresponds to what $GIT_INDEX_FILE describes) is.
 
 I'd _really_ prefer to just try to teach people to work from the top 
 directory instead.

Would it be okay if that were settable on a per-repository basis? :)
Or do you have specific subset of operations you want restricted?

 - A new environment variable GIT_WORKING_TREE points at the
   root of the working tree.
[snip]
 I really don't like it that much, but to some degree it obviously is
 exactly what --prefix= does to checkout-cache. It's basically saying 
 that all normal file operations have to be prefixed with a magic string. 

I'm going to script it one way or the other, but the environment route
allows me to set things up after a fork and before exec in Perl. This
works regardless of what git command I'm running, and should work even
with ithreads. This ease of use would not be the case with the
'--prefix' solution, as scripting the commands would requiring passing
arguments to those commands that need/support them at a higher level
than is desirable.

At present, I have implemented Yogi to support being able to run
commands from a different working directory than the root of the
repository, and that behavior might be per-repository settable
(someday). If I had my way, I would like to see git support the
following variables:

  GIT_WORKING_DIRECTORY   - default to '.'
  GIT_CACHE_DIRECTORTY- default to ${GIT_WORKING_DIRECTORY}/.git
  GIT_OBJECT_DIRECTORY- defaults to ${GIT_CACHE_DIRECTORY}/objects

The reasoning is simple: One object repository can be shared among
numerous working caches, which can be shared among multiple working
directories (e.g. any directories under the project root, but maybe also
import/exports, or other magic...). There are two layers of one to many
relationships between the three classes of directories, and my scripts
want to make use of that flexibility to the hilt.

Also, do you really think git will only ever have the index file, and
not someday possibly other related bits? (You may have said that
elsewhere, but I missed it.) If that's ever the case, the directory
variable is the way to go; scripts can be forward compatible and won't
risk accidentally mingling repository data when their scripts have only
set GIT_INDEX_FILE and not GIT_SOME_OTHER_FILE.

That said, I think GIT_INDEX_FILE would supplement the above scheme
nicely, overriding a default of ${GIT_CACHE_DIRECTORY}/index, because of
use cases you've described.

Cheers,

Zach
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] multi item packed files

2005-04-22 Thread Krzysztof Halasa
Linus Torvalds [EMAIL PROTECTED] writes:

 And dammit, if I'm the original author and likely biggest power-user, and 
 _I_ can't be bothered to use special filesystems, then who can? Nobody.

If someone is motivated enough, and if the task is quite trivial (as it
seems to be) someone may try it. I can see nothing wrong with it as long
as it doesn't affect other people.

 This is why I absolutely do not believe in arguments like if your
 filesystem doesn't do tail packing, you shouldn't use it or if your
 don't have name hashing enabled in your filesystem it's broken.

Of course. But one may consider using a filesystem with, say, different
settings. Or a special filesystem for this task, such as CNFS used by
news servers (it seems news servers do quite the same what git does,
except they also purge old contents, i.e., container files don't grow up).

 I'm perfectly willing to optimize for the common case, but that's as far 
 as it goes. I do not want to make fundamental design decisions that depend 
 on the target filesystem having some particular feature.

The optimization would be (in) the underlying filesystem (i.e., the OS
thing, or possibly a shared preloaded library?), not git itself.
-- 
Krzysztof Halasa
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: proposal: delta based git archival

2005-04-22 Thread Jaime Medrano
On 4/22/05, Michel Lespinasse [EMAIL PROTECTED] wrote:
 I noticed people on this mailing list start talking about using blob deltas
 for compression, and the basic issue that the resulting files are too small
 for efficient filesystem storage. I thought about this a little and decided
 I should send out my ideas for discussion.
 

I've been thinking in another simpler approach.

The main benefit of using deltas is reducing the bandwith use in
pull/push. My idea is leaving the blob storage as it is by now and
adding a new kind of object (remote) that acts as a link to an object
in another repository.

So that, when you rsync, you don't have to get all the blobs (which
can be a lot of data), but only the sha1 of the new objects created.
Then a remote object is created for each new object in the local
repository pointing to its location in the external repository.

Once the rsync is done, when git has to access any of the new objects
they can be fetched from the original location, so that only necessary
objects are transfered.

This way, the cost of a sync in terms of bandwith is nearly zero.

I've been working on this, so if you think it to be a good idea, I can
send a patch when I get it fully working.

Regards,
Jaime Medrano.
http://jmedrano.sl-form.com
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] git-pasky-0.6.3 request for testing

2005-04-22 Thread Petr Baudis
Dear diary, on Fri, Apr 22, 2005 at 09:24:37AM CEST, I got a letter
where Barry K. Nathan [EMAIL PROTECTED] told me that...
 On Fri, Apr 22, 2005 at 12:16:26AM -0700, Barry K. Nathan wrote:
  With git-pasky 0.6.3, git log is unusable on my Mandrake 10.1 system.
  Basically I get a neverending flood of these until I press 'q' to quit
  less:
 [snip sed segmentation faults which happen with 0.6.3 but not 0.6.2]
  I'm not sure if I have time tonight (or tomorrow) to troubleshoot this
  further, but I'll see if I can.
 
 I had sed-4.1.1-2mdk. I downloaded sed-4.1.4-2mdk (from Mandriva 2005
 Limited Edition) and updated to that, and the problem went away.
 
 FWIW this is the second package I've had to update to the Mandriva 2005
 LE level (the first was mktemp). I don't mind however.

Duh, segfaulting sed! Could you please check which of the sed
invocations actually segfault for you?

Thanks,

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


First web interface and service API draft

2005-04-22 Thread Christian Meder
Hi,

me again after a couple of hours of sleep ;-)

This probably gets a bit longer so if you are not interested in a web
service api or the web interface now is your chance to get off the
train.

I'm probably making a complete git of myself but that's not uncalled
for in this contxt ;-)

For those that are still with me let me start by iterating again that
I _do_ care for URIs as the primary API for web service
applications _and_ humans. I probably don't have to tell Linux people
anything about the importance to get the API right ;-)

As it's fairly early in the web service interface cycle I like to change
things around a little bit and starting to get the API straight.

The following considerations should be pretty implementation agnostic
and not specific to wit. The interface should be flexible enough to be
used as a kind of web command line.

---
/project

Ok. The URI should start by stating the project name
e.g. /linux-2.6. This does bloat the URI slightly but I don't think
that we want to have one root namespace per git archive in the long
run. Additionally you can always put rewriting or redirecting rules at
the root level for additional convenience when there's an obvious
default project.

Should provide some meta data, stats, etc. if available.

---
/project/blob/blob-sha1
/project/commit/commit-sha1

These are the easy ones: the web interface should be able to spit out
the plain text data of a blob and a commit at these URIs. Users would
be probably scripts and other downloads.
Open questions:
* Blob data should be probably binary ?
* Should it be commit or changeset ? Linus seems to have changed
nomenclature in the REAME
* If we serve the pristine commit objects we will put the email
addresses in plain sight. If we remove or change the email addresses
it's not the original commit object anymore. Thoughts ?

---
/project/tree/tree-sha1

Tree objects are served in binary form. Primary audience are scripts,
etc. Human beings will probably get a heart attack when they
accidentally visit this URI.

---
/project/blob/blob-sha1.html
/project/commit/commit-sha1.html
/project/tree/tree-sha1.html

A HTML version of blob, commit and tree fully linked aimed at human
beings.

---
/project/tree/tree-sha1.tar.bz2
/project/tree/tree-sha1.tar.gz
/project/commit/commit-sha1.tar.bz2
/project/commit/commit-sha1.tar.gz

Tarballs of the specified commits or trees. Note that these can be
individual subtrees too.


---
/project/tree/tree-sha1/diff/ancestor-tree-sha1

Unified plain text recursive diff of the given trees. I guess the
user could specify any two tree ids but the relevance of the results
would vary greatly ;-)
* Possibly a DOS issue
* does something like /project/tree/tree-sha1/diff/ make sense
producing a full diff from scratch ?  

---
/project/tree/tree-sha1/diff/ancestor-tree-sha1/html

Non recursive HTML view of the objects which are contained in the diff
fully linked with the individual HTML views.

---
/project/blob/blob-sha1/diff/ancestor-sha1

Unified plain text diff of the given blobs.
* again /project/blob/blob-sha1/diff/ sensible ?

---
/project/blob/blob-sha1/diff/ancestor-sha1/html

HTML view (probably colorized) view of a single blob diff.

---
/project/changelog/time-spec

HTML changelog for the given time-spec. I think valid values for
timespec should be number of days nnnd, number of entries nnn and
the keyword 'all'.

* perhaps additionally number of hours nnnh, number of months
  nnnm, number of years nnny. Combinations shouldn't be allowed
* time ranges are probably overkill
* is a plain text version needed /project/changelog/time-spec/plain?

---
/project/changelog/time-spec/search/regexp

HTML changelog for the given time-spec filtered by the regexp.

* again plain version needed ?

--
/project/changelog/time-spec/search/author/regexp
/project/changelog/time-spec/search/committer/regexp
/project/changelog/time-spec/search/signedoffby/regexp

convenience wrappers for generic search restricted to these fields.

--

open questions:
* how to generate and publish additional merge information ?
* how to generate and publish tree and blob history information ? This
is probably expensive with git.
* how to represent branches ? should we code up the branches in the
project id like linux-2.6-mm or whatever ?


Comments ? Ideas ? Other feedback ?




Christian
  
-- 
Christian Meder, email: [EMAIL PROTECTED]

The Way-Seeking Mind of a tenzo is actualized 
by rolling up your sleeves.

(Eihei Dogen Zenji)

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: First web interface and service API draft

2005-04-22 Thread Jon Seymour
On 4/22/05, Christian Meder [EMAIL PROTECTED] wrote:

 Comments ? Ideas ? Other feedback ?
 

I'd suggest serving XML rather than HTML and using client side XSLT to
transform it into HTML. Client-side XSLT works well in IE 6 and all
versions of Firefox, so there is no question that it is a mature
technology. Provide a fall back via server transformed HTML if need
be, but that is trivial to do once you have the client-side XSLT
stylesheets.

Serving XML is as easy as serving HTML and gives you a much more
flexible outcome.

jon.
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: First web interface and service API draft

2005-04-22 Thread Petr Baudis
Dear diary, on Fri, Apr 22, 2005 at 12:41:56PM CEST, I got a letter
where Christian Meder [EMAIL PROTECTED] told me that...
 Hi,

Hi,

 /project
 
 Ok. The URI should start by stating the project name
 e.g. /linux-2.6. This does bloat the URI slightly but I don't think
 that we want to have one root namespace per git archive in the long
 run. Additionally you can always put rewriting or redirecting rules at
 the root level for additional convenience when there's an obvious
 default project.
 
 Should provide some meta data, stats, etc. if available.

I don't think this makes much sense. I think you should just apply -p1
to all the directories, and define that there should be some / page
which should contain some metadata regarding the repository you are
accessing (probably branches, tags, and such).

 ---
 /project/blob/blob-sha1
 /project/commit/commit-sha1
 
 These are the easy ones: the web interface should be able to spit out
 the plain text data of a blob and a commit at these URIs. Users would
 be probably scripts and other downloads.
 Open questions:
 * Blob data should be probably binary ?

What do you mean by binary?

 * Should it be commit or changeset ? Linus seems to have changed
 nomenclature in the REAME

We call it commit everywhere but in the README. :-)

The changeset name is bad anyway. It is a commit of a complete tree
state, diff against one of its parent commits is the set of changes.

 ---
 /project/tree/tree-sha1
 
 Tree objects are served in binary form. Primary audience are scripts,
 etc. Human beings will probably get a heart attack when they
 accidentally visit this URI.

Binary form is unusable for scripts.

Anything wrong with putting ls-tree output there?


We should also have /gitobj/sha1 for fetching the raw git objects.

 ---
 /project/blob/blob-sha1.html
 /project/commit/commit-sha1.html
 /project/tree/tree-sha1.html
 
 A HTML version of blob, commit and tree fully linked aimed at human
 beings.

How can I imagine an HTML version of blob?


 ---
 /project/tree/tree-sha1/diff/ancestor-tree-sha1/html
 
 Non recursive HTML view of the objects which are contained in the diff
 fully linked with the individual HTML views.

Why not .html?

 ---
 /project/changelog/time-spec

I'd personally prefer /log/, but whatever.

For consistency, I'd stay with the plaintext output by default, .html if
requested.

And I think abusing directories for this is bad. Query string seems much
more appropriate, since this is something that changes dynamically a
lot, not a permanent resource identifier.

OTOH, I'd use

/log/commit

to specify what commit to start at. It just does not make sense
otherwise, you would not know where to start.

I think the commit should follow the same or similar rules as Cogito
id decoding. E.g. to get latest Linus' changelog, you'd do

/log/linus

 ---
 /project/changelog/time-spec/search/regexp
 
 HTML changelog for the given time-spec filtered by the regexp.
 
 * again plain version needed ?
 
 --
 /project/changelog/time-spec/search/author/regexp
 /project/changelog/time-spec/search/committer/regexp
 /project/changelog/time-spec/search/signedoffby/regexp
 
 convenience wrappers for generic search restricted to these fields.

Same here. just ?author=...committer=...signedoffby=... etc. You can
even combine several criteria.

 --
 
 open questions:
 * how to generate and publish additional merge information ?

I don't understand

 * how to generate and publish tree and blob history information ? This
 is probably expensive with git.

...this either.

 * how to represent branches ? should we code up the branches in the
 project id like linux-2.6-mm or whatever ?

See above.

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: First web interface and service API draft

2005-04-22 Thread Petr Baudis
Dear diary, on Fri, Apr 22, 2005 at 01:34:45PM CEST, I got a letter
where Jon Seymour [EMAIL PROTECTED] told me that...
 On 4/22/05, Christian Meder [EMAIL PROTECTED] wrote:
 
  Comments ? Ideas ? Other feedback ?
  
 
 I'd suggest serving XML rather than HTML and using client side XSLT to
 transform it into HTML. Client-side XSLT works well in IE 6 and all
 versions of Firefox, so there is no question that it is a mature
 technology. Provide a fall back via server transformed HTML if need
 be, but that is trivial to do once you have the client-side XSLT
 stylesheets.
 
 Serving XML is as easy as serving HTML and gives you a much more
 flexible outcome.

Why rather than? Why not in addition to?

You just append either .html or .xml, based on what you want.

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: First web interface and service API draft

2005-04-22 Thread Jon Seymour
On 4/22/05, Petr Baudis [EMAIL PROTECTED] wrote:
 Dear diary, on Fri, Apr 22, 2005 at 01:34:45PM CEST, I got a letter
 where Jon Seymour [EMAIL PROTECTED] told me that...
  On 4/22/05, Christian Meder [EMAIL PROTECTED] wrote:
  
   Comments ? Ideas ? Other feedback ?
  
 
  I'd suggest serving XML rather than HTML and using client side XSLT to
  transform it into HTML. ...
 
 Why rather than? Why not in addition to?
 
 You just append either .html or .xml, based on what you want.
 

You are right - there is no good reason that an implementation should
not to support both.

From the point of view of a specification, though, I think it would be
useful to focus on an XML content model rather than the details of one
particular HTML model - get the XML model right and you can do
whatever you like with the HTML model at any time after that.

jon.

On 4/22/05, Petr Baudis [EMAIL PROTECTED] wrote:
 Dear diary, on Fri, Apr 22, 2005 at 01:34:45PM CEST, I got a letter
 where Jon Seymour [EMAIL PROTECTED] told me that...
  On 4/22/05, Christian Meder [EMAIL PROTECTED] wrote:
  
   Comments ? Ideas ? Other feedback ?
  
 
  I'd suggest serving XML rather than HTML and using client side XSLT to
  transform it into HTML. Client-side XSLT works well in IE 6 and all
  versions of Firefox, so there is no question that it is a mature
  technology. Provide a fall back via server transformed HTML if need
  be, but that is trivial to do once you have the client-side XSLT
  stylesheets.
 
  Serving XML is as easy as serving HTML and gives you a much more
  flexible outcome.
 
 Why rather than? Why not in addition to?
 
 You just append either .html or .xml, based on what you want.
 
 --
 Petr Pasky Baudis
 Stuff: http://pasky.or.cz/
 C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
 


-- 
homepage: http://www.zeta.org.au/~jon/
blog: http://orwelliantremors.blogspot.com/
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-pasky spec file

2005-04-22 Thread Kevin Smith
Chris Wright wrote:
 Here's a simple spec file to do rpm builds.

(snip)

 Creates a package named git, which seems
 fine since Linus' isn't likely to be packaged directly.  

Um. Really? I can't imagine why Linus's git wouldn't be packaged
directly. He has strongly indicated that folks who want to build on top
of it should not expect to see libgit any time soon, so git will be an
important independent tool.

But presumably you'll change the name of this package to cogito soon
anyway, as soon as git-pasky itself is renamed.

Kevin
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch] fixup GECOS handling

2005-04-22 Thread Martin Schlemmer
Hi,

This still applies - any reason for not doing this?


Thanks,



The GECOS is delimited by ',' or ';', so we should only use whatever is
before the first ',' or ';' for the full name, rather than just
stripping those.

Signed-off-by: Martin Schlemmer [EMAIL PROTECTED]

commit-tree.c: ec53a4565ec0033aaf6df2a48d233ccf4823e8b0
--- 1/commit-tree.c
+++ 2/commit-tree.c 2005-04-18 12:22:18.0 +0200
@@ -96,21 +96,6 @@
if (!c)
break;
}
-
-   /*
-* Go back, and remove crud from the end: some people
-* have commas etc in their gecos field
-*/
-   dst--;
-   while (--dst = p) {
-   unsigned char c = *dst;
-   switch (c) {
-   case ',': case ';': case '.':
-   *dst = 0;
-   continue;
-   }
-   break;
-   }
 }

 static const char *month_names[] = {
@@ -313,6 +298,11 @@
if (!pw)
die(You don't exist. Go away!);
realgecos = pw-pw_gecos;
+   /* The name is seperated from the room no., tel no, etc via [,;] */
+   if (strchr(realgecos, ','))
+   *strchr(realgecos, ',') = 0;
+   else if (strchr(realgecos, ';'))
+   *strchr(realgecos, ';') = 0;
len = strlen(pw-pw_name);
memcpy(realemail, pw-pw_name, len);
realemail[len] = '@';



-- 
Martin Schlemmer

commit-tree.c: ec53a4565ec0033aaf6df2a48d233ccf4823e8b0
--- 1/commit-tree.c
+++ 2/commit-tree.c	2005-04-18 12:22:18.0 +0200
@@ -96,21 +96,6 @@
 		if (!c)
 			break;
 	}
-
-	/*
-	 * Go back, and remove crud from the end: some people
-	 * have commas etc in their gecos field
-	 */
-	dst--;
-	while (--dst = p) {
-		unsigned char c = *dst;
-		switch (c) {
-		case ',': case ';': case '.':
-			*dst = 0;
-			continue;
-		}
-		break;
-	}
 }
 
 static const char *month_names[] = {
@@ -313,6 +298,11 @@
 	if (!pw)
 		die(You don't exist. Go away!);
 	realgecos = pw-pw_gecos;
+	/* The name is seperated from the room no., tel no, etc via ',' or ';' */
+	if (strchr(realgecos, ','))
+		*strchr(realgecos, ',') = 0;
+	else if (strchr(realgecos, ';'))
+		*strchr(realgecos, ';') = 0;
 	len = strlen(pw-pw_name);
 	memcpy(realemail, pw-pw_name, len);
 	realemail[len] = '@';


signature.asc
Description: This is a digitally signed message part


[git pasky] tarball question

2005-04-22 Thread Martin Schlemmer
Hi,

I understand why you have the git-pasky-0.6.x.tar.bz2 tarballs with
the .git database included as well (btw, great stuff renaming it to
something more distributable), but its going to be a pita for users of
source based distro's like us (Gentoo), as well as our mirrors if it
gets much bigger. (Already asked r3pek to add it to portage).

How about ripping the .git directory from the next release, and just
have a un-numbered tarball (like you used to) that have the latest
snapshot of the .git directory for those that want to do git-pasky
development?  Should even make things easier your side, as you could
just do a cron to update it one a day/whatever.


Thanks,

-- 
Martin Schlemmer



signature.asc
Description: This is a digitally signed message part


Re: git pull on ia64 linux tree

2005-04-22 Thread Linus Torvalds


On Fri, 22 Apr 2005 [EMAIL PROTECTED] wrote:
 
 git log seems to have problems interpreting the dates ... looking at the
 commit entries, the time is right ... but it appears that git log applies
 the timezone correction twice, so the changes I just applied at 14:46 PDT
 look like I made them at quarter to five tomorrow morning (+14 hours from
 when I did).

Looks like you are right.

The seconds are already in UTC format, so I think git log is wrong to 
pass the UTC seconds in to date, and then tell date that it was done in 
the original timezone.

I think it would be nice to use the TZ data to show the thing in the 
timezone of the committer, though. Dunno how to do that, maybe something 
like

TZ=$tz date -d 1970-01-01 + $sec sec

or whatever. Sadly, it looks like date doesn't understand timezone 
syntax like that - looks like TZ has to be in the long machine-unreadable 
format like US/Pacific etc. Stupid (either TZ or me - maybe I just 
don't know what the right format is).

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git pasky] tarball question

2005-04-22 Thread Martin Schlemmer
On Sat, 2005-04-23 at 00:42 +0200, Petr Baudis wrote:
 Dear diary, on Fri, Apr 22, 2005 at 04:31:43PM CEST, I got a letter
 where Martin Schlemmer [EMAIL PROTECTED] told me that...
  Hi,
 
 Hi,
 
  I understand why you have the git-pasky-0.6.x.tar.bz2 tarballs with
  the .git database included as well (btw, great stuff renaming it to
  something more distributable), but its going to be a pita for users of
  source based distro's like us (Gentoo), as well as our mirrors if it
  gets much bigger. (Already asked r3pek to add it to portage).
 
 yes; that was actually the plan, it's just that my memory is so
 volatile...
 

Yep, saw before you posted about the change in URL, thanks.

  How about ripping the .git directory from the next release, and just
  have a un-numbered tarball (like you used to) that have the latest
  snapshot of the .git directory for those that want to do git-pasky
  development?  Should even make things easier your side, as you could
  just do a cron to update it one a day/whatever.
 
 Does it actually make sense to keep a tarball with history? Just build
 git-pasky and do git init. (Or rsync it manually.)
 

Well, I did not know about kernel.org hosting it, so I thought it might
help due to your reasons for initially tarballing the whole thing =)


Thanks,

-- 
Martin Schlemmer



signature.asc
Description: This is a digitally signed message part


RE: [3/5] Add http-pull

2005-04-22 Thread Luck, Tony

But if you download 1000 files of the 1010 you need, and then your network
goes down, you will need to download those 1000 again when it comes back,
because you can't save them unless you have the full history. 

So you could make the temporary object repository persistant between pulls
to avoid reloading them across the wire.  Something like:

get_commit(sha1)
{
if (sha1 in real_repo) - done
if (!(sha1 in tmp_repo))
load sha1 to tmp_repo
get_tree(sha1-tree)
for each parent
get_commit(sha1-parent)
move sha1 from tmp_repo to real_repo
}

get_tree(sha1)
{
if (sha1 in real_repo) - done
if (!(sha1 in tmp_repo))
load sha1 to tmp repo
for_each (sha1-entry) {
  case blob: if (!sha1 in real_repo) load to real_repo
  case tree: get_tree()
}
move sha1 from tmp_repo to real_repo
}

The load sha1 to xxx_repo needs to be smarter than my dumb wget
based script ... it must confirm the sha1 of the object being loaded
before installing (even into the tmp_repo).

-Tony

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] git-pasky: Add .gitrc directory to allow command defaults like with .cvsrc

2005-04-22 Thread Fabian Franz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

one thing I liked about CVS was its way to configure default parameters for 
commands.

And as I really like the colored log output, I wanted it as default.

While .cvsrc parsing would be quite expensive, using a directory + files 
should be fairly cheap and result just in one additional stat-call.

So I added -c to ~/.gitrc/log and some code to parse this.

Index: git
===
- --- 0a9ee5a4d947b998a7ce489242800b39f985/git  (mode:100755 
sha1:39969debd59ed51c57973c819cdcc3ca8a7da819)
+++ uncommitted/git  (mode:100755)
@@ -67,6 +67,7 @@
exit 1
 fi

+[ -e $HOME/.gitrc/$cmd ]  set -- $(cat $HOME/.gitrc/$cmd) $@

 case $cmd in
 add)gitadd.sh $@;;

cu

Fabian

PS: Should the commandline parsing be cleaned up or do you want to do that 
after first release of cogito? And if yes, do you want to use getopts or 
would this be not supported on some systems?

PPS: I'm fairly new to git, how do I create a diff with the signed-by fields 
and with what do I need to sign it?
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFCaSZDI0lSH7CXz7MRAoq8AJwM2lxPfl0ej32WU7q6bh6WIq5+EACgghGn
mvJzbvg6/bxWLFKfsP1ZEeI=
=03wm
-END PGP SIGNATURE-

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT_INDEX_FILE environment variable

2005-04-22 Thread Petr Baudis
Dear diary, on Sat, Apr 23, 2005 at 12:14:16AM CEST, I got a letter
where Linus Torvalds [EMAIL PROTECTED] told me that...
 (And I personally think that show-diff is really part of the wrapper
 scripts around git. I wrote it originally just because I needed something
 to verify the index file handling, not because it's core like the other
 programs. I do _not_ consider show-diff to be part of the core git code,
 really. Same goes for git-export, btw - for the same reasons. It's not
 fundamental).

Note that Cogito almost actually does not use show-diff anymore.
I'm doing diff-cache now, since that is what matters to me.

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/5] Add http-pull

2005-04-22 Thread Petr Baudis
Dear diary, on Sat, Apr 23, 2005 at 01:00:33AM CEST, I got a letter
where Daniel Barkalow [EMAIL PROTECTED] told me that...
 On Sat, 23 Apr 2005, Petr Baudis wrote:
 
  Dear diary, on Fri, Apr 22, 2005 at 09:46:35PM CEST, I got a letter
  where Daniel Barkalow [EMAIL PROTECTED] told me that...
  
  Huh. Why? You just go back to history until you find a commit you
  already have. If you did it the way as Tony described, if you have that
  commit, you can be sure that you have everything it depends on too.
 
 But if you download 1000 files of the 1010 you need, and then your network
 goes down, you will need to download those 1000 again when it comes back,
 because you can't save them unless you have the full history. 

Why can't I? I think I can do that perfectly fine. The worst thing that
can happen is that fsck-cache will complain a bit.

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] multi item packed files

2005-04-22 Thread Linus Torvalds


On Thu, 21 Apr 2005, Chris Mason wrote:
 
 We can sort by the files before reading them in, but even if we order things 
 perfectly, we're spreading the io out too much across the drive.

No we don't.

It's easy to just copy the repository in a way where this just isn't true:  
you sort the objects by how far they are from the current HEAD, and you
just copy the repository in that order (furthest objects first - commits
last).

That's what I meant by defragmentation - you can actually do this on your 
own, even if your filesystem doesn't support it.

Do it twice a year, and I pretty much guarantee that your performance will
stay pretty constant over time. The one exception is fsck, which doesn't
seek in history order.

And this works exactly because: 
 - we don't do no steenking delta's, and don't have deep chains of data 
   to follow. The longest chain we ever have is just a few deep, and it's 
   trivial to just encourage the filesystem to have recent things together.
 - we have an append-only mentality.

In fact, it works for exactly the same reason that makes us able to drop 
old history if we want to. We essentially drop the history to another 
part of the disk.

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] git-pasky-0.6.3 request for testing

2005-04-22 Thread Linus Torvalds


On Sat, 23 Apr 2005, Petr Baudis wrote:
 
 Just FYI, this is bug in core git's diff-cache;

Nice find. 

Yes, I told you guys I hadn't tested it well ;)

diff-cache does the same diff trees in lockstep thing that diff-tree 
does, but it's actually more complex, since the _tree_ part always needs 
to be recursively followed, while the _cache_ part is this linear list 
that is already expanded.

Which just made the whole algorithm very messy.

Once I found out how nasty it was to do that compare, I was actually
planning to re-write the thing using the same approach that read-tree -m
tree does - ie move the tree information _into_ the in-memory cache, at
which point it should be absolutely trivial to compare the two. But since 
the horrid algorithm seemed to end up working, I never did.

I'm not even going to debug this bug. I'm just going to rewrite diff-cache 
to do what I should have done originally, ie use the power of the 
in-memory cache. That's also automatically going to properly warn about 
unmerged files.

Give me five minutes ;)

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT_INDEX_FILE environment variable

2005-04-22 Thread Linus Torvalds


On Fri, 22 Apr 2005, Junio C Hamano wrote:
 
 Almost, with a counter-example.  Please try this yourself:

I agree that what git outputs is always based on the archive base. But 
that's an independent issue from where is the working directory. That's 
the issue of how do you want me to print out the results.

To see just how independent that is, think about how git-pasky (and,
indeed, standard show-diff) already prints out the results in a
_different_ base than the working directory _or_ the base. Ie the way we 
already do

--- a/Makefile
+++ b/Makefile
... patch ...

for a patch to Makefile in the top-level directory.

IOW, showing pathnames is different from _using_ them. And if you were 
planning on using the same logic for both, you'd have been making a 
mistake in the first place.

To _use_ pathnames, you use pwd. To _show_ them, you use some other
mechanism. You must not mix up those two issues, or you'd always get
show-diff wrong.

I actually think that showing the pathnames is up to the wrapper scripts. 
Git core really always just works on the canonical format.

(And I personally think that show-diff is really part of the wrapper
scripts around git. I wrote it originally just because I needed something
to verify the index file handling, not because it's core like the other
programs. I do _not_ consider show-diff to be part of the core git code,
really. Same goes for git-export, btw - for the same reasons. It's not
fundamental).

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git remote repositories

2005-04-22 Thread Dan Weber
Hi,
It wasn't that long ago that the pasky git tree was relocated.  This 
required a modification to the .git directory in a local pull.  A dns 
system could be built to ensure the following:

A) quick easy lookup of archive locations
B) handle changes of repository location
C) add mirror support
So heres the plan...
I do a lot of work in sip/voip field, and our approach to handling backup 
proxies and routers is to use a dns srv record.

Here's how it works for voip/sip.
_{protocol}._{transport}.{name}.hostname.org
A sample lookup:
dig SRV _sip._udp.proxy-dca.broadvoice.com
;; QUESTION SECTION:
;_sip._udp.proxy-dca.broadvoice.com. IN SRV
;; ANSWER SECTION:
_sip._udp.proxy-dca.broadvoice.com. 86400 IN SRV 1 0 5060 
proxy.mia.broadvoice.com.
_sip._udp.proxy-dca.broadvoice.com. 86400 IN SRV 0 0 5060 
proxy.dca.broadvoice.com.

Now of course we could null out some of those fields and swap sip for git 
and udp for rsync, then replace proxy.foo to rsync://host/path/to/git. 
Since we're using rsync, mirroring is simplified by just rsyncing the 
trees.

Dan
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/5] Add http-pull

2005-04-22 Thread Petr Baudis
Dear diary, on Fri, Apr 22, 2005 at 09:46:35PM CEST, I got a letter
where Daniel Barkalow [EMAIL PROTECTED] told me that...
 On Thu, 21 Apr 2005 [EMAIL PROTECTED] wrote:
 
  On Wed, 20 Apr 2005, Brad Roberts wrote:
   How about fetching in the inverse order.  Ie, deepest parents up towards
   current.  With that method the repository is always self consistent, even
   if not yet current.
  
  Daniel Barkalow replied:
   You don't know the deepest parents to fetch until you've read everything
   more recent, since the history you'd have to walk is the history you're
   downloading.
  
  You just need to defer adding tree/commit objects to the repository until
  after you have inserted all objects on which they depend.  That's what my
  wget based version does ... it's very crude, in that it loads all tree
   commit objects into a temporary repository (.gittmp) ... since you can
  only use cat-file and ls-tree on things if they live in 
  objects/xx/xxx..xxx
  The blobs can go directly into the real repo (but to be really safe you'd
  have to ensure that the whole blob had been pulled from the network before
  inserting it ... it's probably a good move to validate everything that you
  pull from the outside world too).
 
 The problem with this general scheme is that it means that you have to
 start over if something goes wrong, rather than resuming from where you
 left off (and being able to use what you got until then).

Huh. Why? You just go back to history until you find a commit you
already have. If you did it the way as Tony described, if you have that
commit, you can be sure that you have everything it depends on too.

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


wit suggestion

2005-04-22 Thread David Greaves
Hi Christian
Can I suggest a 'summary diff' option
It's basically a diff between the tree of the commit and the tree of the 
parent commit

It would show what files have changed rather than the diff of the files 
that have changed. (kinda like diffstat without the  for now)

(or maybe just do a diffstat if it's easier)
Of course you could click through to a per-file diff eventually...
David
--
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html