Re: [PATCH] enable core.fsyncObjectFiles by default
On 01/18/2018 11:27 AM, Christoph Hellwig wrote: [adding Chris to the Cc list - this is about the awful ext3 data=ordered behavior of syncing the whole file system data and metadata on each fsync] On Wed, Jan 17, 2018 at 03:57:53PM -0800, Linus Torvalds wrote: On Wed, Jan 17, 2018 at 3:52 PM, Theodore Ts'o wrote: Well, let's be fair; this is something *ext3* got wrong, and it was the default file system back them. I'm pretty sure reiserfs and btrfs did too.. I'm pretty sure btrfs never did, and reiserfs at least looks like it currently doesn't but I'd have to dig into history to check if it ever did. Christoph has this right, btrfs only fsyncs the one file that you've asked for, and unrelated data/metadata won't be included. We've seen big fsync stalls on ext4 caused by data=ordered, so it's still possible to trigger on ext4, but much better than ext3. I do share Ted's concern about the perf impact of the fsyncs on tons of loose files, but the defaults should be safety first. -chris
Re: [PATCH] multi item packed files
On Friday 22 April 2005 16:32, Chris Mason wrote: > If I pack every 64k (uncompressed), the checkout-tree time goes down to > 3m14s. That's a very big difference considering how stupid my code is .git > was only 20% smaller with 64k chunks. I should be able to do better...I'll > do one more run. > This run also packed tree files together (everything produced by write-tree went into a packed file), but not the commits. I estimate I could save about another 168m by packing the tree files and commits into the same file with the blobs, but this wouldn't make any of the times below faster. git - original (28k commits)packed FS size2,675,408k 1,723,820k read-tree24.45s 18.9s checkout-cache 4m30s 3m5s patch time 2h30m1h55m The format for the packed files could be smarter, such that it didn't require decompressing the whole packed file to read one item. I would guess I could get another 20% checkout-cache performance out of it via more tuning, and probably another 10% of space savings. Of course, none of this is likely to convince you ;) If you decide later on it's worthwhile, I don't think it would be difficult to add then. -chris - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] multi item packed files
On Thursday 21 April 2005 18:47, Linus Torvalds wrote: > On Thu, 21 Apr 2005, Chris Mason wrote: > > Shrug, we shouldn't need help from the kernel for something like this. > > git as a database hits worst case scenarios for almost every FS. [ ... ] We somewhat agree on most of this, I snipped out the parts that aren't worth nitpicking over. git is really fast right now, and I'm all for throwing drive space at things to solve problems. I just don't think we have to throw as much space at it as we are. > The _seek_ issue is real, but git actually has a very nice architecture > even there: not only dos it cache really really well (and you can do a > simple "ls-tree $(cat .git/HEAD)" and populate the case from the results), > but the low level of indirection in a git archive means that it's almost > totally prefetchable with near-perfect access patterns. We can sort by the files before reading them in, but even if we order things perfectly, we're spreading the io out too much across the drive. It works right now because the git archive is relatively dense. At a few hundred MB when we order things properly the drive head isn't moving that much. At 3-6 GB this hurts more. The data gets farther apart as things age, and drive performance rots away. I'll never convince you without numbers, which means I'll have to wait for the full load of old history and try it out ;) -chris - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] multi item packed files
On Thursday 21 April 2005 15:28, Krzysztof Halasa wrote: > Linus Torvalds <[EMAIL PROTECTED]> writes: > > Wrong. You most definitely _can_ lose: you end up having to optimize for > > one particular filesystem blocking size, and you'll lose on any other > > filesystem. And you'll lose on the special filesystem of "network > > traffic", which is byte-granular. > > If someone needs better on-disk ratio, (s)he can go with 1 KB filesystem > or something like that, without all the added complexity of packing. > > If we want to optimize that further, I would try doing it at the > underlying filesystem level. For example, loop-mounted one. Shrug, we shouldn't need help from the kernel for something like this. git as a database hits worst case scenarios for almost every FS. We've got: 1) subdirectories with lots of files 2) wasted space for tiny files 3) files that are likely to be accessed together spread across the whole disk One compromise for SCM use would be one packed file per commit, with an index that lets us quickly figure out which commit has a particular version of a given file. My hack gets something close to that (broken into 32k chunks for no good reason), and the index to find a given file is just the git directory tree. But my code does hide the fact that we're packing things from most of the git interfaces. So I can almost keep a straight face while claiming to be true to the original git design...almost. The whole setup is far from perfect, but it is one option for addressing points 2 & 3 above. -chris - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] multi item packed files
On Thursday 21 April 2005 11:41, Linus Torvalds wrote: > On Thu, 21 Apr 2005, Chris Mason wrote: > > There have been a few threads on making git more space efficient, and > > eventually someone mentions tiny files and space fragmentation. Now that > > git object names are decoupled from their compression, it's easier to > > consider a a variety of compression algorithms. I whipped up a really > > silly "pack files together" compression. > > Careful. > > This is something that needs history to tell whether it's effective. In > particular, if one file changes and another one does not, your packed > archive now ends up being a new blob, so while you "saved space" by having > just one blob for the object, in reality you didn't save any space at all > because with the files changing, you just guaranteed that the packed > blob changes times more often. The packed blob lives in git but never makes it into a tree. Lets say that I have a packed blob with files "a, b, c", and another packed blob with files "x, y, z". Someone changes files, b and z and then runs update-cache b z. Now we have 2 unchanged packed blobs: "a, b, c", "x, y, z", and one new packed blob: "b_new, z_new". This means that in order for the packing to help, we have to change more then one file at a time. That's why it would be good to have update-cache include the write-tree and commit-tree. > > See? Your "packing in space" ends up also resulting in "packing in time", > and you didn't actually win anything. > > (If you did a good job of packing, you hopefully didn't _lose_ anything > either - you needed 1: number of objects that took 1: the space if > the packing ended up perfect - but since you needed times more of > these objects unless they all change together, you end up with exactly the > same space usage). > > So the argument is: you can't lose with the method, and you _can_ win. > Right? > > Wrong. You most definitely _can_ lose: you end up having to optimize for > one particular filesystem blocking size, and you'll lose on any other > filesystem. And you'll lose on the special filesystem of "network > traffic", which is byte-granular. > The patch does have one extra directory entry (for the packed blob), but from a network point of view roughly the same number of bytes should be copied. The hardlinks won't play nice with rsync though, soft links might be better. packing isn't just about filesystem block sizes, it's about locality. All the hashing means pretty much every access in git is random. With packing we can at least try to put a single changeset together on disk. Right now it doesn't matter much, but when the git tree is 6GB in two years we'll feel the pain. > I don't want to pee on peoples parades, and I'm all for gathering numbers, > but the thing is, the current git isn't actually all that bad, and I > guarantee that it's hard to make it better without using delta > representation. And the current thing is really really simple. > Grin, if I thought you wanted the patch I might have tried to pretty it up a little. The point is that all the discussions about ways to make git use less space end up stuck in "but wait, that'll make a bunch of tiny files and filesystems aren't good at that". So I believe some kind of packing is a required building block for any kind of delta storage. -chris - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] multi item packed files
Hello, There have been a few threads on making git more space efficient, and eventually someone mentions tiny files and space fragmentation. Now that git object names are decoupled from their compression, it's easier to consider a a variety of compression algorithms. I whipped up a really silly "pack files together" compression. This would maintain the write once semantics but allow a simple mechanism where objects are combined together. Choosing which objects to combine is easy, things put together into update-cache go together. This gives us more space efficiency and no seeks when reading that packed file off disk. A natural extension to this is to make update-cache --commit-tree, which includes the files produced by write-tree and commit-tree into the same packed file. (I haven't coded this). The layout works like this: 1) a new object type "packed" is added. 2) new objects are buffered into a packed object, until it gets to around 32k in size. This is complete arbitrary but felt about right. 3) The packed object is writting to git storage and then hard links are made to the packed object from the sha1 filename of each object inside. 4) read_sha1_file is changed to recognize the packed object and search inside. I did a simple test on the 2.6.11 tree with my 100 patches applied. Without packing, .git is 99MB. With packing it only needs 62MB: read speeds don't suffer with this, time to read-tree ; checkout-cache -a -f from a cold cache were the same. I could get the times lower with the patch by caching the uncompressed data, since in theory I should be faster here. Using this on data you care about would be a really bad idea right now. I'm only posting the patch to get the basic idea across for benchmarking and discussion. -chris diff -ur linus.back/cache.h linus/cache.h --- linus.back/cache.h 2005-04-21 11:05:27.971607944 -0400 +++ linus/cache.h 2005-04-21 09:35:47.173613576 -0400 @@ -109,7 +109,7 @@ /* Read and unpack a sha1 file into memory, write memory to a sha1 file */ extern void * map_sha1_file(const unsigned char *sha1, unsigned long *size); -extern void * unpack_sha1_file(void *map, unsigned long mapsize, char *type, unsigned long *size); +extern void * unpack_sha1_file(const unsigned char *sha1, void *map, unsigned long mapsize, char *type, unsigned long *size); extern void * read_sha1_file(const unsigned char *sha1, char *type, unsigned long *size); extern int write_sha1_file(char *buf, unsigned len, unsigned char *return_sha1); extern int check_sha1_signature(unsigned char *sha1, void *buf, unsigned long size, const char *type); @@ -117,6 +117,10 @@ /* Convert to/from hex/sha1 representation */ extern int get_sha1_hex(const char *hex, unsigned char *sha1); extern char *sha1_to_hex(const unsigned char *sha1); /* static buffer result! */ +extern int pack_sha1_buffer(void *buf, unsigned long buf_len, + unsigned char *returnsha1, char **dest, + unsigned long *dest_size); +int write_packed_buffer(void *buf, unsigned long len); /* General helper functions */ extern void usage(const char *err); diff -ur linus.back/cat-file.c linus/cat-file.c --- linus.back/cat-file.c 2005-04-21 11:05:27.971607944 -0400 +++ linus/cat-file.c 2005-04-21 10:04:29.871723656 -0400 @@ -23,7 +23,7 @@ type[size] = '\n'; size++; } else if (strcmp(type, argv[1])) { - die("cat-file %s: bad tag", argv[2]); + die("cat-file %s: bad tag (%s: %s)", argv[2], type, argv[1]); } while (size > 0) { diff -ur linus.back/fsck-cache.c linus/fsck-cache.c --- linus.back/fsck-cache.c 2005-04-21 11:05:27.974607488 -0400 +++ linus/fsck-cache.c 2005-04-21 09:14:03.139856840 -0400 @@ -85,7 +85,7 @@ if (map) { char type[100]; unsigned long size; - void *buffer = unpack_sha1_file(map, mapsize, type, &size); + void *buffer = unpack_sha1_file(sha1, map, mapsize, type, &size); if (!buffer) return -1; if (check_sha1_signature(sha1, buffer, size, type) < 0) diff -ur linus.back/sha1_file.c linus/sha1_file.c --- linus.back/sha1_file.c 2005-04-21 11:05:27.978606880 -0400 +++ linus/sha1_file.c 2005-04-21 10:41:51.280977656 -0400 @@ -116,7 +116,8 @@ return map; } -void * unpack_sha1_file(void *map, unsigned long mapsize, char *type, unsigned long *size) +void * unpack_sha1_file(const unsigned char *sha1, void *map, + unsigned long mapsize, char *type, unsigned long *size) { int ret, bytes; z_stream stream; @@ -134,12 +135,12 @@ ret = inflate(&stream, 0); if (sscanf(buffer, "%10s %lu", type, size) != 2) return NULL; - bytes = strlen(buffer) + 1; buf = malloc(*size); - if (!buf) + if (!buf) { + perror("malloc"); return NULL; - + } memcpy(buf, buffer + bytes, stream.total_out - bytes); bytes = stream.total_out - bytes; if (bytes < *size && ret == Z_OK) { @@ -149,6 +150,36 @@ /* nothing */; } inflateEnd(&stream); + + /* we've found a packed object */ + if (strcmp(type, "packed") == 0) { + char *p = buf; + if (!sha1
Re: [PATCH] write-tree performance problems
On Wednesday 20 April 2005 13:52, Linus Torvalds wrote: > On Wed, 20 Apr 2005, Chris Mason wrote: > > The patch below with your current tree brings my 100 patch test down to > > 22 seconds again. > > If you ever have a cache_entry bigger than 16384, your code will write > things out in the wrong order (write the new cache without flushing the > old buffer). Whoops > Finally, if you really want to go fast, you should really try to make your > writes powers-of-two, ie fill up the buffer entirely rather than saying > "if I were to overflow, flush it now". It doesn't matter that much for > some filesystems (especially local and append-only like the patterns are > here), but it can definitely matter for the stupid ones. Well, the difference there should be pretty hard to see with any benchmark. But I was being lazy...new patch attached. This one gets the same perf numbers, if this is still wrong then I really need some more coffee. -chris --- linus.back/read-cache.c 2005-04-20 10:14:23.26831 -0400 +++ linus/read-cache.c 2005-04-20 14:54:28.554518320 -0400 @@ -232,11 +232,13 @@ SHA_CTX c; struct cache_header hdr; int i; + #define BUFLEN 16384 + static char buf[BUFLEN]; + int len = 0; hdr.hdr_signature = htonl(CACHE_SIGNATURE); hdr.hdr_version = htonl(1); hdr.hdr_entries = htonl(entries); - SHA1_Init(&c); SHA1_Update(&c, &hdr, offsetof(struct cache_header, sha1)); for (i = 0; i < entries; i++) { @@ -246,13 +248,37 @@ } SHA1_Final(hdr.sha1, &c); - if (write(newfd, &hdr, sizeof(hdr)) != sizeof(hdr)) - return -1; - + /* hdr is small right now, but just + * in case someone changes that... + */ + if (sizeof(hdr) < BUFLEN) { + memcpy(buf, &hdr, sizeof(hdr)); + len += sizeof(hdr); + } else { + if (write(newfd, &hdr, sizeof(hdr)) != sizeof(hdr)) + return -1; + } for (i = 0; i < entries; i++) { struct cache_entry *ce = cache[i]; int size = ce_size(ce); - if (write(newfd, ce, size) != size) + char *p = (char *)ce; + while(size > 0) { + int count = size; + if (count > BUFLEN - len) +count = BUFLEN - len; + memcpy(buf + len, p, count); + size -= count; + len += count; + p += count; + if (len == BUFLEN) { +if (write(newfd, buf, len) != len) + return -1; +len = 0; + } + } + } + if (len) { + if (write(newfd, buf, len) != len) return -1; } return 0;
Re: [PATCH] write-tree performance problems
On Wednesday 20 April 2005 13:06, Linus Torvalds wrote: > On Wed, 20 Apr 2005, Chris Mason wrote: > > At any rate, the time for a single write-tree is pretty consistent. > > Before it was around .5 seconds, and with this change it goes down to > > .128s. > > Oh, wow. > > I bet your SHA1 implementation is done with hand-optimized and scheduled > x86 MMX code or something, while my poor G5 is probably using some slow > generic routine. As a result, it only improved by 33% for me since the > compression was just part of the picture, but with your cheap SHA1 the > compression costs really dominated, and so it's almost four times faster > for you. Aha, I was wondering why your write-tree speeds sounded so bad...this athlon machine is ~2years old now. Your comments about costs for writing the index file got me thinking, so I benchmarked how long the update-cache takes if we don't do the sha1 of the index file. There was almost no difference at all. update-cache currently takes about .152 seconds The code to write the cache calls write() for every cache entry, writing just a few bytes at a time. I changed it to collect these into a 16k buffer, which brings me down to .044s. This might not help as much on ext23, since they are faster than reiser for tiny writes. The patch below with your current tree brings my 100 patch test down to 22 seconds again. -chris --- linus.back/read-cache.c 2005-04-20 10:14:23.26831 -0400 +++ linus/read-cache.c 2005-04-20 13:05:13.200083672 -0400 @@ -232,11 +232,12 @@ SHA_CTX c; struct cache_header hdr; int i; + char *buf; + int len = 0; hdr.hdr_signature = htonl(CACHE_SIGNATURE); hdr.hdr_version = htonl(1); hdr.hdr_entries = htonl(entries); - SHA1_Init(&c); SHA1_Update(&c, &hdr, offsetof(struct cache_header, sha1)); for (i = 0; i < entries; i++) { @@ -246,13 +247,31 @@ } SHA1_Final(hdr.sha1, &c); + buf = malloc(16384); + if (!buf) { + return -1; + } if (write(newfd, &hdr, sizeof(hdr)) != sizeof(hdr)) return -1; for (i = 0; i < entries; i++) { struct cache_entry *ce = cache[i]; int size = ce_size(ce); - if (write(newfd, ce, size) != size) + if (size > 16384) { + if (write(newfd, ce, size) != size) +return -1; + continue; + } + if (len + size > 16384) { + if (write(newfd, buf, len) != len) +return -1; + len = 0; + } + memcpy(buf + len, ce, size); + len += size; + } + if (len) { + if (write(newfd, buf, len) != len) return -1; } return 0;
Re: [PATCH] write-tree performance problems
On Wednesday 20 April 2005 11:40, Linus Torvalds wrote: > On Wed, 20 Apr 2005, Chris Mason wrote: > > Thanks for looking at this. Your new tree is faster, it gets the commit > > 100 patches time down from 1m5s to 50s. > > It really _shouldn't_ be faster. It still does the compression, and throws > the end result away. Well, that's a little odd. I had thought about making sure you did this change and forgotten. 1 minute benchmarks are a horrible idea since they run into noise with cache writebacks. I should know better... At any rate, the time for a single write-tree is pretty consistent. Before it was around .5 seconds, and with this change it goes down to .128s. My patch was .024. The 100 patch time is down to 32s (3 run average). This is close enough that I don't think my patch is worth it if no other part of git can benefit from having trees in the index. > > To actually go faster, it _should_ need this patch. Untested. See if it > works.. Thanks. This one missed the filling in the returnsha1. New patch attached. -chris diff -u linus.back/sha1_file.c linus/sha1_file.c --- linus.back/sha1_file.c 2005-04-20 12:31:00.240181016 -0400 +++ linus/sha1_file.c 2005-04-20 12:13:56.339837528 -0400 @@ -173,12 +173,27 @@ z_stream stream; unsigned char sha1[20]; SHA_CTX c; + char *filename; + int fd; /* Sha1.. */ SHA1_Init(&c); SHA1_Update(&c, buf, len); SHA1_Final(sha1, &c); + filename = sha1_file_name(sha1); + fd = open(filename, O_WRONLY | O_CREAT | O_EXCL, 0666); + if (fd < 0) { + if (errno != EEXIST) + return -1; + + /* + * We might do collision checking here, but we'd need to + * uncompress the old file and check it. Later. + */ + goto out; + } + /* Set it up */ memset(&stream, 0, sizeof(stream)); deflateInit(&stream, Z_BEST_COMPRESSION); @@ -195,8 +210,10 @@ deflateEnd(&stream); size = stream.total_out; - if (write_sha1_buffer(sha1, compressed, size) < 0) - return -1; + if (write(fd, compressed, size) != size) + die("unable to write file"); + close(fd); +out: if (returnsha1) memcpy(returnsha1, sha1, 20); return 0;
Re: [PATCH] write-tree performance problems
On Wednesday 20 April 2005 02:43, Linus Torvalds wrote: > On Tue, 19 Apr 2005, Chris Mason wrote: > > I'll finish off the patch once you ok the basics below. My current code > > works like this: > > Chris, before you do anything further, let me re-consider. > > Assuming that the real cost of write-tree is the compression (and I think > it is), I really suspect that this ends up being the death-knell to my > "use the sha1 of the _compressed_ object" approach. Thanks for looking at this. Your new tree is faster, it gets the commit 100 patches time down from 1m5s to 50s. I've attached my patch from last night, which is mostly a rough guess of the changes we would need, I haven't validated or cleaned things up. With the basic changes I described before, the 100 patch time only goes down to 40s. Certainly not fast enough to justify the changes. In this case, the bulk of the extra time comes from write-tree writing the index file, so I split write-tree.c up into libwrite-tree.c, and created update-cache --write-tree. This gets our time back down to 21s. The attached patch is not against your latest revs. After updating I would need to sprinkle a few S_ISDIR checks into diff-cache.c and checkout-cache.c, but the changes should be small. -chris Index: Makefile === --- dbeacafeb442bcfd39dfdc90c360d47d4215c185/Makefile (mode:100644 sha1:6a04941a337ec50da06cf4cf52aa58f3b1435776) +++ 27e71cd40ff1dccfbbd996427833fd7bac714dde/Makefile (mode:100644 sha1:2ba6d49196e8a2335cfcd77ec0dbe9cda3e402dd) @@ -29,7 +29,7 @@ VERSION= VERSION -LIB_OBJS=read-cache.o sha1_file.o usage.o object.o commit.o tree.o blob.o +LIB_OBJS=read-cache.o sha1_file.o usage.o object.o commit.o tree.o blob.o libwrite-tree.o LIB_FILE=libgit.a LIB_H=cache.h object.h Index: cache.h === --- dbeacafeb442bcfd39dfdc90c360d47d4215c185/cache.h (mode:100644 sha1:c182ea0c5c1def37d899f9a05f8884ebe17c9d92) +++ 27e71cd40ff1dccfbbd996427833fd7bac714dde/cache.h (mode:100644 sha1:0882b713222b71e67c9dab5d58ab6f15c3c49ed6) @@ -74,7 +74,7 @@ #define ce_stage(ce) ((CE_STAGEMASK & ntohs((ce)->ce_flags)) >> CE_STAGESHIFT) #define ce_permissions(mode) (((mode) & 0100) ? 0755 : 0644) -#define create_ce_mode(mode) htonl(S_IFREG | ce_permissions(mode)) +#define create_ce_mode(mode) htonl((mode & (S_IFREG|S_IFDIR)) | ce_permissions(mode)) #define cache_entry_size(len) ((offsetof(struct cache_entry,name) + (len) + 8) & ~7) Index: libwrite-tree.c === --- /dev/null (tree:dbeacafeb442bcfd39dfdc90c360d47d4215c185) +++ 27e71cd40ff1dccfbbd996427833fd7bac714dde/libwrite-tree.c (mode:100644 sha1:52202930d02b3721f5a388ae1178c5a4d99ec1b4) @@ -0,0 +1,174 @@ +/* + * GIT - The information manager from hell + * + * Copyright (C) Linus Torvalds, 2005 + */ +#include "cache.h" + +struct new_ce { + struct new_ce *next; + struct cache_entry ce; +}; + +static struct new_ce *add_list = NULL; + +static int check_valid_sha1(unsigned char *sha1) +{ + char *filename = sha1_file_name(sha1); + int ret; + + /* If we were anal, we'd check that the sha1 of the contents actually matches */ + ret = access(filename, R_OK); + if (ret) + perror(filename); + return ret; +} + +static int prepend_integer(char *buffer, unsigned val, int i) +{ + buffer[--i] = '\0'; + do { + buffer[--i] = '0' + (val % 10); + val /= 10; + } while (val); + return i; +} + +#define ORIG_OFFSET (40) /* Enough space to add the header of "tree \0" */ + +static int write_tree(struct cache_entry **cachep, int maxentries, const char *base, int baselen, unsigned char *returnsha1) +{ + unsigned char subdir_sha1[20]; + unsigned long size, offset; + char *buffer; + int i, nr; + + /* Guess at some random initial size */ + size = 8192; + buffer = malloc(size); + offset = ORIG_OFFSET; + + nr = 0; + do { + struct cache_entry *ce = cachep[nr]; + const char *pathname = ce->name, *filename, *dirname; + int pathlen = ce_namelen(ce), entrylen; + unsigned char *sha1; + unsigned int mode; + + /* Did we hit the end of the directory? Return how many we wrote */ + if (baselen >= pathlen || memcmp(base, pathname, baselen)) + break; + + sha1 = ce->sha1; + mode = ntohl(ce->ce_mode); + + /* Do we have _further_ subdirectories? */ + filename = pathname + baselen; + dirname = strchr(filename, '/'); + if (dirname) { + int subdir_written; + int len = dirname - pathname; + unsigned int size = cache_entry_size(len); + struct new_ce *new_ce = malloc(size + sizeof(struct new_ce *)); + struct cache_entry *c = &new_ce->ce; + subdir_written = write_tree(cachep + nr, maxentries - nr, pathname, dirname-pathname+1, subdir_sha1); + nr += subdir_written
Re: [PATCH] write-tree performance problems
On Tuesday 19 April 2005 17:23, Linus Torvalds wrote: > On Tue, 19 Apr 2005, Chris Mason wrote: > > Regardless, putting it into the index somehow should be fastest, I'll see > > what I can do. > > Start by putting it in at "read-tree" time, and adding the code to > invalidate all parent directory indexes when somebody changes a file in > the index (ie "update-cache" for anything but a "--refresh"). > > That would be needed anyway, since those two are the ones that already > change the index file. > > Once you're sure that you can correctly invalidate the entries (so that > you could never use a stale tree entry by mistake), the second stage would > be to update it at "write-tree" time. This was much easier then I expected, and it seems to be working here. It does slow down the write-tree slightly because we have to write out the index file, but I can get around that with the index file on tmpfs change. The original write-tree needs .54 seconds to run write-tree with the index speedup gets that down to .024s (same as my first patch) when nothing has changed. When it has to rewrite the index file because something changed, it's .167s. I'll finish off the patch once you ok the basics below. My current code works like this: 1) read-tree will insert index entries for directories. There is no index entry for the root. 2) update-cache removes index entries for all parents of the file you're updating. So, if you update-cache fs/ext3/inode.c, I remove the index of fs and fs/ext3 3) If write-tree finds a directory in the index, it uses the sha1 in the cache entry and skips all files/dirs under that directory. 4) If write-tree detects a subdir with no directory in the index, it calls write_tree the same way it used to. It then inserts a new cache object with the calculated sha1. 5) right before exiting, write-tree updates the index if it made any changes. The downside to this setup is that I've got to change other index users to deal with directory entries that are there sometimes and missing other times. The nice part is that I don't have to "invalidate" the directory entry, if it is present, it is valid. -chris - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] write-tree performance problems
On Tuesday 19 April 2005 15:03, Linus Torvalds wrote: > On Tue, 19 Apr 2005, Chris Mason wrote: > > Very true, you can't replace quilt with git without ruining both of them. > > But it would be nice to take a quilt tree and turn it into a git tree > > for merging purposes, or to make use of whatever visualization tools > > might exist someday. > > Fair enough. The thing is, going from quilt->git really is a pretty "big > decision", since it's the decision that says "I will now really commit all > this quilt changes forever and ever". > > Which is also why I think it's actually ok to take a minute to do 100 > quilt patches. This is not something you do on a whim. It's something > you'd better think about. It's turning a very fluid environment into a > unchangable, final thing. > It's only final when someone pulls from you...for me, all the trees would be temporary. [ ... subtree tree hashes in the index file ... ] > I'll think about it. I'd love to speed up write-tree, and keeping track of > it in the index is a nice little trick, but it's not quite high enough up > on my worries for me to act on it right now. > > But if you want to try to see how nasty it would be to add tree index > entries to the index file at "write-tree" time automatically, hey... > Makes sense, I'll let the merge development frenzy die down and give it a try one weekend. I might look into making it a special case of the merging index changes, since some of the concepts seem similar. Regardless, putting it into the index somehow should be fastest, I'll see what I can do. -chris - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] write-tree performance problems
On Tuesday 19 April 2005 13:36, Linus Torvalds wrote: > On Tue, 19 Apr 2005, Chris Mason wrote: > > I did a quick experiment with applying/commit 100 patches from the suse > > kernel into a kernel git tree, which quilt can do in 2 seconds. git > > needs 1m5s. > > Note that I don't think you want to replace quilt with git. The approaches > are totally different, and git does _not_ obviate the need for the quilt > kind of "patch testing". > > In fact, git has all the same issues that BK had, and for the same > fundamental reason: if you do distributed work, you have to always > "append" stuff, and that means that you can never re-order anything after > the fact. Very true, you can't replace quilt with git without ruining both of them. But it would be nice to take a quilt tree and turn it into a git tree for merging purposes, or to make use of whatever visualization tools might exist someday. > What I _would_ like is the ability to re-use an old tree, though. What you > really want to do is not pass in a set of directory names and just trust > that they are correct, but just pass in a directory to compare with, and > if the contents match, you don't need to write out a new one. > > I'll try to whip up something that does what you want done, but doesn't > need (or take) any untrusted information from the user in the form "trust > me, it hasn't changed". We already have a "trust me, it hasn't changed" via update-cache. If it gets called wrong the tree won't reflect reality. The patch doesn't change the write-tree default, but does enable you to give write-tree better information about the parts of the tree you want written back to git. With that said, I hate the patch too. I didn't see how to compare against the old tree without reading each tree object from the old tree, and that should be slower then what write-tree does now. So I wimped out and made the quick patch that demonstrates the cause of the performance hit. The "move .git/index to a tmpfs file" change should be easier though, and has a real benefit. How do you feel about s|.git/index|.git/index_dir/index| in the sources? This gives us the flexibility to link it wherever is needed. -chris - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] write-tree performance problems
Hello everyone, I did a quick experiment with applying/commit 100 patches from the suse kernel into a kernel git tree, which quilt can do in 2 seconds. git needs 1m5s. The primary performance problem during each commit is write-tree recalculating the hash of each directory, even though the contents of most directories are not changing. I've attached a very quick and dirty mod of write-tree.c, it takes an optional tree id (sha1) and list of directories. The hash of any directories not in the list are read in from existing files instead of being recalculated. You have to pass each sub dir with a modified file. So, if you change fs/super.c and fs/ext3/super.c, you would call "write-tree sha1 fs fs/ext3" With this patch, the time to apply 100 commits goes down to 22 seconds. It could be faster (and easier to use) if the index stored the hash of trees instead of just blobs, but that would be a larger change. I was able to get the commit time down to 13 seconds by changing read-tree.c, update-cache.c and read-cache.c to store/read the index in tmpfs instead of on the local filesystem. I haven't attached the patch for that, but it seems easiest to move .git/index into .git/index_dir/index, and let the user decide where to put index_dir. Quick speed summary, apply/commit 100 patches quilt push -a :2s git (unmodified): 1m5s git (tree hash reduction) 22s git (tree hash, tmpfs index) 13s This patch is against pasky's tree from this morning, but also applies to linus' tree. It's nasty stuff, but will hopefully get some discussion started on speeding things up. -chris --- a/write-tree.c +++ b/write-tree.c @@ -4,6 +4,8 @@ * Copyright (C) Linus Torvalds, 2005 */ #include "cache.h" +static char **dirs; +static int num_dirs = 0; static int check_valid_sha1(unsigned char *sha1) { @@ -27,15 +29,47 @@ static int prepend_integer(char *buffer, return i; } +static int find_sha(char *buffer, int size, const char *base, int baselen, char *returnsha1) +{ + while(size) { + int len = strlen(buffer)+1; + unsigned char *sha1 = buffer + len; + char *path = strchr(buffer, ' ')+1; + unsigned int mode; + + if (size < len + 20 || sscanf(buffer, "%o", &mode) != 1) + die("corrupt 'tree' file"); + buffer = sha1 + 20; + size -= len + 20; + if (strncmp(path, base, baselen) == 0 && + strlen(path) == baselen) { + memcpy(returnsha1, sha1, 20); + return 0; + } + } + return -1; +} + #define ORIG_OFFSET (40) /* Enough space to add the header of "tree \0" */ -static int write_tree(struct cache_entry **cachep, int maxentries, const char *base, int baselen, unsigned char *returnsha1) +static int write_tree(struct cache_entry **cachep, int maxentries, const char *base, int baselen, unsigned char *returnsha1, char *treesha) { unsigned char subdir_sha1[20]; unsigned long size, offset; char *buffer; int i, nr; - + char *tree = NULL; + unsigned long tree_size; + char type[20]; + if (treesha) { + tree = read_sha1_file(treesha, type, &tree_size); + if (!tree) { + die("unable to read sha1 file"); + } else { + } + if (strcmp(type, "tree")) + die("expected a tree node"); + } /* Guess at some random initial size */ size = 8192; buffer = malloc(size); @@ -55,15 +89,60 @@ static int write_tree(struct cache_entry sha1 = ce->sha1; mode = ntohl(ce->ce_mode); - /* Do we have _further_ subdirectories? */ filename = pathname + baselen; dirname = strchr(filename, '/'); if (dirname) { int subdir_written; - - subdir_written = write_tree(cachep + nr, maxentries - nr, pathname, dirname-pathname+1, subdir_sha1); - nr += subdir_written; + int dirlen = dirname - pathname; + int dirmatch = 1; + if (tree && num_dirs > 0) { +dirmatch = 0; +for(i = 0 ; i < num_dirs; i++) { + int len = strlen(dirs[i]); + if (len <= baselen) + continue; + if (memcmp(dirs[i], pathname, dirlen)==0 && + pathname[dirlen] == '/') { + dirmatch = 1; + break; + } +} +if (!dirmatch && find_sha(tree, tree_size, + filename, + dirname-filename, + subdir_sha1)) { + dirmatch = 1; +} + } + if (!dirmatch) { +/* eat all the entries in this dir */ +while(++nr < maxentries) { + char *p; + ce = cachep[nr]; + p = strchr(ce->name + baselen, '/'); + if (!p) + break; + if (p - ce->name != dirname-pathname) + break; + if (memcmp(pathname, ce->name, p-ce->name)) + break; +} + } else { +unsigned char thisdir_sha1[20]; +char *p = thisdir_sha1; +if (num_dirs && tree) { +if (find_sha(tree, tree_size, filename, + dirname-filename, p)) { + num_dirs = 0; + p = NULL; +} +} else { + p = NULL; +} +subdir_written = write_tree(cachep + nr, maxentries - nr, pathname, dirname-pathname+1, subdir_sha1, p); +nr += subdir_written; + } /* Now we need to w