Re: [PATCH] multi item packed files

Chris Mason Thu, 21 Apr 2005 13:22:58 -0700

On Thursday 21 April 2005 15:28, Krzysztof Halasa wrote:
> Linus Torvalds <[EMAIL PROTECTED]> writes:
> > Wrong. You most definitely _can_ lose: you end up having to optimize for
> > one particular filesystem blocking size, and you'll lose on any other
> > filesystem. And you'll lose on the special filesystem of "network
> > traffic", which is byte-granular.
>
> If someone needs better on-disk ratio, (s)he can go with 1 KB filesystem
> or something like that, without all the added complexity of packing.
>
> If we want to optimize that further, I would try doing it at the
> underlying filesystem level. For example, loop-mounted one.


Shrug, we shouldn't need help from the kernel for something like this.  git as 
a database hits worst case scenarios for almost every FS.

We've got:

1) subdirectories with lots of files
2) wasted space for tiny files
3) files that are likely to be accessed together spread across the whole disk

One compromise for SCM use would be one packed file per commit, with an index 
that lets us quickly figure out which commit has a particular version of a 
given file.  My hack gets something close to that (broken into 32k chunks for 
no good reason), and the index to find a given file is just the git directory 
tree.

But my code does hide the fact that we're packing things from most of the git 
interfaces.  So I can almost keep a straight face while claiming to be true 
to the original git design...almost.  The whole setup is far from perfect, 
but it is one option for addressing points 2 & 3 above.

-chris
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] multi item packed files

Reply via email to