Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread David Woodhouse
On Wed, 2005-04-20 at 07:59 -0700, Linus Torvalds wrote: > external-parent > comment for this parent > > and the nice thing about that is that now that information allows you to > add external parents at any point. > > Why do it like this? First off, I think that the "

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread Martin Uecker
On Wed, Apr 20, 2005 at 05:57:34PM +0200, Martin Uecker wrote: > On Wed, Apr 20, 2005 at 11:28:20AM -0400, C. Scott Ananian wrote: > > > Yes, I guess this is the detail I was going to abandon. =) > > > > I viewed the fact that the top-level hash was dependent on the exact chunk > > makeup a 'mis

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread Martin Uecker
On Wed, Apr 20, 2005 at 11:28:20AM -0400, C. Scott Ananian wrote: Hi, > A merkle-tree (which I think you initially pointed me at) makes the hash > of the internal nodes be a hash of the chunk's hashes; ie not a straight > content hash. This is roughly what my current implementation does, but

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread C. Scott Ananian
On Wed, 20 Apr 2005, Martin Uecker wrote: You can (and my code demonstrates/will demonstrate) still use a whole-file hash to use chunking. With content prefixes, this takes O(N ln M) time (where N is the file size and M is the number of chunks) to compute all hashes; if subtrees can share the same

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread Martin Uecker
On Wed, Apr 20, 2005 at 10:30:15AM -0400, C. Scott Ananian wrote: Hi, your code looks pretty cool. thank you! > On Wed, 20 Apr 2005, Martin Uecker wrote: > > >The other thing I don't like is the use of a sha1 > >for a complete file. Switching to some kind of hash > >tree would allow to introduc

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread Linus Torvalds
On Thu, 21 Apr 2005, David Woodhouse wrote: > > The reason for doing this is that without it, we can't ever have a full > history actually connected to the current trees. There'd always be a > break at 2.6.12-rc2, at which point you'd have to switch to an entirely > different git repository. Qu

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread C. Scott Ananian
On Wed, 20 Apr 2005, Linus Torvalds wrote: - _keep_ the same compression format, but notice that we already have an object by looking at the uncompressed one. With a chunked file, you can also skip writing certain *subtrees* of the file as soon as you notice it's already present on disk. I can

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread C. Scott Ananian
On Wed, 20 Apr 2005, Martin Uecker wrote: The other thing I don't like is the use of a sha1 for a complete file. Switching to some kind of hash tree would allow to introduce chunks later. This has two advantages: You can (and my code demonstrates/will demonstrate) still use a whole-file hash to us

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread Linus Torvalds
On Wed, 20 Apr 2005, Jon Seymour wrote: > > Am I correct to understand that with this change, all the objects in the > database are still being compressed (so no net performance benefit), but by > doing the SHA1 calculations before compression you are keeping open the > possibility that at so

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread David Woodhouse
On Wed, 2005-04-20 at 02:08 -0700, Linus Torvalds wrote: > I converted my git archives (kernel and git itself) to do the SHA1 > hash _before_ the compression phase. I'm happy to see that -- because I'm going to be asking you to make another change which will also require a simple repository conver

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread Jon Seymour
> The main point is not about trying different compression > techniques but that you don't need to compress at all just > to calculate the hash of some data. (to know if it is > unchanged for example) > Ah, ok, I didn't understand that there were extra compresses being performed for that reason.

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread Morten Welinder
On 4/20/05, Martin Uecker <[EMAIL PROTECTED]> wrote: > The storage method of the database of a collection of > files in the underlying file system. Because of the > random nature of the hashes this leads to a horrible > amount of seeking for all operations which walk the > logical structure of som

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread Martin Uecker
On Wed, Apr 20, 2005 at 10:11:10PM +1000, Jon Seymour wrote: > On 4/20/05, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > > > > I converted my git archives (kernel and git itself) to do the SHA1 hash > > _before_ the compression phase. > > > > Linus, > > Am I correct to understand that with

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread Jon Seymour
On 4/20/05, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > I converted my git archives (kernel and git itself) to do the SHA1 hash > _before_ the compression phase. > Linus, Am I correct to understand that with this change, all the objects in the database are still being compressed (so no n

Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread Ingo Molnar
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > So to convert your old git setup to a new git setup, do the following: > [...] did this for two repositories (git and kernel-git), it works as advertised. Ingo - To unsubscribe from this list: send the line "unsubscribe git" in the body of