On Wed, 2005-04-20 at 07:59 -0700, Linus Torvalds wrote:
> external-parent
> comment for this parent
>
> and the nice thing about that is that now that information allows you to
> add external parents at any point.
>
> Why do it like this? First off, I think that the "
On Wed, 20 Apr 2005, Linus Torvalds wrote:
>
> It would be nicer for the cache to make the index file "header" be a
> "footer", and write it out last - that way we'd be able to do the SHA1 as
> we write rather than doing a two-pass thing. That's for another time.
That other time was now.
The
On Wed, 20 Apr 2005, Chris Mason wrote:
>
> Well, the difference there should be pretty hard to see with any benchmark.
> But I was being lazy...new patch attached. This one gets the same perf
> numbers, if this is still wrong then I really need some more coffee.
I did my preferred version. M
On Wednesday 20 April 2005 13:52, Linus Torvalds wrote:
> On Wed, 20 Apr 2005, Chris Mason wrote:
> > The patch below with your current tree brings my 100 patch test down to
> > 22 seconds again.
>
> If you ever have a cache_entry bigger than 16384, your code will write
> things out in the wrong or
On Wed, 20 Apr 2005 10:06:15 -0700 (PDT)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
> I bet your SHA1 implementation is done with hand-optimized and scheduled
> x86 MMX code or something, while my poor G5 is probably using some slow
> generic routine. As a result, it only improved by 33% for me sin
On Wed, 20 Apr 2005, Chris Mason wrote:
>
> The patch below with your current tree brings my 100 patch test down to 22
> seconds again.
If you ever have a cache_entry bigger than 16384, your code will write
things out in the wrong order (write the new cache without flushing the
old buffer).
On Wednesday 20 April 2005 13:06, Linus Torvalds wrote:
> On Wed, 20 Apr 2005, Chris Mason wrote:
> > At any rate, the time for a single write-tree is pretty consistent.
> > Before it was around .5 seconds, and with this change it goes down to
> > .128s.
>
> Oh, wow.
>
> I bet your SHA1 implementa
On Wed, 20 Apr 2005, Chris Mason wrote:
>
> At any rate, the time for a single write-tree is pretty consistent. Before
> it
> was around .5 seconds, and with this change it goes down to .128s.
Oh, wow.
I bet your SHA1 implementation is done with hand-optimized and scheduled
x86 MMX code or
On Wed, 20 Apr 2005, Linus Torvalds wrote:
>
> NO! Don't see if this works. For the "sha1 file already exists" file, it
> forgot to return the SHA1 value in "returnsha1", and would thus corrupt
> the trees it wrote.
Proper version with fixes checked in. For me, it brings down the time to
writ
On Wednesday 20 April 2005 11:40, Linus Torvalds wrote:
> On Wed, 20 Apr 2005, Chris Mason wrote:
> > Thanks for looking at this. Your new tree is faster, it gets the commit
> > 100 patches time down from 1m5s to 50s.
>
> It really _shouldn't_ be faster. It still does the compression, and throws
>
On Wed, Apr 20, 2005 at 05:57:34PM +0200, Martin Uecker wrote:
> On Wed, Apr 20, 2005 at 11:28:20AM -0400, C. Scott Ananian wrote:
>
> > Yes, I guess this is the detail I was going to abandon. =)
> >
> > I viewed the fact that the top-level hash was dependent on the exact chunk
> > makeup a 'mis
On Wed, 20 Apr 2005, Linus Torvalds wrote:
>
> To actually go faster, it _should_ need this patch. Untested. See if it
> works..
NO! Don't see if this works. For the "sha1 file already exists" file, it
forgot to return the SHA1 value in "returnsha1", and would thus corrupt
the trees it wrote
On Wed, 20 Apr 2005, C. Scott Ananian wrote:
>
> OK, sure. But how 'bout chunking trees? Are you grown happy with the new
> trees-reference-other-trees paradigm, or is there a deep longing in your
> heart for the simplicity of 'trees-reference-blobs-period'?
I'm pretty sure we do better chu
On 4/20/05, Linus Torvalds <[EMAIL PROTECTED]> wrote:
> It really _shouldn't_ be faster. It still does the compression, and throws
> the end result away.
Am I misunderstanding or is the proglem that doing:
-> compress -> sha1 -> compare with existing hash
is expensive?
What about doing:
-> unc
On Wed, Apr 20, 2005 at 11:28:20AM -0400, C. Scott Ananian wrote:
Hi,
> A merkle-tree (which I think you initially pointed me at) makes the hash
> of the internal nodes be a hash of the chunk's hashes; ie not a straight
> content hash. This is roughly what my current implementation does, but
On Wed, 20 Apr 2005, Linus Torvalds wrote:
I was considering using a chunked representation for *all* files (not just
blobs), which would avoid the original 'trees must reference other trees
or they become too large' issue -- and maybe the performance issue you're
referring to, as well?
No. The mos
On Wed, 20 Apr 2005, C. Scott Ananian wrote:
>
> Hmm. Are our index files too large, or is there some other factor?
They _are_ pretty large, but they have to be,
For the kernel, the index file is about 1.6MB. That's
- 17,000+ files and filenames
- stat information for all of them
- the s
On Wed, 20 Apr 2005, Chris Mason wrote:
>
> Thanks for looking at this. Your new tree is faster, it gets the commit 100
> patches time down from 1m5s to 50s.
It really _shouldn't_ be faster. It still does the compression, and throws
the end result away.
To actually go faster, it _should_ nee
On Wed, 20 Apr 2005, Chris Mason wrote:
With the basic changes I described before, the 100 patch time only goes down
to 40s. Certainly not fast enough to justify the changes. In this case, the
bulk of the extra time comes from write-tree writing the index file, so I
split write-tree.c up into li
On Wed, 20 Apr 2005, Martin Uecker wrote:
You can (and my code demonstrates/will demonstrate) still use a whole-file
hash to use chunking. With content prefixes, this takes O(N ln M) time
(where N is the file size and M is the number of chunks) to compute all
hashes; if subtrees can share the same
On Wednesday 20 April 2005 02:43, Linus Torvalds wrote:
> On Tue, 19 Apr 2005, Chris Mason wrote:
> > I'll finish off the patch once you ok the basics below. My current code
> > works like this:
>
> Chris, before you do anything further, let me re-consider.
>
> Assuming that the real cost of write
On Wed, Apr 20, 2005 at 10:30:15AM -0400, C. Scott Ananian wrote:
Hi,
your code looks pretty cool. thank you!
> On Wed, 20 Apr 2005, Martin Uecker wrote:
>
> >The other thing I don't like is the use of a sha1
> >for a complete file. Switching to some kind of hash
> >tree would allow to introduc
On Thu, 21 Apr 2005, David Woodhouse wrote:
>
> The reason for doing this is that without it, we can't ever have a full
> history actually connected to the current trees. There'd always be a
> break at 2.6.12-rc2, at which point you'd have to switch to an entirely
> different git repository.
Qu
On Wed, 20 Apr 2005, Linus Torvalds wrote:
- _keep_ the same compression format, but notice that we already have an
object by looking at the uncompressed one.
With a chunked file, you can also skip writing certain *subtrees* of the
file as soon as you notice it's already present on disk. I can
On Wed, 20 Apr 2005, Martin Uecker wrote:
The other thing I don't like is the use of a sha1
for a complete file. Switching to some kind of hash
tree would allow to introduce chunks later. This has
two advantages:
You can (and my code demonstrates/will demonstrate) still use a whole-file
hash to us
On Wed, 20 Apr 2005, Jon Seymour wrote:
>
> Am I correct to understand that with this change, all the objects in the
> database are still being compressed (so no net performance benefit), but by
> doing the SHA1 calculations before compression you are keeping open the
> possibility that at so
On Wed, 2005-04-20 at 02:08 -0700, Linus Torvalds wrote:
> I converted my git archives (kernel and git itself) to do the SHA1
> hash _before_ the compression phase.
I'm happy to see that -- because I'm going to be asking you to make
another change which will also require a simple repository conver
> The main point is not about trying different compression
> techniques but that you don't need to compress at all just
> to calculate the hash of some data. (to know if it is
> unchanged for example)
>
Ah, ok, I didn't understand that there were extra compresses being
performed for that reason.
On 4/20/05, Martin Uecker <[EMAIL PROTECTED]> wrote:
> The storage method of the database of a collection of
> files in the underlying file system. Because of the
> random nature of the hashes this leads to a horrible
> amount of seeking for all operations which walk the
> logical structure of som
On Wed, Apr 20, 2005 at 10:11:10PM +1000, Jon Seymour wrote:
> On 4/20/05, Linus Torvalds <[EMAIL PROTECTED]> wrote:
> >
> >
> > I converted my git archives (kernel and git itself) to do the SHA1 hash
> > _before_ the compression phase.
> >
>
> Linus,
>
> Am I correct to understand that with
On 4/20/05, Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> I converted my git archives (kernel and git itself) to do the SHA1 hash
> _before_ the compression phase.
>
Linus,
Am I correct to understand that with this change, all the objects in
the database are still being compressed (so no n
* Linus Torvalds <[EMAIL PROTECTED]> wrote:
> So to convert your old git setup to a new git setup, do the following:
> [...]
did this for two repositories (git and kernel-git), it works as
advertised.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of
I converted my git archives (kernel and git itself) to do the SHA1 hash
_before_ the compression phase.
So I'll just have to publically admit that everybody who complained about
that particular design decision was right. Oh, well.
On Wed, 20 Apr 2005, H. Peter Anvin wrote:
> Linus Torvalds wr
Linus Torvalds wrote:
So I'll see if I can turn the current fsck into a "convert into
uncompressed format", and do a nice clean format conversion.
Just let me know what you want to do, and I can trivially change the
conversion scripts I've already written to do what you want.
-hpa
-
To
On Tue, 19 Apr 2005, Chris Mason wrote:
>
> I'll finish off the patch once you ok the basics below. My current code
> works
> like this:
Chris, before you do anything further, let me re-consider.
Assuming that the real cost of write-tree is the compression (and I think
it is), I really susp
On Tue, 19 Apr 2005, Chris Mason wrote:
>
> 5) right before exiting, write-tree updates the index if it made any changes.
This part won't work. It needs to do the proper locking, which means that
it needs to create "index.lock" _before_ it reads the index file, and
write everything to that on
On Tue, Apr 19, 2005 at 04:59:18PM -0700, Linus Torvalds wrote:
>
> However, it definitely wouldn't be useful for _me_. The whole thing that
> I'm after is to allow painless merging of distributed work. If I have to
> merge one patch at a time, I'd much rather see people send me patches
> directly
On Tuesday 19 April 2005 17:23, Linus Torvalds wrote:
> On Tue, 19 Apr 2005, Chris Mason wrote:
> > Regardless, putting it into the index somehow should be fastest, I'll see
> > what I can do.
>
> Start by putting it in at "read-tree" time, and adding the code to
> invalidate all parent directory i
On Tue, 19 Apr 2005, David Lang wrote:
> >
> > If so, he should set up one repository per quilt patch.
>
> a tool to do this automaticaly is what I was trying to suggest (and asking
> if it would be useful)
Heh. It's certainly possible. Esepcially with the object sharing, you
could create a g
On Tue, 19 Apr 2005, Linus Torvalds wrote:
On Tue, 19 Apr 2005, David Lang wrote:
if you are useing quilt for locally developed patches I fully agree with
you, but I was thinking of the case where Andrew is receiving independant
patches from lots of people and storing them in quilt for testing, and
On Tue, 19 Apr 2005, David Lang wrote:
>
> if you are useing quilt for locally developed patches I fully agree with
> you, but I was thinking of the case where Andrew is receiving independant
> patches from lots of people and storing them in quilt for testing, and
> then sending them on to yo
On Tue, 19 Apr 2005, Linus Torvalds wrote:
On Tue, 19 Apr 2005, David Lang wrote:
what if you turned the forest of quilt patches into a forest of git trees?
(essentially applying each patch against the baseline seperatly) would
this make sense or be useful?
It has a certain charm, but the fact is,
On Tue, 19 Apr 2005, Linus Torvalds wrote:
(*) Actually, I think it's the compression that ends up being the most
expensive part.
You're also using the equivalent of '-9', too -- and *that's slow*.
Changing to Z_NORMAL_COMPRESSION would probably help a lot
(but would break all existing repositories
On Tue, 19 Apr 2005, David Lang wrote:
>
> what if you turned the forest of quilt patches into a forest of git trees?
> (essentially applying each patch against the baseline seperatly) would
> this make sense or be useful?
It has a certain charm, but the fact is, it gets really messy to sort
On Tue, 19 Apr 2005, Linus Torvalds wrote:
On Tue, 19 Apr 2005, Chris Mason wrote:
Very true, you can't replace quilt with git without ruining both of them.
But
it would be nice to take a quilt tree and turn it into a git tree for merging
purposes, or to make use of whatever visualization tools m
On Tue, 19 Apr 2005, Chris Mason wrote:
>
> Regardless, putting it into the index somehow should be fastest, I'll see
> what
> I can do.
Start by putting it in at "read-tree" time, and adding the code to
invalidate all parent directory indexes when somebody changes a file in
the index (ie "up
On Tuesday 19 April 2005 15:03, Linus Torvalds wrote:
> On Tue, 19 Apr 2005, Chris Mason wrote:
> > Very true, you can't replace quilt with git without ruining both of them.
> > But it would be nice to take a quilt tree and turn it into a git tree
> > for merging purposes, or to make use of whatev
On Tue, 19 Apr 2005, Chris Mason wrote:
>
> Very true, you can't replace quilt with git without ruining both of them.
> But
> it would be nice to take a quilt tree and turn it into a git tree for merging
> purposes, or to make use of whatever visualization tools might exist someday.
>
Fa
On Tue, Apr 19, 2005 at 10:36:06AM -0700, Linus Torvalds wrote:
> In fact, git has all the same issues that BK had, and for the same
> fundamental reason: if you do distributed work, you have to always
> "append" stuff, and that means that you can never re-order anything after
> the fact.
You c
On Tuesday 19 April 2005 13:36, Linus Torvalds wrote:
> On Tue, 19 Apr 2005, Chris Mason wrote:
> > I did a quick experiment with applying/commit 100 patches from the suse
> > kernel into a kernel git tree, which quilt can do in 2 seconds. git
> > needs 1m5s.
>
> Note that I don't think you want t
On Tue, 19 Apr 2005, Chris Mason wrote:
>
> I did a quick experiment with applying/commit 100 patches from the suse
> kernel
> into a kernel git tree, which quilt can do in 2 seconds. git needs 1m5s.
Note that I don't think you want to replace quilt with git. The approaches
are totally diff
51 matches
Mail list logo