On Tue, 19 Apr 2005, Chris Mason wrote:
I'll finish off the patch once you ok the basics below. My current code
works
like this:
Chris, before you do anything further, let me re-consider.
Assuming that the real cost of write-tree is the compression (and I think
it is), I really suspect
Linus Torvalds wrote:
So I'll see if I can turn the current fsck into a convert into
uncompressed format, and do a nice clean format conversion.
Just let me know what you want to do, and I can trivially change the
conversion scripts I've already written to do what you want.
-hpa
-
To
* Linus Torvalds [EMAIL PROTECTED] wrote:
So to convert your old git setup to a new git setup, do the following:
[...]
did this for two repositories (git and kernel-git), it works as
advertised.
Ingo
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a
On 4/20/05, Linus Torvalds [EMAIL PROTECTED] wrote:
I converted my git archives (kernel and git itself) to do the SHA1 hash
_before_ the compression phase.
Linus,
Am I correct to understand that with this change, all the objects in
the database are still being compressed (so no net
On Wed, Apr 20, 2005 at 10:11:10PM +1000, Jon Seymour wrote:
On 4/20/05, Linus Torvalds [EMAIL PROTECTED] wrote:
I converted my git archives (kernel and git itself) to do the SHA1 hash
_before_ the compression phase.
Linus,
Am I correct to understand that with this change,
On 4/20/05, Martin Uecker [EMAIL PROTECTED] wrote:
The storage method of the database of a collection of
files in the underlying file system. Because of the
random nature of the hashes this leads to a horrible
amount of seeking for all operations which walk the
logical structure of some tree
The main point is not about trying different compression
techniques but that you don't need to compress at all just
to calculate the hash of some data. (to know if it is
unchanged for example)
Ah, ok, I didn't understand that there were extra compresses being
performed for that reason.
On Wed, 2005-04-20 at 02:08 -0700, Linus Torvalds wrote:
I converted my git archives (kernel and git itself) to do the SHA1
hash _before_ the compression phase.
I'm happy to see that -- because I'm going to be asking you to make
another change which will also require a simple repository
On Wed, 20 Apr 2005, Jon Seymour wrote:
Am I correct to understand that with this change, all the objects in the
database are still being compressed (so no net performance benefit), but by
doing the SHA1 calculations before compression you are keeping open the
possibility that at some
On Wed, Apr 20, 2005 at 10:30:15AM -0400, C. Scott Ananian wrote:
Hi,
your code looks pretty cool. thank you!
On Wed, 20 Apr 2005, Martin Uecker wrote:
The other thing I don't like is the use of a sha1
for a complete file. Switching to some kind of hash
tree would allow to introduce
On Wednesday 20 April 2005 02:43, Linus Torvalds wrote:
On Tue, 19 Apr 2005, Chris Mason wrote:
I'll finish off the patch once you ok the basics below. My current code
works like this:
Chris, before you do anything further, let me re-consider.
Assuming that the real cost of write-tree is
On Wed, 20 Apr 2005, Chris Mason wrote:
With the basic changes I described before, the 100 patch time only goes down
to 40s. Certainly not fast enough to justify the changes. In this case, the
bulk of the extra time comes from write-tree writing the index file, so I
split write-tree.c up into
On Wed, 20 Apr 2005, C. Scott Ananian wrote:
Hmm. Are our index files too large, or is there some other factor?
They _are_ pretty large, but they have to be,
For the kernel, the index file is about 1.6MB. That's
- 17,000+ files and filenames
- stat information for all of them
- the
On Wed, 20 Apr 2005, Linus Torvalds wrote:
I was considering using a chunked representation for *all* files (not just
blobs), which would avoid the original 'trees must reference other trees
or they become too large' issue -- and maybe the performance issue you're
referring to, as well?
No. The
On Wed, Apr 20, 2005 at 11:28:20AM -0400, C. Scott Ananian wrote:
Hi,
A merkle-tree (which I think you initially pointed me at) makes the hash
of the internal nodes be a hash of the chunk's hashes; ie not a straight
content hash. This is roughly what my current implementation does, but
I
On 4/20/05, Linus Torvalds [EMAIL PROTECTED] wrote:
It really _shouldn't_ be faster. It still does the compression, and throws
the end result away.
Am I misunderstanding or is the proglem that doing:
file with unknown status - compress - sha1 - compare with existing hash
is expensive?
What
On Wed, 20 Apr 2005, C. Scott Ananian wrote:
OK, sure. But how 'bout chunking trees? Are you grown happy with the new
trees-reference-other-trees paradigm, or is there a deep longing in your
heart for the simplicity of 'trees-reference-blobs-period'?
I'm pretty sure we do better
On Wed, 20 Apr 2005, Linus Torvalds wrote:
To actually go faster, it _should_ need this patch. Untested. See if it
works..
NO! Don't see if this works. For the sha1 file already exists file, it
forgot to return the SHA1 value in returnsha1, and would thus corrupt
the trees it wrote.
So
On Wednesday 20 April 2005 11:40, Linus Torvalds wrote:
On Wed, 20 Apr 2005, Chris Mason wrote:
Thanks for looking at this. Your new tree is faster, it gets the commit
100 patches time down from 1m5s to 50s.
It really _shouldn't_ be faster. It still does the compression, and throws
the
On Wed, 20 Apr 2005, Linus Torvalds wrote:
NO! Don't see if this works. For the sha1 file already exists file, it
forgot to return the SHA1 value in returnsha1, and would thus corrupt
the trees it wrote.
Proper version with fixes checked in. For me, it brings down the time to
write a
On Wed, 20 Apr 2005, Chris Mason wrote:
At any rate, the time for a single write-tree is pretty consistent. Before
it
was around .5 seconds, and with this change it goes down to .128s.
Oh, wow.
I bet your SHA1 implementation is done with hand-optimized and scheduled
x86 MMX code or
On Wed, 20 Apr 2005 10:06:15 -0700 (PDT)
Linus Torvalds [EMAIL PROTECTED] wrote:
I bet your SHA1 implementation is done with hand-optimized and scheduled
x86 MMX code or something, while my poor G5 is probably using some slow
generic routine. As a result, it only improved by 33% for me since
On Wed, 20 Apr 2005, Chris Mason wrote:
Well, the difference there should be pretty hard to see with any benchmark.
But I was being lazy...new patch attached. This one gets the same perf
numbers, if this is still wrong then I really need some more coffee.
I did my preferred version.
On Wed, 2005-04-20 at 07:59 -0700, Linus Torvalds wrote:
external-parent commit-hash external-parent-ID
comment for this parent
and the nice thing about that is that now that information allows you to
add external parents at any point.
Why do it like this? First
On Tue, Apr 19, 2005 at 10:36:06AM -0700, Linus Torvalds wrote:
In fact, git has all the same issues that BK had, and for the same
fundamental reason: if you do distributed work, you have to always
append stuff, and that means that you can never re-order anything after
the fact.
You can,
On Tue, 19 Apr 2005, Chris Mason wrote:
Very true, you can't replace quilt with git without ruining both of them.
But
it would be nice to take a quilt tree and turn it into a git tree for merging
purposes, or to make use of whatever visualization tools might exist someday.
Fair
On Tue, 19 Apr 2005, Linus Torvalds wrote:
On Tue, 19 Apr 2005, Chris Mason wrote:
Very true, you can't replace quilt with git without ruining both of them.
But
it would be nice to take a quilt tree and turn it into a git tree for merging
purposes, or to make use of whatever visualization tools
On Tue, 19 Apr 2005, Linus Torvalds wrote:
(*) Actually, I think it's the compression that ends up being the most
expensive part.
You're also using the equivalent of '-9', too -- and *that's slow*.
Changing to Z_NORMAL_COMPRESSION would probably help a lot
(but would break all existing
On Tue, 19 Apr 2005, Chris Mason wrote:
5) right before exiting, write-tree updates the index if it made any changes.
This part won't work. It needs to do the proper locking, which means that
it needs to create index.lock _before_ it reads the index file, and
write everything to that one
29 matches
Mail list logo