Re: [RFC] pack-objects: compression level for non-blobs

2013-01-01 Thread Duy Nguyen
On Tue, Jan 1, 2013 at 11:15 AM, Duy Nguyen pclo...@gmail.com wrote: Fix pack-objects to behave the way JGit does, cluster commits first in the pack stream. Now you have a dense space of commits. If I remember right this has a tiny positive improvement for most rev-list operations with very

Re: [RFC] pack-objects: compression level for non-blobs

2013-01-01 Thread Shawn Pearce
On Tue, Jan 1, 2013 at 4:10 AM, Duy Nguyen pclo...@gmail.com wrote: On Tue, Jan 1, 2013 at 11:15 AM, Duy Nguyen pclo...@gmail.com wrote: Fix pack-objects to behave the way JGit does, cluster commits first in the pack stream. Now you have a dense space of commits. If I remember right this has a

Re: [RFC] pack-objects: compression level for non-blobs

2013-01-01 Thread Junio C Hamano
Duy Nguyen pclo...@gmail.com writes: On Tue, Jan 1, 2013 at 11:15 AM, Duy Nguyen pclo...@gmail.com wrote: Fix pack-objects to behave the way JGit does, cluster commits first in the pack stream. Now you have a dense space of commits. If I remember right this has a tiny positive improvement for

Re: [RFC] pack-objects: compression level for non-blobs

2013-01-01 Thread Junio C Hamano
Shawn Pearce spea...@spearce.org writes: How blobs are written is very different, Junio's implementation is strictly better than JGit's[1]. I do not think there can be a single ordering that is strictly better than any other one. The clump all objects in a delta family and write them

Re: [RFC] pack-objects: compression level for non-blobs

2013-01-01 Thread Duy Nguyen
On Wed, Jan 2, 2013 at 12:17 AM, Shawn Pearce spea...@spearce.org wrote: And I was wrong. At least since 1b4bb16 (pack-objects: optimize recency order - 2011-06-30) commits are spread out and can be mixed with trees too. Grouping them back defeats what Junio did in that commit, I think. I

Re: [RFC] pack-objects: compression level for non-blobs

2012-12-31 Thread Shawn Pearce
This thread is pretty interesting. Unfortunately the holidays have kept me busy. But I am excited by the work David and Peff are doing. :-) On Sun, Dec 30, 2012 at 1:31 PM, Jeff King p...@peff.net wrote: On Sun, Dec 30, 2012 at 07:53:48PM +0700, Nguyen Thai Ngoc Duy wrote: $ cd objects/pack

Re: [RFC] pack-objects: compression level for non-blobs

2012-12-31 Thread Duy Nguyen
On Tue, Jan 1, 2013 at 1:06 AM, Shawn Pearce spea...@spearce.org wrote: 3. Dropping the commits file and just using the pack-*.idx as the index. The problem is that it is sparse in the commit space. So just naively storing 40 bytes per entry is going to waste a lot of space.

Re: [RFC] pack-objects: compression level for non-blobs

2012-12-30 Thread Jeff King
On Sat, Dec 29, 2012 at 12:27:47AM -0500, Jeff King wrote: If reachability bitmap is implemented, we'll have per-pack cache infrastructure ready, so less work there for commit cache. True. I don't want to dissuade you from doing any commit cache work. I only wanted to point out that this

Re: [RFC] pack-objects: compression level for non-blobs

2012-12-30 Thread Nguyen Thai Ngoc Duy
On Sun, Dec 30, 2012 at 7:05 PM, Jeff King p...@peff.net wrote: So I was thinking about this, which led to some coding, which led to some benchmarking. I like your way of thinking! May I suggest you take a new year break first, then think about reachability bitmaps ;-) 2013 will be an exciting

Re: [RFC] pack-objects: compression level for non-blobs

2012-12-30 Thread Jeff King
On Sun, Dec 30, 2012 at 07:53:48PM +0700, Nguyen Thai Ngoc Duy wrote: $ cd objects/pack ls pack-a3e262f40d95fc0cc97d92797ff9988551367b75.commits pack-a3e262f40d95fc0cc97d92797ff9988551367b75.idx pack-a3e262f40d95fc0cc97d92797ff9988551367b75.pack

Re: [RFC] pack-objects: compression level for non-blobs

2012-12-29 Thread Jeff King
On Sat, Dec 29, 2012 at 12:27:47AM -0500, Jeff King wrote: I think I tried the partial decompression for commit header and it did not help much (or I misremember it, not so sure). I'll see if I can dig up the reference, as it was something I was going to look at next. I tried the simple

Re: [RFC] pack-objects: compression level for non-blobs

2012-12-29 Thread Jeff King
On Sat, Dec 29, 2012 at 04:05:58AM -0500, Jeff King wrote: On Sat, Dec 29, 2012 at 12:27:47AM -0500, Jeff King wrote: I think I tried the partial decompression for commit header and it did not help much (or I misremember it, not so sure). I'll see if I can dig up the reference, as

Re: [RFC] pack-objects: compression level for non-blobs

2012-12-28 Thread Jeff King
On Mon, Nov 26, 2012 at 05:25:54PM +1100, David Michael Barr wrote: The intent is to allow selective recompression of pack data. For small objects/deltas the overhead of deflate is significant. This may improve read performance for the object graph. I ran some unscientific experiments

Re: [RFC] pack-objects: compression level for non-blobs

2012-12-28 Thread Nguyen Thai Ngoc Duy
On Sat, Dec 29, 2012 at 7:41 AM, Jeff King p...@peff.net wrote: I wonder if we could do even better, though. For a traversal, we only need to look at the commit header. We could potentially do a progressive inflate and stop before getting to the commit message (which is the bulk of the data,

Re: [RFC] pack-objects: compression level for non-blobs

2012-12-28 Thread Jeff King
On Sat, Dec 29, 2012 at 11:34:09AM +0700, Nguyen Thai Ngoc Duy wrote: On Sat, Dec 29, 2012 at 7:41 AM, Jeff King p...@peff.net wrote: I wonder if we could do even better, though. For a traversal, we only need to look at the commit header. We could potentially do a progressive inflate and

Re: [RFC] pack-objects: compression level for non-blobs

2012-12-28 Thread Nguyen Thai Ngoc Duy
On Sat, Dec 29, 2012 at 12:07 PM, Jeff King p...@peff.net wrote: On Sat, Dec 29, 2012 at 11:34:09AM +0700, Nguyen Thai Ngoc Duy wrote: On Sat, Dec 29, 2012 at 7:41 AM, Jeff King p...@peff.net wrote: I wonder if we could do even better, though. For a traversal, we only need to look at the

Re: [RFC] pack-objects: compression level for non-blobs

2012-12-28 Thread Jeff King
On Sat, Dec 29, 2012 at 12:25:04PM +0700, Nguyen Thai Ngoc Duy wrote: But just dropping the compression (or doing partial decompression when we only care about the beginning part) is way less code and complexity. I think I tried the partial decompression for commit header and it did not

Re: [RFC] pack-objects: compression level for non-blobs

2012-11-26 Thread David Michael Barr
Add config pack.graphcompression similar to pack.compression. Applies to non-blob objects and if unspecified falls back to pack.compression. We may identify objects compressed with level 0 by their leading bytes. Use this to force recompression when the source and target levels mismatch.