Re: [PATCH 4/6] introduce a commit metapack

2013-03-18 Thread Jeff King
On Sun, Mar 17, 2013 at 08:21:13PM +0700, Nguyen Thai Ngoc Duy wrote: On Thu, Jan 31, 2013 at 6:06 PM, Duy Nguyen pclo...@gmail.com wrote: On Wed, Jan 30, 2013 at 09:16:29PM +0700, Duy Nguyen wrote: Perhaps we could store abbrev sha-1 instead of full sha-1. Nice space/time trade-off.

Re: [PATCH 4/6] introduce a commit metapack

2013-03-17 Thread Duy Nguyen
On Thu, Jan 31, 2013 at 6:06 PM, Duy Nguyen pclo...@gmail.com wrote: On Wed, Jan 30, 2013 at 09:16:29PM +0700, Duy Nguyen wrote: Perhaps we could store abbrev sha-1 instead of full sha-1. Nice space/time trade-off. Following the on-disk format experiment yesterday, I changed the format to:

Re: [PATCH 4/6] introduce a commit metapack

2013-02-02 Thread Duy Nguyen
On Fri, Feb 1, 2013 at 5:15 PM, Jeff King p...@peff.net wrote: The short-sha1 is a clever idea. Looks like it saves us on the order of 4MB for linux-2.6 (versus the full 20-byte sha1). Not as big as the savings we get from dropping the other 3 sha1's to uint32_t, but still not bad. We could

Re: [PATCH 4/6] introduce a commit metapack

2013-02-02 Thread Junio C Hamano
Jeff King p...@peff.net writes: On Thu, Jan 31, 2013 at 09:03:26AM -0800, Shawn O. Pearce wrote: ... If we are going to change the index to support extension sections and I have to modify JGit to grok this new format, it needs to be index v3 not index v2. If we are making index v3 we should

Re: [PATCH 4/6] introduce a commit metapack

2013-02-01 Thread Jeff King
On Tue, Jan 29, 2013 at 11:17:41PM -0800, Junio C Hamano wrote: True, but it is even less headache if the file is totally separate and optional. Once you start thinking about using an offset to some list of SHA-1, perhaps? A section inside the same file can never go out of sync. Yes,

Re: [PATCH 4/6] introduce a commit metapack

2013-02-01 Thread Jeff King
On Thu, Jan 31, 2013 at 09:03:26AM -0800, Shawn O. Pearce wrote: Of course, it is more convenient to store this kind of things in a separate file while experimenting and improving the mechanism, but I do not think we want to see each packfile in a repository comes with 47 auxiliary files

Re: [PATCH 4/6] introduce a commit metapack

2013-02-01 Thread Jeff King
On Wed, Jan 30, 2013 at 08:56:07PM +0700, Nguyen Thai Ngoc Duy wrote: Another point, but not really important at this stage, I think we have memory leak somewhere (lookup_commit??). It used up to 800 MB RES on linux-2.6.git while generating the cache. We generate (and then leak!) the linked

Re: [PATCH 4/6] introduce a commit metapack

2013-02-01 Thread Jeff King
On Thu, Jan 31, 2013 at 06:06:56PM +0700, Nguyen Thai Ngoc Duy wrote: On Wed, Jan 30, 2013 at 09:16:29PM +0700, Duy Nguyen wrote: Perhaps we could store abbrev sha-1 instead of full sha-1. Nice space/time trade-off. Following the on-disk format experiment yesterday, I changed the format

Re: [PATCH 4/6] introduce a commit metapack

2013-02-01 Thread Jeff King
On Thu, Jan 31, 2013 at 06:06:56PM +0700, Nguyen Thai Ngoc Duy wrote: On Wed, Jan 30, 2013 at 09:16:29PM +0700, Duy Nguyen wrote: Perhaps we could store abbrev sha-1 instead of full sha-1. Nice space/time trade-off. Following the on-disk format experiment yesterday, I changed the format

Re: [PATCH 4/6] introduce a commit metapack

2013-01-31 Thread Duy Nguyen
On Wed, Jan 30, 2013 at 09:16:29PM +0700, Duy Nguyen wrote: Perhaps we could store abbrev sha-1 instead of full sha-1. Nice space/time trade-off. Following the on-disk format experiment yesterday, I changed the format to: - a list a _short_ SHA-1 of cached commits - a list of cache entries,

Re: [PATCH 4/6] introduce a commit metapack

2013-01-31 Thread Shawn Pearce
On Wed, Jan 30, 2013 at 7:56 AM, Junio C Hamano gits...@pobox.com wrote: Jeff King p...@peff.net writes: From this: Then it will be very natural for the extension data that store the commit metainfo to name objects in the pack the .idx file describes by the offset in the SHA-1 table. I

Re: [PATCH 4/6] introduce a commit metapack

2013-01-30 Thread Duy Nguyen
On Tue, Jan 29, 2013 at 04:16:11AM -0500, Jeff King wrote: When we are doing a commit traversal that does not need to look at the commit messages themselves (e.g., rev-list, merge-base, etc), we spend a lot of time accessing, decompressing, and parsing the commit objects just to find the

Re: [PATCH 4/6] introduce a commit metapack

2013-01-30 Thread Duy Nguyen
On Wed, Jan 30, 2013 at 8:56 PM, Duy Nguyen pclo...@gmail.com wrote: However, performance seems to suffer too. Maybe I do more lookups than necessary, I don't know. Yes, I should have stored the position in the sha-1 - offset map instead of the position of the object in .pack file. Even so,

Re: [PATCH 4/6] introduce a commit metapack

2013-01-30 Thread Junio C Hamano
Jeff King p...@peff.net writes: From this: Then it will be very natural for the extension data that store the commit metainfo to name objects in the pack the .idx file describes by the offset in the SHA-1 table. I guess your argument is that putting it all in the same file makes it more

Re: [PATCH 4/6] introduce a commit metapack

2013-01-29 Thread Michael Haggerty
On 01/29/2013 10:16 AM, Jeff King wrote: When we are doing a commit traversal that does not need to look at the commit messages themselves (e.g., rev-list, merge-base, etc), we spend a lot of time accessing, decompressing, and parsing the commit objects just to find the parent and timestamp

Re: [PATCH 4/6] introduce a commit metapack

2013-01-29 Thread Jeff King
On Tue, Jan 29, 2013 at 11:24:45AM +0100, Michael Haggerty wrote: On 01/29/2013 10:16 AM, Jeff King wrote: When we are doing a commit traversal that does not need to look at the commit messages themselves (e.g., rev-list, merge-base, etc), we spend a lot of time accessing, decompressing,

Re: [PATCH 4/6] introduce a commit metapack

2013-01-29 Thread Junio C Hamano
Jeff King p...@peff.net writes: +int commit_metapack(unsigned char *sha1, + uint32_t *timestamp, + unsigned char **tree, + unsigned char **parent1, + unsigned char **parent2) +{ + struct commit_metapack *p; + +

Re: [PATCH 4/6] introduce a commit metapack

2013-01-29 Thread Duy Nguyen
On Tue, Jan 29, 2013 at 4:16 PM, Jeff King p...@peff.net wrote: +int commit_metapack(unsigned char *sha1, + uint32_t *timestamp, + unsigned char **tree, + unsigned char **parent1, + unsigned char **parent2) +{ Nit

Re: [PATCH 4/6] introduce a commit metapack

2013-01-29 Thread Jeff King
On Tue, Jan 29, 2013 at 09:38:10AM -0800, Junio C Hamano wrote: Jeff King p...@peff.net writes: +int commit_metapack(unsigned char *sha1, + uint32_t *timestamp, + unsigned char **tree, + unsigned char **parent1, + unsigned char

Re: [PATCH 4/6] introduce a commit metapack

2013-01-29 Thread Jeff King
On Wed, Jan 30, 2013 at 10:36:10AM +0700, Nguyen Thai Ngoc Duy wrote: On Tue, Jan 29, 2013 at 4:16 PM, Jeff King p...@peff.net wrote: +int commit_metapack(unsigned char *sha1, + uint32_t *timestamp, + unsigned char **tree, + unsigned