Re: [PATCH] Hold an 'unsigned long' chunk of the sha1 in obj_hash

2013-05-02 Thread Thomas Rast
Jeff King p...@peff.net writes: It _might_ still be advantageous to do your patch on top, but I suspect it will diminish the returns from your patch (since the point of it is to probe less far down the chain on average). No, mine makes it slower again. Apparently the increased size is no

Re: [PATCH] Hold an 'unsigned long' chunk of the sha1 in obj_hash

2013-05-01 Thread Jeff King
On Thu, Apr 25, 2013 at 08:04:01PM +0200, Thomas Rast wrote: And probing lookups happen a lot: some simple instrumentation shows that 'git rev-list --all --objects' on my git.git, * 19.4M objects are accessed in lookup_object and grow_object_hash combined, while * the linear probing

[PATCH] Hold an 'unsigned long' chunk of the sha1 in obj_hash

2013-04-25 Thread Thomas Rast
The existing obj_hash is really straightforward: it holds a struct object * and spills into the subsequent slots (linear probing), which is good enough because it doesn't need to support deletion. However, this arrangement has pretty bad cache locality in the case of collisions. Because the sha1

Re: [PATCH] Hold an 'unsigned long' chunk of the sha1 in obj_hash

2013-04-25 Thread Junio C Hamano
Thomas Rast tr...@inf.ethz.ch writes: So we take a slightly different approach, and trade some memory for better cache locality. Interesting. It feels somewhat bait-and-switch to reveal that the above some turns out to be double later, but the resulting code does not look too bad, and the

Re: [PATCH] Hold an 'unsigned long' chunk of the sha1 in obj_hash

2013-04-25 Thread Thomas Rast
Junio C Hamano gits...@pobox.com writes: Thomas Rast tr...@inf.ethz.ch writes: So we take a slightly different approach, and trade some memory for better cache locality. Interesting. It feels somewhat bait-and-switch to reveal that the above some turns out to be double later, but the

Re: [PATCH] Hold an 'unsigned long' chunk of the sha1 in obj_hash

2013-04-25 Thread Duy Nguyen
On Fri, Apr 26, 2013 at 1:04 AM, Thomas Rast tr...@inf.ethz.ch wrote: So we take a slightly different approach, and trade some memory for better cache locality. Namely, we change the hash table slots to contain struct object *obj; unsigned long sha1prefix; We use this new 'sha1prefix'