Re: [PATCH 0/4] Bloom filter experiment

2018-10-17 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason writes: >> This is all to say: having a maximum size is good. 512 is big enough >> to cover _most_ commits, but not so big that we may store _really_ big >> filters. > > Makes sense. 512 is good enough to hardcode initially, but I couldn't > tell from briefly skimming the

Re: [PATCH 0/4] Bloom filter experiment

2018-10-16 Thread Jonathan Tan
> | Implementation | Queries | Maybe | FP # | FP %  | > ||-|---|--|---| > | Szeder | 66095   | 1142  | 256  | 0.38% | > | Jonathan   | 66459   | 107   | 89   | 0.16% | > | Stolee | 53025   | 492   | 479  | 0.90% | > > (Note that we must have

Re: [PATCH 0/4] Bloom filter experiment

2018-10-16 Thread Derrick Stolee
On 10/16/2018 8:57 AM, Ævar Arnfjörð Bjarmason wrote: On Tue, Oct 16 2018, Derrick Stolee wrote: On 10/16/2018 12:45 AM, Junio C Hamano wrote: Derrick Stolee writes: 2. The filters are sized according to the number of changes in each commit, with a minimum of one 64-bit word. ... 6. When

Re: [PATCH 0/4] Bloom filter experiment

2018-10-16 Thread Ævar Arnfjörð Bjarmason
On Tue, Oct 16 2018, Derrick Stolee wrote: > On 10/16/2018 12:45 AM, Junio C Hamano wrote: >> Derrick Stolee writes: >> >>> 2. The filters are sized according to the number of changes in each >>> commit, with a minimum of one 64-bit word. >>> ... >>> 6. When we compute the Bloom filters, we

Re: [PATCH 0/4] Bloom filter experiment

2018-10-16 Thread Derrick Stolee
On 10/16/2018 12:45 AM, Junio C Hamano wrote: Derrick Stolee writes: 2. The filters are sized according to the number of changes in each commit, with a minimum of one 64-bit word. ... 6. When we compute the Bloom filters, we don't store a filter for commits whose first-parent diff has more

Re: [PATCH 0/4] Bloom filter experiment

2018-10-15 Thread Junio C Hamano
Derrick Stolee writes: > 2. The filters are sized according to the number of changes in each > commit, with a minimum of one 64-bit word. > ... > 6. When we compute the Bloom filters, we don't store a filter for > commits whose first-parent diff has more than 512 paths. Just being curious but

Re: [PATCH 0/4] Bloom filter experiment

2018-10-15 Thread Derrick Stolee
On 10/9/2018 3:34 PM, SZEDER Gábor wrote: To keep the ball rolling, here is my proof of concept in a somewhat cleaned-up form, with still plenty of rough edges. Peff, Szeder, and Jonathan, Thanks for giving me the kick in the pants to finally write a proof of concept for my personal take on

Re: [PATCH 0/4] Bloom filter experiment

2018-10-09 Thread Derrick Stolee
On 10/9/2018 3:34 PM, SZEDER Gábor wrote: To keep the ball rolling, here is my proof of concept in a somewhat cleaned-up form, with still plenty of rough edges. You can play around with it like this: $ GIT_USE_POC_BLOOM_FILTER=$((8*1024*1024*8)) git commit-graph write Computing commit

[PATCH 0/4] Bloom filter experiment

2018-10-09 Thread SZEDER Gábor
To keep the ball rolling, here is my proof of concept in a somewhat cleaned-up form, with still plenty of rough edges. You can play around with it like this: $ GIT_USE_POC_BLOOM_FILTER=$((8*1024*1024*8)) git commit-graph write Computing commit graph generation numbers: 100% (52801/52801),