On Sun, Apr 24, 2016 at 18:06:51 -0400, Emilio G. Cota wrote: > On Sun, Apr 24, 2016 at 12:46:08 -0700, Richard Henderson wrote: > > On 04/22/2016 04:57 PM, Emilio G. Cota wrote: > > >On Fri, Apr 22, 2016 at 12:59:52 -0700, Richard Henderson wrote: > > >>FWIW, so that I could get an idea of how the stats change as we improve > > >>the > > >>hashing, I inserted the attachment 1 patch between patches 5 and 6, and > > >>with > > >>attachment 2 attempting to fix the accounting for patches 9 and 10. > > > > > >For qht, I dislike the approach of reporting "avg chain" per-element, > > >instead of per-bucket. Performance for a bucket whose entries are > > >all valid is virtually the same as that of a bucket that only > > >has one valid element; thus, with per-bucket reporting, we'd say that > > >the chain lenght is 1 in both cases, i.e. "perfect". With per-element > > >reporting, we'd report 4 (on a 64-bit host, since that's the value of > > >QHT_BUCKET_ENTRIES) when the bucket is full, which IMO gives the > > >wrong idea (users would think they're in trouble, when they're not). > > > > But otherwise you have no way of knowing how full the buckets are. The > > bucket size is just something that you have to keep in mind. > > I'll make some changes in v4 that I think will address both your and > my concerns: > - Report the number of empty buckets > - Do not count empty buckets when reporting avg bucket chain length > - Report average bucket occupancy (in %, so that QHT_BUCKET_ENTRIES > does not have to be reported.)
How does the following look? Example with good hashing, i.e. func5(phys_pc, pc, flags): TB count 704242/1342156 [...] TB hash buckets 386484/524288 (73.72% used) TB hash occupancy 32.57% avg chain occupancy. Histogram: 0-10%▆▁█▁▁▅▁▃▁▁90-100% TB hash avg chain 1.02 buckets. Histogram: 1█▁▁3 Example with bad hashing, i.e. func5(phys_pc, 0, 0): TB count 710748/1342156 [...] TB hash buckets 113569/524288 (21.66% used) TB hash occupancy 10.24% avg chain occupancy. Histogram: 0-10%█▁▁▁▁▁▁▁▁▁90-100% TB hash avg chain 2.11 buckets. Histogram: 1▇▁▁▁▁▁▁▁▁▁93 Note that: - "TB hash avg chain" does _not_ count empty buckets. This gives an idea of how many buckets a typical hit goes through. - "TB hash occupancy" _does_ count empty buckets. It is called "avg chain occupancy" and not "avg occupancy" because the counts are only valid per-chain due to the seqlock protecting each chain. Thanks, Emilio