On Thu, Apr 28, 2016 at 10:25 PM, Linus Torvalds
wrote:
> It's the hashes that _look_ like they might be good hashes, but
> there's not a lot of analysis behind it, that I would worry about. The
> simple prime modulus _should_ be fine, but at the same time I kind
On Thu, Apr 28, 2016 at 10:25 PM, Linus Torvalds
wrote:
> It's the hashes that _look_ like they might be good hashes, but
> there's not a lot of analysis behind it, that I would worry about. The
> simple prime modulus _should_ be fine, but at the same time I kind of
> suspect we can do better.
On Fri, May 13, 2016 at 8:54 PM, George Spelvin wrote:
>
> There are exactly three architectures which (some models) don't have
> an efficient 32x32->32-bit multiply:
>
> - arch/m58k: MC68000 (and 68010 and 68328) no-mmu
> - arch/h8300: Most (all?) of the H8 processor series
>
On Fri, May 13, 2016 at 8:54 PM, George Spelvin wrote:
>
> There are exactly three architectures which (some models) don't have
> an efficient 32x32->32-bit multiply:
>
> - arch/m58k: MC68000 (and 68010 and 68328) no-mmu
> - arch/h8300: Most (all?) of the H8 processor series
> - arch/microblaze:
On May 1, 2016 at 9:51 AM, Lnus Torvalds wrote:
> On Sun, May 1, 2016 at 2:43 AM, George Spelvin wrote:
>> * If you feel ambitious, add a 32-bit CONFIG_ARCH_HAS_SLOW_MULTIPLIER
>> exception path.
> Let's make that a separate worry, and just fix hash_64() first.
>
> In
On May 1, 2016 at 9:51 AM, Lnus Torvalds wrote:
> On Sun, May 1, 2016 at 2:43 AM, George Spelvin wrote:
>> * If you feel ambitious, add a 32-bit CONFIG_ARCH_HAS_SLOW_MULTIPLIER
>> exception path.
> Let's make that a separate worry, and just fix hash_64() first.
>
> In particular, that means
On Fri, 2016-04-29 at 16:51 -0700, Linus Torvalds wrote:
> On Fri, Apr 29, 2016 at 2:10 PM, Linus Torvalds
> wrote:
> > On Thu, Apr 28, 2016 at 11:32 AM, Linus Torvalds
> > wrote:
> >>
> >> hash_long/ptr really shouldn't care about
On Fri, 2016-04-29 at 16:51 -0700, Linus Torvalds wrote:
> On Fri, Apr 29, 2016 at 2:10 PM, Linus Torvalds
> wrote:
> > On Thu, Apr 28, 2016 at 11:32 AM, Linus Torvalds
> > wrote:
> >>
> >> hash_long/ptr really shouldn't care about the low bits being zero at
> >> all, because it should mix in
On Sun, 1 May 2016, George Spelvin wrote:
> But I noticed a much greater difference.
>
> Wang * PHI% 4093 Wang * PHI% 4093
> 13599486 3494816 5238442 13584166 3460266 5239463
> 13589552 3466764 5237327 13582381 3422304 5276253
> 13569409 3407317 5236239
On Sun, 1 May 2016, George Spelvin wrote:
> But I noticed a much greater difference.
>
> Wang * PHI% 4093 Wang * PHI% 4093
> 13599486 3494816 5238442 13584166 3460266 5239463
> 13589552 3466764 5237327 13582381 3422304 5276253
> 13569409 3407317 5236239
On Sun, May 1, 2016 at 2:43 AM, George Spelvin wrote:
>
> Anyway, my recommendation (I'll write the patches if you like) is:
>
> * Modify the multiplicative constants to be
> #define COLDEN_RATIO_32 0x61C88647
> #define COLDEN_RATIO_64 0x61C8864680B583EB
Interestingly,
On Sun, May 1, 2016 at 2:43 AM, George Spelvin wrote:
>
> Anyway, my recommendation (I'll write the patches if you like) is:
>
> * Modify the multiplicative constants to be
> #define COLDEN_RATIO_32 0x61C88647
> #define COLDEN_RATIO_64 0x61C8864680B583EB
Interestingly, looking at where
> Sorry I did not express myself clear enough.
> hash64 (the single multiply with the adjusted golden ratio) is
> slightly faster than the modulo one which has two mutiplications.
Yes, I figured out that was probably what you were talking about,
and benchmarked it similarly.
But I noticed a
> Sorry I did not express myself clear enough.
> hash64 (the single multiply with the adjusted golden ratio) is
> slightly faster than the modulo one which has two mutiplications.
Yes, I figured out that was probably what you were talking about,
and benchmarked it similarly.
But I noticed a
On Sat, 30 Apr 2016, George Spelvin wrote:
> Thomas Gleixner wrote:
> You say that
> > hash64 is slightly faster as the modulo prime as it does not have the
> > multiplication.
>
> Um... are you sure you benchmarked that right? The hash_64 code you
> used (Thomas Wang's 64->32-bit hash) has a
On Sat, 30 Apr 2016, George Spelvin wrote:
> Thomas Gleixner wrote:
> You say that
> > hash64 is slightly faster as the modulo prime as it does not have the
> > multiplication.
>
> Um... are you sure you benchmarked that right? The hash_64 code you
> used (Thomas Wang's 64->32-bit hash) has a
Thomas Gleixner wrote:
> I'll send a patch to replace hash_64 and hash_32.
Before you do that, could we look for a way to tweak the constants
in the existing hash?
It seems the basic "take the high bits of x * K" algorithm is actually
a decent hash function if K is chosen properly, and has a
Thomas Gleixner wrote:
> I'll send a patch to replace hash_64 and hash_32.
Before you do that, could we look for a way to tweak the constants
in the existing hash?
It seems the basic "take the high bits of x * K" algorithm is actually
a decent hash function if K is chosen properly, and has a
On Sat, 2016-04-30 at 10:12 -0700, Linus Torvalds wrote:
> On Sat, Apr 30, 2016 at 9:45 AM, Eric Dumazet wrote:
> >
> > I use hash_32() in net/sched/sch_fq.c, for all packets sent by Google
> > servers. (Note that I did _not_ use hash_ptr())
> >
> > That's gazillions of
On Sat, 2016-04-30 at 10:12 -0700, Linus Torvalds wrote:
> On Sat, Apr 30, 2016 at 9:45 AM, Eric Dumazet wrote:
> >
> > I use hash_32() in net/sched/sch_fq.c, for all packets sent by Google
> > servers. (Note that I did _not_ use hash_ptr())
> >
> > That's gazillions of packets per second, and
On Sat, Apr 30, 2016 at 9:45 AM, Eric Dumazet wrote:
>
> I use hash_32() in net/sched/sch_fq.c, for all packets sent by Google
> servers. (Note that I did _not_ use hash_ptr())
>
> That's gazillions of packets per second, and the current multiply worked
> just fine in term
On Sat, Apr 30, 2016 at 9:45 AM, Eric Dumazet wrote:
>
> I use hash_32() in net/sched/sch_fq.c, for all packets sent by Google
> servers. (Note that I did _not_ use hash_ptr())
>
> That's gazillions of packets per second, and the current multiply worked
> just fine in term of hash spreading.
So
On Sat, 2016-04-30 at 15:02 +0200, Thomas Gleixner wrote:
> Yes. So I tested those two:
>
> u32 hash_64(u64 key)
> {
>key = ~key + (key << 18);
>key ^= key >> 31;
>key += (key << 2)) + (key << 4);
>key ^= key >> 11;
>key += key << 6;
>key ^= key
On Sat, 2016-04-30 at 15:02 +0200, Thomas Gleixner wrote:
> Yes. So I tested those two:
>
> u32 hash_64(u64 key)
> {
>key = ~key + (key << 18);
>key ^= key >> 31;
>key += (key << 2)) + (key << 4);
>key ^= key >> 11;
>key += key << 6;
>key ^= key
On Fri, 29 Apr 2016, Linus Torvalds wrote:
> Picking a new value almost at random (I say "almost", because I just
> started with that 32-bit multiplicand value that mostly works and
> shifted it up by 32 bits and then randomly added a few more bits to
> avoid long ranges of ones and zeroes), I
On Fri, 29 Apr 2016, Linus Torvalds wrote:
> Picking a new value almost at random (I say "almost", because I just
> started with that 32-bit multiplicand value that mostly works and
> shifted it up by 32 bits and then randomly added a few more bits to
> avoid long ranges of ones and zeroes), I
On Thu, 28 Apr 2016, Linus Torvalds wrote:
> It's the hashes that _look_ like they might be good hashes, but
> there's not a lot of analysis behind it, that I would worry about. The
> simple prime modulus _should_ be fine, but at the same time I kind of
> suspect we can do better. Especially since
On Thu, 28 Apr 2016, Linus Torvalds wrote:
> It's the hashes that _look_ like they might be good hashes, but
> there's not a lot of analysis behind it, that I would worry about. The
> simple prime modulus _should_ be fine, but at the same time I kind of
> suspect we can do better. Especially since
> Not doing it for 64-bit constants makes no sense if it just uses the
> trivial Booth's algorithm version.
AFAICT, gcc 5 *does* optimize 64-bit multiplies by constants.
Does the belief that it doesn't date back to some really old
version?
There's still a threshold where it just punts to the
> Not doing it for 64-bit constants makes no sense if it just uses the
> trivial Booth's algorithm version.
AFAICT, gcc 5 *does* optimize 64-bit multiplies by constants.
Does the belief that it doesn't date back to some really old
version?
There's still a threshold where it just punts to the
On Fri, 2016-04-29 at 16:51 -0700, Linus Torvalds wrote:
> There's presumably a few optimal values from a "spread bits out
> evenly" standpoint, and they won't have anything to do with random
> irrational constants, and will have everything to do with having nice
> bitpatterns.
>
> I'm adding
On Fri, 2016-04-29 at 16:51 -0700, Linus Torvalds wrote:
> There's presumably a few optimal values from a "spread bits out
> evenly" standpoint, and they won't have anything to do with random
> irrational constants, and will have everything to do with having nice
> bitpatterns.
>
> I'm adding
On Fri, Apr 29, 2016 at 5:32 PM, George Spelvin wrote:
>
> Odd is important. If the multiplier is even, the msbit of the input
> doesn't affect the hash result at all.
Fair enough. My test-set was incomplete.
>> Yeah. gcc will actually do the clever stuff for the 32-bit
On Fri, Apr 29, 2016 at 5:32 PM, George Spelvin wrote:
>
> Odd is important. If the multiplier is even, the msbit of the input
> doesn't affect the hash result at all.
Fair enough. My test-set was incomplete.
>> Yeah. gcc will actually do the clever stuff for the 32-bit case, afaik.
>
> It's
> At least for my tests, even that seems to actually be a total
> non-issue. Yes, odd values *might* be better, but as mentioned in my
> crossing email, it doesn't actually seem to matter for any case the
> kernel cares about, since we tend to want to hash down to 10-20 bits
> of data, so the
> At least for my tests, even that seems to actually be a total
> non-issue. Yes, odd values *might* be better, but as mentioned in my
> crossing email, it doesn't actually seem to matter for any case the
> kernel cares about, since we tend to want to hash down to 10-20 bits
> of data, so the
On Fri, Apr 29, 2016 at 4:31 PM, George Spelvin wrote:
>
> After researching it, I think that the "high bits of a multiply" is
> in fact a decent way to do such a hash.
Our emails crossed. Yes. My test harness actually likes the
multiplication more than most of the specialized
On Fri, Apr 29, 2016 at 4:31 PM, George Spelvin wrote:
>
> After researching it, I think that the "high bits of a multiply" is
> in fact a decent way to do such a hash.
Our emails crossed. Yes. My test harness actually likes the
multiplication more than most of the specialized "spread out bits"
On Fri, Apr 29, 2016 at 2:10 PM, Linus Torvalds
wrote:
> On Thu, Apr 28, 2016 at 11:32 AM, Linus Torvalds
> wrote:
>>
>> hash_long/ptr really shouldn't care about the low bits being zero at
>> all, because it should mix in all the
On Fri, Apr 29, 2016 at 2:10 PM, Linus Torvalds
wrote:
> On Thu, Apr 28, 2016 at 11:32 AM, Linus Torvalds
> wrote:
>>
>> hash_long/ptr really shouldn't care about the low bits being zero at
>> all, because it should mix in all the bits (using a prime multiplier
>> and taking the high bits).
>
>
On Fri, Apr 29, 2016 at 9:32 PM, Linus Torvalds
wrote:
wrote:
> For example, that _long_ range of bits set ("7fffc" in the middle)
> is effectively just one bit set with a subtraction. And it's *right*
> in that bit area that is supposed to shuffle bits 14-40 to
On Fri, Apr 29, 2016 at 9:32 PM, Linus Torvalds
wrote:
wrote:
> For example, that _long_ range of bits set ("7fffc" in the middle)
> is effectively just one bit set with a subtraction. And it's *right*
> in that bit area that is supposed to shuffle bits 14-40 to the high bits
> (which is what
On Thu, Apr 28, 2016 at 11:32 AM, Linus Torvalds
wrote:
>
> hash_long/ptr really shouldn't care about the low bits being zero at
> all, because it should mix in all the bits (using a prime multiplier
> and taking the high bits).
Looking at this assertion, it
On Thu, Apr 28, 2016 at 11:32 AM, Linus Torvalds
wrote:
>
> hash_long/ptr really shouldn't care about the low bits being zero at
> all, because it should mix in all the bits (using a prime multiplier
> and taking the high bits).
Looking at this assertion, it doesn't actually pan out very well at
Linus wrote:
> Having looked around at other hashes, I suspect we should look at the
> ones that do five or six shifts, and a mix of add/sub and xor. And
> because they shift the bits around more freely you don't have the
> final shift (that ends up being dependent on the size of the target
>
Linus wrote:
> Having looked around at other hashes, I suspect we should look at the
> ones that do five or six shifts, and a mix of add/sub and xor. And
> because they shift the bits around more freely you don't have the
> final shift (that ends up being dependent on the size of the target
>
On Thu, Apr 28, 2016 at 7:57 PM, George Spelvin wrote:
>
> How many 32-bit platforms nead a multiplier that's easy for GCC to
> evaluate via shifts and adds?
>
> Generlly, by the time you've got a machine grunty enough to
> need 64 bits, a multiplier is quite affordable.
On Thu, Apr 28, 2016 at 7:57 PM, George Spelvin wrote:
>
> How many 32-bit platforms nead a multiplier that's easy for GCC to
> evaluate via shifts and adds?
>
> Generlly, by the time you've got a machine grunty enough to
> need 64 bits, a multiplier is quite affordable.
Probably true.
That
Thomas Gleixner wrote:
> I'm not a hashing wizard and I completely failed to understand why
> hash_long/ptr are so horrible for the various test cases I ran.
It's very simple: the constants chosen are bit-sparse, *particularly*
in the least significant bits, and only 32/64 bits of the product are
Thomas Gleixner wrote:
> I'm not a hashing wizard and I completely failed to understand why
> hash_long/ptr are so horrible for the various test cases I ran.
It's very simple: the constants chosen are bit-sparse, *particularly*
in the least significant bits, and only 32/64 bits of the product are
On Thu, Apr 28, 2016 at 4:26 PM, Thomas Gleixner wrote:
>
> I'll try to dig up some time to analyze the hash_long failure unless someone
> familiar with the problem is beating me to it.
I'm not sure you need to spend time analyzing failure: if you get bad
hashing with
On Thu, Apr 28, 2016 at 4:26 PM, Thomas Gleixner wrote:
>
> I'll try to dig up some time to analyze the hash_long failure unless someone
> familiar with the problem is beating me to it.
I'm not sure you need to spend time analyzing failure: if you get bad
hashing with hash_long(), then we know
On Thu, 28 Apr 2016, Linus Torvalds wrote:
>
> I'd really hate to add *another* ad-hoc hash when the previous ad-hoc
> hash has been shown to be bad.
I completely agree.
I'm not a hashing wizard and I completely failed to understand why
hash_long/ptr are so horrible for the various test cases I
On Thu, 28 Apr 2016, Linus Torvalds wrote:
>
> I'd really hate to add *another* ad-hoc hash when the previous ad-hoc
> hash has been shown to be bad.
I completely agree.
I'm not a hashing wizard and I completely failed to understand why
hash_long/ptr are so horrible for the various test cases I
On Thu, Apr 28, 2016 at 9:42 AM, Thomas Gleixner wrote:
> hash_long/hash_ptr() provide really bad hashing for small hash sizes and for
> cases where the lower 12 bits (Page size aligned) of the hash value are 0.
Hmm.
hash_long/ptr really shouldn't care about the low bits
On Thu, Apr 28, 2016 at 9:42 AM, Thomas Gleixner wrote:
> hash_long/hash_ptr() provide really bad hashing for small hash sizes and for
> cases where the lower 12 bits (Page size aligned) of the hash value are 0.
Hmm.
hash_long/ptr really shouldn't care about the low bits being zero at
all,
hash_long/hash_ptr() provide really bad hashing for small hash sizes and for
cases where the lower 12 bits (Page size aligned) of the hash value are 0.
A simple modulo(prime) based hash function has way better results
across a wide range of input values. The implementation uses invers
hash_long/hash_ptr() provide really bad hashing for small hash sizes and for
cases where the lower 12 bits (Page size aligned) of the hash value are 0.
A simple modulo(prime) based hash function has way better results
across a wide range of input values. The implementation uses invers
58 matches
Mail list logo