Hiya, Last night, I was running memtier_benchmark on my laptop (mid-2015 15" MBP, 2.5GHz 4c i7) and achieved about a 10-15% throughput improvement on both modern and non-modern settings on the 64 bit variant. 32 bit variant was about equal in performance (the results showed them to be within about 3% of each other, but most of the difference was probably just typical entropy). I was able to solve the 32/64 bit compile time problem by adding in a wrapper and some compile-time declarations, so I'd say that's about 50% solved for x86-based systems. But yeah, with ARM, it could turn interesting.
As a next-ish step, I'm going to attempt to drop in xxh3, but since it's still in active development, it's probably not good as anything more than a tech demo. I'm happy, if it would help, just to go nuts adding a dozen different algos into hash.c, though (cityhash/farmhash, as you mentioned). In xxhash's implementation, though, I played with some compile-time flags to make it a bit faster, and I've been toying with the idea of modifying it so no seed logic ever occurs, to maybe gain a couple cycles of speed increase. I'm also looking into seeing if I can find a pure assembly version to squeeze a bit more speed out of x86 and ARM versions. I should probably get one of my ARM systems running and test the difference... But hey, thanks for humoring me. Maybe next I'll take a look at the reading & processing command steps, and see if there's anything I can do. Or maybe parallelizing rotl... Hm. I'll keep on with trying it out :) Thanks, Eamonn On Sun, Mar 17, 2019 at 2:46 PM dormando <[email protected]> wrote: > Hey, > > What exact test did you do? > > Well to be honest I've been wanting to swap in xxhash for a long time, but > in my own profiling other things show up higher than murmur so I keep > deprioritizing it :) > > One big problem with the hash algo is mc keys can be short and are > hashed one at a time. xxhash is more optimized for longer data (kilobytes > to megabytes). The original author tries to address this with an updated > algorithm: > https://fastcompression.blogspot.com/2019/03/presenting-xxh3.html > > xxhash makes significant use of instruction parallelism, such that if a > key is 8 bytes or less you could end up waiting for the pipeline more > than murmur. Other algos like cityhash/farmhash are better at short keys > IIRC. Also xx's 32bit algo is a bit slower on 64bit machines... so if I > wanted to use it I was going to test both 32bit and 64bit hashes and then > have to do compile time testing to figure out which to use. It's also > heavily x86 optimized so we might have to default something else for ARM. > > Sorry, not debated on the list, just in my own head :) It's not quite as > straightforward as just dropping it in. If you're willing to get all the > conditions tested go nuts! :) > > -Dormando > > On Sat, 16 Mar 2019, eamonn.nugent via memcached wrote: > > > Hi there, > > I started using memcached in prod a week or two ago, and am loving it. I > wanted to give back, and took a look through the issues board, > > but most of them looked solved. So, in my usual "it's never fast enough" > style, I went and profiled its performance, and had some fun. > > > > After seeing that MurmurHash3 was taking a good amount of the execution > time, I decided to run a test integrating one of my old favorite > > hash functions, xxhash. My guess is that Memcached could benefit from > using the hash function, as it is faster than MMH3 and has several > > native variants. I ran some of my own tests, and found roughly equal > performance, but with no tuning performed on xxhash. For example, > > using an assembly (x86/arm/etc) version could likely speed up hashing, > along with properly implementing it in memcached. However, I was > > also running this on a much older Nehalem CPU, so there could be unseen > advantages to one or both of the algorithms by running them on a > > newer CPU. I'm in the process of fighting with my newer systems to get > libevent installed properly, so I'll report back with more > > up-to-date tests later. > > > > I did a cursory search, but didn't find any mention of the algo in the > mailing list. If this has been discussed, though, apologies for > > bringing it up again. On the other hand, I would be happy to write a PR > to add it, using the `hash_algorithm` CLI arg. > > > > Thanks, > > Eamonn > > > > -- > > > > --- > > You received this message because you are subscribed to the Google > Groups "memcached" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected]. > > For more options, visit https://groups.google.com/d/optout. > > > > > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "memcached" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/memcached/Y02zPF-WTKg/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "memcached" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
