On Thu, Jun 1, 2017 at 2:25 PM, Andres Freund <and...@anarazel.de> wrote: > Just to clarify: I don't think it's a problem to do so for integers and > most other simple scalar types. There's plenty hash algorithms that are > endianess independent, and the rest is just a bit of care.
Do you have any feeling for which of those endianness-independent hash functions might be a reasonable choice for us? https://github.com/markokr/pghashlib implements various hash functions for PostgreSQL, and claims that, of those implemented, crc32, Jenkins, lookup3be and lookup3le, md5, and siphash24 are endian-independent. An interesting point here is that Jeff Davis asserted in the original post on this thread that our existing hash_any() wasn't portable, but our current hash_any seems to be the Jenkins algorithm -- so I'm confused. Part of the problem seems to be that, according to https://en.wikipedia.org/wiki/Jenkins_hash_function there are 4 of those. I don't know whether the one in pghashlib is the same one we've implemented. Kennel Marshall suggested xxhash as an endian-independent algorithm upthread. Code for that is available under a 2-clause BSD license. PostgreSQL page checksums use an algorithm based on, but not exactly, FNV-1a. See storage/checksum_impl.h. The comments there say this algorithm was chosen with speed in mind. Our version is not endian-independent because it folds in 4-byte integers rather than 1-byte integers, but plain old FNV-1a *is* endian-independent and could be used. We also have an implementation of CRC32C in core - see port/pg_crc32.h and port/pg_crc32c_sb8.c. It's not clear to me whether this is Endian-independent or not, although there is stuff that depends on WORDS_BIGENDIAN, so, uh, maybe? Some other possibly-interesting links: https://research.neustar.biz/2011/12/29/choosing-a-good-hash-function-part-2/ http://greenrobot.org/essentials/features/performant-hash-functions-for-java/comparison-of-hash-functions/ https://www.strchr.com/hash_functions Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers