chaokunyang opened a new issue, #1409:
URL: https://github.com/apache/incubator-fury/issues/1409
Hi @chaokunyang , very nice work on Fury! There's a handful of
things I can think of that might speed up maps in your case.
An open-addressing, linear-probing table like this needs a low enough
collision rate to be as fast as it can get. If your hashCode() results are
known to be fairly-high-quality random numbers, such as for identity hash
codes, there's nothing that the map needs to do to mix the hashCode() at all,
and `place()` can use `item.hashCode() & mask` instead of needing to do the
`long` multiplication and shift. That's only a tiny speedup, but place() may
need to be called often. Counterintuitively, a place where a hash table of
`Class` objects might have trouble is how Classes are hashed -- they use
`Object.hashCode()`, so they use the identity hash code. Identity hash codes
have been a strangely slow area in some of my code, and I try to avoid them,
but that's harder to do with `Class`. [This article was interesting about
System.identityHashCode()](https://varoa.net/jvm/java/openjdk/biased-locking/2017/01/30/hashCode.html);
it looks like having any calls to it interferes with "biased lock
ing." It also seems like the locking can be an issue even in single-threaded
situations. I'm wondering if slowdown in the map could be because the first
call to `hashCode()` on each Class, or other type that uses hashCode() from
Object, has to do thread-related cautionary work, and not actually from the map
code itself.
As an aside, the mixing in `place()` might be able to be simpler because
Fury doesn't target as many platforms as this did in its original form (in
[libGDX](https://libgdx.com)). Some ideas for mixing without needing 64-bit
multiplication might include `final int hash = item.hashCode(); return (hash
>>> shift) ^ (hash & mask);`, or some [reversible bitwise
operation](https://marc-b-reynolds.github.io/math/2017/10/13/XorRotate.html)
followed by masking. That could be `final int hash = item.hashCode(); return
(hash ^ (hash << 5 | hash >>> 27) ^ (hash << 24 | hash >>> 8)) & mask;`, or
anything else that's fast enough and gets suitable mixing for your key type.
Identity hash codes probably only need masking, no mixing needed.
Alternate Map implementations could absolutely do better. You're right that
SwissTable doesn't have a good implementation in Java; maybe one isn't
possible. I think Robin Hood hashing might be something to look into; Cuckoo
hashing is what my code replaced in Kryo and before that in libGDX, and it's
somewhat of a precursor to Robin Hood and Hopscotch hashing. I wouldn't
recommend Cuckoo hashing unless you're certain you've tested all the corner
cases -- the papers that describe it gloss over that it can "fail" without
describing just how *bad* a failure can be... In libGDX, we had a proof of
concept that sent about 40 4-character Strings into a Cuckoo-based map, and
that forced that implementation to attempt to use more table entries than Java
can actually fit in one array (it would have used an array with more than 2 to
the 30 items if it could). There are some possible workarounds for Cuckoo
hashing, but they would also need to be tested very thoroughly for their
worst-case beha
vior. Robin Hood Hashing is what Rust used before it switched to hashbrown,
which I think is a SwissTable variant, though I'm not that familiar with Rust.
Robin Hood Hashing is also in most cases very fast and offers good
expected-case performance, but one operation was a disaster in at least some
versions of Rust; it was very slow to copy one map into another. [This blog
post discusses that issue in great
detail](https://www.tumblr.com/accidentallyquadratic/153545455987/rust-hash-iteration-reinsertion);
it's a good read. I think that bug might not surface if a Robin Hood hash
table didn't change which member of a hash family it uses between instances.
FuryObjectMap doesn't actually uses a hash family currently, so that might be
an interesting avenue.
Lastly, the simplest option: adjusting the default load factor might make a
big difference. One of my several subclasses of ObjectObjectMap in different
places used a specific type of key every time, `Coord`, which is a 2D point on
an int grid. I found that using the [Cantor Pairing
Function](https://en.wikipedia.org/wiki/Pairing_function#Cantor_pairing_function)
as a hashCode() actually worked very well in that case, but only when the load
factor was 0.5f or lower (the default is higher than that). The reason for that
probably has to do with the data I was hashing; it was mostly square grids
where I wanted to track a value with each Coord and do so in an order, and
Cantor won't produce any duplicate results for points in a right isosceles
triangle shape, with the size of the triangle determined by how many bits of
the hashCode() were used. A load factor of 0.5f meant an extra bit that
wouldn't have been used otherwise, and that essentially doubled the side
lengths and made the wh
ole square area covered without any collisions. A lower load factor speeds up
several operations by lowering the likelihood of a collision, but it also slows
down iteration, since empty table slots need to be iterated past more often.
I hope I didn't get too off-track here... I'd try the load factor first,
though it may require some tinkering.
_Originally posted by @tommyettinger in
https://github.com/apache/incubator-fury/issues/1274#issuecomment-1997259888_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]