I put together a simple ignite application to iterate over all cache
entries using broadcast() and scanQuery() (I am currently evaluating the
two approaches). The goal is to iterate over all of the cache values local
to the ignite instance as fast as possible.

The data I am storing in the grid is relatively large, 10k of data for each
cache value and the keys are just strings. My initial benchmarks are
decent, I am able to iterate over 133k entries/second per ignite instance.
If I store just the keys and not the large cache values I can iterate over
the keys at a rate of around 1.8 million entries/second (getting as close
to this performance is my goal)

The compromise I have found is to store the 10k of data via java unsafe()
calls offheap, and annotate the field with transient (avoiding
serialization). This approach is giving me around 1.4 million entries
/second which is orders of magnitude faster than the 133k when the large
data was serialized.

I believe the unsafe() approach will work but will break down if the Ignite
framework attempts to rebalance which in turn will start copying the data
around the cluster. If I go down this road are there hooks anywhere to
deserialize the offheap data before it is shipped to another node during a
rebalance? Or am I barking up the wrong tree on this one entirely?

I have done all of the typical optimizations such as turning off
copyOnRead, reducing backups, setting a large heap, etc.


Reply via email to