We have been working on a distributed data proxy for Cassandra. A data
proxy is a combination of proxy and caching that also takes care of data
consistency and invalidation for insert and updates. In addition, the data
proxy is distributed based on consistent hashing and using gossip between
data proxy nodes to keep the cached data unique (per node) and consistent.
Finally, we have also implemented our data proxy on a FPGA-based
accelerator to achieve lower latency and better throughput numbers.
We have a blog post with more details about our technology and initial
In brief, the main highlights of our results are that we observe a latency
reduction of almost 9X-10X compared to baseline Cassandra and a throughput
increase of 3X-4X. Interested to hear thoughts on what kind of benchmarking
setup you would like to see us use given we are now exploring other
workloads to benchmark with our engine.