I have a quick architecture question. Imagine that you are processing a stream data at high speed and needs to build, update, and access some memory data structure where the "model" is stored.
One option to keep store this model in a DB, but it will require a huge number of updates (assume we do not want to go to a large Cassandra ring or similar). What's the preferred way to manage this model in memory to have consistent and shared access across multiple nodes in the cluster. Is there some sort of shared memory approach or do I need to try to manage this model as an RDD? All suggestions are welcome. dma
