I have a quick architecture question. Imagine that you are processing a stream 
data at high speed and needs to build, update, and access some memory data 
structure where the "model" is stored.  

One option to keep store this model in a DB, but it will require a huge number 
of updates (assume we do not want to go to a large Cassandra ring or similar). 
What's the preferred way to manage this model in memory to have consistent and 
shared access across multiple nodes in the cluster. Is there some sort of 
shared memory approach or do I need to try to manage this model as an RDD?

All suggestions are welcome.

dma

Reply via email to