Hi, This contains the overview of large state management. Some parts need more description which I am working on but please free to go through it and any feedback is appreciated.
Thanks, Chandni On Tue, Oct 20, 2015 at 8:31 AM, Pramod Immaneni <[email protected]> wrote: > This is a much needed component Chandni. > > The API for the cache will be important as users will be able to plugin > different implementations in future like those based off of popular > distributed in-memory caches. Ehcache is a popular cache mechanism and API > that comes to bind. It comes bundled with a non-distributed implementation > but there are commercial distributed implementations of it as well like > BigMemory. > > Given our needs for fault tolerance we may not be able to adopt the ehcache > API as is but an extension of it might work. We would still provide a > default implementation but going off of a well recognized API will > facilitate development of other implementations in future based off of > popular implementations already available. We will need to investigate if > we can use the API as is or with relatively straightforward extensions > which will be a positive for using it. But if the API turns out to be > significantly deviating from what we need then that would be a negative. > > Also it would be great if we could support an iterator to scan all the > keys, lazy loading as needed, since this need comes up from time to time in > different scenarios such as change data capture calculations. > > Thanks. > > On Mon, Oct 19, 2015 at 9:10 PM, Chandni Singh <[email protected]> > wrote: > > > Hi All, > > > > While working on making the Join operator fault-tolerant, we realized the > > need of a fault-tolerant Cache in Malhar library. > > > > This cache is useful for any operator which is state-full and stores > > key/values for a very long period (more than an hour). > > > > The problem with just having a non-transient HashMap for the cache is > that > > over a period of time this state will become so large that checkpointing > it > > will be very costly and will cause bigger issues. > > > > In order to address this we need to checkpoint the state iteratively, > i.e., > > save the difference in state at every application window. > > > > This brings forward the following broad requirements for the cache: > > 1. The cache needs to have a max size and is backed by a filesystem. > > > > 2. When this threshold is reached, then adding more data to it should > evict > > older entries from memory. > > > > 3. To minimize cache misses, a block of data is loaded in memory. > > > > 4. A block or bucket to which a key belongs is provided by the user > > (operator in this case) as the information about closeness in keys (that > > can potentially reduce future misses) is not known to the cache but to > the > > user. > > > > 5. lazy load the keys in case of operator failure > > > > 6. To offset the cost of loading a block of keys when there is a miss, > > loading can be done asynchronously with a callback that indicates when > the > > key is available. This allows the operator to process other keys which > are > > in memory. > > > > 7. data that is spilled over needs to be purged when it is not needed > > anymore. > > > > > > In past we solved this problem with BucketManager which is not in open > > source now and also there were some limitations with the bucket api - the > > biggest one is that it doesn't allow to save multiple values for a key. > > > > My plan is to create a similar solution as BucketManager in Malhar with > > improved api. > > Also save the data on hdfs in TFile which provides better performance > when > > saving key/values. > > > > Thanks, > > Chandni > > >
