[ 
https://issues.apache.org/jira/browse/STORM-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Kellogg updated STORM-45:
------------------------------
    Component/s: storm-core

> Stateful bolts
> --------------
>
>                 Key: STORM-45
>                 URL: https://issues.apache.org/jira/browse/STORM-45
>             Project: Apache Storm
>          Issue Type: Wish
>          Components: storm-core
>            Reporter: James Xu
>
> https://github.com/nathanmarz/storm/issues/204
> This is an idea to build abstractions for bolts with fault-tolerant state (so 
> if a task dies and gets reassigned to another machine it still has its 
> state). The idea is to use a similar strategy as HBase uses: Append changes 
> to local state into a file on a DFS, and occasionally compact the file on the 
> DFS. This would be easier to use and more efficient than using an external 
> database like Cassandra.
> This abstraction is actually independent to Storm, it should be done as a 
> separate project or in storm-contrib. It does require a DFS with append 
> functionality, which more recent versions of HDFS might provide (or MapR 
> might provide it). The interface for a map-like state can look something like 
> this:
> PersistentMap(String dfsDir) {
>     Object get(Object key)
>     void put(Object key, Object value);
> }
> Using transactional topologies, it would probably look more like:
> PersistentMap(String dfsDir) {
>     Object get(Object key)
>     void putAll(Long txid, List pairs);
> }
> If you colocate your DFS with your Storm cluster, you should get data 
> locality when writing. Having a pluggable scheduler can help with this as 
> well.
> The first version should just keep all the state in memory, using the DFS 
> just for reliability.
> -----------------------------------------------------------------------------------------------------
> SirWellington: This is a good idea. 
> What is it that currently occurs when a bolt task fails? Does the source 
> spout re-emit the tuple?
> Some questions for implementation: 
> What happens while you are re-initiating the bolt with the previous one's 
> state? Will you pause all threads in the topology or will you temporarily 
> increase the timeout for processing a tuple? I am not sure how long it would 
> take to re-start a bolt with state, but it could trigger a fail(), causing a 
> replay.
> From the API side:
> Does this mean that the prepare() method will not be called?
> And finally, will this be optional or seamlessly integrated?
> Thank You
> -----------------------------------------------------------------------------------------------------
> nathanmarz: The tuple trees that are made incomplete due to the bolt task 
> failure will time-out and the spout will be able to replay the source tuple 
> for that tree. Tuples that have already successfully completed will not be 
> replayed. So generally you keep any persistent state in a database, 
> oftentimes doing something like waiting to ack tuples until you've done a 
> batch update to the database. Stateful bolts will just be a much more 
> efficient way of keeping a large amount of state at hand in a bolt.
> Good point regarding reinitialization. The first implementation will target 
> amounts of state that can fit into memory, so reinitialization time won't be 
> a concern. Once we look at doing much larger amounts of state we'll revisit 
> this question.
> This is orthogonal to the prepare method being called. The prepare method is 
> called whenever a task starts up in a worker, regardless if it existed before 
> in another worker.
> This will most definitely be optional. Actually, I think this will end up 
> being a storm-contrib submodule.
> -----------------------------------------------------------------------------------------------------
> pereferrera: Is this issue outdated as of: 
> https://github.com/stormprocessor/storm-state ? Should it be closed then?
> Has anybody tested storm-state in a big setup? Is there a corresponding 
> backing State for Trident?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to