Hi, It sounds like your bolt actually has a state (Initialized by the messages picked up from topic A) When restarting the bolt in case of failover, storm does not provide any inherent mechanism to keep the Bolt's previous state.
In my opinion, your best option would be to move to Trident, which provides the notion of a state. see https://storm.apache.org/documentation/Trident-state. Alternatively, you can use any external storage (e.g. mongo or memcached) to save your state. After the processing of the messages from topic A you should write your state to the external storage. Then, you can read it in the prepare method. It would be empty in case the topology was just started, or have the data that was previously written there if it is a failover restart. Take a look at MongoBolt for some ideas ( https://github.com/stormprocessor/storm-mongo/) Yair http://www.alooma.io On Mon, Oct 20, 2014 at 1:28 PM, Manoj Jaiswal <[email protected]> wrote: > Hi, > > Let me explain my use case in brief: > > Kakfa spout A picks up messages from Kakfa topic topic-A and creates > Queries via Esper in Storm bolt B. > This is done only once as soon as topology is deployed > Another Kafka spout C picks up realtime messages from Kafka topic -C which > will be processed by Esper engine in same bolt B. > > The spout data from A and B are both partitioned by account numbers so > that the Esper engine in different worker processes gets same account > numbers. > > Now the problem: > In case of worker threads dying due to some issue or the supervisor node > getting kicked out of the cluster, we are observing that the bolt instance > may get assigned to new process/worker. > But the prepare method of the bolt is initializing Esper query > configuration. so every time the Esper query engine in that worker process > is initialized .Hence it loses the queries setup by one time messages from > Kafka spout A. > > Any suggestions, how do we handle this? > > -Manoj >
