Re: storm rebalancing losing data

Yair Weinberger Tue, 21 Oct 2014 14:36:13 -0700

Hi,
It sounds like your bolt actually has a state (Initialized by the messages
picked up from topic A)
When restarting the bolt in case of failover, storm does not provide any
inherent mechanism to keep the Bolt's previous state.

In my opinion, your best option would be to move to Trident, which provides
the notion of a state. see
https://storm.apache.org/documentation/Trident-state.

Alternatively, you can use any external storage (e.g. mongo or memcached)
to save your state.
After the processing of the messages from topic A you should write your
state to the external storage.
Then, you can read it in the prepare method. It would be empty in case the
topology was just started, or have the data that was previously written
there if it is a failover restart.

Take a look at MongoBolt for some ideas (
https://github.com/stormprocessor/storm-mongo/)

  Yair
  http://www.alooma.io

On Mon, Oct 20, 2014 at 1:28 PM, Manoj Jaiswal <[email protected]>
wrote:

> Hi,
>
> Let me explain my use case in brief:
>
> Kakfa spout A picks up messages from Kakfa topic topic-A and creates
> Queries via Esper in Storm bolt B.
> This is done only once as soon as topology is deployed
> Another Kafka spout C picks up realtime messages from Kafka topic -C which
> will be processed by Esper engine in same bolt B.
>
> The spout data from A and B are both partitioned by account numbers so
> that the Esper engine in different worker processes gets same account
> numbers.
>
> Now the problem:
> In case of worker threads dying due to some issue or the supervisor node
> getting kicked out of the cluster, we are observing that the bolt instance
> may get assigned to new process/worker.
> But the prepare method of the bolt is initializing Esper query
> configuration. so every time the Esper query engine in that worker process
> is initialized .Hence it loses the queries setup by one time messages from
> Kafka spout A.
>
> Any suggestions, how do we handle this?
>
> -Manoj
>

Re: storm rebalancing losing data

Reply via email to