Re: Cluster Failure and Data Recovery

Srinath C Wed, 04 Jun 2014 17:50:24 -0700

Hi Nima,
    Use the reliable message processing
<https://github.com/nathanmarz/storm/wiki/Guaranteeing-message-processing>
mechanism
to ensure that there is no data loss. You would need support for
transactional semantics from the tuple source where spout can commit/abort
a read (kestrel, kafka, rabbitmq, etc can do this). Yes you would need to
keep the queue transactions open until the spout receives an "ack" or
"fail" for every tuple.
    IMO, this ensures that each tuple is processed "atleast once" and not
"exactly once" so you need to be prepared to end up with duplicate entries
in your DB or have a way to figure out that a write to DB is duplicate or
earlier write. This is case where there are crashes with intermediate data
in memory.


Regards,
Srinath.



On Thu, Jun 5, 2014 at 5:37 AM, Nima Movafaghrad <
[email protected]> wrote:

> Hi everyone,
>
>
>
> We are in the process of designing a high available system with zero data
> loss tolerance. Plan is for the spouts to read from a queue and process
> them down in several different specialized bolts and then flush to DB. How
> can we guarantee no data loss here? Should we keep the queue transactions
> open until data is committed to DB? Should we persist the state of all the
> bolts? What happens to the intermediate data if the whole cluster fails?
>
>
>
> Any suggestions are much appreciated.
>
>
>
> Nima
>

Re: Cluster Failure and Data Recovery

Reply via email to