Hi,

I've checked with the data team and it is not possible to have the database
provide us with the "error as duplicate check" strategy (there is a link
between the incoming message id and the persisted row, but the actual key
is not unique based on it, so it wouldn't create this clash).

We're looking into using some kind of cache of incoming ids processed to
avoid persisting twice to the database. The coordination of this between
all nodes is where it gets tricky. Perhaps using the zookeeper as the
cache? It covers replication and should be available to the nodes easily.

Thanks,
Javier


On Tue, Mar 3, 2015 at 1:49 PM, Parth Brahmbhatt <
[email protected]> wrote:

>  I am not really familiar with cassandra but I think they do support
> conditional insert/update. Something like *Insert into my_table (col1)
> values (‘val1’) if not exists;. *
>
>  See if it actually does support conditional insert/update and if you can
> use this feature.
>
>  Thanks
> Parth
>
>   From: Javier Gonzalez <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Tuesday, March 3, 2015 at 10:43 AM
> To: "[email protected]" <[email protected]>
> Subject: Re: Exactly once transactions and storm
>
>  Thanks for your reply. This could work, as the problem domain has a
> unique Id in the incoming stream, but I believe the db will be Cassandra,
> which updates instead of throwing errors when inserting a duplicate key. So
> I can't rely on that.
>
>


-- 
Javier González Nicolini

Reply via email to