Hi, I've checked with the data team and it is not possible to have the database provide us with the "error as duplicate check" strategy (there is a link between the incoming message id and the persisted row, but the actual key is not unique based on it, so it wouldn't create this clash).
We're looking into using some kind of cache of incoming ids processed to avoid persisting twice to the database. The coordination of this between all nodes is where it gets tricky. Perhaps using the zookeeper as the cache? It covers replication and should be available to the nodes easily. Thanks, Javier On Tue, Mar 3, 2015 at 1:49 PM, Parth Brahmbhatt < [email protected]> wrote: > I am not really familiar with cassandra but I think they do support > conditional insert/update. Something like *Insert into my_table (col1) > values (‘val1’) if not exists;. * > > See if it actually does support conditional insert/update and if you can > use this feature. > > Thanks > Parth > > From: Javier Gonzalez <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Tuesday, March 3, 2015 at 10:43 AM > To: "[email protected]" <[email protected]> > Subject: Re: Exactly once transactions and storm > > Thanks for your reply. This could work, as the problem domain has a > unique Id in the incoming stream, but I believe the db will be Cassandra, > which updates instead of throwing errors when inserting a duplicate key. So > I can't rely on that. > > -- Javier González Nicolini
