Re: Something like a unique key to prevent same record from being inserted twice?

2019-04-03 Thread Liam Clarke
And to share my experience of doing similar - certain messages on our system must not be duplicated, but as they are bounced back to us from third parties, duplication is inevitable. So I deduplicate them using Spark structured streaming's flapMapGroupsWithState to deduplicate based on a business

Re: Something like a unique key to prevent same record from being inserted twice?

2019-04-03 Thread Hans Jespersen
Ok what you are describing is different from accidental duplicate message pruning which is what the idempotent publish feature does. You are describing a situation were multiple independent messages just happen to have the same contents (both key and value). Removing those messages is an

Re: Something like a unique key to prevent same record from being inserted twice?

2019-04-03 Thread Dimitry Lvovsky
I've done this using kafka streams: specifically, I created a processor, and use a keystore (a functionality of streams) to save/check for keys and only forwarding messages that were not in the keystore. Since the keystore is in memory, and backed by the local filesystem on the node the processor

Re: Something like a unique key to prevent same record from being inserted twice?

2019-04-03 Thread Vincent Maurin
Hi, Idempotence flag will guarantee that the message is produce exactly one time on the topic i.e that running your command a single time will produce a single message. It is not a unique enforcement on the message key, there is no such thing in Kafka. In Kafka, a topic containing the "history"

Re: Something like a unique key to prevent same record from being inserted twice?

2019-04-02 Thread jim . meyer
On 2019/04/02 22:43:31, jim.me...@concept-solutions.com wrote: > > > On 2019/04/02 22:25:16, jim.me...@concept-solutions.com > wrote: > > > > > > On 2019/04/02 21:59:21, Hans Jespersen wrote: > > > yes. Idempotent publish uses a unique messageID to discard potential > > > duplicate

Re: Something like a unique key to prevent same record from being inserted twice?

2019-04-02 Thread jim . meyer
On 2019/04/02 22:25:16, jim.me...@concept-solutions.com wrote: > > > On 2019/04/02 21:59:21, Hans Jespersen wrote: > > yes. Idempotent publish uses a unique messageID to discard potential > > duplicate messages caused by failure conditions when publishing. > > > > -hans > > > > >

Re: Something like a unique key to prevent same record from being inserted twice?

2019-04-02 Thread jim . meyer
On 2019/04/02 21:59:21, Hans Jespersen wrote: > yes. Idempotent publish uses a unique messageID to discard potential > duplicate messages caused by failure conditions when publishing. > > -hans > > > On Apr 1, 2019, at 9:49 PM, jim.me...@concept-solutions.com > > wrote: > > > > Does

Re: Something like a unique key to prevent same record from being inserted twice?

2019-04-02 Thread Hans Jespersen
yes. Idempotent publish uses a unique messageID to discard potential duplicate messages caused by failure conditions when publishing. -hans > On Apr 1, 2019, at 9:49 PM, jim.me...@concept-solutions.com > wrote: > > Does Kafka have something that behaves like a unique key so a producer

Something like a unique key to prevent same record from being inserted twice?

2019-04-02 Thread jim . meyer
Does Kafka have something that behaves like a unique key so a producer can’t write the same value to a topic twice?