Lets get a lot more details about config settings on all processors and versions of various systems involved for any cases of suspected dupes.
On Sat, Nov 19, 2022 at 11:33 AM Joe Obernberger < [email protected]> wrote: > Are you by chance using a clustered NiFi? I'm seeing duplicate messages > if I run the consumer on multiple NiFi nodes, so I've started running the > consumer only on the parent. This seems to correct the issue, but leads to > other problems. I'd love a solution. > > -Joe > On 11/16/2022 3:50 AM, Aian Cantabrana wrote: > > Hi Joe, > > Thanks for the reply. The actual flow is sending data from the ConsumeAMQP > processor to two different PublishKafka processors, one with Idempotence > and other with default config. Each of it is sending same data to two > different topics and comparing both topics is how I am checking that there > are duplicates. It seems to be random, some times they appear in the > "normal" processor's topic and others in the "idempotence", I did not find > any pattern. > > I will upgrade to NiFi 1.18.0 and try again. > > In any case, messages have json format (one json per flowfile) but since I > am sending and storing them in kafka in plain text I am using > *no-record-oriented* Kafka publisher. Is PublishKafkaRecord more > reliable? Would it be better to use it? > > Thanks, > > Aian > > ------------------------------ > *De: *"Joe Witt" <[email protected]> <[email protected]> > *Para: *"users" <[email protected]> <[email protected]> > *Enviados: *Martes, 15 de Noviembre 2022 17:31:54 > *Asunto: *Re: Exacly once from NiFi to Kafka > > Aian, > How can you tell there are duplicates in Kafka and are you certain that no > duplicates exist in the source topic? > > Given NiFi's data provenance capabilities you should be able to pin point > a given duplicate and figure out whether it happened at the source, in > nifi, or otherwise. > > Note much has changed/improved since the 1.12.x line of NiFi so we have > more Kafka components and record oriented mechanisms. But still pretty > sure even in your version we should not be duplicating data unless the flow > is configured such that it would happen. > > Thanks > > On Tue, Nov 15, 2022 at 9:25 AM Aian Cantabrana <[email protected]> > wrote: > >> Hi, >> >> I am having some difficulties trying to get *exactly-once *semantic >> while ensuring data order from NiFi to Kafka. I have read Kafka >> documentation and it should be quite straight forward using idempotent >> producer from NiFi and having a Kafka topic with a single partition, but I >> am still getting some duplicated messages in Kafka. >> >> NiFi version: 1.12.1 >> Kafka version: 2.7.0 >> >> NiFi flow: >> (Both shown queues with FIFO prioritizer) >> >> PublishKafka_2_6 configuration: >> >> As I said, target Kafka topic has just one partition to ensure data order. >> >> Incoming flowfiles are small 60 bytes messages. >> >> I have been a while working with it so any suggestion is really welcome. >> >> Thanks in advance, >> >> Aian >> > > > > <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> > Virus-free.www.avg.com > <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> > <#m_-1801405584016021996_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> >
