Lets get a lot more details about config settings on all processors and
versions of various systems involved for any cases of suspected dupes.

On Sat, Nov 19, 2022 at 11:33 AM Joe Obernberger <
[email protected]> wrote:

> Are you by chance using a clustered NiFi?  I'm seeing duplicate messages
> if I run the consumer on multiple NiFi nodes, so I've started running the
> consumer only on the parent.  This seems to correct the issue, but leads to
> other problems.  I'd love a solution.
>
> -Joe
> On 11/16/2022 3:50 AM, Aian Cantabrana wrote:
>
> Hi Joe,
>
> Thanks for the reply. The actual flow is sending data from the ConsumeAMQP
> processor to two different PublishKafka processors, one with Idempotence
> and other with default config. Each of it is sending same data to two
> different topics and comparing both topics is how I am checking that there
> are duplicates. It seems to be random, some times they appear in the
> "normal" processor's topic and others in the "idempotence", I did not find
> any pattern.
>
> I will upgrade to NiFi 1.18.0 and try again.
>
> In any case, messages have json format (one json per flowfile) but since I
> am sending and storing them in kafka in plain text I am using
> *no-record-oriented* Kafka publisher. Is PublishKafkaRecord more
> reliable? Would it be better to use it?
>
> Thanks,
>
> Aian
>
> ------------------------------
> *De: *"Joe Witt" <[email protected]> <[email protected]>
> *Para: *"users" <[email protected]> <[email protected]>
> *Enviados: *Martes, 15 de Noviembre 2022 17:31:54
> *Asunto: *Re: Exacly once from NiFi to Kafka
>
> Aian,
> How can you tell there are duplicates in Kafka and are you certain that no
> duplicates exist in the source topic?
>
> Given NiFi's data provenance capabilities you should be able to pin point
> a given duplicate and figure out whether it happened at the source, in
> nifi, or otherwise.
>
> Note much has changed/improved since the 1.12.x line of NiFi so we have
> more Kafka components and record oriented mechanisms.  But still pretty
> sure even in your version we should not be duplicating data unless the flow
> is configured such that it would happen.
>
> Thanks
>
> On Tue, Nov 15, 2022 at 9:25 AM Aian Cantabrana <[email protected]>
> wrote:
>
>> Hi,
>>
>> I am having some difficulties trying to get *exactly-once *semantic
>> while ensuring data order from NiFi to Kafka. I have read Kafka
>> documentation and it should be quite straight forward using idempotent
>> producer from NiFi and having a Kafka topic with a single partition, but I
>> am still getting some duplicated messages in Kafka.
>>
>> NiFi version: 1.12.1
>> Kafka version: 2.7.0
>>
>> NiFi flow:
>> (Both shown queues with FIFO prioritizer)
>>
>> PublishKafka_2_6 configuration:
>>
>> As I said, target Kafka topic has just one partition to ensure data order.
>>
>> Incoming flowfiles are small 60 bytes messages.
>>
>> I have been a while working with it so any suggestion is really welcome.
>>
>> Thanks in advance,
>>
>> Aian
>>
>
>
>
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
> Virus-free.www.avg.com
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
> <#m_-1801405584016021996_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>

Reply via email to