I’m just jumping in, we are seeing this issue as well when we are restarting 
the nifi process from time.

We are aware of the nifi.properties 
“nifi.flowcontroller.graceful.shutdown.period=10 sec” parameter, but to be 
honest we didn’t try to raise it up yet. Maybe it takes more than 10s to fully 
execute the PutKudu, I really don’t know.

Cheers Josef




From: Vibhath Ileperuma <[email protected]>
Reply to: "[email protected]" <[email protected]>
Date: Wednesday, 17 March 2021 at 13:49
To: "[email protected]" <[email protected]>
Subject: Re: Data duplication When NIFI is restarted

Hi Pierre,

The NIFI flow I'm implementing can be run for a long time continuously(maybe a 
couple of weeks/months). During this time period it can be terminated due to 
memory issue or some other system issue, can't it be? In such a case, I may 
need to restart NIFi manually and run the flow from where it stopped.

Thanks & Regards

Vibhath Ileperuma




On Wed, Mar 17, 2021 at 5:51 PM Pierre Villard 
<[email protected]<mailto:[email protected]>> wrote:
Hi Vibhath,

How is NiFi terminated / restarted ?

Thanks,
Pierre

Le mer. 17 mars 2021 à 15:04, Vibhath Ileperuma 
<[email protected]<mailto:[email protected]>> a écrit :
Hi all,

I notice that, if the NIFI instance gets terminated while a processor is 
processing a flow file, that processor starts to process the flow file again 
from the beginning when NIFI is restarted.
I'm using the PutKudu processor and the PutParquet processor to write data into 
kudu and parquet format. Due to the above behaviour,

  1.  PutKudu shows primary key violation errors in a restart. I'm using INSERT 
operation and I can't use INSERT_IGNORE or UPSERT operations since I need to be 
notified if incoming data has duplicates.
  2.  Since I need to write data in a single flow file into multiple parquet 
files(by specifying the row group size) It is possible for PutParquet processor 
to to generate multiple parquet  files with the same content in a restart (data 
can be duplicated)
I would be grateful if you could suggest a way to overcome this problem.

Thanks & Regards

Vibhath Ileperuma

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to