Hi Pierre, The NIFI flow I'm implementing can be run for a long time continuously(maybe a couple of weeks/months). During this time period it can be terminated due to memory issue or some other system issue, can't it be? In such a case, I may need to restart NIFi manually and run the flow from where it stopped.
Thanks & Regards *Vibhath Ileperuma* On Wed, Mar 17, 2021 at 5:51 PM Pierre Villard <[email protected]> wrote: > Hi Vibhath, > > How is NiFi terminated / restarted ? > > Thanks, > Pierre > > Le mer. 17 mars 2021 à 15:04, Vibhath Ileperuma < > [email protected]> a écrit : > >> Hi all, >> >> I notice that, if the NIFI instance gets terminated while a processor is >> processing a flow file, that processor starts to process the flow file >> again from the beginning when NIFI is restarted. >> I'm using the PutKudu processor and the PutParquet processor to write >> data into kudu and parquet format. Due to the above behaviour, >> >> 1. PutKudu shows primary key violation errors in a restart. I'm using >> INSERT operation and I can't use INSERT_IGNORE or UPSERT operations since >> I >> need to be notified if incoming data has duplicates. >> 2. Since I need to write data in a single flow file into multiple >> parquet files(by specifying the row group size) It is possible for >> PutParquet processor to to generate multiple parquet files with the same >> content in a restart (data can be duplicated) >> >> I would be grateful if you could suggest a way to overcome this problem. >> >> Thanks & Regards >> >> *Vibhath Ileperuma* >> >
