Hi all, I notice that, if the NIFI instance gets terminated while a processor is processing a flow file, that processor starts to process the flow file again from the beginning when NIFI is restarted. I'm using the PutKudu processor and the PutParquet processor to write data into kudu and parquet format. Due to the above behaviour,
1. PutKudu shows primary key violation errors in a restart. I'm using INSERT operation and I can't use INSERT_IGNORE or UPSERT operations since I need to be notified if incoming data has duplicates. 2. Since I need to write data in a single flow file into multiple parquet files(by specifying the row group size) It is possible for PutParquet processor to to generate multiple parquet files with the same content in a restart (data can be duplicated) I would be grateful if you could suggest a way to overcome this problem. Thanks & Regards *Vibhath Ileperuma*
