Hi Pierre,

The NIFI flow I'm implementing can be run for a long time
continuously(maybe a couple of weeks/months). During this time period it
can be terminated due to memory issue or some other system issue, can't it
be? In such a case, I may need to restart NIFi manually and run the flow
from where it stopped.

Thanks & Regards

*Vibhath Ileperuma*




On Wed, Mar 17, 2021 at 5:51 PM Pierre Villard <[email protected]>
wrote:

> Hi Vibhath,
>
> How is NiFi terminated / restarted ?
>
> Thanks,
> Pierre
>
> Le mer. 17 mars 2021 à 15:04, Vibhath Ileperuma <
> [email protected]> a écrit :
>
>> Hi all,
>>
>> I notice that, if the NIFI instance gets terminated while a processor is
>> processing a flow file, that processor starts to process the flow file
>> again from the beginning when NIFI is restarted.
>> I'm using the PutKudu processor and the PutParquet processor to write
>> data into kudu and parquet format. Due to the above behaviour,
>>
>>    1. PutKudu shows primary key violation errors in a restart. I'm using
>>    INSERT operation and I can't use INSERT_IGNORE or UPSERT operations since 
>> I
>>    need to be notified if incoming data has duplicates.
>>    2. Since I need to write data in a single flow file into multiple
>>    parquet files(by specifying the row group size) It is possible for
>>    PutParquet processor to to generate multiple parquet  files with the same
>>    content in a restart (data can be duplicated)
>>
>> I would be grateful if you could suggest a way to overcome this problem.
>>
>> Thanks & Regards
>>
>> *Vibhath Ileperuma*
>>
>

Reply via email to