Hi,

I am currently evaluating Nifi, I came across ListS3 + FetchS3Object
processors which am using for retrieving S3 objects as they are added,
transforming and posting it to an external API. To be able to use it in
production, I need some information on how this state is maintained and
whether it is dependable. 

1. If I start my dataflow 'now' (2017-01-01 00:00), I see it retrieves all
object 'now' (2017-01-01 00:00) onwards, after that even if I restart
dataflow or the Nifi server, it retrieves all the objects it that missed
since 'now'. 
Seems this information is persisted, probably in the filesystem, where
exactly is that? Is there any control over it? I started on 2017-01-01 00:00
but now I want it to process objects only 2017-07-01 00:00 onwards. 

3. What happens in case there was an error half way processing through the
file, how can I reprocess say it was a genuine bug in a custom processor.

4. I would like to monitor which files were successfully processed. What is
the recommended way to do that?

Appreciate your help.



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/In-ListS3-processor-where-does-Nifi-persists-the-state-of-objects-tp14489.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Reply via email to