Hi, I am currently evaluating Nifi, I came across ListS3 + FetchS3Object processors which am using for retrieving S3 objects as they are added, transforming and posting it to an external API. To be able to use it in production, I need some information on how this state is maintained and whether it is dependable.
1. If I start my dataflow 'now' (2017-01-01 00:00), I see it retrieves all object 'now' (2017-01-01 00:00) onwards, after that even if I restart dataflow or the Nifi server, it retrieves all the objects it that missed since 'now'. Seems this information is persisted, probably in the filesystem, where exactly is that? Is there any control over it? I started on 2017-01-01 00:00 but now I want it to process objects only 2017-07-01 00:00 onwards. 3. What happens in case there was an error half way processing through the file, how can I reprocess say it was a genuine bug in a custom processor. 4. I would like to monitor which files were successfully processed. What is the recommended way to do that? Appreciate your help. -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/In-ListS3-processor-where-does-Nifi-persists-the-state-of-objects-tp14489.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
