ddewaele, > 2. Sometimes, when syncing files to my S3 buckets, I notice that the ListS3 > processor is picking up the same file twice. Is there a way to avoid that ?
Joe's response is correct. If you upload an object to S3 that overwrites an existing key, the modified date will change, and ListS3 will emit a flowfile for the "new" object with the same key. Likewise, changes such as object metadata, setting server-side encryption, etc, will also cause a change to the object modified date. The List->Fetch strategy works well for a directory being used as queue, but it doesn't always work as well for monitoring an entire S3 bucket over time. You may be able to achieve finer grained control using event notifications and an SQS queue, which I wrote about a while back: https://adamlamar.github.io/2016-01-30-monitoring-an-s3-bucket-in-apache-nifi/ I suspect this will function a bit closer to your expectations and the latency from object creation to NiFi receiving the event should be much shorter as well. Hope that helps, Adam
