ListS3 actually just picks up sort of references to the actual object residing in S3. You should connect ListS3 to a RPG that can be configured with your same cluster itself. And connect the input of the RPG to FetchS3. This does distribution of files.
Useful link: https://pierrevillard.com/2017/02/23/listfetch-pattern-and-remote-process-group-in-apache-nifi/ On Mon, 10 Sep 2018 at 10:30 PM, Jean-Sebastien Vachon < [email protected]> wrote: > Hi all, > > > > I am using a ListS3 processor to process a large number of files stored in > S3 but this processor only runs on the primary node. Could this be the > cause of the heavy unbalanced distribution of the load amongst the three > identical nodes I have? > > > > Is there anyway of distributing the load to all nodes ? or should I simply > replace ListS3 with something else? > > > > Thanks >
