We have been using Nifi for over a year and we just turned up a new cluster. We move around 6TB a day of small to large files. We are having an issue of the ListSFTP missing files. I know this can happen if a file with an older date is moved into the directory because the lister is maintaining state. However it also seems to hang when there are 10k plus files. I am running Nifi 1.6 on Ubuntu 18. The cluster has plenty of memory, CPU, and disk space. I am also using the distributed cache because we haven't migrated to 1.8 yet.
We have 20 different data flows all with their own logic. We connect the Lister to a remote port that is connected to a remote process group and then distributed across the cluster to a FetchSFTP that deletes the files after they are loaded. We move files into the input directory so we have permission to delete them from the Nifi Fetch. We are doing a find which orders the files to make sure that we don't grab old files. This could still be an issue and cause us to miss a few files but it still doesn't explain why when the lister is running and there are files to pull nothing gets pulled. Any suggestion for idea would be appreciated. Dave -- Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/
