mosermw commented on PR #8914: URL: https://github.com/apache/nifi/pull/8914#issuecomment-2215302556
Thanks for looking at this PR, @exceptionfactory > @mosermw can you provide some additional context for the reasons behind this proposed change in behavior? As a use case, let's say I have a server running software that has an anxiety attack if a certain file doesn't exist, and it checks for that file every 5 seconds. I have a requirement to update that file periodically. I use the power of NiFi to generate the file contents and then PutSFTP the file into place. Currently, my server software sometimes panics because it checks for the file while I am updating it. This is because PutSFTP deletes the file first, transfers the file as ".filename" then renames to "filename". As my file gets larger, it can take more than 5 seconds to transfer. After the change in this PR, the file will only not exist in the short period of time between an SFTP delete then rename. I asked the question on Slack a while back and got positive feedback that this change would be useful. I should have put the link into the Jira ticket. https://apachenifi.slack.com/archives/C0L9S92JY/p1713799005100369 > Reviewing the code, this change introduces an additional `mlist()` command for FTP and `stat()` command for SFTP, for each file transferred through the corresponding Processors. Those commands require both file access and network communication, which could have an impact on high volume flows. Is there a particular reason for adding those calls prior to calling `delete()`? It seems like that should not be necessary. You're absolutely right and I can do better. It doesn't hurt to call delete whether the destination file exists or not. I will modify the PR to remove those additional commands and test again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
