[
https://issues.apache.org/jira/browse/NIFI-12900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
endzeit updated NIFI-12900:
---------------------------
Description:
The processor {{PutSFTP}} is based on {{PutFileTransfer}}.
Before an actual upload takes place, potential conflicts (e.g. existing file)
are identified and resolved using {{identifyAndResolveConflictFile(...)}}.
As part of this process, information on the target file is retrieved using
{{FileTransfer.getRemoteFileInfo(...)}}.
In case of {{PutSFTP}} this is implemented by {{SFTPTransfer}}.
The implementation of {{getRemoteFileInfo}} executes {{ls}} on the target
directory path. In case there are a lot of files inside the remote directory,
e.g. >10.000 files, the listing reduces the performance of {{PutSFTP}}
significantly.
Instead of a listing on the directory, file information should be retrieved
using either {{ls}} or {{stat}} on the target file directly.
was:
The processor `PutSFTP` is based on `PutFileTransfer`.
Before an actual upload takes place, potential conflicts (e.g. existing file)
are identified and resolved using `identifyAndResolveConflictFile(...)`.
As part of this process, information on the target file is retrieved using
`FileTransfer.getRemoteFileInfo(...)`.
In case of `PutSFTP` this is implemented by `SFTPTransfer`.
The implementation of `getRemoteFileInfo` executes `ls` on the target directory
path. In case there are a lot of files inside the remote directory, e.g.
>10.000 files, the listing reduces the performance of `PutSFTP` significantly.
Instead of a listing on the directory, file information should be retrieved
using either `ls` or `stat` on the target file directly.
> Avoid unnecessary file listing in PutSFTP
> ------------------------------------------
>
> Key: NIFI-12900
> URL: https://issues.apache.org/jira/browse/NIFI-12900
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: endzeit
> Assignee: endzeit
> Priority: Major
>
> The processor {{PutSFTP}} is based on {{PutFileTransfer}}.
> Before an actual upload takes place, potential conflicts (e.g. existing file)
> are identified and resolved using {{identifyAndResolveConflictFile(...)}}.
> As part of this process, information on the target file is retrieved using
> {{FileTransfer.getRemoteFileInfo(...)}}.
> In case of {{PutSFTP}} this is implemented by {{SFTPTransfer}}.
> The implementation of {{getRemoteFileInfo}} executes {{ls}} on the target
> directory path. In case there are a lot of files inside the remote directory,
> e.g. >10.000 files, the listing reduces the performance of {{PutSFTP}}
> significantly.
> Instead of a listing on the directory, file information should be retrieved
> using either {{ls}} or {{stat}} on the target file directly.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)