[ 
https://issues.apache.org/jira/browse/NIFI-12900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12900:
---------------------------
    Description: 
The processor {{PutSFTP}} is based on {{PutFileTransfer}}.
Before an actual upload takes place, potential conflicts (e.g. existing file) 
are identified and resolved using {{identifyAndResolveConflictFile(...)}}.
As part of this process, information on the target file is retrieved using 
{{FileTransfer.getRemoteFileInfo(...)}}.
In case of {{PutSFTP}} this is implemented by {{SFTPTransfer}}.

The implementation of {{getRemoteFileInfo}} executes {{ls}} on the target 
directory path. In case there are a lot of files inside the remote directory, 
e.g. >10.000 files, the listing reduces the performance of {{PutSFTP}} 
significantly.

Instead of a listing on the directory, file information should be retrieved 
using either {{ls}} or {{stat}} on the target file directly.


  was:
The processor `PutSFTP` is based on `PutFileTransfer`.
Before an actual upload takes place, potential conflicts (e.g. existing file) 
are identified and resolved using `identifyAndResolveConflictFile(...)`.
As part of this process, information on the target file is retrieved using 
`FileTransfer.getRemoteFileInfo(...)`.
In case of `PutSFTP` this is implemented by `SFTPTransfer`.

The implementation of `getRemoteFileInfo` executes `ls` on the target directory 
path. In case there are a lot of files inside the remote directory, e.g. 
>10.000 files, the listing reduces the performance of `PutSFTP` significantly.

Instead of a listing on the directory, file information should be retrieved 
using either `ls` or `stat` on the target file directly.



> Avoid unnecessary file listing in PutSFTP 
> ------------------------------------------
>
>                 Key: NIFI-12900
>                 URL: https://issues.apache.org/jira/browse/NIFI-12900
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: endzeit
>            Assignee: endzeit
>            Priority: Major
>
> The processor {{PutSFTP}} is based on {{PutFileTransfer}}.
> Before an actual upload takes place, potential conflicts (e.g. existing file) 
> are identified and resolved using {{identifyAndResolveConflictFile(...)}}.
> As part of this process, information on the target file is retrieved using 
> {{FileTransfer.getRemoteFileInfo(...)}}.
> In case of {{PutSFTP}} this is implemented by {{SFTPTransfer}}.
> The implementation of {{getRemoteFileInfo}} executes {{ls}} on the target 
> directory path. In case there are a lot of files inside the remote directory, 
> e.g. >10.000 files, the listing reduces the performance of {{PutSFTP}} 
> significantly.
> Instead of a listing on the directory, file information should be retrieved 
> using either {{ls}} or {{stat}} on the target file directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to