Wow that sounds promising! would that also be the same for any other get/put processors?
On Fri, Sep 8, 2017 at 7:47 PM, Koji Kawamura <[email protected]> wrote: > Hi, > > Just a quick update. I've tested > commons-net-3.3::org.apache.commons.net.ftp.FTPClient without NiFi > code. > Here is the test code I used. > https://gist.github.com/ijokarumawak/f5a329e53901bf2be7c19aa531094abd > > NiFi doesn't set its BufferSize currently, and default is only 1KB. > To send 10MB file > > # BufferSize = 1KB (default) > about 8 sec > > # BufferSize = 16KB > about 300 ms > > I'm going to create a JIRA to add a processor property to specify buffer > size. > Also, will test SFTP. > Thanks again for highlighting the issue! > > Koji > > On Fri, Sep 8, 2017 at 8:48 AM, Koji Kawamura <[email protected]> > wrote: > > Hi, > > > > Thanks for clarifying that the number of files is not significant. > > I looked at the PutFTP and FTPTransfer source code, and found that it > > makes few calls to a FTP server in addition to send a file: > > > > 1. Sending a file as a temporal file > > 2. Update modification time, if 'Last Modified Time' is set > > 3. chmod if 'Permissions' is set > > 4. Rename the temporal file > > https://github.com/apache/nifi/blob/master/nifi-nar- > bundles/nifi-standard-bundle/nifi-standard-processors/src/ > main/java/org/apache/nifi/processors/standard/util/FTPTransfer.java#L379 > > > > PutSFTP and SFTPTransfer does followings additionally: > > 5. chown if 'Remote Owner' is set > > 6. chgrp if 'Remote Group' is set > > > > I wonder if those additional invocations add more latency. > > > > Also, it'd be helpful if you can write simple Java code using the > > underlying (S)FTP client libraries without NiFi layer to investigate > > if NiFi implementation can be improved, or the performance difference > > come from library implementation. > > > > commons-net-3.3::org.apache.commons.net.ftp.FTPClient for FTP > > and > > jsch-0.1.54::com.jcraft.jsch.ChannelSftp for SFTP > > > > > > I will try to do that at my end when I have time, but it'd be very > > helpful if you can do that since you already have testing environment > > and base metrics. > > > > Thanks! > > Koji > > > > > > On Thu, Sep 7, 2017 at 6:30 PM, Gino Lisignoli <[email protected]> > wrote: > >> Hi > >> > >> I monitor the send rates using collectd and grafana. It doesn't seem to > >> matter if I send 10,000 10MB files or 100 1GB files, the maximum > throughput > >> rate of nifi PutFTP and PutSFTP remain the same. 300Mbps and 1Gbs > >> > >> As mention above, the weird thing is when I send files though ftp and > sftp > >> (without nifi) then the rates are much better. > >> > >> It's really odd the the rates are significantly slower in NIFI. > >> > >> On Thu, Sep 7, 2017 at 5:45 PM, Koji Kawamura <[email protected]> > >> wrote: > >>> > >>> Hello Gino, > >>> > >>> Thanks for sharing your findings on FTP performance. > >>> > >>> How did you measure send rate from NiFi to your FTP server? > >>> > >>> Sending multiple FlowFiles would provide less throughput compared to > >>> sending one big FlowFile, as PutFTP and PutSFTP make connection to > >>> each incoming FlowFile. The overhead of establishing connection each > >>> time might be the performance difference you see with mput command. > >>> > >>> Those processors can decide which FTP servers to use based on incoming > >>> FlowFiles' attribute when NiFi Expression Language is used. > >>> > >>> If that's the case, there are some room for performance improvement by > >>> keeping underlying FTP(S) client instance so that it can be reused > >>> among multiple onTrigger() call. > >>> > >>> A possible work-around would be using MergeContent beforehand and send > >>> it as a single file, if your use-case allows that. > >>> > >>> Thanks, > >>> Koji > >>> > >>> On Thu, Sep 7, 2017 at 12:15 PM, Gino Lisignoli <[email protected]> > >>> wrote: > >>> > I have this weird issue with PutFTP and PutSFTP transfer rates. > >>> > > >>> > What I am seeing is that no matter what files I transfer from One > server > >>> > to > >>> > another over a single connection the maximum rates I can send are > >>> > 300Mbps > >>> > for PutFTP and 1Gbps for PutSFTP. > >>> > > >>> > The sending nifi is installed on Centos 7, running on a Dell R730, > 190GB > >>> > Ram, 16 Cores @ 2.4GHz and 4x10Gb nics bonded. The sending nifi has > it's > >>> > content repository on a ramdisk, and the receiving server is > receiving > >>> > to a > >>> > ramdisk (for testing, to remove disk IO out of the equation). > >>> > > >>> > When I do a ftp send manually (without nifi) with mput I get ftp > rates > >>> > of > >>> > ~8Gbs and sftp rates of 2.2Gbs (Which seems slow anyway). > >>> > > >>> > I would have expected transfer rates similar with nifi. > >>> > > >>> > Is there any way to work out why these rates are so much slower, but > >>> > also so > >>> > consistent? I'm using Nifi-1.30 > >> > >> >
