[ 
https://issues.apache.org/jira/browse/NIFI-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17053370#comment-17053370
 ] 

Jens M Kofoed commented on NIFI-7222:
-------------------------------------

[~joewitt] Thanks for the nar file. In my setup it works, both for the Fetch 
and the Get process.

But... another error occurs with PutSFTP. This error also exist in the current 
version 1.11.3. Sorry for not finding that bug earlier.

If I change the "Conflict Resolution" to anything else but NONE, it fails. I've 
tried to add a DEBUG in the logback settings for the process, but no other 
messages is printet out.

2020-03-06 13:27:49,967 ERROR [Timer-Driven Process Thread-2] 
o.a.nifi.processors.standard.PutSFTP 
PutSFTP[id=8b42b8b0-0170-1000-1cfa-f6abe37ba1a1] 
PutSFTP[id=8b42b8b0-0170-1000-1cfa-f6abe37ba1a1] failed to process session due 
to java.lang.NullPointerException; Processor Administratively Yielded for 1 
sec: java.lang.NullPointerException2020-03-06 13:27:49,967 ERROR [Timer-Driven 
Process Thread-2] o.a.nifi.processors.standard.PutSFTP 
PutSFTP[id=8b42b8b0-0170-1000-1cfa-f6abe37ba1a1] 
PutSFTP[id=8b42b8b0-0170-1000-1cfa-f6abe37ba1a1] failed to process session due 
to java.lang.NullPointerException; Processor Administratively Yielded for 1 
sec: java.lang.NullPointerExceptionjava.lang.NullPointerException: null at 
net.schmizz.sshj.common.Buffer.putString(Buffer.java:435) at 
net.schmizz.sshj.sftp.SFTPEngine.openDir(SFTPEngine.java:160) at 
net.schmizz.sshj.sftp.SFTPClient.ls(SFTPClient.java:57) at 
net.schmizz.sshj.sftp.SFTPClient.ls(SFTPClient.java:52) at 
org.apache.nifi.processors.standard.util.SFTPTransfer.getRemoteFileInfo(SFTPTransfer.java:686)
 at 
org.apache.nifi.processors.standard.PutFileTransfer.identifyAndResolveConflictFile(PutFileTransfer.java:207)
 at 
org.apache.nifi.processors.standard.PutFileTransfer.onTrigger(PutFileTransfer.java:113)
 at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
 at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1176)
 at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
 at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
 at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at 
java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at 
java.util.concurrent.FutureTask.runAndReset(Unknown Source) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown
 Source) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
 Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at 
java.lang.Thread.run(Unknown Source)2020-03-06 13:27:49,983 WARN [Timer-Driven 
Process Thread-2] o.a.n.controller.tasks.ConnectableTask Administratively 
Yielding PutSFTP[id=8b42b8b0-0170-1000-1cfa-f6abe37ba1a1] due to uncaught 
Exception: java.lang.NullPointerExceptionjava.lang.NullPointerException: null 
at net.schmizz.sshj.common.Buffer.putString(Buffer.java:435) at 
net.schmizz.sshj.sftp.SFTPEngine.openDir(SFTPEngine.java:160) at 
net.schmizz.sshj.sftp.SFTPClient.ls(SFTPClient.java:57) at 
net.schmizz.sshj.sftp.SFTPClient.ls(SFTPClient.java:52) at 
org.apache.nifi.processors.standard.util.SFTPTransfer.getRemoteFileInfo(SFTPTransfer.java:686)
 at 
org.apache.nifi.processors.standard.PutFileTransfer.identifyAndResolveConflictFile(PutFileTransfer.java:207)
 at 
org.apache.nifi.processors.standard.PutFileTransfer.onTrigger(PutFileTransfer.java:113)
 at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
 at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1176)
 at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
 at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
 at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at 
java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at 
java.util.concurrent.FutureTask.runAndReset(Unknown Source) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown
 Source) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
 Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at 
java.lang.Thread.run(Unknown Source)   

> FetchSFTP appears to not advise the remote system it is done with a given 
> resource resulting in too many open files
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-7222
>                 URL: https://issues.apache.org/jira/browse/NIFI-7222
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>            Reporter: Joe Witt
>            Assignee: Joe Witt
>            Priority: Major
>             Fix For: 1.12.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hi guys,
>  
> We have an issue with the FetchSFTP processor and the max open file 
> descriptors. In short, it seems that the FetchSFTP keeps the file open 
> “forever” on our Synology NAS, so we are reaching always the default max open 
> files limit of 1024 from our Synlogy NAS if we try to fetch 500’000 small 1MB 
> files (so in fact it’s not possible to read the files as everything is 
> blocked after 1024 files).
>  
> We found no option to rise the limit of max open files on the Synology NAS 
> (but that’s not NiFi’s fault 😉). We have also other linux machine with 
> CentOS, but the behavior there isn’t exactly always the same. Sometimes the 
> file descriptors get closed but sometimes as well not.
>  
> Synology has no lsof command, but this is how I’ve checked it:
> user@nas-01:~$ sudo ls -l /proc/<SSHD SFTP process PID>/fd | wc -l
> 1024
>  
> Any comments how we can troubleshoot the issue?
>  
> Cheers Josef
> Oh sorry, missed one of of the most important parts, we are using a 8-node 
> cluster with nifi 1.11.3 – so perfectly up to date.
>  
> Cheers Josef
> Hi Joe
>  
> Ok, to our setup, we just bought a new powerful Synology NAS to use it as 
> SFTP server mainly for NiFi to replace our current SFTP linux machine. So the 
> NAS is empty and just configured for this single use case (read/write SFTP 
> from NiFi). Nothing else is running there at the moment. Important limit is 
> per SSH/user session ulimit -a 1024 open files max.:
>  
> root@nas-01:~# ulimit -a
> core file size          (blocks, -c) unlimited
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 62025
> max locked memory       (kbytes, -l) 64
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 1024
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 8192
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 62025
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
>  
>  
> On NiFi side we are using an 8 node cluster, but it doesn’t matter whether 
> I’m using the whole cluster or just one single (primary) node. It’s clearly 
> visible that it’s related to the number of FetchSFTP processors running. So 
> if I’m distributing the load to 8 nodes I’m seeing 8 SFTP sessions on the NAS 
> and we can fetch  8x1024 files. I’m also seeing the file descriptors from 
> each file (per FetchSFTP processor = PID) on the NAS which has been fetched 
> by NiFi. In my understanding this files should be fetched and the file 
> descriptor should be closed after the transfer, but this doesn’t seems to be 
> the case in most of the times.
>  
> As soon as I’m stopping the “FetchSFTP” processor, the SFTP session seems to 
> be closed and all FDs are gone. So after stop/start I can fetch again 1024 
> files.
>  
> So I tried to troubleshoot a bit further and here is what I’ve done in NiFi 
> and on the NAS:
>  
> A screenshot of text
> Description automatically generated
>  
> So I’ve done a ListSFTP and got 2880 flowfiles, they will be loadbalanced to 
> one single node (to simplify to test and only get 1 SFTP session on the NAS). 
> In the ControlRate I’m transferring every 10 seconds 10 flowfiles to the 
> FetchSFTP, that corelates directly with the open file descriptors on my NAS, 
> as you can see below. Sometimes, and I don’t know when or why, the SFTP 
> session will be closed and everything starts from scratch (not happened here) 
> without any notice on NiFi side.  As you see, the FDs are growing with +10 
> every 10sec and if I’m checking the path/filename of the open FDs I see that 
> this are the one which I’ve fetched.
>  
> root@nas-01:~# ps aux | grep sftp
> root      1740  0.5  0.0 240848  8584 ?        Ss   15:01   0:00 sshd: 
> ldr@internal-sftp
> root      1753  0.0  0.0  23144  2360 pts/2    S+   15:01   0:00 grep 
> --color=auto sftp
> root     15520  0.0  0.0 241088  9252 ?        Ss   13:38   0:02 sshd: 
> ldr@internal-sftp
> root@nas-01:~#
> root@nas-01:~# ls -l /proc/1740/fd | wc -l    
> 24
> root@nas-01:~# ls -l /proc/1740/fd | wc -l
> 34
> root@nas-01:~# ls -l /proc/1740/fd | wc -l
> 44
> root@nas-01:~# ls -l /proc/1740/fd | wc -l
> 54
> root@nas-01:~# ls -l /proc/1740/fd | wc -l
> 64
>  
> root@p-li-nas-01:~# ls -l /proc/1740/fd | head
> total 0
> lr-x------  1 root root 64 Mar  4 15:01 0 -> pipe:[1086218]
> l-wx------  1 root root 64 Mar  4 15:01 1 -> pipe:[1086219]
> lr-x------+ 1 root root 64 Mar  4 15:01 10 -> 
> /volume1/test/2019-08-31/detail-20190831-0104-92.log.gz
> lr-x------+ 1 root root 64 Mar  4 15:03 100 -> 
> /volume1/test/2019-08-31/detail-20190831-0052-91.log.gz
> lr-x------+ 1 root root 64 Mar  4 15:03 101 -> 
> /volume1/test/2019-08-31/detail-20190831-0340-92.log.gz
> lr-x------+ 1 root root 64 Mar  4 15:03 102 -> 
> /volume1/test/2019-08-31/detail-20190831-0246-91.log.gz
> lr-x------+ 1 root root 64 Mar  4 15:03 103 -> 
> /volume1/test/2019-08-31/detail-20190831-0104-91.log.gz
> lr-x------+ 1 root root 64 Mar  4 15:03 104 -> 
> /volume1/test/2019-08-31/detail-20190831-0150-91.log.gz
> lr-x------+ 1 root root 64 Mar  4 15:03 105 -> 
> /volume1/test/2019-08-31/detail-20190831-0013-91.log.gz
> …
>  
> So to sum up, one FetchSFTP generates one SFTP Session on the NAS. The SFTP 
> Session holds FDs which ,most of the time, doesn’t get closed. Reproduceable 
> with the template above with FetchSFTP to a CentOS machine or a Synology NAS. 
> Main question is now, why were the FDs not closed or when should the SFTP 
> session gets closed.
>  
> Thanks
> Just checked the open sockets on the NiFi machine where FetchSFTP is running, 
> of course there is just one SSH session if I’m using just one single 
> processor… So the SFTP transfer is hidden in the SSH session.
>  
> Open TCP sessions on NiFi
> [user@ nifi-05 ~]$ netstat -vatn | grep x.y.z.232
> tcp        0      0 x.y.z.144:33628     x.y.z.232:22        ESTABLISHED
>  
> Any comments are welcome. Still unclear where the open FDs on the NAS (SFTP 
> server) are coming from or how it should work from NiFi perspective.
>  
> Cheers Josef



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to