Re: sshd-core: dealock
Hello again, I am still trying to create a minimal example to reproduce the problem. If I am using only 1 NIO worker on server side it is easy to reproduce. So, I assume there must be at least two workers, that there is one that always could call the notifyAll method to wake up all the sleeping ones. Can you give me some hints, how you ensure, that there is always one worker free to notify the ones that call lock.wait? A perhaps more readable (because of the line breaks) version of the thread dump could be found here: https://gist.github.com/maggu2810/4c8df61e95938102726376a044ca9a9e Best regards, Markus
sshd-core: dealock
Hello, I have a project that is using sshd-core (+mina-core 2.0.16) on Client and Server side. I previously used 1.3.0 and switched to 1.4.0-SNAPSHOT. The problem (I will describe shortly) is raised by both versions (not pre 1.3.0 version has been used before). The clients logic is very simple: * create a SshClient * setup the client * connect * authenticate * startRemotePortForwarding The relevant server logic looks like: * create a SshServer * setup the server (e.g. accept all forwarding filter, set default TCP/IP forwarder) * start the server Multiple clients connect to the server and the forwarded ports on the server side are used by other tools to establish a tunneled connection to the clients. FYI: The whole stuff is running in a Karaf instance. This works most of the time... But not always. Sometimes no data seems to be transmitted to the tunnels anymore. The server seems to be run into a deadlock. I created a stack trace (attached at the end) of the server and connected a debugging session at the time no data is transmitted anymore. It shows that all NIO workers resides in the "lock.wait()" statement of the AbstractClientChanne's waitFor method. If I disconnect the respective client by closing the session on the server side (I created a Karaf command line command) then the handling is working again. I know that the best would be to have a minimal example or an unit test, but I cannot reproduce the scenario if e.g. client and server are on the same network. I assume it is very timing dependent. Could someone with more knowledge about the channel and lock handling have a look at? Who is responsible to call lock.notify[All]() is all NIO workers are waiting? Thanks you guys! The deadlock looks like: === "sshd-SshServer[67de991c]-nio2-thread-3" - Thread t@125 java.lang.Thread.State: TIMED_WAITING at java.lang.Object.wait(Native Method) - waiting on <132c6b60> (a java.lang.Object) at org.apache.sshd.client.channel.AbstractClientChannel.waitFor(AbstractClientChannel.java:244) at org.apache.sshd.common.forward.DefaultTcpipForwarder$StaticIoHandler.messageReceived(DefaultTcpipForwarder.java:984) at org.apache.sshd.common.io.nio2.Nio2Session.handleReadCycleCompletion(Nio2Session.java:276) at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:256) at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:253) at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.lambda$completed$0(Nio2CompletionHandler.java:38) at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$$Lambda$45/1071326492.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:37) at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126) at sun.nio.ch.Invoker$2.run(Invoker.java:218) at sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - locked <6d02d7ed> (a java.util.concurrent.ThreadPoolExecutor$Worker) "sshd-SshServer[67de991c]-nio2-thread-2" - Thread t@124 java.lang.Thread.State: TIMED_WAITING at java.lang.Object.wait(Native Method) - waiting on <7e9f4eff> (a java.lang.Object) at org.apache.sshd.client.channel.AbstractClientChannel.waitFor(AbstractClientChannel.java:244) at org.apache.sshd.common.forward.DefaultTcpipForwarder$StaticIoHandler.messageReceived(DefaultTcpipForwarder.java:984) at org.apache.sshd.common.io.nio2.Nio2Session.handleReadCycleCompletion(Nio2Session.java:276) at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:256) at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:253) at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.lambda$completed$0(Nio2CompletionHandler.java:38) at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$$Lambda$45/1071326492.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:37) at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126) at sun.nio.ch.Invoker$2.run(Invoker.java:218) at sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - locked <35fbf3e8> (a java.util.concurrent.ThreadPoolExecutor$Worker) "sshd-SshServer[67de991c]-nio2-thread-1" - Thread t@122 java.lang.Thread.State: TIMED_WAITING at java.lang.Object.wait(Native Method) - waiting on