[ 
https://issues.apache.org/jira/browse/SSHD-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16437233#comment-16437233
 ] 

Bill Kuker commented on SSHD-721:
---------------------------------

I am uncertain of the answers to these. My full project includes keepalives and 
checks for crashed servers and clients and bad network connections, so after a 
certain amount of time stuck in this state the client applications kill their 
SSH clients and reconnect...

Once all clients with port forwards have killed themselves sshd begins 
operating normally.

Very similar to Markus's problem is that one web browser - web server 
connection over one port forward can get sshd into this state where all port 
forwards for all clients (not just the one client with the http port forward) 
halt.

Because at this point sshd is using no cpu, allocating no memory, executing no 
code, I'd be inclined to call it deadlock and a bug rather than a performance 
issue.

Increasing the number of NIO workers eases my pain, but my assumption would be 
that for any # of NIO workers there is a client behavior that can trigger the 
same deadlock.

I can say this problem began when I upgraded from 0.3.0 to 1.2.0 and still 
occurs in 1.6.0. I am working on updating to 1.7.0 but there are some API 
changes so that'll take a little longer. There also seems to be some time based 
component to it. It happens more for some people bouncing from toronto to paris 
to ottawa than it does for people testing it on the same lan. yay.

I guess my bottom line is that if someone from the SSHD team agrees that this 
is probably a bug I'll work real hard to reproduce it with the simplest test 
case I can... And maybe even a fix. I only have SSHD-85 to my name, but the 
dates on that ticket show a certain persistence ;)

> deadlock: all nio workers wait to be woken up
> ---------------------------------------------
>
>                 Key: SSHD-721
>                 URL: https://issues.apache.org/jira/browse/SSHD-721
>             Project: MINA SSHD
>          Issue Type: Bug
>    Affects Versions: 1.3.0, 1.4.0
>            Reporter: Markus Rathgeb
>            Priority: Major
>
> I am using sshd-core for a server machine (S) that accepts incoming 
> connections and port forwarding requests.
> There are client machines (C) that run servers that should be accessible by a 
> tunnel to the server.
> On the client machines (C) also an implementation using sshd-core is running 
> that establish the connection to the server (S) and initiate the port 
> forwarding.
> Other clients are using the tunnelled connection to communicate with the 
> servers that are running on the client machines (C).
> Sometimes I realized that no data is transferred anymore (through the 
> tunnels).
> All the worker reside in the waitFor function and no one wakes them up.
> {noformat}
> "sshd-SshServer[67de991c]-nio2-thread-3" - Thread t@125
>    java.lang.Thread.State: TIMED_WAITING
>       at java.lang.Object.wait(Native Method)
>       - waiting on <132c6b60> (a java.lang.Object)
>       at 
> org.apache.sshd.client.channel.AbstractClientChannel.waitFor(AbstractClientChannel.java:244)
>       at 
> org.apache.sshd.common.forward.DefaultTcpipForwarder$StaticIoHandler.messageReceived(DefaultTcpipForwarder.java:984)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session.handleReadCycleCompletion(Nio2Session.java:276)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:256)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:253)
>       at 
> org.apache.sshd.common.io.nio2.Nio2CompletionHandler.lambda$completed$0(Nio2CompletionHandler.java:38)
>       at 
> org.apache.sshd.common.io.nio2.Nio2CompletionHandler$$Lambda$45/1071326492.run(Unknown
>  Source)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at 
> org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:37)
>       at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>       at sun.nio.ch.Invoker$2.run(Invoker.java:218)
>       at 
> sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
>       - locked <6d02d7ed> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "sshd-SshServer[67de991c]-nio2-thread-2" - Thread t@124
>    java.lang.Thread.State: TIMED_WAITING
>       at java.lang.Object.wait(Native Method)
>       - waiting on <7e9f4eff> (a java.lang.Object)
>       at 
> org.apache.sshd.client.channel.AbstractClientChannel.waitFor(AbstractClientChannel.java:244)
>       at 
> org.apache.sshd.common.forward.DefaultTcpipForwarder$StaticIoHandler.messageReceived(DefaultTcpipForwarder.java:984)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session.handleReadCycleCompletion(Nio2Session.java:276)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:256)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:253)
>       at 
> org.apache.sshd.common.io.nio2.Nio2CompletionHandler.lambda$completed$0(Nio2CompletionHandler.java:38)
>       at 
> org.apache.sshd.common.io.nio2.Nio2CompletionHandler$$Lambda$45/1071326492.run(Unknown
>  Source)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at 
> org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:37)
>       at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>       at sun.nio.ch.Invoker$2.run(Invoker.java:218)
>       at 
> sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
>       - locked <35fbf3e8> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "sshd-SshServer[67de991c]-nio2-thread-1" - Thread t@122
>    java.lang.Thread.State: TIMED_WAITING
>       at java.lang.Object.wait(Native Method)
>       - waiting on <49ce93a9> (a java.lang.Object)
>       at 
> org.apache.sshd.client.channel.AbstractClientChannel.waitFor(AbstractClientChannel.java:244)
>       at 
> org.apache.sshd.common.forward.DefaultTcpipForwarder$StaticIoHandler.messageReceived(DefaultTcpipForwarder.java:984)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session.handleReadCycleCompletion(Nio2Session.java:276)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:256)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:253)
>       at 
> org.apache.sshd.common.io.nio2.Nio2CompletionHandler.lambda$completed$0(Nio2CompletionHandler.java:38)
>       at 
> org.apache.sshd.common.io.nio2.Nio2CompletionHandler$$Lambda$45/1071326492.run(Unknown
>  Source)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at 
> org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:37)
>       at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>       at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)
>       at 
> sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)
>       at 
> sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:276)
>       at 
> sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:297)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session.doReadCycle(Nio2Session.java:304)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session.doReadCycle(Nio2Session.java:249)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session.startReading(Nio2Session.java:243)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session.startReading(Nio2Session.java:239)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session.startReading(Nio2Session.java:235)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session.startReading(Nio2Session.java:231)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Session.startReading(Nio2Session.java:227)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Acceptor$AcceptCompletionHandler.onCompleted(Nio2Acceptor.java:178)
>       at 
> org.apache.sshd.common.io.nio2.Nio2Acceptor$AcceptCompletionHandler.onCompleted(Nio2Acceptor.java:156)
>       at 
> org.apache.sshd.common.io.nio2.Nio2CompletionHandler.lambda$completed$0(Nio2CompletionHandler.java:38)
>       at 
> org.apache.sshd.common.io.nio2.Nio2CompletionHandler$$Lambda$45/1071326492.run(Unknown
>  Source)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at 
> org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:37)
>       at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>       at sun.nio.ch.Invoker$2.run(Invoker.java:218)
>       at 
> sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
>       - locked <7d1c59e6> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "sshd-SshServer[67de991c]-timer-thread-1" - Thread t@105
>    java.lang.Thread.State: TIMED_WAITING
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for <655c080c> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>       at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
>       at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
> - None
> {noformat}
> To eliminate some other code that could trigger that error I created a 
> "minimal" example -- a simple test application -- that could be used to 
> demonstrate the hang (for me it is reproducible using that code).
> Please have a look at 
> https://github.com/maggu2810/sshd-deadlock/tree/first-report where you could 
> also find a readme with a short description about the code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to