SantoshNGitHub opened a new issue, #878:
URL: https://github.com/apache/mina-sshd/issues/878
### Version
Apache MINA SSHD 2.11.0 Client side, NETCONF over SSH
### Bug description
Threads block indefinitely in DefaultSshFuture.await() during writePacket()
when peer stops reading or TCP buffer fills.
Observed in production: threads stuck for hours.
```
"CentralPool-alarm-rule-translation-pool-565890" #565890 prio=5 os_prio=0
cpu=2241.46ms elapsed=23171.52s tid=0x00007f45583eecf0 nid=0x254b33 in
Object.wait() [0x00007f36c7efd000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait([email protected]/Native Method)
- waiting on <no object reference available>
at
org.apache.sshd.common.future.DefaultSshFuture.await0(DefaultSshFuture.java:80)
- locked <0x000000048b3c6430> (a java.lang.Object)
at
org.apache.sshd.common.future.AbstractSshFuture.await(AbstractSshFuture.java:58)
at
org.apache.sshd.common.future.WaitableFuture.await(WaitableFuture.java:50)
at
org.apache.sshd.common.session.helpers.KeyExchangeMessageHandler.writeOrEnqueue(KeyExchangeMessageHandler.java:355)
at
org.apache.sshd.common.session.helpers.KeyExchangeMessageHandler.writePacket(KeyExchangeMessageHandler.java:248)
at
org.apache.sshd.common.session.helpers.AbstractSession.writePacket(AbstractSession.java:1027)
at
org.apache.sshd.common.channel.AbstractChannel.writePacket(AbstractChannel.java:812)
at
org.apache.sshd.common.channel.throttle.DefaultChannelStreamWriter.writeData(DefaultChannelStreamWriter.java:46)
at
org.apache.sshd.common.channel.ChannelAsyncOutputStream.writePacket(ChannelAsyncOutputStream.java:347)
at
org.apache.sshd.common.channel.ChannelAsyncOutputStream.doWriteIfPossible(ChannelAsyncOutputStream.java:224)
at
org.apache.sshd.common.channel.ChannelAsyncOutputStream.writeBuffer(ChannelAsyncOutputStream.java:118)
at
org.broadband_forum.obbaa.netconf.client.ssh.SshNetconfClientSession.sendRpcMessage(SshNetconfClientSession.java:87)
- locked <0x000000044d56bcd0> (a
org.broadband_forum.obbaa.netconf.client.ssh.SshNetconfClientSession)
at
org.broadband_forum.obbaa.netconf.client.ssh.SshNetconfClientSession.sendRpcMessage(SshNetconfClientSession.java:70)
- locked <0x000000044d56bcd0> (a
org.broadband_forum.obbaa.netconf.client.ssh.SshNetconfClientSession)
at
org.broadband_forum.obbaa.netconf.api.client.AbstractNetconfClientSession.sendRpcAndGetFuture(AbstractNetconfClientSession.java:145)
at
org.broadband_forum.obbaa.netconf.api.client.AbstractNetconfClientSession.sendRpc(AbstractNetconfClientSession.java:138)
at
org.broadband_forum.obbaa.netconf.api.client.AbstractNetconfClientSession.get(AbstractNetconfClientSession.java:122)
at
com.alcatel.pma.core.configmgmt.DeviceNetconfSessionImpl.get(DeviceNetconfSessionImpl.java:237)
```
```
"CentralPool-nbi-request-pool-572076" #572076 prio=5 os_prio=0 cpu=201.60ms
elapsed=16925.79s tid=0x00007f455843fdc0 nid=0x257fe8 in Object.wait()
[0x00007f36c6bfd000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait([email protected]/Native Method)
- waiting on <no object reference available>
at
org.apache.sshd.common.future.DefaultSshFuture.await0(DefaultSshFuture.java:80)
- locked <0x0000000545cb0828> (a java.lang.Object)
at
org.apache.sshd.common.future.AbstractSshFuture.await(AbstractSshFuture.java:58)
at
org.apache.sshd.common.future.WaitableFuture.await(WaitableFuture.java:50)
at
org.apache.sshd.common.session.helpers.KeyExchangeMessageHandler.writeOrEnqueue(KeyExchangeMessageHandler.java:355)
at
org.apache.sshd.common.session.helpers.KeyExchangeMessageHandler.writePacket(KeyExchangeMessageHandler.java:248)
at
org.apache.sshd.common.session.helpers.AbstractSession.writePacket(AbstractSession.java:1027)
at
org.apache.sshd.common.channel.AbstractChannel.writePacket(AbstractChannel.java:812)
at
org.apache.sshd.common.channel.throttle.DefaultChannelStreamWriter.writeData(DefaultChannelStreamWriter.java:46)
at
org.apache.sshd.common.channel.ChannelAsyncOutputStream.writePacket(ChannelAsyncOutputStream.java:347)
at
org.apache.sshd.common.channel.ChannelAsyncOutputStream.doWriteIfPossible(ChannelAsyncOutputStream.java:224)
at
org.apache.sshd.common.channel.ChannelAsyncOutputStream.writeBuffer(ChannelAsyncOutputStream.java:118)
at
org.broadband_forum.obbaa.netconf.client.ssh.SshNetconfClientSession.sendRpcMessage(SshNetconfClientSession.java:87)
- locked <0x0000000545cb0b40> (a
org.broadband_forum.obbaa.netconf.client.ssh.SshNetconfClientSession)
```
### Actual behavior
SSH writePacket can block indefinitely in 2.11.0.
### Expected behavior
Write operations should respect a configurable timeout and fail fast.
### Relevant log output
```Shell
```
### Other information
We did following code change to add configurable timeout:
### AbstractSession.java
```
private final long ssh_timeout_seconds =
CoreModuleProperties.SSH_TIMEOUT_SECONDS.getRequired(this);
```
```
@Override
public IoWriteFuture writePacket(Buffer buffer) throws IOException {
log.info("Triggers KEX Handler write packet with timeout {}",
ssh_timeout_seconds);
return kexHandler.writePacket(buffer, ssh_timeout_seconds, null);
}
```
### CoreModuleProperties.java
```
public static final Property<Long> SSH_TIMEOUT_SECONDS
= Property.long_("ssh-timeout-seconds", 300L);
```
Before raising a PR, I’d like feedback whether this approach and location
for timeout handling is acceptable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]