ianw opened a new issue, #319:
URL: https://github.com/apache/mina-sshd/issues/319

   ### Version
   
   2.8.0
   
   ### Bug description
   
   We received a report from a Gerrit user at review.opendev.org who could not 
pull their repositories via SSH.  Upon investigating, we could see a consistent 
error logged for them
   
   ```SshChannelNotFoundException: Received SSH_MSG_CHANNEL_WINDOW_ADJUST on 
unassigned channel 0 (last assigned=null)```
   
   and then the connection appears to disconnect.  This user was using 
```openssh 9.1_p1-r3```
   
   This issue seems to have some history with Gerrit.  It seems in [1] there 
was a report of high-latency connections causing a similar-but-different error 
after an upgrade to 1.7.0 or maybe 2.0.0
   
   ```SshChannelNotFoundException: Received SSH_MSG_CHANNEL_WINDOW_ADJUST on 
unknown channel 0```
   
   If I'm understanding what happened here, there was no particular root cause 
found as to why this error was occurring.  However, what happened was that a 
```ChannelIdTrackingUnknownChannelReferenceHandler``` implementation was added 
to ```sshd-contrib``` and incorporated into gerrit with [2].
   
   If I'm understanding what this does, it basically watches when channels are 
initalized and saves that channel number to a session variable 
```LAST_CHANNEL_ID_KEY```.  It then seems to basically ignore the 
```ChannelIdTrackingUnknownChannelReference``` error *if* that channel has ever 
been opened before (i.e. the channel raising the exception is < 
LAST_CHANNEL_ID_KEY, and assuming channels are opened sequentially).
   
   I think the summary at that point might be "we don't really know why we're 
seeing adjustment messages for unassigned channels, but we're ok to ignore that 
if we know the channel was opened at some point"?
   
   Gerrit enabled this by default, but left an undocumented escape-hatch of an 
```enableChannelIdTracking``` flag to turn it off; i.e. go back to the default 
state of raising an error for any messages for unassigned channels [3].
   
   I think what we can see from this current error is that the client has sent 
this window adjustment message when mina seems to think no channel has ever 
been opened -- since the last assigned is null (```on unassigned channel 0 
(last assigned=null)```).
   
   This seems quite weird, and possibly racy?  I confess only a passing 
knowledge of the SSH protocol, but how would the remote end have thought that 
the channel was setup enough to send a window adjustment when the mina side 
appears to have never have made the call to the ```channelInitialized()``` 
function here?
   
   I should note this was again debugged to a high-latency, possibly unreliable 
connection.  The user tried both ipv4 and ipv6 and could replicate the issue.  
When the switched to tethering via their phone, the problem did not occur.  But 
it does seem to me that tcp/ip should keep what is coming across the wire 
in-order ...
   
   Once identified by this user, upon inspecting the logs we noticed there were 
more connections exhibiting this behaviour.  It seems to be heavily skewed to a 
few users that seem to have a lot of problems, but then we have this same 
message occurring  once or twice for many more users.  
   
   I guess the question is -- is this a symptom of a connection that is so out 
of order with packet loss etc. that it can not be recovered; and this is just 
the first error that it happens to hit; or is this possibly some sort of race, 
where if this race didn't happen, the connection could be completed, even if 
slowly?
   
   [1] https://issues.apache.org/jira/browse/SSHD-942
   [2] 
https://gerrit-review.googlesource.com/c/gerrit/+/238384/9/java/com/google/gerrit/sshd/ChannelIdTrackingUnknownChannelReferenceHandler.java
   [3] 
https://gerrit-review.googlesource.com/c/gerrit/+/238384/9/java/com/google/gerrit/sshd/SshDaemon.java
   
   
   ### Actual behavior
   
   Connection closed unexpectedly
   
   ### Expected behavior
   
   Connection to work
   
   ### Relevant log output
   
   _No response_
   
   ### Other information
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to