tomaswolf opened a new issue, #263:
URL: https://github.com/apache/mina-sshd/issues/263

   SFTP tests are flaky. Turns out that they are running into
   ```
   23:13:59.123 | ERROR | MinaProcessor-25 | o.a.s.c.c.BufferedIoOutputStream | 
org.apache.sshd.common.channel.BufferedIoOutputStream             219 | 
finishWrite(sftp-out@0)[ChannelAsyncOutputStream[ChannelSession[id=0, 
recipient=0]-ServerSessionImpl[testTransferIntegrityWithBufferLargerThanPacket[REKEY_BLOCK_SIZE
 65,536]@/127.0.0.1:59555]] cmd=SSH_MSG_CHANNEL_DATA] - pending byte counts 
underflow (-64525) after 18070131 bytes
   23:13:59.123 | ERROR | m-42976-thread-1 | o.a.s.s.s.SftpSubsystem          | 
org.apache.sshd.common.util.logging.LoggingUtils                  693 | 
run(ServerSessionImpl[testTransferIntegrityWithBufferLargerThanPacket[REKEY_BLOCK_SIZE
 65,536]@/127.0.0.1:59555]) SshChannelBufferedOutputException caught in SFTP 
subsystem after 863 buffers: Pending byte counts underflow
   23:23:59.498 | INFO  | )-timer-thread-1 | o.a.s.s.s.ServerSessionImpl      | 
org.apache.sshd.common.session.helpers.SessionHelper             1183 | 
Disconnecting(ServerSessionImpl[testTransferIntegrityWithBufferLargerThanPacket[REKEY_BLOCK_SIZE
 65,536]@/127.0.0.1:59555]): SSH2_DISCONNECT_PROTOCOL_ERROR - Detected 
IdleTimeout after 600375/600000 ms.
   
   Finished 
org.apache.sshd.sftp.client.SftpTransferTest:testTransferIntegrityWithBufferLargerThanPacket[REKEY_BLOCK_SIZE
 65,536] in 610192 ms
   ```
   
   Let's ignore the fact that test hangs for 10 minutes until that timeout 
expires; that a different minor problem. The real problem is the "pending byte 
counts underflow", which indicates that the BufferedIoOutputStream somehow 
managed to write more data than was passed to it.
   
   This is a race condition between `BufferedIoOutputStream.startWriting()` and 
`BufferedIoOutputStream.finishWrite()`. The following sequence of events is 
possible when a write that was in progress just finished and the user of the 
stream just initiates a new write:
   
   1. Thread 1: `startWriting` gets the head of the write queue -- this is 
still the future/buffer that was just written
   2. Thread 2: `finishWrite` removes that write from the write queue
   3. Thread 2: `finishWrite` clears the "current" write
   4. Thread 1: `startWriting` sets the "current" write and happily writes the 
future/buffer again.
   
   Combined with the fact that the underlying `ChannelAsyncOutputStream` may in 
some cases copy the buffer and actually write that copy, but doesn't consume 
the original buffer in that case, this leads to the same data being written 
again, and when that second write finishes, this byte count gets subtracted 
again from the number of total bytes to be written. So the count is added once 
but subtracted twice, and eventually this "pending byte counts underflow" is 
hit.
   
   This race condition in `BufferedIoOutputStream` needs to be fixed. 
Optionally `ChannelAsyncOutputStream` should consume the original buffer if it 
makes a copy; though I'm undecided on that: if it did, it might have hidden 
this bug (because then second write of that buffer would have be a zero-byte 
write).
   
   (The problem is not specific to the MINA transport back-end; it also occurs 
with NIO2 from time to time. CI builds are green because the race condition 
appears to be hit infrequently, and on a re-run the test usually succeeds.)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@mina.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@mina.apache.org
For additional commands, e-mail: dev-h...@mina.apache.org

Reply via email to