tomaswolf opened a new issue, #263: URL: https://github.com/apache/mina-sshd/issues/263
SFTP tests are flaky. Turns out that they are running into ``` 23:13:59.123 | ERROR | MinaProcessor-25 | o.a.s.c.c.BufferedIoOutputStream | org.apache.sshd.common.channel.BufferedIoOutputStream 219 | finishWrite(sftp-out@0)[ChannelAsyncOutputStream[ChannelSession[id=0, recipient=0]-ServerSessionImpl[testTransferIntegrityWithBufferLargerThanPacket[REKEY_BLOCK_SIZE 65,536]@/127.0.0.1:59555]] cmd=SSH_MSG_CHANNEL_DATA] - pending byte counts underflow (-64525) after 18070131 bytes 23:13:59.123 | ERROR | m-42976-thread-1 | o.a.s.s.s.SftpSubsystem | org.apache.sshd.common.util.logging.LoggingUtils 693 | run(ServerSessionImpl[testTransferIntegrityWithBufferLargerThanPacket[REKEY_BLOCK_SIZE 65,536]@/127.0.0.1:59555]) SshChannelBufferedOutputException caught in SFTP subsystem after 863 buffers: Pending byte counts underflow 23:23:59.498 | INFO | )-timer-thread-1 | o.a.s.s.s.ServerSessionImpl | org.apache.sshd.common.session.helpers.SessionHelper 1183 | Disconnecting(ServerSessionImpl[testTransferIntegrityWithBufferLargerThanPacket[REKEY_BLOCK_SIZE 65,536]@/127.0.0.1:59555]): SSH2_DISCONNECT_PROTOCOL_ERROR - Detected IdleTimeout after 600375/600000 ms. Finished org.apache.sshd.sftp.client.SftpTransferTest:testTransferIntegrityWithBufferLargerThanPacket[REKEY_BLOCK_SIZE 65,536] in 610192 ms ``` Let's ignore the fact that test hangs for 10 minutes until that timeout expires; that a different minor problem. The real problem is the "pending byte counts underflow", which indicates that the BufferedIoOutputStream somehow managed to write more data than was passed to it. This is a race condition between `BufferedIoOutputStream.startWriting()` and `BufferedIoOutputStream.finishWrite()`. The following sequence of events is possible when a write that was in progress just finished and the user of the stream just initiates a new write: 1. Thread 1: `startWriting` gets the head of the write queue -- this is still the future/buffer that was just written 2. Thread 2: `finishWrite` removes that write from the write queue 3. Thread 2: `finishWrite` clears the "current" write 4. Thread 1: `startWriting` sets the "current" write and happily writes the future/buffer again. Combined with the fact that the underlying `ChannelAsyncOutputStream` may in some cases copy the buffer and actually write that copy, but doesn't consume the original buffer in that case, this leads to the same data being written again, and when that second write finishes, this byte count gets subtracted again from the number of total bytes to be written. So the count is added once but subtracted twice, and eventually this "pending byte counts underflow" is hit. This race condition in `BufferedIoOutputStream` needs to be fixed. Optionally `ChannelAsyncOutputStream` should consume the original buffer if it makes a copy; though I'm undecided on that: if it did, it might have hidden this bug (because then second write of that buffer would have be a zero-byte write). (The problem is not specific to the MINA transport back-end; it also occurs with NIO2 from time to time. CI builds are green because the race condition appears to be hit infrequently, and on a re-run the test usually succeeds.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@mina.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@mina.apache.org For additional commands, e-mail: dev-h...@mina.apache.org