[
https://issues.apache.org/jira/browse/GEODE-9825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bill Burcham updated GEODE-9825:
--------------------------------
Attachment: GEODE-9825-demo.patch
> Disparate socket-buffer-size Results in "IOException: Unknown header byte"
> and Hangs
> ------------------------------------------------------------------------------------
>
> Key: GEODE-9825
> URL: https://issues.apache.org/jira/browse/GEODE-9825
> Project: Geode
> Issue Type: Bug
> Components: messaging
> Affects Versions: 1.12.4, 1.15.0
> Reporter: Bill Burcham
> Priority: Major
> Attachments: GEODE-9825-demo.patch
>
>
> GEODE-9141 introduced a bug that causes {{IOException: "Unknown header
> byte..."}} and hangs if members are configured with different
> {{socket-buffer-size}} settings.
> h2. Reproduction
> To reproduce this bug, modify {{P2PMessagingConcurrencyDUnitTest}} so that
> sender and locator and receiver use different configuration parameters. Set
> {{socket-buffer-size}} to 212992 for the sender and 32 * 1024 for the
> receiver. Oh and just skip the call to {{{}securityProperties(){}}}—we want
> to induce the "Unknown header byte" exception—we don't want the TLS framework
> throwing exceptions.
> h2. Analysis
> In {{{}Connection.processInputBuffer(){}}}. When that method has read all the
> messages it can from the current input buffer, it then considers whether the
> buffer needs expansion. If it does then:
> {code:java}
> inputBuffer = inputSharing.expandReadBufferIfNeeded(allocSize); {code}
> Is executed and the method returns. The caller then expects to be able to
> _write_ bytes into {{{}inputBuffer{}}}.
> The problem, it seems, is that
> {{ByteBufferSharingInternalImpl.expandReadBufferIfNeeded()}} does not leave
> the the {{ByteBuffer}} in the proper state. It leaves the buffer ready to be
> _read_ not written.
> The line of code referenced above used to be this method in {{Connection}}
> (which has since been removed):
> {code:java}
> private void compactOrResizeBuffer(int messageLength) {
> final int oldBufferSize = inputBuffer.capacity();
> int allocSize = messageLength + MSG_HEADER_BYTES;
> if (oldBufferSize < allocSize) {
> // need a bigger buffer
> logger.info("Allocating larger network read buffer, new size is {} old
> size was {}.",
> allocSize, oldBufferSize);
> ByteBuffer oldBuffer = inputBuffer;
> inputBuffer = getBufferPool().acquireDirectReceiveBuffer(allocSize);
> if (oldBuffer != null) {
> int oldByteCount = oldBuffer.remaining();
> inputBuffer.put(oldBuffer);
> inputBuffer.position(oldByteCount);
> getBufferPool().releaseReceiveBuffer(oldBuffer);
> }
> } else {
> if (inputBuffer.position() != 0) {
> inputBuffer.compact();
> } else {
> inputBuffer.position(inputBuffer.limit());
> inputBuffer.limit(inputBuffer.capacity());
> }
> }
> } {code}
> Notice how this method leaves {{inputBuffer}} ready to be _written_ to.
> It's not sufficient to simply call {{flip()}} on the inputBuffer before
> returning it (I tried it and it didn't fix the bug). More work is needed.
> h2. Resolution
> When this ticket is complete the bug will be fixed and
> {{P2PMessagingConcurrencyDUnitTest}} will be enhanced to test these
> combinations:
> [security, sender/locator socket-buffer-size, receiver socket-buffer-size]
> [TLS, (default), (default)] this is what the test currently does
> [no TLS, 64 * 1024, 32 * 1024] *new: this illustrates this bug*
> [no TLS, (default), (default)] *new*
> We might want to mix in conserve-sockets true/false in there too while we're
> at it (the test currently holds it at true).
--
This message was sent by Atlassian Jira
(v8.20.1#820001)