Re: RFR: 8351339: WebSocket::sendBinary assume that user supplied buffers are BIG_ENDIAN [v7]

Volkan Yazici Wed, 26 Mar 2025 02:40:50 -0700

On Wed, 26 Mar 2025 08:09:54 GMT, Michael McMahon <micha...@openjdk.org> wrote:


>> should the Endian check be in the genesis of the processing i.e. the 
>> applyMask method ?
>> 
>>         static void applyMask(ByteBuffer src, ByteBuffer dst, int mask) {
>> 
>> // endian check here then other checks are superfluous ?
>> 
>>             if (src.remaining() > dst.remaining()) {
>>                 throw new IllegalArgumentException(dump(src, dst));
>>             }
>>             new Masker().setMask(mask).applyMask(src, dst);
>>         }
>> 
>> Additionally, if the WebSocket spec (RFC 6455) expects that data is sent 
>> (received) in network byte order (Big Endian), then both src and dst should 
>> be ByteOrder.BIG_ENDIAN  rather than just the same byte order ?
>
>> should the Endian check be in the genesis of the processing i.e. the 
>> applyMask method ?
>> 
>> ```
>>     static void applyMask(ByteBuffer src, ByteBuffer dst, int mask) {
>> ```
>> 
>> // endian check here then other checks are superfluous ?
>> 
>> ```
>>         if (src.remaining() > dst.remaining()) {
>>             throw new IllegalArgumentException(dump(src, dst));
>>         }
>>         new Masker().setMask(mask).applyMask(src, dst);
>>     }
>> ```
>> 
>> Additionally, if the WebSocket spec (RFC 6455) expects that data is sent 
>> (received) in network byte order (Big Endian), then both src and dst should 
>> be ByteOrder.BIG_ENDIAN rather than just the same byte order ?
> 
> A stream of data bytes is just a stream of bytes. The big/little endian 
> question only comes into it when you try to interpret bytes as 32 bit (or 64 
> bit) integers, and that's only an implementation artifact, supposedly to 
> optimize performance. All this code relating to byte order would disappear if 
> we removed this "optimized" code path.
> 
> Afaict, the RFC refers to byte order itself only for framing data that is 
> itself 16 bit or 32 bit. The payload length is encoded in network byte order 
> but application data is just a stream of bytes.

I agree with @Michael-Mc-Mahon. I will try to address remarks by @liach and 
@msheppar:

> Because `maskBytes` is big-endian, if `dst` is little-endian (which is not 
> the case at all right now because trusted callers are all using BE dst 
> ByteBuffer), we should use `maskBytes[4 - offset]`, right?
> ...
> the endianness of maskBytes (currently BE) must match that of dst (all 
> internal usages of dst is currently BE)

@liach, AFAICT, no. Mask is received as a `byte[4]`, we pass it to `Masker` as 
an `int` (read as BE, which is the correct order here to preserve the initial 
order in `byte[4]`), and its octets XOR'ed to the input octets independent of 
the endianness. See [the relevant section in the 
RFC](https://datatracker.ietf.org/doc/html/rfc6455#section-5.3):


Octet i of the transformed data ("transformed-octet-i") is the XOR of
octet i of the original data ("original-octet-i") with octet at index
i modulo 4 of the masking key ("masking-key-octet-j"):

  j                   = i MOD 4
  transformed-octet-i = original-octet-i XOR masking-key-octet-j


Consider that the mask is `0x0A0B0C0D` and the input buffer is `{0x1, 0x2, 0x3, 
0x4}`. Independent of the endianness of the input and/or output buffer, the 
output buffer should be `{0x1 ^ 0xA, 0x2 ^ 0xB, 0x3 ^ 0xC, 0x4 ^ 0xD}`. I've 
added further tests for this in 41b40436adc5cd9abb46b0f2f9536734470d9507.

> endian check here then other checks are superfluous ?
> ...
> Additionally, if the WebSocket spec (RFC 6455) expects that data is sent 
> (received) in network byte order (Big Endian), then both src and dst should 
> be ByteOrder.BIG_ENDIAN rather than just the same byte order ?

@msheppar, endianness only matter in the context of vectorized passes. There we 
read `long`, XOR it with the mask matching the `src` order, and write `long`. 
The only constraint is input and output endiannesses must match. Consider the 
following example:


var maskLongBe = 0x0A0B0C0D0A0B0C0DL;
var maskLongLe = 0x0D0C0B0A0D0C0B0AL;
var srcBe = ByteBuffer.wrap(new byte[]{1, 2, 3, 4, 5, 6, 7, 8});
var srcLe = ByteBuffer.wrap(new byte[]{1, 2, 3, 4, 5, 6, 7, 
8}).order(ByteOrder.LITTLE_ENDIAN);
var dstBe = ByteBuffer.allocate(8);
var dstLe = ByteBuffer.allocate(8).order(ByteOrder.LITTLE_ENDIAN);

dstBe.putLong(0, srcBe.getLong(0) ^ maskLongBe); dstBe.array();
// srcBe->dstBe: RIGHT: byte[8] { 11, 9, 15, 9, 15, 13, 11, 5 }

dstLe.putLong(0, srcBe.getLong(0) ^ maskLongBe); dstLe.array();
// srcBe->dstLe: WRONG: byte[8] { 5, 11, 13, 15, 9, 15, 9, 11 }

dstBe.putLong(0, srcLe.getLong(0) ^ maskLongLe); dstBe.array();
// srcLe->dstBe: WRONG: byte[8] { 5, 11, 13, 15, 9, 15, 9, 11 }

dstLe.putLong(0, srcLe.getLong(0) ^ maskLongLe); dstLe.array();
// srcLe->dstLe: RIGHT: byte[8] { 11, 9, 15, 9, 15, 13, 11, 5 }

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24033#discussion_r2013728091

Re: RFR: 8351339: WebSocket::sendBinary assume that user supplied buffers are BIG_ENDIAN [v7]

Reply via email to