On Wed, 26 Mar 2025 08:09:54 GMT, Michael McMahon <[email protected]> wrote:
>> should the Endian check be in the genesis of the processing i.e. the
>> applyMask method ?
>>
>> static void applyMask(ByteBuffer src, ByteBuffer dst, int mask) {
>>
>> // endian check here then other checks are superfluous ?
>>
>> if (src.remaining() > dst.remaining()) {
>> throw new IllegalArgumentException(dump(src, dst));
>> }
>> new Masker().setMask(mask).applyMask(src, dst);
>> }
>>
>> Additionally, if the WebSocket spec (RFC 6455) expects that data is sent
>> (received) in network byte order (Big Endian), then both src and dst should
>> be ByteOrder.BIG_ENDIAN rather than just the same byte order ?
>
>> should the Endian check be in the genesis of the processing i.e. the
>> applyMask method ?
>>
>> ```
>> static void applyMask(ByteBuffer src, ByteBuffer dst, int mask) {
>> ```
>>
>> // endian check here then other checks are superfluous ?
>>
>> ```
>> if (src.remaining() > dst.remaining()) {
>> throw new IllegalArgumentException(dump(src, dst));
>> }
>> new Masker().setMask(mask).applyMask(src, dst);
>> }
>> ```
>>
>> Additionally, if the WebSocket spec (RFC 6455) expects that data is sent
>> (received) in network byte order (Big Endian), then both src and dst should
>> be ByteOrder.BIG_ENDIAN rather than just the same byte order ?
>
> A stream of data bytes is just a stream of bytes. The big/little endian
> question only comes into it when you try to interpret bytes as 32 bit (or 64
> bit) integers, and that's only an implementation artifact, supposedly to
> optimize performance. All this code relating to byte order would disappear if
> we removed this "optimized" code path.
>
> Afaict, the RFC refers to byte order itself only for framing data that is
> itself 16 bit or 32 bit. The payload length is encoded in network byte order
> but application data is just a stream of bytes.
I agree with @Michael-Mc-Mahon. I will try to address remarks by @liach and
@msheppar:
> Because `maskBytes` is big-endian, if `dst` is little-endian (which is not
> the case at all right now because trusted callers are all using BE dst
> ByteBuffer), we should use `maskBytes[4 - offset]`, right?
> ...
> the endianness of maskBytes (currently BE) must match that of dst (all
> internal usages of dst is currently BE)
@liach, AFAICT, no. Mask is received as a `byte[4]`, we pass it to `Masker` as
an `int` (read as BE, which is the correct order here to preserve the initial
order in `byte[4]`), and its octets XOR'ed to the input octets independent of
the endianness. See [the relevant section in the
RFC](https://datatracker.ietf.org/doc/html/rfc6455#section-5.3):
Octet i of the transformed data ("transformed-octet-i") is the XOR of
octet i of the original data ("original-octet-i") with octet at index
i modulo 4 of the masking key ("masking-key-octet-j"):
j = i MOD 4
transformed-octet-i = original-octet-i XOR masking-key-octet-j
Consider that the mask is `0x0A0B0C0D` and the input buffer is `{0x1, 0x2, 0x3,
0x4}`. Independent of the endianness of the input and/or output buffer, the
output buffer should be `{0x1 ^ 0xA, 0x2 ^ 0xB, 0x3 ^ 0xC, 0x4 ^ 0xD}`. I've
added further tests for this in 41b40436adc5cd9abb46b0f2f9536734470d9507.
> endian check here then other checks are superfluous ?
> ...
> Additionally, if the WebSocket spec (RFC 6455) expects that data is sent
> (received) in network byte order (Big Endian), then both src and dst should
> be ByteOrder.BIG_ENDIAN rather than just the same byte order ?
@msheppar, endianness only matter in the context of vectorized passes. There we
read `long`, XOR it with the mask matching the `src` order, and write `long`.
The only constraint is input and output endiannesses must match. Consider the
following example:
var maskLongBe = 0x0A0B0C0D0A0B0C0DL;
var maskLongLe = 0x0D0C0B0A0D0C0B0AL;
var srcBe = ByteBuffer.wrap(new byte[]{1, 2, 3, 4, 5, 6, 7, 8});
var srcLe = ByteBuffer.wrap(new byte[]{1, 2, 3, 4, 5, 6, 7,
8}).order(ByteOrder.LITTLE_ENDIAN);
var dstBe = ByteBuffer.allocate(8);
var dstLe = ByteBuffer.allocate(8).order(ByteOrder.LITTLE_ENDIAN);
dstBe.putLong(0, srcBe.getLong(0) ^ maskLongBe); dstBe.array();
// srcBe->dstBe: RIGHT: byte[8] { 11, 9, 15, 9, 15, 13, 11, 5 }
dstLe.putLong(0, srcBe.getLong(0) ^ maskLongBe); dstLe.array();
// srcBe->dstLe: WRONG: byte[8] { 5, 11, 13, 15, 9, 15, 9, 11 }
dstBe.putLong(0, srcLe.getLong(0) ^ maskLongLe); dstBe.array();
// srcLe->dstBe: WRONG: byte[8] { 5, 11, 13, 15, 9, 15, 9, 11 }
dstLe.putLong(0, srcLe.getLong(0) ^ maskLongLe); dstLe.array();
// srcLe->dstLe: RIGHT: byte[8] { 11, 9, 15, 9, 15, 13, 11, 5 }
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/24033#discussion_r2013728091