On Wed, 26 Mar 2025 08:09:54 GMT, Michael McMahon <micha...@openjdk.org> wrote:
>> should the Endian check be in the genesis of the processing i.e. the >> applyMask method ? >> >> static void applyMask(ByteBuffer src, ByteBuffer dst, int mask) { >> >> // endian check here then other checks are superfluous ? >> >> if (src.remaining() > dst.remaining()) { >> throw new IllegalArgumentException(dump(src, dst)); >> } >> new Masker().setMask(mask).applyMask(src, dst); >> } >> >> Additionally, if the WebSocket spec (RFC 6455) expects that data is sent >> (received) in network byte order (Big Endian), then both src and dst should >> be ByteOrder.BIG_ENDIAN rather than just the same byte order ? > >> should the Endian check be in the genesis of the processing i.e. the >> applyMask method ? >> >> ``` >> static void applyMask(ByteBuffer src, ByteBuffer dst, int mask) { >> ``` >> >> // endian check here then other checks are superfluous ? >> >> ``` >> if (src.remaining() > dst.remaining()) { >> throw new IllegalArgumentException(dump(src, dst)); >> } >> new Masker().setMask(mask).applyMask(src, dst); >> } >> ``` >> >> Additionally, if the WebSocket spec (RFC 6455) expects that data is sent >> (received) in network byte order (Big Endian), then both src and dst should >> be ByteOrder.BIG_ENDIAN rather than just the same byte order ? > > A stream of data bytes is just a stream of bytes. The big/little endian > question only comes into it when you try to interpret bytes as 32 bit (or 64 > bit) integers, and that's only an implementation artifact, supposedly to > optimize performance. All this code relating to byte order would disappear if > we removed this "optimized" code path. > > Afaict, the RFC refers to byte order itself only for framing data that is > itself 16 bit or 32 bit. The payload length is encoded in network byte order > but application data is just a stream of bytes. I agree with @Michael-Mc-Mahon. I will try to address remarks by @liach and @msheppar: > Because `maskBytes` is big-endian, if `dst` is little-endian (which is not > the case at all right now because trusted callers are all using BE dst > ByteBuffer), we should use `maskBytes[4 - offset]`, right? > ... > the endianness of maskBytes (currently BE) must match that of dst (all > internal usages of dst is currently BE) @liach, AFAICT, no. Mask is received as a `byte[4]`, we pass it to `Masker` as an `int` (read as BE, which is the correct order here to preserve the initial order in `byte[4]`), and its octets XOR'ed to the input octets independent of the endianness. See [the relevant section in the RFC](https://datatracker.ietf.org/doc/html/rfc6455#section-5.3): Octet i of the transformed data ("transformed-octet-i") is the XOR of octet i of the original data ("original-octet-i") with octet at index i modulo 4 of the masking key ("masking-key-octet-j"): j = i MOD 4 transformed-octet-i = original-octet-i XOR masking-key-octet-j Consider that the mask is `0x0A0B0C0D` and the input buffer is `{0x1, 0x2, 0x3, 0x4}`. Independent of the endianness of the input and/or output buffer, the output buffer should be `{0x1 ^ 0xA, 0x2 ^ 0xB, 0x3 ^ 0xC, 0x4 ^ 0xD}`. I've added further tests for this in 41b40436adc5cd9abb46b0f2f9536734470d9507. > endian check here then other checks are superfluous ? > ... > Additionally, if the WebSocket spec (RFC 6455) expects that data is sent > (received) in network byte order (Big Endian), then both src and dst should > be ByteOrder.BIG_ENDIAN rather than just the same byte order ? @msheppar, endianness only matter in the context of vectorized passes. There we read `long`, XOR it with the mask matching the `src` order, and write `long`. The only constraint is input and output endiannesses must match. Consider the following example: var maskLongBe = 0x0A0B0C0D0A0B0C0DL; var maskLongLe = 0x0D0C0B0A0D0C0B0AL; var srcBe = ByteBuffer.wrap(new byte[]{1, 2, 3, 4, 5, 6, 7, 8}); var srcLe = ByteBuffer.wrap(new byte[]{1, 2, 3, 4, 5, 6, 7, 8}).order(ByteOrder.LITTLE_ENDIAN); var dstBe = ByteBuffer.allocate(8); var dstLe = ByteBuffer.allocate(8).order(ByteOrder.LITTLE_ENDIAN); dstBe.putLong(0, srcBe.getLong(0) ^ maskLongBe); dstBe.array(); // srcBe->dstBe: RIGHT: byte[8] { 11, 9, 15, 9, 15, 13, 11, 5 } dstLe.putLong(0, srcBe.getLong(0) ^ maskLongBe); dstLe.array(); // srcBe->dstLe: WRONG: byte[8] { 5, 11, 13, 15, 9, 15, 9, 11 } dstBe.putLong(0, srcLe.getLong(0) ^ maskLongLe); dstBe.array(); // srcLe->dstBe: WRONG: byte[8] { 5, 11, 13, 15, 9, 15, 9, 11 } dstLe.putLong(0, srcLe.getLong(0) ^ maskLongLe); dstLe.array(); // srcLe->dstLe: RIGHT: byte[8] { 11, 9, 15, 9, 15, 13, 11, 5 } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24033#discussion_r2013728091