BK202503 opened a new pull request, #22536: URL: https://github.com/apache/kafka/pull/22536
JIRA: [KAFKA-20656](https://issues.apache.org/jira/browse/KAFKA-20656) ### What Kafka Connect BYTES values may be either `byte[]` or `ByteBuffer`, and `ConnectSchema` recommends `ByteBuffer` because plain arrays do not implement content-based `equals()`/`hashCode()`. `Struct` did not honor that contract: - `getBytes(...)` called `ByteBuffer.array()`, returning the entire backing array instead of the logical remaining bytes, and throwing `UnsupportedOperationException` for direct buffers. - `equals` and `hashCode` delegated straight to `Arrays.deepEquals`/`deepHashCode` over the raw `values` array, so two structs holding the same logical BYTES value supplied as `byte[]` vs. `ByteBuffer` were not equal. ### Changes - `Struct.getBytes(String)` now goes through `Utils.toArray(ByteBuffer)`. That copies the buffer's remaining bytes and works for direct buffers — the same approach `Values.convertToBytes` already uses elsewhere in this module. - `Struct.equals(Object)` and `Struct.hashCode()` now compare/hash a `normalizedBytesValues()` view of the struct that copies top-level BYTES fields stored as `ByteBuffer` into `byte[]`. The underlying `values` array is not mutated, so `get(String)` callers still see whatever representation was put in. Non-BYTES fields go through `Arrays.deepEquals`/`deepHashCode` unchanged. The normalization is scoped to top-level BYTES fields, matching the reporter's reproducer. Nested BYTES inside `ARRAY`/`MAP`/`STRUCT` fields keep the previous behavior in this PR. ### Tests Added three regression tests in `StructTest` that fail against the previous implementation and pass with this change: - `testGetBytesPreservesByteBufferRemainingBytes`: a sliced `ByteBuffer` returns only the logical bytes. - `testGetBytesSupportsDirectByteBuffer`: a direct buffer serializes instead of throwing. - `testEqualsAndHashCodeWithEquivalentByteArrayAndByteBufferValues`: a `byte[]`-valued struct and a `ByteBuffer`-valued struct with the same logical content are `.equals` and share a `hashCode`. ### Validation ``` ./gradlew :connect:api:test --tests "org.apache.kafka.connect.data.StructTest.*" ``` All `StructTest` tests pass on JDK 17, including the new three, the existing `testFlatStruct`, and the existing `testEqualsAndHashCodeWithByteArrayValue` (which exercises the unchanged `byte[]`-only equality path). ### Scope This PR completes the four-ticket `ByteBuffer.array()` cluster for Kafka Connect BYTES values; the other three are split into independent PRs to keep each reviewable: - KAFKA-20657 (`JsonConverter`) — #22533 - KAFKA-20658 (`Cast` SMT) — #22534 - KAFKA-20666 (offset backing stores) — #22535 ### Committer Checklist - [x] Verified design and implementation - [x] Verified test coverage and CI build status - [x] Verified documentation (including upgrade notes) updates (no public API surface change beyond bug-fix behavior; equality semantics now match the documented BYTES contract) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
