thswlsqls opened a new pull request, #17079:
URL: https://github.com/apache/iceberg/pull/17079
Closes #17076
## Summary
- `RecordConverter.convertUUID()` returned `byte[]` for UUID columns when
the target file format is Parquet, but the Parquet UUID writer
(`ParquetValueWriters.uuids()`) expects a `java.util.UUID` and converts to
bytes itself, so writes threw `ClassCastException: class [B cannot be cast to
class java.util.UUID`.
- Removes the `byte[]` branch so `convertUUID()` always returns `UUID`,
matching ORC (`GenericOrcWriters.uuids()`) and Avro, which already accept
`UUID` directly.
- The `byte[]` conversion matched the writer contract before PR #11904
changed `ParquetValueWriters`' UUID writer to accept `UUID` directly;
`kafka-connect` was not updated to follow — this restores the correct contract.
- Note: open PR #16654 ("Kafka Connect: Precompute UUID-as-bytes flag in
RecordConverter") touches the same method but explicitly preserves the current
`byte[]` behavior, so it does not fix this bug; whichever of the two merges
first, the other will need a rebase.
## Testing done
- Updated `TestRecordConverter#testUUIDConversionWithParquet` to assert the
field equals the original `UUID`, replacing the `UUIDUtil.convert(UUID_VAL)`
byte[] expectation.
- `./gradlew :iceberg-kafka-connect:iceberg-kafka-connect:check` passes —
`TestRecordConverter` 59/59, full module 122/122, 0 failures.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]