ahmedabu98 opened a new issue, #26643:
URL: https://github.com/apache/beam/issues/26643
### What happened?
Row builder fails when processing ByteBuffer types in a Map (doesn't matter
if it's key or value) with the following error:
`class java.nio.HeapByteBuffer cannot be cast to class [B
(java.nio.HeapByteBuffer and [B are in module java.base of loader 'bootstrap')`
Reproduce with:
```
Schema mapBytes = Schema.of(Schema.Field.of("map_bytes",
Schema.FieldType.map(Schema.FieldType.BYTES, Schema.FieldType.STRING)));
Map<ByteBuffer, String> map = new HashMap<>();
map.put(ByteBuffer.wrap("b".getBytes(StandardCharsets.UTF_8)), "c");
Row mapBytesRow = Row.withSchema(mapBytes).withFieldValue("map_bytes",
map).build();
```
After debugging a little, found that this is because we are trying to cast
HeapByteBuffer to byte[] here:
https://github.com/apache/beam/blob/fd9c60bbeba20b223a16286f28fc549092a33dfb/sdks/java/core/src/main/java/org/apache/beam/sdk/values/RowUtils.java#L175-L177
The reason why this doesn't get picked up when using just ByteBuffer (ie.
not in a map) is because at this point in the code, the value is null and is
instead retrieved by `overrideOrReturn` here:
https://github.com/apache/beam/blob/fd9c60bbeba20b223a16286f28fc549092a33dfb/sdks/java/core/src/main/java/org/apache/beam/sdk/values/RowUtils.java#L604-L607
However, when `FieldType.MAP` is processed, the map itself is originally
null and retrieved by `overrideOrReturn` during `processMap()`. The kv values
it contains are not null when they are processed. So a ByteBuffer value would
be non-null at `processBytes()` and would encounter the cast error.
### Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
### Issue Components
- [ ] Component: Python SDK
- [X] Component: Java SDK
- [ ] Component: Go SDK
- [ ] Component: Typescript SDK
- [ ] Component: IO connector
- [ ] Component: Beam examples
- [ ] Component: Beam playground
- [ ] Component: Beam katas
- [ ] Component: Website
- [ ] Component: Spark Runner
- [ ] Component: Flink Runner
- [ ] Component: Samza Runner
- [ ] Component: Twister2 Runner
- [ ] Component: Hazelcast Jet Runner
- [ ] Component: Google Cloud Dataflow Runner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]