yihua commented on code in PR #13515:
URL: https://github.com/apache/hudi/pull/13515#discussion_r2361512860
##########
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/log/block/TestHoodieDeleteBlock.java:
##########
@@ -54,6 +59,20 @@ public class TestHoodieDeleteBlock {
private static Random random = new Random();
+ @Test
+ void validateHoodieDeleteRecordListFieldsAndOrdinals() {
+ // HoodieDeleteBlock uses IndexedRecord instead of HoodieDeleteRecordList
as the output of
+ // reading the delete record list, due to class loading issue on the
executor side on Spark
Review Comment:
The Hudi bundles, e.g., hudi-spark3.5-bundle, do not package Avro classes,
and we rely on Spark to provide Avro Jar. Spark loads Avro Jar and Hudi Jar
using different class loaders on the executor side (see the stacktrace). When
deserializing the record, the logic below tries to cast the record from
`HoodieDeleteRecordList` to `org.apache.avro.generic.GenericData$Record`. Java
11+ does not allow this because these two classes are loaded by different class
loaders even though one is the subclass of the other.
```
DatumReader<HoodieDeleteRecordList> reader = new
SpecificDatumReader<>(HoodieDeleteRecordList.class);
BinaryDecoder decoder = DecoderFactory.get().binaryDecoder(data, 0,
data.length, null);
List<HoodieDeleteRecord> deleteRecordList = reader.read(null,
decoder).getDeleteRecordList();
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]