cshuo commented on code in PR #18717:
URL: https://github.com/apache/hudi/pull/18717#discussion_r3223513390
##########
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/ITTestVariantCrossEngineCompatibility.java:
##########
@@ -87,35 +85,32 @@ private void verifyFlinkCanReadSparkVariantTable(String
tablePath, String tableT
assertEquals("row1", row.getField(1), "Second column should be name=row1");
assertEquals(1000L, row.getField(3), "Fourth column should be ts=1000");
- // Verify the variant column is readable as a ROW with binary fields
- Row variantRow = (Row) row.getField(2);
- assertNotNull(variantRow, "Variant column should not be null");
-
- byte[] metadataBytes = (byte[]) variantRow.getField(0);
- byte[] valueBytes = (byte[]) variantRow.getField(1);
+ // Verify the variant column is readable as a native Flink Variant.
+ Object variantObject = row.getField(2);
+ assertNotNull(variantObject, "Variant column should not be null");
+ DataTypeAdapterTestUtils.assertAsBinaryVariant(variantObject);
Review Comment:
Seems no need to introduce `DataTypeAdapterTestUtils.assertAsBinaryVariant`,
can be checked by`variantObject instanceOf Variant`.
##########
hudi-flink-datasource/hudi-flink2.1.x/src/main/java/org/apache/hudi/table/format/cow/ParquetSplitReaderUtil.java:
##########
@@ -675,11 +701,44 @@ private static WritableColumnVector
createWritableColumnVector(
}
}
return new HeapRowColumnVector(batchSize, columnVectors);
+ case VARIANT:
+ validateVariantField(physicalType,
HoodieSchema.Variant.VARIANT_VALUE_FIELD);
Review Comment:
should we also add validation for not supporting shredded variant now, i.e.,
not incuding `VARIANT_TYPED_VALUE_FIELD`
##########
hudi-flink-datasource/hudi-flink2.1.x/src/main/java/org/apache/hudi/table/format/cow/ParquetSplitReaderUtil.java:
##########
@@ -675,11 +701,44 @@ private static WritableColumnVector
createWritableColumnVector(
}
}
return new HeapRowColumnVector(batchSize, columnVectors);
+ case VARIANT:
+ validateVariantField(physicalType,
HoodieSchema.Variant.VARIANT_VALUE_FIELD);
+ validateVariantField(physicalType,
HoodieSchema.Variant.VARIANT_METADATA_FIELD);
+ return new HeapRowColumnVector(
+ batchSize,
+ new HeapBytesVector(batchSize),
+ new HeapBytesVector(batchSize));
default:
throw new UnsupportedOperationException(fieldType + " is not supported
now.");
}
}
+ private static ColumnDescriptor getVariantColumnDescriptor(
+ Type physicalType,
+ List<ColumnDescriptor> descriptors,
+ int depth,
+ String fieldName) {
+ validateVariantField(physicalType, fieldName);
Review Comment:
duplicate validation, the type is already validated during creating column
vector.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]