voonhous commented on code in PR #17772:
URL: https://github.com/apache/hudi/pull/17772#discussion_r2658759456
##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordScanner.java:
##########
@@ -848,9 +848,9 @@ private Option<Pair<Function<HoodieRecord, HoodieRecord>,
Schema>> composeEvolve
return Option.of(Pair.of((record) -> {
return record.rewriteRecordWithNewSchema(
- dataBlock.getSchema().toAvroSchema(),
+ dataBlock.getSchema(),
this.hoodieTableMetaClient.getTableConfig().getProps(),
- mergedAvroSchema,
+ HoodieSchema.fromAvroSchema(mergedAvroSchema),
Review Comment:
Line 847, `mergedAvroSchema` can be converted to `HoodieSchema`.
##########
hudi-hadoop-common/src/main/java/org/apache/hudi/common/util/ParquetUtils.java:
##########
@@ -430,7 +430,7 @@ public ByteArrayOutputStream
serializeRecordsToLogBlock(HoodieStorage storage,
//TODO boundary to revisit in follow up to use HoodieSchema directly
Review Comment:
We can remove this comment now.
##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordScanner.java:
##########
@@ -634,7 +634,7 @@ private void processDataBlock(HoodieDataBlock dataBlock,
Option<KeySpec> keySpec
try (ClosableIterator<HoodieRecord> recordIterator =
recordsIteratorSchemaPair.getLeft()) {
while (recordIterator.hasNext()) {
HoodieRecord completedRecord = recordIterator.next()
-
.wrapIntoHoodieRecordPayloadWithParams(recordsIteratorSchemaPair.getRight(),
+
.wrapIntoHoodieRecordPayloadWithParams(HoodieSchema.fromAvroSchema(recordsIteratorSchemaPair.getRight()),
Review Comment:
update `getRecordsIterator` to return `HoodieSchema` pair, so we do not need
to invoke this `HoodieSchema#fromAvroSchema`
##########
hudi-common/src/test/java/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderBase.java:
##########
@@ -726,7 +726,7 @@ private static List<HoodieRecord>
getExpectedHoodieRecordsWithOrderingValue(List
return expectedHoodieRecords.stream().map(rec -> {
List<String> orderingFields =
metaClient.getTableConfig().getOrderingFields();
HoodieAvroIndexedRecord avroRecord = ((HoodieAvroIndexedRecord) rec);
- Comparable orderingValue = OrderingValues.create(orderingFields, field
-> (Comparable) avroRecord.getColumnValueAsJava(avroSchema, field, new
TypedProperties()));
+ Comparable orderingValue = OrderingValues.create(orderingFields, field
-> (Comparable)
avroRecord.getColumnValueAsJava(HoodieSchema.fromAvroSchema(avroSchema), field,
new TypedProperties()));
Review Comment:
`#getExpectedHoodieRecordsWithOrderingValue`'s static function signature can
be changed to accept `HoodieSchema`, which `validateOutputFromFileGroupReader`
can be updated so that the `#toAvroSchema` / `#getAvroSchema` does not need to
be used.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/execution/CopyOnWriteInsertHandler.java:
##########
@@ -82,7 +82,7 @@ public void consume(HoodieInsertValueGenResult<HoodieRecord>
genResult) {
String partitionPath = record.getPartitionPath();
// just skip the ignored record,do not make partitions on fs
try {
- if (record.shouldIgnore(genResult.schema, config.getProps())) {
+ if (record.shouldIgnore(HoodieSchema.fromAvroSchema(genResult.schema),
config.getProps())) {
Review Comment:
Can `genResult.schema` return a `HoodieSchema` instead?
##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieAvroDataBlock.java:
##########
@@ -119,7 +119,7 @@ protected ByteArrayOutputStream
serializeRecords(List<HoodieRecord> records, Hoo
try {
// Encode the record into bytes
// Spark Record not support write avro log
- ByteArrayOutputStream data = s.getAvroBytes(schema, props);
+ ByteArrayOutputStream data =
s.getAvroBytes(HoodieSchema.fromAvroSchema(schema), props);
Review Comment:
Let's use `HoodieSchemaCache#intern` on line 107 and use
`HoodieSchema#parse` directly. So we do not need to do a
`HoodieSchema#fromAvroSchema` call.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/HoodieMergeHelper.java:
##########
@@ -208,9 +208,9 @@ private Option<Function<HoodieRecord, HoodieRecord>>
composeSchemaEvolutionTrans
Map<String, String> renameCols =
InternalSchemaUtils.collectRenameCols(writeInternalSchema, querySchema);
return Option.of(record -> {
return record.rewriteRecordWithNewSchema(
- recordSchema.toAvroSchema(),
+ recordSchema,
writeConfig.getProps(),
- newWriterSchema, renameCols);
+ HoodieSchema.fromAvroSchema(newWriterSchema), renameCols);
Review Comment:
`newWriterSchema` does not need to be wrapped with
`HoodieSchema.fromAvroSchema`. Just remove the `getAvroSchema` call in line 203.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]