wombatu-kun commented on code in PR #18375:
URL: https://github.com/apache/hudi/pull/18375#discussion_r3224862408
##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/UpdateProcessor.java:
##########
@@ -135,19 +139,59 @@ protected BufferedRecord<T>
handleNonDeletes(BufferedRecord<T> previousRecord, B
if (previousRecord == null) {
// special case for payloads when there is no previous record
HoodieSchema recordSchema =
readerContext.getRecordContext().decodeAvroSchema(mergedRecord.getSchemaId());
- GenericRecord record =
readerContext.getRecordContext().convertToAvroRecord(mergedRecord.getRecord(),
recordSchema);
- HoodieAvroRecord hoodieRecord = new HoodieAvroRecord<>(null,
HoodieRecordUtils.loadPayload(payloadClass, record,
mergedRecord.getOrderingValue()));
- try {
- if (hoodieRecord.shouldIgnore(recordSchema, properties)) {
- return null;
- } else {
- HoodieSchema readerSchema =
readerContext.getSchemaHandler().getRequestedSchema();
- // If the record schema is different from the reader schema,
rewrite the record using the payload methods to ensure consistency with legacy
writer paths
- hoodieRecord.rewriteRecordWithNewSchema(recordSchema, properties,
readerSchema).toIndexedRecord(readerSchema, properties)
- .ifPresent(rewrittenRecord ->
mergedRecord.replaceRecord(readerContext.getRecordContext().convertAvroRecord(rewrittenRecord.getData())));
+ GenericRecord originalAvro = mergedRecord.getOriginalAvroRecord();
+ Schema recordAvroSchema = recordSchema.toAvroSchema();
+
+ // When the merged record carries an originalAvroRecord (populated by
extractDataFromRecord
+ // for ExpressionPayload in the COW write path via ExtractedData), the
record is already in
+ // write-schema format with correctly evaluated expressions. Convert
directly and skip the
+ // payload path.
+ //
+ // NOTE: this branch bypasses shouldIgnore. That is safe today because
the only payload that
+ // populates originalAvroRecord is ExpressionPayload, which never
returns shouldIgnore=true.
+ // If a future payload starts producing an originalAvroRecord, it must
add a shouldIgnore
+ // check here.
+ if (originalAvro != null) {
Review Comment:
Keeping the inline form — the three comment blocks document load-bearing
contracts that read most clearly at the dispatch site; splitting into helpers
would either move the `shouldIgnore`-bypass NOTE away from where a future
originalAvroRecord-producing payload would be added, or fragment the contracts
across methods.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]