nsivabalan commented on code in PR #13860:
URL: https://github.com/apache/hudi/pull/13860#discussion_r2348968097
##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieAvroIndexedRecord.java:
##########
@@ -185,9 +188,17 @@ public HoodieRecord joinWith(HoodieRecord other, Schema
targetSchema) {
}
@Override
- public HoodieRecord prependMetaFields(Schema recordSchema, Schema
targetSchema, MetadataValues metadataValues, Properties props) {
+ public HoodieRecord prependMetaFields(Schema recordSchema, Schema
targetSchema, MetadataValues metadataValues, Properties props,
+ boolean hasOperationMetaField) {
decodeRecord(recordSchema);
- GenericRecord newAvroRecord =
HoodieAvroUtils.rewriteRecordWithNewSchema(data, targetSchema);
+ List<String> metaFields = hasOperationMetaField ?
HoodieRecord.HOODIE_META_COLUMNS_WITH_OPERATION_LIST :
HoodieRecord.HOODIE_META_COLUMNS;
Review Comment:
this is per record processing. why do we want to do that if we can add an
additional argument and get the value of `hasOperationMetaField` that we are
interested in. w/ additional argument, we compute `hasOperationMetaField` just
once either in driver or at the task level(write handle). but not for every
record.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##########
@@ -116,7 +117,8 @@ protected HoodieWriteHandle(HoodieWriteConfig config,
String instantTime, String
this.partitionPath = partitionPath;
this.fileId = fileId;
this.writeSchema = AvroSchemaCache.intern(overriddenSchema.orElseGet(() ->
getWriteSchema(config)));
- this.writeSchemaWithMetaFields =
AvroSchemaCache.intern(HoodieAvroUtils.addMetadataFields(writeSchema,
config.allowOperationMetadataField()));
+ this.hasOperationMetaField = config.allowOperationMetadataField();
+ this.writeSchemaWithMetaFields =
AvroSchemaCache.intern(HoodieAvroUtils.addMetadataFields(writeSchema,
hasOperationMetaField));
Review Comment:
ok, moved the L 120 to 122 as below
```
this.hasOperationMetaField =
writeSchemaWithMetaFields.getField(HoodieRecord.OPERATION_METADATA_FIELD) !=
null;
```
##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieAvroIndexedRecord.java:
##########
@@ -185,9 +188,17 @@ public HoodieRecord joinWith(HoodieRecord other, Schema
targetSchema) {
}
@Override
- public HoodieRecord prependMetaFields(Schema recordSchema, Schema
targetSchema, MetadataValues metadataValues, Properties props) {
+ public HoodieRecord prependMetaFields(Schema recordSchema, Schema
targetSchema, MetadataValues metadataValues, Properties props,
+ boolean hasOperationMetaField) {
decodeRecord(recordSchema);
- GenericRecord newAvroRecord =
HoodieAvroUtils.rewriteRecordWithNewSchema(data, targetSchema);
+ List<String> metaFields = hasOperationMetaField ?
HoodieRecord.HOODIE_META_COLUMNS_WITH_OPERATION_LIST :
HoodieRecord.HOODIE_META_COLUMNS;
+ /*GenericRecord newAvroRecord =
HoodieAvroUtils.rewriteRecordWithPrependedMetaFieldsOptimized(data,
targetSchema, recordSchema.getFields().size(),
+ targetSchema.getFields().size() - recordSchema.getFields().size());*/
+
+ GenericRecord newAvroRecord = new JoinedGenericRecord((GenericRecord)
data, new HashMap<>(metaFields.size()),
Review Comment:
makes sense. will fix it
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]