empcl commented on code in PR #12949:
URL: https://github.com/apache/hudi/pull/12949#discussion_r2035220428
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##########
@@ -95,8 +96,8 @@ protected HoodieWriteHandle(HoodieWriteConfig config, String
instantTime, String
super(config, Option.of(instantTime), hoodieTable);
this.partitionPath = partitionPath;
this.fileId = fileId;
- this.writeSchema = overriddenSchema.orElseGet(() ->
getWriteSchema(config));
- this.writeSchemaWithMetaFields =
HoodieAvroUtils.addMetadataFields(writeSchema,
config.allowOperationMetadataField());
+ this.writeSchema = AvroSchemaCache.intern(overriddenSchema.orElseGet(() ->
getWriteSchema(config)));
+ this.writeSchemaWithMetaFields =
AvroSchemaCache.intern(HoodieAvroUtils.addMetadataFields(writeSchema,
config.allowOperationMetadataField()));
Review Comment:
When introducing the necessity of PR, it was mentioned that a lot of time
was spent on unnecessary avro schema comparisons, namely the
`HoodieInternalRowUtils#getCachedSchema `section. However, I noticed that there
was no modification made to the `getCachedSchema` method in this PR. So, I
would like to ask how this PR achieves the effect of speeding up, or how the
cache value is reflected. Thank you.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]