danny0405 commented on code in PR #17672:
URL: https://github.com/apache/hudi/pull/17672#discussion_r2641880358
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java:
##########
@@ -410,10 +417,9 @@ public Option<IndexedRecord> getInsertValue(Schema schema,
Properties properties
return Option.empty();
}
- // TODO: feature(schema): HoodieSchema change, we removed caching in a few
areas, during the migration of Avro.Schema -> HoodieSchema.
- // The schema objects might have been the same reference (due to caching),
but now after converting from HoodieSchema to Avro Schema using .toAvroSchema(),
- // it creates a new Schema object that's not the same reference as
HOODIE_METADATA_SCHEMA
- if (schema == null || HOODIE_METADATA_SCHEMA.equals(schema)) {
+ // TODO: feature(schema): Swap this over to HOODIE_METADATA_SCHEMA after
HoodieRecordPayload implementations are using HoodieSchema
+ // Uses cached Avro schema reference for O(1) equality check.
+ if (schema == null || schema == HOODIE_METADATA_AVRO_SCHEMA) {
Review Comment:
nice catch, is this the only place we got this comparison regression, did we
check all the usages of `AvroSchemaCache.intern` and
`HoodieSchemaCache.intern`, do we have enough test coverage to ensure the
performance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]