voonhous commented on code in PR #18967:
URL: https://github.com/apache/hudi/pull/18967#discussion_r3393444477


##########
hudi-common/src/main/java/org/apache/hudi/avro/AvroRecordContext.java:
##########
@@ -70,8 +71,16 @@ public AvroRecordContext() {
   public static Object getFieldValueFromIndexedRecord(
       IndexedRecord record,
       String fieldName) {
-    HoodieSchema currentSchema = 
HoodieSchema.fromAvroSchema(record.getSchema());
+    // Interning returns the canonical wrapper for this schema, whose lazily 
built field list and
+    // field map survive across calls, so the per-record cost is a cache hit 
instead of an
+    // O(schema width) wrapper rebuild.
+    HoodieSchema currentSchema = 
HoodieSchemaCache.intern(HoodieSchema.fromAvroSchema(record.getSchema()));
     IndexedRecord currentRecord = record;
+    if (fieldName.indexOf('.') < 0) {

Review Comment:
   You are right -- `String.split` already fast-paths the two-character `\\.` 
pattern (no regex compilation), so with interning in place this branch only 
saved one small array allocation per call, which is second order next to the 
wrapper allocation and cache lookup. Removed it; the method now only adds the 
interning relative to master.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to