alexeykudinkin commented on code in PR #7021:
URL: https://github.com/apache/hudi/pull/7021#discussion_r1003672081


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieAvroRecord.java:
##########
@@ -189,14 +193,37 @@ public Option<Map<String, String>> getMetadata() {
 
   @Override
   public Option<HoodieAvroIndexedRecord> toIndexedRecord(Schema recordSchema, 
Properties props) throws IOException {
-    Option<IndexedRecord> avroData = getData().getInsertValue(recordSchema, 
props);
+    Option<IndexedRecord> avroData = getCachedDeserializedRecord(recordSchema, 
props);
     if (avroData.isPresent()) {
       return Option.of(new HoodieAvroIndexedRecord(avroData.get()));
     } else {
       return Option.empty();
     }
   }
 
+  private Option<IndexedRecord> getCachedDeserializedRecord(Schema 
recordSchema, Properties props) throws IOException {
+    // Check schema identical
+    if (this.cachedDeserializedRecord != null && 
this.cachedDeserializedRecord.isPresent()
+        && !compareSchema(cachedDeserializedRecord.get().getSchema(), 
recordSchema)) {
+      this.cachedDeserializedRecord = null;
+    }
+    if (this.cachedDeserializedRecord == null) {
+      this.cachedDeserializedRecord = this.data.getInsertValue(recordSchema, 
props);
+    }
+    return this.cachedDeserializedRecord;
+  }
+
+  private static Boolean compareSchema(Schema left, Schema right) {
+    if (left == null || right == null) {
+      return false;
+    }
+    Pair<Schema, Schema> schemaPair = Pair.of(left, right);
+    if (!SCHEMA_COMPARE_MAP.containsKey(schemaPair)) {

Review Comment:
   I don't think this setup makes sense -- complexity of comparing the 
`Pair(leftSchema, rightSchema)` is def more than just comparing 2 schemas.
   
   Instead, let's just keep `Pair<Schema, GenericRecord>` cached and compare it 
whenever we retrieve



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to