xiarixiaoyao commented on code in PR #10727:
URL: https://github.com/apache/hudi/pull/10727#discussion_r1524366492


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/HoodieMergeHelper.java:
##########
@@ -202,7 +202,9 @@ private Option<Function<HoodieRecord, HoodieRecord>> 
composeSchemaEvolutionTrans
       Schema newWriterSchema = 
AvroInternalSchemaConverter.convert(mergedSchema, writerSchema.getFullName());
       Schema writeSchemaFromFile = 
AvroInternalSchemaConverter.convert(writeInternalSchema, 
newWriterSchema.getFullName());
       boolean needToReWriteRecord = sameCols.size() != 
colNamesFromWriteSchema.size()
-          || 
SchemaCompatibility.checkReaderWriterCompatibility(newWriterSchema, 
writeSchemaFromFile).getType() == 
org.apache.avro.SchemaCompatibility.SchemaCompatibilityType.COMPATIBLE;
+          && 
SchemaCompatibility.checkReaderWriterCompatibility(newWriterSchema, 
writeSchemaFromFile).getType()
+          == 
org.apache.avro.SchemaCompatibility.SchemaCompatibilityType.COMPATIBLE;
+

Review Comment:
   @danny0405 
   This place can actually raise an additional question,
   Now when we are reading the MOR table, we pass the full schema when reading 
the AVRO log; Even if we only query one column, if this table has 100 rows of 
avro logs, using full schema to read data and generate BitCatstMap will consume 
a lot of memory, and the performance will not be good. 
   now our current version of Avro has been upgraded to 1.10. x.  In fact, we 
can pass pruned schemas directly when reading logs. This way, when reading logs 
and generating bitcastmaps, the speed and memory consumption are much better.
   Forgive me for that i can not paste test pic due to company information 
security reasons
   
   
   presto read hudi log
   
   pass full schema, we will see following log
   Total size in bytes of MemoryBasedMap in ExternalSpillableMap => 712,956,000
   final query time: 35672ms
   
   pass puned schema
   Total size in bytes of MemoryBasedMap in ExternalSpillableMap => 45,500,000
   final query time: 13373ms
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to