pratyakshsharma commented on a change in pull request #1520: [HUDI-797] Small 
performance improvement for rewriting records.
URL: https://github.com/apache/incubator-hudi/pull/1520#discussion_r410170533
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
 ##########
 @@ -231,6 +254,36 @@ private static GenericRecord rewrite(GenericRecord 
record, LinkedHashSet<Field>
     return allFields;
   }
 
+  /*
+   * Given a avro record with a given schema, rewrites it into the new schema 
while setting fields only from the old
+   * schema.
+   *
+   * NOTE: This function is only suitable if newSchema has fields with the 
same position as record's schema.
+   */
+  public static GenericRecord rewriteHoodieRecord(GenericRecord record, Schema 
newSchema) {
+    return rewriteHoodieRecord(record, record.getSchema(), newSchema);
+  }
+
+  /**
+   * Given a avro record with a given schema, rewrites it into the new schema 
while setting fields only from the old
+   * schema.
+   *
+   * This function has better performance than rewrite() even though it 
provides the same functionality.
+   *
+   * NOTE: This function is only suitable if newSchema has fields with the 
same position as schemaWithFields.
+   */
+  public static GenericRecord rewriteHoodieRecord(GenericRecord record, Schema 
schemaWithFields, Schema newSchema) {
+    GenericRecord newRecord = new GenericData.Record(newSchema);
+    for (Schema.Field f : schemaWithFields.getFields()) {
+      newRecord.put(f.pos(), record.get(f.pos()));
 
 Review comment:
   Please take care of handling default values here. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to