danny0405 commented on code in PR #13742:
URL: https://github.com/apache/hudi/pull/13742#discussion_r2300157763


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecordMerger.java:
##########
@@ -63,8 +63,15 @@ public interface HoodieRecordMerger extends Serializable {
    * It'd be associative operation: f(a, f(b, c)) = f(f(a, b), c) (which we 
can translate as having 3 versions A, B, C
    * of the single record, both orders of operations applications have to 
yield the same result)
    * This method takes only full records for merging.
+   *
+   * @param older     Older record in terms of commit time ordering.
+   * @param oldSchema The schema of the older record.
+   * @param newer     Newer record in terms of commit time ordering.
+   * @param newSchema The schema of the newer record.
+   * @param props     The additional properties for the merging operation.
+   * @return The merged record and schema. The record is expected to be 
non-null. If the record represents a deletion, the operation must be set as 
{@link HoodieOperation#DELETE}.
    */
-  Option<Pair<HoodieRecord, Schema>> merge(HoodieRecord older, Schema 
oldSchema, HoodieRecord newer, Schema newSchema, TypedProperties props) throws 
IOException;
+  Pair<HoodieRecord, Schema> merge(HoodieRecord older, Schema oldSchema, 
HoodieRecord newer, Schema newSchema, TypedProperties props) throws IOException;

Review Comment:
   if we just takes two buffered records, things would be much easier:
   
   ```java
   BufferedRecord merge(BufferedRecord older, BufferedRecord newer, 
RecordContext context);
   ```
   
   the benefits to do this:
   
   - BufferedRecord is the core abstraction for merging, it is more easier to 
evolve in the future;
   - avoid unnecessary hoodie record -> buffered record -> hoodie record 
conversion;
   - more clear impl/semantics for deletes handling in mergers.
   
   We just need to rename the `BufferedRecord` to be a more general name when 
it been used as a user API.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to