codope commented on code in PR #7038:
URL: https://github.com/apache/hudi/pull/7038#discussion_r1012548734


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java:
##########
@@ -499,7 +499,7 @@ private Pair<SchemaProvider, Pair<String, 
JavaRDD<HoodieRecord>>> fetchFromSourc
       return new HoodieAvroRecord<>(keyGenerator.getKey(record), payload);
     });
 
-    return Pair.of(schemaProvider, Pair.of(checkpointStr, records));
+    return new ReadResult(schemaProvider, checkpointStr, records, false);

Review Comment:
   Can we not omit `avroRDDOptional.get().isEmpty()` check in L484 by pulling 
up `DataSourceUtils.dropDuplicates` call in this method? Then we can check 
`isEmpty` at this L502 while instantiating `ReadResult`. It will still be just 
one `isEmpty` call.



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java:
##########
@@ -962,4 +964,34 @@ private Set<String> getPartitionColumns(KeyGenerator 
keyGenerator, TypedProperti
     String partitionColumns = 
SparkKeyGenUtils.getPartitionColumns(keyGenerator, props);
     return 
Arrays.stream(partitionColumns.split(",")).collect(Collectors.toSet());
   }
+
+  class ReadResult {

Review Comment:
   +1 for abstracting out this model.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to