xushiyan commented on code in PR #7720:
URL: https://github.com/apache/hudi/pull/7720#discussion_r1083231849


##########
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java:
##########
@@ -607,8 +608,14 @@ public List<HoodieRecord> 
generateInsertsForPartition(String instantTime, Intege
   }
 
   public Stream<HoodieRecord> generateInsertsStream(String commitTime, Integer 
n, boolean isFlattened, String schemaStr, boolean containsAllPartitions) {
+    AtomicInteger partitionIndex = new AtomicInteger(0);
     return generateInsertsStream(commitTime, n, isFlattened, schemaStr, 
containsAllPartitions,
-        () -> partitionPaths[rand.nextInt(partitionPaths.length)],
+        () -> {
+          // round robin to ensure we generate inserts for all partition paths
+          String partitionToUse = partitionPaths[partitionIndex.get()];
+          partitionIndex.set((partitionIndex.get() + 1) % 
partitionPaths.length);
+          return partitionToUse;

Review Comment:
   this is minor: 1) from UT perspective, 100 records shouldn't be different 
from 10 records. the question is: if 10 works, why bump to 100? 2) we should 
ensure num records > num partitions without knowing there are always 3 
partitions so some check arg will help prevent misuse leading to unexpected 
results



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to