[GitHub] [hudi] nsivabalan commented on a diff in pull request #7720: [HUDI-5592] Fixing some of the flaky tests in CI

via GitHub Fri, 20 Jan 2023 16:28:44 -0800


nsivabalan commented on code in PR #7720:
URL: https://github.com/apache/hudi/pull/7720#discussion_r1083161581



##########
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java:
##########
@@ -607,8 +608,14 @@ public List<HoodieRecord> 
generateInsertsForPartition(String instantTime, Intege
   }
 
   public Stream<HoodieRecord> generateInsertsStream(String commitTime, Integer 
n, boolean isFlattened, String schemaStr, boolean containsAllPartitions) {
+    AtomicInteger partitionIndex = new AtomicInteger(0);
     return generateInsertsStream(commitTime, n, isFlattened, schemaStr, 
containsAllPartitions,
-        () -> partitionPaths[rand.nextInt(partitionPaths.length)],
+        () -> {
+          // round robin to ensure we generate inserts for all partition paths
+          String partitionToUse = partitionPaths[partitionIndex.get()];
+          partitionIndex.set((partitionIndex.get() + 1) % 
partitionPaths.length);
+          return partitionToUse;

Review Comment:
   yes, but I don't expect any savings from 100 to 10 records. Also most of our 
tests are doing 100 recs. so just to keep it in parity. 
   yes, we have only 3 partitions in TestDataGenerator. so, a minimum of 3 is 
required to ensure we generate for all 3 partitions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] nsivabalan commented on a diff in pull request #7720: [HUDI-5592] Fixing some of the flaky tests in CI

Reply via email to