[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #529: [FLINK-30985] Allocate the splits belonging to the specified task from all buckets.

via GitHub Tue, 14 Feb 2023 03:41:07 -0800


JingsongLi commented on code in PR #529:
URL: https://github.com/apache/flink-table-store/pull/529#discussion_r1105689147



##########
flink-table-store-connector/src/test/java/org/apache/flink/table/store/connector/source/ContinuousFileSplitEnumeratorTest.java:
##########
@@ -105,6 +105,57 @@ public void testSplitAllocationIsOrdered() throws 
Exception {
         }
     }
 
+    @Test
+    public void testSplitAllocationIsFair() throws Exception {
+        final TestingSplitEnumeratorContext<FileStoreSourceSplit> context =
+                new TestingSplitEnumeratorContext<>(1);
+        context.registerReader(0, "test-host");
+
+        List<FileStoreSourceSplit> initialSplits = new ArrayList<>();
+        for (int i = 1; i <= 2; i++) {
+            initialSplits.add(createSnapshotSplit(i, 0, 
Collections.emptyList()));
+            initialSplits.add(createSnapshotSplit(i, 1, 
Collections.emptyList()));
+        }
+
+        List<FileStoreSourceSplit> expectedSplits = new 
ArrayList<>(initialSplits);
+
+        final ContinuousFileSplitEnumerator enumerator =
+                new Builder()
+                        .setSplitEnumeratorContext(context)
+                        .setInitialSplits(initialSplits)
+                        .setDiscoveryInterval(3)
+                        .build();
+
+        // each time a split is allocated from bucket-0 and bucket-1
+        enumerator.handleSplitRequest(0, "test-host");
+        Map<Integer, SplitAssignmentState<FileStoreSourceSplit>> assignments =
+                context.getSplitAssignments();
+        // Only subtask-0 is allocated.
+        Assertions.assertThat(assignments.size()).isEqualTo(1);

Review Comment:
   You can import `Assertions.assertThat`.
   And `assertThat` is very strong. For this case, we can use:
   `assertThat(assignments).hasSameSizeAs(1)`.
   
   I think you can change all assertions in this class.



##########
flink-table-store-connector/src/main/java/org/apache/flink/table/store/connector/source/ContinuousFileSplitEnumerator.java:
##########
@@ -163,22 +164,28 @@ private void processDiscoveredSplits(
     }
 
     private void assignSplits() {
+        Map<Integer, List<FileStoreSourceSplit>> toAssignSplits = new 
HashMap<>();
         bucketSplits.forEach(
                 (bucket, splits) -> {
                     if (splits.size() > 0) {
                         // To ensure the order of consumption, the data of the 
same bucket is given
                         // to a task to be consumed.
                         int task = bucket % context.currentParallelism();
-                        if (readersAwaitingSplit.remove(task)) {
+                        if (readersAwaitingSplit.contains(task)) {
                             // if the reader that requested another split has 
failed in the
                             // meantime, remove
                             // it from the list of waiting readers
                             if 
(!context.registeredReaders().containsKey(task)) {
+                                readersAwaitingSplit.remove(task);
                                 return;
                             }
-                            context.assignSplit(splits.poll(), task);
+                            toAssignSplits
+                                    .computeIfAbsent(task, i -> new 
ArrayList<>())
+                                    .add(splits.poll());
                         }
                     }
                 });
+        toAssignSplits.forEach((task, splits) -> 
readersAwaitingSplit.remove(task));

Review Comment:
   Minor: `toAssignSplits.keySet().forEach(readersAwaitingSplit::remove)`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #529: [FLINK-30985] Allocate the splits belonging to the specified task from all buckets.

Reply via email to