fapaul commented on a change in pull request #18790:
URL: https://github.com/apache/flink/pull/18790#discussion_r809110464
##########
File path:
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/SourceOperator.java
##########
@@ -423,6 +422,16 @@ private DataInputStatus emitNextNotReading(DataOutput<OUT>
output) throws Except
}
}
+ private void initializeMainOutput(DataOutput<OUT> output) {
+ currentMainOutput = eventTimeLogic.createMainOutput(output,
this::onWatermarkEmitted);
+ initializeLatencyMarkerEmitter(output);
+ lastInvokedOutput = output;
+ outputPendingSplits.forEach(
+ split ->
currentMainOutput.createOutputForSplit(split.splitId()));
Review comment:
I am wondering whether it might be problematic that you initialize all
the outputs already but do not release them.
Previously the outputs were initialized one by one here [1] and released
when moving to the next split[2]
[1]
https://github.com/apache/flink/blob/106280e10a96d729943985986198b942446197d9/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/SourceReaderBase.java#L327
[2]
https://github.com/apache/flink/blob/106280e10a96d729943985986198b942446197d9/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/SourceReaderBase.java#L195
##########
File path:
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/SourceOperator.java
##########
@@ -516,7 +525,14 @@ public void handleOperatorEvent(OperatorEvent event) {
checkWatermarkAlignment();
} else if (event instanceof AddSplitEvent) {
try {
- sourceReader.addSplits(((AddSplitEvent<SplitT>)
event).splits(splitSerializer));
+ List<SplitT> newSplits = ((AddSplitEvent<SplitT>)
event).splits(splitSerializer);
Review comment:
Is `handleOperatorEvent` executed by the mailbox thread as the other
methods?
##########
File path:
flink-connectors/flink-connector-base/src/test/java/org/apache/flink/connector/base/source/reader/SourceReaderBaseTest.java
##########
@@ -239,6 +257,77 @@ void
testPollNextReturnMoreAvailableWhenAllSplitFetcherCloseWithLeftoverElementI
.isEqualTo(InputStatus.MORE_AVAILABLE);
}
+ @Test
+ void testPerSplitWatermark() throws Exception {
+ MockSplitReader mockSplitReader =
+ MockSplitReader.newBuilder()
+ .setNumRecordsPerSplitPerFetch(3)
+ .setBlockingFetch(true)
+ .build();
+
+ MockSourceReader reader =
+ new MockSourceReader(
+ new FutureCompletingBlockingQueue<>(),
+ () -> mockSplitReader,
+ new Configuration(),
+ new TestingReaderContext());
+
+ SourceOperator<Integer, MockSourceSplit> sourceOperator =
+ createTestOperator(
+ reader,
+ WatermarkStrategy.forGenerator(
+ (context) -> new OnEventWatermarkGenerator()),
+ true);
+
+ MockSourceSplit splitA = new MockSourceSplit(0, 0, 3);
+ splitA.addRecord(100);
+ splitA.addRecord(200);
+ splitA.addRecord(300);
+
+ MockSourceSplit splitB = new MockSourceSplit(1, 0, 3);
+ splitB.addRecord(150);
+ splitB.addRecord(250);
+ splitB.addRecord(350);
+
+ AddSplitEvent<MockSourceSplit> addSplitsEvent =
+ new AddSplitEvent<>(Arrays.asList(splitA, splitB), new
MockSourceSplitSerializer());
+ sourceOperator.handleOperatorEvent(addSplitsEvent);
+ WatermarkCollectingDataOutput output = new
WatermarkCollectingDataOutput();
+
+ // First 3 records from split A should not generate any watermarks
+ CommonTestUtils.waitUtil(
Review comment:
Why do you need this test loop? Can't you call `emitNext` the correct
amount of time?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]