stevenzwu edited a comment on pull request #13574: URL: https://github.com/apache/flink/pull/13574#issuecomment-718047126
> adding a failJob() method to the SplitEnumeratorContext so the SplitEnumerator implementations can decide by themselves what to do in each method. And any exception thrown from the method invocation will just result in the job failure. @becketqin I was going to ask for this feature, although it still depends on our design choice. Let me paste a email question regarding StaticFileSplitEnumerator that I shared with @StephanEwen offline. Right now, for the static mode, FileSource discovers the splits in the createEnumerator step. We are trying to evaluate if we should follow the same pattern for Iceberg source. Here are the pros and cons from my understanding * Pro: job will fail fast if split enumeration fails, which may be required for batch jobs. Is this the reason for the decision? * Con: job creation/submission can be slow since split enumeration can be slow (dozens of seconds or longer) for large table scans. Alternatively, Static*Enumerator initiate split discovery during SplitEnumerator.start(). This is how the Kafka source is implemented. Depends on the `partitionDiscoveryIntervalMs` config, `KafkaSourceEnumerator` calls `context.callAsync` once or periodically. Then the pros and cons got reversed. * Pro: job submission/creation is fast. we can add retries internally to handle enumeration failure. * Con: job may be stuck in a failure loop if split discovery fails. Once SplitEnumerator starts, there is no way to fail fast (which might be a critical issue for batch jobs). If we go with the later approach, that `SplitEnumeratorContext.failJob()` would be very useful. So that we can fail the batch/bounded job after the initial enumeration failed once or a few times after retries. if we are going to add `SplitEnumeratorContext.failJob()`, it should be done in [PR 13784](https://github.com/apache/flink/pull/13784), right? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
