[GitHub] [flink] XComp commented on pull request #15898: [FLINK-21439][core] WIP: Adds Exception History for AdaptiveScheduler

GitBox Tue, 03 Aug 2021 05:17:07 -0700


XComp commented on pull request #15898:
URL: https://github.com/apache/flink/pull/15898#issuecomment-891799420

> Hi @zentol I just got back from a vacation so I'm looking at this again.
I'm not sure what we want specification wise with concurrent failure support. I
can imagine all sorts of things failing concurrently. It seems like maybe
concurrent failures would be best tested in a more elaborate integration test
to have a very clear expectation of correct behavior? Perhaps that could be a
follow up ticket to make this current minimal exception handling landable
quickly after I address the current feedback?

Hi @bytesandwich , I'm back from vacation so I am able to answer your
questions.
Testing concurrent failures should be possible as part of the
`AdaptiveSchedulerTest`. Similarly to what you've done in
[AdaptiveSchedulerTest:929](https://github.com/apache/flink/blob/521e19eecadf39a226c5b5be4ed5348485656eab/flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/adaptive/AdaptiveSchedulerTest.java#L929)
with one `updateTaskExecutionState` call you should be able to do with two
calls. The first call will make the `AdaptiveScheduler` switch into restarting
state. Calling the `updateTaskExecutionState` again would not catch the second
exception right now. Implementing the exception handling also in the
`Restarting` state class should solve the issue.

Analogously, that has to be done for cases where the scheduler does not
switch into `Restarting` state but `Failing`. Does that make sense to you?

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] XComp commented on pull request #15898: [FLINK-21439][core] WIP: Adds Exception History for AdaptiveScheduler

Reply via email to