StephanEwen commented on pull request #11554:
URL: https://github.com/apache/flink/pull/11554#issuecomment-621785104
Thank you for addressing the changes. Looks good to me with two remarks:
(1) Can we wrap the `ByteArrayOutputStream` into a `DataOutputStream`
instead of using the `SerdeUtils`? It looks both simpler and more efficient and
is also what other parts of the Flink code typically do.
(2) About the "Fatal Error Handler": Virtually all of the really critical
issues in Flink have fallen into the following category: An exception happens
that we try to handle "fine grained" but it actually leaves something in an
inconsistent state that is not recognized. Had we simply handled it "coarse
grained" (fail process, rely on recovery) it would have been fine.
That is why I think the default should be to escalate. If we feel that we
are completely sure we can handle this more fine-grained, then it is fine.
This issue is somewhat connected with the "closing" of the coordinator.
How about the following approach:
- We add a "process killing" uncaught exception handler, as the safety net.
- We catch the exceptions in the runnables ourselves and cause a global
job failure upon exception, unless the `SourceCoordinator` is closed already
- The global job failure will close() the current coordinator re-create
the coordinator from latest checkpoint (we need to still implement this)
- closing the source coordinator gets the following changes
- we call `shutdownNow()` on the executor to make sure all queued tasks
do not get executed
- we make sure the `SourceCoordinator` does not forward events from the
Enumerator any more, so that a long-running task that is still being executed
in the enumerator cannot affect anything any more
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]