Github user foxish commented on the issue:
https://github.com/apache/spark/pull/21067
> After a short/configurable delay the driver pod state changed to Unknown
and the Job controller initiated a new spark driver.
This is dangerous behavior. The old spark driver can still be perfectly
functional and running within the cluster even though it's state is marked
Unknown. It could also still be making progress with it's own executors.
Network connection with the K8s master is not a prerequisite for pods to
continue running.
On Thu, Jul 12, 2018, 7:57 AM Lucas Kacher <[email protected]> wrote:
> @baluchicken <https://github.com/baluchicken>, did that test involve
> using checkpointing in a shared location?
>
> â
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <https://github.com/apache/spark/pull/21067#issuecomment-404541386>, or
mute
> the thread
>
<https://github.com/notifications/unsubscribe-auth/AA3U5z1vUwuS3NHh8Zx388Am8gs1sedTks5uF2PogaJpZM4TTiRg>
> .
>
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]