Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/6103
  
    That all depends why the failure happens in the first place. It seems to 
happen if the receiver of a channel starts much faster than the sender. The 
longest part of the deployment is library distribution, which happens only 
once. After one failure / recovery, the library should be cached and the next 
attempt to start the task should be very fast.


---

Reply via email to