Hi.
We have seen the same behaviour on Yarn. It turned out that the default
settings for was not optimal.
yarn.maximum-failed-containers: The maximum number of failed containers the
ApplicationMaster accepts until it fails the YARN session. Default: The number
of initially requested TaskManagers (-n).
So try to lookup the configuration for your system.
Next step is to investigate why the task manager is killed.
Med venlig hilsen / Best regards
Lasse Nedergaard
> Den 20. aug. 2018 kl. 16.34 skrev Dominik Wosiński :
>
> Hey,
> Can You please provide a little more information about your setup and maybe
> logs showing when the crash occurs?
> Best Regards,
> Dominik
>
> 2018-08-20 16:23 GMT+02:00 Siew Wai Yow :
>> Hi,
>>
>> When one of the task manager is killed, the whole cluster die, is this
>> something expected? We are using Flink 1.4. Thank you.
>>
>> Regards,
>> Yow
>