Re: Cluster die when one of the TM killed

Lasse Nedergaard Mon, 20 Aug 2018 08:16:58 -0700

Hi. 
We have seen the same behaviour on Yarn. It turned out that the default 
settings for was not optimal. 
yarn.maximum-failed-containers: The maximum number of failed containers the 
ApplicationMaster accepts until it fails the YARN session. Default: The number 
of initially requested TaskManagers (-n).
So try to lookup the configuration for your system. 
Next step is to investigate why the task manager is killed.



Med venlig hilsen / Best regards
Lasse Nedergaard


> Den 20. aug. 2018 kl. 16.34 skrev Dominik Wosiński <[email protected]>:
> 
> Hey, 
> Can You please provide a little more information about your setup and maybe 
> logs showing when the crash occurs? 
> Best Regards,
> Dominik
> 
> 2018-08-20 16:23 GMT+02:00 Siew Wai Yow <[email protected]>:
>> Hi,
>> 
>> When one of the task manager is killed, the whole cluster die, is this 
>> something expected? We are using Flink 1.4. Thank you.
>> 
>> Regards,
>> Yow
>

Re: Cluster die when one of the TM killed

Reply via email to