Re: Taskmanagers are quarantined

2017-12-07 Thread T Obi
this helps to overcome the GC pauses. > > Cheers, > Till > > On Wed, Nov 29, 2017 at 12:41 PM, T Obi <t@geniee.co.jp> wrote: >> >> Warnings of Datanode appeared not in all cases of timeout. They seem >> to be raised just by timeout while snapshotting. >

Re: Taskmanagers are quarantined

2017-11-29 Thread T Obi
time. I try to make a few taskmanagers run with divided memory size on each machine. Also I will tune JVM memory parameters to reduce the frequency of "Full GC (Metadata GC Threshold)". Best, Tetsuya 2017-11-28 16:30 GMT+09:00 T Obi <t@geniee.co.jp>: > Hello Chesnay, > &

Re: Taskmanagers are quarantined

2017-11-27 Thread T Obi
e you using? > > From the stack-trace it appears that multiple hdfs nodes are being > corrupted. > The taskmanagers timeout since the connection to zookeeper breaks down, > at which point it no longer knows who the leading jobmanager knows and > subsequently shuts down. > > >

Taskmanagers are quarantined

2017-11-26 Thread T Obi
Hello all, We run jobs on a standalone cluster with Flink 1.3.2 and we're facing a problem. Suddenly a connection between a taskmanager and the jobmanager is timed out and the taskmanager is "quarantined" by jobmanager. Once a taskmanager is quarantined, of course jobs are restarted, but the