|
Hi,
The apparent cause of the problem is when I ctrl-c the driver on
compute-heavy tasks. The slaves continue running (a long time
after the driver has been stopped) and the slaves are marked as
dead.
Guillaume
Hi, sorry for the poor information I
initially gave :)
Cluster is standalone on premise, 4 nodes.
No problem before with 0.8.0 (more exactly, it happened once or
twice, not several times/day)
No exceptions, just these warnings on master (t1.exensa.loc had
disappeared, while its process continue working) :
WARN Master: Got heartbeat from unregistered worker
worker-20140108170426-t1.exensa.loc-38178
Guillaume
Hi,
Can you give a little more details about the problem
apart from a few hints that would be great !. I would like
to exactly what you did and how did you end up getting those
stuck up executors. This can be due to network too. Are you
on ec2 ? in that case ec2 n/w is often unpredictable.
--
|
Guillaume PITEL, Président
+33(0)6 25 48 86 80
eXenSa
S.A.S.
41, rue Périer -
92120 Montrouge - FRANCE
Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37
05
|
--
|
Guillaume
PITEL, Président
+33(0)6 25 48 86 80
eXenSa
S.A.S.
41, rue Périer -
92120 Montrouge - FRANCE
Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37
05
|
|