Re: Dying workers since migration to 0.8.1

Guillaume Pitel Fri, 17 Jan 2014 06:46:34 -0800

Hi,

The apparent cause of the problem is when I ctrl-c the driver on compute-heavy tasks. The slaves continue running (a long time after the driver has been stopped) and the slaves are marked as dead.

Guillaume

Hi, sorry for the poor information I initially gave :)

Cluster is standalone on premise, 4 nodes.

No problem before with 0.8.0 (more exactly, it happened once or twice, not several times/day)

No exceptions, just these warnings on master (t1.exensa.loc had disappeared, while its process continue working) :

WARN Master: Got heartbeat from unregistered worker worker-20140108170426-t1.exensa.loc-38178

Guillaume

Hi,

Can you give a little more details about the problem apart from a few hints that would be great !. I would like to exactly what you did and how did you end up getting those stuck up executors. This can be due to network too. Are you on ec2 ? in that case ec2 n/w is often unpredictable.

--

Guillaume PITEL, Président
+33(0)6 25 48 86 80

eXenSa S.A.S.
41, rue Périer - 92120 Montrouge - FRANCE
Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05

Guillaume PITEL, Président
+33(0)6 25 48 86 80

eXenSa S.A.S.
41, rue Périer - 92120 Montrouge - FRANCE
Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05

Re: Dying workers since migration to 0.8.1

Reply via email to