Hi,

I was wondering on how often does Worker pings Master to check on Master's
liveness? Or is it the Master (Resource manager) that pings Workers to
check on their liveness and if any workers are dead to spawn ? Or is it
both?

Some info:
Standalone cluster
1 Master - 8core 12Gb
32 workers - each 8 core and 8 Gb

My main problem - Here's what happened:

Master M - running with 32 workers
Worker 1 and 2 died at 03:55:00 - so now the cluster is 30 workers

Worker 1' came up at 03:55:12.000 AM - it connected to M
Worker 2' came up at 03:55:16.000 AM - it connected to M

Master M *dies* at 03:56.00 AM
New master NM' comes up at 03:56:30 AM
Worker 1' and 2' - *DO NOT* connect to NM
Remaining 30 workers connect to NM.

So NM now has 30 workers.

I was wondering on why those two won't connect to new master NM even though
master M is dead for sure.

PS:I have a LB setup for Master which means that whenever a new master
comes in LB will start pointing to new one.

Thanks,
KP

Reply via email to