Greetings,

I've tried making various storm.yaml and zookeeper parameter changes to
find out what causes random workers in our cluster to die, but have not had
any luck. Sometimes I get a clean test run, though most of the times the
worker just stops for no apparent reason.

Supervisor.log file says it's shutting down the worker, but doesn't
indicate why. Sometimes the state is "timed-out", and other times it is
"disallowed". When I do see the disallowed state, I see a new worker
process spawned.

Are there any specific heartbeat, timeout, or combination-of-parameters I
should pay particular attention to?

*Cluster details:*
Storm - 0.9.2-incubating
Zookeeper - 3.4.5
Ubuntu 12.04


Thanks,
Dennis

Reply via email to