hi,
  we are trying to use Helix to manage our clusters ( 300+ nodes) and we
now have a problem, please help!

  let me describe it.

  our clusters is made up of servers and clients; servers are partitioned
into groups ( partition in Helix) and clients are partitioned to
accordingly; now we are trying to do some fault-tolerant thing like this:

  1) if one server-node fails, trigger a state transition (server site) ,
do something like print log, trigger alarm and restart the server process;

  2)then, trigger some state transition on all client-nodes belonging this
partition, do something like kick the fail-server and release the fail
server's resource on client;

  could someone please tell me how to inplement this using Helix, thanks!

  it'll be better if you could show me some code samplesl

  thanks!

Reply via email to