> I am having difficulty implementing "no parallel execution" guarantee --
> if worker (or connection to it) goes down I need to recognize this in
> Coordinator, "pause" all jobs given worker was running and (after some
> timeout or user action) re submit jobs to another worker. Timeout (or user
> action) is required to allow worker (if it is alive) to detect network
> error and stop it's jobs and start the cycle again (try to register self
> with Coordinator, etc). It is important that once connection was deemed as
> broken -- it never reused(or worker may not notice the problem), worker is
> treated as dead until it re-registers itself (after a job purge or
> restart).

gRPC doesn't have these sort of intrinsics.

The interesting part here smells like a variation on distributed locking.
You may want to look at something like ZooKeeper.

You could use gRPC messages to do things like communicate the lock names.

Christopher Warrington

You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To post to this group, send email to grpc-io@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
For more options, visit https://groups.google.com/d/optout.

Reply via email to