On Thu, Aug 8, 2013 at 12:37 PM, Jose A. Lopes <[email protected]> wrote:

> > That would not work if we want LuxiD to be able to be restarted while
> jobs
> > are running (might be useful for easier upgrades). We would like to
> persist
> > information about jobs and the job queue to disk, and then obviously the
> > parent/child relationship is gone. But maybe we could implement the
> proper
> > way for normal operation and check only on startup using the PID/creation
> > time/cmdline in /proc approach.
>
> Why not reverse the direction of parent/child pings?
>
> For example, why not have a UNIX socket in the master and the job
> processes must ping on the master socket every now and then. This way
> we just have one socket instead of having one per process. If the
> master dies, the job processes know because the UNIX socket gets
> closed. But they just keep trying until the socket comes back.
>
> Moreover, perhaps we don't have to persist any job queue information,
> because when the master comes back up, it will simply collect the
> pings from the job processes that are still running.
>
> If you like this idea, we can even remove the UNIX socket from the
> picture and simply add a LUXI ping request, used only by the job
> processes, to communicate with the master.
>
> What do you think ?
>
> Jose
>

I don't think this would work. What happens if a job terminates while LuxiD
is not available? LuxiD, after coming back, will have no idea about which
jobs are still running and which are not, because it will end up just
waiting for their ping. And what if a job finishes before LuxiD is back? It
will never send a ping again, or (even more important) a message saying
that everything went well and it's releasing the locks.

Having multiple sockets it's easier. LuxiD only needs to have a persistent
list of where all the sockets are located, and just needs to try to connect
to them to find out whether the jobs are still alive and working.

Thanks,
Michele

-- 
Google Germany GmbH
Dienerstr. 12
80331 München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Graham Law, Christine Elizabeth Flores

Reply via email to