On Thu, Aug 08, 2013 at 10:45:14AM +0200, Guido Trotter wrote: > On Thu, Aug 8, 2013 at 10:33 AM, Michele Tartara <[email protected]> wrote: > > On Thu, Aug 8, 2013 at 9:01 AM, Guido Trotter <[email protected]> wrote: > >> > >> On Wed, Aug 7, 2013 at 9:34 PM, Iustin Pop <[email protected]> wrote: > >> > On Wed, Aug 07, 2013 at 04:03:38PM +0200, Michele Tartara wrote: > >> >> On Wed, Aug 7, 2013 at 1:56 PM, Guido Trotter <[email protected]> > >> >> wrote: > >> >> > >> >> > On Wed, Aug 7, 2013 at 9:36 AM, Thomas Thrainer <[email protected]> > >> >> > wrote: > >> >> > > On Tue, Aug 6, 2013 at 5:56 PM, Michele Tartara > >> >> > > <[email protected]> > >> >> > > wrote: > >> >> > >> +``Configuration management daemon (ConfDW)`` > >> >> > >> + It will run on the master node and it will be responsible for > >> >> > >> the > >> >> > >> management > >> >> > >> + of the authoritative copy of the cluster configuration (that > >> >> > >> is, it > >> >> > >> will be > >> >> > >> + the daemon actually modifying the ``config.data`` file). All > >> >> > >> the > >> >> > >> requests of > >> >> > >> + configuration changes will have to pass through this daemon. > >> >> > >> Having a > >> >> > >> single > >> >> > >> + point of configuration management will also allow Ganeti to get > >> >> > >> rid > >> >> > of > >> >> > >> + possible race conditions due to concurrent modifications of the > >> >> > >> configuration. > >> >> > >> + When the configuration is updated, it will have to push the > >> >> > >> received > >> >> > >> changes > >> >> > >> + to the ConfDR daemons, to keep them up to date. > >> >> > >> + This daemon will also be the one responsible for managing the > >> >> > >> locks, > >> >> > >> granting > >> >> > >> + them to the jobs requesting them, and taking care of freeing > >> >> > >> them up > >> >> > if > >> >> > >> the > >> >> > >> + jobs holding them crash or are terminated before releasing > >> >> > >> them. > >> >> > > > >> >> > > > >> >> > > How? > >> >> > > > >> >> > > >> >> > To be detailed. (in this or a separate design, to keep just the split > >> >> > simpler). > >> >> > (I believe it should be detailed, but as long as we don't think it's > >> >> > impossible we can defer the detailing and point from here to a second > >> >> > design: of course we should have that design too, before > >> >> > implementing). > >> >> > > >> >> > >> >> I guess checking for the existence of a process with the PID of the > >> >> lock > >> >> older should be enough. > >> >> I know PIDs are not ensured to be uniques, but I think they are unique > >> >> enough for this not to be a problem. > >> >> And if we really think this is going to be a problem, we can also check > >> >> the > >> >> actual program command line via /proc. > >> > > >> > This is still not the best way (I think). > >> > > >> > The way this is usually done in Unix is that the forking process "knows" > >> > its children and receives termination signals (SIGCHLD) when they exit; > >> > that way, it knows precisely which children are still running and which > >> > have died. > >> > > >> > So if you keep a simple mapping between child PID and job ID, it should > >> > be fine. > >> > > >> > Note that I don't know how well Haskell deals with SIGCHLD and whether > >> > it's still easily usable or if it's completely hidden by some RunProcess > >> > abstraction… > >> > > >> > >> Note that if jobs run in separate processes we need to make sure the > >> way we handle them survives a restart of the job daemon, since they > >> can be pretty long-run themselves. > >> As such the SIGCHLD option won't work without further changes. But I > >> also don't believe in tracking pids and "checking": I think a system > >> of communication (via filesystem sockets, probably) should be in place > >> for this to be resilient. > >> What do you think? > >> > > > > I agree with SIGCHLD not being really usable because we want persistence > > across LuxiD reboots. > > But what are you suggesting exactly with "system of communication"? > > Something like a unique (job-id based maybe?) unix socket for the > > communication between each LuxiD and each specific job? So that as long as > > that socket exists we know that either the job replies to pings on that or > > it is dead? > > That's an option, even without pings: if the socket gets closed the > process is dead. > Pings would be needed just in case of loops, livelocks and worse > situations, but we don't handle them now.
+1 to the jobs each having a socket; I don't think the method that Jose proposes (jobs pinging LuxiD back) is better, I think a simple mapping inside LuxiD jobid→unix socket is very nice. > The point is how to handle the reconnection when luxid restarts and > what do the jobs do in the meantime when they detect a dead luxid. … fun questions :) I guess you'll want to move to a system where jobs write the result to a file on disk (agreed before job starts), and jobs just exit when finishing, even if luxid is dead? Anyway, I was interested in the overall design, so thanks again for it and I'll stay out of the discussion :) iustin
