On Thu, Aug 08, 2013 at 10:45:14AM +0200, Guido Trotter wrote:
> On Thu, Aug 8, 2013 at 10:33 AM, Michele Tartara <[email protected]> wrote:
> > On Thu, Aug 8, 2013 at 9:01 AM, Guido Trotter <[email protected]> wrote:
> >>
> >> On Wed, Aug 7, 2013 at 9:34 PM, Iustin Pop <[email protected]> wrote:
> >> > On Wed, Aug 07, 2013 at 04:03:38PM +0200, Michele Tartara wrote:
> >> >> On Wed, Aug 7, 2013 at 1:56 PM, Guido Trotter <[email protected]>
> >> >> wrote:
> >> >>
> >> >> > On Wed, Aug 7, 2013 at 9:36 AM, Thomas Thrainer <[email protected]>
> >> >> > wrote:
> >> >> > > On Tue, Aug 6, 2013 at 5:56 PM, Michele Tartara
> >> >> > > <[email protected]>
> >> >> > > wrote:
> >> >> > >> +``Configuration management daemon (ConfDW)``
> >> >> > >> +  It will run on the master node and it will be responsible for
> >> >> > >> the
> >> >> > >> management
> >> >> > >> +  of the authoritative copy of the cluster configuration (that
> >> >> > >> is, it
> >> >> > >> will be
> >> >> > >> +  the daemon actually modifying the ``config.data`` file). All
> >> >> > >> the
> >> >> > >> requests of
> >> >> > >> +  configuration changes will have to pass through this daemon.
> >> >> > >> Having a
> >> >> > >> single
> >> >> > >> +  point of configuration management will also allow Ganeti to get
> >> >> > >> rid
> >> >> > of
> >> >> > >> +  possible race conditions due to concurrent modifications of the
> >> >> > >> configuration.
> >> >> > >> +  When the configuration is updated, it will have to push the
> >> >> > >> received
> >> >> > >> changes
> >> >> > >> +  to the ConfDR daemons, to keep them up to date.
> >> >> > >> +  This daemon will also be the one responsible for managing the
> >> >> > >> locks,
> >> >> > >> granting
> >> >> > >> +  them to the jobs requesting them, and taking care of freeing
> >> >> > >> them up
> >> >> > if
> >> >> > >> the
> >> >> > >> +  jobs holding them crash or are terminated before releasing
> >> >> > >> them.
> >> >> > >
> >> >> > >
> >> >> > > How?
> >> >> > >
> >> >> >
> >> >> > To be detailed. (in this or a separate design, to keep just the split
> >> >> > simpler).
> >> >> > (I believe it should be detailed, but as long as we don't think it's
> >> >> > impossible we can defer the detailing and point from here to a second
> >> >> > design: of course we should have that design too, before
> >> >> > implementing).
> >> >> >
> >> >>
> >> >> I guess checking for the existence of a process with the PID of the
> >> >> lock
> >> >> older should be enough.
> >> >> I know PIDs are not ensured to be uniques, but I think they are unique
> >> >> enough for this not to be a problem.
> >> >> And if we really think this is going to be a problem, we can also check
> >> >> the
> >> >> actual program command line via /proc.
> >> >
> >> > This is still not the best way (I think).
> >> >
> >> > The way this is usually done in Unix is that the forking process "knows"
> >> > its children and receives termination signals (SIGCHLD) when they exit;
> >> > that way, it knows precisely which children are still running and which
> >> > have died.
> >> >
> >> > So if you keep a simple mapping between child PID and job ID, it should
> >> > be fine.
> >> >
> >> > Note that I don't know how well Haskell deals with SIGCHLD and whether
> >> > it's still easily usable or if it's completely hidden by some RunProcess
> >> > abstraction…
> >> >
> >>
> >> Note that if jobs run in separate processes we need to make sure the
> >> way we handle them survives a restart of the job daemon, since they
> >> can be pretty long-run themselves.
> >> As such the SIGCHLD option won't work without further changes. But I
> >> also don't believe in tracking pids and "checking": I think a system
> >> of communication (via filesystem sockets, probably) should be in place
> >> for this to be resilient.
> >> What do you think?
> >>
> >
> > I agree with SIGCHLD not being really usable because we want persistence
> > across LuxiD reboots.
> > But what are you suggesting exactly with "system of communication"?
> > Something like a unique (job-id based maybe?) unix socket for the
> > communication between each LuxiD and each specific job? So that as long as
> > that socket exists we know that either the job replies to pings on that or
> > it is dead?
> 
> That's an option, even without pings: if the socket gets closed the
> process is dead.
> Pings would be needed just in case of loops, livelocks and worse
> situations, but we don't handle them now.

+1 to the jobs each having a socket; I don't think the method that Jose
proposes (jobs pinging LuxiD back) is better, I think a simple mapping
inside LuxiD jobid→unix socket is very nice.

> The point is how to handle the reconnection when luxid restarts and
> what do the jobs do in the meantime when they detect a dead luxid.

… fun questions :) I guess you'll want to move to a system where jobs
write the result to a file on disk (agreed before job starts), and jobs
just exit when finishing, even if luxid is dead?

Anyway, I was interested in the overall design, so thanks again for it
and I'll stay out of the discussion :)

iustin

Reply via email to