Re: [PATCH stable-2.8] Add daemon split design doc

Guido Trotter Fri, 09 Aug 2013 04:12:40 -0700

On Thu, Aug 8, 2013 at 7:31 PM, Iustin Pop <[email protected]> wrote:
> On Thu, Aug 08, 2013 at 10:45:14AM +0200, Guido Trotter wrote:
>> On Thu, Aug 8, 2013 at 10:33 AM, Michele Tartara <[email protected]> wrote:
>> > On Thu, Aug 8, 2013 at 9:01 AM, Guido Trotter <[email protected]> wrote:
>> >>
>> >> On Wed, Aug 7, 2013 at 9:34 PM, Iustin Pop <[email protected]> wrote:
>> >> > On Wed, Aug 07, 2013 at 04:03:38PM +0200, Michele Tartara wrote:
>> >> >> On Wed, Aug 7, 2013 at 1:56 PM, Guido Trotter <[email protected]>
>> >> >> wrote:
>> >> >>
>> >> >> > On Wed, Aug 7, 2013 at 9:36 AM, Thomas Thrainer <[email protected]>
>> >> >> > wrote:
>> >> >> > > On Tue, Aug 6, 2013 at 5:56 PM, Michele Tartara
>> >> >> > > <[email protected]>
>> >> >> > > wrote:
>> >> >> > >> +``Configuration management daemon (ConfDW)``
>> >> >> > >> +  It will run on the master node and it will be responsible for
>> >> >> > >> the
>> >> >> > >> management
>> >> >> > >> +  of the authoritative copy of the cluster configuration (that
>> >> >> > >> is, it
>> >> >> > >> will be
>> >> >> > >> +  the daemon actually modifying the ``config.data`` file). All
>> >> >> > >> the
>> >> >> > >> requests of
>> >> >> > >> +  configuration changes will have to pass through this daemon.
>> >> >> > >> Having a
>> >> >> > >> single
>> >> >> > >> +  point of configuration management will also allow Ganeti to get
>> >> >> > >> rid
>> >> >> > of
>> >> >> > >> +  possible race conditions due to concurrent modifications of the
>> >> >> > >> configuration.
>> >> >> > >> +  When the configuration is updated, it will have to push the
>> >> >> > >> received
>> >> >> > >> changes
>> >> >> > >> +  to the ConfDR daemons, to keep them up to date.
>> >> >> > >> +  This daemon will also be the one responsible for managing the
>> >> >> > >> locks,
>> >> >> > >> granting
>> >> >> > >> +  them to the jobs requesting them, and taking care of freeing
>> >> >> > >> them up
>> >> >> > if
>> >> >> > >> the
>> >> >> > >> +  jobs holding them crash or are terminated before releasing
>> >> >> > >> them.
>> >> >> > >
>> >> >> > >
>> >> >> > > How?
>> >> >> > >
>> >> >> >
>> >> >> > To be detailed. (in this or a separate design, to keep just the split
>> >> >> > simpler).
>> >> >> > (I believe it should be detailed, but as long as we don't think it's
>> >> >> > impossible we can defer the detailing and point from here to a second
>> >> >> > design: of course we should have that design too, before
>> >> >> > implementing).
>> >> >> >
>> >> >>
>> >> >> I guess checking for the existence of a process with the PID of the
>> >> >> lock
>> >> >> older should be enough.
>> >> >> I know PIDs are not ensured to be uniques, but I think they are unique
>> >> >> enough for this not to be a problem.
>> >> >> And if we really think this is going to be a problem, we can also check
>> >> >> the
>> >> >> actual program command line via /proc.
>> >> >
>> >> > This is still not the best way (I think).
>> >> >
>> >> > The way this is usually done in Unix is that the forking process "knows"
>> >> > its children and receives termination signals (SIGCHLD) when they exit;
>> >> > that way, it knows precisely which children are still running and which
>> >> > have died.
>> >> >
>> >> > So if you keep a simple mapping between child PID and job ID, it should
>> >> > be fine.
>> >> >
>> >> > Note that I don't know how well Haskell deals with SIGCHLD and whether
>> >> > it's still easily usable or if it's completely hidden by some RunProcess
>> >> > abstraction…
>> >> >
>> >>
>> >> Note that if jobs run in separate processes we need to make sure the
>> >> way we handle them survives a restart of the job daemon, since they
>> >> can be pretty long-run themselves.
>> >> As such the SIGCHLD option won't work without further changes. But I
>> >> also don't believe in tracking pids and "checking": I think a system
>> >> of communication (via filesystem sockets, probably) should be in place
>> >> for this to be resilient.
>> >> What do you think?
>> >>
>> >
>> > I agree with SIGCHLD not being really usable because we want persistence
>> > across LuxiD reboots.
>> > But what are you suggesting exactly with "system of communication"?
>> > Something like a unique (job-id based maybe?) unix socket for the
>> > communication between each LuxiD and each specific job? So that as long as
>> > that socket exists we know that either the job replies to pings on that or
>> > it is dead?
>>
>> That's an option, even without pings: if the socket gets closed the
>> process is dead.
>> Pings would be needed just in case of loops, livelocks and worse
>> situations, but we don't handle them now.
>
> +1 to the jobs each having a socket; I don't think the method that Jose
> proposes (jobs pinging LuxiD back) is better, I think a simple mapping
> inside LuxiD jobid→unix socket is very nice.
>
>> The point is how to handle the reconnection when luxid restarts and
>> what do the jobs do in the meantime when they detect a dead luxid.
>
> … fun questions :) I guess you'll want to move to a system where jobs
> write the result to a file on disk (agreed before job starts), and jobs
> just exit when finishing, even if luxid is dead?
>
> Anyway, I was interested in the overall design, so thanks again for it
> and I'll stay out of the discussion :)
>


We welcome feedback on the details as well. I think we can at this
point commit this design (when Michele sends the promised changes) and
then write new ones for the detailed parts of IPC.

Thanks!

Guido

Re: [PATCH stable-2.8] Add daemon split design doc

Reply via email to