Re: [OpenAFS-devel] Re: pthreading the bosserver

Derrick Brashear Thu, 08 Aug 2013 13:51:49 -0700

On Thu, Aug 8, 2013 at 4:30 PM, Andrew Deason <adea...@sinenomine.net>wrote:


> On Thu, 8 Aug 2013 15:40:08 -0400 (EDT)
> Benjamin Kaduk <ka...@mit.edu> wrote:
>
> > However, the bosserver is currently using LWP for parallelism, and
> > GSSAPI libraries which are compatible with LWP are hard to come by;
> > the obvious solution is to convert the bosserver to pthreads.
>
> Just mentioning... you can have a pthread process with lwp emulation
> that implements all of the lwp primitives in terms of pthreads. This is
> like having just a giant anti-preemption lock around everything, and I
> thought this already existed somewhere. I don't think that helps with
> signalling, but it can help with locking issues.
>
> Code to do this is in Arla; I donated it a jillion years ago, and it's
probably not drop-in for OpenAFS at this point.


> Not that that's a good general pthread-ifying solution, but since
> bosserver doesn't need to run very fast, consistency seems more
> important than actual parallelism.
>
> > First off: do we need to keep an LWP version of the bosserver around
> > as well as a pthreaded one?  I don't think so, and I believe Simon
> > agrees, but it would be good to get consensus.
>
> It does not need to stay very long; lwp fileserver has already been
> removed. If you're asking if you can get rid of the LWP bosserver at the
> same time as introducing a pthreaded bosserver, I think that depends on
> how sure you are that it functions correctly. I would vote for a tbozo
> directory, but if the changes are not complex and you're verify
> confident, it may not be necessary. But I think it's easier to implement
> a tbozo, and then remove bozo (and move tbozo into it) when it's just as
> good.
>
> > Second, how strong of an integrity guarantee do we need for the bos
> > config?  My understanding is that configuration changes (adding or
> > removing or en/disabling bnodes) are rare events, and it is highly
> > unlikely that multiple administrator connnections changing things will
> > be made concurrently.
>
> We can assume they are infrequent, but we must assume that they will
> happen. That is, there needs to be locking, but it doesn't need to be
> very granular. That is, it can be slow, but it cannot cause something to
> break or behave weirdly.
>
> > If this is true, then we can rely on time-domain "locking" for
> > synchronization and eliminate some aspects of code-level locking.  For
> > example, a per-bnode lock acquired before writing any bnode state
> > would not be needed, and a single global lock would be sufficient.
>
> I don't really see how one of these is offering integrity but the other
> isn't, but... A single lock is fine, if I understand this correctly.
> You've never been able to do certain bozo things in parallel, but I
> haven't heard complaining about it. In any case, rxgk is more important
> than improving that.
>
> > Relatedly, is it okay to assume that shutdown/restart/etc. will not be
> > issued concurrently with config changes?  A "fully correct"
> > implementation would seem to need to only shutdown/restart the bnodes
> > which were configured when the command was issued, and ignore any new
> > nodes created since then.  Because the implementation of
> > shutdown/restart must drop locks, making this guarantee seems to
> > require additional sychronization effort, whether via a temporary
> > queue to store the bnodes being acted upon, or a higher-level lock.
>
> Are you talking about a 'bos create' racing with a 'bos restart -all'? I
> would think you'd block out all modifications during a restart. While
> the ordering may not matter for 'bos restart -all', it may matter for
> 'bos restart -bosserver', just so it doesn't leave behind a running
> process and then re-exec itself or something.
>
> > I haven't been able to convince myself that the additional complexity
> > of the extra watcher threads is necessary, but if someone else could
> > convince me, that would be good.
>
> My opinion is that we should explicitly drop LINUX24 support on servers
> (or at least tbozo, if we eventually provide both tbozo and bozo).  I
> have never heard of demand for LINUX24 servers, and it's easy to migrate
> off of them. The thing I have heard demand for and is not easy to
> migrate off of is LINUX24 clients, which we could still keep.
>
> I mean, regardless of what solution we end up with, how much testing is
> anyone really going to do for bozo on LINUX24? We're just going to end
> up with something that theoretically works but we're not very confident
> has solved various possible race conditions or whatnot. If we want to
> keep LINUX24 for this, we should at least put a big warning on it that
> mentions something involving the relevant issues.
>
>
That would be my theory. At this point a lot of stuff is inadequately
tested on 2.4.


> That doesn't deal with any signalling specifics, but keep in mind our
> current bozo signal handling is not always great, and does not
> necessarily need to be fixed at the same time. I've always seen bozo
> misidentify core dumps, which I thought was due to this, but I've never
> really cared.
>
> --
> Andrew Deason
> adea...@sinenomine.net
>
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel
>
>


-- 
Derrick

Re: [OpenAFS-devel] Re: pthreading the bosserver

Reply via email to