> Quoth Cathy Thomas on Fri, Nov 17, 2006 at 12:58:50PM -0800:
> > My service (ipmievd) will not run if the local host does not have
> > a BMC device available. ipmievd itself will return a 1 and give an
> > error message that "no BMC is available".
> ...
> 
> You currently have three options:
> 
>   1. The start method detects that the BMC is unavailable (either by
>      itself or through the daemon) and reports a fatal error to SMF.
>      The service ends up in the maintenance state.
> 
>   2. The start method detects that the BMC is unavailable and disables
>      the service.  The service ends up in the disabled state.
> 
>   3. Even though the daemon detects that there is no BMC device, it
>      stays running, presumably doing nothing.  The start method reports
>      success and the sevice enters and stays "online".
> 
> Before I recommend one, can you explain what your daemon does?  In
> particular, why the BMC device is required and why anyone would want to
> enable or disable it.

ipmievd listens for IPMI events over the BMC device, and then can syslog
some of the results (e.g. if an IPMI event is posted saying that some
h/w failed).  Aside: this is an opensource thing; the actual Solaris
functionality here should be done using FMA, but that's another story ...

In terms of the daemon, I think the only sensible behavior is (2). (1) doesn't
make sense because there's nothing broken -- i.e. nothing for someone to
repair.  (3) might make sense for some daemons, but not this one, because
fundamentally if there is no BMC it means the platform has none -- this isn't
a DR'able resource, and thus you're just wasting CPU + memory.  Plus ipmievd
being a bunch of open source, it likely just fails or exits if BMC is missing.

That said, I'd really like to see the SMF RFE implemented to provide a
supported exit status for this.  In particular, a service calling disable -t
on itself in these situations in confusing.  Ideally what we want is the
effect of disable or disable -t, but synchronously as part of the method
exiting, and also with some appropriate auxiliary status such that when I
do a svcs -x on it I don't get the same result as I would if I had typed
svcadm disable by hand.  In other words, we don't want to report these things
as "disabled by administrator", we want separate reporting to indicate that
"service disabled itself because it has nothing to do" or somesuch.

One final comment here: although the above mechanism could be used for ipmievd
and definitely is needed for a few other things, it's also worth noting that
ipmievd's needs could be expressed in a different way: namely as a device
dependency.  That is, another potential missing mechanism in SMF which we had
discussed long ago was a dependency on a /dev path, which is really what
ipmievd is trying to express.  This is even more conducive to the kind of 
reporting I mention above, because svcs -x would then be able to produce
a message of the form "disabled because /dev/bmc isn't present on your system"

So while the ipmievd team can work around the absence of either feature now
by doing (2), it would be nice to see one or both of these capabilities added.

-Mike

-- 
Mike Shapiro, Solaris Kernel Development. blogs.sun.com/mws/

Reply via email to