Hi,
First, please let me apologize for not having the 1.8.10
patch-and-outstanding-issues list completed yet. I had a series of
crises to deal with since I posted my intent to get started on this
and
so was delayed. However, all that is behind me now (knock on wood ;-)
and I'm now back to working on the preliminary 1.8.10 full-time. I
expect to have the list of outstanding patches/issues posted to this
mailing list soon for discussion.
I also mentioned some time back that I wanted to add support to
ipmitool
for turning off a running watchdog timer. As you may recall, during
the
watchdog discussion several months back, we'd decided that full
watchdog
"set" command support is too dangerous and we subsequently had some
patches pulled from the cvs tree which allowed that functionality.
However, using the "set" command to turn off a running watchdog timer
would be very useful. (Of course any of the watchdog set commands can
still be sent via the raw command interface.)
While browsing around the cvs tree, I noticed that a (second) set of
watchdog commands is present under the mc command and having been
added
since 1.8.9 was released. This seemed to me to be an ok place to have
some watchdog support so I've gone ahead and made some modifications
to
the existing watchdog commands there. I've expanded the "get" command
results displayed, made "reset" safe (see below), and changed "set" to
do one job only -- turn off a running timer. I've also added man page
support for this new functionality. I am attaching a patch for review
and comments. Any and all (constructive ;-) comments are much
appreciated.
For folks concerned about even having a reset command at all (valid
concerns imo), I want to add a few notes on why I believe the "reset"
I've implemented here is safe. I ran some tests of this reset command
in various scenarios and saw the behavior outlined below. I only ran
these tests on system and one (2.6.18) kernel so more extensive
testing
should be done to be sure it works ok on other systems before we
roll to
1.8.10. (And I intend to do more testing myself, too.) If issues are
found that we aren't able to fully address, we can ifdef or pull the
reset code.
First, reset run in-band:
1) if the ipmi watchdog driver is not yet started, "reset" starts the
timer but the action is "No action" so when the countdown ends, it
does
nothing.
2) if the ipmi watchdog driver is started with start_now=0, it does
the
same as #1 -- timer started but action is "No action".
3) if the ipmi watchdog driver is started with start_now=1, "reset"
resets the timer back to 5 minutes (a semi-arbitrary time I chose so
it
would be similar to the default time setting on many BMCs). Other
values set by the watchdog driver are unchanged. Note: this is
where I
see the most value for the reset command -- it gives more time if/when
occasionally needed.
4) After using "ipmitool mc watchdog off" to turn off a running timer,
the action is set to "No action". A reset command will reset the
timer,
but since it retains the former action setting of "No action", nothing
happens when the countdown ends.
5) After sending '-V' to the ipmi watchdog driver for a graceful
shutdown: The watchdog driver doesn't appear to shut off the timer,
but
sets the action to "No action" -- the countdown ends and nothing
happens
(similar to #1, #2, #4 above). A subsequent "restart" command will
reset the timer, but there will be "No action" when the countdown
ends.
Second, reset run out-of-band:
1) if the ipmi watchdog driver is not present/started, the behavior of
reset is "No action" so the timer starts and then counts down, but
nothing happens when the countdown completes.
2) if the ipmi watchdog driver is present and start_now=1 is set,
"reset" will restart the countdown but will not change anything else
set
by the ipmi watchdog driver (e.g., if the driver's action is to reset
the system, the reset will still be triggered after the new countdown
period ends).
--
Again, all comments, concerns, and feedback are much appreciated.
Thanks,
Carol Hebert
<
ipmitool_watchdog
.patch
>
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save
$100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone_______________________________________________
Ipmitool-devel mailing list
Ipmitool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ipmitool-devel