Hi,

This is a patch for a problem we encountered trying to use the IPMI 
watchdog on our Dell PowerEdge servers.

We would like to use the watchdog to monitor and reboot within a time 
limits after freeze/lockup/etc. Our test case is using echo c > 
/proc/sysrq-trigger.

Unfortunately, while it works on other hardware, using different 
watchdog implementations, the IPMI driver changes the timeout on panic, 
which is way too big for our needs. A configurable timeout in this case 
would allow the watchdog timer to behave as expected on a production system.

 From cb4d4d3ef3f4faff14631da5857ee5df875abae0 Mon Sep 17 00:00:00 2001
From: Jean-Yves Faye <[email protected]>
Date: Tue, 29 Sep 2015 11:39:19 +0200
Subject: [PATCH] ipmi watchdog : add panic_wdt_timeout parameter

In order to allow panic actions to be processed, the ipmi watchdog
driver sets a new timeout value on panic. The 255s timeout
was designed to allow kdump and others actions on panic, as in
http://lkml.iu.edu/hypermail/linux/kernel/0711.3/0258.html

This is counter-intuitive for a end-user who sets watchdog timeout
value to something like 30s and who expects BMC to reset the system
within 30s of a panic.

This commit allows user to configure the timeout on panic.

Signed-off-by: Jean-Yves Faye <[email protected]>
---
  Documentation/IPMI.txt            | 7 +++++--
  drivers/char/ipmi/ipmi_watchdog.c | 8 +++++++-
  2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/Documentation/IPMI.txt b/Documentation/IPMI.txt
index 31d1d65..c0d8788 100644
--- a/Documentation/IPMI.txt
+++ b/Documentation/IPMI.txt
@@ -587,7 +587,7 @@ used to control it:

    modprobe ipmi_watchdog timeout=<t> pretimeout=<t> action=<action type>
        preaction=<preaction type> preop=<preop type> start_now=x
-      nowayout=x ifnum_to_use=n
+      nowayout=x ifnum_to_use=n panic_wdt_timeout=<t>

  ifnum_to_use specifies which interface the watchdog timer should use.
  The default is -1, which means to pick the first one registered.
@@ -597,7 +597,9 @@ is the amount of seconds before the reset that the 
pre-timeout panic will
  occur (if pretimeout is zero, then pretimeout will not be enabled).  Note
  that the pretimeout is the time before the final timeout.  So if the
  timeout is 50 seconds and the pretimeout is 10 seconds, then the 
pretimeout
-will occur in 40 second (10 seconds before the timeout).
+will occur in 40 second (10 seconds before the timeout). The 
panic_wdt_timeout
+is the value of timeout which is set on kernel panic, in order to let 
actions
+such as kdump to occur during panic.

  The action may be "reset", "power_cycle", or "power_off", and
  specifies what to do when the timer times out, and defaults to
@@ -634,6 +636,7 @@ for configuring the watchdog:
        ipmi_watchdog.preop=<preop type>
        ipmi_watchdog.start_now=x
        ipmi_watchdog.nowayout=x
+       ipmi_watchdog.panic_wdt_timeout=<t>

  The options are the same as the module parameter options.

diff --git a/drivers/char/ipmi/ipmi_watchdog.c 
b/drivers/char/ipmi/ipmi_watchdog.c
index 0ac3bd1..096f0ce 100644
--- a/drivers/char/ipmi/ipmi_watchdog.c
+++ b/drivers/char/ipmi/ipmi_watchdog.c
@@ -153,6 +153,9 @@ static int timeout = 10;
  /* The pre-timeout is disabled by default. */
  static int pretimeout;

+/* Default timeout to set on panic */
+static int panic_wdt_timeout = 255;
+
  /* Default action is to reset the board on a timeout. */
  static unsigned char action_val = WDOG_TIMEOUT_RESET;

@@ -293,6 +296,9 @@ MODULE_PARM_DESC(timeout, "Timeout value in seconds.");
  module_param(pretimeout, timeout, 0644);
  MODULE_PARM_DESC(pretimeout, "Pretimeout value in seconds.");

+module_param(panic_wdt_timeout, timeout, 0644);
+MODULE_PARM_DESC(timeout, "Timeout value on kernel panic in seconds.");
+
  module_param_cb(action, &param_ops_str, action_op, 0644);
  MODULE_PARM_DESC(action, "Timeout action. One of: "
                 "reset, none, power_cycle, power_off.");
@@ -1189,7 +1195,7 @@ static int wdog_panic_handler(struct 
notifier_block *this,
                /* Make sure we do this only once. */
                panic_event_handled = 1;

-               timeout = 255;
+               timeout = panic_wdt_timeout;
                pretimeout = 0;
                panic_halt_ipmi_set_timeout();
        }
-- 
1.8.3.2


------------------------------------------------------------------------------
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to