> Date: Tue, 3 Dec 2024 00:10:23 +0100 > From: Manuel Bouyer <bou...@antioche.eu.org> > > On Mon, Dec 02, 2024 at 10:57:40PM -0000, Michael van Elst wrote: > > e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes: > > > > >With a -10 kernel, I experience a >45s delay during boot at > > >[ 4.7185738] wsmouse0 at ums0 mux 0 > > >[ 51.1886454] ipmi0: version 32.0 interface KCS iobase 0xca8/0x8 spacing 4 > > >How can I track down what causes this delay? > > > > The delay comes from the slow "KCS" (keyboard controller) interface. > > > > This is done asynchronously, but netbsd-9 will still wait for all > > autoconfig operations before continuing the boot. In netbsd-10 > > it no longer waits for ipmi to finish attachment (but may do so > > at shutdown if that occurs very early). > > I think it still does, at last I'm seeing it on my servers. > > If I remember properly this was reintroduced so that the watchdog is > available early.
This is PR kern/56568: ipmi.c 1.7 causes large boot delays <https://gnats.netbsd.org/56568>. The code is the same in netbsd-9, netbsd-10, and HEAD; it changed between 9.2 and 9.3. There is currently a conflict on machines with slow ipmi(4) initialization where: (a) If the driver waits for the device to initialize, there's an annoying delay before the system reaches userland. (b) If the driver doesn't wait for device to initialize, it boots too fast for the watchdog to work -- PR kern/56539: wdogctl starts early, but ipmi takes its time <https://gnats.netbsd.org/56539>. I was the one who put in the delay, but it was for a different reason. On PR 56568 I suggested an alternative, which is that: - ipmi_attach or ipmi_thread can publish the watchdog immediately before the device has initialized so it doesn't hold up boot, but - ipmi_watchdog_setmode must wait with cv_wait_sig until the device has initialized before it actually configures the watchdog. That way, you reach userland quickly but wdogctl(8) still works -- with the caveat that (by necessity) it waits until the device has initialized. However, nobody has acted on this suggestion yet. Here's what happened: 1. The ipmi(4) driver shipped in 9.0 had an ipmi_detach function that wouldn't wait for the thread to complete -- and wouldn't even wait for the thread to finish initializing, e.g., sc->sc_wdog before it called sysmon_wdog_unregister to tear it down. ipmi(4) _did_ use a short-lived flag DVF_ATTACH_INPROGRESS _as if_ that would prevent ipmi_detach from running before the thread has finished initializing, except that's not what DVF_ATTACH_INPROGRESS ever did (it only changed the timing of the `WARNING: power management not supported' message, and it's gone now). 2. I fixed these races by using config_pending_incr in ipmi_attach and config_pending_decr in ipmi_thread where it used to set and clear DVF_ATTACH_INPROGRESS. This caused boot to hang indefinitely because the place where ipmi_thread cleared DVF_ATTACH_INPROGRESS was only after its near-infinite loop to process commands, interrupted only when ipmi_detach asks it to stop. So I fixed the indefinite hang at boot by moving config_pending_decr a little earlier in ipmi_thread so that it happens between initialization and the near-infinite loop to process commands. 3. hauke@ noticed that on some machines, ipmi(4) booted _too fast_ in 9.0 for wdogctl(8) to run at boot time: PR kern/56539: wdogctl starts early, but ipmi takes its time https://gnats.netbsd.org/56539 On an amd64 server (Supermicro H8DCL-iF board, AMD 4226 cpu), I enabled the IPMI watchdog. Unfortunately, while the wdogctl rc.d script runs very early during the boot (BEFORE: fsck_root), IPMI takes its time: [ 1.0470319] ipmi0 at mainbus0 [...] [ 7.6848901] init: copying out path `/sbin/init' 11 Tue Dec 7 17:11:51 CET 2021 Starting watchdog timer. wdogctl: WDOGIOC_SMODE: No such process /etc/rc.d/wdogctl exited with code 1 [...] [ 13.0963472] ipmi0: ID 32.1 IPMI 2.0 Available He also noticed that my change had the side effect of holding boot up until ipmi_thread had initialized enough for wdogctl(8) to work, so he pulled that up to netbsd-9: https://releng.netbsd.org/cgi-bin/req-9.cgi?show=1392 4. mlelstv@ noticed that the config_pending_incr/decr changes caused a long delay in boot on some machines: PR kern/56568: ipmi.c 1.7 causes large boot delays https://gnats.netbsd.org/56568 ipmi(4) talks to a separate system (the BMC) when attaching. The communication may be done very slowly, often it is passed through the keyboard controller and can take several ten seconds. The driver used to set the DV_ATTACH_INPROGRESS flag to signal that the attachment hasn't completed when ipmi_attach returns. The flag isn't used anymore by autoconf. Changing this to config_pending_incr/descr has a different effect, it makes the boot process wait for the attachment to finish. This is neither wanted nor necessary, you only have to wait for it before you can detach the driver again. On this system (HP ML110 G4) the attachment takes 40 seconds. The logic is currently the same in netbsd-9, netbsd-10, and HEAD. Here's how it works: 1. ipmi_attach starts ipmi_thread and arranges, with config_pending_incr, to have boot held up until the thread has acknowledged its startup: 2135 static void 2136 ipmi_attach(device_t parent, device_t self, void *aux) 2137 { ... 2154 if (kthread_create(PRI_NONE, KTHREAD_MUSTJOIN, NULL, ipmi_thread, self, 2155 &sc->sc_kthread, "%s", device_xname(self)) != 0) { 2156 aprint_error_dev(self, "unable to create thread, disabled\n"); 2157 } else 2158 config_pending_incr(self); 2159 } https://nxr.netbsd.org/xref/src/sys/dev/ipmi.c?r=1.10#2135 2. ipmi_thread maps the registers and talks to the device with ipmi_get_device_id and get_sdr -- this requires the device to read and respond to commands from the OS, which may involve delays (not sure how long): 1982 static void 1983 ipmi_thread(void *cookie) 1984 { ... 1998 /* Map registers */ 1999 ipmi_map_regs(sc, ia); 2000 2001 memset(&id, 0, sizeof(id)); 2002 if (ipmi_get_device_id(sc, &id)) 2003 aprint_error_dev(self, "Failed to re-query device ID\n"); 2004 2005 /* Scan SDRs, add sensors to list */ 2006 for (rec = 0; rec != 0xFFFF;) 2007 if (get_sdr(sc, rec, &rec)) 2008 break; https://nxr.netbsd.org/xref/src/sys/dev/ipmi.c?r=1.10#1982 3. ipmi_thread prints the message Edgar quoted: 2063 aprint_verbose_dev(self, "version %d.%d interface %s %sbase " 2064 "0x%" PRIx64 "/%#x spacing %d\n", 2065 ia->iaa_if_rev >> 4, ia->iaa_if_rev & 0xF, sc->sc_if->name, 2066 ia->iaa_if_iotype == 'i' ? "io" : "mem", 2067 (uint64_t)ia->iaa_if_iobase, 2068 ia->iaa_if_iospacing * sc->sc_if->nregs, ia->iaa_if_iospacing); https://nxr.netbsd.org/xref/src/sys/dev/ipmi.c?r=1.10#2063 4. ipmi_thread calls config_pending_decr to allow boot to proceed: 2112 config_pending_decr(self); https://nxr.netbsd.org/xref/src/sys/dev/ipmi.c?r=1.10#2112 5. ipmi_thread goes into a loop until impi_detach tells it to stop: 2114 mutex_enter(&sc->sc_poll_mtx); 2115 while (sc->sc_thread_running) { ... 2131 mutex_exit(&sc->sc_poll_mtx); 2132 kthread_exit(0); 2133 } https://nxr.netbsd.org/xref/src/sys/dev/ipmi.c?r=1.10#2114 2161 static int 2162 ipmi_detach(device_t self, int flags) 2163 { 2164 struct ipmi_sensor *i; 2165 int rc; 2166 struct ipmi_softc *sc = device_private(self); 2167 2168 mutex_enter(&sc->sc_poll_mtx); 2169 sc->sc_thread_running = false; 2170 cv_signal(&sc->sc_poll_cv); 2171 mutex_exit(&sc->sc_poll_mtx); 2172 if (sc->sc_kthread) 2173 (void)kthread_join(sc->sc_kthread); https://nxr.netbsd.org/xref/src/sys/dev/ipmi.c?r=1.10#2161