At 07:55 AM 26/06/2006, Marc G. Fournier wrote:

For the server that I'm fighting with right now, where Dmitry pointed out that it looks like a deadlock issue ... I have dumpdev/savecore enabled, is there some way of forcing it to panic when I know I actually have the deadlock, so that it will dump a core?

DDB is a difficult option, since a keyboard isn't always attached to the server when it boots ...

These are ugly quick hacks, but it might work for you... If the network still continues to function. you might be able to hack up a quick script to force a panic. Hackup some kld (e.g. ichwd) with something like

# diff -u /usr/src/sys/dev/ichwd/ichwd.c.orig /usr/src/sys/dev/ichwd/ichwd.c
--- /usr/src/sys/dev/ichwd/ichwd.c.orig Mon Jun 26 09:50:33 2006
+++ /usr/src/sys/dev/ichwd/ichwd.c      Mon Jun 26 09:51:04 2006
@@ -225,6 +225,7 @@
        device_t ich = NULL;
        device_t dev;

+       panic("I played panicky idiot no 3 on the Poseidon Adventure");
        /* look for an ICH LPC interface bridge */
        for (id = ichwd_devices; id->desc != NULL; ++id)
                if ((ich = pci_find_device(id->vendor, id->device)) != NULL)


Then run a script something like the one below. Set target to be an ip that you control and is always up. When you think your box has deadlocked, add a firewall rule on the target machine to block ICMP echos from the problem machine. You might need to fiddle with max_tries to make it more aggressive. If the target machine is on the local LAN you can make it a nice low value like 2 or 3. Ideally, you would want to make a kld that would instead do the test for you, or you could perhaps hack up the software watchdog to call a panic for you. Dont know if that works or not as I have only used hardware watchdogs.


#!/bin/sh
timeout=5
no_resp_sleep=10
max_tries=25
normal_sleep=300
con_cnt=0
target=1.1.1.1


while true; do
strings /boot/kernel/ichwd.ko > /dev/null # try and make sure these binaries are cached strings /sbin/kldload > /dev/null # try and make sure these binaries are cached
    if /sbin/ping -c1 -t$timeout $target > /dev/null 2>&1; then
        no_resp=0
    else
        no_resp=$(($no_resp + 1))
    fi
    if [ $no_resp -gt $max_tries ]; then
            /sbin/kldload ichwd
    fi
    if [ $no_resp -gt 0 ]; then
        sleep $no_resp_sleep
    else
        sleep $normal_sleep
        if [ $con_cnt -lt 25 ]; then
            con_cnt=$(($con_cnt + 1))
        fi
    fi
done &


---Mike
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to