reassign 416098 linux-image-2.6.18-4-amd64
retitle 416098 MegaRAID SAS adapter locks up
quit
* Steinar H. Gunderson
I'm unsure if this is a munin bug or a kernel bug; I'm filing against
munin, but it should perhaps be reassigned.
I installed munin on a Dell PowerEdge 2950:
Setting up munin-node (1.2.5-1) ...
Initializing plugins..
At this point, ssh froze. I connected through the remote admin console,
which was filled with messages like:
megasas: [65] waiting for 140 requests to complete
megasas: [70] waiting for 140 commands to complete
megasas: [75] waiting for 140 commands to complete
megasas: [80] waiting for 140 commands to complete
I googled this error message and it seems a lot of people have seen
lockups with this SAS adapter (without there being any mention of
Munin). Unless you can reproduce this I agree with Stephen that the
kernel (or hardware) is at the prime suspect, and that it happened
during a Munin installation was just random chance. Therefore I'm
reassigning the bug.
To repeat the step the init script did at the time of the crash, you
run the following command:
munin-node-configure --shell --debug | sh -x
That does the same, but with more debug information.
The server never recovered, and I had to (remote) reset it. In other
words, some plugin in munin-node kills the megasas driver, which in turn
kills the entire server.
Well, I'm not aware of any plugins that could possibly do this. The
only plugins I know that are run with elevated privileges and that have
something to do with block devices is smart_ and hddtemp_smartctl, but
they both rely on the helper program "smartctl" so the bug has would
have had to be in that program anyway.
* Stephen Gran
Disclaimer: I am not a munin maintainer.
Would you like to be? :-) I really need a comaintainer or for
someone to take over the package completely - way too little free time
these days.
Regards
--
Tore Anderson
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]