Today... One of my systems crashed due to SCSI disk errors.
when it came back up it crashed again. so i had to go to the site (2 hour drive!@) and took a look at it. it seems when the system tries to run e2fsck on the raid set the system crashes with the message: Jul 14 13:36:30 galactica kernel: VFS: grow_buffers: size = 16384 Jul 14 13:36:31 galactica last message repeated 665 times Jul 14 13:36:31 galactica kernel: 384 Jul 14 13:36:31 galactica kernel: VFS: grow_buffers: size = 16384 Jul 14 13:36:31 galactica last message repeated 814 times Jul 14 13:36:31 galactica kernel: 384 Jul 14 13:36:31 galactica kernel: VFS: grow_buffers: size = 16384 Jul 14 13:36:31 galactica last message repeated 468 times it goes on and on and on..(the people on site left it in that state for a good hour and a half until i got there) This did not happen with 2.2.15. This is also the first time i have run e2fsck on the raid drive since upgrading to 2.2.16. As a result i have been forced to mount the raid array unclean just so i could get the system up and running so the rest of the staff at the site could go home(its co-located). I didn't think to reboot to 2.2.15 and run e2fsck on it before i left i was in a real rush. System configuration: Asus P2L97-DS (this board does not have scsi despite the "DS" model code) Dual P2-233 128MB PC100 SDRAM x 3 (384MB total) Matrox G100 AGP Video Adaptec AHA 2940UW PCI 3Com 3C905C w/drivers from www.3com.com Generic ATAPI 32x CDROM Root device: IBM DDRS-34560D 4.3GB Raid system: IBM-DNES-30917OW 9.1GB (x2) Software raid mode 1 Additional Storage: QUANTUM VIKING 4.5WSE 4.5GB (<- the cause of the initial crash) Linux Distribution: Debian GNU/Linux 2.1r4 Installed: Mid-march Installed with kernel: 2.2.14+ow1 (www.openwall.com/linux) Currently running: 2.2.16+ow1 Raid version: 0.36.4 this happened again and again, and i finally traced it down to this by disabling automounting of /dev/md0 in /etc/fstab and the system booted fine. ckraid was run multiple times with no crash(it ran everytime the system crashed due to other reasons). Once i tried to run e2fsck on /dev/md0 the errors(above) flooded the screen until i ctrl-alt-del. I'm not a kernel hacker but i was curious if this is a known issue? i do read kernel traffic every week but i haven't seen anything specifically realted to raid mentioned (that i can remember). I think what i will do is download the data off the raid set, and reformat it so it can be "clean" again. or reboot to 2.2.15 and run e2fsck on it.. i can try to provide more info if needed, let me know, i just dont want to have to drive up there again if i can avoid it:) help! nate ::: http://www.aphroland.org/ http://www.linuxpowered.net/ [EMAIL PROTECTED] 3:24pm up 1:30, 1 user, load average: 0.00, 0.01, 0.00