J?rg Sa?mannshausen <[email protected]> wrote:

$ smartctl -i /dev/sda -d megaraid,X

Right.

The issues have been resolved. If anybody is still curious, this is what happened.

The disappearing files/directories were the result of a script that was run as root which moved /boot and /bin to an obscure subdirectory belonging to that user.

The disk errors were a red herring. The system had a Seagate USB disk plugged into it which I was not aware of. (It was less not obvious because of the rats nest of cables behind it.) This disk's partition table was marked bootable - even though there was nothing on that disk which would have supported a boot. This was the disk that was showing up as /dev/sdb. When CentOS booted normally it was automatically mounting this disk, which is why there was no mention of it in /etc/fstab. However, nothing was using this disk. It looks like at 30 minute intervals the OS "pinged" the device to see if it was still there, and the enclosure/disk did not fully support whatever command was being used for this operation, resulting in the sense error messages in the log files. When the rescue DVD was used it saw this device, created /dev/sda for it (yes, device names were exchanged in the two environments) and didn't mount it.

Long SMART tests have now been run on each of the internal disks using smartctl commands like the one above, and all the disks are fine. megacli also comes up clean. The USB disk is no longer plugged in, which solved the issue of sense error messages going to /var/log/messages.

Thanks for all of the suggestions,

David Mathog
[email protected]
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to