J?rg Sa?mannshausen <[email protected]> wrote:
$ smartctl -i /dev/sda -d megaraid,X
Right.
The issues have been resolved. If anybody is still curious, this is
what happened.
The disappearing files/directories were the result of a script that was
run as root which moved /boot and /bin to an obscure subdirectory
belonging to that user.
The disk errors were a red herring. The system had a Seagate USB disk
plugged into it which I was not aware of. (It was less not obvious
because of the rats nest of cables behind it.) This disk's partition
table was marked bootable - even though there was nothing on that disk
which would have supported a boot. This was the disk that was showing
up as /dev/sdb. When CentOS booted normally it was automatically
mounting this disk, which is why there was no mention of it in
/etc/fstab. However, nothing was using this disk. It looks like at 30
minute intervals the OS "pinged" the device to see if it was still
there, and the enclosure/disk did not fully support whatever command was
being used for this operation, resulting in the sense error messages in
the log files. When the rescue DVD was
used it saw this device, created /dev/sda for it (yes, device names were
exchanged in the two environments) and didn't mount it.
Long SMART tests have now been run on each of the internal disks using
smartctl commands like the one above, and all the disks are fine.
megacli also comes up clean. The USB disk is no longer plugged in,
which solved the issue of sense error messages going to
/var/log/messages.
Thanks for all of the suggestions,
David Mathog
[email protected]
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf