Good idea. Unfortunately there is no way to check smart status of disks over a Smart Array Controller. I tried this before without success and found information on the web, that this controller model is not smart capable. However, HP provides a tool to check the array. A HP technitian has already changed both Controllers in the MSA 500 and one controller in a machine as well as the enclosure without success. The disks do not show a defekt.
Additionaly a HP Linux expert (that didn't know gentoo...) gave a hint concerning possible timing problems with SMP Kernels (I use an SMP Kernel to take advantage of the Hyperthreading CPU). Also I will try to disable the HPET timer in the kernel. Also to rule out timing problems. Thanx for your sugestions!!! Mit freundlichen Gr��en, Matthias Witschel Infraserv GmbH & Co H�chst KG SC-IT, SAP-Basis Industriepark H�chst, Geb�ude D710 65926 Frankfurt am Main Tel: 069 - 305 84235 Fax: 069 - 305 23549 -----Urspr�ngliche Nachricht----- Von: Renzo Rosales [mailto:[EMAIL PROTECTED] Gesendet: Donnerstag, 10. Februar 2005 15:51 An: [EMAIL PROTECTED] Betreff: Re: [gentoo-server] Problems with HP MSA 500 Storage Perhaps one of the drives is bad and the controller and unable to read/write to the disk which is why it's timing out. Do you have a utility to check the array or have a SMART program to check on the health of the drives? On Thu, 10 Feb 2005 10:18:28 +0100, Witschel, Matthias, Infraserv-Hoechst/DE <[EMAIL PROTECTED]> wrote: > Hallo everybody! > > Short info on architecture in use: > > I have a setup of two HP DL380 with Smart Array 5i Controller for internal > Disks (RAID 1 for rootdisks, on mashine wit additional RAID 5 for local > database). Both Machines are attached to a HP MSA 500 Storage device via > Smart Array 532 Controller. The machines form a high availability cluster for > an Oracle database. Kernel in Use is 2.6.10-gentoo-r6. This construct is > suggestet by HP for use in HA clusters. The device has one singel lun wich is > used as LVM2 device via device-mapper. FS ist ext2. > > For a few days now write access to the MSA 500 stalls. Afterwards every > access to that device stalls too. The machine refises to sync and will not > reboot without pressing tho power button. > > Please help! > _any_ hint is welcome! Including input on working environments with similiar > setup or similiar problems. Any hint on possible problems with the used > kernel or host bus adapters? > > Kind regards, > > Matthias Witschel > > PS: Here are the relevant messages from /var/log/messages: > (earlier tests included EXT2 error messages ahead of the timeout, this didn't > happen after fsck on the device) > > Feb 9 16:56:16 telkas1 cciss: cmd f7d80000 timedout > Feb 9 16:56:16 telkas1 Buffer I/O error on device dm-6, logical block 60998 > Feb 9 16:56:16 telkas1 lost page write due to I/O error on dm-6 > Feb 9 16:56:16 telkas1 cciss: cmd f7d80248 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d80490 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d806d8 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d80920 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d80b68 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d80db0 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d80ff8 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d81240 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d81488 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d816d0 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d81918 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d81b60 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d81da8 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d81ff0 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d82238 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d82480 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d826c8 timedout > Feb 9 16:56:16 telkas1 Buffer I/O error on device dm-6, logical block 62023 > Feb 9 16:56:16 telkas1 lost page write due to I/O error on dm-6 > Feb 9 16:56:16 telkas1 cciss: cmd f7d82910 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d82b58 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d82da0 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d82fe8 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d83230 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d83478 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d836c0 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d83908 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d83b50 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d83d98 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d83fe0 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d84228 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d84470 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d846b8 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d84900 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d84b48 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d84d90 timedout > Feb 9 16:56:16 telkas1 Buffer I/O error on device dm-6, logical block 63048 > Feb 9 16:56:16 telkas1 lost page write due to I/O error on dm-6 > Feb 9 16:56:16 telkas1 cciss: cmd f7d84fd8 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d85220 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d85468 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d856b0 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d858f8 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d85b40 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d85d88 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d85fd0 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d86218 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d86460 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d866a8 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d868f0 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d86b38 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d86d80 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d86fc8 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d87210 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d87458 timedout > Feb 9 16:56:16 telkas1 Buffer I/O error on device dm-6, logical block 64073 > Feb 9 16:56:16 telkas1 lost page write due to I/O error on dm-6 > Feb 9 16:56:16 telkas1 cciss: cmd f7d876a0 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d878e8 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d87b30 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d87d78 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d87fc0 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d88208 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d88450 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d88698 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d888e0 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d88b28 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d88d70 timedout > Feb 9 16:56:16 telkas1 cciss: cmd f7d88fb8 timedout >
