Re: [CentOS] SUMMARY : Repair Filesystem prompt , after inode has illegal blocks ; qla2xxx message on reboot

Ross Walker Tue, 15 Sep 2009 06:26:09 -0700

On Sep 14, 2009, at 10:25 PM, "McCulloch, Alan" <alan.mccull...@agresearch.co.nz> wrote:

hi All,

thanks for the responses.

After being dropped into the

# Filesystem repair

prompt,

(  on account of “inode 27344909 has illegal blocks” )
following warm reboot (via “reboot”) after finding (SAN )filesystem in read-only
mode yesterday morning (possibly because of HBA fault on SAN) , I ran

fsck –r /data
(Linux version 2.6.18-92.1.18.el5 , Red Hat 4.1.2-42 , ext3filesystem)
This took a couple of hours or so , prompting me for various changes
all of which I accepted. This appeared to complete OK, but then the
system would not boot, with the following error from the qla2xxxdriver.
.
.
qla2xxx 0000:05:0d.0: Mailbox command timeout occurred. SchedulingISP abort.qla2xxx 0000:05:0d.0: Mailbox command timeout occurred. SchedulingISP abort.
.
etc
However after powering down the system and cold-booting, the systemwas ableto boot up and mount the repaired filesystem without any obviousdamage, but withabnormal not to mention scary looking boot messages and ongoingwarnings from
multipath.
This morning (as I sort of expected) the filesystem had dropped backdown to read-only mode, but meanwhilethe source of our woes was identified, a fibre port on the SANcontroller which was degraded but notcompletely failed, so that there had been no clean failover to thetwin controller, and therefore a degradedvirtual device was presented to the O/S, with consequence for thefilesystem.
After that port and controller was quarantined, this time around Idid a cold power-off rebootof the server , and this time there was a more normal looking bootand the filesystem
came up normally without any repair being requested.
(My hypothesis is that in this situation – i.e. ext3 filesystem hasput itself in read-only mode –a warm boot , via reboot, does not cleanly remount the filesystemand apply the journalquite like a cold power-off reboot does. I think it is likely thatthe lengthysession of me answering “yes” to fsck’s interactive repair, thefirst time around, simply applied all of thefixes that would automatically have been done from the journal , hadI cold-rebooted in the first place.However that is only a hunch. But I will be making sure to do coldpower-off reboots in general, in
future.)
Another lesson is that a sophisticated system of twin SANcontrollers with failover does not protectagainst a situation where a device is degrading rather than failingcompletely.
Thanks again for the responses and sorry if my questions were a bitbasic but I have
been dropped  in a little out of my depth with this system.

I always prefer round-robin mpath versus fail-over if possible as adegraded or failed path simply is not used, then there is the twicethe bandwidth factor when both paths are working which is nice.


-Ross

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] SUMMARY : Repair Filesystem prompt , after inode has illegal blocks ; qla2xxx message on reboot

Reply via email to