>>> Dejan Muhamedagic <[email protected]> schrieb am 21.02.2011 um 17:43 in Nachricht <20110221164331.GA3603@squib>: > Hi, > > On Fri, Feb 18, 2011 at 11:56:49AM -0500, Tony Nelson wrote: > > Hi All, > > > > I have a small cluster configured like this: > > > > [-------------- config -----------------] > > root@ihdb2:~# crm configure show > > node $id="3888bf0f-3e06-4ad8-a2c2-297451128d3d" ihdb1 > > node $id="a1f70384-6684-47e6-ba00-ed082dee7a56" ihdb2 > > primitive bacula-fd lsb:bacula-fd.local \ > > meta target-role="Started" > > primitive dbip ocf:heartbeat:IPaddr2 \ > > params ip="192.168.44.22" nic="eth0" \ > > op start interval="0" timeout="120s" \ > > op monitor interval="30s" timeout="20s" > > primitive fs0 ocf:heartbeat:Filesystem \ > > params fstype="ext3" directory="/var/lib/postgresql" > device="/dev/vg01/postgresql" options="noatime" \ > > op start interval="0" timeout="60s" \ > > op stop interval="0" timeout="60s" \ > > meta target-role="Started" > > primitive iscsi ocf:heartbeat:iscsi \ > > params portal="192.168.43.28" > target="iqn.2001-05.com.equallogic:0-8a0906-a6bb3d802-25aca117e304cae3-ihdb" > \ > > op start interval="0" timeout="120s" \ > > op monitor interval="30s" timeout="30s" \ > > op stop interval="0" timeout="120s" \ > > meta target-role="Started" > > primitive psql lsb:postgresql-8.4 \ > > meta target-role="Started" > > group psql-group iscsi fs0 dbip bacula-fd psql \ > > meta target-role="Started" > > property $id="cib-bootstrap-options" \ > > dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \ > > cluster-infrastructure="Heartbeat" \ > > stonith-enabled="false" \ > > last-lrm-refresh="1291165836" \ > > no-quorum-policy="ignore" > > rsc_defaults $id="rsc-options" \ > > resource-stickiness="100" > > [ -------------- end config --------------] > > > > This morning the postgres server started logging errors because of > corrupted data files. > > > > I stopped all of the services except for the iscsi one and manually mounted > the filesystem. The system said something like "Warning: mounting a > filesystem with errors". Sorry I don't have the exact messages. > > > > I unmounted the filesystem, did a fsck manually then restarted the > services. > > > > Is there any way to have heartbeat fsck the filesystem like a normal mount > from fstab would? Did I miss a step? > > No. ext3 is a filesystem with a journal, so it is considered > that it can recover without fsck. Otherwise, there's a parameter > called run_fsck, check the meta data: crm ra info Filesystem. > > BTW, it is very unusual (and suspicious) that the filesystem > starts having errors just like that, while the system's running. > You should find what caused the corruption.
On HP-UX with Serviceguard and VxFS (Journaled Filesystem) the filesystem is checked every time before it it mounted: If it's clean nothing is done; if not, either the journal is replayed or a full structural consistency check is run (if a sever corroption was detected). Remember: A node could go down also because of a memory failure (which might corrupt the filesystem) So I think checking a filesystem before mount is a good thing. Regards, Ulrich _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
