[Linux-HA] Antw: Re: fsck filesystem?

Ulrich Windl Mon, 14 Mar 2011 01:40:21 -0700

>>> Dejan Muhamedagic <[email protected]> schrieb am 21.02.2011 um 17:43 in
Nachricht <20110221164331.GA3603@squib>:
> Hi,
> 
> On Fri, Feb 18, 2011 at 11:56:49AM -0500, Tony Nelson wrote:
> > Hi All,
> > 
> > I have a small cluster configured like this:
> > 
> > [-------------- config -----------------]
> > root@ihdb2:~# crm configure show
> > node $id="3888bf0f-3e06-4ad8-a2c2-297451128d3d" ihdb1
> > node $id="a1f70384-6684-47e6-ba00-ed082dee7a56" ihdb2
> > primitive bacula-fd lsb:bacula-fd.local \
> >     meta target-role="Started"
> > primitive dbip ocf:heartbeat:IPaddr2 \
> >     params ip="192.168.44.22" nic="eth0" \
> >     op start interval="0" timeout="120s" \
> >     op monitor interval="30s" timeout="20s"
> > primitive fs0 ocf:heartbeat:Filesystem \
> >     params fstype="ext3" directory="/var/lib/postgresql" 
> device="/dev/vg01/postgresql" options="noatime" \
> >     op start interval="0" timeout="60s" \
> >     op stop interval="0" timeout="60s" \
> >     meta target-role="Started"
> > primitive iscsi ocf:heartbeat:iscsi \
> >     params portal="192.168.43.28" 
> target="iqn.2001-05.com.equallogic:0-8a0906-a6bb3d802-25aca117e304cae3-ihdb" 
> \
> >     op start interval="0" timeout="120s" \
> >     op monitor interval="30s" timeout="30s" \
> >     op stop interval="0" timeout="120s" \
> >     meta target-role="Started"
> > primitive psql lsb:postgresql-8.4 \
> >     meta target-role="Started"
> > group psql-group iscsi fs0 dbip bacula-fd psql \
> >     meta target-role="Started"
> > property $id="cib-bootstrap-options" \
> >     dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
> >     cluster-infrastructure="Heartbeat" \
> >     stonith-enabled="false" \
> >     last-lrm-refresh="1291165836" \
> >     no-quorum-policy="ignore"
> > rsc_defaults $id="rsc-options" \
> >     resource-stickiness="100"
> > [ -------------- end config --------------]
> > 
> > This morning the postgres server started logging errors because of 
> corrupted data files.
> > 
> > I stopped all of the services except for the iscsi one and manually mounted 
> the filesystem.  The system said something like "Warning: mounting a 
> filesystem with errors".  Sorry I don't have the exact messages.
> > 
> > I unmounted the filesystem, did a fsck manually then restarted the 
> services.  
> > 
> > Is there any way to have heartbeat fsck the filesystem like a normal mount 
> from fstab would?  Did I miss a step?
> 
> No. ext3 is a filesystem with a journal, so it is considered
> that it can recover without fsck. Otherwise, there's a parameter
> called run_fsck, check the meta data: crm ra info Filesystem.
> 
> BTW, it is very unusual (and suspicious) that the filesystem
> starts having errors just like that, while the system's running.
> You should find what caused the corruption.


On HP-UX with Serviceguard and VxFS (Journaled Filesystem) the filesystem is 
checked every time before it it mounted: If it's clean nothing is done; if not, 
either the journal is replayed or a full structural consistency check is run 
(if a sever corroption was detected).
Remember: A node could go down also because of a memory failure (which might 
corrupt the filesystem)

So I think checking a filesystem before mount is a good thing.

Regards,
Ulrich


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Antw: Re: fsck filesystem?

Reply via email to