Hello Dejan,
On 02/21/2011 05:43 PM, Dejan Muhamedagic wrote: > > No. ext3 is a filesystem with a journal, so it is considered > that it can recover without fsck. Otherwise, there's a parameter > called run_fsck, check the meta data: crm ra info Filesystem. > no, not if it writes "Warning: mounting a filesystem with errors". In that case extX has recorded an error either in its super block or in the journal. We had a long discussion about that on the ext4 list back in October and in the end upstream e2fsprogs excepted a patch for e2fsck to allow to play back the journal only. After journal playback a possible error always be recorded in the superblock and from there on the a script can read it using dumpe2fs. The Filesystem agent should be rewritten to refuse to mount if the superblock has an error. Using the new e2fsck option "-E journal_only" is a bit more tricky, as only the most recent e2fsprogs/e2fsck version has it. http://kerneltrap.org/mailarchive/linux-ext4/2010/10/22/6885813 http://git.kernel.org/?p=fs/ext2/e2fsprogs.git;a=commit;h=71873b17307993c08b38b97c9551bed231e6048c Below is what I added to the DDN lustre_server agent: > # check if the superblock knows about filesystem errors > # return 0 if not, 1 if errors have been recorded > check_sb_fs_errors() > { > with_error=`dumpe2fs -h $DEVICE 2>/dev/null | grep "Filesystem > state:" | grep "error"` > if [ -n "$with_error" ]; then > ocf_log err "$DEVICE : $with_error (run e2fsck)" > return 1 > fi > return 0 > } (As I left DDN end of November and as the "e2fsck -E journal_only" option was not accepted upstream that time, that part is not implemented yet in that RA). > BTW, it is very unusual (and suspicious) that the filesystem > starts having errors just like that, while the system's running. > You should find what caused the corruption. Well, extX even recorded an error in the journal and subsequently in the super-block if an IO error came up. Unfortunately, there does not seem to a single expensive raid unit out there, that does not bring up errors. Although I have to admit, that FC and IB HBAs and fabric also play their part in that issue. And of course, no filesystem is free of bugs. Which is why until now extX suggests frequent fscks. Cheers, Bernd _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
