On 3/31/2016 10:37 PM, Junxiao Bi wrote:
> On 04/01/2016 11:20 AM, Jay Vasa wrote:
>> On 3/31/2016 6:36 PM, Herbert van den Bergh wrote:
>>> It seems to me that the reason fsck -fn is reporting errors is because
>>> it isn't replaying the journal:
>>>
>>> ** Skipping journal replay because -n was given. There may be spurious
>>> errors that journal replay would fix. **
>>> ** Skipping slot recovery because -n was given. **
>>>
>>> So there are outstanding changes in the journal that need to be made
>>> to the fs, but fsck -fn skips them.  Then later it runs into the
>>> inconsistencies that would have been cleared if the journal was replayed.
>>>
>>> fsck -fy does replay the journal, so it doesn't see the
>>> inconsistencies that were fixed by it.
>>>
>>> When you do the fsck -fn AFTER fsck -fy, does it still say now that it
>>> is skipping journal replay?  If so, I wonder why.  If not, does it
>>> still report the exact same inode / cluster numbers as the previous
>>> time you ran it?  If fsck -fy had to make any changes (including
>>> replaying the journal), run it again, and repeat until it doesn't make
>>> any changes to the filesystem.  This is just to make sure it isn't
>>> leaving some inconsistency unfixed.  So please do:
>>>
>>> umount (on ALL nodes)
>>> fsck -fy
>>> fsck -fy (if the previous fsck made ANY changes including replaying
>>> the journal)
>>> fsck -fn (check if it mentions skipping the journal replay)
>>>
>>> If you still see any errors reported by fsck -fn, are they exactly the
>>> same ones as you've sent earlier?
>>>
>> This is exactly what I did on the first time I ran it. I really don't
>> want to have another downtime doing exactly this again.
> So the "corrupted" ocfs2 volume is online now, does it work well? If
> ocfs2 is really corrupted, i think it will soon fall into a read-only fs
> or panic. If it works well, then maybe fsck.ocfs2 -fn report the
> corruption wrongly.
>
> Thanks,
> Junxiao.
Yes the "corrupted" ocfs2 is working just fine. It has not fallen to 
read-only and has not had a panic.  I though am worried that it will in 
the future and go read-only at some-time. I have though been lately 
minimizing the load on it as I am worried about this happening and seems 
no way to fix it.

Thanks,
Jay

>> If you see I ran this exactly:
>> % umount /dev/drbd2 -- the umount stalled so I rebooted it
>> % fsck -fy /dev/drbd2
>> -- this fixed the journal replay
>> % fsck -fy /dev/drbd2
>> -- this did nothing
>> % fsck -fn /dev/drbd2
>> -- this showed the errors all over again. Yes exactly the same errors.
>>
>> Look at the bottom of this message as that is exactly what I ran, and
>> yes everything was unmounted. This is the only reason why I brought up
>> this issue.
>>
>> If you really want me to do this again, I can, but I don't like bringing
>> down the filesystem another 6 hours for this. I have already tried fsck
>> this about 20 times.
>>
>> Thanks,
>> Jay


_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-users

Reply via email to