We were running 2.5.3.90 with changelogs enabled earlier this summer.  We ran 
into a catalog corruption issue (LU-6556) - we decided to deregister our 
changelog users, move the CONFIGS/changelog_{catalog,users} files out of the 
way, and carry on until we had an opportunity to upgrade.  We did not remove 
anything from /O/1/d* at that time (though we probably should have).


We've observed that mounting our MDT can take several-to-many minutes - I can 
see with iostat that the MDT is very busy with reads while it is being mounted. 
 I suspect that those stale files in /O/1/d* are the reason (there are lots of 
them), as they are processed by the OSP sync at MDT startup.   I looked with 
debugfs at the /O/1/d* directories - there are 1000s of files and their 
timestamps are consistent with when we were using changelogs.  I dumped a few 
randomly selected ones and checked with llog_reader that the types of records 
they contain are CHANGELOG_REC (type=10660000).


At the least, I think we should to remove the files in /O/1/d* that contain 
CHANGELOG_REC entries.  Can I just delete every file in /O/1/d*, or do I need 
to be careful and only remove the CHANGELOG_REC entries?


The reason I ask is that I do see a handful of files that are not 
changelog-related in these directories - their timestamps are newer and their 
record type as reported by llog_reader is not CHANGELOG_REC or CHANGELOG_USER.  
There are only a small number of such files, though.


Thanks,

Craig Prescott

University of Florida Research Computing
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to