Hello,
According to the error msgs, looks like there is a corrupted plain-LLOG file 
for the ChangeLogs of MDT0. And unfortunately, neither e2fsck nor lfsck can 
help to recover in this case.
I think that to clear this situation you need to stop/umount this MDT and 
re-mount it as ldiskfs to move both changelog_users and changelog_catalog files 
to some alternate place/name (do not remove them!), umount ldiskfs, 
re-start/mount your MDT, re-run a RBH full-scan, re-register a ChangeLog user.
Only side-effect doing so, can be the volume of orphan plain-LLOGs that will be 
kept consuming space on MDT. You should be able to identify them by running 
llog_reader tool over the saved/renamed old catalog file that will list you the 
references to all these remaining plain-LLOGs, allowing you to find+remove them 
during a new ldiskfs-mount session.

Bruno.

On Jun 1, 2017, at 4:09 PM, Gibbins, Faye 
<[email protected]<mailto:[email protected]>> wrote:

Hi,

We have 4 file systems on our lustre cluster. All have changelog users 
registered for robinhood to use.

We have discovered that a changelog user for one of the file systems is not 
catching up to its index. Manual runs of Robinhood fail to read any more 
records even though according to mdd/tools-MDT0000/changelog_users there are 
record to read!

Over time the change log had filled and the file system had become sluggish. 
Wiping the robinhood mysql and reinitializing robin hood with a full scan 
didn’t fix the issue and like I said above three other change logs from 
different file systems (on the same MSG) are ok when used from the same 
robinhood instance.

What makes me think this is a lustre (and we are using 2.8 on ext4) problem is 
this (repeated) error we are getting in syslog:

[Wed May 31 14:06:59 2017] Lustre: 46400:0:(llog.c:530:llog_process_thread()) 
invalid length -420090294 in llog record for index 372672342/61708
[Wed May 31 14:06:59 2017] LustreError: 
46400:0:(mdd_device.c:261:llog_changelog_cancel()) tools-MDD0000: cancel idx 
645 of catalog 0x7:10 rc=-22

Deregistering the user from the change log and starting with a new one has not 
changed the behaviour and we still can’t use this new user to track changes to 
the file system.

Can anyone offer any advice on how to resolve this issue in the changelog?
If not can anyone confirm if taking the file system down for a e2fsck/lfsck 
will fix issues with the changelog? I’d settle for being able to clear the 
whole log and starting afresh if that’s possible?

Yours
Faye Gibbins
Snr SysAdmin, Unix Lead Architect
Software Systems and Cloud Services
Cirrus Logic | cirrus.com<http://www.cirrus.com/>  | +44 (0) 131 272 7398

<image001.png>

This message and any attachments may contain privileged and confidential 
information that is intended solely for the person(s) to whom it is addressed. 
If you are not an intended recipient you must not: read; copy; distribute; 
discuss; take any action in or make any reliance upon the contents of this 
message; nor open or read any attachment. If you have received this message in 
error, please notify us as soon as possible on the following telephone number 
and destroy this message including any attachments. Thank you. Cirrus Logic 
International (UK) Ltd and Cirrus Logic International Semiconductor Ltd are 
companies registered in Scotland, with registered numbers SC089839 and SC495735 
respectively. Our registered office is at 7B Nightingale Way, Quartermile, 
Edinburgh, EH3 9EG, UK. Tel: +44 (0)131 272 7000. 
cirrus.com<http://cirrus.com/>_______________________________________________
lustre-discuss mailing list
[email protected]<mailto:[email protected]>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris, 
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to