Hi,

We have 4 file systems on our lustre cluster. All have changelog users 
registered for robinhood to use.

We have discovered that a changelog user for one of the file systems is not 
catching up to its index. Manual runs of Robinhood fail to read any more 
records even though according to mdd/tools-MDT0000/changelog_users there are 
record to read!

Over time the change log had filled and the file system had become sluggish. 
Wiping the robinhood mysql and reinitializing robin hood with a full scan 
didn't fix the issue and like I said above three other change logs from 
different file systems (on the same MSG) are ok when used from the same 
robinhood instance.

What makes me think this is a lustre (and we are using 2.8 on ext4) problem is 
this (repeated) error we are getting in syslog:

[Wed May 31 14:06:59 2017] Lustre: 46400:0:(llog.c:530:llog_process_thread()) 
invalid length -420090294 in llog record for index 372672342/61708
[Wed May 31 14:06:59 2017] LustreError: 
46400:0:(mdd_device.c:261:llog_changelog_cancel()) tools-MDD0000: cancel idx 
645 of catalog 0x7:10 rc=-22

Deregistering the user from the change log and starting with a new one has not 
changed the behaviour and we still can't use this new user to track changes to 
the file system.

Can anyone offer any advice on how to resolve this issue in the changelog?
If not can anyone confirm if taking the file system down for a e2fsck/lfsck 
will fix issues with the changelog? I'd settle for being able to clear the 
whole log and starting afresh if that's possible?

Yours
Faye Gibbins
Snr SysAdmin, Unix Lead Architect
Software Systems and Cloud Services
Cirrus Logic | cirrus.com<http://www.cirrus.com/>  | +44 (0) 131 272 7398

[cid:[email protected]]

This message and any attachments may contain privileged and confidential 
information that is intended solely for the person(s) to whom it is addressed. 
If you are not an intended recipient you must not: read; copy; distribute; 
discuss; take any action in or make any reliance upon the contents of this 
message; nor open or read any attachment. If you have received this message in 
error, please notify us as soon as possible on the following telephone number 
and destroy this message including any attachments. Thank you. Cirrus Logic 
International (UK) Ltd and Cirrus Logic International Semiconductor Ltd are 
companies registered in Scotland, with registered numbers SC089839 and SC495735 
respectively. Our registered office is at 7B Nightingale Way, Quartermile, 
Edinburgh, EH3 9EG, UK. Tel: +44 (0)131 272 7000. cirrus.com
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to