Hi,

just to add a quick note to it.

We had already the situation that a MDS crashed because the changelogs were not processed fast enough by Robinhood.

This happend to us with an old Lustre installation, so the MDS did not drop any changelogs.

To monitor the changelog indexes and to determine number of pending changelogs (gap) we are using the Prometheus Lustre exporter (https://github.com/GSI-HPC/lustre_exporter/tree/master).

If you do not want to run a Prometheus monitoring system you could also use a script we initially used to determine the gaps (https://github.com/GSI-HPC/lustre-scripts/blob/master/bin/lustre_collect_changelog_indexes.py). The script might need an update for accessing the procfs. But I would recommend to use the lctl command anyway to get the required information instead (see Lustre exporter LCTL source).

Best
Gabriele



On 27.07.23 19:57, thomasleibovici wrote:

No problem shutting it down for a short time. As long as you have registered a changelog reader, Lustre will retain the logs until Robinhood reads and acknowledge the record. It may be problematic if this situation stays for a very long time: changelogs will accumulate on the MDT until space is missing. In this case if may not be possible to create files on the MDT, or the MDS may decide to drop the log records. So it is worth monitoring the number of pending log records on the mdt. Information should be available in a /proc or /sys (I don't have it in mind right now).

Regards
Thomas

-------- Message d'origine --------
De : Russell Jones <arjone...@gmail.com>
Date : 27/07/2023 18:04 (GMT+01:00)
À : robinhood-support@lists.sourceforge.net
Objet : [robinhood-support] Historical changelog reading?

Hi all,

Quick question on how Robinhood handles being offline - If changelogs are enabled on the MDT's, and the Robinhood service is offline for a short period of time, when it's turned back on is Robinhood able to pick up where it left off and read historical changelogs from the MDT's (is this a thing?)? Or would I need to do another filesystem scan to have it pick up what it missed?


_______________________________________________
robinhood-support mailing list
robinhood-support@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/robinhood-support
_______________________________________________
robinhood-support mailing list
robinhood-support@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to