Hi all, this morning one of our MDT went 'unhealthy',
> Jul 26 10:15:13 lxmds20 kernel: LustreError: > 9510:0:(service.c:3285:ptlrpc_svcpt_health_check()) mdt: unhealthy - request has been waiting 1017s ... However, somewhat later, > lxmds20:~# cat /sys/fs/lustre/health_check healthy and all Lustre operations seem to be good, too. Used to be that if an MDT went unhealthy, all of Lustre was in kind of an 'undefined' state, you had to reboot and fs-check. This is now Lustre 2.10.6 - can it heal itself reliably, or should we still take some action? Regards Thomas -- -------------------------------------------------------------------- Thomas Roth Department: Informationstechnologie Location: SB3 2.291 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528 Managing Directors / Geschäftsführung: Professor Dr. Paolo Giubellino, Ursula Weyrich, Jörg Blaurock Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats: State Secretary / Staatssekretär Dr. Georg Schütte _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
