[lustre-discuss] Appropriate Umount Ordering

Ellis Wilson via lustre-discuss Thu, 17 Feb 2022 08:08:09 -0800

Hi all,

(Hopefully) simple two questions this time around.  This is for 2.14.0, and my 
cluster is setup with no failovers for MDTs or OSTs.  OBD timeouts have not 
been altered from the defaults.


Question 1:
        
I read on the Lustre Wiki that the appropriate ordering to umount the various 
components of a Lustre filesystem is:
1. Clients
2. MDT(s)
3. OSTs
4. MGS

However, if I do it this way, the OST mounts always hang for 04:25 seconds 
before umounting.  Dmesg reports:
[88944.272233] Lustre: 30178:0:(client.c:2282:ptlrpc_expire_one_request()) @@@ 
Request sent has timed out for slow reply: [sent 1645111309/real 1645111309]  
req@00000000cc9c1aeb x1724931853622016/t0(0) 
o39->[email protected]@tcp:12/10 lens 224/224 e 0 to 1 dl 
1645111574 ref 2 fl Rpc:XNQr/0/ffffffff rc 0/-1 job:''
[88944.275884] Lustre: Failing over lustrefs-OST0000
[88944.429622] Lustre: server umount lustrefs-OST0000 complete

For reference, if I reverse OSTs and MDT (do the MDT second), then all of the 
OST umounts are fast, but the MDT takes a whopping 8 minutes and 50 seconds to 
umount.

Why is the canonical shutdown ordering delaying so long (and so specifically) 
for me?

Question 2:

In all cases (OSTs or MDTs) of umount, whether they are fast or not, I see 
messages like the following in dmesg:
[88944.275884] Lustre: Failing over lustrefs-OST0000
or
[78406.007678] Lustre: Failing over lustrefs-MDT0000

There is no failover configured in my setup.  The MGS is up the entire time in 
all cases.  What is lustre doing here?  How do I explicitly disable this 
failover attempt, since it seems to be at best misleading and at worst directly 
related to the lengthy delays?  FWIW, I have tried umount with '-f' to cause 
the MDT to go into failout rather than failover to no avail.

Thanks for any help folks can offer on this in advance,

ellis
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] Appropriate Umount Ordering

Reply via email to