There is a paragraph on page 194 of the operations manual 1.6 v17:

While running a client and an OST on the same machine, the following
failures can occur:
   ---  If the client contains a dirty file system in memory and memory
pressure, a kernel thread flushes
dirty pages to the file system, and it writes to a local OST. To complete
the write, the OST needs to
do an allocation. Then the blocking of allocation occurs while waiting for
the above kernel thread to
complete the write process and free up some memory. This is a deadlock
condition.
   ---- If the node with both a client and OST crashes, then the OST waits
for the mounted client on that
node to recover. However, since the client is now in crashed state, the OST
considers it to be a new
client and blocks it from mounting until the recovery completes.

      As a result, running OST and client on same machine can cause a double
failure and prevent a complete
recovery.


I want to know how frequently this double failure occurs。 You know, this
is very important for me。


-- 
Best Regards,
Ruyue Ma
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to