On Sep 06, 2007 21:15 +0800, Ruyue Ma wrote: > While running a client and an OST on the same machine, the following > failures can occur: > --- If the client contains a dirty file system in memory and memory > pressure, a kernel thread flushes > dirty pages to the file system, and it writes to a local OST. To complete > the write, the OST needs to > do an allocation. Then the blocking of allocation occurs while waiting for > the above kernel thread to > complete the write process and free up some memory. This is a deadlock > condition.
This depends on load. We do a lot of simple testing with client-on-OST, but it does very occasionally hang if the application is dirtying a lot of data. > ---- If the node with both a client and OST crashes, then the OST waits > for the mounted client on that > node to recover. However, since the client is now in crashed state, the OST > considers it to be a new > client and blocks it from mounting until the recovery completes. > > As a result, running OST and client on same machine can cause a double > failure and prevent a complete > recovery. This will prevent recovery every time an client/OST crashes. If you don't care about recovery (e.g. app was running on client also, so no recovery is possible in any case) then you can also live with this. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. _______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
