Hello!

On May 3, 2010, at 11:49 AM, Thomas Roth wrote:
> We found a user job submission script that probably caused all this by
> starting
> - several hundred (900) jobs simultaneously
> - all of them opening one and the same file for batch system errors and
> one and the same file for its output.

You probably should keep an eye on developments in bug 20373 which should
help to avert this kind of problems for the usecase you describe.
The existing "good" patch in there should help somewhat and the other patch
under development will help some more once it's completed.

> Still I'd like to learn more about "operation X on unconnected MDS", on
> the net I only found my own question from two years ago.

This means MDS got a request X from a client that it believes is no longer
connected to it (because the client was evicted, I guess).

Bye,
    Oleg
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to