Hello!
On Fri, Dec 08, 2006 at 11:59:41AM -0700, Martin Pokorny wrote:
> I'm trying to determine whether multiple processes on multiple nodes can
> simultaneously mmap a common file on a lustre file system, write to it,
> and produce a coherent result (I'm using OpenMPI to spawn the processes
> and provide synchronization barriers). In my tests, each process is
> writing a 10,000 byte segment of the file, but is memory mapping the
write with write(2) system call, or write into mapping?
What is the striping pattern?
> whole file. What I'm seeing is that if I use 40 processes or less, the
> file is (usually) produced correctly. However, when I try my test with
> 50 or 100 processes, I rarely get a good result; in fact, the tests seem
> to hang. What I've found is that, when the test fails, there are
> processes remaining on the lustre client nodes that are using up all the
> CPU, but never seem to finish. I have no trouble interrupting the
> running processes in this case.
Can you obtain traces for these processes? (use sysrq-t and sysrq-p).
Also I wonder if yo can retry your testing with Lustre 1.4.8 and see if it
behaves any differently.
Thanks.
Bye,
Oleg
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss