Hi,

I am in the process of evaluating lustre, and I have a small problem I
am hoping that someone could shed some light on.

What I'm running:
Lustre v1.6beta5
linux 2.6.12.6 (smp)
2 OSS
1 MGS/MDT
4 clients
Ethernet network

I'm trying to determine whether multiple processes on multiple nodes can
simultaneously mmap a common file on a lustre file system, write to it,
and produce a coherent result (I'm using OpenMPI to spawn the processes
and provide synchronization barriers). In my tests, each process is
writing a 10,000 byte segment of the file, but is memory mapping the
whole file. What I'm seeing is that if I use 40 processes or less, the
file is (usually) produced correctly. However, when I try my test with
50 or 100 processes, I rarely get a good result; in fact, the tests seem
to hang. What I've found is that, when the test fails, there are
processes remaining on the lustre client nodes that are using up all the
CPU, but never seem to finish. I have no trouble interrupting the
running processes in this case.

While I'm not entirely sure of the result I should expect in these
tests, I certainly would expect the test to finish. Does anyone have any
comments or ideas?

-- 
Martin

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to