Hi John,
Is the MPI application writing to one file from multiple ranks? Any
idea what the application was doing when the lookup happens?
Mark
On 08/26/2012 09:03 AM, John Wright wrote:
Hi All,
We're running ceph 0.48 on small three node test cluster. We've had good
stability with I/O using dd and iozone especially after upgrading to 0.48.
However, we're running into a repeatable lockup of the linux ceph client (
3.3.5-2.fc16.x86_64 ) when running an mpi program that has simple I/O on a ceph
mount. This is an mpi program running processes on two nodes. It is the remote
node on which the ceph client locks up. The cient becomes immediately
unresponsive and any attempt to access the mounted volume produces a process
with status 'D'. I can see no indication in the server logs that it is ever
contacted. Regular serial processes run fine on the volume. MPI runs on the
nodes work fine when not using the ceph volume.
So any suggestions on where to look? Any one have an experience testing
parallel programs on ceph?
thanks,
-john
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html