Thanks Rob. I'm glad you have verified this. It happens only with 2 PVFS2 servers for me. I have done some debugging on it in the past to only come up with a few clues.
1) It is the MPI_File_delete in hpio where it is hanging. Basically on the PVFS2 server, I think that the reads never get fully pushed out of the request-server and therefore the delete is waiting for the read to finish before it can happen and thus the hang. 2) I've tried to simplify the problem, but believe it or not, this is pretty much the simplest I can get it. It could be a problem with the datatype I/O code, but since it works with 1 and 3 servers, I'm not entirely convinced of that. 3) Honestly I'm not entirely sure of where the problem is, but if I had to guess, it's a flow problem where the read flow isn't being marked complete (and thus not flushed from the request scheduler). 4) So it appears that one server will actually delete the file, but the other won't. So only one of the reads is incomplete. Hope that helps some. I'll keep working on it myself. =) Extracting the pvfs2-only parts of the code is a bit difficult...I was hoping to avoid that if possible. If we can't make any progress on it, I guess I'll try to do that. I appreciate the quick response. Avery On Thu, 2006-02-23 at 14:55 -0600, Robert Latham wrote: > On Thu, Feb 23, 2006 at 12:38:26PM -0600, Avery Ching wrote: > > By the way, is the datatype branch going to make it to ROMIO at some > > point? The major bug I've been trying to fix is using the datatype I/O > > branch of the PVFS2 ROMIO driver using 2 pvfs2 servers. > > > > mpiexec -n 2 ./hpio-debug -o 11 -t 10 -m 1 -n 10 -c 4096 -p 128 -d > > pvfs2:/mnt/pvfs2 > > > > It works fine with posix, list I/O, and collective I/O, just not > > datatype I/O. Could be something with my ROMIO driver or down inside > > PVFS2. Hard to say. =) Basically, writes are fine, but reads will > > hang....as if they never truly complete. > > > hey avery > Well i tried with your dtype code that you sent us a while back and > that command worked great with 2 clients and 3 pvfs2 servers. I'm > seeing the two server problem now. I'll try to narrow down the > problem. If you can extract the pvfs2-only parts of your code, > that'll make it a lot easer to debug, but if not, that's ok. > > ==rob > _______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
