Re: [Pvfs2-developers] Use of pvfs_isys_io and pvfs_sys_wait

Troy Benjegerdes Wed, 16 Apr 2008 15:20:51 -0700

There's a new io.tar.gz there now that fixes an bunch of warnings.

What I'm finding is the failures are really hard to reproduce, but onthe PPC hardware, seem most likely to happen when using 16 32 MBbuffers (NUMBUF 16, BUFFER_SIZE 32*1024*1024), with 4 cpus (./testio/pvfs/da13-test/testio 4)

and the files don't exist beforehand, the printfs in testio that printout 'WRITEFILE' and 'READFILE' are NOT commented out, and there's adelay in the printf added by going over my DSL link.

I managed to somehow cause one of the flows to cancel, then do this tothe server:

[E 17:05:12.921338] handle_io_error: flow proto error cleanup started on0x2aab29a46bf0: Connection timed outpvfs2-server:../src/io/flow/flowproto-bmi-trove/flowproto-multiqueue.c:2008:handle_io_error: Assertion `0' failed.




Troy Benjegerdes wrote:

The app does a sequential write, using a maximum of NUMBUF (currently16) buffers for async writes, posts them one at a time withpvfs2_aio_flush, and then checks the buffer before re-using it withpvfs2_aio_check.
Then it does 'rewindfile', and immediately starts the read.. So if thefile is small enough to fit in NUMBUF*BUFFER_SIZE, I could issue aread for the end of the file before the write for the data is retired.But I'm using a big enough file that the call to PVFS_sys_wait() forthe write is guaranteed to have completed. Unless of course I havesome weird logic error in my read-ahead in get_next_full_readbuf.
A very simple test harness that I just reproduced the problem (alongwith the IO shim) is at:
http://www.scl.ameslab.gov/~troy/pvfs/corruption/io.tar.gz
So *if* I have enough printfs that the printing takes longer than theIO (probably include the RTT to stuff it over my DSL link from home),I get this:
R 65010  READFILE 65010
R 65011  READFILE 65011
R 65012  READFILE 65012
get_next_full_readbuf: enter userFilePos 1073741824, read_ahead_offset1107296256-- buffer 0x100de290 curbuf 2 buffer_ahead_offset 15 size 0read_ahead_offset 1107296256
get_next_full_readbuf at end of file, no more bufs to fill
pvfs2_aio_check id 0 buf 2 aio_req 0x10104e78 b->offset 1073741824b->size 33554432
ERROR pvfs2_aio_check called to PVFS_sys_wait for op_id 321 got error 0
ERROR pvfs2_aio_check: buf 2 offset 1073741824 b->size (33554432) !=total_completed (29360128)
run 'gdb -p 6888' to debug
run 'gdb -p 6888' to debug

If the file already exists, and is big enough, it works just fine.
So there is some race there that is notoriously timing sensitive, thushas been really erratic for us to reproduce easily. I have suspicionsit will only happen on BMI layers that are fully hardware asynch aswell. (aka, !tcp)
Phil Carns wrote:
Could you break down what the app is doing at a little bit higherlevel in this time frame? (ie, how many writes is it posting, howmany reads is it posting, which are concurrent, when it calls waitfor each).
From what I can tell, it looks like there are 30 total isys_io'sposted; the first 15 are writes (triggered by pvfs2_aio_flush) andthe last 15 are reads (triggered by pvfs2_aio_fill). It doesn't looklike there are any waits in between the two, though, if I am readingit right.
Are you calling wait() (or some other variant) between the writes andthe reads? The pvfs2 system interface doesn't order any of theoperations, so it might just be that some of your reads happen to behitting the server before your writes have put the data there.
This is different from the standard posix aio; I think their apiautomatically orders every I/O operation at least at a filedescriptor level. The PVFS system interface doesn't do anything likethat to prevent I/O operations from getting out of order once theyare posted.
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers


_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] Use of pvfs_isys_io and pvfs_sys_wait

Reply via email to