The RTS_DONE error messages indicate a problem that is specific to running on Infiniband. This will be fixed in the upcoming 2.8.8 release.
Thanks, Elaine On Wed, Oct 23, 2013 at 2:05 PM, xihuang sun <[email protected]> wrote: > another error come out when I run: > mpiexec -machinefile mpd.hosts -np 10 /home/IOR/src/C/IOR -a MPIIO -N 10 > -b 1g -d 5 -t 256k -o /mnt/orangefs/file1 -g -w -W -r -s 1 -vv > > I got > IOR-2.10.3: MPI Coordinated Test of Parallel I/O > > Run began: Thu Oct 24 04:06:47 2013 > Command line used: /home/IOR/src/C/IOR -a MPIIO -N 10 -b 1g -d 5 -t 256k > -o /mnt/orangefs/file1 -g -w -W -r -s 1 -vv > Machine: Linux node1 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 > x86_64 > Using synchronized MPI timer > Start time skew across all tasks: 0.00 sec > Path: /mnt/orangefs > FS: 1.1 TiB Used FS: 15.8% Inodes: 8796093022208.0 Mi Used Inodes: > 0.0% > Participating tasks: 10 > task 0 on node1 > task 1 on node2 > task 2 on node3 > task 3 on node4 > task 4 on node5 > task 5 on node6 > task 6 on node7 > task 7 on node8 > task 8 on node9 > task 9 on node10 > > Summary: > api = MPIIO (version=2, subversion=2) > test filename = /mnt/orangefs/file1 > access = single-shared-file, independent > pattern = segmented (1 segment) > ordering in a file = sequential offsets > ordering inter file= no tasks offsets > clients = 10 (1 per node) > repetitions = 1 > xfersize = 262144 bytes > blocksize = 1 GiB > aggregate filesize = 10 GiB > > Using Time Stamp 1382558807 (0x52682c57) for Data Signature > delaying 5 seconds . . . > Commencing write performance test. > Thu Oct 24 04:06:52 2013 > > \^[[Aaccess bw(MiB/s) block(KiB) xfer(KiB) open(s) wr/rd(s) > close(s) total(s) iter > ------ --------- ---------- --------- -------- -------- -------- > -------- ---- > write 28.83 1048576 256.00 0.170848 355.01 0.001721 > 355.18 0 XXCEL > Verifying contents of the file(s) just written. > Thu Oct 24 04:12:47 2013 > > [E 04:12:49.364807] Warning: encourage_recv_incoming: mop_id 629fe60 in > RTS_DONE message not found. > [E 04:12:50.350602] Warning: encourage_recv_incoming: mop_id 2aaaae91aaf0 > in RTS_DONE message not found. > [E 04:12:50.613899] Warning: encourage_recv_incoming: mop_id 2aaaac009a30 > in RTS_DONE message not found. > [E 04:12:51.175232] Warning: encourage_recv_incoming: mop_id 6f07940 in > RTS_DONE message not found. > > I think maybe I am on the wrong way. > > thanks for help~ > > > > 2013/10/24 xihuang sun <[email protected]> > >> Hi, >> I'm using orangefs2.8.7 over InfiniBand and testing with IOR. some errors >> are below. >> now here is the command: >> mpiexec -machinefile mpd.hosts -np 4 /home/IOR/src/C/IOR -a MPIIO -N 4 -b >> 1g -d 5 -t 2m -o /mnt/orangefs/file1 -c -g -w -W -r -s 1 -vv >> >> it SOMETIMES get the error below: >> >> [E 02:52:21.394237] fp_multiqueue_cancel: flow proto cancel called on >> 0x12bff388 >> [E 02:52:21.394326] fp_multiqueue_cancel: I/O error occurred >> [E 02:52:21.394333] handle_io_error: flow proto error cleanup started on >> 0x12bff388: Operation cancelled (possibly due to timeout) >> [E 02:52:21.394341] handle_io_error: flow proto 0x12bff388 canceled 1 >> operations, will clean up. >> [E 02:52:21.394349] mem_to_bmi_callback_fn: I/O error occurred >> [E 02:52:21.394353] handle_io_error: flow proto 0x12bff388 error cleanup >> finished: Operation cancelled (possibly due to timeout) >> [E 02:52:21.394358] io_datafile_complete_operations: flow failed, >> retrying from msgpair >> >> and SOMETIMES the data can't be all written into the /mnt/orangefs(like >> 4GB case in the example command can only write 3.2GB into the >> /mnt/orangefs and the program get stuck in,with no error message shown). >> >> I don't know whether these two situations have some connections. Or my >> command is wrong? >> >> I tried to solve it with pvfs2-set-sync to set the -D and -M to 1 and the >> write operations get down to a very low rate >> >> Max Write: 17.61 MiB/sec (18.47 MB/sec) >> Max Read: 3403.92 MiB/sec (3569.27 MB/sec) >> >> so my question is what's wrong with it? >> > > > _______________________________________________ > Pvfs2-users mailing list > [email protected] > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users > >
_______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
