Hi, I'm using orangefs2.8.7 over InfiniBand and testing with IOR. some errors are below. now here is the command: mpiexec -machinefile mpd.hosts -np 4 /home/IOR/src/C/IOR -a MPIIO -N 4 -b 1g -d 5 -t 2m -o /mnt/orangefs/file1 -c -g -w -W -r -s 1 -vv
it SOMETIMES get the error below: [E 02:52:21.394237] fp_multiqueue_cancel: flow proto cancel called on 0x12bff388 [E 02:52:21.394326] fp_multiqueue_cancel: I/O error occurred [E 02:52:21.394333] handle_io_error: flow proto error cleanup started on 0x12bff388: Operation cancelled (possibly due to timeout) [E 02:52:21.394341] handle_io_error: flow proto 0x12bff388 canceled 1 operations, will clean up. [E 02:52:21.394349] mem_to_bmi_callback_fn: I/O error occurred [E 02:52:21.394353] handle_io_error: flow proto 0x12bff388 error cleanup finished: Operation cancelled (possibly due to timeout) [E 02:52:21.394358] io_datafile_complete_operations: flow failed, retrying from msgpair and SOMETIMES the data can't be all written into the /mnt/orangefs(like 4GB case in the example command can only write 3.2GB into the /mnt/orangefs and the program get stuck in,with no error message shown). I don't know whether these two situations have some connections. Or my command is wrong? I tried to solve it with pvfs2-set-sync to set the -D and -M to 1 and the write operations get down to a very low rate Max Write: 17.61 MiB/sec (18.47 MB/sec) Max Read: 3403.92 MiB/sec (3569.27 MB/sec) so my question is what's wrong with it?
_______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
