Re: [Pvfs2-users] IOR errors

Pete Wyckoff Tue, 08 May 2007 12:45:23 -0700

[EMAIL PROTECTED] wrote on Wed, 21 Mar 2007 16:39 -0700:
> Thanks to the folks who helped me out yesterday I got a nice little 2.3T
> pvfs2 (2.6.2) file system. I have 16 nodes that are all acting as I/O
> servers and clients. 1 of those boxes is also the meta data server.  All
> over Topspin IB and I am using all the default setting in my config file
> parameters.
> 
> That being said, I wanted to test the bandwidth so I compiled the POSIX
> version of IOR against the Topspin mpich libraries.
> 
> My run looks like this.
> 
> IOR-2.9.4: MPI Coordinated Test of Parallel I/O
> 
> Run began: Wed Mar 21 16:06:04 2007
> Command line used: /home/tim/IOR -i 8 -b 1024m -o /mnt/pvfs2/ior/ior_16g
> Machine: Linux compute-0-15.local
> 
> Summary:
>         api                = POSIX
>         test filename      = /mnt/pvfs2/ior/ior_16g
>         access             = single-shared-file
>         clients            = 16 (1 per node)
>         repetitions        = 8
>         xfersize           = 262144 bytes
>         blocksize          = 1 GiB
>         aggregate filesize = 16 GiB
> 
> access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)
> close(s)   iter
> ------    ---------  ---------- ---------  --------   --------
> --------   ----
> write     613.70     1048576    256.00     0.177541   26.43      7.24
> 0
> read      1141.20    1048576    256.00     0.019199   14.34
> 0.329994   0
> write     589.05     1048576    256.00     0.154706   27.74      7.06
> 1
> read      1032.93    1048576    256.00     0.019723   15.84
> 0.417178   1
> write     550.66     1048576    256.00     0.991332   29.58      8.43
> 2
> read      1005.48    1048576    256.00     0.021340   16.28
> 0.448091   2
> write     555.06     1048576    256.00     0.232900   29.48      8.57
> 3
> read      1006.24    1048576    256.00     0.018788   16.27
> 0.263041   3
> WARNING: Expected aggregate file size       = 17179869184.
> WARNING: Stat() of aggregate file size      = 13958643712.
> WARNING: Using actual aggregate bytes moved = 17179869184.
> write     438.87     1048576    256.00     0.238877   37.23      15.80
> 4
> ** error **
> ERROR in aiori-POSIX.c (line 245): hit EOF prematurely.
> ERROR: Success
> ** exiting **
> ** error **
> ERROR in aiori-POSIX.c (line 245): hit EOF prematurely.
> 
> 
> I would say that the performance is quite good until I get to those
> errors. Nothing interesting in the client or server logs. Something in
> my IOR setup that might be stressing things a bit too hard?


Cleaning my mailbox today.  Didn't think you'd hear another reply on
this matter, did you.  :)

I can repeat this with IB, both using POSIX and MPIIO.  With MPIIO you will
see the error messages explicitly, something like:

[E 18:30:04.916174] fp_multiqueue_cancel: flow proto cancel called on 0x6d0100
[E 18:30:04.917227] handle_io_error: flow proto error cleanup started on 
0x6d0100, error_code: -1610613121
[E 18:30:04.917247] handle_io_error: flow proto 0x6d0100 canceled 1 operations, 
will clean up.
[E 18:30:04.917264] handle_io_error: flow proto 0x6d0100 error cleanup 
finished, error_code: -1610613121

but with POSIX the error messages are generated by the
pvfs2-client-core kernel helper and end up in a file somewhere.

In the MPIIO case, the code completes with no errors, but the times
are pretty lousy.  It is related to these timeouts in the flow
protocol.  Each client allocates a certain amount of time to get a
response back from a server, and if it doesn't get one, it cancels
the operation and tries again.  It could be that in the POSIX case
we don't have all the error conditions handled properly.

One thing I'll suggest is not to run clients on the same nodes as
servers.  With the same run, but on 14 clients against 14 servers
(incl 1 md server), no timeouts occur.  If you insist on setting
things up like this, there are six values in fs.conf that you can
adjust to increase the timeouts:

        ServerJobBMITimeoutSecs 30
        ServerJobFlowTimeoutSecs 30
        ClientJobBMITimeoutSecs 300
        ClientJobFlowTimeoutSecs 300
        ClientRetryLimit 5
        ClientRetryDelayMilliSecs 2000

Play around with those numbers and you should probably get IOR to
run to completion.  The down-side is that if a server dies, you
don't get an error message at the client for some possibly long
time.

                -- Pete
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Re: [Pvfs2-users] IOR errors

Reply via email to