> we have a testbed running with Lustre 1.8.3 and a RTT of ~4ms (10GE > network cards everywhere) for a ping between client and servers. If I > have read the iozone source code correctly, iozone reports bandwidth in > KB/s and includes the time for the open() call, but not for close(). If
To include the close time you need to add the -c flag. The -e should include fsync/fflush in the time. > I write a single 64 byte file (with one I/O request), iozone tells me > something like '295548', which means ~295 MB/s. Dividing the file size > by the bandwidth, I get the time that was needed to create the file and > write the 64 bytes (with a single request). In this case, the time is > about 0,2 micro seconds which is way below the RTT. That seems oddly fast for such a small file over any latency. Since you shouldn't even be able to lock the file in that time. > That mean for a Lustre file system, if I create a file and write 64 > bytes, the client sends two(?) RPCs to the server and does not wait for > the completion. Is this correct? But it will wait for the completion of > both RPCs when the file ist closed? You can see what Lustre is doing if the client isn't doing any other activity and you enable some tracing. "sysctl -w lnet.debug='+rpctrace vfstrace'" should allow you to see the VFS ops ll_file_open, ll_file_writev/ll_file_aio_write, ll_file_release, along with any RPCs generated by them. You should see an RPC for the file open which will be a 101 opcode for requesting the lock and you should see a reply AFAIK before the client actually attempts to write any data. So that should bring your time upto at least 4 ms for 1 RTT. The initial write should request a lock from the first stripe followed by a OST write RPC (opcode 4) followed by a file close (opcode 35). I ran a test over 4 ms latency so you can see what I'm referring to. I thought that there was a patch in Lustre a few months back that forced a flush before a file close, but this is from a 1.8.5 client so I'm guessing that isn't how it works because between when I closed the file and the end I had to "sync" for the OST write to show up. 00000100:00100000:3:1299677923.962151:0:31223:0:(client.c:2087:ptlrpc_queue_wait()) Sending RPC pname:cluuid:pid:xid:nid:opc bash:052a94ea-8d57-ac3e-829e-890e0bb65639:31223:x1362774702706288:192.168.66.2@tcp8:101 00000100:00100000:3:1299677923.966512:0:31223:0:(client.c:2192:ptlrpc_queue_wait()) Completed RPC pname:cluuid:pid:xid:nid:opc bash:052a94ea-8d57-ac3e-829e-890e0bb65639:31223:x1362774702706288:192.168.66.2@tcp8:101 00000080:00200000:3:1299677923.966562:0:31223:0:(file.c:419:ll_file_open()) VFS Op:inode=219/2880174574(ffff8801e1a68550), flags 101101 00000100:00100000:3:1299677923.966615:0:31223:0:(client.c:2087:ptlrpc_queue_wait()) Sending RPC pname:cluuid:pid:xid:nid:opc bash:052a94ea-8d57-ac3e-829e-890e0bb65639:31223:x1362774702706292:192.168.66.3@tcp8:101 00000100:00100000:3:1299677923.970994:0:31223:0:(client.c:2192:ptlrpc_queue_wait()) Completed RPC pname:cluuid:pid:xid:nid:opc bash:052a94ea-8d57-ac3e-829e-890e0bb65639:31223:x1362774702706292:192.168.66.3@tcp8:101 00000080:00200000:3:1299677923.971058:0:31223:0:(file.c:256:ll_file_release()) VFS Op:inode=219/2880174574(ffff8801e1a68550) 00000100:00100000:3:1299677923.971078:0:31223:0:(client.c:2087:ptlrpc_queue_wait()) Sending RPC pname:cluuid:pid:xid:nid:opc bash:052a94ea-8d57-ac3e-829e-890e0bb65639:31223:x1362774702706296:192.168.66.2@tcp8:35 00000100:00100000:3:1299677923.975447:0:31223:0:(client.c:2192:ptlrpc_queue_wait()) Completed RPC pname:cluuid:pid:xid:nid:opc bash:052a94ea-8d57-ac3e-829e-890e0bb65639:31223:x1362774702706296:192.168.66.2@tcp8:35 00000100:00100000:0:1299677927.817739:0:2894:0:(client.c:1177:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc ptlrpcd:052a94ea-8d57-ac3e-829e-890e0bb65639:2894:x1362774702706304:192.168.66.3@tcp8:4 00000100:00100000:0:1299677927.826642:0:2894:0:(client.c:1444:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd:052a94ea-8d57-ac3e-829e-890e0bb65639:2894:x1362774702706304:192.168.66.3@tcp8:4 > The numbers look different when I disable the client side cache by > setting max_dirty_mb to 0. Without any grant I think all RPCs have to be synchronous so you'll see a huge performance hit over latency. Jeremy _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
