As previously mentioned, I was investigating performance issues with 9pfs. Raw file read/write of 9pfs is actually quite good, provided that client picked a reasonable high msize (maximum message size). I would recommend to log a warning on 9p server side if a client attached with a small msize that would cause performance issues for that reason.
However there are other aspects where 9pfs currently performs suboptimally, especially readdir handling of 9pfs is extremely slow, a simple readdir request of a guest typically blocks for several hundred milliseconds or even several seconds, no matter how powerful the underlying hardware is. The reason for this performance issue: latency. Currently 9pfs is heavily dispatching a T_readdir request numerous times between main I/O thread and a background I/O thread back and forth; in fact it is actually hopping between threads even multiple times for every single directory entry during T_readdir request handling which leads in total to huge latencies for a single T_readdir request. This patch series aims to address this severe performance issue of 9pfs T_readdir request handling. The actual performance fix is patch 10. I also provided a convenient benchmark for comparing the performance improvements by using the 9pfs "synth" driver (see patch 8 for instructions how to run the benchmark), so no guest OS installation is required to peform this benchmark A/B comparison. With patch 10 I achieved a performance improvement of factor 40 on my test machine. ** NOTE: ** As outlined by patch 7 there seems to be an outstanding issue (both with current, unoptimized readdir code, as well as with new, optimized readdir code) causing a transport error with splitted readdir requests. This issue only occurs if patch 7 is applied. I haven't investigated the cause of this issue yet, it looks like a memory issue though. I am not sure if it is a problem with the actual 9p server or rather "just" with the test environment. Apart from that issue, the actual splitted readdir seems to work well with the new performance optimized readdir code as well though. v3->v4: * Rebased to master (SHA-1 43d1455c). * Adjusted commit log message [patch 2], [patch 3], [patch 8]. * Fixed using Rreaddir header size of 11 (instead of P9_IOHDRSZ) for limiting 'count' parameter of Treaddir [patch 3], [patch 5]. Christian Schoenebeck (11): tests/virtio-9p: add terminating null in v9fs_string_read() 9pfs: require msize >= 4096 9pfs: validate count sent by client with T_readdir hw/9pfs/9p-synth: added directory for readdir test tests/virtio-9p: added readdir test tests/virtio-9p: added splitted readdir test tests/virtio-9p: failing splitted readdir test 9pfs: readdir benchmark hw/9pfs/9p-synth: avoid n-square issue in synth_readdir() 9pfs: T_readdir latency optimization hw/9pfs/9p.c: benchmark time on T_readdir request hw/9pfs/9p-synth.c | 48 +++++- hw/9pfs/9p-synth.h | 5 + hw/9pfs/9p.c | 163 ++++++++++++-------- hw/9pfs/9p.h | 34 ++++ hw/9pfs/codir.c | 183 ++++++++++++++++++++-- hw/9pfs/coth.h | 3 + tests/qtest/virtio-9p-test.c | 290 ++++++++++++++++++++++++++++++++++- 7 files changed, 643 insertions(+), 83 deletions(-) -- 2.20.1