On Tue, 21 Jan 2020 01:23:55 +0100 Christian Schoenebeck <qemu_...@crudebyte.com> wrote:
> This patch is not intended to be merged. It just provides a Well I like the idea of having such a benchmark available. It could probably be merged after a few changes... > temporary benchmark foundation for coneniently A/B comparison > of the subsequent 9p readdir optimization patches: > > * hw/9pfs/9p-synth: increase amount of simulated files for > readdir test to 2000 files. > ... the 9p-synth backend could maybe always create this number of files ? > * tests/virtio-9p: measure wall time that elapsed between > sending T_readdir request and arrival of R_readdir response > and print out that measured duration, as well as amount of > directory entries received, and the amount of bytes of the > response message. > ... maybe we should make the printing optional and off by default so that it doesn't pollute CI logs. > * tests/virtio-9p: increased msize to 256kiB to allow > retrieving all 2000 files (simulated by 9pfs synth driver) > with only one T_readdir request. > ... always use 256k for both the basic test and a revisited split test ? > Running this benchmark is fairly quick & simple and does not > require any guest OS installation or other prerequisites: > > cd build > make && make tests/qtest/qos-test > export QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 > tests/qtest/qos-test -p $(tests/qtest/qos-test -l | grep readdir/basic) > > Since this benchmark uses the 9pfs synth driver, the host > machine's I/O hardware (SSDs/HDDs) is not relevant for the > benchmark result, because the synth backend's readdir > implementation returns immediately (without any blocking I/O > that would incur with a real-life fs driver) and just returns > already prepared, simulated directory entries directly from RAM. > So this benchmark focuses on the efficiency of the 9pfs > controller code (or top half) for readdir request handling. > > Signed-off-by: Christian Schoenebeck <qemu_...@crudebyte.com> > --- > hw/9pfs/9p-synth.h | 2 +- > tests/qtest/virtio-9p-test.c | 37 +++++++++++++++++++++++++++++++++++- > 2 files changed, 37 insertions(+), 2 deletions(-) > > diff --git a/hw/9pfs/9p-synth.h b/hw/9pfs/9p-synth.h > index 036d7e4a5b..7d6cedcdac 100644 > --- a/hw/9pfs/9p-synth.h > +++ b/hw/9pfs/9p-synth.h > @@ -58,7 +58,7 @@ int qemu_v9fs_synth_add_file(V9fsSynthNode *parent, int > mode, > /* for READDIR test */ > #define QTEST_V9FS_SYNTH_READDIR_DIR "ReadDirDir" > #define QTEST_V9FS_SYNTH_READDIR_FILE "ReadDirFile%d" > -#define QTEST_V9FS_SYNTH_READDIR_NFILES 100 > +#define QTEST_V9FS_SYNTH_READDIR_NFILES 2000 > > /* Any write to the "FLUSH" file is handled one byte at a time by the > * backend. If the byte is zero, the backend returns success (ie, 1), > diff --git a/tests/qtest/virtio-9p-test.c b/tests/qtest/virtio-9p-test.c > index e47b286340..d71b37aa6c 100644 > --- a/tests/qtest/virtio-9p-test.c > +++ b/tests/qtest/virtio-9p-test.c > @@ -15,6 +15,18 @@ > #include "libqos/virtio-9p.h" > #include "libqos/qgraph.h" > > +/* > + * to benchmark the real time (not CPU time) that elapsed between start of > + * a request and arrival of its response > + */ > +static double wall_time(void) > +{ > + struct timeval t; > + struct timezone tz; > + gettimeofday(&t, &tz); > + return t.tv_sec + t.tv_usec * 0.000001; > +} > + > #define QVIRTIO_9P_TIMEOUT_US (10 * 1000 * 1000) > static QGuestAllocator *alloc; > > @@ -36,7 +48,7 @@ static void pci_config(void *obj, void *data, > QGuestAllocator *t_alloc) > g_free(tag); > } > > -#define P9_MAX_SIZE 4096 /* Max size of a T-message or R-message */ > +#define P9_MAX_SIZE (256 * 1024) /* Max size of a T-message or R-message */ > > typedef struct { > QTestState *qts; > @@ -600,12 +612,35 @@ static void fs_readdir(void *obj, void *data, > QGuestAllocator *t_alloc) > v9fs_req_wait_for_reply(req, NULL); > v9fs_rlopen(req, &qid, NULL); > > + const double start = wall_time(); > + > /* > * submit count = msize - 11, because 11 is the header size of Rreaddir > */ > req = v9fs_treaddir(v9p, 1, 0, P9_MAX_SIZE - 11, 0); > + const double treaddir = wall_time(); > v9fs_req_wait_for_reply(req, NULL); > + const double waitforreply = wall_time(); > v9fs_rreaddir(req, &count, &nentries, &entries); > + const double end = wall_time(); > + > + printf("\nTime client spent on sending T_readdir: %fs\n\n", > + treaddir - start); > + > + printf("Time client spent for waiting for reply from server: %fs " > + "[MOST IMPORTANT]\n", waitforreply - start); > + printf("(This is the most important value, because it reflects the > time\n" > + "the 9p server required to process and return the result of the\n" > + "T_readdir request.)\n\n"); > + > + printf("Total client time: %fs\n", end - start); > + printf("(NOTE: this time is not relevant; this huge time comes from\n" > + "inefficient qtest_memread() calls. So you can discard this\n" > + "value as a problem of this test client implementation while\n" > + "processing the received server T_readdir reply.)\n\n"); > + > + printf("Details of response message data: R_readddir nentries=%d " > + "rbytes=%d\n", nentries, count); > > /* > * Assuming msize (P9_MAX_SIZE) is large enough so we can retrieve all