Hi all, The project would be about utilizing shadow doorbell buffer features in NVMe 1.3 to enable QEMU side polling for virtualized NVMe device, thus achieving comparable performance as in virtio-dataplane.
**Why not virtio?** The reason is many industrial/academic researchers uses QEMU NVMe as a performance platform for research/product prototyping. NVMe interface is better in the rich features it provides than virtio interface. If we can make QEMU NVMe performance competent with virtio, it will benefit a lot of communities. **Doable?** NVMe spec 1.3 introduces a shadow doorbell buffer which is aimed for virtual NVMe controller optimizations. QEMU can certainly utilize this feature to reduce or even eliminate VM-exits triggered by doorbell writes. I remember there were some discussions back in 2015 about this, but I don't see it finally done. For this project, I think we can go in three steps: (1). add the shadow doorbell buffer support into QEMU NVMe emulation, this will reduce # of VM-exits. (2). replace current timers used by QEMU NVMe with a separate polling thread, thus we can completely eliminate VM-exits. (3). Even further, we can adapt the architecture to use one polling thread for each NVMe queue pair, thus it's possible to provide more performance. (step 3 can be left for next year if the workload is too much for 3 months). Actually, I have an initial implementation over step (1)(2) and would like to work more on it to push it upstream. More information is in this papper, (Section 3.1 and Figure 2-left), http://ucare.cs.uchicago.edu/pdf/fast18-femu.pdf Comments are welcome. Thanks. Best, Huaicheng