> The driver will create the cq with an allocated vector, but it's not activated > until after the driver wires it up to a handler. I think that's what you're > observing with the incomplete MSIx table entry on creation.
Agreed. I digged through pci_request_irq()'s call chain and found pci_write_msi_msg() was called in the end. Now to implement irqfd support, we need to register the (complete) MSI message in KVM so that KVM can directly send the interrupt when we signal the irqfd. My prior implementation delayed each CQ's MSI message registration to its first nvme_post_cqes(). I'm not sure whether this is a good choice. What do you think about this approach? BTW, since we skip QEMU's MSI-x emulation with irqfd, we need to record the mask status of each interrupt vector. QEMU provides msix_set_vector_notifiers() to help us call handlers on each mask and unmask event. But this function works on a per-device basis. I guess it is best to call msix_set_vector_notifiers() after all CQs are created. But I think qemu-nvme can't tell when the host has finished CQ creation. Where do you think is the best place we register the mask/unmask callbacks? Is it OK to put it at, say, the first nvme_post_cqes() of the whole device? Thanks, Jinhao Fan