Re: [Qemu-devel] [PATCH 3/3] virtio-pmem: Add virtio pmem driver
On Thu, Sep 27, 2018 at 6:07 AM Pankaj Gupta wrote: [..] > > We are plugging VIRTIO based flush callback for virtio_pmem driver. If pmem > > driver (pmem_make_request) has to queue request we have to plug "blk_mq_ops" > > callbacks for corresponding VIRTIO vqs. AFAICU there is no existing > > multiqueue > > code merged for pmem driver yet, though i could see patches by Dave > > upstream. > > > > I thought about this and with current infrastructure "make_request" releases > spinlock > and makes current thread/task. All Other threads are free to call > 'make_request'/flush > and similarly wait by releasing the lock. Which lock are you referring? > This actually works like a queue of threads > waiting for notifications from host. > > Current pmem code do not have multiqueue support and I am not sure if core > pmem code > needs it. Adding multiqueue support just for virtio-pmem and not for pmem in > same driver > will be confusing or require alot of tweaking. Why does the pmem driver need to be converted to multiqueue support? > Could you please give your suggestions on this. I was expecting that flush requests that cannot be completed synchronously be placed on a queue and have bio_endio() called at a future time. I.e. use bio_chain() to manage the async portion of the flush request. This causes the guest block layer to just assume the bio was queued and will be completed at some point in the future.
Re: [Qemu-devel] [PATCH 3/3] virtio-pmem: Add virtio pmem driver
Hello Dan, > > > + /* The request submission function */ > > > +static int virtio_pmem_flush(struct nd_region *nd_region) > > > +{ > > > + int err; [...] > > > + init_waitqueue_head(&req->host_acked); > > > + init_waitqueue_head(&req->wq_buf); > > > + > > > + spin_lock_irqsave(&vpmem->pmem_lock, flags); > > > + sg_init_one(&sg, req->name, strlen(req->name)); > > > + sgs[0] = &sg; > > > + sg_init_one(&ret, &req->ret, sizeof(req->ret)); > > > + sgs[1] = &ret; [...] > > > + spin_unlock_irqrestore(&vpmem->pmem_lock, flags); > > > + /* When host has read buffer, this completes via host_ack */ > > > + wait_event(req->host_acked, req->done); > > > > Hmm, this seems awkward if this is called from pmem_make_request. If > > we need to wait for completion that should be managed by the guest > > block layer. I.e. make_request should just queue request and then > > trigger bio_endio() when the response comes back. > > We are plugging VIRTIO based flush callback for virtio_pmem driver. If pmem > driver (pmem_make_request) has to queue request we have to plug "blk_mq_ops" > callbacks for corresponding VIRTIO vqs. AFAICU there is no existing > multiqueue > code merged for pmem driver yet, though i could see patches by Dave upstream. > I thought about this and with current infrastructure "make_request" releases spinlock and makes current thread/task. All Other threads are free to call 'make_request'/flush and similarly wait by releasing the lock. This actually works like a queue of threads waiting for notifications from host. Current pmem code do not have multiqueue support and I am not sure if core pmem code needs it. Adding multiqueue support just for virtio-pmem and not for pmem in same driver will be confusing or require alot of tweaking. Could you please give your suggestions on this. Thanks, Pankaj
Re: [Qemu-devel] [PATCH 3/3] virtio-pmem: Add virtio pmem driver
Hi Dan, Thanks for the review. Please find my reply inline. > > This patch adds virtio-pmem driver for KVM guest. > > > > Guest reads the persistent memory range information from > > Qemu over VIRTIO and registers it on nvdimm_bus. It also > > creates a nd_region object with the persistent memory > > range information so that existing 'nvdimm/pmem' driver > > can reserve this into system memory map. This way > > 'virtio-pmem' driver uses existing functionality of pmem > > driver to register persistent memory compatible for DAX > > capable filesystems. > > > > This also provides function to perform guest flush over > > VIRTIO from 'pmem' driver when userspace performs flush > > on DAX memory range. > > > > Signed-off-by: Pankaj Gupta > > --- > > drivers/virtio/Kconfig | 9 ++ > > drivers/virtio/Makefile | 1 + > > drivers/virtio/virtio_pmem.c | 255 > > +++ > > include/uapi/linux/virtio_ids.h | 1 + > > include/uapi/linux/virtio_pmem.h | 40 ++ > > 5 files changed, 306 insertions(+) > > create mode 100644 drivers/virtio/virtio_pmem.c > > create mode 100644 include/uapi/linux/virtio_pmem.h > > > > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig > > index 3589764..a331e23 100644 > > --- a/drivers/virtio/Kconfig > > +++ b/drivers/virtio/Kconfig > > @@ -42,6 +42,15 @@ config VIRTIO_PCI_LEGACY > > > > If unsure, say Y. > > > > +config VIRTIO_PMEM > > + tristate "Support for virtio pmem driver" > > + depends on VIRTIO > > + help > > + This driver provides support for virtio based flushing interface > > + for persistent memory range. > > + > > + If unsure, say M. > > + > > config VIRTIO_BALLOON > > tristate "Virtio balloon driver" > > depends on VIRTIO > > diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile > > index 3a2b5c5..cbe91c6 100644 > > --- a/drivers/virtio/Makefile > > +++ b/drivers/virtio/Makefile > > @@ -6,3 +6,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o > > virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o > > obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o > > obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o > > +obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o > > diff --git a/drivers/virtio/virtio_pmem.c b/drivers/virtio/virtio_pmem.c > > new file mode 100644 > > index 000..c22cc87 > > --- /dev/null > > +++ b/drivers/virtio/virtio_pmem.c > > @@ -0,0 +1,255 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* > > + * virtio_pmem.c: Virtio pmem Driver > > + * > > + * Discovers persistent memory range information > > + * from host and provides a virtio based flushing > > + * interface. > > + */ > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > I think we need to split this driver into 2 files, > drivers/virtio/pmem.c would discover and register the virtual pmem > device with the libnvdimm core, and drivers/nvdimm/virtio.c would > house virtio_pmem_flush(). o.k. Will split the driver into two files as suggested. > > > + > > +struct virtio_pmem_request { > > + /* Host return status corresponding to flush request */ > > + int ret; > > + > > + /* command name*/ > > + char name[16]; > > + > > + /* Wait queue to process deferred work after ack from host */ > > + wait_queue_head_t host_acked; > > + bool done; > > + > > + /* Wait queue to process deferred work after virt queue buffer > > avail */ > > + wait_queue_head_t wq_buf; > > Why does this need wait_queue's per request? shouldn't this be per-device? This is used to wait flush calling threads when virtio queue is full. wait_queue in request struct binds waitqueue and request. When host acknowledges guest, first waiting request is selected and corresponding thread is woken-up. Alternatively, we can use "add_wait_queue_exclusive" with device wait_queue. This will wake up only one exclusive process waiting. This will avoid using additional list for tracking. > > > + bool wq_buf_avail; > > + struct list_head list; > > +}; > > + > > +struct virtio_pmem { > > + struct virtio_device *vdev; > > + > > + /* Virtio pmem request queue */ > > + struct virtqueue *req_vq; > > + > > + /* nvdimm bus registers virtio pmem device */ > > + struct nvdimm_bus *nvdimm_bus; > > + struct nvdimm_bus_descriptor nd_desc; > > + > > + /* List to store deferred work if virtqueue is full */ > > + struct list_head req_list; > > + > > + /* Synchronize virtqueue data */ > > + spinlock_t pmem_lock; > > + > > + /* Memory region information */ > > + uint64_t start; > > + uint64_t size; > > +}; > > + > > +static struct virtio_device_id id_table[] = { > > + { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID }, > > + { 0 }, > > +}; > > + > > + /* The interrupt handler */ > > +static voi
Re: [Qemu-devel] [PATCH 3/3] virtio-pmem: Add virtio pmem driver
On Fri, Aug 31, 2018 at 6:32 AM Pankaj Gupta wrote: > > This patch adds virtio-pmem driver for KVM guest. > > Guest reads the persistent memory range information from > Qemu over VIRTIO and registers it on nvdimm_bus. It also > creates a nd_region object with the persistent memory > range information so that existing 'nvdimm/pmem' driver > can reserve this into system memory map. This way > 'virtio-pmem' driver uses existing functionality of pmem > driver to register persistent memory compatible for DAX > capable filesystems. > > This also provides function to perform guest flush over > VIRTIO from 'pmem' driver when userspace performs flush > on DAX memory range. > > Signed-off-by: Pankaj Gupta > --- > drivers/virtio/Kconfig | 9 ++ > drivers/virtio/Makefile | 1 + > drivers/virtio/virtio_pmem.c | 255 > +++ > include/uapi/linux/virtio_ids.h | 1 + > include/uapi/linux/virtio_pmem.h | 40 ++ > 5 files changed, 306 insertions(+) > create mode 100644 drivers/virtio/virtio_pmem.c > create mode 100644 include/uapi/linux/virtio_pmem.h > > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig > index 3589764..a331e23 100644 > --- a/drivers/virtio/Kconfig > +++ b/drivers/virtio/Kconfig > @@ -42,6 +42,15 @@ config VIRTIO_PCI_LEGACY > > If unsure, say Y. > > +config VIRTIO_PMEM > + tristate "Support for virtio pmem driver" > + depends on VIRTIO > + help > + This driver provides support for virtio based flushing interface > + for persistent memory range. > + > + If unsure, say M. > + > config VIRTIO_BALLOON > tristate "Virtio balloon driver" > depends on VIRTIO > diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile > index 3a2b5c5..cbe91c6 100644 > --- a/drivers/virtio/Makefile > +++ b/drivers/virtio/Makefile > @@ -6,3 +6,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o > virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o > obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o > obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o > +obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o > diff --git a/drivers/virtio/virtio_pmem.c b/drivers/virtio/virtio_pmem.c > new file mode 100644 > index 000..c22cc87 > --- /dev/null > +++ b/drivers/virtio/virtio_pmem.c > @@ -0,0 +1,255 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * virtio_pmem.c: Virtio pmem Driver > + * > + * Discovers persistent memory range information > + * from host and provides a virtio based flushing > + * interface. > + */ > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include I think we need to split this driver into 2 files, drivers/virtio/pmem.c would discover and register the virtual pmem device with the libnvdimm core, and drivers/nvdimm/virtio.c would house virtio_pmem_flush(). > + > +struct virtio_pmem_request { > + /* Host return status corresponding to flush request */ > + int ret; > + > + /* command name*/ > + char name[16]; > + > + /* Wait queue to process deferred work after ack from host */ > + wait_queue_head_t host_acked; > + bool done; > + > + /* Wait queue to process deferred work after virt queue buffer avail > */ > + wait_queue_head_t wq_buf; Why does this need wait_queue's per request? shouldn't this be per-device? > + bool wq_buf_avail; > + struct list_head list; > +}; > + > +struct virtio_pmem { > + struct virtio_device *vdev; > + > + /* Virtio pmem request queue */ > + struct virtqueue *req_vq; > + > + /* nvdimm bus registers virtio pmem device */ > + struct nvdimm_bus *nvdimm_bus; > + struct nvdimm_bus_descriptor nd_desc; > + > + /* List to store deferred work if virtqueue is full */ > + struct list_head req_list; > + > + /* Synchronize virtqueue data */ > + spinlock_t pmem_lock; > + > + /* Memory region information */ > + uint64_t start; > + uint64_t size; > +}; > + > +static struct virtio_device_id id_table[] = { > + { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID }, > + { 0 }, > +}; > + > + /* The interrupt handler */ > +static void host_ack(struct virtqueue *vq) > +{ > + unsigned int len; > + unsigned long flags; > + struct virtio_pmem_request *req, *req_buf; > + struct virtio_pmem *vpmem = vq->vdev->priv; > + > + spin_lock_irqsave(&vpmem->pmem_lock, flags); > + while ((req = virtqueue_get_buf(vq, &len)) != NULL) { > + req->done = true; > + wake_up(&req->host_acked); > + > + if (!list_empty(&vpmem->req_list)) { > + req_buf = list_first_entry(&vpmem->req_list, > + struct virtio_pmem_request, list); > + list_del(&vpmem->req_list); > + req_buf->wq_buf_avail = true; > +
Re: [Qemu-devel] [PATCH 3/3] virtio-pmem: Add virtio pmem driver
> > > Hi Luiz, > > > > Thanks for the review. > > > > > > > > > This patch adds virtio-pmem driver for KVM guest. > > > > > > > > Guest reads the persistent memory range information from > > > > Qemu over VIRTIO and registers it on nvdimm_bus. It also > > > > creates a nd_region object with the persistent memory > > > > range information so that existing 'nvdimm/pmem' driver > > > > can reserve this into system memory map. This way > > > > 'virtio-pmem' driver uses existing functionality of pmem > > > > driver to register persistent memory compatible for DAX > > > > capable filesystems. > > > > > > > > This also provides function to perform guest flush over > > > > VIRTIO from 'pmem' driver when userspace performs flush > > > > on DAX memory range. > > > > > > > > Signed-off-by: Pankaj Gupta > > > > --- > > > > drivers/virtio/Kconfig | 9 ++ > > > > drivers/virtio/Makefile | 1 + > > > > drivers/virtio/virtio_pmem.c | 255 > > > > +++ > > > > include/uapi/linux/virtio_ids.h | 1 + > > > > include/uapi/linux/virtio_pmem.h | 40 ++ > > > > 5 files changed, 306 insertions(+) > > > > create mode 100644 drivers/virtio/virtio_pmem.c > > > > create mode 100644 include/uapi/linux/virtio_pmem.h > > > > > > > > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig > > > > index 3589764..a331e23 100644 > > > > --- a/drivers/virtio/Kconfig > > > > +++ b/drivers/virtio/Kconfig > > > > @@ -42,6 +42,15 @@ config VIRTIO_PCI_LEGACY > > > > > > > > If unsure, say Y. > > > > > > > > +config VIRTIO_PMEM > > > > + tristate "Support for virtio pmem driver" > > > > + depends on VIRTIO > > > > + help > > > > + This driver provides support for virtio based flushing interface > > > > + for persistent memory range. > > > > + > > > > + If unsure, say M. > > > > + > > > > config VIRTIO_BALLOON > > > > tristate "Virtio balloon driver" > > > > depends on VIRTIO > > > > diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile > > > > index 3a2b5c5..cbe91c6 100644 > > > > --- a/drivers/virtio/Makefile > > > > +++ b/drivers/virtio/Makefile > > > > @@ -6,3 +6,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o > > > > virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o > > > > obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o > > > > obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o > > > > +obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o > > > > diff --git a/drivers/virtio/virtio_pmem.c > > > > b/drivers/virtio/virtio_pmem.c > > > > new file mode 100644 > > > > index 000..c22cc87 > > > > --- /dev/null > > > > +++ b/drivers/virtio/virtio_pmem.c > > > > @@ -0,0 +1,255 @@ > > > > +// SPDX-License-Identifier: GPL-2.0 > > > > +/* > > > > + * virtio_pmem.c: Virtio pmem Driver > > > > + * > > > > + * Discovers persistent memory range information > > > > + * from host and provides a virtio based flushing > > > > + * interface. > > > > + */ > > > > +#include > > > > +#include > > > > +#include > > > > +#include > > > > +#include > > > > +#include > > > > +#include > > > > +#include > > > > + > > > > +struct virtio_pmem_request { > > > > + /* Host return status corresponding to flush request */ > > > > + int ret; > > > > + > > > > + /* command name*/ > > > > + char name[16]; > > > > + > > > > + /* Wait queue to process deferred work after ack from host */ > > > > + wait_queue_head_t host_acked; > > > > + bool done; > > > > + > > > > + /* Wait queue to process deferred work after virt queue buffer > > > > avail > > > > */ > > > > + wait_queue_head_t wq_buf; > > > > + bool wq_buf_avail; > > > > + struct list_head list; > > > > +}; > > > > + > > > > +struct virtio_pmem { > > > > + struct virtio_device *vdev; > > > > + > > > > + /* Virtio pmem request queue */ > > > > + struct virtqueue *req_vq; > > > > + > > > > + /* nvdimm bus registers virtio pmem device */ > > > > + struct nvdimm_bus *nvdimm_bus; > > > > + struct nvdimm_bus_descriptor nd_desc; > > > > + > > > > + /* List to store deferred work if virtqueue is full */ > > > > + struct list_head req_list; > > > > + > > > > + /* Synchronize virtqueue data */ > > > > + spinlock_t pmem_lock; > > > > + > > > > + /* Memory region information */ > > > > + uint64_t start; > > > > + uint64_t size; > > > > +}; > > > > + > > > > +static struct virtio_device_id id_table[] = { > > > > + { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID }, > > > > + { 0 }, > > > > +}; > > > > + > > > > + /* The interrupt handler */ > > > > +static void host_ack(struct virtqueue *vq) > > > > +{ > > > > + unsigned int len; > > > > + unsigned long flags; > > > > + struct virtio_pmem_request *req, *req_buf; > > > > + struct virtio_pmem *vpmem = vq->vdev->priv; > > > > + > > > > +
Re: [Qemu-devel] [PATCH 3/3] virtio-pmem: Add virtio pmem driver
On Thu, 13 Sep 2018 02:58:21 -0400 (EDT) Pankaj Gupta wrote: > Hi Luiz, > > Thanks for the review. > > > > > > This patch adds virtio-pmem driver for KVM guest. > > > > > > Guest reads the persistent memory range information from > > > Qemu over VIRTIO and registers it on nvdimm_bus. It also > > > creates a nd_region object with the persistent memory > > > range information so that existing 'nvdimm/pmem' driver > > > can reserve this into system memory map. This way > > > 'virtio-pmem' driver uses existing functionality of pmem > > > driver to register persistent memory compatible for DAX > > > capable filesystems. > > > > > > This also provides function to perform guest flush over > > > VIRTIO from 'pmem' driver when userspace performs flush > > > on DAX memory range. > > > > > > Signed-off-by: Pankaj Gupta > > > --- > > > drivers/virtio/Kconfig | 9 ++ > > > drivers/virtio/Makefile | 1 + > > > drivers/virtio/virtio_pmem.c | 255 > > > +++ > > > include/uapi/linux/virtio_ids.h | 1 + > > > include/uapi/linux/virtio_pmem.h | 40 ++ > > > 5 files changed, 306 insertions(+) > > > create mode 100644 drivers/virtio/virtio_pmem.c > > > create mode 100644 include/uapi/linux/virtio_pmem.h > > > > > > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig > > > index 3589764..a331e23 100644 > > > --- a/drivers/virtio/Kconfig > > > +++ b/drivers/virtio/Kconfig > > > @@ -42,6 +42,15 @@ config VIRTIO_PCI_LEGACY > > > > > > If unsure, say Y. > > > > > > +config VIRTIO_PMEM > > > + tristate "Support for virtio pmem driver" > > > + depends on VIRTIO > > > + help > > > + This driver provides support for virtio based flushing interface > > > + for persistent memory range. > > > + > > > + If unsure, say M. > > > + > > > config VIRTIO_BALLOON > > > tristate "Virtio balloon driver" > > > depends on VIRTIO > > > diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile > > > index 3a2b5c5..cbe91c6 100644 > > > --- a/drivers/virtio/Makefile > > > +++ b/drivers/virtio/Makefile > > > @@ -6,3 +6,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o > > > virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o > > > obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o > > > obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o > > > +obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o > > > diff --git a/drivers/virtio/virtio_pmem.c b/drivers/virtio/virtio_pmem.c > > > new file mode 100644 > > > index 000..c22cc87 > > > --- /dev/null > > > +++ b/drivers/virtio/virtio_pmem.c > > > @@ -0,0 +1,255 @@ > > > +// SPDX-License-Identifier: GPL-2.0 > > > +/* > > > + * virtio_pmem.c: Virtio pmem Driver > > > + * > > > + * Discovers persistent memory range information > > > + * from host and provides a virtio based flushing > > > + * interface. > > > + */ > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > +#include > > > + > > > +struct virtio_pmem_request { > > > + /* Host return status corresponding to flush request */ > > > + int ret; > > > + > > > + /* command name*/ > > > + char name[16]; > > > + > > > + /* Wait queue to process deferred work after ack from host */ > > > + wait_queue_head_t host_acked; > > > + bool done; > > > + > > > + /* Wait queue to process deferred work after virt queue buffer avail */ > > > + wait_queue_head_t wq_buf; > > > + bool wq_buf_avail; > > > + struct list_head list; > > > +}; > > > + > > > +struct virtio_pmem { > > > + struct virtio_device *vdev; > > > + > > > + /* Virtio pmem request queue */ > > > + struct virtqueue *req_vq; > > > + > > > + /* nvdimm bus registers virtio pmem device */ > > > + struct nvdimm_bus *nvdimm_bus; > > > + struct nvdimm_bus_descriptor nd_desc; > > > + > > > + /* List to store deferred work if virtqueue is full */ > > > + struct list_head req_list; > > > + > > > + /* Synchronize virtqueue data */ > > > + spinlock_t pmem_lock; > > > + > > > + /* Memory region information */ > > > + uint64_t start; > > > + uint64_t size; > > > +}; > > > + > > > +static struct virtio_device_id id_table[] = { > > > + { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID }, > > > + { 0 }, > > > +}; > > > + > > > + /* The interrupt handler */ > > > +static void host_ack(struct virtqueue *vq) > > > +{ > > > + unsigned int len; > > > + unsigned long flags; > > > + struct virtio_pmem_request *req, *req_buf; > > > + struct virtio_pmem *vpmem = vq->vdev->priv; > > > + > > > + spin_lock_irqsave(&vpmem->pmem_lock, flags); > > > + while ((req = virtqueue_get_buf(vq, &len)) != NULL) { > > > + req->done = true; > > > + wake_up(&req->host_acked); > > > + > > > + if (!list_empty(&vpmem->req_list)) { > > > + req_buf = list_first_entry(&vpmem->req_list, > > > + struct virtio_pmem_request, list); > > > + list_del(&vpmem->req_list); > > > + req_buf->wq
Re: [Qemu-devel] [PATCH 3/3] virtio-pmem: Add virtio pmem driver
Hi Luiz, Thanks for the review. > > > This patch adds virtio-pmem driver for KVM guest. > > > > Guest reads the persistent memory range information from > > Qemu over VIRTIO and registers it on nvdimm_bus. It also > > creates a nd_region object with the persistent memory > > range information so that existing 'nvdimm/pmem' driver > > can reserve this into system memory map. This way > > 'virtio-pmem' driver uses existing functionality of pmem > > driver to register persistent memory compatible for DAX > > capable filesystems. > > > > This also provides function to perform guest flush over > > VIRTIO from 'pmem' driver when userspace performs flush > > on DAX memory range. > > > > Signed-off-by: Pankaj Gupta > > --- > > drivers/virtio/Kconfig | 9 ++ > > drivers/virtio/Makefile | 1 + > > drivers/virtio/virtio_pmem.c | 255 > > +++ > > include/uapi/linux/virtio_ids.h | 1 + > > include/uapi/linux/virtio_pmem.h | 40 ++ > > 5 files changed, 306 insertions(+) > > create mode 100644 drivers/virtio/virtio_pmem.c > > create mode 100644 include/uapi/linux/virtio_pmem.h > > > > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig > > index 3589764..a331e23 100644 > > --- a/drivers/virtio/Kconfig > > +++ b/drivers/virtio/Kconfig > > @@ -42,6 +42,15 @@ config VIRTIO_PCI_LEGACY > > > > If unsure, say Y. > > > > +config VIRTIO_PMEM > > + tristate "Support for virtio pmem driver" > > + depends on VIRTIO > > + help > > + This driver provides support for virtio based flushing interface > > + for persistent memory range. > > + > > + If unsure, say M. > > + > > config VIRTIO_BALLOON > > tristate "Virtio balloon driver" > > depends on VIRTIO > > diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile > > index 3a2b5c5..cbe91c6 100644 > > --- a/drivers/virtio/Makefile > > +++ b/drivers/virtio/Makefile > > @@ -6,3 +6,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o > > virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o > > obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o > > obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o > > +obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o > > diff --git a/drivers/virtio/virtio_pmem.c b/drivers/virtio/virtio_pmem.c > > new file mode 100644 > > index 000..c22cc87 > > --- /dev/null > > +++ b/drivers/virtio/virtio_pmem.c > > @@ -0,0 +1,255 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* > > + * virtio_pmem.c: Virtio pmem Driver > > + * > > + * Discovers persistent memory range information > > + * from host and provides a virtio based flushing > > + * interface. > > + */ > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +struct virtio_pmem_request { > > + /* Host return status corresponding to flush request */ > > + int ret; > > + > > + /* command name*/ > > + char name[16]; > > + > > + /* Wait queue to process deferred work after ack from host */ > > + wait_queue_head_t host_acked; > > + bool done; > > + > > + /* Wait queue to process deferred work after virt queue buffer avail */ > > + wait_queue_head_t wq_buf; > > + bool wq_buf_avail; > > + struct list_head list; > > +}; > > + > > +struct virtio_pmem { > > + struct virtio_device *vdev; > > + > > + /* Virtio pmem request queue */ > > + struct virtqueue *req_vq; > > + > > + /* nvdimm bus registers virtio pmem device */ > > + struct nvdimm_bus *nvdimm_bus; > > + struct nvdimm_bus_descriptor nd_desc; > > + > > + /* List to store deferred work if virtqueue is full */ > > + struct list_head req_list; > > + > > + /* Synchronize virtqueue data */ > > + spinlock_t pmem_lock; > > + > > + /* Memory region information */ > > + uint64_t start; > > + uint64_t size; > > +}; > > + > > +static struct virtio_device_id id_table[] = { > > + { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID }, > > + { 0 }, > > +}; > > + > > + /* The interrupt handler */ > > +static void host_ack(struct virtqueue *vq) > > +{ > > + unsigned int len; > > + unsigned long flags; > > + struct virtio_pmem_request *req, *req_buf; > > + struct virtio_pmem *vpmem = vq->vdev->priv; > > + > > + spin_lock_irqsave(&vpmem->pmem_lock, flags); > > + while ((req = virtqueue_get_buf(vq, &len)) != NULL) { > > + req->done = true; > > + wake_up(&req->host_acked); > > + > > + if (!list_empty(&vpmem->req_list)) { > > + req_buf = list_first_entry(&vpmem->req_list, > > + struct virtio_pmem_request, list); > > + list_del(&vpmem->req_list); > > + req_buf->wq_buf_avail = true; > > + wake_up(&req_buf->wq_buf); > > + } > > + } > > + spin_unlock_irqrestore(&vpmem->pmem_lock, flags); > > +} > > + /* Initialize virt queue */ > > +static int init_vq(struct virtio_pmem *vpmem) > > +{ >
Re: [Qemu-devel] [PATCH 3/3] virtio-pmem: Add virtio pmem driver
On Fri, 31 Aug 2018 19:00:18 +0530 Pankaj Gupta wrote: > This patch adds virtio-pmem driver for KVM guest. > > Guest reads the persistent memory range information from > Qemu over VIRTIO and registers it on nvdimm_bus. It also > creates a nd_region object with the persistent memory > range information so that existing 'nvdimm/pmem' driver > can reserve this into system memory map. This way > 'virtio-pmem' driver uses existing functionality of pmem > driver to register persistent memory compatible for DAX > capable filesystems. > > This also provides function to perform guest flush over > VIRTIO from 'pmem' driver when userspace performs flush > on DAX memory range. > > Signed-off-by: Pankaj Gupta > --- > drivers/virtio/Kconfig | 9 ++ > drivers/virtio/Makefile | 1 + > drivers/virtio/virtio_pmem.c | 255 > +++ > include/uapi/linux/virtio_ids.h | 1 + > include/uapi/linux/virtio_pmem.h | 40 ++ > 5 files changed, 306 insertions(+) > create mode 100644 drivers/virtio/virtio_pmem.c > create mode 100644 include/uapi/linux/virtio_pmem.h > > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig > index 3589764..a331e23 100644 > --- a/drivers/virtio/Kconfig > +++ b/drivers/virtio/Kconfig > @@ -42,6 +42,15 @@ config VIRTIO_PCI_LEGACY > > If unsure, say Y. > > +config VIRTIO_PMEM > + tristate "Support for virtio pmem driver" > + depends on VIRTIO > + help > + This driver provides support for virtio based flushing interface > + for persistent memory range. > + > + If unsure, say M. > + > config VIRTIO_BALLOON > tristate "Virtio balloon driver" > depends on VIRTIO > diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile > index 3a2b5c5..cbe91c6 100644 > --- a/drivers/virtio/Makefile > +++ b/drivers/virtio/Makefile > @@ -6,3 +6,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o > virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o > obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o > obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o > +obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o > diff --git a/drivers/virtio/virtio_pmem.c b/drivers/virtio/virtio_pmem.c > new file mode 100644 > index 000..c22cc87 > --- /dev/null > +++ b/drivers/virtio/virtio_pmem.c > @@ -0,0 +1,255 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * virtio_pmem.c: Virtio pmem Driver > + * > + * Discovers persistent memory range information > + * from host and provides a virtio based flushing > + * interface. > + */ > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +struct virtio_pmem_request { > + /* Host return status corresponding to flush request */ > + int ret; > + > + /* command name*/ > + char name[16]; > + > + /* Wait queue to process deferred work after ack from host */ > + wait_queue_head_t host_acked; > + bool done; > + > + /* Wait queue to process deferred work after virt queue buffer avail */ > + wait_queue_head_t wq_buf; > + bool wq_buf_avail; > + struct list_head list; > +}; > + > +struct virtio_pmem { > + struct virtio_device *vdev; > + > + /* Virtio pmem request queue */ > + struct virtqueue *req_vq; > + > + /* nvdimm bus registers virtio pmem device */ > + struct nvdimm_bus *nvdimm_bus; > + struct nvdimm_bus_descriptor nd_desc; > + > + /* List to store deferred work if virtqueue is full */ > + struct list_head req_list; > + > + /* Synchronize virtqueue data */ > + spinlock_t pmem_lock; > + > + /* Memory region information */ > + uint64_t start; > + uint64_t size; > +}; > + > +static struct virtio_device_id id_table[] = { > + { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID }, > + { 0 }, > +}; > + > + /* The interrupt handler */ > +static void host_ack(struct virtqueue *vq) > +{ > + unsigned int len; > + unsigned long flags; > + struct virtio_pmem_request *req, *req_buf; > + struct virtio_pmem *vpmem = vq->vdev->priv; > + > + spin_lock_irqsave(&vpmem->pmem_lock, flags); > + while ((req = virtqueue_get_buf(vq, &len)) != NULL) { > + req->done = true; > + wake_up(&req->host_acked); > + > + if (!list_empty(&vpmem->req_list)) { > + req_buf = list_first_entry(&vpmem->req_list, > + struct virtio_pmem_request, list); > + list_del(&vpmem->req_list); > + req_buf->wq_buf_avail = true; > + wake_up(&req_buf->wq_buf); > + } > + } > + spin_unlock_irqrestore(&vpmem->pmem_lock, flags); > +} > + /* Initialize virt queue */ > +static int init_vq(struct virtio_pmem *vpmem) > +{ > + struct virtqueue *vq; > + > + /* single vq */ > + vpmem->req_vq = vq = virtio_find_single_vq(vpmem->vdev, > + host_ack, "flush_queue
Re: [Qemu-devel] [PATCH 3/3] virtio-pmem: Add virtio pmem driver
Hi Pankaj, Thank you for the patch! Yet something to improve: [auto build test ERROR on linux-nvdimm/libnvdimm-for-next] [also build test ERROR on v4.19-rc2 next-20180905] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Pankaj-Gupta/kvm-fake-DAX-device/20180903-160032 base: https://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git libnvdimm-for-next config: i386-allyesconfig (attached as .config) compiler: gcc-7 (Debian 7.3.0-1) 7.3.0 reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): drivers/virtio/virtio_pmem.o: In function `virtio_pmem_remove': >> virtio_pmem.c:(.text+0x299): undefined reference to `nvdimm_bus_unregister' drivers/virtio/virtio_pmem.o: In function `virtio_pmem_probe': >> virtio_pmem.c:(.text+0x5e3): undefined reference to `nvdimm_bus_register' >> virtio_pmem.c:(.text+0x62a): undefined reference to >> `nvdimm_pmem_region_create' virtio_pmem.c:(.text+0x63b): undefined reference to `nvdimm_bus_unregister' --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [Qemu-devel] [PATCH 3/3] virtio-pmem: Add virtio pmem driver
Hello, Thanks for the report. > Hi Pankaj, > > Thank you for the patch! Yet something to improve: > > [auto build test ERROR on linux-nvdimm/libnvdimm-for-next] > [also build test ERROR on v4.19-rc2 next-20180903] > [if your patch is applied to the wrong git tree, please drop us a note to > help improve the system] > > url: > https://github.com/0day-ci/linux/commits/Pankaj-Gupta/kvm-fake-DAX-device/20180903-160032 > base: https://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git > libnvdimm-for-next > config: i386-randconfig-a3-201835 (attached as .config) > compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4 > reproduce: > # save the attached .config to linux build tree > make ARCH=i386 > :: branch date: 21 hours ago > :: commit date: 21 hours ago > > All errors (new ones prefixed by >>): > >drivers/virtio/virtio_pmem.o: In function `virtio_pmem_remove': > >> drivers/virtio/virtio_pmem.c:220: undefined reference to > >> `nvdimm_bus_unregister' >drivers/virtio/virtio_pmem.o: In function `virtio_pmem_probe': > >> drivers/virtio/virtio_pmem.c:186: undefined reference to > >> `nvdimm_bus_register' > >> drivers/virtio/virtio_pmem.c:198: undefined reference to > >> `nvdimm_pmem_region_create' >drivers/virtio/virtio_pmem.c:207: undefined reference to >`nvdimm_bus_unregister' It looks like dependent configiguration 'LIBNVDIMM' is not enabled. I will add the dependency in Kconfig file for virtio_pmem in v2. Thanks, Pankaj > > # > https://github.com/0day-ci/linux/commit/acce2633da18b0ad58d0cc9243a85b03020ca099 > git remote add linux-review https://github.com/0day-ci/linux > git remote update linux-review > git checkout acce2633da18b0ad58d0cc9243a85b03020ca099 > vim +220 drivers/virtio/virtio_pmem.c > > acce2633 Pankaj Gupta 2018-08-31 147 > acce2633 Pankaj Gupta 2018-08-31 148 static int virtio_pmem_probe(struct > virtio_device *vdev) > acce2633 Pankaj Gupta 2018-08-31 149 { > acce2633 Pankaj Gupta 2018-08-31 150 int err = 0; > acce2633 Pankaj Gupta 2018-08-31 151 struct resource res; > acce2633 Pankaj Gupta 2018-08-31 152 struct virtio_pmem *vpmem; > acce2633 Pankaj Gupta 2018-08-31 153 struct nvdimm_bus *nvdimm_bus; > acce2633 Pankaj Gupta 2018-08-31 154 struct nd_region_desc ndr_desc; > acce2633 Pankaj Gupta 2018-08-31 155 int nid = > dev_to_node(&vdev->dev); > acce2633 Pankaj Gupta 2018-08-31 156 struct nd_region *nd_region; > acce2633 Pankaj Gupta 2018-08-31 157 > acce2633 Pankaj Gupta 2018-08-31 158 if (!vdev->config->get) { > acce2633 Pankaj Gupta 2018-08-31 159 dev_err(&vdev->dev, "%s > failure: > config disabled\n", > acce2633 Pankaj Gupta 2018-08-31 160 __func__); > acce2633 Pankaj Gupta 2018-08-31 161 return -EINVAL; > acce2633 Pankaj Gupta 2018-08-31 162 } > acce2633 Pankaj Gupta 2018-08-31 163 > acce2633 Pankaj Gupta 2018-08-31 164 vdev->priv = vpmem = > devm_kzalloc(&vdev->dev, sizeof(*vpmem), > acce2633 Pankaj Gupta 2018-08-31 165 GFP_KERNEL); > acce2633 Pankaj Gupta 2018-08-31 166 if (!vpmem) { > acce2633 Pankaj Gupta 2018-08-31 167 err = -ENOMEM; > acce2633 Pankaj Gupta 2018-08-31 168 goto out_err; > acce2633 Pankaj Gupta 2018-08-31 169 } > acce2633 Pankaj Gupta 2018-08-31 170 > acce2633 Pankaj Gupta 2018-08-31 171 vpmem->vdev = vdev; > acce2633 Pankaj Gupta 2018-08-31 172 err = init_vq(vpmem); > acce2633 Pankaj Gupta 2018-08-31 173 if (err) > acce2633 Pankaj Gupta 2018-08-31 174 goto out_err; > acce2633 Pankaj Gupta 2018-08-31 175 > acce2633 Pankaj Gupta 2018-08-31 176 virtio_cread(vpmem->vdev, struct > virtio_pmem_config, > acce2633 Pankaj Gupta 2018-08-31 177 start, > &vpmem->start); > acce2633 Pankaj Gupta 2018-08-31 178 virtio_cread(vpmem->vdev, struct > virtio_pmem_config, > acce2633 Pankaj Gupta 2018-08-31 179 size, > &vpmem->size); > acce2633 Pankaj Gupta 2018-08-31 180 > acce2633 Pankaj Gupta 2018-08-31 181 res.start = vpmem->start; > acce2633 Pankaj Gupta 2018-08-31 182 res.end = vpmem->start + > vpmem->size-1; > acce2633 Pankaj Gupta 2018-08-31 183 vpmem->nd_desc.provider_name = > "virtio-pmem"; > acce2633 Pankaj Gupta 2018-08-31 184 vpmem->nd_desc.module = > THIS_MODULE; > acce2633 Pankaj Gupta 2018-08-31 185 > acce2633 Pankaj Gupta 2018-08-31 @186 vpmem->nvdimm_bus = nvdimm_bus = > nvdimm_bus_register(&vdev->dev, > acce2633 Pankaj Gupta 2018-08-31 187 > &vpmem->nd_desc); > acce2633 Pankaj Gupta 2018-08-31 188 if (!nvdimm_bus) > acce2633 Pankaj Gupta 2018-08-31 189 goto out_vq; > acce2633 Pankaj Gupta 2018-08-31 190 > acce2633 Pankaj Gupta 2018-08-31 191 d
Re: [Qemu-devel] [PATCH 3/3] virtio-pmem: Add virtio pmem driver
Hi Pankaj, Thank you for the patch! Yet something to improve: [auto build test ERROR on linux-nvdimm/libnvdimm-for-next] [also build test ERROR on v4.19-rc2 next-20180903] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Pankaj-Gupta/kvm-fake-DAX-device/20180903-160032 base: https://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git libnvdimm-for-next config: i386-randconfig-a3-201835 (attached as .config) compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4 reproduce: # save the attached .config to linux build tree make ARCH=i386 :: branch date: 21 hours ago :: commit date: 21 hours ago All errors (new ones prefixed by >>): drivers/virtio/virtio_pmem.o: In function `virtio_pmem_remove': >> drivers/virtio/virtio_pmem.c:220: undefined reference to >> `nvdimm_bus_unregister' drivers/virtio/virtio_pmem.o: In function `virtio_pmem_probe': >> drivers/virtio/virtio_pmem.c:186: undefined reference to >> `nvdimm_bus_register' >> drivers/virtio/virtio_pmem.c:198: undefined reference to >> `nvdimm_pmem_region_create' drivers/virtio/virtio_pmem.c:207: undefined reference to `nvdimm_bus_unregister' # https://github.com/0day-ci/linux/commit/acce2633da18b0ad58d0cc9243a85b03020ca099 git remote add linux-review https://github.com/0day-ci/linux git remote update linux-review git checkout acce2633da18b0ad58d0cc9243a85b03020ca099 vim +220 drivers/virtio/virtio_pmem.c acce2633 Pankaj Gupta 2018-08-31 147 acce2633 Pankaj Gupta 2018-08-31 148 static int virtio_pmem_probe(struct virtio_device *vdev) acce2633 Pankaj Gupta 2018-08-31 149 { acce2633 Pankaj Gupta 2018-08-31 150 int err = 0; acce2633 Pankaj Gupta 2018-08-31 151 struct resource res; acce2633 Pankaj Gupta 2018-08-31 152 struct virtio_pmem *vpmem; acce2633 Pankaj Gupta 2018-08-31 153 struct nvdimm_bus *nvdimm_bus; acce2633 Pankaj Gupta 2018-08-31 154 struct nd_region_desc ndr_desc; acce2633 Pankaj Gupta 2018-08-31 155 int nid = dev_to_node(&vdev->dev); acce2633 Pankaj Gupta 2018-08-31 156 struct nd_region *nd_region; acce2633 Pankaj Gupta 2018-08-31 157 acce2633 Pankaj Gupta 2018-08-31 158 if (!vdev->config->get) { acce2633 Pankaj Gupta 2018-08-31 159 dev_err(&vdev->dev, "%s failure: config disabled\n", acce2633 Pankaj Gupta 2018-08-31 160 __func__); acce2633 Pankaj Gupta 2018-08-31 161 return -EINVAL; acce2633 Pankaj Gupta 2018-08-31 162 } acce2633 Pankaj Gupta 2018-08-31 163 acce2633 Pankaj Gupta 2018-08-31 164 vdev->priv = vpmem = devm_kzalloc(&vdev->dev, sizeof(*vpmem), acce2633 Pankaj Gupta 2018-08-31 165 GFP_KERNEL); acce2633 Pankaj Gupta 2018-08-31 166 if (!vpmem) { acce2633 Pankaj Gupta 2018-08-31 167 err = -ENOMEM; acce2633 Pankaj Gupta 2018-08-31 168 goto out_err; acce2633 Pankaj Gupta 2018-08-31 169 } acce2633 Pankaj Gupta 2018-08-31 170 acce2633 Pankaj Gupta 2018-08-31 171 vpmem->vdev = vdev; acce2633 Pankaj Gupta 2018-08-31 172 err = init_vq(vpmem); acce2633 Pankaj Gupta 2018-08-31 173 if (err) acce2633 Pankaj Gupta 2018-08-31 174 goto out_err; acce2633 Pankaj Gupta 2018-08-31 175 acce2633 Pankaj Gupta 2018-08-31 176 virtio_cread(vpmem->vdev, struct virtio_pmem_config, acce2633 Pankaj Gupta 2018-08-31 177 start, &vpmem->start); acce2633 Pankaj Gupta 2018-08-31 178 virtio_cread(vpmem->vdev, struct virtio_pmem_config, acce2633 Pankaj Gupta 2018-08-31 179 size, &vpmem->size); acce2633 Pankaj Gupta 2018-08-31 180 acce2633 Pankaj Gupta 2018-08-31 181 res.start = vpmem->start; acce2633 Pankaj Gupta 2018-08-31 182 res.end = vpmem->start + vpmem->size-1; acce2633 Pankaj Gupta 2018-08-31 183 vpmem->nd_desc.provider_name = "virtio-pmem"; acce2633 Pankaj Gupta 2018-08-31 184 vpmem->nd_desc.module = THIS_MODULE; acce2633 Pankaj Gupta 2018-08-31 185 acce2633 Pankaj Gupta 2018-08-31 @186 vpmem->nvdimm_bus = nvdimm_bus = nvdimm_bus_register(&vdev->dev, acce2633 Pankaj Gupta 2018-08-31 187 &vpmem->nd_desc); acce2633 Pankaj Gupta 2018-08-31 188 if (!nvdimm_bus) acce2633 Pankaj Gupta 2018-08-31 189 goto out_vq; acce2633 Pankaj Gupta 2018-08-31 190 acce2633 Pankaj Gupta 2018-08-31 191 dev_set_drvdata(&vdev->dev, nvdimm_bus); acce2633 Pankaj Gupta 2018-08-31 192 memset(&ndr_desc, 0, sizeof(ndr_desc)); acce2633 Pankaj Gupta 2018-08-31 193 acce2633 Pankaj Gupta 2018-08-31 194 ndr_desc.res = &res; acce2633 Pankaj Gupta 2018-08-31 195 ndr_desc.numa_node = nid; acce2633 Pankaj Gupta 2018-08-31 196 ndr_desc.flush = virtio_pmem_flush; acce2633 Pankaj Gupta 2018-08-31 197 set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags); acce2633 Pankaj Gupta 2018-08-31 @198 nd_region = nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc); acce2633 Pankaj G
[Qemu-devel] [PATCH 3/3] virtio-pmem: Add virtio pmem driver
This patch adds virtio-pmem driver for KVM guest. Guest reads the persistent memory range information from Qemu over VIRTIO and registers it on nvdimm_bus. It also creates a nd_region object with the persistent memory range information so that existing 'nvdimm/pmem' driver can reserve this into system memory map. This way 'virtio-pmem' driver uses existing functionality of pmem driver to register persistent memory compatible for DAX capable filesystems. This also provides function to perform guest flush over VIRTIO from 'pmem' driver when userspace performs flush on DAX memory range. Signed-off-by: Pankaj Gupta --- drivers/virtio/Kconfig | 9 ++ drivers/virtio/Makefile | 1 + drivers/virtio/virtio_pmem.c | 255 +++ include/uapi/linux/virtio_ids.h | 1 + include/uapi/linux/virtio_pmem.h | 40 ++ 5 files changed, 306 insertions(+) create mode 100644 drivers/virtio/virtio_pmem.c create mode 100644 include/uapi/linux/virtio_pmem.h diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig index 3589764..a331e23 100644 --- a/drivers/virtio/Kconfig +++ b/drivers/virtio/Kconfig @@ -42,6 +42,15 @@ config VIRTIO_PCI_LEGACY If unsure, say Y. +config VIRTIO_PMEM + tristate "Support for virtio pmem driver" + depends on VIRTIO + help + This driver provides support for virtio based flushing interface + for persistent memory range. + + If unsure, say M. + config VIRTIO_BALLOON tristate "Virtio balloon driver" depends on VIRTIO diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile index 3a2b5c5..cbe91c6 100644 --- a/drivers/virtio/Makefile +++ b/drivers/virtio/Makefile @@ -6,3 +6,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o +obj-$(CONFIG_VIRTIO_PMEM) += virtio_pmem.o diff --git a/drivers/virtio/virtio_pmem.c b/drivers/virtio/virtio_pmem.c new file mode 100644 index 000..c22cc87 --- /dev/null +++ b/drivers/virtio/virtio_pmem.c @@ -0,0 +1,255 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * virtio_pmem.c: Virtio pmem Driver + * + * Discovers persistent memory range information + * from host and provides a virtio based flushing + * interface. + */ +#include +#include +#include +#include +#include +#include +#include +#include + +struct virtio_pmem_request { + /* Host return status corresponding to flush request */ + int ret; + + /* command name*/ + char name[16]; + + /* Wait queue to process deferred work after ack from host */ + wait_queue_head_t host_acked; + bool done; + + /* Wait queue to process deferred work after virt queue buffer avail */ + wait_queue_head_t wq_buf; + bool wq_buf_avail; + struct list_head list; +}; + +struct virtio_pmem { + struct virtio_device *vdev; + + /* Virtio pmem request queue */ + struct virtqueue *req_vq; + + /* nvdimm bus registers virtio pmem device */ + struct nvdimm_bus *nvdimm_bus; + struct nvdimm_bus_descriptor nd_desc; + + /* List to store deferred work if virtqueue is full */ + struct list_head req_list; + + /* Synchronize virtqueue data */ + spinlock_t pmem_lock; + + /* Memory region information */ + uint64_t start; + uint64_t size; +}; + +static struct virtio_device_id id_table[] = { + { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID }, + { 0 }, +}; + + /* The interrupt handler */ +static void host_ack(struct virtqueue *vq) +{ + unsigned int len; + unsigned long flags; + struct virtio_pmem_request *req, *req_buf; + struct virtio_pmem *vpmem = vq->vdev->priv; + + spin_lock_irqsave(&vpmem->pmem_lock, flags); + while ((req = virtqueue_get_buf(vq, &len)) != NULL) { + req->done = true; + wake_up(&req->host_acked); + + if (!list_empty(&vpmem->req_list)) { + req_buf = list_first_entry(&vpmem->req_list, + struct virtio_pmem_request, list); + list_del(&vpmem->req_list); + req_buf->wq_buf_avail = true; + wake_up(&req_buf->wq_buf); + } + } + spin_unlock_irqrestore(&vpmem->pmem_lock, flags); +} + /* Initialize virt queue */ +static int init_vq(struct virtio_pmem *vpmem) +{ + struct virtqueue *vq; + + /* single vq */ + vpmem->req_vq = vq = virtio_find_single_vq(vpmem->vdev, + host_ack, "flush_queue"); + if (IS_ERR(vq)) + return PTR_ERR(vq); + + spin_lock_init(&vpmem->pmem_lock); + INIT_LIST_HEAD(&vpmem->req_list); + + return 0; +}; + + /* The request submission function */ +static int virtio_pmem_flush(struc