Re: [RFC PATCH 1/2] hw/nvme: add mi device
On Tue, Jul 13, 2021 at 10:37:23AM +0100, Stefan Hajnoczi wrote: On Tue, Jul 13, 2021 at 06:30:28AM +0100, Christoph Hellwig wrote: On Tue, Jul 13, 2021 at 06:30:28AM +0100, Christoph Hellwig wrote: On Mon, Jul 12, 2021 at 12:03:27PM +0100, Stefan Hajnoczi wrote: > Why did you decide to implement -device nvme-mi as a device on > TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised > that there's no NVMe bus interface (callbacks). It seems like this could > just as easily be a property of an NVMe controller -device > nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not > familiar enough with MI and NVMe architecture... I'm too far away from qemu these days to understand what TYPE_NVME_BUS is. Bt NVMe-MI has tree possible transports: 1) out of band through smbus. This seems something that could be trivially modelled in qemu 2) out of band over MCTP / PCIe VDM. 3) in band using NVMe admin commands that pass through MI commands Thanks for explaining! Common NVMe-MI code can be shared by -device nvme-mi-smbus, in-band NVMe MI commands (part of -device nvme), a vsock transport, etc. This patch has nvme_mi_admin_command() as the entry point to common MI code, so not much needs to be done to achieve this. My question about why -device nvme-mi was because this "device" doesn't implement any bus interface (callbacks). The bus effectively just serves as an owner of this device. The guest does not access the device via the bus. So I'm not sure a -device is appropriate, it's an usual device. If the device is kept, please name it -device nvme-mi-vsock so it's clear this is the NVMe-MI vsock transport. I think the device could be dropped and instead an -device nvme,mi-vsock=on|off property could be added to enable the MI vsock transport on a specific NVMe controller. This raises the question of whether the port number should be configurable so multiple vsock Management Endpoints can coexist. I don't have time to explore the architectural model, but here's the link in case anyone wants to think through all the options for NVMe MI Management Endpoints and how QEMU should model them: "1.4 NVM Subsystem Architectural Model" https://protect2.fireeye.com/v1/url?k=8ee99db1-ee0b00ec-8ee816fe-000babd9f1ba-c174da71c1d11e79=1=b7b9709a-33ac-4d98-a6c0-ff53377a3278=https%3A%2F%2Fnvmexpress.org%2Fwp-content%2Fuploads%2FNVM-Express-Management-Interface-1.2-2021.06.02-Ratified.pdf Stefan Thanks Stefan for the suggestion.
Re: [RFC PATCH 1/2] hw/nvme: add mi device
On Mon, Jul 12, 2021 at 12:03:27PM +0100, Stefan Hajnoczi wrote: On Fri, Jul 09, 2021 at 07:25:45PM +0530, Padmakar Kalghatgi wrote: The enclosed patch contains the implementation of certain commands of nvme-mi specification.The MI commands are useful to manage/configure/monitor the device.Eventhough the MI commands can be sent via the inband NVMe-MI send/recieve commands, the idea here is to emulate the sideband interface for MI. Since the nvme-mi specification deals in communicating to the nvme subsystem via. a sideband interface, in this qemu implementation, virtual-vsock is used for making the sideband communication, the guest VM needs to make the connection to the specific cid of the vsock of the qemu host. One needs to specify the following command in the launch to specify the nvme-mi device, cid and to setup the vsock: -device nvme-mi,bus= -device vhost-vsock-pci, guest-cid= The following commands are tested with nvme-cli by hooking to the cid of the vsock as shown above and use the socket send/recieve commands to issue the commands and get the response. we are planning to push the changes for nvme-cli as well to test the MI functionality. Is the purpose of this feature (-device nvme-mi) testing MI with QEMU's NVMe implementation? My understanding is that instead of inventing an out-of-band interface in the form of a new paravirtualized device, you decided to use vsock to send MI commands from the guest to QEMU? As the connection can be established by the guest VM at any point, we have created a thread which is looking for a connection request. Please suggest if there is a native/better way to handle this. QEMU has an event-driven architecture and uses threads sparingly. When it uses threads it uses qemu_create_thread() instead of pthread_create(), but I suggest using qemu_set_fd_handler() or a coroutine with QIOChannel to integrate into the QEMU event loop instead. I didn't see any thread synchronization, so I'm not sure if accessing NVMe state from the MI thread is safe. Changing the code to use QEMU's event loop can solve that problem since there's no separate thread. vsock mimcs the sideband communication hence we used it. However we are working the smbus/i2c implementation for nvme-mi in qemu/nvme-cli, we will send the patch in few days. to communicate with nvme-mi over smbus/i2c, nvme-mi device needs to inherit from the i2c class which has callbacks for sending and recieving messages, this approach would get rid of the threads. This module makes use of the NvmeCtrl structure of the nvme module, to fetch relevant information of the nvme device which are used in some of the mi commands. Eventhough certain commands might require modification to the nvme module, currently we have currently refrained from making changes to the nvme module. Why did you decide to implement -device nvme-mi as a device on TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised that there's no NVMe bus interface (callbacks). It seems like this could just as easily be a property of an NVMe controller -device nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not familiar enough with MI and NVMe architecture... Stefan since nvme communication happens over pcie and nvme-mi happens over smbus/i2c nvme-mi cannot be a property of nvme rather it should be a separate device which will be on the smbus/i2c.
Re: [RFC PATCH 1/2] hw/nvme: add mi device
On Fri, Jul 09, 2021 at 08:58:42AM -0700, Keith Busch wrote: On Fri, Jul 09, 2021 at 07:25:45PM +0530, Padmakar Kalghatgi wrote: The following commands are tested with nvme-cli by hooking to the cid of the vsock as shown above and use the socket send/recieve commands to issue the commands and get the response. Why sockets? Shouldn't mi targets use smbus for that? vsock mimcs the sideband communication, hence we used it. However, we are working on the smbus/i2c implementation for nvme-mi in qemu/nvme-cli, we will send the patch in few days.
Re: [RFC PATCH 1/2] hw/nvme: add mi device
On Tue, Jul 13, 2021 at 06:30:28AM +0100, Christoph Hellwig wrote: On Tue, Jul 13, 2021 at 06:30:28AM +0100, Christoph Hellwig wrote: > On Mon, Jul 12, 2021 at 12:03:27PM +0100, Stefan Hajnoczi wrote: > > Why did you decide to implement -device nvme-mi as a device on > > TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised > > that there's no NVMe bus interface (callbacks). It seems like this could > > just as easily be a property of an NVMe controller -device > > nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not > > familiar enough with MI and NVMe architecture... > > I'm too far away from qemu these days to understand what TYPE_NVME_BUS > is. Bt NVMe-MI has tree possible transports: > > 1) out of band through smbus. This seems something that could be > trivially modelled in qemu > 2) out of band over MCTP / PCIe VDM. > 3) in band using NVMe admin commands that pass through MI commands Thanks for explaining! Common NVMe-MI code can be shared by -device nvme-mi-smbus, in-band NVMe MI commands (part of -device nvme), a vsock transport, etc. This patch has nvme_mi_admin_command() as the entry point to common MI code, so not much needs to be done to achieve this. My question about why -device nvme-mi was because this "device" doesn't implement any bus interface (callbacks). The bus effectively just serves as an owner of this device. The guest does not access the device via the bus. So I'm not sure a -device is appropriate, it's an usual device. If the device is kept, please name it -device nvme-mi-vsock so it's clear this is the NVMe-MI vsock transport. I think the device could be dropped and instead an -device nvme,mi-vsock=on|off property could be added to enable the MI vsock transport on a specific NVMe controller. This raises the question of whether the port number should be configurable so multiple vsock Management Endpoints can coexist. I don't have time to explore the architectural model, but here's the link in case anyone wants to think through all the options for NVMe MI Management Endpoints and how QEMU should model them: "1.4 NVM Subsystem Architectural Model" https://nvmexpress.org/wp-content/uploads/NVM-Express-Management-Interface-1.2-2021.06.02-Ratified.pdf Stefan signature.asc Description: PGP signature
Re: [RFC PATCH 1/2] hw/nvme: add mi device
On Mon, Jul 12, 2021 at 12:03:27PM +0100, Stefan Hajnoczi wrote: > Why did you decide to implement -device nvme-mi as a device on > TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised > that there's no NVMe bus interface (callbacks). It seems like this could > just as easily be a property of an NVMe controller -device > nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not > familiar enough with MI and NVMe architecture... I'm too far away from qemu these days to understand what TYPE_NVME_BUS is. Bt NVMe-MI has tree possible transports: 1) out of band through smbus. This seems something that could be trivially modelled in qemu 2) out of band over MCTP / PCIe VDM. 3) in band using NVMe admin commands that pass through MI commands
Re: [RFC PATCH 1/2] hw/nvme: add mi device
On Fri, Jul 09, 2021 at 07:25:45PM +0530, Padmakar Kalghatgi wrote: > The enclosed patch contains the implementation of certain > commands of nvme-mi specification.The MI commands are useful > to manage/configure/monitor the device.Eventhough the MI commands > can be sent via the inband NVMe-MI send/recieve commands, the idea here is > to emulate the sideband interface for MI. > > Since the nvme-mi specification deals in communicating > to the nvme subsystem via. a sideband interface, in this > qemu implementation, virtual-vsock is used for making the > sideband communication, the guest VM needs to make the > connection to the specific cid of the vsock of the qemu host. > > One needs to specify the following command in the launch to > specify the nvme-mi device, cid and to setup the vsock: > -device nvme-mi,bus= > -device vhost-vsock-pci, guest-cid= > > The following commands are tested with nvme-cli by hooking > to the cid of the vsock as shown above and use the socket > send/recieve commands to issue the commands and get the response. > > we are planning to push the changes for nvme-cli as well to test the > MI functionality. Is the purpose of this feature (-device nvme-mi) testing MI with QEMU's NVMe implementation? My understanding is that instead of inventing an out-of-band interface in the form of a new paravirtualized device, you decided to use vsock to send MI commands from the guest to QEMU? > As the connection can be established by the guest VM at any point, > we have created a thread which is looking for a connection request. > Please suggest if there is a native/better way to handle this. QEMU has an event-driven architecture and uses threads sparingly. When it uses threads it uses qemu_create_thread() instead of pthread_create(), but I suggest using qemu_set_fd_handler() or a coroutine with QIOChannel to integrate into the QEMU event loop instead. I didn't see any thread synchronization, so I'm not sure if accessing NVMe state from the MI thread is safe. Changing the code to use QEMU's event loop can solve that problem since there's no separate thread. > This module makes use of the NvmeCtrl structure of the nvme module, > to fetch relevant information of the nvme device which are used in > some of the mi commands. Eventhough certain commands might require > modification to the nvme module, currently we have currently refrained > from making changes to the nvme module. Why did you decide to implement -device nvme-mi as a device on TYPE_NVME_BUS? If the NVMe spec somehow requires this then I'm surprised that there's no NVMe bus interface (callbacks). It seems like this could just as easily be a property of an NVMe controller -device nvme,mi=on|off or -device nvme-subsys,mi=on|off? I'm probably just not familiar enough with MI and NVMe architecture... Stefan signature.asc Description: PGP signature
Re: [RFC PATCH 1/2] hw/nvme: add mi device
On Fri, Jul 09, 2021 at 07:25:45PM +0530, Padmakar Kalghatgi wrote: > The following commands are tested with nvme-cli by hooking > to the cid of the vsock as shown above and use the socket > send/recieve commands to issue the commands and get the response. Why sockets? Shouldn't mi targets use smbus for that?
[RFC PATCH 1/2] hw/nvme: add mi device
The enclosed patch contains the implementation of certain commands of nvme-mi specification.The MI commands are useful to manage/configure/monitor the device.Eventhough the MI commands can be sent via the inband NVMe-MI send/recieve commands, the idea here is to emulate the sideband interface for MI. Since the nvme-mi specification deals in communicating to the nvme subsystem via. a sideband interface, in this qemu implementation, virtual-vsock is used for making the sideband communication, the guest VM needs to make the connection to the specific cid of the vsock of the qemu host. One needs to specify the following command in the launch to specify the nvme-mi device, cid and to setup the vsock: -device nvme-mi,bus= -device vhost-vsock-pci, guest-cid= The following commands are tested with nvme-cli by hooking to the cid of the vsock as shown above and use the socket send/recieve commands to issue the commands and get the response. we are planning to push the changes for nvme-cli as well to test the MI functionality. As the connection can be established by the guest VM at any point, we have created a thread which is looking for a connection request. Please suggest if there is a native/better way to handle this. This module makes use of the NvmeCtrl structure of the nvme module, to fetch relevant information of the nvme device which are used in some of the mi commands. Eventhough certain commands might require modification to the nvme module, currently we have currently refrained from making changes to the nvme module. Since this is our first foray into implementing a new module in qemu, we will be happy to receive comments, suggestions to improve the current implementation. cmd-name cmd-type * read nvme-mi dsnvme-mi configuration set nvme-mi configuration get nvme-mi vpd read nvme-mi vpd write nvme-mi controller Health Staus Poll nvme-mi nvme subsystem health status poll nvme-mi identify nvme-admin get log page nvme-admin get features nvme-admin Signed-off-by: Padmakar Kalghatgi --- hw/nvme/meson.build | 2 +- hw/nvme/nvme-mi.c | 676 hw/nvme/nvme-mi.h | 288 ++ 3 files changed, 965 insertions(+), 1 deletion(-) create mode 100644 hw/nvme/nvme-mi.c create mode 100644 hw/nvme/nvme-mi.h diff --git a/hw/nvme/meson.build b/hw/nvme/meson.build index 3cf4004..8dac50e 100644 --- a/hw/nvme/meson.build +++ b/hw/nvme/meson.build @@ -1 +1 @@ -softmmu_ss.add(when: 'CONFIG_NVME_PCI', if_true: files('ctrl.c', 'dif.c', 'ns.c', 'subsys.c')) +softmmu_ss.add(when: 'CONFIG_NVME_PCI', if_true: files('ctrl.c', 'dif.c', 'ns.c', 'subsys.c', 'nvme-mi.c')) diff --git a/hw/nvme/nvme-mi.c b/hw/nvme/nvme-mi.c new file mode 100644 index 000..5e93417 --- /dev/null +++ b/hw/nvme/nvme-mi.c @@ -0,0 +1,676 @@ +/* + * QEMU NVMe-MI Controller + * + * Copyright (c) 2021, Samsung Electronics co Ltd. + * + * Written by Padmakar Kalghatgi + *Arun Kumar Agasar + *Saurav Kumar + * + * This code is licensed under the GNU GPL v2 or later. + */ + +/** + * Reference Specs: http://www.nvmexpress.org, + * + * + * Usage + * - + * The nvme-mi device has to be used along with nvme device only + * + * Add options: + *-device nvme-mi,bus= + *-device vhost-vsock-pci, guest-cid= + * + * the cid is used to connect to the vsock + */ + +#include "qemu/osdep.h" +#include "hw/qdev-core.h" +#include "hw/block/block.h" +#include "hw/pci/msix.h" +#include "nvme.h" +#include "nvme-mi.h" +#include "qemu/crc32c.h" + +#define NVME_TEMPERATURE 0x143 +#define NVME_TEMPERATURE_WARNING 0x157 +#define NVME_TEMPERATURE_CRITICAL 0x175 + +static void nvme_mi_send_resp(NvmeMiCtrl *ctrl_mi, uint8_t *resp, uint32_t size) +{ +uint32_t crc_value = crc32c(0x, resp, size); +size += 4; +uint32_t retries = 5; +uint32_t offset = 0; +uint32_t som = 1; +uint32_t eom = 0; +uint32_t pktseq = 0; +uint32_t mtus = ctrl_mi->mctp_unit_size; +while (size > 0) { +uint32_t sizesent = size > mtus ? mtus : size; +size -= sizesent; +eom = size > 0 ? 0 : 1; +g_autofree uint8_t *buf = (uint8_t *)g_malloc(sizesent + 8); +buf[2] = sizesent + 5; +buf[7] = (som << 7) | (eom << 6) | (pktseq << 5); +som = 0; +memcpy(buf + 8, resp + offset, sizesent); +offset += sizesent; +if (size <= 0) { +memcpy(buf + 4 + offset , _value, sizeof(crc_value)); +} +retries = 5; +while (retries > 0) { +int32_t nsend = send(ctrl_mi->sock_desc, buf, sizesent + 8, 0); +if (nsend < 0) { +retries--; +