Re: [PATCH 0/9] More virtio hardening
On Tue, Oct 12, 2021 at 10:43:57AM +0800, Jason Wang wrote: > On Mon, Oct 11, 2021 at 8:36 PM Michael S. Tsirkin wrote: > > > > On Mon, Oct 11, 2021 at 03:36:51PM +0800, Jason Wang wrote: > > > On Tue, Oct 5, 2021 at 3:42 PM Michael S. Tsirkin wrote: > > > > > > > > On Mon, Sep 13, 2021 at 01:53:44PM +0800, Jason Wang wrote: > > > > > Hi All: > > > > > > > > > > This series treis to do more hardening for virito. > > > > > > > > > > patch 1 validates the num_queues for virio-blk device. > > > > > patch 2-4 validates max_nr_ports for virito-console device. > > > > > patch 5-7 harden virtio-pci interrupts to make sure no exepcted > > > > > interrupt handler is tiggered. If this makes sense we can do similar > > > > > things in other transport drivers. > > > > > patch 8-9 validate used ring length. > > > > > > > > > > Smoking test on blk/net with packed=on/off and iommu_platform=on/off. > > > > > > > > > > Please review. > > > > > > > > > > Thanks > > > > > > > > So I poked at console at least, and I think I see > > > > an issue: if interrupt handler queues a work/bh, > > > > then it can still run while reset is in progress. > > > > > > Looks like a bug which is unrelated to the hardening? > > > > Won't preventing use after free be relevant? > > Oh right. > > > I frankly don't know what does hardening means then. > > > E.g the driver > > > should sync with work/bh before reset. > > > > No, there's no way to fix it ATM without extra locks and state which I > > think we should strive to avoid or make it generic, not per-driver, > > since sync before reset is useless, new interrupts will just arrive and > > queue more work. And a sync after reset is too late since driver will > > try to add buffers. > > Can we do something like > > 1) disable interrupt > 2) sync bh > > Or I guess this is somehow you meant in the following steps. So that would mean a new API to disable vq interrupts. reset will re-enable. E.g. virtqueue_cancel_cb_before_reset()? Then drivers can sync, then reset. This means maintaining more state though, which I don't like. An alternative is something like this: static void (*virtio_flush_device)(struct virtio_device *dev); void virtio_reset_device(struct virtio_device *dev, virtio_flush_device flush) { might_sleep(); if (flush) { dev->config->disable_interrupts(dev); flush(dev); dev->config->reset(dev); dev->config->enable_interrupts(dev); } else { dev->config->reset(dev); } } I have patches wrapping all reset calls in virtio_reset_device (without the flush parameter but that's easy to tweak). > > > > Maybe we can break device. Two issues with that > > - drivers might not be ready to handle add_buf failures > > - restore needs to unbreak then and we don't have a way to do that yet > > > > So .. careful reading of all device drivers and hoping we don't mess > > things up even more ... here we come. > > Yes. The biggest issue with this trick is drivers not handling add_buf errors, adding a failure path here risks creating memory leaks. OTOH with e.g. bounce buffers maybe it's possible for add buf to fail anyway? > > > > > > > > > > I sent a patch to fix it for console removal specifically, > > > > but I suspect it's not enough e.g. freeze is still broken. > > > > And note this has been reported without any TDX things - > > > > it's not a malicious device issue, can be triggered just > > > > by module unload. > > > > > > > > I am vaguely thinking about new APIs to disable/enable callbacks. > > > > An alternative: > > > > > > > > 1. adding new remove_nocb/freeze_nocb calls > > > > 2. disabling/enabling interrupts automatically around these > > > > 3. gradually moving devices to using these > > > > 4. once/if all device move, removing the old callbacks > > > > > > > > the advantage here is that we'll be sure calls are always > > > > paired correctly. > > > > > > I'm not sure I get the idea, but my feeling is that it doesn't > > > conflict with the interrupt hardening here (or at least the same > > > method is required e.g NO_AUTO_EN). > > > > > > Thanks > > > > Right. It's not that it conflicts, it's that I was hoping that > > since you are working on hardening you can take up fixing that. > > Let me know whether you have the time. Thanks! > > I can do that. > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jason Wang (9): > > > > > virtio-blk: validate num_queues during probe > > > > > virtio: add doc for validate() method > > > > > virtio-console: switch to use .validate() > > > > > virtio_console: validate max_nr_ports before trying to use it > > > > > virtio_config: introduce a new ready method > > > > > virtio_pci: harden MSI-X interrupts > > > > > virtio-pci: harden INTX interrupts > > > > > virtio_ring: fix typos in vring_desc_extra > > > > > virtio_ring: validate used buffer length > > > > > > > > > >
Re: [PATCH v5 12/16] PCI: Add pci_iomap_host_shared(), pci_iomap_host_shared_range()
On Mon, Oct 11, 2021 at 03:09:09PM -0400, Michael S. Tsirkin wrote: > The reason we have trouble is that it's not clear what does the API mean > outside the realm of TDX. > If we really, truly want an API that says "ioremap and it's a hardened > driver" then I guess ioremap_hardened_driver is what you want. Yes. And why would be we ioremap the BIOS anyway? It is not I/O memory in any of the senses we generally use ioremap for. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH v4 00/20] vDPA shadow virtqueue
On Tue, Oct 12, 2021 at 11:59 AM Jason Wang wrote: > > > 在 2021/10/1 下午3:05, Eugenio Pérez 写道: > > This series enable shadow virtqueue (SVQ) for vhost-vdpa devices. This > > is intended as a new method of tracking the memory the devices touch > > during a migration process: Instead of relay on vhost device's dirty > > logging capability, SVQ intercepts the VQ dataplane forwarding the > > descriptors between VM and device. This way qemu is the effective > > writer of guests memory, like in qemu's virtio device operation. > > > > When SVQ is enabled qemu offers a new vring to the device to read > > and write into, and also intercepts kicks and calls between the device > > and the guest. Used buffers relay would cause dirty memory being > > tracked, but at this RFC SVQ is not enabled on migration automatically. > > > > It is based on the ideas of DPDK SW assisted LM, in the series of > > DPDK's https://patchwork.dpdk.org/cover/48370/ . However, these does > > not map the shadow vq in guest's VA, but in qemu's. > > > > For qemu to use shadow virtqueues the guest virtio driver must not use > > features like event_idx or indirect descriptors. These limitations will > > be addressed in later series, but they are left out for simplicity at > > the moment. > > > > SVQ needs to be enabled with QMP command: > > > > { "execute": "x-vhost-enable-shadow-vq", > >"arguments": { "name": "dev0", "enable": true } } > > > > This series includes some patches to delete in the final version that > > helps with its testing. The first two of the series freely implements > > the feature to stop the device and be able to retrieve its status. It's > > intended to be used with vp_vpda driver in a nested environment. This > > driver also need modifications to forward the new status bit. > > > > Patches 2-8 prepares the SVQ and QMP command to support guest to host > > notifications forwarding. If the SVQ is enabled with these ones > > applied and the device supports it, that part can be tested in > > isolation (for example, with networking), hopping through SVQ. > > > > Same thing is true with patches 9-13, but with device to guest > > notifications. > > > > The rest of the patches implements the actual buffer forwarding. > > > > Comments are welcome. > > > Hi Eugenio: > > > It would be helpful to have a public git repo for us to ease the review. > > Thanks > Btw, we also need to measure the performance impact of the shadow virtqueue. Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH v4 00/20] vDPA shadow virtqueue
在 2021/10/1 下午3:05, Eugenio Pérez 写道: This series enable shadow virtqueue (SVQ) for vhost-vdpa devices. This is intended as a new method of tracking the memory the devices touch during a migration process: Instead of relay on vhost device's dirty logging capability, SVQ intercepts the VQ dataplane forwarding the descriptors between VM and device. This way qemu is the effective writer of guests memory, like in qemu's virtio device operation. When SVQ is enabled qemu offers a new vring to the device to read and write into, and also intercepts kicks and calls between the device and the guest. Used buffers relay would cause dirty memory being tracked, but at this RFC SVQ is not enabled on migration automatically. It is based on the ideas of DPDK SW assisted LM, in the series of DPDK's https://patchwork.dpdk.org/cover/48370/ . However, these does not map the shadow vq in guest's VA, but in qemu's. For qemu to use shadow virtqueues the guest virtio driver must not use features like event_idx or indirect descriptors. These limitations will be addressed in later series, but they are left out for simplicity at the moment. SVQ needs to be enabled with QMP command: { "execute": "x-vhost-enable-shadow-vq", "arguments": { "name": "dev0", "enable": true } } This series includes some patches to delete in the final version that helps with its testing. The first two of the series freely implements the feature to stop the device and be able to retrieve its status. It's intended to be used with vp_vpda driver in a nested environment. This driver also need modifications to forward the new status bit. Patches 2-8 prepares the SVQ and QMP command to support guest to host notifications forwarding. If the SVQ is enabled with these ones applied and the device supports it, that part can be tested in isolation (for example, with networking), hopping through SVQ. Same thing is true with patches 9-13, but with device to guest notifications. The rest of the patches implements the actual buffer forwarding. Comments are welcome. Hi Eugenio: It would be helpful to have a public git repo for us to ease the review. Thanks TODO: * Event, indirect, packed, and others features of virtio - Waiting for confirmation of the big picture. * Use already available iova tree to track mappings. * To sepparate buffers forwarding in its own AIO context, so we can throw more threads to that task and we don't need to stop the main event loop. * unmap iommu memory. Now the tree can only grow from SVQ enable, but it should be fine as long as not a lot of memory is added to the guest. * Rebase on top of latest qemu (and, hopefully, on top of multiqueue vdpa). * Some assertions need to be appropiate error handling paths. * Proper documentation. Changes from v3 RFC: * Move everything to vhost-vdpa backend. A big change, this allowed some cleanup but more code has been added in other places. * More use of glib utilities, especially to manage memory. v3 link: https://lists.nongnu.org/archive/html/qemu-devel/2021-05/msg06032.html Changes from v2 RFC: * Adding vhost-vdpa devices support * Fixed some memory leaks pointed by different comments v2 link: https://lists.nongnu.org/archive/html/qemu-devel/2021-03/msg05600.html Changes from v1 RFC: * Use QMP instead of migration to start SVQ mode. * Only accepting IOMMU devices, closer behavior with target devices (vDPA) * Fix invalid masking/unmasking of vhost call fd. * Use of proper methods for synchronization. * No need to modify VirtIO device code, all of the changes are contained in vhost code. * Delete superfluous code. * An intermediate RFC was sent with only the notifications forwarding changes. It can be seen in https://patchew.org/QEMU/20210129205415.876290-1-epere...@redhat.com/ v1 link: https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg05372.html Eugenio Pérez (20): virtio: Add VIRTIO_F_QUEUE_STATE virtio-net: Honor VIRTIO_CONFIG_S_DEVICE_STOPPED virtio: Add virtio_queue_is_host_notifier_enabled vhost: Make vhost_virtqueue_{start,stop} public vhost: Add x-vhost-enable-shadow-vq qmp vhost: Add VhostShadowVirtqueue vdpa: Register vdpa devices in a list vhost: Route guest->host notification through shadow virtqueue Add vhost_svq_get_svq_call_notifier Add vhost_svq_set_guest_call_notifier vdpa: Save call_fd in vhost-vdpa vhost-vdpa: Take into account SVQ in vhost_vdpa_set_vring_call vhost: Route host->guest notification through shadow virtqueue virtio: Add vhost_shadow_vq_get_vring_addr vdpa: Save host and guest features vhost: Add vhost_svq_valid_device_features to shadow vq vhost: Shadow virtqueue buffers forwarding vhost: Add VhostIOVATree vhost: Use a tree to store memory mappings vdpa: Add custom IOTLB translations to SVQ Eugenio Pérez (20): virtio: Add
Re: [PATCH v2 3/3] vdpa: Check for iova range at mappings changes
On Tue, Oct 5, 2021 at 9:49 PM Eugenio Pérez wrote: > > Check vdpa device range before updating memory regions so we don't add > any outside of it, and report the invalid change if any. > > Signed-off-by: Eugenio Pérez > --- > include/hw/virtio/vhost-vdpa.h | 2 + > hw/virtio/vhost-vdpa.c | 68 ++ > hw/virtio/trace-events | 1 + > 3 files changed, 55 insertions(+), 16 deletions(-) > > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h > index a8963da2d9..c288cf7ecb 100644 > --- a/include/hw/virtio/vhost-vdpa.h > +++ b/include/hw/virtio/vhost-vdpa.h > @@ -13,6 +13,7 @@ > #define HW_VIRTIO_VHOST_VDPA_H > > #include "hw/virtio/virtio.h" > +#include "standard-headers/linux/vhost_types.h" > > typedef struct VhostVDPAHostNotifier { > MemoryRegion mr; > @@ -24,6 +25,7 @@ typedef struct vhost_vdpa { > uint32_t msg_type; > bool iotlb_batch_begin_sent; > MemoryListener listener; > +struct vhost_vdpa_iova_range iova_range; > struct vhost_dev *dev; > VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX]; > } VhostVDPA; > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > index be7c63b4ba..6654287050 100644 > --- a/hw/virtio/vhost-vdpa.c > +++ b/hw/virtio/vhost-vdpa.c > @@ -37,20 +37,34 @@ static Int128 vhost_vdpa_section_end(const > MemoryRegionSection *section) > return llend; > } > > -static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section) > -{ > -return (!memory_region_is_ram(section->mr) && > -!memory_region_is_iommu(section->mr)) || > -memory_region_is_protected(section->mr) || > - /* vhost-vDPA doesn't allow MMIO to be mapped */ > -memory_region_is_ram_device(section->mr) || > - /* > -* Sizing an enabled 64-bit BAR can cause spurious mappings to > -* addresses in the upper part of the 64-bit address space. These > -* are never accessed by the CPU and beyond the address width of > -* some IOMMU hardware. TODO: VDPA should tell us the IOMMU > width. > -*/ > - section->offset_within_address_space & (1ULL << 63); > +static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section, > +uint64_t iova_min, > +uint64_t iova_max) > +{ > +Int128 llend; > + > +if ((!memory_region_is_ram(section->mr) && > + !memory_region_is_iommu(section->mr)) || > +memory_region_is_protected(section->mr) || > +/* vhost-vDPA doesn't allow MMIO to be mapped */ > +memory_region_is_ram_device(section->mr)) { > +return true; > +} > + > +if (section->offset_within_address_space < iova_min) { > +error_report("RAM section out of device range (min=%lu, addr=%lu)", > + iova_min, section->offset_within_address_space); > +return true; > +} > + > +llend = vhost_vdpa_section_end(section); > +if (int128_gt(llend, int128_make64(iova_max))) { > +error_report("RAM section out of device range (max=%lu, end > addr=%lu)", > + iova_max, int128_get64(llend)); > +return true; > +} > + > +return false; > } > > static int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size, > @@ -162,7 +176,8 @@ static void vhost_vdpa_listener_region_add(MemoryListener > *listener, > void *vaddr; > int ret; > > -if (vhost_vdpa_listener_skipped_section(section)) { > +if (vhost_vdpa_listener_skipped_section(section, v->iova_range.first, > +v->iova_range.last)) { > return; > } > > @@ -220,7 +235,8 @@ static void vhost_vdpa_listener_region_del(MemoryListener > *listener, > Int128 llend, llsize; > int ret; > > -if (vhost_vdpa_listener_skipped_section(section)) { > +if (vhost_vdpa_listener_skipped_section(section, v->iova_range.first, > +v->iova_range.last)) { > return; > } > > @@ -288,9 +304,24 @@ static void vhost_vdpa_add_status(struct vhost_dev *dev, > uint8_t status) > vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, ); > } > > +static int vhost_vdpa_get_iova_range(struct vhost_vdpa *v) > +{ > +int ret; > + > +ret = vhost_vdpa_call(v->dev, VHOST_VDPA_GET_IOVA_RANGE, >iova_range); > +if (ret != 0) { > +return ret; > +} I think we need a fallback for the kernel that does not support VHOST_VDPA_GET_IOVA_RANGE? Thanks > + > +trace_vhost_vdpa_get_iova_range(v->dev, v->iova_range.first, > +v->iova_range.last); > +return ret; > +} > + > static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp) > { > struct vhost_vdpa *v; > +int r; > assert(dev->vhost_ops->backend_type ==
Re: [PATCH v2 2/3] vdpa: Add vhost_vdpa_section_end
On Tue, Oct 5, 2021 at 9:49 PM Eugenio Pérez wrote: > > Abstract this operation, that will be reused when validating the region > against the iova range that the device supports. > > Signed-off-by: Eugenio Pérez Acked-by: Jason Wang > --- > hw/virtio/vhost-vdpa.c | 22 +++--- > 1 file changed, 15 insertions(+), 7 deletions(-) > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > index ea1aa71ad8..be7c63b4ba 100644 > --- a/hw/virtio/vhost-vdpa.c > +++ b/hw/virtio/vhost-vdpa.c > @@ -24,6 +24,19 @@ > #include "trace.h" > #include "qemu-common.h" > > +/* > + * Return one past the end of the end of section. Be careful with uint64_t > + * conversions! > + */ > +static Int128 vhost_vdpa_section_end(const MemoryRegionSection *section) > +{ > +Int128 llend = int128_make64(section->offset_within_address_space); > +llend = int128_add(llend, section->size); > +llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK)); > + > +return llend; > +} > + > static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section) > { > return (!memory_region_is_ram(section->mr) && > @@ -160,10 +173,7 @@ static void > vhost_vdpa_listener_region_add(MemoryListener *listener, > } > > iova = TARGET_PAGE_ALIGN(section->offset_within_address_space); > -llend = int128_make64(section->offset_within_address_space); > -llend = int128_add(llend, section->size); > -llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK)); > - > +llend = vhost_vdpa_section_end(section); > if (int128_ge(int128_make64(iova), llend)) { > return; > } > @@ -221,9 +231,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener > *listener, > } > > iova = TARGET_PAGE_ALIGN(section->offset_within_address_space); > -llend = int128_make64(section->offset_within_address_space); > -llend = int128_add(llend, section->size); > -llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK)); > +llend = vhost_vdpa_section_end(section); > > trace_vhost_vdpa_listener_region_del(v, iova, int128_get64(llend)); > > -- > 2.27.0 > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 1/3] vdpa: Skip protected ram IOMMU mappings
On Tue, Oct 5, 2021 at 9:49 PM Eugenio Pérez wrote: > > Following the logic of commit 56918a126ae ("memory: Add RAM_PROTECTED > flag to skip IOMMU mappings") with VFIO, skip memory sections > inaccessible via normal mechanisms, including DMA. > > Signed-off-by: Eugenio Pérez Acked-by: Jason Wang > --- > hw/virtio/vhost-vdpa.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > index 47d7a5a23d..ea1aa71ad8 100644 > --- a/hw/virtio/vhost-vdpa.c > +++ b/hw/virtio/vhost-vdpa.c > @@ -28,6 +28,7 @@ static bool > vhost_vdpa_listener_skipped_section(MemoryRegionSection *section) > { > return (!memory_region_is_ram(section->mr) && > !memory_region_is_iommu(section->mr)) || > +memory_region_is_protected(section->mr) || > /* vhost-vDPA doesn't allow MMIO to be mapped */ > memory_region_is_ram_device(section->mr) || > /* > -- > 2.27.0 > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 0/9] More virtio hardening
On Mon, Oct 11, 2021 at 8:36 PM Michael S. Tsirkin wrote: > > On Mon, Oct 11, 2021 at 03:36:51PM +0800, Jason Wang wrote: > > On Tue, Oct 5, 2021 at 3:42 PM Michael S. Tsirkin wrote: > > > > > > On Mon, Sep 13, 2021 at 01:53:44PM +0800, Jason Wang wrote: > > > > Hi All: > > > > > > > > This series treis to do more hardening for virito. > > > > > > > > patch 1 validates the num_queues for virio-blk device. > > > > patch 2-4 validates max_nr_ports for virito-console device. > > > > patch 5-7 harden virtio-pci interrupts to make sure no exepcted > > > > interrupt handler is tiggered. If this makes sense we can do similar > > > > things in other transport drivers. > > > > patch 8-9 validate used ring length. > > > > > > > > Smoking test on blk/net with packed=on/off and iommu_platform=on/off. > > > > > > > > Please review. > > > > > > > > Thanks > > > > > > So I poked at console at least, and I think I see > > > an issue: if interrupt handler queues a work/bh, > > > then it can still run while reset is in progress. > > > > Looks like a bug which is unrelated to the hardening? > > Won't preventing use after free be relevant? Oh right. > I frankly don't know what does hardening means then. > > E.g the driver > > should sync with work/bh before reset. > > No, there's no way to fix it ATM without extra locks and state which I > think we should strive to avoid or make it generic, not per-driver, > since sync before reset is useless, new interrupts will just arrive and > queue more work. And a sync after reset is too late since driver will > try to add buffers. Can we do something like 1) disable interrupt 2) sync bh Or I guess this is somehow you meant in the following steps. > > Maybe we can break device. Two issues with that > - drivers might not be ready to handle add_buf failures > - restore needs to unbreak then and we don't have a way to do that yet > > So .. careful reading of all device drivers and hoping we don't mess > things up even more ... here we come. Yes. > > > > > > > I sent a patch to fix it for console removal specifically, > > > but I suspect it's not enough e.g. freeze is still broken. > > > And note this has been reported without any TDX things - > > > it's not a malicious device issue, can be triggered just > > > by module unload. > > > > > > I am vaguely thinking about new APIs to disable/enable callbacks. > > > An alternative: > > > > > > 1. adding new remove_nocb/freeze_nocb calls > > > 2. disabling/enabling interrupts automatically around these > > > 3. gradually moving devices to using these > > > 4. once/if all device move, removing the old callbacks > > > > > > the advantage here is that we'll be sure calls are always > > > paired correctly. > > > > I'm not sure I get the idea, but my feeling is that it doesn't > > conflict with the interrupt hardening here (or at least the same > > method is required e.g NO_AUTO_EN). > > > > Thanks > > Right. It's not that it conflicts, it's that I was hoping that > since you are working on hardening you can take up fixing that. > Let me know whether you have the time. Thanks! I can do that. Thanks > > > > > > > > > > > > > > > > > > > > Jason Wang (9): > > > > virtio-blk: validate num_queues during probe > > > > virtio: add doc for validate() method > > > > virtio-console: switch to use .validate() > > > > virtio_console: validate max_nr_ports before trying to use it > > > > virtio_config: introduce a new ready method > > > > virtio_pci: harden MSI-X interrupts > > > > virtio-pci: harden INTX interrupts > > > > virtio_ring: fix typos in vring_desc_extra > > > > virtio_ring: validate used buffer length > > > > > > > > drivers/block/virtio_blk.c | 3 +- > > > > drivers/char/virtio_console.c | 51 +- > > > > drivers/virtio/virtio_pci_common.c | 43 + > > > > drivers/virtio/virtio_pci_common.h | 7 ++-- > > > > drivers/virtio/virtio_pci_legacy.c | 5 +-- > > > > drivers/virtio/virtio_pci_modern.c | 6 ++-- > > > > drivers/virtio/virtio_ring.c | 27 ++-- > > > > include/linux/virtio.h | 1 + > > > > include/linux/virtio_config.h | 6 > > > > 9 files changed, 118 insertions(+), 31 deletions(-) > > > > > > > > -- > > > > 2.25.1 > > > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v5 12/16] PCI: Add pci_iomap_host_shared(), pci_iomap_host_shared_range()
On Mon, Oct 11, 2021 at 10:23:00AM -0700, Andi Kleen wrote: > > On 10/11/2021 12:58 AM, Christoph Hellwig wrote: > > Just as last time: This does not make any sense. ioremap is shared > > by definition. > > It's not necessarily shared with the host for confidential computing: for > example BIOS mappings definitely should not be shared, but they're using > ioremap today. That just needs to be fixed. > But if you have a better term please propose something. I tried to clarify > it with "shared_host", but I don't know a better term. > > > -Andi > The reason we have trouble is that it's not clear what does the API mean outside the realm of TDX. If we really, truly want an API that says "ioremap and it's a hardened driver" then I guess ioremap_hardened_driver is what you want. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v5 16/16] x86/tdx: Add cmdline option to force use of ioremap_host_shared
On Mon, Oct 11, 2021 at 10:35:18AM -0700, Andi Kleen wrote: > > > Presumably bios code is in arch/x86 and drivers/acpi, right? > > Up to 200 calls the majority of which is likely private ... > > Yes. > > > I don't have better ideas but the current setup will just > > result in people making their guests vulnerable whenever they > > want to allow device pass-through. > > > Yes that's true. For current TDX our target is virtual devices only. But if > pass through usage will be really wide spread we may need to revisit. > > > -Andi I mean ... it's already wide spread. If we support it with TDX it will be used with TDX. If we don't then I guess it won't, exposing this kind of limitation in a userspace visible way isn't great though. I guess it boils down to the fact that ioremap_host_shared is just not a great interface, users simply have no idea whether a given driver uses ioremap. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v5 12/16] PCI: Add pci_iomap_host_shared(), pci_iomap_host_shared_range()
On Mon, Oct 11, 2021 at 10:32:23AM -0700, Andi Kleen wrote: > > > Because it does not end with I/O operations, that's a trivial example. > > module unloading is famous for being racy: I just re-read that part of > > virtio drivers and sure enough we have bugs there, this is after > > they have presumably been audited, so a TDX guest is better off > > just disabling hot-unplug completely, and hotplug isn't far behind. > > These all shouldn't matter for a confidential guest. The only way it can be > attacked is through IO, everything else is protected by hardware. > > > Also it would all require doing something at the guest level, which we > assume is not malicious. > > > > Malicious filesystems can exploit many linux systems unless > > you take pains to limit what is mounted and how. > > That's expected to be handled by authenticated dmcrypt and similar. > Hardening at this level has been done for many years. It's possible to do it like this, sure. But that's not the only configuration, userspace needs to be smart about setting things up. Which is my point really. > > > Networking devices tend to get into the default namespaces and can > > do more or less whatever CAP_NET_ADMIN can. > > Etc. > > > Networking should be already hardened, otherwise you would have much worse > problems today. Same thing. NFS is pretty common, you are saying don't do it then. Fair enough but again, arbitrary configs just aren't going to be secure. > > > > hange in your subsystem here. > > Well I commented on the API patch, not the virtio patch. > > If it's a way for a driver to say "I am hardened > > and audited" then I guess it should at least say so. > > > This is handled by the central allow list. We intentionally didn't want each > driver to declare itself, but have a central list where changes will get > more scrutiny than random driver code. Makes sense. Additionally, distros can tweak that to their heart's content, selecting the functionality/security balance that makes sense for them. > But then there are the additional opt-ins for the low level firewall. These > are in the API. I don't see how it could be done at the driver level, unless > you want to pass in a struct device everywhere? I am just saying don't do it then. Don't build drivers that distro does not want to support into kernel. And don't load them when they are modules. > > > > How about creating a defconfig that makes sense for TDX then? > > > TDX can be used in many different ways, I don't think a defconfig is > > > practical. > > > > > > In theory you could do some Kconfig dependency (at the pain point of > > > having > > > separate kernel binariees), but why not just do it at run time then if you > > > maintain the list anyways. That's much easier and saner for everyone. In > > > the > > > past we usually always ended up with runtime mechanism for similar things > > > anyways. > > > > > > Also it turns out that the filter mechanisms are needed for some arch > > > drivers which are not even configurable, so alone it's probably not > > > enough, > > > > I guess they aren't really needed though right, or you won't try to > > filter them? > > We're addressing most of them with the device filter for platform drivers. > But since we cannot stop them doing ioremap IO in their init code they also > need the low level firewall. > > Some others that cannot be addressed have explicit disables. > > > > So make them configurable? > > Why not just fix the runtime? It's much saner for everyone. Proposing to do > things at build time sounds like we're in Linux 0.99 days. > > -Andi Um. Tweaking driver code is not just build time, it's development time. At least with kconfig you don't need to patch your kernel. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v5 16/16] x86/tdx: Add cmdline option to force use of ioremap_host_shared
Presumably bios code is in arch/x86 and drivers/acpi, right? Up to 200 calls the majority of which is likely private ... Yes. I don't have better ideas but the current setup will just result in people making their guests vulnerable whenever they want to allow device pass-through. Yes that's true. For current TDX our target is virtual devices only. But if pass through usage will be really wide spread we may need to revisit. -Andi ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v5 12/16] PCI: Add pci_iomap_host_shared(), pci_iomap_host_shared_range()
Because it does not end with I/O operations, that's a trivial example. module unloading is famous for being racy: I just re-read that part of virtio drivers and sure enough we have bugs there, this is after they have presumably been audited, so a TDX guest is better off just disabling hot-unplug completely, and hotplug isn't far behind. These all shouldn't matter for a confidential guest. The only way it can be attacked is through IO, everything else is protected by hardware. Also it would all require doing something at the guest level, which we assume is not malicious. Malicious filesystems can exploit many linux systems unless you take pains to limit what is mounted and how. That's expected to be handled by authenticated dmcrypt and similar. Hardening at this level has been done for many years. Networking devices tend to get into the default namespaces and can do more or less whatever CAP_NET_ADMIN can. Etc. Networking should be already hardened, otherwise you would have much worse problems today. hange in your subsystem here. Well I commented on the API patch, not the virtio patch. If it's a way for a driver to say "I am hardened and audited" then I guess it should at least say so. This is handled by the central allow list. We intentionally didn't want each driver to declare itself, but have a central list where changes will get more scrutiny than random driver code. But then there are the additional opt-ins for the low level firewall. These are in the API. I don't see how it could be done at the driver level, unless you want to pass in a struct device everywhere? How about creating a defconfig that makes sense for TDX then? TDX can be used in many different ways, I don't think a defconfig is practical. In theory you could do some Kconfig dependency (at the pain point of having separate kernel binariees), but why not just do it at run time then if you maintain the list anyways. That's much easier and saner for everyone. In the past we usually always ended up with runtime mechanism for similar things anyways. Also it turns out that the filter mechanisms are needed for some arch drivers which are not even configurable, so alone it's probably not enough, I guess they aren't really needed though right, or you won't try to filter them? We're addressing most of them with the device filter for platform drivers. But since we cannot stop them doing ioremap IO in their init code they also need the low level firewall. Some others that cannot be addressed have explicit disables. So make them configurable? Why not just fix the runtime? It's much saner for everyone. Proposing to do things at build time sounds like we're in Linux 0.99 days. -Andi ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v5 12/16] PCI: Add pci_iomap_host_shared(), pci_iomap_host_shared_range()
On 10/11/2021 12:58 AM, Christoph Hellwig wrote: Just as last time: This does not make any sense. ioremap is shared by definition. It's not necessarily shared with the host for confidential computing: for example BIOS mappings definitely should not be shared, but they're using ioremap today. But if you have a better term please propose something. I tried to clarify it with "shared_host", but I don't know a better term. -Andi ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 0/9] More virtio hardening
On Mon, Oct 11, 2021 at 03:36:51PM +0800, Jason Wang wrote: > On Tue, Oct 5, 2021 at 3:42 PM Michael S. Tsirkin wrote: > > > > On Mon, Sep 13, 2021 at 01:53:44PM +0800, Jason Wang wrote: > > > Hi All: > > > > > > This series treis to do more hardening for virito. > > > > > > patch 1 validates the num_queues for virio-blk device. > > > patch 2-4 validates max_nr_ports for virito-console device. > > > patch 5-7 harden virtio-pci interrupts to make sure no exepcted > > > interrupt handler is tiggered. If this makes sense we can do similar > > > things in other transport drivers. > > > patch 8-9 validate used ring length. > > > > > > Smoking test on blk/net with packed=on/off and iommu_platform=on/off. > > > > > > Please review. > > > > > > Thanks > > > > So I poked at console at least, and I think I see > > an issue: if interrupt handler queues a work/bh, > > then it can still run while reset is in progress. > > Looks like a bug which is unrelated to the hardening? Won't preventing use after free be relevant? I frankly don't know what does hardening means then. > E.g the driver > should sync with work/bh before reset. No, there's no way to fix it ATM without extra locks and state which I think we should strive to avoid or make it generic, not per-driver, since sync before reset is useless, new interrupts will just arrive and queue more work. And a sync after reset is too late since driver will try to add buffers. Maybe we can break device. Two issues with that - drivers might not be ready to handle add_buf failures - restore needs to unbreak then and we don't have a way to do that yet So .. careful reading of all device drivers and hoping we don't mess things up even more ... here we come. > > > > I sent a patch to fix it for console removal specifically, > > but I suspect it's not enough e.g. freeze is still broken. > > And note this has been reported without any TDX things - > > it's not a malicious device issue, can be triggered just > > by module unload. > > > > I am vaguely thinking about new APIs to disable/enable callbacks. > > An alternative: > > > > 1. adding new remove_nocb/freeze_nocb calls > > 2. disabling/enabling interrupts automatically around these > > 3. gradually moving devices to using these > > 4. once/if all device move, removing the old callbacks > > > > the advantage here is that we'll be sure calls are always > > paired correctly. > > I'm not sure I get the idea, but my feeling is that it doesn't > conflict with the interrupt hardening here (or at least the same > method is required e.g NO_AUTO_EN). > > Thanks Right. It's not that it conflicts, it's that I was hoping that since you are working on hardening you can take up fixing that. Let me know whether you have the time. Thanks! > > > > > > > > > > > > > Jason Wang (9): > > > virtio-blk: validate num_queues during probe > > > virtio: add doc for validate() method > > > virtio-console: switch to use .validate() > > > virtio_console: validate max_nr_ports before trying to use it > > > virtio_config: introduce a new ready method > > > virtio_pci: harden MSI-X interrupts > > > virtio-pci: harden INTX interrupts > > > virtio_ring: fix typos in vring_desc_extra > > > virtio_ring: validate used buffer length > > > > > > drivers/block/virtio_blk.c | 3 +- > > > drivers/char/virtio_console.c | 51 +- > > > drivers/virtio/virtio_pci_common.c | 43 + > > > drivers/virtio/virtio_pci_common.h | 7 ++-- > > > drivers/virtio/virtio_pci_legacy.c | 5 +-- > > > drivers/virtio/virtio_pci_modern.c | 6 ++-- > > > drivers/virtio/virtio_ring.c | 27 ++-- > > > include/linux/virtio.h | 1 + > > > include/linux/virtio_config.h | 6 > > > 9 files changed, 118 insertions(+), 31 deletions(-) > > > > > > -- > > > 2.25.1 > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v5 16/16] x86/tdx: Add cmdline option to force use of ioremap_host_shared
On Sun, Oct 10, 2021 at 07:39:55PM -0700, Andi Kleen wrote: > > > The connection is quite unfortunate IMHO. > > Can't there be an option > > that unbreaks drivers *without* opening up security holes by > > making BIOS shared? > > That would require new low level APIs that distinguish both cases, and a > tree sweep. > > > -Andi Presumably bios code is in arch/x86 and drivers/acpi, right? Up to 200 calls the majority of which is likely private ... I don't have better ideas but the current setup will just result in people making their guests vulnerable whenever they want to allow device pass-through. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v5 12/16] PCI: Add pci_iomap_host_shared(), pci_iomap_host_shared_range()
On Sun, Oct 10, 2021 at 03:22:39PM -0700, Andi Kleen wrote: > > > To which Andi replied > > One problem with removing the ioremap opt-in is that > > it's still possible for drivers to get at devices without going through > > probe. > > > > To which Greg replied: > > https://lore.kernel.org/all/yvxbnj431yiww...@kroah.com/ > > If there are in-kernel PCI drivers that do not do this, they need to be > > fixed today. > > > > Can you guys resolve the differences here? > > > I addressed this in my other mail, but we may need more discussion. Hopefully Greg will reply to that one. > > > > > And once they are resolved, mention this in the commit log so > > I don't get to re-read the series just to find out nothing > > changed in this respect? > > > > I frankly do not believe we are anywhere near being able to harden > > an arbitrary kernel config against attack. > > Why not? Device filter and the opt-ins together are a fairly strong > mechanism. Because it does not end with I/O operations, that's a trivial example. module unloading is famous for being racy: I just re-read that part of virtio drivers and sure enough we have bugs there, this is after they have presumably been audited, so a TDX guest is better off just disabling hot-unplug completely, and hotplug isn't far behind. Malicious filesystems can exploit many linux systems unless you take pains to limit what is mounted and how. Networking devices tend to get into the default namespaces and can do more or less whatever CAP_NET_ADMIN can. Etc. I am not saying this makes the effort worthless, I am saying userspace better know very well what it's doing, and kernel better be configured in a very specific way. > And it's not that they're a lot of code or super complicated either. > > You're essentially objecting to a single line change in your subsystem here. Well I commented on the API patch, not the virtio patch. If it's a way for a driver to say "I am hardened and audited" then I guess it should at least say so. It has nothing to do with host or sharing, that's an implementation detail, and it obscures the actual limitations of the approach, in that eventually in an ideal world all drivers would be secure and use this API. Yes, if that's the API that PCI gains then virtio will use it. > > How about creating a defconfig that makes sense for TDX then? > > TDX can be used in many different ways, I don't think a defconfig is > practical. > > In theory you could do some Kconfig dependency (at the pain point of having > separate kernel binariees), but why not just do it at run time then if you > maintain the list anyways. That's much easier and saner for everyone. In the > past we usually always ended up with runtime mechanism for similar things > anyways. > > Also it turns out that the filter mechanisms are needed for some arch > drivers which are not even configurable, so alone it's probably not enough, I guess they aren't really needed though right, or you won't try to filter them? So make them configurable? > > > Anyone deviating from that better know what they are doing, > > this API tweaking is just putting policy into the kernel ... > > Hardening drivers is kernel policy. It cannot be done anywhere else. > > > -Andi To clarify, the policy is which drivers to load into the kernel. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v5] virtio-blk: Add validation for block size in config space
On Tue, Oct 05, 2021 at 06:42:43AM -0400, Michael S. Tsirkin wrote: > Stefan also pointed out this duplicates the logic from > > if (blksize < 512 || blksize > PAGE_SIZE || !is_power_of_2(blksize)) > return -EINVAL; > > > and a bunch of other places. > > > Would it be acceptable for blk layer to validate the input > instead of having each driver do it's own thing? > Maybe inside blk_queue_logical_block_size? I'm pretty sure we want down that before. Let's just add a helper just for that check for now as part of this series. Actually validating in in blk_queue_logical_block_size seems like a good idea, but returning errors from that has a long tail. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] drm/virtio: fix the missed drm_gem_object_put() in virtio_gpu_user_framebuffer_create()
On Sat, Oct 09, 2021 at 05:09:20PM +0800, Jing Xiangfeng wrote: > virtio_gpu_user_framebuffer_create() misses to call drm_gem_object_put() > in an error path. Add the missed function call to fix it. Pushed to drm-misc-next. thanks, Gerd ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC v1 03/11] iommu/virtio: Handle incoming page faults
Hi Vivek, On Mon, Oct 11, 2021 at 01:41:15PM +0530, Vivek Gautam wrote: > > > + list_for_each_entry(ep, >endpoints, list) { > > > + if (ep->eid == endpoint) { > > > + vdev = ep->vdev; > > I have a question here though - > Is endpoint-ID unique across all the endpoints available per 'viommu_dev' or > per 'viommu_domain'? > If it is per 'viommu_domain' then the above list is also incorrect. > As you pointed to in the patch [1] - > [PATCH RFC v1 02/11] iommu/virtio: Maintain a list of endpoints served > by viommu_dev > I am planning to add endpoint ID into a static global xarray in > viommu_probe_device() as below: > > vdev_for_each_id(i, eid, vdev) { > ret = xa_insert(_ep_ids, eid, vdev, GFP_KERNEL); > if (ret) > goto err_free_dev; > } > > and replace the above list traversal as below: > > xa_lock_irqsave(_ep_ids, flags); > xa_for_each(_ep_ids, eid, vdev) { > if (eid == endpoint) { > ret = > iommu_report_device_fault(vdev->dev, _evt); > if (ret) > dev_err(vdev->dev, "Couldn't > handle page request\n"); > } > } > xa_unlock_irqrestore(_ep_ids, flags); > > But using a global xarray would also be incorrect if the endpointsID are > global > across 'viommu_domain'. > > I need to find the correct 'viommu_endpoint' to call > iommu_report_device_fault() > with the correct device. The endpoint IDs are only unique across viommu_dev, so a global xarray wouldn't work but one in viommu_dev would. In vdomain it doesn't work either because we can't get to the domain from the fault handler without first finding the endpoint Thanks, Jean ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 1/1] virtio: write back F_VERSION_1 before validate
On Mon, Oct 11 2021, Halil Pasic wrote: > The virtio specification virtio-v1.1-cs01 states: "Transitional devices > MUST detect Legacy drivers by detecting that VIRTIO_F_VERSION_1 has not > been acknowledged by the driver." This is exactly what QEMU as of 6.1 > has done relying solely on VIRTIO_F_VERSION_1 for detecting that. > > However, the specification also says: "... the driver MAY read (but MUST > NOT write) the device-specific configuration fields to check that it can > support the device ..." before setting FEATURES_OK. > > In that case, any transitional device relying solely on > VIRTIO_F_VERSION_1 for detecting legacy drivers will return data in > legacy format. In particular, this implies that it is in big endian > format for big endian guests. This naturally confuses the driver which > expects little endian in the modern mode. > > It is probably a good idea to amend the spec to clarify that > VIRTIO_F_VERSION_1 can only be relied on after the feature negotiation > is complete. Before validate callback existed, config space was only > read after FEATURES_OK. However, we already have two regressions, so > let's address this here as well. > > The regressions affect the VIRTIO_NET_F_MTU feature of virtio-net and > the VIRTIO_BLK_F_BLK_SIZE feature of virtio-blk for BE guests when > virtio 1.0 is used on both sides. The latter renders virtio-blk unusable > with DASD backing, because things simply don't work with the default. > See Fixes tags for relevant commits. > > For QEMU, we can work around the issue by writing out the feature bits > with VIRTIO_F_VERSION_1 bit set. We (ab)use the finalize_features > config op for this. This isn't enough to address all vhost devices since > these do not get the features until FEATURES_OK, however it looks like > the affected devices actually never handled the endianness for legacy > mode correctly, so at least that's not a regression. > > No devices except virtio net and virtio blk seem to be affected. > > Long term the right thing to do is to fix the hypervisors. > > Cc: #v4.11 > Signed-off-by: Halil Pasic > Fixes: 82e89ea077b9 ("virtio-blk: Add validation for block size in config > space") > Fixes: fe36cbe0671e ("virtio_net: clear MTU when out of range") > Reported-by: mark...@us.ibm.com > Reviewed-by: Cornelia Huck > --- > > @Connie: I made some more commit message changes to accommodate Michael's > requests. I just assumed these will work or you as well and kept your > r-b. Please shout at me if it needs to be dropped :) No need to shout, still looks good to me :) > --- > drivers/virtio/virtio.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c > index 0a5b54034d4b..236081afe9a2 100644 > --- a/drivers/virtio/virtio.c > +++ b/drivers/virtio/virtio.c > @@ -239,6 +239,17 @@ static int virtio_dev_probe(struct device *_d) > driver_features_legacy = driver_features; > } > > + /* > + * Some devices detect legacy solely via F_VERSION_1. Write > + * F_VERSION_1 to force LE config space accesses before FEATURES_OK for > + * these when needed. > + */ > + if (drv->validate && !virtio_legacy_is_little_endian() > + && device_features & BIT_ULL(VIRTIO_F_VERSION_1)) { > + dev->features = BIT_ULL(VIRTIO_F_VERSION_1); > + dev->config->finalize_features(dev); > + } > + > if (device_features & (1ULL << VIRTIO_F_VERSION_1)) > dev->features = driver_features & device_features; > else > > base-commit: 60a9483534ed0d99090a2ee1d4bb0b8179195f51 > -- > 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v5 12/16] PCI: Add pci_iomap_host_shared(), pci_iomap_host_shared_range()
Just as last time: This does not make any sense. ioremap is shared by definition. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 0/9] More virtio hardening
On Tue, Oct 5, 2021 at 3:42 PM Michael S. Tsirkin wrote: > > On Mon, Sep 13, 2021 at 01:53:44PM +0800, Jason Wang wrote: > > Hi All: > > > > This series treis to do more hardening for virito. > > > > patch 1 validates the num_queues for virio-blk device. > > patch 2-4 validates max_nr_ports for virito-console device. > > patch 5-7 harden virtio-pci interrupts to make sure no exepcted > > interrupt handler is tiggered. If this makes sense we can do similar > > things in other transport drivers. > > patch 8-9 validate used ring length. > > > > Smoking test on blk/net with packed=on/off and iommu_platform=on/off. > > > > Please review. > > > > Thanks > > So I poked at console at least, and I think I see > an issue: if interrupt handler queues a work/bh, > then it can still run while reset is in progress. Looks like a bug which is unrelated to the hardening? E.g the driver should sync with work/bh before reset. > > I sent a patch to fix it for console removal specifically, > but I suspect it's not enough e.g. freeze is still broken. > And note this has been reported without any TDX things - > it's not a malicious device issue, can be triggered just > by module unload. > > I am vaguely thinking about new APIs to disable/enable callbacks. > An alternative: > > 1. adding new remove_nocb/freeze_nocb calls > 2. disabling/enabling interrupts automatically around these > 3. gradually moving devices to using these > 4. once/if all device move, removing the old callbacks > > the advantage here is that we'll be sure calls are always > paired correctly. I'm not sure I get the idea, but my feeling is that it doesn't conflict with the interrupt hardening here (or at least the same method is required e.g NO_AUTO_EN). Thanks > > > > > > > Jason Wang (9): > > virtio-blk: validate num_queues during probe > > virtio: add doc for validate() method > > virtio-console: switch to use .validate() > > virtio_console: validate max_nr_ports before trying to use it > > virtio_config: introduce a new ready method > > virtio_pci: harden MSI-X interrupts > > virtio-pci: harden INTX interrupts > > virtio_ring: fix typos in vring_desc_extra > > virtio_ring: validate used buffer length > > > > drivers/block/virtio_blk.c | 3 +- > > drivers/char/virtio_console.c | 51 +- > > drivers/virtio/virtio_pci_common.c | 43 + > > drivers/virtio/virtio_pci_common.h | 7 ++-- > > drivers/virtio/virtio_pci_legacy.c | 5 +-- > > drivers/virtio/virtio_pci_modern.c | 6 ++-- > > drivers/virtio/virtio_ring.c | 27 ++-- > > include/linux/virtio.h | 1 + > > include/linux/virtio_config.h | 6 > > 9 files changed, 118 insertions(+), 31 deletions(-) > > > > -- > > 2.25.1 > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization