Re: [PATCH] vduse: make vduse_class constant

2023-10-07 Thread Greg Kroah-Hartman
On Sun, Oct 08, 2023 at 02:41:22AM -0400, Michael S. Tsirkin wrote:
> On Sun, Oct 08, 2023 at 08:40:05AM +0200, Greg Kroah-Hartman wrote:
> > On Sun, Oct 08, 2023 at 02:20:52AM -0400, Michael S. Tsirkin wrote:
> > > On Fri, Oct 06, 2023 at 04:30:44PM +0200, Greg Kroah-Hartman wrote:
> > > > Now that the driver core allows for struct class to be in read-only
> > > > memory, we should make all 'class' structures declared at build time
> > > > placing them into read-only memory, instead of having to be dynamically
> > > > allocated at runtime.
> > > > 
> > > > Cc: "Michael S. Tsirkin" 
> > > > Cc: Jason Wang 
> > > > Cc: Xuan Zhuo 
> > > > Cc: Xie Yongji 
> > > > Signed-off-by: Greg Kroah-Hartman 
> > > 
> > > Acked-by: Michael S. Tsirkin 
> > > 
> > > Greg should I merge it or do you intend to merge all these patches?
> > 
> > "all"?  There's loads of them for all sorts of subsystems, so feel free
> > to take it through your subsystem tree if you want.  I usually scoop up
> > the ones that no one picks after a release and take them through my
> > tree, to pick up the stragglers.
> > 
> > So it's your call, whatever is easier for you is fine for me.
> > 
> > thanks,
> > 
> > greg k-h
> 
> To clarify which commit does this depend on?

The 6.4 kernel release :)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vduse: make vduse_class constant

2023-10-07 Thread Greg Kroah-Hartman
On Sun, Oct 08, 2023 at 02:20:52AM -0400, Michael S. Tsirkin wrote:
> On Fri, Oct 06, 2023 at 04:30:44PM +0200, Greg Kroah-Hartman wrote:
> > Now that the driver core allows for struct class to be in read-only
> > memory, we should make all 'class' structures declared at build time
> > placing them into read-only memory, instead of having to be dynamically
> > allocated at runtime.
> > 
> > Cc: "Michael S. Tsirkin" 
> > Cc: Jason Wang 
> > Cc: Xuan Zhuo 
> > Cc: Xie Yongji 
> > Signed-off-by: Greg Kroah-Hartman 
> 
> Acked-by: Michael S. Tsirkin 
> 
> Greg should I merge it or do you intend to merge all these patches?

"all"?  There's loads of them for all sorts of subsystems, so feel free
to take it through your subsystem tree if you want.  I usually scoop up
the ones that no one picks after a release and take them through my
tree, to pick up the stragglers.

So it's your call, whatever is easier for you is fine for me.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] vduse: make vduse_class constant

2023-10-06 Thread Greg Kroah-Hartman
Now that the driver core allows for struct class to be in read-only
memory, we should make all 'class' structures declared at build time
placing them into read-only memory, instead of having to be dynamically
allocated at runtime.

Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Xuan Zhuo 
Cc: Xie Yongji 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 40 --
 1 file changed, 21 insertions(+), 19 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index df7869537ef1..0ddd4b8abecb 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -134,7 +134,6 @@ static DEFINE_MUTEX(vduse_lock);
 static DEFINE_IDR(vduse_idr);
 
 static dev_t vduse_major;
-static struct class *vduse_class;
 static struct cdev vduse_ctrl_cdev;
 static struct cdev vduse_cdev;
 static struct workqueue_struct *vduse_irq_wq;
@@ -1528,6 +1527,16 @@ static const struct kobj_type vq_type = {
.default_groups = vq_groups,
 };
 
+static char *vduse_devnode(const struct device *dev, umode_t *mode)
+{
+   return kasprintf(GFP_KERNEL, "vduse/%s", dev_name(dev));
+}
+
+static const struct class vduse_class = {
+   .name = "vduse",
+   .devnode = vduse_devnode,
+};
+
 static void vduse_dev_deinit_vqs(struct vduse_dev *dev)
 {
int i;
@@ -1638,7 +1647,7 @@ static int vduse_destroy_dev(char *name)
mutex_unlock(&dev->lock);
 
vduse_dev_reset(dev);
-   device_destroy(vduse_class, MKDEV(MAJOR(vduse_major), dev->minor));
+   device_destroy(&vduse_class, MKDEV(MAJOR(vduse_major), dev->minor));
idr_remove(&vduse_idr, dev->minor);
kvfree(dev->config);
vduse_dev_deinit_vqs(dev);
@@ -1805,7 +1814,7 @@ static int vduse_create_dev(struct vduse_dev_config 
*config,
 
dev->minor = ret;
dev->msg_timeout = VDUSE_MSG_DEFAULT_TIMEOUT;
-   dev->dev = device_create_with_groups(vduse_class, NULL,
+   dev->dev = device_create_with_groups(&vduse_class, NULL,
MKDEV(MAJOR(vduse_major), dev->minor),
dev, vduse_dev_groups, "%s", config->name);
if (IS_ERR(dev->dev)) {
@@ -1821,7 +1830,7 @@ static int vduse_create_dev(struct vduse_dev_config 
*config,
 
return 0;
 err_vqs:
-   device_destroy(vduse_class, MKDEV(MAJOR(vduse_major), dev->minor));
+   device_destroy(&vduse_class, MKDEV(MAJOR(vduse_major), dev->minor));
 err_dev:
idr_remove(&vduse_idr, dev->minor);
 err_idr:
@@ -1934,11 +1943,6 @@ static const struct file_operations vduse_ctrl_fops = {
.llseek = noop_llseek,
 };
 
-static char *vduse_devnode(const struct device *dev, umode_t *mode)
-{
-   return kasprintf(GFP_KERNEL, "vduse/%s", dev_name(dev));
-}
-
 struct vduse_mgmt_dev {
struct vdpa_mgmt_dev mgmt_dev;
struct device dev;
@@ -2082,11 +2086,9 @@ static int vduse_init(void)
int ret;
struct device *dev;
 
-   vduse_class = class_create("vduse");
-   if (IS_ERR(vduse_class))
-   return PTR_ERR(vduse_class);
-
-   vduse_class->devnode = vduse_devnode;
+   ret = class_register(&vduse_class);
+   if (ret)
+   return ret;
 
ret = alloc_chrdev_region(&vduse_major, 0, VDUSE_DEV_MAX, "vduse");
if (ret)
@@ -2099,7 +2101,7 @@ static int vduse_init(void)
if (ret)
goto err_ctrl_cdev;
 
-   dev = device_create(vduse_class, NULL, vduse_major, NULL, "control");
+   dev = device_create(&vduse_class, NULL, vduse_major, NULL, "control");
if (IS_ERR(dev)) {
ret = PTR_ERR(dev);
goto err_device;
@@ -2141,13 +2143,13 @@ static int vduse_init(void)
 err_wq:
cdev_del(&vduse_cdev);
 err_cdev:
-   device_destroy(vduse_class, vduse_major);
+   device_destroy(&vduse_class, vduse_major);
 err_device:
cdev_del(&vduse_ctrl_cdev);
 err_ctrl_cdev:
unregister_chrdev_region(vduse_major, VDUSE_DEV_MAX);
 err_chardev_region:
-   class_destroy(vduse_class);
+   class_unregister(&vduse_class);
return ret;
 }
 module_init(vduse_init);
@@ -2159,10 +2161,10 @@ static void vduse_exit(void)
destroy_workqueue(vduse_irq_bound_wq);
destroy_workqueue(vduse_irq_wq);
cdev_del(&vduse_cdev);
-   device_destroy(vduse_class, vduse_major);
+   device_destroy(&vduse_class, vduse_major);
cdev_del(&vduse_ctrl_cdev);
unregister_chrdev_region(vduse_major, VDUSE_DEV_MAX);
-   class_destroy(vduse_class);
+   class_unregister(&vduse_class);
 }
 module_exit(vduse_exit);
 
-- 
2.42.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 7/9] virtio_console: make port class a static const structure

2023-06-20 Thread Greg Kroah-Hartman
From: Ivan Orlov 

Now that the driver core allows for struct class to be in read-only
memory, remove the class field of the ports_driver_data structure and
create the port_class static class structure declared at build time
which places it into read-only memory, instead of having it to be
dynamically allocated at load time.

Cc: Amit Shah 
Cc: Arnd Bergmann 
Cc: virtualization@lists.linux-foundation.org
Suggested-by: Greg Kroah-Hartman 
Signed-off-by: Ivan Orlov 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/char/virtio_console.c | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index b65c809a4e97..1f8da0a71ce9 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -40,9 +40,6 @@
  * across multiple devices and multiple ports per device.
  */
 struct ports_driver_data {
-   /* Used for registering chardevs */
-   struct class *class;
-
/* Used for exporting per-port information to debugfs */
struct dentry *debugfs_dir;
 
@@ -55,6 +52,10 @@ struct ports_driver_data {
 
 static struct ports_driver_data pdrvdata;
 
+static const struct class port_class = {
+   .name = "virtio-ports",
+};
+
 static DEFINE_SPINLOCK(pdrvdata_lock);
 static DECLARE_COMPLETION(early_console_added);
 
@@ -1399,7 +1400,7 @@ static int add_port(struct ports_device *portdev, u32 id)
"Error %d adding cdev for port %u\n", err, id);
goto free_cdev;
}
-   port->dev = device_create(pdrvdata.class, &port->portdev->vdev->dev,
+   port->dev = device_create(&port_class, &port->portdev->vdev->dev,
  devt, port, "vport%up%u",
  port->portdev->vdev->index, id);
if (IS_ERR(port->dev)) {
@@ -1465,7 +1466,7 @@ static int add_port(struct ports_device *portdev, u32 id)
 
 free_inbufs:
 free_device:
-   device_destroy(pdrvdata.class, port->dev->devt);
+   device_destroy(&port_class, port->dev->devt);
 free_cdev:
cdev_del(port->cdev);
 free_port:
@@ -1540,7 +1541,7 @@ static void unplug_port(struct port *port)
port->portdev = NULL;
 
sysfs_remove_group(&port->dev->kobj, &port_attribute_group);
-   device_destroy(pdrvdata.class, port->dev->devt);
+   device_destroy(&port_class, port->dev->devt);
cdev_del(port->cdev);
 
debugfs_remove(port->debugfs_file);
@@ -2244,12 +2245,9 @@ static int __init virtio_console_init(void)
 {
int err;
 
-   pdrvdata.class = class_create("virtio-ports");
-   if (IS_ERR(pdrvdata.class)) {
-   err = PTR_ERR(pdrvdata.class);
-   pr_err("Error %d creating virtio-ports class\n", err);
+   err = class_register(&port_class);
+   if (err)
return err;
-   }
 
pdrvdata.debugfs_dir = debugfs_create_dir("virtio-ports", NULL);
INIT_LIST_HEAD(&pdrvdata.consoles);
@@ -2271,7 +2269,7 @@ static int __init virtio_console_init(void)
unregister_virtio_driver(&virtio_console);
 free:
debugfs_remove_recursive(pdrvdata.debugfs_dir);
-   class_destroy(pdrvdata.class);
+   class_unregister(&port_class);
return err;
 }
 
@@ -2282,7 +2280,7 @@ static void __exit virtio_console_fini(void)
unregister_virtio_driver(&virtio_console);
unregister_virtio_driver(&virtio_rproc_serial);
 
-   class_destroy(pdrvdata.class);
+   class_unregister(&port_class);
debugfs_remove_recursive(pdrvdata.debugfs_dir);
 }
 module_init(virtio_console_init);
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH 1/2] mm: multigen-LRU: working set reporting

2023-05-10 Thread Greg Kroah-Hartman
On Wed, May 10, 2023 at 02:54:18AM +0800, Yuanchu Xie wrote:
> From: talumbau 

Please fix the name here.

> 
> A single patch to be broken up into multiple patches.

What does this mean?

> - Add working set reporting structure.
> - Add per-node and per-memcg interfaces for working set reporting.
> - Implement working set backend for MGLRU.

Please break it up to be reviewable, otherwise no one will review it.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/2] tools/virtio: fix build caused by virtio_ring changes

2023-04-10 Thread Greg Kroah-Hartman
On Mon, Apr 10, 2023 at 08:00:33AM -0400, Michael S. Tsirkin wrote:
> On Mon, Apr 10, 2023 at 08:28:45PM +0900, Shunsuke Mie wrote:
> > Fix the build dependency for virtio_test. The virtio_ring that is used from
> > the test requires container_of_const(). Change to use container_of.h kernel
> > header directly and adapt related codes.
> > 
> > Signed-off-by: Shunsuke Mie 
> 
> This is only for next right? That's where container_of_const
> things are I think ...

container_of_const() is in 6.2.


> 
> > ---
> >  tools/include/linux/types.h   |  1 -
> >  tools/virtio/linux/compiler.h |  2 ++
> >  tools/virtio/linux/kernel.h   |  5 +
> >  tools/virtio/linux/module.h   |  1 -
> >  tools/virtio/linux/uaccess.h  | 11 ++-
> >  5 files changed, 5 insertions(+), 15 deletions(-)
> > 
> > diff --git a/tools/include/linux/types.h b/tools/include/linux/types.h
> > index 051fdeaf2670..f1896b70a8e5 100644
> > --- a/tools/include/linux/types.h
> > +++ b/tools/include/linux/types.h
> > @@ -49,7 +49,6 @@ typedef __s8  s8;
> >  #endif
> >  
> >  #define __force
> > -#define __user

Why is this needed?

> >  #define __must_check
> >  #define __cold
> >  
> > diff --git a/tools/virtio/linux/compiler.h b/tools/virtio/linux/compiler.h
> > index 2c51bccb97bb..1f3a15b954b9 100644
> > --- a/tools/virtio/linux/compiler.h
> > +++ b/tools/virtio/linux/compiler.h
> > @@ -2,6 +2,8 @@
> >  #ifndef LINUX_COMPILER_H
> >  #define LINUX_COMPILER_H
> >  
> > +#include "../../../include/linux/compiler_types.h"

While I understand your need to not want to duplicate code, what in the
world is this doing?  Why not use the in-kernel compiler.h instead?  Why
are you copying loads of .h files into tools/virtio/?  What is this for
and why not just use the real files so you don't have to even attempt to
try to keep things in sync (hint, they will always be out of sync.)

> > +
> >  #define WRITE_ONCE(var, val) \
> > (*((volatile typeof(val) *)(&(var))) = (val))
> >  
> > diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h
> > index 8b877167933d..6702008f7f5c 100644
> > --- a/tools/virtio/linux/kernel.h
> > +++ b/tools/virtio/linux/kernel.h
> > @@ -10,6 +10,7 @@
> >  #include 
> >  
> >  #include 
> > +#include "../../../include/linux/container_of.h"

Either do this for all .h files, or not, don't pick and choose.



> >  #include 
> >  #include 
> >  #include 
> > @@ -107,10 +108,6 @@ static inline void free_page(unsigned long addr)
> > free((void *)addr);
> >  }
> >  
> > -#define container_of(ptr, type, member) ({ \
> > -   const typeof( ((type *)0)->member ) *__mptr = (ptr);\
> > -   (type *)( (char *)__mptr - offsetof(type,member) );})
> > -
> >  # ifndef likely
> >  #  define likely(x)(__builtin_expect(!!(x), 1))
> >  # endif
> > diff --git a/tools/virtio/linux/module.h b/tools/virtio/linux/module.h
> > index 9dfa96fea2b2..5cf39167d47a 100644
> > --- a/tools/virtio/linux/module.h
> > +++ b/tools/virtio/linux/module.h
> > @@ -4,4 +4,3 @@
> >  #define MODULE_LICENSE(__MODULE_LICENSE_value) \
> > static __attribute__((unused)) const char *__MODULE_LICENSE_name = \
> > __MODULE_LICENSE_value
> > -

This change has nothing to do with what you said was happening in this
patch :(

Please be more careful.

> > diff --git a/tools/virtio/linux/uaccess.h b/tools/virtio/linux/uaccess.h
> > index 991dfb263998..cde2c344b260 100644
> > --- a/tools/virtio/linux/uaccess.h
> > +++ b/tools/virtio/linux/uaccess.h
> > @@ -6,15 +6,10 @@
> >  
> >  extern void *__user_addr_min, *__user_addr_max;
> >  
> > -static inline void __chk_user_ptr(const volatile void *p, size_t size)
> > -{
> > -   assert(p >= __user_addr_min && p + size <= __user_addr_max);
> > -}
> > -

What does this function have to do with container_of()?


> >  #define put_user(x, ptr)   \
> >  ({ \
> > typeof(ptr) __pu_ptr = (ptr);   \
> > -   __chk_user_ptr(__pu_ptr, sizeof(*__pu_ptr));\
> > +   __chk_user_ptr(__pu_ptr);   \

Why are you trying to duplicate in-kernel .h files?

This all doesn't look ok, sorry.

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 31/36] vhost-vdpa: vhost_vdpa_alloc_domain() should be using a const struct bus_type *

2023-03-13 Thread Greg Kroah-Hartman
The function, vhost_vdpa_alloc_domain(), has a pointer to a struct
bus_type, but it should be constant as the function it passes it to
expects it to be const, and the vhost code does not modify it in any
way.

Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: k...@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Cc: net...@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman 
---
Note, this is a patch that is a prepatory cleanup as part of a larger
series of patches that is working on resolving some old driver core
design mistakes.  It will build and apply cleanly on top of 6.3-rc2 on
its own, but I'd prefer if I could take it through my driver-core tree
so that the driver core changes can be taken through there for 6.4-rc1.

 drivers/vhost/vdpa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index dc12dbd5b43b..08c7cb3399fc 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -1140,7 +1140,7 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v)
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;
struct device *dma_dev = vdpa_get_dma_dev(vdpa);
-   struct bus_type *bus;
+   const struct bus_type *bus;
int ret;
 
/* Device want to do DMA by itself */
-- 
2.39.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 2/6] virtio console: Harden port adding

2023-01-27 Thread Greg Kroah-Hartman
On Fri, Jan 27, 2023 at 04:17:46PM +0200, Alexander Shishkin wrote:
> Greg Kroah-Hartman  writes:
> 
> > On Fri, Jan 27, 2023 at 02:47:55PM +0200, Alexander Shishkin wrote:
> >> "Michael S. Tsirkin"  writes:
> >> 
> >> > On Fri, Jan 27, 2023 at 01:55:43PM +0200, Alexander Shishkin wrote:
> >> >> We can have shared pages between the host and guest without bounce
> >> >> buffers in between, so they can be both looking directly at the same
> >> >> page.
> >> >> 
> >> >> Regards,
> >> >
> >> > How does this configuration work? What else is in this page?
> >> 
> >> So, for example in TDX, you have certain pages as "shared", as in
> >> between guest and hypervisor. You can have virtio ring(s) in such
> >> pages. It's likely that there'd be a swiotlb buffer there instead, but
> >> sharing pages between host virtio and guest virtio drivers is possible.
> >
> > If it is shared, then what does this mean?  Do we then need to copy
> > everything out of that buffer first before doing anything with it
> > because the data could change later on?  Or do we not trust anything in
> > it at all and we throw it away?  Or something else (trust for a short
> > while and then we don't?)
> 
> The first one, we need a consistent view of the metadata (the ckpt in
> this case), so we take a snapshot of it. Then, we validate it (because
> we don't trust it) to be correct. If it is not, we discard it, otherwise
> we act on it. Since this is a ring, we just move on to the next record
> if there is one.

So you do an additional extra copy of everything, making the bounce
buffer useless?  :)

> Meanwhile, in the shared page, it can change from correct to incorrect,
> but it won't affect us because we have this consistent view at the
> moment the snapshot was taken.

Wonderful, copy everything out then, the whole page, don't do it
piecemeal field by field.  And then justify it to everyone whose
throughput you just tanked...

good luck!

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 2/6] virtio console: Harden port adding

2023-01-27 Thread Greg Kroah-Hartman
On Fri, Jan 27, 2023 at 02:47:55PM +0200, Alexander Shishkin wrote:
> "Michael S. Tsirkin"  writes:
> 
> > On Fri, Jan 27, 2023 at 01:55:43PM +0200, Alexander Shishkin wrote:
> >> "Michael S. Tsirkin"  writes:
> >> 
> >> > On Thu, Jan 19, 2023 at 10:13:18PM +0200, Alexander Shishkin wrote:
> >> >> When handling control messages, instead of peeking at the device memory
> >> >> to obtain bits of the control structure,
> >> >
> >> > Except the message makes it seem that we are getting data from
> >> > device memory, when we do nothing of the kind.
> >> 
> >> We can be, see below.
> >> 
> >> >> take a snapshot of it once and
> >> >> use it instead, to prevent it from changing under us. This avoids races
> >> >> between port id validation and control event decoding, which can lead
> >> >> to, for example, a NULL dereference in port removal of a nonexistent
> >> >> port.
> >> >> 
> >> >> The control structure is small enough (8 bytes) that it can be cached
> >> >> directly on the stack.
> >> >
> >> > I still have no real idea why we want a copy here.
> >> > If device can poke anywhere at memory then it can crash kernel anyway.
> >> > If there's a bounce buffer or an iommu or some other protection
> >> > in place, then this memory can no longer change by the time
> >> > we look at it.
> >> 
> >> We can have shared pages between the host and guest without bounce
> >> buffers in between, so they can be both looking directly at the same
> >> page.
> >> 
> >> Regards,
> >
> > How does this configuration work? What else is in this page?
> 
> So, for example in TDX, you have certain pages as "shared", as in
> between guest and hypervisor. You can have virtio ring(s) in such
> pages. It's likely that there'd be a swiotlb buffer there instead, but
> sharing pages between host virtio and guest virtio drivers is possible.

If it is shared, then what does this mean?  Do we then need to copy
everything out of that buffer first before doing anything with it
because the data could change later on?  Or do we not trust anything in
it at all and we throw it away?  Or something else (trust for a short
while and then we don't?)

Please be specific as to what you want to see happen here, and why.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 2/6] virtio console: Harden port adding

2023-01-19 Thread Greg Kroah-Hartman
On Thu, Jan 19, 2023 at 10:13:18PM +0200, Alexander Shishkin wrote:
> Greg Kroah-Hartman  writes:
> 
> > Then you need to copy it out once, and then only deal with the local
> > copy.  Otherwise you have an incomplete snapshot.
> 
> Ok, would you be partial to something like this:
> 
> >From 1bc9bb84004154376c2a0cf643d53257da6d1cd7 Mon Sep 17 00:00:00 2001
> From: Alexander Shishkin 
> Date: Thu, 19 Jan 2023 21:59:02 +0200
> Subject: [PATCH] virtio console: Keep a local copy of the control structure
> 
> When handling control messages, instead of peeking at the device memory
> to obtain bits of the control structure, take a snapshot of it once and
> use it instead, to prevent it from changing under us. This avoids races
> between port id validation and control event decoding, which can lead
> to, for example, a NULL dereference in port removal of a nonexistent
> port.
> 
> The control structure is small enough (8 bytes) that it can be cached
> directly on the stack.
> 
> Signed-off-by: Alexander Shishkin 
> Cc: Greg Kroah-Hartman 
> Cc: Arnd Bergmann 
> Cc: Amit Shah 
> ---
>  drivers/char/virtio_console.c | 29 +++--
>  1 file changed, 15 insertions(+), 14 deletions(-)

Yes, this looks much better, thanks!

Reviewed-by: Greg Kroah-Hartman 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 1/6] virtio console: Harden multiport against invalid host input

2023-01-19 Thread Greg Kroah-Hartman
On Thu, Jan 19, 2023 at 08:52:02PM +0200, Alexander Shishkin wrote:
> Greg Kroah-Hartman  writes:
> 
> > On Thu, Jan 19, 2023 at 03:57:16PM +0200, Alexander Shishkin wrote:
> >> From: Andi Kleen 
> >> 
> >> --- a/drivers/char/virtio_console.c
> >> +++ b/drivers/char/virtio_console.c
> >> @@ -1843,6 +1843,9 @@ static int init_vqs(struct ports_device *portdev)
> >>int err;
> >>  
> >>nr_ports = portdev->max_nr_ports;
> >> +  if (use_multiport(portdev) && nr_ports < 1)
> >> +  return -EINVAL;
> >> +
> >>nr_queues = use_multiport(portdev) ? (nr_ports + 1) * 2 : 2;
> >>  
> >>vqs = kmalloc_array(nr_queues, sizeof(struct virtqueue *), GFP_KERNEL);
> >> -- 
> >> 2.39.0
> >> 
> >
> > Why did I only get a small subset of these patches?
> 
> I did what get_maintainer told me. Would you like to be CC'd on the
> whole thing?

If you only cc: me on a portion of the series, I guess you only want me
to apply a portion of it?  if so, why is it a longer series?

> > And why is the whole thread not on lore.kernel.org?
> 
> That is a mystery, some wires got crossed between my smtp and vger. I
> bounced the series to lkml just now and at least some of it seems to
> have landed on lore.
> 
> > And the term "hardening" is marketing fluff.   Just say, "properly parse
> > input" or something like that, as what you are doing is fixing
> > assumptions about the data here, not causing anything to be more (or
> > less) secure.
> >
> > But, this still feels wrong.  Why is this happening here, in init_vqs()
> > and not in the calling function that already did a bunch of validation
> > of the ports and the like?  Are those checks not enough?  if not, fix it
> > there, don't spread it out all over the place...
> 
> Good point! And there happens to already be 28962ec595d70 that takes
> care of exactly this case. I totally missed it.

So this series is not needed?  Or just this one?

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 2/6] virtio console: Harden port adding

2023-01-19 Thread Greg Kroah-Hartman
On Thu, Jan 19, 2023 at 07:48:35PM +0200, Alexander Shishkin wrote:
> Greg Kroah-Hartman  writes:
> 
> > On Thu, Jan 19, 2023 at 03:57:17PM +0200, Alexander Shishkin wrote:
> >> From: Andi Kleen 
> >> 
> >> The ADD_PORT operation reads and sanity checks the port id multiple
> >> times from the untrusted host. This is not safe because a malicious
> >> host could change it between reads.
> >> 
> >> Read the port id only once and cache it for subsequent uses.
> >> 
> >> Signed-off-by: Andi Kleen 
> >> Signed-off-by: Alexander Shishkin 
> >> Cc: Amit Shah 
> >> Cc: Arnd Bergmann 
> >> Cc: Greg Kroah-Hartman 
> >> ---
> >>  drivers/char/virtio_console.c | 10 ++
> >>  1 file changed, 6 insertions(+), 4 deletions(-)
> >> 
> >> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
> >> index f4fd5fe7cd3a..6599c2956ba4 100644
> >> --- a/drivers/char/virtio_console.c
> >> +++ b/drivers/char/virtio_console.c
> >> @@ -1563,10 +1563,13 @@ static void handle_control_message(struct 
> >> virtio_device *vdev,
> >>struct port *port;
> >>size_t name_size;
> >>int err;
> >> +  unsigned id;
> >>  
> >>cpkt = (struct virtio_console_control *)(buf->buf + buf->offset);
> >>  
> >> -  port = find_port_by_id(portdev, virtio32_to_cpu(vdev, cpkt->id));
> >> +  /* Make sure the host cannot change id under us */
> >> +  id = virtio32_to_cpu(vdev, READ_ONCE(cpkt->id));
> >
> > Why READ_ONCE()?
> >
> > And how can it change under us?  Is the message still under control of
> > the "host"?  If so, that feels wrong as this is all in kernel memory,
> > not userspace memory right?
> >
> > If you are dealing with memory from a different process that you do not
> > trust, then you need to copy EVERYTHING at once.  Don't piece-meal copy
> > bits and bobs in all different places please.  Do it once and then parse
> > the local structure properly.
> 
> This is the device memory or the VM host memory, not userspace or
> another process. And it can change under us willy-nilly.

Then you need to copy it out once, and then only deal with the local
copy.  Otherwise you have an incomplete snapshot.

> The thing is, we only need to cache two things to correctly process the
> request. Copying everything, on the other hand, would involve the entire
> buffer, not just the *cpkt, but also stuff that follows, which also
> differs between different event types. And we also don't care if the
> rest of it changes under us.

That feels broken if you do not "trust" that other side.  And what
prevents the buffer from changing after you validated the other part?

For virtio, I thought you always implied that you did trust the other
side, when has that changed?  Where was that new security model for the
kernel discussed?

Are you sure this is even viable?  What is the threat model you are
attempting to add to the driver here?

> > Otherwise this is going to be impossible to actually maintain over
> > time...
> 
> An 'id' can't possibly be worse to maintain than multiple instances of
> 'virtio32_to_cpu(vdev, cpkt->id)' sprinkled around the code.

Again, copy what you want out and then act on that.  If it can change
under you, and you do not trust it, then you have to work only on a
snapshot that you have verified.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 4/6] virtio console: Harden control message handling

2023-01-19 Thread Greg Kroah-Hartman
On Thu, Jan 19, 2023 at 03:57:19PM +0200, Alexander Shishkin wrote:
> In handle_control_message(), we look at the ->event field twice, which
> gives a malicious VMM a window in which to switch it from PORT_ADD to
> PORT_REMOVE, triggering a null dereference further down the line:

How is the other VMM have full control over the full message here?
Shouldn't this all have been copied into our local memory if we are
going to be poking around in it?  Like I mentioned in my other review,
copy it all once and then parse it.  Don't try to mess with individual
fields one at a time otherwise that way lies madness...

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 2/6] virtio console: Harden port adding

2023-01-19 Thread Greg Kroah-Hartman
On Thu, Jan 19, 2023 at 03:57:17PM +0200, Alexander Shishkin wrote:
> From: Andi Kleen 
> 
> The ADD_PORT operation reads and sanity checks the port id multiple
> times from the untrusted host. This is not safe because a malicious
> host could change it between reads.
> 
> Read the port id only once and cache it for subsequent uses.
> 
> Signed-off-by: Andi Kleen 
> Signed-off-by: Alexander Shishkin 
> Cc: Amit Shah 
> Cc: Arnd Bergmann 
> Cc: Greg Kroah-Hartman 
> ---
>  drivers/char/virtio_console.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
> index f4fd5fe7cd3a..6599c2956ba4 100644
> --- a/drivers/char/virtio_console.c
> +++ b/drivers/char/virtio_console.c
> @@ -1563,10 +1563,13 @@ static void handle_control_message(struct 
> virtio_device *vdev,
>   struct port *port;
>   size_t name_size;
>   int err;
> + unsigned id;
>  
>   cpkt = (struct virtio_console_control *)(buf->buf + buf->offset);
>  
> - port = find_port_by_id(portdev, virtio32_to_cpu(vdev, cpkt->id));
> + /* Make sure the host cannot change id under us */
> + id = virtio32_to_cpu(vdev, READ_ONCE(cpkt->id));

Why READ_ONCE()?

And how can it change under us?  Is the message still under control of
the "host"?  If so, that feels wrong as this is all in kernel memory,
not userspace memory right?

If you are dealing with memory from a different process that you do not
trust, then you need to copy EVERYTHING at once.  Don't piece-meal copy
bits and bobs in all different places please.  Do it once and then parse
the local structure properly.

Otherwise this is going to be impossible to actually maintain over
time...

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 1/6] virtio console: Harden multiport against invalid host input

2023-01-19 Thread Greg Kroah-Hartman
On Thu, Jan 19, 2023 at 03:57:16PM +0200, Alexander Shishkin wrote:
> From: Andi Kleen 
> 
> It's possible for the host to set the multiport flag, but pass in
> 0 multiports, which results in:
> 
> BUG: KASAN: slab-out-of-bounds in init_vqs+0x244/0x6c0 
> drivers/char/virtio_console.c:1878
> Write of size 8 at addr 888001cc24a0 by task swapper/1
> 
> CPU: 0 PID: 1 Comm: swapper Not tainted 5.15.0-rc1-140273-gaab0bb9fbaa1-dirty 
> #588
> Call Trace:
>  init_vqs+0x244/0x6c0 drivers/char/virtio_console.c:1878
>  virtcons_probe+0x1a3/0x5b0 drivers/char/virtio_console.c:2042
>  virtio_dev_probe+0x2b9/0x500 drivers/virtio/virtio.c:263
>  call_driver_probe drivers/base/dd.c:515
>  really_probe+0x1c9/0x5b0 drivers/base/dd.c:601
>  really_probe_debug drivers/base/dd.c:694
>  __driver_probe_device+0x10d/0x1f0 drivers/base/dd.c:754
>  driver_probe_device+0x68/0x150 drivers/base/dd.c:786
>  __driver_attach+0xca/0x200 drivers/base/dd.c:1145
>  bus_for_each_dev+0x108/0x190 drivers/base/bus.c:301
>  driver_attach+0x30/0x40 drivers/base/dd.c:1162
>  bus_add_driver+0x325/0x3c0 drivers/base/bus.c:618
>  driver_register+0xf3/0x1d0 drivers/base/driver.c:171
> ...
> 
> Add a suitable sanity check.
> 
> Signed-off-by: Andi Kleen 
> Signed-off-by: Alexander Shishkin 
> Cc: Amit Shah 
> Cc: Arnd Bergmann 
> Cc: Greg Kroah-Hartman 
> ---
>  drivers/char/virtio_console.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
> index 6a821118d553..f4fd5fe7cd3a 100644
> --- a/drivers/char/virtio_console.c
> +++ b/drivers/char/virtio_console.c
> @@ -1843,6 +1843,9 @@ static int init_vqs(struct ports_device *portdev)
>   int err;
>  
>   nr_ports = portdev->max_nr_ports;
> + if (use_multiport(portdev) && nr_ports < 1)
> + return -EINVAL;
> +
>   nr_queues = use_multiport(portdev) ? (nr_ports + 1) * 2 : 2;
>  
>   vqs = kmalloc_array(nr_queues, sizeof(struct virtqueue *), GFP_KERNEL);
> -- 
> 2.39.0
> 

Why did I only get a small subset of these patches?

And why is the whole thread not on lore.kernel.org?

And the term "hardening" is marketing fluff.   Just say, "properly parse
input" or something like that, as what you are doing is fixing
assumptions about the data here, not causing anything to be more (or
less) secure.

But, this still feels wrong.  Why is this happening here, in init_vqs()
and not in the calling function that already did a bunch of validation
of the ports and the like?  Are those checks not enough?  if not, fix it
there, don't spread it out all over the place...

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 11/16] virtio: move dev_to_virtio() to use container_of_const()

2023-01-11 Thread Greg Kroah-Hartman
The driver core is changing to pass some pointers as const, so move
dev_to_virtio() to use container_of_const() to handle this change.

dev_to_virtio() now properly keeps the const-ness of the pointer passed
into it, while as before it could be lost.

Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman 
---
 include/linux/virtio.h | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index dcab9c7e8784..2b472514c49b 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -127,10 +127,7 @@ struct virtio_device {
void *priv;
 };
 
-static inline struct virtio_device *dev_to_virtio(struct device *_dev)
-{
-   return container_of(_dev, struct virtio_device, dev);
-}
+#define dev_to_virtio(_dev)container_of_const(_dev, struct virtio_device, 
dev)
 
 void virtio_add_status(struct virtio_device *dev, unsigned int status);
 int register_virtio_device(struct virtio_device *dev);
-- 
2.39.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 6.1 0981/1146] blk-mq: avoid double ->queue_rq() because of early timeout

2022-12-28 Thread Greg Kroah-Hartman
From: David Jeffery 

[ Upstream commit 82c229476b8f6afd7e09bc4dc77d89dc19ff7688 ]

David Jeffery found one double ->queue_rq() issue, so far it can
be triggered in VM use case because of long vmexit latency or preempt
latency of vCPU pthread or long page fault in vCPU pthread, then block
IO req could be timed out before queuing the request to hardware but after
calling blk_mq_start_request() during ->queue_rq(), then timeout handler
may handle it by requeue, then double ->queue_rq() is caused, and kernel
panic.

So far, it is driver's responsibility to cover the race between timeout
and completion, so it seems supposed to be solved in driver in theory,
given driver has enough knowledge.

But it is really one common problem, lots of driver could have similar
issue, and could be hard to fix all affected drivers, even it isn't easy
for driver to handle the race. So David suggests this patch by draining
in-progress ->queue_rq() for solving this issue.

Cc: Stefan Hajnoczi 
Cc: Keith Busch 
Cc: virtualization@lists.linux-foundation.org
Cc: Bart Van Assche 
Signed-off-by: David Jeffery 
Signed-off-by: Ming Lei 
Reviewed-by: Bart Van Assche 
Link: https://lore.kernel.org/r/20221026051957.358818-1-ming@redhat.com
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 block/blk-mq.c | 56 +++---
 1 file changed, 44 insertions(+), 12 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4f1c259138e8..a23026099284 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1529,7 +1529,13 @@ static void blk_mq_rq_timed_out(struct request *req)
blk_add_timer(req);
 }
 
-static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
+struct blk_expired_data {
+   bool has_timedout_rq;
+   unsigned long next;
+   unsigned long timeout_start;
+};
+
+static bool blk_mq_req_expired(struct request *rq, struct blk_expired_data 
*expired)
 {
unsigned long deadline;
 
@@ -1539,13 +1545,13 @@ static bool blk_mq_req_expired(struct request *rq, 
unsigned long *next)
return false;
 
deadline = READ_ONCE(rq->deadline);
-   if (time_after_eq(jiffies, deadline))
+   if (time_after_eq(expired->timeout_start, deadline))
return true;
 
-   if (*next == 0)
-   *next = deadline;
-   else if (time_after(*next, deadline))
-   *next = deadline;
+   if (expired->next == 0)
+   expired->next = deadline;
+   else if (time_after(expired->next, deadline))
+   expired->next = deadline;
return false;
 }
 
@@ -1561,7 +1567,7 @@ void blk_mq_put_rq_ref(struct request *rq)
 
 static bool blk_mq_check_expired(struct request *rq, void *priv)
 {
-   unsigned long *next = priv;
+   struct blk_expired_data *expired = priv;
 
/*
 * blk_mq_queue_tag_busy_iter() has locked the request, so it cannot
@@ -1570,7 +1576,18 @@ static bool blk_mq_check_expired(struct request *rq, 
void *priv)
 * it was completed and reallocated as a new request after returning
 * from blk_mq_check_expired().
 */
-   if (blk_mq_req_expired(rq, next))
+   if (blk_mq_req_expired(rq, expired)) {
+   expired->has_timedout_rq = true;
+   return false;
+   }
+   return true;
+}
+
+static bool blk_mq_handle_expired(struct request *rq, void *priv)
+{
+   struct blk_expired_data *expired = priv;
+
+   if (blk_mq_req_expired(rq, expired))
blk_mq_rq_timed_out(rq);
return true;
 }
@@ -1579,7 +1596,9 @@ static void blk_mq_timeout_work(struct work_struct *work)
 {
struct request_queue *q =
container_of(work, struct request_queue, timeout_work);
-   unsigned long next = 0;
+   struct blk_expired_data expired = {
+   .timeout_start = jiffies,
+   };
struct blk_mq_hw_ctx *hctx;
unsigned long i;
 
@@ -1599,10 +1618,23 @@ static void blk_mq_timeout_work(struct work_struct 
*work)
if (!percpu_ref_tryget(&q->q_usage_counter))
return;
 
-   blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &next);
+   /* check if there is any timed-out request */
+   blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &expired);
+   if (expired.has_timedout_rq) {
+   /*
+* Before walking tags, we must ensure any submit started
+* before the current time has finished. Since the submit
+* uses srcu or rcu, wait for a synchronization point to
+* ensure all running submits have finished
+*/
+   blk_mq_wait_quiesce_done(q);
+
+   expired.next = 0;
+   blk_mq_queue_tag_busy_iter(q, blk_mq_handle_expired, &expired);
+   }
 
-   if (next != 0) {
-   mod_timer(&q->timeout, next);
+   if (expired.next != 0) {
+   mod_timer(&q->timeout, expi

[PATCH 6.0 0919/1073] blk-mq: avoid double ->queue_rq() because of early timeout

2022-12-28 Thread Greg Kroah-Hartman
From: David Jeffery 

[ Upstream commit 82c229476b8f6afd7e09bc4dc77d89dc19ff7688 ]

David Jeffery found one double ->queue_rq() issue, so far it can
be triggered in VM use case because of long vmexit latency or preempt
latency of vCPU pthread or long page fault in vCPU pthread, then block
IO req could be timed out before queuing the request to hardware but after
calling blk_mq_start_request() during ->queue_rq(), then timeout handler
may handle it by requeue, then double ->queue_rq() is caused, and kernel
panic.

So far, it is driver's responsibility to cover the race between timeout
and completion, so it seems supposed to be solved in driver in theory,
given driver has enough knowledge.

But it is really one common problem, lots of driver could have similar
issue, and could be hard to fix all affected drivers, even it isn't easy
for driver to handle the race. So David suggests this patch by draining
in-progress ->queue_rq() for solving this issue.

Cc: Stefan Hajnoczi 
Cc: Keith Busch 
Cc: virtualization@lists.linux-foundation.org
Cc: Bart Van Assche 
Signed-off-by: David Jeffery 
Signed-off-by: Ming Lei 
Reviewed-by: Bart Van Assche 
Link: https://lore.kernel.org/r/20221026051957.358818-1-ming@redhat.com
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 block/blk-mq.c | 56 +++---
 1 file changed, 44 insertions(+), 12 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 88975170cc32..05e33c51702d 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1442,7 +1442,13 @@ static void blk_mq_rq_timed_out(struct request *req)
blk_add_timer(req);
 }
 
-static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
+struct blk_expired_data {
+   bool has_timedout_rq;
+   unsigned long next;
+   unsigned long timeout_start;
+};
+
+static bool blk_mq_req_expired(struct request *rq, struct blk_expired_data 
*expired)
 {
unsigned long deadline;
 
@@ -1452,13 +1458,13 @@ static bool blk_mq_req_expired(struct request *rq, 
unsigned long *next)
return false;
 
deadline = READ_ONCE(rq->deadline);
-   if (time_after_eq(jiffies, deadline))
+   if (time_after_eq(expired->timeout_start, deadline))
return true;
 
-   if (*next == 0)
-   *next = deadline;
-   else if (time_after(*next, deadline))
-   *next = deadline;
+   if (expired->next == 0)
+   expired->next = deadline;
+   else if (time_after(expired->next, deadline))
+   expired->next = deadline;
return false;
 }
 
@@ -1472,7 +1478,7 @@ void blk_mq_put_rq_ref(struct request *rq)
 
 static bool blk_mq_check_expired(struct request *rq, void *priv)
 {
-   unsigned long *next = priv;
+   struct blk_expired_data *expired = priv;
 
/*
 * blk_mq_queue_tag_busy_iter() has locked the request, so it cannot
@@ -1481,7 +1487,18 @@ static bool blk_mq_check_expired(struct request *rq, 
void *priv)
 * it was completed and reallocated as a new request after returning
 * from blk_mq_check_expired().
 */
-   if (blk_mq_req_expired(rq, next))
+   if (blk_mq_req_expired(rq, expired)) {
+   expired->has_timedout_rq = true;
+   return false;
+   }
+   return true;
+}
+
+static bool blk_mq_handle_expired(struct request *rq, void *priv)
+{
+   struct blk_expired_data *expired = priv;
+
+   if (blk_mq_req_expired(rq, expired))
blk_mq_rq_timed_out(rq);
return true;
 }
@@ -1490,7 +1507,9 @@ static void blk_mq_timeout_work(struct work_struct *work)
 {
struct request_queue *q =
container_of(work, struct request_queue, timeout_work);
-   unsigned long next = 0;
+   struct blk_expired_data expired = {
+   .timeout_start = jiffies,
+   };
struct blk_mq_hw_ctx *hctx;
unsigned long i;
 
@@ -1510,10 +1529,23 @@ static void blk_mq_timeout_work(struct work_struct 
*work)
if (!percpu_ref_tryget(&q->q_usage_counter))
return;
 
-   blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &next);
+   /* check if there is any timed-out request */
+   blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &expired);
+   if (expired.has_timedout_rq) {
+   /*
+* Before walking tags, we must ensure any submit started
+* before the current time has finished. Since the submit
+* uses srcu or rcu, wait for a synchronization point to
+* ensure all running submits have finished
+*/
+   blk_mq_wait_quiesce_done(q);
+
+   expired.next = 0;
+   blk_mq_queue_tag_busy_iter(q, blk_mq_handle_expired, &expired);
+   }
 
-   if (next != 0) {
-   mod_timer(&q->timeout, next);
+   if (expired.next != 0) {
+   mod_timer(&q->timeout, expi

[PATCH 2/5] driver core: make struct class.devnode() take a const *

2022-11-23 Thread Greg Kroah-Hartman
The devnode() in struct class should not be modifying the device that is
passed into it, so mark it as a const * and propagate the function
signature changes out into all relevant subsystems that use this
callback.

Cc: Fenghua Yu 
Cc: Reinette Chatre 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: x...@kernel.org
Cc: "H. Peter Anvin" 
Cc: FUJITA Tomonori 
Cc: Jens Axboe 
Cc: Justin Sanders 
Cc: Arnd Bergmann 
Cc: Sumit Semwal 
Cc: Benjamin Gaignard 
Cc: Liam Mark 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: John Stultz 
Cc: "Christian König" 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Leon Romanovsky 
Cc: Dennis Dalessandro 
Cc: Dmitry Torokhov 
Cc: Mauro Carvalho Chehab 
Cc: Sean Young 
Cc: Frank Haverkamp 
Cc: Jiri Slaby 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Alex Williamson 
Cc: Cornelia Huck 
Cc: Kees Cook 
Cc: Anton Vorontsov 
Cc: Colin Cross 
Cc: Tony Luck 
Cc: Jaroslav Kysela 
Cc: Takashi Iwai 
Cc: Hans Verkuil 
Cc: Christophe JAILLET 
Cc: Xie Yongji 
Cc: Gautam Dawar 
Cc: Dan Carpenter 
Cc: Eli Cohen 
Cc: Parav Pandit 
Cc: Maxime Coquelin 
Cc: alsa-de...@alsa-project.org
Cc: dri-de...@lists.freedesktop.org
Cc: k...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: linux-bl...@vger.kernel.org
Cc: linux-in...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Cc: linux-r...@vger.kernel.org
Cc: linux-s...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Greg Kroah-Hartman 
---
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c  | 4 ++--
 arch/x86/kernel/cpuid.c| 2 +-
 arch/x86/kernel/msr.c  | 2 +-
 block/bsg.c| 2 +-
 drivers/block/aoe/aoechr.c | 2 +-
 drivers/char/mem.c | 2 +-
 drivers/char/misc.c| 4 ++--
 drivers/dma-buf/dma-heap.c | 2 +-
 drivers/gpu/drm/drm_sysfs.c| 2 +-
 drivers/infiniband/core/user_mad.c | 2 +-
 drivers/infiniband/core/uverbs_main.c  | 2 +-
 drivers/infiniband/hw/hfi1/device.c| 4 ++--
 drivers/input/input.c  | 2 +-
 drivers/media/dvb-core/dvbdev.c| 4 ++--
 drivers/media/pci/ddbridge/ddbridge-core.c | 4 ++--
 drivers/media/rc/rc-main.c | 2 +-
 drivers/misc/genwqe/card_base.c| 2 +-
 drivers/tty/tty_io.c   | 2 +-
 drivers/usb/core/file.c| 2 +-
 drivers/vdpa/vdpa_user/vduse_dev.c | 2 +-
 drivers/vfio/vfio_main.c   | 2 +-
 fs/pstore/pmsg.c   | 2 +-
 include/linux/device/class.h   | 2 +-
 sound/sound_core.c | 2 +-
 24 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c 
b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index d961ae3ed96e..4e4231a58f38 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -1560,9 +1560,9 @@ static const struct file_operations pseudo_lock_dev_fops 
= {
.mmap = pseudo_lock_dev_mmap,
 };
 
-static char *pseudo_lock_devnode(struct device *dev, umode_t *mode)
+static char *pseudo_lock_devnode(const struct device *dev, umode_t *mode)
 {
-   struct rdtgroup *rdtgrp;
+   const struct rdtgroup *rdtgrp;
 
rdtgrp = dev_get_drvdata(dev);
if (mode)
diff --git a/arch/x86/kernel/cpuid.c b/arch/x86/kernel/cpuid.c
index 6f7b8cc1bc9f..621ba9c0f17a 100644
--- a/arch/x86/kernel/cpuid.c
+++ b/arch/x86/kernel/cpuid.c
@@ -139,7 +139,7 @@ static int cpuid_device_destroy(unsigned int cpu)
return 0;
 }
 
-static char *cpuid_devnode(struct device *dev, umode_t *mode)
+static char *cpuid_devnode(const struct device *dev, umode_t *mode)
 {
return kasprintf(GFP_KERNEL, "cpu/%u/cpuid", MINOR(dev->devt));
 }
diff --git a/arch/x86/kernel/msr.c b/arch/x86/kernel/msr.c
index ed8ac6bcbafb..708751311786 100644
--- a/arch/x86/kernel/msr.c
+++ b/arch/x86/kernel/msr.c
@@ -250,7 +250,7 @@ static int msr_device_destroy(unsigned int cpu)
return 0;
 }
 
-static char *msr_devnode(struct device *dev, umode_t *mode)
+static char *msr_devnode(const struct device *dev, umode_t *mode)
 {
return kasprintf(GFP_KERNEL, "cpu/%u/msr", MINOR(dev->devt));
 }
diff --git a/block/bsg.c b/block/bsg.c
index 2ab1351eb082..08046bd9207d 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -232,7 +232,7 @@ struct bsg_device *bsg_register_queue(struct request_queue 
*q,
 }
 EXPORT_SYMBOL_GPL(bsg_register_queue);
 
-static char *bsg_devnode(struct device *dev, umode_t *mode)
+static char *bsg_devnode(const struct device *dev, umode_t *mode)
 {
return kasprintf(GFP_KERNEL, "bsg/%s", dev_name(dev));
 }
diff --g

Re: [PATCH] virtio_console: Use an atomic to allocate virtual console numbers

2022-11-14 Thread Greg Kroah-Hartman
On Mon, Nov 14, 2022 at 05:03:40PM +0100, Cédric Le Goater wrote:
> On 11/14/22 09:57, Greg Kroah-Hartman wrote:
> > On Mon, Nov 14, 2022 at 09:07:52AM +0100, Cédric Le Goater wrote:
> > > When a virtio console port is initialized, it is registered as an hvc
> > > console using a virtual console number. If a KVM guest is started with
> > > multiple virtio console devices, the same vtermno (or virtual console
> > > number) can be used to allocate different hvc consoles, which leads to
> > > various communication problems later on.
> > > 
> > > This is also reported in debugfs :
> > > 
> > ># grep vtermno /sys/kernel/debug/virtio-ports/*
> > >/sys/kernel/debug/virtio-ports/vport1p1:console_vtermno: 1
> > >/sys/kernel/debug/virtio-ports/vport2p1:console_vtermno: 1
> > >/sys/kernel/debug/virtio-ports/vport3p1:console_vtermno: 2
> > >/sys/kernel/debug/virtio-ports/vport4p1:console_vtermno: 3
> > > 
> > > Fix the issue with an atomic variable and start the first console
> > > number at 1 as it is today.
> > > 
> > > Signed-off-by: Cédric Le Goater 
> > > ---
> > >   drivers/char/virtio_console.c | 8 
> > >   1 file changed, 4 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
> > > index 9fa3c76a267f..253574f41e57 100644
> > > --- a/drivers/char/virtio_console.c
> > > +++ b/drivers/char/virtio_console.c
> > > @@ -58,12 +58,13 @@ struct ports_driver_data {
> > >* We also just assume the first console being initialised was
> > >* the first one that got used as the initial console.
> > >*/
> > > - unsigned int next_vtermno;
> > > + atomic_t next_vtermno;
> > >   /* All the console devices handled by this driver */
> > >   struct list_head consoles;
> > >   };
> > > -static struct ports_driver_data pdrvdata = { .next_vtermno = 1};
> > > +
> > > +static struct ports_driver_data pdrvdata = { .next_vtermno = 
> > > ATOMIC_INIT(0) };
> > >   static DEFINE_SPINLOCK(pdrvdata_lock);
> > >   static DECLARE_COMPLETION(early_console_added);
> > > @@ -1244,7 +1245,7 @@ static int init_port_console(struct port *port)
> > >* pointers.  The final argument is the output buffer size: we
> > >* can do any size, so we put PAGE_SIZE here.
> > >*/
> > > - port->cons.vtermno = pdrvdata.next_vtermno;
> > > + port->cons.vtermno = atomic_inc_return(&pdrvdata.next_vtermno);
> > 
> > Why not use a normal ida/idr structure here?
> 
> yes that works.
> 
> > And why is this never decremented?
> 
> The driver would then need to track the id allocation ...

That's what an ida/idr does.

> > and finally, why not use the value that created the "vportN" number
> > instead?
> 
> yes. we could also encode the tuple (vdev->index, port) using a bitmask,

No need for that, you already have a unique number in the name above,
why not use that?

> possibly using 'max_nr_ports' to reduce the port width.

Why is that an issue?  Maybe I am confused as to what this magic
"vtermno" is here.  Who uses it and why is the vportN number not
sufficient?

> VIRTCONS_MAX_PORTS
> seems a bit big for this device and QEMU sets the #ports to 31.
> 
> An ida might be simpler. One drawback is that an id can be reused for a
> different device/port tuple in case of an (unlikely) unplug/plug sequence.

What's wrong with that?  We do not have persistent device names from
within the kernel.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_console: Use an atomic to allocate virtual console numbers

2022-11-14 Thread Greg Kroah-Hartman
On Mon, Nov 14, 2022 at 09:07:52AM +0100, Cédric Le Goater wrote:
> When a virtio console port is initialized, it is registered as an hvc
> console using a virtual console number. If a KVM guest is started with
> multiple virtio console devices, the same vtermno (or virtual console
> number) can be used to allocate different hvc consoles, which leads to
> various communication problems later on.
> 
> This is also reported in debugfs :
> 
>   # grep vtermno /sys/kernel/debug/virtio-ports/*
>   /sys/kernel/debug/virtio-ports/vport1p1:console_vtermno: 1
>   /sys/kernel/debug/virtio-ports/vport2p1:console_vtermno: 1
>   /sys/kernel/debug/virtio-ports/vport3p1:console_vtermno: 2
>   /sys/kernel/debug/virtio-ports/vport4p1:console_vtermno: 3
> 
> Fix the issue with an atomic variable and start the first console
> number at 1 as it is today.
> 
> Signed-off-by: Cédric Le Goater 
> ---
>  drivers/char/virtio_console.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
> index 9fa3c76a267f..253574f41e57 100644
> --- a/drivers/char/virtio_console.c
> +++ b/drivers/char/virtio_console.c
> @@ -58,12 +58,13 @@ struct ports_driver_data {
>* We also just assume the first console being initialised was
>* the first one that got used as the initial console.
>*/
> - unsigned int next_vtermno;
> + atomic_t next_vtermno;
>  
>   /* All the console devices handled by this driver */
>   struct list_head consoles;
>  };
> -static struct ports_driver_data pdrvdata = { .next_vtermno = 1};
> +
> +static struct ports_driver_data pdrvdata = { .next_vtermno = ATOMIC_INIT(0) 
> };
>  
>  static DEFINE_SPINLOCK(pdrvdata_lock);
>  static DECLARE_COMPLETION(early_console_added);
> @@ -1244,7 +1245,7 @@ static int init_port_console(struct port *port)
>* pointers.  The final argument is the output buffer size: we
>* can do any size, so we put PAGE_SIZE here.
>*/
> - port->cons.vtermno = pdrvdata.next_vtermno;
> + port->cons.vtermno = atomic_inc_return(&pdrvdata.next_vtermno);

Why not use a normal ida/idr structure here?

And why is this never decremented?

and finally, why not use the value that created the "vportN" number
instead?

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 1/4] Make place for common balloon code

2022-08-16 Thread Greg Kroah-Hartman
On Tue, Aug 16, 2022 at 02:47:22PM +0300, Alexander Atanasov wrote:
> Hello,
> 
> On 16.08.22 12:49, Greg Kroah-Hartman wrote:
> > On Tue, Aug 16, 2022 at 12:41:14PM +0300, Alexander Atanasov wrote:
> 
> > >   rename include/linux/{balloon_compaction.h => balloon_common.h} (99%)
> > 
> > Why rename the .h file?  It still handles the "balloon compaction"
> > logic.
> 
> File contains code that is common to balloon drivers,
> compaction is only part of it. Series add more code to it.
> Since it was suggested to use it for such common code.
> I find that common becomes a better name for it so the rename.
> I can drop the rename easy on next iteration if you suggest to.

"balloon_common.h" is very vague, you should only need one balloon.h
file in the include/linux/ directory, right, so of course it is "common"
:)

thanks,

greg "naming is hard" k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 1/4] Make place for common balloon code

2022-08-16 Thread Greg Kroah-Hartman
On Tue, Aug 16, 2022 at 12:41:14PM +0300, Alexander Atanasov wrote:
> File already contains code that is common along balloon
> drivers so rename it to reflect its contents.
> mm/balloon_compaction.c -> mm/balloon_common.c
> 
> Signed-off-by: Alexander Atanasov 
> ---
>  MAINTAINERS  | 4 ++--
>  arch/powerpc/platforms/pseries/cmm.c | 2 +-
>  drivers/misc/vmw_balloon.c   | 2 +-
>  drivers/virtio/virtio_balloon.c  | 2 +-
>  include/linux/{balloon_compaction.h => balloon_common.h} | 2 +-
>  mm/Makefile  | 2 +-
>  mm/{balloon_compaction.c => balloon_common.c}| 4 ++--
>  mm/migrate.c | 2 +-
>  mm/vmscan.c  | 2 +-
>  9 files changed, 11 insertions(+), 11 deletions(-)
>  rename include/linux/{balloon_compaction.h => balloon_common.h} (99%)

Why rename the .h file?  It still handles the "balloon compaction"
logic.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.17 054/225] virtio_net: fix wrong buf address calculation when using xdp

2022-05-04 Thread Greg Kroah-Hartman
ead+0x138/0x190
 [   41.517198]  ksys_read+0x87/0xc0
 [   41.535336]  do_syscall_64+0x3b/0x90
 [   41.551637]  entry_SYSCALL_64_after_hwframe+0x44/0xae
 [   41.568050] RIP: 0033:0x48765b
 [   41.583955] Code: e8 4a 35 fe ff eb 88 cc cc cc cc cc cc cc cc e8 fb 7a fe 
ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 
f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
 [   41.632818] RSP: 002b:00c000a2f5b8 EFLAGS: 0212 ORIG_RAX: 

 [   41.664588] RAX: ffda RBX: 00c62000 RCX: 
0048765b
 [   41.681205] RDX: 5e54 RSI: 00c000e66000 RDI: 
0016
 [   41.697164] RBP: 00c000a2f608 R08: 0001 R09: 
01b4
 [   41.713034] R10: 00b6 R11: 0212 R12: 
00e9
 [   41.728755] R13: 0001 R14: 00c000a92000 R15: 

 [   41.744254]  
 [   41.758585] Modules linked in: br_netfilter bridge veth netconsole 
virtio_net

 and

 [   33.524802] BUG: Bad page state in process systemd-network  pfn:11e60
 [   33.528617] page e05dc0147b00 e05dc04e7a00 8ae9851ec000 (1) len 
82 offset 252 metasize 4 hroom 0 hdr_len 12 data 8ae9851ec10c data_meta 
8ae9851ec108 data_end 8ae9851ec14e
 [   33.529764] page:3792b5ba refcount:0 mapcount:-512 
mapping: index:0x0 pfn:0x11e60
 [   33.532463] flags: 0xfc000(node=0|zone=1|lastcpupid=0x1f)
 [   33.532468] raw: 000fc000  dead0122 

 [   33.532470] raw:   fdff 

 [   33.532471] page dumped because: nonzero mapcount
 [   33.532472] Modules linked in: br_netfilter bridge veth netconsole 
virtio_net
 [   33.532479] CPU: 0 PID: 791 Comm: systemd-network Kdump: loaded Not tainted 
5.18.0-rc1+ #37
 [   33.532482] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.15.0-1.fc35 04/01/2014
 [   33.532484] Call Trace:
 [   33.532496]  
 [   33.532500]  dump_stack_lvl+0x45/0x5a
 [   33.532506]  bad_page.cold+0x63/0x94
 [   33.532510]  free_pcp_prepare+0x290/0x420
 [   33.532515]  free_unref_page+0x1b/0x100
 [   33.532518]  skb_release_data+0x13f/0x1c0
 [   33.532524]  kfree_skb_reason+0x3e/0xc0
 [   33.532527]  ip6_mc_input+0x23c/0x2b0
 [   33.532531]  ip6_sublist_rcv_finish+0x83/0x90
 [   33.532534]  ip6_sublist_rcv+0x22b/0x2b0

[3] XDP program to reproduce(xdp_pass.c):
 #include 
 #include 

 SEC("xdp_pass")
 int xdp_pkt_pass(struct xdp_md *ctx)
 {
  bpf_xdp_adjust_head(ctx, -(int)32);
  return XDP_PASS;
 }

 char _license[] SEC("license") = "GPL";

 compile: clang -O2 -g -Wall -target bpf -c xdp_pass.c -o xdp_pass.o
 load on virtio_net: ip link set enp1s0 xdpdrv obj xdp_pass.o sec xdp_pass

CC: sta...@vger.kernel.org
CC: Jason Wang 
CC: Xuan Zhuo 
CC: Daniel Borkmann 
CC: "Michael S. Tsirkin" 
CC: virtualization@lists.linux-foundation.org
Fixes: 8fb7da9e9907 ("virtio_net: get build_skb() buf by data ptr")
Signed-off-by: Nikolay Aleksandrov 
Reviewed-by: Xuan Zhuo 
Acked-by: Daniel Borkmann 
Acked-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
Link: https://lore.kernel.org/r/20220425103703.3067292-1-ra...@blackwall.org
Signed-off-by: Paolo Abeni 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/virtio_net.c |   20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -978,6 +978,24 @@ static struct sk_buff *receive_mergeable
 * xdp.data_meta were adjusted
 */
len = xdp.data_end - xdp.data + vi->hdr_len + metasize;
+
+   /* recalculate headroom if xdp.data or xdp_data_meta
+* were adjusted, note that offset should always point
+* to the start of the reserved bytes for virtio_net
+* header which are followed by xdp.data, that means
+* that offset is equal to the headroom (when buf is
+* starting at the beginning of the page, otherwise
+* there is a base offset inside the page) but it's used
+* with a different starting point (buf start) than
+* xdp.data (buf start + vnet hdr size). If xdp.data or
+* data_meta were adjusted by the xdp prog then the
+* headroom size has changed and so has the offset, we
+* can use data_hard_start, which points at buf start +
+* vnet hdr size, to calculate the new headroom and use
+* it later to compute buf start in page_to_skb()
+*/
+   headroom = xdp.data - xdp.data_hard_start - metasize;
+
   

[PATCH 5.15 044/177] virtio_net: fix wrong buf address calculation when using xdp

2022-05-04 Thread Greg Kroah-Hartman
ead+0x138/0x190
 [   41.517198]  ksys_read+0x87/0xc0
 [   41.535336]  do_syscall_64+0x3b/0x90
 [   41.551637]  entry_SYSCALL_64_after_hwframe+0x44/0xae
 [   41.568050] RIP: 0033:0x48765b
 [   41.583955] Code: e8 4a 35 fe ff eb 88 cc cc cc cc cc cc cc cc e8 fb 7a fe 
ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 
f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
 [   41.632818] RSP: 002b:00c000a2f5b8 EFLAGS: 0212 ORIG_RAX: 

 [   41.664588] RAX: ffda RBX: 00c62000 RCX: 
0048765b
 [   41.681205] RDX: 5e54 RSI: 00c000e66000 RDI: 
0016
 [   41.697164] RBP: 00c000a2f608 R08: 0001 R09: 
01b4
 [   41.713034] R10: 00b6 R11: 0212 R12: 
00e9
 [   41.728755] R13: 0001 R14: 00c000a92000 R15: 

 [   41.744254]  
 [   41.758585] Modules linked in: br_netfilter bridge veth netconsole 
virtio_net

 and

 [   33.524802] BUG: Bad page state in process systemd-network  pfn:11e60
 [   33.528617] page e05dc0147b00 e05dc04e7a00 8ae9851ec000 (1) len 
82 offset 252 metasize 4 hroom 0 hdr_len 12 data 8ae9851ec10c data_meta 
8ae9851ec108 data_end 8ae9851ec14e
 [   33.529764] page:3792b5ba refcount:0 mapcount:-512 
mapping: index:0x0 pfn:0x11e60
 [   33.532463] flags: 0xfc000(node=0|zone=1|lastcpupid=0x1f)
 [   33.532468] raw: 000fc000  dead0122 

 [   33.532470] raw:   fdff 

 [   33.532471] page dumped because: nonzero mapcount
 [   33.532472] Modules linked in: br_netfilter bridge veth netconsole 
virtio_net
 [   33.532479] CPU: 0 PID: 791 Comm: systemd-network Kdump: loaded Not tainted 
5.18.0-rc1+ #37
 [   33.532482] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.15.0-1.fc35 04/01/2014
 [   33.532484] Call Trace:
 [   33.532496]  
 [   33.532500]  dump_stack_lvl+0x45/0x5a
 [   33.532506]  bad_page.cold+0x63/0x94
 [   33.532510]  free_pcp_prepare+0x290/0x420
 [   33.532515]  free_unref_page+0x1b/0x100
 [   33.532518]  skb_release_data+0x13f/0x1c0
 [   33.532524]  kfree_skb_reason+0x3e/0xc0
 [   33.532527]  ip6_mc_input+0x23c/0x2b0
 [   33.532531]  ip6_sublist_rcv_finish+0x83/0x90
 [   33.532534]  ip6_sublist_rcv+0x22b/0x2b0

[3] XDP program to reproduce(xdp_pass.c):
 #include 
 #include 

 SEC("xdp_pass")
 int xdp_pkt_pass(struct xdp_md *ctx)
 {
  bpf_xdp_adjust_head(ctx, -(int)32);
  return XDP_PASS;
 }

 char _license[] SEC("license") = "GPL";

 compile: clang -O2 -g -Wall -target bpf -c xdp_pass.c -o xdp_pass.o
 load on virtio_net: ip link set enp1s0 xdpdrv obj xdp_pass.o sec xdp_pass

CC: sta...@vger.kernel.org
CC: Jason Wang 
CC: Xuan Zhuo 
CC: Daniel Borkmann 
CC: "Michael S. Tsirkin" 
CC: virtualization@lists.linux-foundation.org
Fixes: 8fb7da9e9907 ("virtio_net: get build_skb() buf by data ptr")
Signed-off-by: Nikolay Aleksandrov 
Reviewed-by: Xuan Zhuo 
Acked-by: Daniel Borkmann 
Acked-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
Link: https://lore.kernel.org/r/20220425103703.3067292-1-ra...@blackwall.org
Signed-off-by: Paolo Abeni 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/virtio_net.c |   20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -965,6 +965,24 @@ static struct sk_buff *receive_mergeable
 * xdp.data_meta were adjusted
 */
len = xdp.data_end - xdp.data + vi->hdr_len + metasize;
+
+   /* recalculate headroom if xdp.data or xdp_data_meta
+* were adjusted, note that offset should always point
+* to the start of the reserved bytes for virtio_net
+* header which are followed by xdp.data, that means
+* that offset is equal to the headroom (when buf is
+* starting at the beginning of the page, otherwise
+* there is a base offset inside the page) but it's used
+* with a different starting point (buf start) than
+* xdp.data (buf start + vnet hdr size). If xdp.data or
+* data_meta were adjusted by the xdp prog then the
+* headroom size has changed and so has the offset, we
+* can use data_hard_start, which points at buf start +
+* vnet hdr size, to calculate the new headroom and use
+* it later to compute buf start in page_to_skb()
+*/
+   headroom = xdp.data - xdp.data_hard_start - metasize;
+
   

Re: [PATCH v2] drm/cirrus: fix a NULL vs IS_ERR() checks

2022-04-25 Thread Greg Kroah-Hartman
On Mon, Apr 25, 2022 at 10:10:43PM +0800, Shile Zhang wrote:
> The function drm_gem_shmem_vmap can returns error pointers as well,
> which could cause following kernel crash:
> 
> BUG: unable to handle page fault for address: fffc
> PGD 1426a12067 P4D 1426a12067 PUD 1426a14067 PMD 0
> Oops:  [#1] SMP NOPTI
> CPU: 12 PID: 3598532 Comm: stress-ng Kdump: loaded Not tainted 5.10.50.x86_64 
> #1
> ...
> RIP: 0010:memcpy_toio+0x23/0x50
> Code: 00 00 00 00 0f 1f 00 0f 1f 44 00 00 48 85 d2 74 28 40 f6 c7 01 75 2b 48 
> 83 fa 01 76 06 40 f6 c7 02 75 17 48 89 d1 48 c1 e9 02  a5 f6 c2 02 74 02 
> 66 a5 f6 c2 01 74 01 a4 c3 66 a5 48 83 ea 02
> RSP: 0018:afbf8a203c68 EFLAGS: 00010216
> RAX:  RBX: fffc RCX: 0200
> RDX: 0800 RSI: fffc RDI: afbf8200
> RBP: afbf8200 R08: 0002 R09: 
> R10: 02b5 R11:  R12: 0800
> R13: 8a6801099300 R14: 0001 R15: 0300
> FS:  7f4a6bc5f740() GS:8a864190() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: fffc CR3: 0016d3874001 CR4: 003606e0
> Call Trace:
>  drm_fb_memcpy_dstclip+0x5e/0x80 [drm_kms_helper]
>  cirrus_fb_blit_rect.isra.0+0xb7/0xe0 [cirrus]
>  cirrus_pipe_update+0x9f/0xa8 [cirrus]
>  drm_atomic_helper_commit_planes+0xb8/0x220 [drm_kms_helper]
>  drm_atomic_helper_commit_tail+0x42/0x80 [drm_kms_helper]
>  commit_tail+0xce/0x130 [drm_kms_helper]
>  drm_atomic_helper_commit+0x113/0x140 [drm_kms_helper]
>  drm_client_modeset_commit_atomic+0x1c4/0x200 [drm]
>  drm_client_modeset_commit_locked+0x53/0x80 [drm]
>  drm_client_modeset_commit+0x24/0x40 [drm]
>  drm_fbdev_client_restore+0x48/0x85 [drm_kms_helper]
>  drm_client_dev_restore+0x64/0xb0 [drm]
>  drm_release+0xf2/0x110 [drm]
>  __fput+0x96/0x240
>  task_work_run+0x5c/0x90
>  exit_to_user_mode_loop+0xce/0xd0
>  exit_to_user_mode_prepare+0x6a/0x70
>  syscall_exit_to_user_mode+0x12/0x40
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f4a6bd82c2b
> 
> Fixes: ab3e023b1b4c9 ("drm/cirrus: rewrite and modernize driver.")
> 
> Signed-off-by: Shile Zhang 

No blank line between those please.

And you need to really really really document why this can not use a
commit that is currently upstream.  And what commit upstream did solve
this and how.  Otherwise we can not take this change, sorry.

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 5.10.y] drm/cirrus: fix a NULL vs IS_ERR() checks

2022-04-25 Thread Greg Kroah-Hartman
On Sun, Apr 24, 2022 at 11:27:17AM +0800, Shile Zhang wrote:
> Hi David and Daniel,
> 
> Sorry but could you please help to check this issue?
> Due to the function 'drm_gem_shmem_vmap' could return ERROR pointers which
> will cause the kernel crash due to 'cirrus_fb_blit_rect' only check the
> pointer.
> 
> Since the related code has been refactoring in mainline, so this issue only
> happened in stable 5.10.y branch.
> 
> @Greg
> I think it is probably not realistic to backport the related refactoring
> from mainline directly, so I just give this bugfix patch only for 5.10.y
> branch.

I'm sorry, but I do not have "this bugfix" in my queue anymore,
considering it is so old.  Please rebase and resubmit.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v7 00/12] Fix broken usage of driver_override (and kfree of static memory)

2022-04-22 Thread Greg Kroah-Hartman
On Wed, Apr 20, 2022 at 11:20:06AM +0200, Krzysztof Kozlowski wrote:
> On 19/04/2022 13:34, Krzysztof Kozlowski wrote:
> 
> Hi Greg, Rafael,
> 
> The patchset was for some time on the lists, got some reviews, some
> changes/feedback which I hope I applied/responded.
> 
> Entire set depends on the driver core changes, so maybe you could pick
> up everything via drivers core tree?

Ok, will do, thanks.

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.9 185/218] virtio_console: eliminate anonymous module_init & module_exit

2022-04-18 Thread Greg Kroah-Hartman
From: Randy Dunlap 

[ Upstream commit fefb8a2a941338d871e2d83fbd65fbfa068857bd ]

Eliminate anonymous module_init() and module_exit(), which can lead to
confusion or ambiguity when reading System.map, crashes/oops/bugs,
or an initcall_debug log.

Give each of these init and exit functions unique driver-specific
names to eliminate the anonymous names.

Example 1: (System.map)
 832fc78c t init
 832fc79e t init
 832fc8f8 t init

Example 2: (initcall_debug log)
 calling  init+0x0/0x12 @ 1
 initcall init+0x0/0x12 returned 0 after 15 usecs
 calling  init+0x0/0x60 @ 1
 initcall init+0x0/0x60 returned 0 after 2 usecs
 calling  init+0x0/0x9a @ 1
 initcall init+0x0/0x9a returned 0 after 74 usecs

Signed-off-by: Randy Dunlap 
Reviewed-by: Amit Shah 
Cc: virtualization@lists.linux-foundation.org
Cc: Arnd Bergmann 
Link: https://lore.kernel.org/r/20220316192010.19001-3-rdun...@infradead.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/char/virtio_console.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index a6b6dc204c1f..ba4c546db756 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -2284,7 +2284,7 @@ static struct virtio_driver virtio_rproc_serial = {
.remove =   virtcons_remove,
 };
 
-static int __init init(void)
+static int __init virtio_console_init(void)
 {
int err;
 
@@ -2321,7 +2321,7 @@ static int __init init(void)
return err;
 }
 
-static void __exit fini(void)
+static void __exit virtio_console_fini(void)
 {
reclaim_dma_bufs();
 
@@ -2331,8 +2331,8 @@ static void __exit fini(void)
class_destroy(pdrvdata.class);
debugfs_remove_recursive(pdrvdata.debugfs_dir);
 }
-module_init(init);
-module_exit(fini);
+module_init(virtio_console_init);
+module_exit(virtio_console_fini);
 
 MODULE_DESCRIPTION("Virtio console driver");
 MODULE_LICENSE("GPL");
-- 
2.35.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.14 233/284] virtio_console: eliminate anonymous module_init & module_exit

2022-04-18 Thread Greg Kroah-Hartman
From: Randy Dunlap 

[ Upstream commit fefb8a2a941338d871e2d83fbd65fbfa068857bd ]

Eliminate anonymous module_init() and module_exit(), which can lead to
confusion or ambiguity when reading System.map, crashes/oops/bugs,
or an initcall_debug log.

Give each of these init and exit functions unique driver-specific
names to eliminate the anonymous names.

Example 1: (System.map)
 832fc78c t init
 832fc79e t init
 832fc8f8 t init

Example 2: (initcall_debug log)
 calling  init+0x0/0x12 @ 1
 initcall init+0x0/0x12 returned 0 after 15 usecs
 calling  init+0x0/0x60 @ 1
 initcall init+0x0/0x60 returned 0 after 2 usecs
 calling  init+0x0/0x9a @ 1
 initcall init+0x0/0x9a returned 0 after 74 usecs

Signed-off-by: Randy Dunlap 
Reviewed-by: Amit Shah 
Cc: virtualization@lists.linux-foundation.org
Cc: Arnd Bergmann 
Link: https://lore.kernel.org/r/20220316192010.19001-3-rdun...@infradead.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/char/virtio_console.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 2140d401523f..fa103e7a43b7 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -2281,7 +2281,7 @@ static struct virtio_driver virtio_rproc_serial = {
.remove =   virtcons_remove,
 };
 
-static int __init init(void)
+static int __init virtio_console_init(void)
 {
int err;
 
@@ -2318,7 +2318,7 @@ static int __init init(void)
return err;
 }
 
-static void __exit fini(void)
+static void __exit virtio_console_fini(void)
 {
reclaim_dma_bufs();
 
@@ -2328,8 +2328,8 @@ static void __exit fini(void)
class_destroy(pdrvdata.class);
debugfs_remove_recursive(pdrvdata.debugfs_dir);
 }
-module_init(init);
-module_exit(fini);
+module_init(virtio_console_init);
+module_exit(virtio_console_fini);
 
 MODULE_DESCRIPTION("Virtio console driver");
 MODULE_LICENSE("GPL");
-- 
2.35.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.4 417/475] virtio_console: eliminate anonymous module_init & module_exit

2022-04-14 Thread Greg Kroah-Hartman
From: Randy Dunlap 

[ Upstream commit fefb8a2a941338d871e2d83fbd65fbfa068857bd ]

Eliminate anonymous module_init() and module_exit(), which can lead to
confusion or ambiguity when reading System.map, crashes/oops/bugs,
or an initcall_debug log.

Give each of these init and exit functions unique driver-specific
names to eliminate the anonymous names.

Example 1: (System.map)
 832fc78c t init
 832fc79e t init
 832fc8f8 t init

Example 2: (initcall_debug log)
 calling  init+0x0/0x12 @ 1
 initcall init+0x0/0x12 returned 0 after 15 usecs
 calling  init+0x0/0x60 @ 1
 initcall init+0x0/0x60 returned 0 after 2 usecs
 calling  init+0x0/0x9a @ 1
 initcall init+0x0/0x9a returned 0 after 74 usecs

Signed-off-by: Randy Dunlap 
Reviewed-by: Amit Shah 
Cc: virtualization@lists.linux-foundation.org
Cc: Arnd Bergmann 
Link: https://lore.kernel.org/r/20220316192010.19001-3-rdun...@infradead.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/char/virtio_console.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 2660a0c5483a..c736adef9d3c 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -2241,7 +2241,7 @@ static struct virtio_driver virtio_rproc_serial = {
.remove =   virtcons_remove,
 };
 
-static int __init init(void)
+static int __init virtio_console_init(void)
 {
int err;
 
@@ -2278,7 +2278,7 @@ static int __init init(void)
return err;
 }
 
-static void __exit fini(void)
+static void __exit virtio_console_fini(void)
 {
reclaim_dma_bufs();
 
@@ -2288,8 +2288,8 @@ static void __exit fini(void)
class_destroy(pdrvdata.class);
debugfs_remove_recursive(pdrvdata.debugfs_dir);
 }
-module_init(init);
-module_exit(fini);
+module_init(virtio_console_init);
+module_exit(virtio_console_fini);
 
 MODULE_DESCRIPTION("Virtio console driver");
 MODULE_LICENSE("GPL");
-- 
2.35.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.19 297/338] virtio_console: eliminate anonymous module_init & module_exit

2022-04-14 Thread Greg Kroah-Hartman
From: Randy Dunlap 

[ Upstream commit fefb8a2a941338d871e2d83fbd65fbfa068857bd ]

Eliminate anonymous module_init() and module_exit(), which can lead to
confusion or ambiguity when reading System.map, crashes/oops/bugs,
or an initcall_debug log.

Give each of these init and exit functions unique driver-specific
names to eliminate the anonymous names.

Example 1: (System.map)
 832fc78c t init
 832fc79e t init
 832fc8f8 t init

Example 2: (initcall_debug log)
 calling  init+0x0/0x12 @ 1
 initcall init+0x0/0x12 returned 0 after 15 usecs
 calling  init+0x0/0x60 @ 1
 initcall init+0x0/0x60 returned 0 after 2 usecs
 calling  init+0x0/0x9a @ 1
 initcall init+0x0/0x9a returned 0 after 74 usecs

Signed-off-by: Randy Dunlap 
Reviewed-by: Amit Shah 
Cc: virtualization@lists.linux-foundation.org
Cc: Arnd Bergmann 
Link: https://lore.kernel.org/r/20220316192010.19001-3-rdun...@infradead.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/char/virtio_console.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index ac0b84afabe7..d3937d690400 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -2265,7 +2265,7 @@ static struct virtio_driver virtio_rproc_serial = {
.remove =   virtcons_remove,
 };
 
-static int __init init(void)
+static int __init virtio_console_init(void)
 {
int err;
 
@@ -2302,7 +2302,7 @@ static int __init init(void)
return err;
 }
 
-static void __exit fini(void)
+static void __exit virtio_console_fini(void)
 {
reclaim_dma_bufs();
 
@@ -2312,8 +2312,8 @@ static void __exit fini(void)
class_destroy(pdrvdata.class);
debugfs_remove_recursive(pdrvdata.debugfs_dir);
 }
-module_init(init);
-module_exit(fini);
+module_init(virtio_console_init);
+module_exit(virtio_console_fini);
 
 MODULE_DESCRIPTION("Virtio console driver");
 MODULE_LICENSE("GPL");
-- 
2.35.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.17 184/343] virtio_console: eliminate anonymous module_init & module_exit

2022-04-12 Thread Greg Kroah-Hartman
From: Randy Dunlap 

[ Upstream commit fefb8a2a941338d871e2d83fbd65fbfa068857bd ]

Eliminate anonymous module_init() and module_exit(), which can lead to
confusion or ambiguity when reading System.map, crashes/oops/bugs,
or an initcall_debug log.

Give each of these init and exit functions unique driver-specific
names to eliminate the anonymous names.

Example 1: (System.map)
 832fc78c t init
 832fc79e t init
 832fc8f8 t init

Example 2: (initcall_debug log)
 calling  init+0x0/0x12 @ 1
 initcall init+0x0/0x12 returned 0 after 15 usecs
 calling  init+0x0/0x60 @ 1
 initcall init+0x0/0x60 returned 0 after 2 usecs
 calling  init+0x0/0x9a @ 1
 initcall init+0x0/0x9a returned 0 after 74 usecs

Signed-off-by: Randy Dunlap 
Reviewed-by: Amit Shah 
Cc: virtualization@lists.linux-foundation.org
Cc: Arnd Bergmann 
Link: https://lore.kernel.org/r/20220316192010.19001-3-rdun...@infradead.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/char/virtio_console.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index e3c430539a17..9fa3c76a267f 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -2245,7 +2245,7 @@ static struct virtio_driver virtio_rproc_serial = {
.remove =   virtcons_remove,
 };
 
-static int __init init(void)
+static int __init virtio_console_init(void)
 {
int err;
 
@@ -2280,7 +2280,7 @@ static int __init init(void)
return err;
 }
 
-static void __exit fini(void)
+static void __exit virtio_console_fini(void)
 {
reclaim_dma_bufs();
 
@@ -2290,8 +2290,8 @@ static void __exit fini(void)
class_destroy(pdrvdata.class);
debugfs_remove_recursive(pdrvdata.debugfs_dir);
 }
-module_init(init);
-module_exit(fini);
+module_init(virtio_console_init);
+module_exit(virtio_console_fini);
 
 MODULE_DESCRIPTION("Virtio console driver");
 MODULE_LICENSE("GPL");
-- 
2.35.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.16 140/285] virtio_console: eliminate anonymous module_init & module_exit

2022-04-12 Thread Greg Kroah-Hartman
From: Randy Dunlap 

[ Upstream commit fefb8a2a941338d871e2d83fbd65fbfa068857bd ]

Eliminate anonymous module_init() and module_exit(), which can lead to
confusion or ambiguity when reading System.map, crashes/oops/bugs,
or an initcall_debug log.

Give each of these init and exit functions unique driver-specific
names to eliminate the anonymous names.

Example 1: (System.map)
 832fc78c t init
 832fc79e t init
 832fc8f8 t init

Example 2: (initcall_debug log)
 calling  init+0x0/0x12 @ 1
 initcall init+0x0/0x12 returned 0 after 15 usecs
 calling  init+0x0/0x60 @ 1
 initcall init+0x0/0x60 returned 0 after 2 usecs
 calling  init+0x0/0x9a @ 1
 initcall init+0x0/0x9a returned 0 after 74 usecs

Signed-off-by: Randy Dunlap 
Reviewed-by: Amit Shah 
Cc: virtualization@lists.linux-foundation.org
Cc: Arnd Bergmann 
Link: https://lore.kernel.org/r/20220316192010.19001-3-rdun...@infradead.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/char/virtio_console.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index f864b17be7e3..35025f283bf6 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -2245,7 +2245,7 @@ static struct virtio_driver virtio_rproc_serial = {
.remove =   virtcons_remove,
 };
 
-static int __init init(void)
+static int __init virtio_console_init(void)
 {
int err;
 
@@ -2280,7 +2280,7 @@ static int __init init(void)
return err;
 }
 
-static void __exit fini(void)
+static void __exit virtio_console_fini(void)
 {
reclaim_dma_bufs();
 
@@ -2290,8 +2290,8 @@ static void __exit fini(void)
class_destroy(pdrvdata.class);
debugfs_remove_recursive(pdrvdata.debugfs_dir);
 }
-module_init(init);
-module_exit(fini);
+module_init(virtio_console_init);
+module_exit(virtio_console_fini);
 
 MODULE_DESCRIPTION("Virtio console driver");
 MODULE_LICENSE("GPL");
-- 
2.35.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.15 135/277] virtio_console: eliminate anonymous module_init & module_exit

2022-04-11 Thread Greg Kroah-Hartman
From: Randy Dunlap 

[ Upstream commit fefb8a2a941338d871e2d83fbd65fbfa068857bd ]

Eliminate anonymous module_init() and module_exit(), which can lead to
confusion or ambiguity when reading System.map, crashes/oops/bugs,
or an initcall_debug log.

Give each of these init and exit functions unique driver-specific
names to eliminate the anonymous names.

Example 1: (System.map)
 832fc78c t init
 832fc79e t init
 832fc8f8 t init

Example 2: (initcall_debug log)
 calling  init+0x0/0x12 @ 1
 initcall init+0x0/0x12 returned 0 after 15 usecs
 calling  init+0x0/0x60 @ 1
 initcall init+0x0/0x60 returned 0 after 2 usecs
 calling  init+0x0/0x9a @ 1
 initcall init+0x0/0x9a returned 0 after 74 usecs

Signed-off-by: Randy Dunlap 
Reviewed-by: Amit Shah 
Cc: virtualization@lists.linux-foundation.org
Cc: Arnd Bergmann 
Link: https://lore.kernel.org/r/20220316192010.19001-3-rdun...@infradead.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/char/virtio_console.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 3adf04766e98..77bc993d7513 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -2236,7 +2236,7 @@ static struct virtio_driver virtio_rproc_serial = {
.remove =   virtcons_remove,
 };
 
-static int __init init(void)
+static int __init virtio_console_init(void)
 {
int err;
 
@@ -2271,7 +2271,7 @@ static int __init init(void)
return err;
 }
 
-static void __exit fini(void)
+static void __exit virtio_console_fini(void)
 {
reclaim_dma_bufs();
 
@@ -2281,8 +2281,8 @@ static void __exit fini(void)
class_destroy(pdrvdata.class);
debugfs_remove_recursive(pdrvdata.debugfs_dir);
 }
-module_init(init);
-module_exit(fini);
+module_init(virtio_console_init);
+module_exit(virtio_console_fini);
 
 MODULE_DESCRIPTION("Virtio console driver");
 MODULE_LICENSE("GPL");
-- 
2.35.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.10 089/171] virtio_console: eliminate anonymous module_init & module_exit

2022-04-11 Thread Greg Kroah-Hartman
From: Randy Dunlap 

[ Upstream commit fefb8a2a941338d871e2d83fbd65fbfa068857bd ]

Eliminate anonymous module_init() and module_exit(), which can lead to
confusion or ambiguity when reading System.map, crashes/oops/bugs,
or an initcall_debug log.

Give each of these init and exit functions unique driver-specific
names to eliminate the anonymous names.

Example 1: (System.map)
 832fc78c t init
 832fc79e t init
 832fc8f8 t init

Example 2: (initcall_debug log)
 calling  init+0x0/0x12 @ 1
 initcall init+0x0/0x12 returned 0 after 15 usecs
 calling  init+0x0/0x60 @ 1
 initcall init+0x0/0x60 returned 0 after 2 usecs
 calling  init+0x0/0x9a @ 1
 initcall init+0x0/0x9a returned 0 after 74 usecs

Signed-off-by: Randy Dunlap 
Reviewed-by: Amit Shah 
Cc: virtualization@lists.linux-foundation.org
Cc: Arnd Bergmann 
Link: https://lore.kernel.org/r/20220316192010.19001-3-rdun...@infradead.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/char/virtio_console.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 3dd4deb60adb..6d361420ffe8 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -2239,7 +2239,7 @@ static struct virtio_driver virtio_rproc_serial = {
.remove =   virtcons_remove,
 };
 
-static int __init init(void)
+static int __init virtio_console_init(void)
 {
int err;
 
@@ -2276,7 +2276,7 @@ static int __init init(void)
return err;
 }
 
-static void __exit fini(void)
+static void __exit virtio_console_fini(void)
 {
reclaim_dma_bufs();
 
@@ -2286,8 +2286,8 @@ static void __exit fini(void)
class_destroy(pdrvdata.class);
debugfs_remove_recursive(pdrvdata.debugfs_dir);
 }
-module_init(init);
-module_exit(fini);
+module_init(virtio_console_init);
+module_exit(virtio_console_fini);
 
 MODULE_DESCRIPTION("Virtio console driver");
 MODULE_LICENSE("GPL");
-- 
2.35.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 4/6] virtio: Initialize authorized attribute for confidential guest

2021-10-05 Thread Greg Kroah-Hartman
On Tue, Oct 05, 2021 at 03:33:29PM -0700, Dan Williams wrote:
> On Sun, Oct 3, 2021 at 10:16 PM Mika Westerberg
>  wrote:
> >
> > Hi,
> >
> > On Fri, Oct 01, 2021 at 12:57:18PM -0700, Dan Williams wrote:
> > > > > Ah, so are you saying that it would be sufficient for USB if the
> > > > > generic authorized implementation did something like:
> > > > >
> > > > > dev->authorized = 1;
> > > > > device_attach(dev);
> > > > >
> > > > > ...for the authorize case, and:
> > > > >
> > > > > dev->authorize = 0;
> > > > > device_release_driver(dev);
> > > > >
> > > > > ...for the deauthorize case?
> > > >
> > > > Yes, I think so.  But I haven't tried making this change to test and
> > > > see what really happens.
> > >
> > > Sounds like a useful path for this effort to explore. Especially as
> > > Greg seems to want the proposed "has_probe_authorization" flag in the
> > > bus_type to disappear and make this all generic. It just seems that
> > > Thunderbolt would need deeper surgery to move what it does in the
> > > authorization toggle path into the probe and remove paths.
> > >
> > > Mika, do you see a path for Thunderbolt to align its authorization
> > > paths behind bus ->probe() ->remove() events similar to what USB might
> > > be able to support for a generic authorization path?
> >
> > In Thunderbolt "authorization" actually means whether there is a PCIe
> > tunnel to the device or not. There is no driver bind/unbind happening
> > when authorization toggles (well on Thunderbolt bus, there can be on PCI
> > bus after the tunnel is established) so I'm not entirely sure how we
> > could use the bus ->probe() or ->remove for that to be honest.
> 
> Greg, per your comment:
> 
> "... which was to move the way that busses are allowed to authorize
> the devices they wish to control into a generic way instead of being
> bus-specific logic."
> 
> We have USB and TB that have already diverged on the ABI here. The USB
> behavior is more in line with the "probe authorization" concept, while
> TB is about tunnel establishment and not cleanly tied to probe
> authorization. So while I see a path to a common authorization
> implementation for USB and other buses (per the insight from Alan), TB
> needs to retain the ability to record the authorization state as an
> enum rather than a bool, and emit a uevent on authorization status
> change.
> 
> So how about something like the following that moves the attribute
> into the core, but still calls back to TB and USB to perform their
> legacy authorization work. This new authorized attribute only shows up
> when devices default to not authorized, i.e. when userspace owns the
> allow list past critical-boot built-in drivers, or if the bus (USB /
> TB) implements ->authorize().

At quick glance, this looks better, but it would be good to see someone
test it :)

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 2/6] mm/memory_hotplug: remove CONFIG_MEMORY_HOTPLUG_SPARSE

2021-10-05 Thread Greg Kroah-Hartman
On Wed, Sep 29, 2021 at 04:35:56PM +0200, David Hildenbrand wrote:
> CONFIG_MEMORY_HOTPLUG depends on CONFIG_SPARSEMEM, so there is no need for
> CONFIG_MEMORY_HOTPLUG_SPARSE anymore; adjust all instances to use
> CONFIG_MEMORY_HOTPLUG and remove CONFIG_MEMORY_HOTPLUG_SPARSE.
> 
> Signed-off-by: David Hildenbrand 

Acked-by: Greg Kroah-Hartman 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_console: break out of buf poll on remove

2021-10-05 Thread Greg Kroah-Hartman
On Tue, Oct 05, 2021 at 03:04:07AM -0400, Michael S. Tsirkin wrote:
> A common pattern for device reset is currently:
> vdev->config->reset(vdev);
> .. cleanup ..
> 
> reset prevents new interrupts from arriving and waits for interrupt
> handlers to finish.
> 
> However if - as is common - the handler queues a work request which is
> flushed during the cleanup stage, we have code adding buffers / trying
> to get buffers while device is reset. Not good.
> 
> This was reproduced by running
>   modprobe virtio_console
>   modprobe -r virtio_console
> in a loop.

That's a pathological case that is not "in the field" except by people
who want to abuse the system as root.  And they can do much worse things
than that.

> Fixing this comprehensively needs some thought, and new APIs.
> Let's at least handle the specific case of virtio_console
> removal that was reported in the field.

Let's fix this correctly, don't just hack it up now.

> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1786239
> Signed-off-by: Michael S. Tsirkin 
> ---
>  drivers/char/virtio_console.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
> index 7eaf303a7a86..c852ce0b4d56 100644
> --- a/drivers/char/virtio_console.c
> +++ b/drivers/char/virtio_console.c
> @@ -1956,6 +1956,12 @@ static void virtcons_remove(struct virtio_device *vdev)
>   list_del(&portdev->list);
>   spin_unlock_irq(&pdrvdata_lock);
>  
> + /* Device is going away, exit any polling for buffers */
> + virtio_break_device(vdev);
> + if (use_multiport(portdev))
> + flush_work(&portdev->control_work);
> + else
> + flush_work(&portdev->config_work);
>   /* Disable interrupts for vqs */

newline before comment?

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 4/6] virtio: Initialize authorized attribute for confidential guest

2021-10-02 Thread Greg Kroah-Hartman
On Sat, Oct 02, 2021 at 02:40:55PM -0400, Michael S. Tsirkin wrote:
> On Sat, Oct 02, 2021 at 07:20:22AM -0700, Andi Kleen wrote:
> > 
> > On 10/2/2021 4:14 AM, Greg Kroah-Hartman wrote:
> > > On Sat, Oct 02, 2021 at 07:04:28AM -0400, Michael S. Tsirkin wrote:
> > > > On Fri, Oct 01, 2021 at 08:49:28AM -0700, Andi Kleen wrote:
> > > > > >Do you have a list of specific drivers and kernel options that 
> > > > > > you
> > > > > > feel you now "trust"?
> > > > > For TDX it's currently only virtio net/block/console
> > > > > 
> > > > > But we expect this list to grow slightly over time, but not at a high 
> > > > > rate
> > > > > (so hopefully <10)
> > > > Well there are already >10 virtio drivers and I think it's reasonable
> > > > that all of these will be used with encrypted guests. The list will
> > > > grow.
> > > What is keeping "all" drivers from being on this list?
> > 
> > It would be too much work to harden them all, and it would be pointless
> > because all these drivers are never legitimately needed in a virtualized
> > environment which only virtualize a very small number of devices.
> > 
> > >   How exactly are
> > > you determining what should, and should not, be allowed?
> > 
> > Everything that has had reasonable effort at hardening can be added. But if
> > someone proposes to add a driver that should trigger additional scrutiny in
> > code review. We should also request them to do some fuzzing.
> 
> Looks like out of tree modules get a free pass then.

That's not good.  As we already know if a module is in or out of the
tree, you all should be banning all out-of-tree modules if you care
about these things.  That should be very easy to do if you care.

> > How would user space know what drivers have been hardened? This is really
> > something that the kernel needs to determine. I don't think we can outsource
> > it to anyone else.
> 
> IIUC userspace is the distro. It can also do more than a binary on/off,
> e.g. it can decide "only virtio", "no out of tree drivers".
> A distro can also ship configs with a specific features
> enabled/disabled. E.g. I can see where some GPU drivers will be
> included by some distros since they are so useful, and excluded
> by others since they are so big and hard to audit.
> I don't see how the kernel can reasonably make a stand here.
> Is "some audit and some fuzzing" a good policy? How much is enough?

Agreed, that is why the policy for this should be in userspace.

> Well if userspace sets the policy then I'm not sure we also want
> a kernel one ... but if yes I'd like it to be in a central
> place so whoever is building the kernel can tweak it easily
> and rebuild, without poking at individual drivers.

And here I thought the requirement was that no one could rebuild their
kernel as it was provided by someone else.

Again, these requirements seem contradicting, but as no one has actually
pointed me at the real list of them, who knows what they are?

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 4/6] virtio: Initialize authorized attribute for confidential guest

2021-10-02 Thread Greg Kroah-Hartman
On Sat, Oct 02, 2021 at 07:20:22AM -0700, Andi Kleen wrote:
> 
> On 10/2/2021 4:14 AM, Greg Kroah-Hartman wrote:
> > On Sat, Oct 02, 2021 at 07:04:28AM -0400, Michael S. Tsirkin wrote:
> > > On Fri, Oct 01, 2021 at 08:49:28AM -0700, Andi Kleen wrote:
> > > > >Do you have a list of specific drivers and kernel options that you
> > > > > feel you now "trust"?
> > > > For TDX it's currently only virtio net/block/console
> > > > 
> > > > But we expect this list to grow slightly over time, but not at a high 
> > > > rate
> > > > (so hopefully <10)
> > > Well there are already >10 virtio drivers and I think it's reasonable
> > > that all of these will be used with encrypted guests. The list will
> > > grow.
> > What is keeping "all" drivers from being on this list?
> 
> It would be too much work to harden them all, and it would be pointless
> because all these drivers are never legitimately needed in a virtualized
> environment which only virtualize a very small number of devices.

Why would you not want to properly review and fix up all kernel drivers?
That feels like you are being lazy.

What exactly are you meaning by "harden"?  Why isn't it automated?  Who
is doing this work?  Where is it being done?

Come on, you have a small number of virtio drivers, to somehow say that
they are now divided up into trusted/untrusted feels very very odd.

Just do the real work here, everyone will benefit, including yourself.

> >   How exactly are
> > you determining what should, and should not, be allowed?
> 
> Everything that has had reasonable effort at hardening can be added. But if
> someone proposes to add a driver that should trigger additional scrutiny in
> code review. We should also request them to do some fuzzing.

You can provide that fuzzing right now, why isn't syzbot running on
these interfaces today?

And again, what _exactly_ do you all mean by "hardening" that has
happened?  Where is that documented and who did that work?

> > And why not just put all of that into userspace and have it pick and
> > choose?  That should be the end-goal here, you don't want to encode
> > policy like this in the kernel, right?
> 
> How would user space know what drivers have been hardened? This is really
> something that the kernel needs to determine. I don't think we can outsource
> it to anyone else.

It would "know" just as well as you know today in the kernel.  There is
no difference here.

Just do the real work here, and "harden" all of the virtio drivers
please.  What prevents that?

> Also BTW of course user space can still override it, but really the defaults
> should be a kernel policy.
> 
> There's also the additional problem that one of the goals of confidential
> guest is to just move existing guest virtual images into them without much
> changes. So it's better for such a case if as much as possible of the policy
> is in the kernel. But that's more a secondary consideration, the first point
> is really the important part.

Where exactly are all of these "goals" and "requirements" documented and
who is defining them and forcing them on us without actually doing any
of the work involved?

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 4/6] virtio: Initialize authorized attribute for confidential guest

2021-10-02 Thread Greg Kroah-Hartman
On Sat, Oct 02, 2021 at 07:04:28AM -0400, Michael S. Tsirkin wrote:
> On Fri, Oct 01, 2021 at 08:49:28AM -0700, Andi Kleen wrote:
> > >   Do you have a list of specific drivers and kernel options that you
> > > feel you now "trust"?
> > 
> > For TDX it's currently only virtio net/block/console
> > 
> > But we expect this list to grow slightly over time, but not at a high rate
> > (so hopefully <10)
> 
> Well there are already >10 virtio drivers and I think it's reasonable
> that all of these will be used with encrypted guests. The list will
> grow.

What is keeping "all" drivers from being on this list?  How exactly are
you determining what should, and should not, be allowed?  How can
drivers move on, or off, of it over time?

And why not just put all of that into userspace and have it pick and
choose?  That should be the end-goal here, you don't want to encode
policy like this in the kernel, right?

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 4/6] virtio: Initialize authorized attribute for confidential guest

2021-10-01 Thread Greg Kroah-Hartman
On Thu, Sep 30, 2021 at 12:04:05PM -0700, Kuppuswamy, Sathyanarayanan wrote:
> 
> 
> On 9/30/21 8:23 AM, Greg Kroah-Hartman wrote:
> > On Thu, Sep 30, 2021 at 08:18:18AM -0700, Kuppuswamy, Sathyanarayanan wrote:
> > > 
> > > 
> > > On 9/30/21 6:36 AM, Dan Williams wrote:
> > > > > And in particular, not all virtio drivers are hardened -
> > > > > I think at this point blk and scsi drivers have been hardened - so
> > > > > treating them all the same looks wrong.
> > > > My understanding was that they have been audited, Sathya?
> > > 
> > > Yes, AFAIK, it has been audited. Andi also submitted some patches
> > > related to it. Andi, can you confirm.
> > 
> > What is the official definition of "audited"?
> 
> 
> In our case (Confidential Computing platform), the host is an un-trusted
> entity. So any interaction with host from the drivers will have to be
> protected against the possible attack from the host. For example, if we
> are accessing a memory based on index value received from host, we have
> to make sure it does not lead to out of bound access or when sharing the
> memory with the host, we need to make sure only the required region is
> shared with the host and the memory is un-shared after use properly.

You have not defined the term "audited" here at all in any way that can
be reviewed or verified by anyone from what I can tell.

You have only described a new model that you wish the kernel to run in,
one in which it does not trust the hardware at all.  That is explicitly
NOT what the kernel has been designed for so far, and if you wish to
change that, lots of things need to be done outside of simply running
some fuzzers on a few random drivers.

For one example, how do you ensure that the memory you are reading from
hasn't been modified by the host between writing to it the last time you
did?  Do you have a list of specific drivers and kernel options that you
feel you now "trust"?  If so, how long does that trust last for?  Until
someonen else modifies that code?  What about modifications to functions
that your "audited" code touches?  Who is doing this auditing?  How do
you know the auditing has been done correctly?  Who has reviewed and
audited the tools that are doing the auditing?  Where is the
specification that has been agreed on how the auditing must be done?
And so on...

I feel like there are a lot of different things all being mixed up here
into one "oh we want this to happen!" type of thread.  Please let's just
stick to the one request that I had here, which was to move the way that
busses are allowed to authorize the devices they wish to control into a
generic way instead of being bus-specific logic.

Any requests outside of that type of functionality are just that,
outside the scope of this patchset and should get their own patch series
and discussion.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/6] driver core: Add common support to skip probe for un-authorized devices

2021-09-30 Thread Greg Kroah-Hartman
On Thu, Sep 30, 2021 at 12:15:16PM -0700, Andi Kleen wrote:
> 
> On 9/30/2021 10:23 AM, Greg Kroah-Hartman wrote:
> > On Thu, Sep 30, 2021 at 10:17:09AM -0700, Andi Kleen wrote:
> > > The legacy drivers could be fixed, but nobody really wants to touch them
> > > anymore and they're impossible to test.
> > Pointers to them?
> 
> For example if you look over old SCSI drivers in drivers/scsi/*.c there is a
> substantial number that has a module init longer than just registering a
> driver. As a single example look at drivers/scsi/BusLogic.c

Great, send patches to fix them up instead of adding new infrastructure
to the kernel.  It is better to remove code than add it.  You can rip
the ISA code out of that driver and then you will not have the issue
anymore.

Or again, just add that module to the deny list and never load it from
userspace.

> There were also quite a few platform drivers like this.

Of course, platform drivers are horrible abusers of this.  Just like the
recent one submitted by Intel that would bind to any machine it was
loaded on and did not actually do any hardware detection assuming that
it owned the platform:

https://lore.kernel.org/r/20210924213157.3584061-2-david.e@linux.intel.com

So yes, some drivers are horrible, it is our job to catch that and fix
it.  If you don't want to load those drivers on your system, we have
userspace solutions for that (you can have allow/deny lists there.)

> > > The drivers that probe something that is not enumerated in a standard way
> > > have no choice, it cannot be implemented in a different way.
> > PCI devices are not enumerated in a standard way???
> 
> The pci devices are enumerated in a standard way, but typically the driver
> also needs something else outside PCI that needs some other probing
> mechanism.

Like what?  What PCI drivers need outside connections to control the
hardware?

> > > So instead we're using a "firewall" the prevents these drivers from doing
> > > bad things by not allowing ioremap access unless opted in, and also do 
> > > some
> > > filtering on the IO ports The device filter is still the primary 
> > > mechanism,
> > > the ioremap filtering is just belts and suspenders for those odd cases.
> > That's horrible, don't try to protect the kernel from itself.  Just fix
> > the drivers.
> 
> I thought we had already established this last time we discussed it.
> 
> That's completely impractical. We cannot harden thousands of drivers,
> especially since it would be all wasted work since nobody will ever need
> them in virtual guests. Even if we could harden them how would such a work
> be maintained long term? Using a firewall and filtering mechanism is much
> saner for everyone.

I agree, you can not "harden" anything here.  That is why I asked you to
use the existing process that explicitly moves the model to userspace
where a user can say "do I want this device to be controlled by the
kernel or not" which then allows you to pick and choose what drivers you
want to have in your system.

You need to trust devices, and not worry about trusting drivers as you
yourself admit :)

The kernel's trust model is that once we bind to them, we trust almost
all device types almost explicitly.  If you wish to change that model,
that's great, but it is a much larger discussion than this tiny patchset
would require.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/6] driver core: Add common support to skip probe for un-authorized devices

2021-09-30 Thread Greg Kroah-Hartman
On Thu, Sep 30, 2021 at 10:17:09AM -0700, Andi Kleen wrote:
> 
> > > The "it" that I referred to is the claim that no driver should be
> > > touching hardware in their module init call. Andi seems to think
> > > such drivers are worth working around with a special remap API.
> > Andi is wrong.
> 
> While overall it's a small percentage of the total, there are still quite a
> few drivers that do touch hardware in init functions. Sometimes for good
> reasons -- they need to do some extra probing to discover something that is
> not enumerated -- sometimes just because it's very old legacy code that
> predates the modern driver model.

Are any of them in the kernel today?

PCI drivers should not be messing with this, we have had well over a
decade to fix that up.

> The legacy drivers could be fixed, but nobody really wants to touch them
> anymore and they're impossible to test.

Pointers to them?

> The drivers that probe something that is not enumerated in a standard way
> have no choice, it cannot be implemented in a different way.

PCI devices are not enumerated in a standard way???

> So instead we're using a "firewall" the prevents these drivers from doing
> bad things by not allowing ioremap access unless opted in, and also do some
> filtering on the IO ports The device filter is still the primary mechanism,
> the ioremap filtering is just belts and suspenders for those odd cases.

That's horrible, don't try to protect the kernel from itself.  Just fix
the drivers.

If you point me at them, I will be glad to have a look and throw some
interns on them.

But really, you all could have fixed them up by now if Intel really
cared about it :(

> If you want we can send an exact list, we did some analysis using a patched
> smatch tool.

Please do.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 4/6] virtio: Initialize authorized attribute for confidential guest

2021-09-30 Thread Greg Kroah-Hartman
On Thu, Sep 30, 2021 at 08:18:18AM -0700, Kuppuswamy, Sathyanarayanan wrote:
> 
> 
> On 9/30/21 6:36 AM, Dan Williams wrote:
> > > And in particular, not all virtio drivers are hardened -
> > > I think at this point blk and scsi drivers have been hardened - so
> > > treating them all the same looks wrong.
> > My understanding was that they have been audited, Sathya?
> 
> Yes, AFAIK, it has been audited. Andi also submitted some patches
> related to it. Andi, can you confirm.

What is the official definition of "audited"?

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/6] driver core: Add common support to skip probe for un-authorized devices

2021-09-30 Thread Greg Kroah-Hartman
On Thu, Sep 30, 2021 at 11:00:07AM -0400, Michael S. Tsirkin wrote:
> On Thu, Sep 30, 2021 at 04:49:23PM +0200, Greg Kroah-Hartman wrote:
> > On Thu, Sep 30, 2021 at 10:38:42AM -0400, Michael S. Tsirkin wrote:
> > > On Thu, Sep 30, 2021 at 03:52:52PM +0200, Greg Kroah-Hartman wrote:
> > > > On Thu, Sep 30, 2021 at 06:59:36AM -0400, Michael S. Tsirkin wrote:
> > > > > On Wed, Sep 29, 2021 at 06:05:07PM -0700, Kuppuswamy Sathyanarayanan 
> > > > > wrote:
> > > > > > While the common case for device-authorization is to skip probe of
> > > > > > unauthorized devices, some buses may still want to emit a message on
> > > > > > probe failure (Thunderbolt), or base probe failures on the
> > > > > > authorization status of a related device like a parent (USB). So add
> > > > > > an option (has_probe_authorization) in struct bus_type for the bus
> > > > > > driver to own probe authorization policy.
> > > > > > 
> > > > > > Reviewed-by: Dan Williams 
> > > > > > Signed-off-by: Kuppuswamy Sathyanarayanan 
> > > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > So what e.g. the PCI patch
> > > > > https://lore.kernel.org/all/CACK8Z6E8pjVeC934oFgr=vb3pulx_gyt2nkzaogdrqj9tks...@mail.gmail.com/
> > > > > actually proposes is a list of
> > > > > allowed drivers, not devices. Doing it at the device level
> > > > > has disadvantages, for example some devices might have a legacy
> > > > > unsafe driver, or an out of tree driver. It also does not
> > > > > address drivers that poke at hardware during init.
> > > > 
> > > > Doing it at a device level is the only sane way to do this.
> > > > 
> > > > A user needs to say "this device is allowed to be controlled by this
> > > > driver".  This is the trust model that USB has had for over a decade and
> > > > what thunderbolt also has.
> > > > 
> > > > > Accordingly, I think the right thing to do is to skip
> > > > > driver init for disallowed drivers, not skip probe
> > > > > for specific devices.
> > > > 
> > > > What do you mean by "driver init"?  module_init()?
> > > > 
> > > > No driver should be touching hardware in their module init call.  They
> > > > should only be touching it in the probe callback as that is the only
> > > > time they are ever allowed to talk to hardware.  Specifically the device
> > > > that has been handed to them.
> > > > 
> > > > If there are in-kernel PCI drivers that do not do this, they need to be
> > > > fixed today.
> > > > 
> > > > We don't care about out-of-tree drivers for obvious reasons that we have
> > > > no control over them.
> > > > 
> > > > thanks,
> > > > 
> > > > greg k-h
> > > 
> > > Well talk to Andi about it pls :)
> > > https://lore.kernel.org/r/ad1e41d1-3f4e-8982-16ea-18a3b2c04019%40linux.intel.com
> > 
> > As Alan said, the minute you allow any driver to get into your kernel,
> > it can do anything it wants to.
> > 
> > So just don't allow drivers to be added to your kernel if you care about
> > these things.  The system owner has that mechanism today.
> > 
> > thanks,
> > 
> > greg k-h
> 
> The "it" that I referred to is the claim that no driver should be
> touching hardware in their module init call. Andi seems to think
> such drivers are worth working around with a special remap API.

Andi is wrong.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/6] driver core: Add common support to skip probe for un-authorized devices

2021-09-30 Thread Greg Kroah-Hartman
On Thu, Sep 30, 2021 at 10:38:42AM -0400, Michael S. Tsirkin wrote:
> On Thu, Sep 30, 2021 at 03:52:52PM +0200, Greg Kroah-Hartman wrote:
> > On Thu, Sep 30, 2021 at 06:59:36AM -0400, Michael S. Tsirkin wrote:
> > > On Wed, Sep 29, 2021 at 06:05:07PM -0700, Kuppuswamy Sathyanarayanan 
> > > wrote:
> > > > While the common case for device-authorization is to skip probe of
> > > > unauthorized devices, some buses may still want to emit a message on
> > > > probe failure (Thunderbolt), or base probe failures on the
> > > > authorization status of a related device like a parent (USB). So add
> > > > an option (has_probe_authorization) in struct bus_type for the bus
> > > > driver to own probe authorization policy.
> > > > 
> > > > Reviewed-by: Dan Williams 
> > > > Signed-off-by: Kuppuswamy Sathyanarayanan 
> > > > 
> > > 
> > > 
> > > 
> > > So what e.g. the PCI patch
> > > https://lore.kernel.org/all/CACK8Z6E8pjVeC934oFgr=vb3pulx_gyt2nkzaogdrqj9tks...@mail.gmail.com/
> > > actually proposes is a list of
> > > allowed drivers, not devices. Doing it at the device level
> > > has disadvantages, for example some devices might have a legacy
> > > unsafe driver, or an out of tree driver. It also does not
> > > address drivers that poke at hardware during init.
> > 
> > Doing it at a device level is the only sane way to do this.
> > 
> > A user needs to say "this device is allowed to be controlled by this
> > driver".  This is the trust model that USB has had for over a decade and
> > what thunderbolt also has.
> > 
> > > Accordingly, I think the right thing to do is to skip
> > > driver init for disallowed drivers, not skip probe
> > > for specific devices.
> > 
> > What do you mean by "driver init"?  module_init()?
> > 
> > No driver should be touching hardware in their module init call.  They
> > should only be touching it in the probe callback as that is the only
> > time they are ever allowed to talk to hardware.  Specifically the device
> > that has been handed to them.
> > 
> > If there are in-kernel PCI drivers that do not do this, they need to be
> > fixed today.
> > 
> > We don't care about out-of-tree drivers for obvious reasons that we have
> > no control over them.
> > 
> > thanks,
> > 
> > greg k-h
> 
> Well talk to Andi about it pls :)
> https://lore.kernel.org/r/ad1e41d1-3f4e-8982-16ea-18a3b2c04019%40linux.intel.com

As Alan said, the minute you allow any driver to get into your kernel,
it can do anything it wants to.

So just don't allow drivers to be added to your kernel if you care about
these things.  The system owner has that mechanism today.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/6] driver core: Add common support to skip probe for un-authorized devices

2021-09-30 Thread Greg Kroah-Hartman
On Thu, Sep 30, 2021 at 06:59:36AM -0400, Michael S. Tsirkin wrote:
> On Wed, Sep 29, 2021 at 06:05:07PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > While the common case for device-authorization is to skip probe of
> > unauthorized devices, some buses may still want to emit a message on
> > probe failure (Thunderbolt), or base probe failures on the
> > authorization status of a related device like a parent (USB). So add
> > an option (has_probe_authorization) in struct bus_type for the bus
> > driver to own probe authorization policy.
> > 
> > Reviewed-by: Dan Williams 
> > Signed-off-by: Kuppuswamy Sathyanarayanan 
> > 
> 
> 
> 
> So what e.g. the PCI patch
> https://lore.kernel.org/all/CACK8Z6E8pjVeC934oFgr=vb3pulx_gyt2nkzaogdrqj9tks...@mail.gmail.com/
> actually proposes is a list of
> allowed drivers, not devices. Doing it at the device level
> has disadvantages, for example some devices might have a legacy
> unsafe driver, or an out of tree driver. It also does not
> address drivers that poke at hardware during init.

Doing it at a device level is the only sane way to do this.

A user needs to say "this device is allowed to be controlled by this
driver".  This is the trust model that USB has had for over a decade and
what thunderbolt also has.

> Accordingly, I think the right thing to do is to skip
> driver init for disallowed drivers, not skip probe
> for specific devices.

What do you mean by "driver init"?  module_init()?

No driver should be touching hardware in their module init call.  They
should only be touching it in the probe callback as that is the only
time they are ever allowed to talk to hardware.  Specifically the device
that has been handed to them.

If there are in-kernel PCI drivers that do not do this, they need to be
fixed today.

We don't care about out-of-tree drivers for obvious reasons that we have
no control over them.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 4/6] virtio: Initialize authorized attribute for confidential guest

2021-09-30 Thread Greg Kroah-Hartman
On Thu, Sep 30, 2021 at 06:36:18AM -0700, Dan Williams wrote:
> On Thu, Sep 30, 2021 at 4:03 AM Michael S. Tsirkin  wrote:
> >
> > On Wed, Sep 29, 2021 at 06:05:09PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > > Confidential guest platforms like TDX have a requirement to allow
> > > only trusted devices. By default the confidential-guest core will
> > > arrange for all devices to default to unauthorized (via
> > > dev_default_authorization) in device_initialize(). Since virtio
> > > driver is already hardened against the attack from the un-trusted host,
> > > override the confidential computing default unauthorized state
> > >
> > > Reviewed-by: Dan Williams 
> > > Signed-off-by: Kuppuswamy Sathyanarayanan 
> > > 
> >
> > Architecturally this all looks backwards. IIUC nothing about virtio
> > makes it authorized or trusted. The driver is hardened,
> > true, but this should be set at the driver not the device level.
> 
> That's was my initial reaction to this proposal as well, and I ended
> up leading Sathya astray from what Greg wanted. Greg rightly points
> out that the "authorized" attribute from USB and Thunderbolt already
> exists [1] [2]. So the choice is find an awkward way to mix driver
> trust with existing bus-local "authorized" mechanisms, or promote the
> authorized capability to the driver-core. This patch set implements
> the latter to keep the momentum on the already shipping design scheme
> to not add to the driver-core maintenance burden.
> 
> [1]: https://lore.kernel.org/all/yquaj78y8j1um...@kroah.com/
> [2]: https://lore.kernel.org/all/yqzf%2futgrjfbz...@kroah.com/
> 
> > And in particular, not all virtio drivers are hardened -
> > I think at this point blk and scsi drivers have been hardened - so
> > treating them all the same looks wrong.
> 
> My understanding was that they have been audited, Sathya?

Please define "audited" and "trusted" here.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.10 054/122] mm/memory_hotplug: use "unsigned long" for PFN in zone_for_pfn_range()

2021-09-20 Thread Greg Kroah-Hartman
From: David Hildenbrand 

commit 7cf209ba8a86410939a24cb1aeb279479a7e0ca6 upstream.

Patch series "mm/memory_hotplug: preparatory patches for new online policy and 
memory"

These are all cleanups and one fix previously sent as part of [1]:
[PATCH v1 00/12] mm/memory_hotplug: "auto-movable" online policy and memory
groups.

These patches make sense even without the other series, therefore I pulled
them out to make the other series easier to digest.

[1] https://lkml.kernel.org/r/20210607195430.48228-1-da...@redhat.com

This patch (of 4):

Checkpatch complained on a follow-up patch that we are using "unsigned"
here, which defaults to "unsigned int" and checkpatch is correct.

As we will search for a fitting zone using the wrong pfn, we might end
up onlining memory to one of the special kernel zones, such as ZONE_DMA,
which can end badly as the onlined memory does not satisfy properties of
these zones.

Use "unsigned long" instead, just as we do in other places when handling
PFNs.  This can bite us once we have physical addresses in the range of
multiple TB.

Link: https://lkml.kernel.org/r/20210712124052.26491-2-da...@redhat.com
Fixes: e5e689302633 ("mm, memory_hotplug: display allowed zones in the 
preferred ordering")
Signed-off-by: David Hildenbrand 
Reviewed-by: Pankaj Gupta 
Reviewed-by: Muchun Song 
Reviewed-by: Oscar Salvador 
Cc: David Hildenbrand 
Cc: Vitaly Kuznetsov 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Pankaj Gupta 
Cc: Wei Yang 
Cc: Michal Hocko 
Cc: Dan Williams 
Cc: Anshuman Khandual 
Cc: Dave Hansen 
Cc: Vlastimil Babka 
Cc: Mike Rapoport 
Cc: "Rafael J. Wysocki" 
Cc: Len Brown 
Cc: Pavel Tatashin 
Cc: Heiko Carstens 
Cc: Michael Ellerman 
Cc: Catalin Marinas 
Cc: virtualization@lists.linux-foundation.org
Cc: Andy Lutomirski 
Cc: "Aneesh Kumar K.V" 
Cc: Anton Blanchard 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Christophe Leroy 
Cc: Dave Jiang 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Jia He 
Cc: Joe Perches 
Cc: Kefeng Wang 
Cc: Laurent Dufour 
Cc: Michel Lespinasse 
Cc: Nathan Lynch 
Cc: Nicholas Piggin 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Pierre Morel 
Cc: "Rafael J. Wysocki" 
Cc: Rich Felker 
Cc: Scott Cheloha 
Cc: Sergei Trofimovich 
Cc: Thiago Jung Bauermann 
Cc: Thomas Gleixner 
Cc: Vasily Gorbik 
Cc: Vishal Verma 
Cc: Will Deacon 
Cc: Yoshinori Sato 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: David Hildenbrand 
Signed-off-by: Greg Kroah-Hartman 
---
 include/linux/memory_hotplug.h |4 ++--
 mm/memory_hotplug.c|4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -359,8 +359,8 @@ extern void sparse_remove_section(struct
unsigned long map_offset, struct vmem_altmap *altmap);
 extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map,
  unsigned long pnum);
-extern struct zone *zone_for_pfn_range(int online_type, int nid, unsigned 
start_pfn,
-   unsigned long nr_pages);
+extern struct zone *zone_for_pfn_range(int online_type, int nid,
+   unsigned long start_pfn, unsigned long nr_pages);
 #endif /* CONFIG_MEMORY_HOTPLUG */
 
 #endif /* __LINUX_MEMORY_HOTPLUG_H */
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -765,8 +765,8 @@ static inline struct zone *default_zone_
return movable_node_enabled ? movable_zone : kernel_zone;
 }
 
-struct zone * zone_for_pfn_range(int online_type, int nid, unsigned start_pfn,
-   unsigned long nr_pages)
+struct zone *zone_for_pfn_range(int online_type, int nid,
+   unsigned long start_pfn, unsigned long nr_pages)
 {
if (online_type == MMOP_ONLINE_KERNEL)
return default_kernel_zone_for_pfn(nid, start_pfn, nr_pages);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.4 232/260] mm/memory_hotplug: use "unsigned long" for PFN in zone_for_pfn_range()

2021-09-20 Thread Greg Kroah-Hartman
From: David Hildenbrand 

commit 7cf209ba8a86410939a24cb1aeb279479a7e0ca6 upstream.

Patch series "mm/memory_hotplug: preparatory patches for new online policy and 
memory"

These are all cleanups and one fix previously sent as part of [1]:
[PATCH v1 00/12] mm/memory_hotplug: "auto-movable" online policy and memory
groups.

These patches make sense even without the other series, therefore I pulled
them out to make the other series easier to digest.

[1] https://lkml.kernel.org/r/20210607195430.48228-1-da...@redhat.com

This patch (of 4):

Checkpatch complained on a follow-up patch that we are using "unsigned"
here, which defaults to "unsigned int" and checkpatch is correct.

As we will search for a fitting zone using the wrong pfn, we might end
up onlining memory to one of the special kernel zones, such as ZONE_DMA,
which can end badly as the onlined memory does not satisfy properties of
these zones.

Use "unsigned long" instead, just as we do in other places when handling
PFNs.  This can bite us once we have physical addresses in the range of
multiple TB.

Link: https://lkml.kernel.org/r/20210712124052.26491-2-da...@redhat.com
Fixes: e5e689302633 ("mm, memory_hotplug: display allowed zones in the 
preferred ordering")
Signed-off-by: David Hildenbrand 
Reviewed-by: Pankaj Gupta 
Reviewed-by: Muchun Song 
Reviewed-by: Oscar Salvador 
Cc: David Hildenbrand 
Cc: Vitaly Kuznetsov 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Pankaj Gupta 
Cc: Wei Yang 
Cc: Michal Hocko 
Cc: Dan Williams 
Cc: Anshuman Khandual 
Cc: Dave Hansen 
Cc: Vlastimil Babka 
Cc: Mike Rapoport 
Cc: "Rafael J. Wysocki" 
Cc: Len Brown 
Cc: Pavel Tatashin 
Cc: Heiko Carstens 
Cc: Michael Ellerman 
Cc: Catalin Marinas 
Cc: virtualization@lists.linux-foundation.org
Cc: Andy Lutomirski 
Cc: "Aneesh Kumar K.V" 
Cc: Anton Blanchard 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Christophe Leroy 
Cc: Dave Jiang 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Jia He 
Cc: Joe Perches 
Cc: Kefeng Wang 
Cc: Laurent Dufour 
Cc: Michel Lespinasse 
Cc: Nathan Lynch 
Cc: Nicholas Piggin 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Pierre Morel 
Cc: "Rafael J. Wysocki" 
Cc: Rich Felker 
Cc: Scott Cheloha 
Cc: Sergei Trofimovich 
Cc: Thiago Jung Bauermann 
Cc: Thomas Gleixner 
Cc: Vasily Gorbik 
Cc: Vishal Verma 
Cc: Will Deacon 
Cc: Yoshinori Sato 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: David Hildenbrand 
Signed-off-by: Greg Kroah-Hartman 
---
 include/linux/memory_hotplug.h |4 ++--
 mm/memory_hotplug.c|4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -358,6 +358,6 @@ extern struct page *sparse_decode_mem_ma
  unsigned long pnum);
 extern bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long 
nr_pages,
int online_type);
-extern struct zone *zone_for_pfn_range(int online_type, int nid, unsigned 
start_pfn,
-   unsigned long nr_pages);
+extern struct zone *zone_for_pfn_range(int online_type, int nid,
+   unsigned long start_pfn, unsigned long nr_pages);
 #endif /* __LINUX_MEMORY_HOTPLUG_H */
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -775,8 +775,8 @@ static inline struct zone *default_zone_
return movable_node_enabled ? movable_zone : kernel_zone;
 }
 
-struct zone * zone_for_pfn_range(int online_type, int nid, unsigned start_pfn,
-   unsigned long nr_pages)
+struct zone *zone_for_pfn_range(int online_type, int nid,
+   unsigned long start_pfn, unsigned long nr_pages)
 {
if (online_type == MMOP_ONLINE_KERNEL)
return default_kernel_zone_for_pfn(nid, start_pfn, nr_pages);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.19 273/293] mm/memory_hotplug: use "unsigned long" for PFN in zone_for_pfn_range()

2021-09-20 Thread Greg Kroah-Hartman
From: David Hildenbrand 

commit 7cf209ba8a86410939a24cb1aeb279479a7e0ca6 upstream.

Patch series "mm/memory_hotplug: preparatory patches for new online policy and 
memory"

These are all cleanups and one fix previously sent as part of [1]:
[PATCH v1 00/12] mm/memory_hotplug: "auto-movable" online policy and memory
groups.

These patches make sense even without the other series, therefore I pulled
them out to make the other series easier to digest.

[1] https://lkml.kernel.org/r/20210607195430.48228-1-da...@redhat.com

This patch (of 4):

Checkpatch complained on a follow-up patch that we are using "unsigned"
here, which defaults to "unsigned int" and checkpatch is correct.

As we will search for a fitting zone using the wrong pfn, we might end
up onlining memory to one of the special kernel zones, such as ZONE_DMA,
which can end badly as the onlined memory does not satisfy properties of
these zones.

Use "unsigned long" instead, just as we do in other places when handling
PFNs.  This can bite us once we have physical addresses in the range of
multiple TB.

Link: https://lkml.kernel.org/r/20210712124052.26491-2-da...@redhat.com
Fixes: e5e689302633 ("mm, memory_hotplug: display allowed zones in the 
preferred ordering")
Signed-off-by: David Hildenbrand 
Reviewed-by: Pankaj Gupta 
Reviewed-by: Muchun Song 
Reviewed-by: Oscar Salvador 
Cc: David Hildenbrand 
Cc: Vitaly Kuznetsov 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Pankaj Gupta 
Cc: Wei Yang 
Cc: Michal Hocko 
Cc: Dan Williams 
Cc: Anshuman Khandual 
Cc: Dave Hansen 
Cc: Vlastimil Babka 
Cc: Mike Rapoport 
Cc: "Rafael J. Wysocki" 
Cc: Len Brown 
Cc: Pavel Tatashin 
Cc: Heiko Carstens 
Cc: Michael Ellerman 
Cc: Catalin Marinas 
Cc: virtualization@lists.linux-foundation.org
Cc: Andy Lutomirski 
Cc: "Aneesh Kumar K.V" 
Cc: Anton Blanchard 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Christophe Leroy 
Cc: Dave Jiang 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Jia He 
Cc: Joe Perches 
Cc: Kefeng Wang 
Cc: Laurent Dufour 
Cc: Michel Lespinasse 
Cc: Nathan Lynch 
Cc: Nicholas Piggin 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Pierre Morel 
Cc: "Rafael J. Wysocki" 
Cc: Rich Felker 
Cc: Scott Cheloha 
Cc: Sergei Trofimovich 
Cc: Thiago Jung Bauermann 
Cc: Thomas Gleixner 
Cc: Vasily Gorbik 
Cc: Vishal Verma 
Cc: Will Deacon 
Cc: Yoshinori Sato 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: David Hildenbrand 
Signed-off-by: Greg Kroah-Hartman 
---
 include/linux/memory_hotplug.h |4 ++--
 mm/memory_hotplug.c|4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -344,6 +344,6 @@ extern struct page *sparse_decode_mem_ma
  unsigned long pnum);
 extern bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long 
nr_pages,
int online_type);
-extern struct zone *zone_for_pfn_range(int online_type, int nid, unsigned 
start_pfn,
-   unsigned long nr_pages);
+extern struct zone *zone_for_pfn_range(int online_type, int nid,
+   unsigned long start_pfn, unsigned long nr_pages);
 #endif /* __LINUX_MEMORY_HOTPLUG_H */
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -783,8 +783,8 @@ static inline struct zone *default_zone_
return movable_node_enabled ? movable_zone : kernel_zone;
 }
 
-struct zone * zone_for_pfn_range(int online_type, int nid, unsigned start_pfn,
-   unsigned long nr_pages)
+struct zone *zone_for_pfn_range(int online_type, int nid,
+   unsigned long start_pfn, unsigned long nr_pages)
 {
if (online_type == MMOP_ONLINE_KERNEL)
return default_kernel_zone_for_pfn(nid, start_pfn, nr_pages);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.14 205/217] mm/memory_hotplug: use "unsigned long" for PFN in zone_for_pfn_range()

2021-09-20 Thread Greg Kroah-Hartman
From: David Hildenbrand 

commit 7cf209ba8a86410939a24cb1aeb279479a7e0ca6 upstream.

Patch series "mm/memory_hotplug: preparatory patches for new online policy and 
memory"

These are all cleanups and one fix previously sent as part of [1]:
[PATCH v1 00/12] mm/memory_hotplug: "auto-movable" online policy and memory
groups.

These patches make sense even without the other series, therefore I pulled
them out to make the other series easier to digest.

[1] https://lkml.kernel.org/r/20210607195430.48228-1-da...@redhat.com

This patch (of 4):

Checkpatch complained on a follow-up patch that we are using "unsigned"
here, which defaults to "unsigned int" and checkpatch is correct.

As we will search for a fitting zone using the wrong pfn, we might end
up onlining memory to one of the special kernel zones, such as ZONE_DMA,
which can end badly as the onlined memory does not satisfy properties of
these zones.

Use "unsigned long" instead, just as we do in other places when handling
PFNs.  This can bite us once we have physical addresses in the range of
multiple TB.

Link: https://lkml.kernel.org/r/20210712124052.26491-2-da...@redhat.com
Fixes: e5e689302633 ("mm, memory_hotplug: display allowed zones in the 
preferred ordering")
Signed-off-by: David Hildenbrand 
Reviewed-by: Pankaj Gupta 
Reviewed-by: Muchun Song 
Reviewed-by: Oscar Salvador 
Cc: David Hildenbrand 
Cc: Vitaly Kuznetsov 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Pankaj Gupta 
Cc: Wei Yang 
Cc: Michal Hocko 
Cc: Dan Williams 
Cc: Anshuman Khandual 
Cc: Dave Hansen 
Cc: Vlastimil Babka 
Cc: Mike Rapoport 
Cc: "Rafael J. Wysocki" 
Cc: Len Brown 
Cc: Pavel Tatashin 
Cc: Heiko Carstens 
Cc: Michael Ellerman 
Cc: Catalin Marinas 
Cc: virtualization@lists.linux-foundation.org
Cc: Andy Lutomirski 
Cc: "Aneesh Kumar K.V" 
Cc: Anton Blanchard 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Christophe Leroy 
Cc: Dave Jiang 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Jia He 
Cc: Joe Perches 
Cc: Kefeng Wang 
Cc: Laurent Dufour 
Cc: Michel Lespinasse 
Cc: Nathan Lynch 
Cc: Nicholas Piggin 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Pierre Morel 
Cc: "Rafael J. Wysocki" 
Cc: Rich Felker 
Cc: Scott Cheloha 
Cc: Sergei Trofimovich 
Cc: Thiago Jung Bauermann 
Cc: Thomas Gleixner 
Cc: Vasily Gorbik 
Cc: Vishal Verma 
Cc: Will Deacon 
Cc: Yoshinori Sato 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: David Hildenbrand 
Signed-off-by: Greg Kroah-Hartman 
---
 include/linux/memory_hotplug.h |4 ++--
 mm/memory_hotplug.c|4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -332,6 +332,6 @@ extern struct page *sparse_decode_mem_ma
  unsigned long pnum);
 extern bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long 
nr_pages,
int online_type);
-extern struct zone *zone_for_pfn_range(int online_type, int nid, unsigned 
start_pfn,
-   unsigned long nr_pages);
+extern struct zone *zone_for_pfn_range(int online_type, int nid,
+   unsigned long start_pfn, unsigned long nr_pages);
 #endif /* __LINUX_MEMORY_HOTPLUG_H */
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -842,8 +842,8 @@ static inline struct zone *default_zone_
return movable_node_enabled ? movable_zone : kernel_zone;
 }
 
-struct zone * zone_for_pfn_range(int online_type, int nid, unsigned start_pfn,
-   unsigned long nr_pages)
+struct zone *zone_for_pfn_range(int online_type, int nid,
+   unsigned long start_pfn, unsigned long nr_pages)
 {
if (online_type == MMOP_ONLINE_KERNEL)
return default_kernel_zone_for_pfn(nid, start_pfn, nr_pages);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.14 404/432] mm/memory_hotplug: use "unsigned long" for PFN in zone_for_pfn_range()

2021-09-16 Thread Greg Kroah-Hartman
From: David Hildenbrand 

commit 7cf209ba8a86410939a24cb1aeb279479a7e0ca6 upstream.

Patch series "mm/memory_hotplug: preparatory patches for new online policy and 
memory"

These are all cleanups and one fix previously sent as part of [1]:
[PATCH v1 00/12] mm/memory_hotplug: "auto-movable" online policy and memory
groups.

These patches make sense even without the other series, therefore I pulled
them out to make the other series easier to digest.

[1] https://lkml.kernel.org/r/20210607195430.48228-1-da...@redhat.com

This patch (of 4):

Checkpatch complained on a follow-up patch that we are using "unsigned"
here, which defaults to "unsigned int" and checkpatch is correct.

As we will search for a fitting zone using the wrong pfn, we might end
up onlining memory to one of the special kernel zones, such as ZONE_DMA,
which can end badly as the onlined memory does not satisfy properties of
these zones.

Use "unsigned long" instead, just as we do in other places when handling
PFNs.  This can bite us once we have physical addresses in the range of
multiple TB.

Link: https://lkml.kernel.org/r/20210712124052.26491-2-da...@redhat.com
Fixes: e5e689302633 ("mm, memory_hotplug: display allowed zones in the 
preferred ordering")
Signed-off-by: David Hildenbrand 
Reviewed-by: Pankaj Gupta 
Reviewed-by: Muchun Song 
Reviewed-by: Oscar Salvador 
Cc: David Hildenbrand 
Cc: Vitaly Kuznetsov 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Pankaj Gupta 
Cc: Wei Yang 
Cc: Michal Hocko 
Cc: Dan Williams 
Cc: Anshuman Khandual 
Cc: Dave Hansen 
Cc: Vlastimil Babka 
Cc: Mike Rapoport 
Cc: "Rafael J. Wysocki" 
Cc: Len Brown 
Cc: Pavel Tatashin 
Cc: Heiko Carstens 
Cc: Michael Ellerman 
Cc: Catalin Marinas 
Cc: virtualization@lists.linux-foundation.org
Cc: Andy Lutomirski 
Cc: "Aneesh Kumar K.V" 
Cc: Anton Blanchard 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Christophe Leroy 
Cc: Dave Jiang 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Jia He 
Cc: Joe Perches 
Cc: Kefeng Wang 
Cc: Laurent Dufour 
Cc: Michel Lespinasse 
Cc: Nathan Lynch 
Cc: Nicholas Piggin 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Pierre Morel 
Cc: "Rafael J. Wysocki" 
Cc: Rich Felker 
Cc: Scott Cheloha 
Cc: Sergei Trofimovich 
Cc: Thiago Jung Bauermann 
Cc: Thomas Gleixner 
Cc: Vasily Gorbik 
Cc: Vishal Verma 
Cc: Will Deacon 
Cc: Yoshinori Sato 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 
---
 include/linux/memory_hotplug.h |4 ++--
 mm/memory_hotplug.c|4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -339,8 +339,8 @@ extern void sparse_remove_section(struct
unsigned long map_offset, struct vmem_altmap *altmap);
 extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map,
  unsigned long pnum);
-extern struct zone *zone_for_pfn_range(int online_type, int nid, unsigned 
start_pfn,
-   unsigned long nr_pages);
+extern struct zone *zone_for_pfn_range(int online_type, int nid,
+   unsigned long start_pfn, unsigned long nr_pages);
 extern int arch_create_linear_mapping(int nid, u64 start, u64 size,
  struct mhp_params *params);
 void arch_remove_linear_mapping(u64 start, u64 size);
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -708,8 +708,8 @@ static inline struct zone *default_zone_
return movable_node_enabled ? movable_zone : kernel_zone;
 }
 
-struct zone *zone_for_pfn_range(int online_type, int nid, unsigned start_pfn,
-   unsigned long nr_pages)
+struct zone *zone_for_pfn_range(int online_type, int nid,
+   unsigned long start_pfn, unsigned long nr_pages)
 {
if (online_type == MMOP_ONLINE_KERNEL)
return default_kernel_zone_for_pfn(nid, start_pfn, nr_pages);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.13 358/380] mm/memory_hotplug: use "unsigned long" for PFN in zone_for_pfn_range()

2021-09-16 Thread Greg Kroah-Hartman
From: David Hildenbrand 

commit 7cf209ba8a86410939a24cb1aeb279479a7e0ca6 upstream.

Patch series "mm/memory_hotplug: preparatory patches for new online policy and 
memory"

These are all cleanups and one fix previously sent as part of [1]:
[PATCH v1 00/12] mm/memory_hotplug: "auto-movable" online policy and memory
groups.

These patches make sense even without the other series, therefore I pulled
them out to make the other series easier to digest.

[1] https://lkml.kernel.org/r/20210607195430.48228-1-da...@redhat.com

This patch (of 4):

Checkpatch complained on a follow-up patch that we are using "unsigned"
here, which defaults to "unsigned int" and checkpatch is correct.

As we will search for a fitting zone using the wrong pfn, we might end
up onlining memory to one of the special kernel zones, such as ZONE_DMA,
which can end badly as the onlined memory does not satisfy properties of
these zones.

Use "unsigned long" instead, just as we do in other places when handling
PFNs.  This can bite us once we have physical addresses in the range of
multiple TB.

Link: https://lkml.kernel.org/r/20210712124052.26491-2-da...@redhat.com
Fixes: e5e689302633 ("mm, memory_hotplug: display allowed zones in the 
preferred ordering")
Signed-off-by: David Hildenbrand 
Reviewed-by: Pankaj Gupta 
Reviewed-by: Muchun Song 
Reviewed-by: Oscar Salvador 
Cc: David Hildenbrand 
Cc: Vitaly Kuznetsov 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Pankaj Gupta 
Cc: Wei Yang 
Cc: Michal Hocko 
Cc: Dan Williams 
Cc: Anshuman Khandual 
Cc: Dave Hansen 
Cc: Vlastimil Babka 
Cc: Mike Rapoport 
Cc: "Rafael J. Wysocki" 
Cc: Len Brown 
Cc: Pavel Tatashin 
Cc: Heiko Carstens 
Cc: Michael Ellerman 
Cc: Catalin Marinas 
Cc: virtualization@lists.linux-foundation.org
Cc: Andy Lutomirski 
Cc: "Aneesh Kumar K.V" 
Cc: Anton Blanchard 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Christophe Leroy 
Cc: Dave Jiang 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Jia He 
Cc: Joe Perches 
Cc: Kefeng Wang 
Cc: Laurent Dufour 
Cc: Michel Lespinasse 
Cc: Nathan Lynch 
Cc: Nicholas Piggin 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Pierre Morel 
Cc: "Rafael J. Wysocki" 
Cc: Rich Felker 
Cc: Scott Cheloha 
Cc: Sergei Trofimovich 
Cc: Thiago Jung Bauermann 
Cc: Thomas Gleixner 
Cc: Vasily Gorbik 
Cc: Vishal Verma 
Cc: Will Deacon 
Cc: Yoshinori Sato 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman 
---
 include/linux/memory_hotplug.h |4 ++--
 mm/memory_hotplug.c|4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -366,8 +366,8 @@ extern void sparse_remove_section(struct
unsigned long map_offset, struct vmem_altmap *altmap);
 extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map,
  unsigned long pnum);
-extern struct zone *zone_for_pfn_range(int online_type, int nid, unsigned 
start_pfn,
-   unsigned long nr_pages);
+extern struct zone *zone_for_pfn_range(int online_type, int nid,
+   unsigned long start_pfn, unsigned long nr_pages);
 extern int arch_create_linear_mapping(int nid, u64 start, u64 size,
  struct mhp_params *params);
 void arch_remove_linear_mapping(u64 start, u64 size);
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -834,8 +834,8 @@ static inline struct zone *default_zone_
return movable_node_enabled ? movable_zone : kernel_zone;
 }
 
-struct zone *zone_for_pfn_range(int online_type, int nid, unsigned start_pfn,
-   unsigned long nr_pages)
+struct zone *zone_for_pfn_range(int online_type, int nid,
+   unsigned long start_pfn, unsigned long nr_pages)
 {
if (online_type == MMOP_ONLINE_KERNEL)
return default_kernel_zone_for_pfn(nid, start_pfn, nr_pages);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.19 01/30] virtio_net: Do not pull payload in skb->head

2021-08-02 Thread Greg Kroah-Hartman
From: Eric Dumazet 

commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db upstream.

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo 
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo 
Signed-off-by: Xuan Zhuo 
Signed-off-by: Eric Dumazet 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Matthieu Baerts 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/virtio_net.c   |   10 +++---
 include/linux/virtio_net.h |   14 +-
 2 files changed, 16 insertions(+), 8 deletions(-)

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -413,9 +413,13 @@ static struct sk_buff *page_to_skb(struc
offset += hdr_padded_len;
p += hdr_padded_len;
 
-   copy = len;
-   if (copy > skb_tailroom(skb))
-   copy = skb_tailroom(skb);
+   /* Copy all frame if it fits skb->head, otherwise
+* we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+*/
+   if (len <= skb_tailroom(skb))
+   copy = len;
+   else
+   copy = ETH_HLEN + metasize;
skb_put_data(skb, p, copy);
 
if (metasize) {
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(
skb_reset_mac_header(skb);
 
if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-   u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
-   u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+   u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+   if (!pskb_may_pull(skb, needed))
+   return -EINVAL;
 
if (!skb_partial_csum_set(skb, start, off))
return -EINVAL;
 
p_off = skb_transport_offset(skb) + thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
} else {
/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -102,14 +106,14 @@ retry:
}
 
p_off = keys.control.thoff + thlen;
-   if (p_off > skb_headlen(skb) ||
+   if (!pskb_may_pull(skb, p_off) ||
keys.basic.ip_proto != ip_proto)
return -EINVAL;
 
skb_set_transport_header(skb, keys.control.thoff);
} else if (gso_type) {
p_off = thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
}
}


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.14 14/38] virtio_net: Do not pull payload in skb->head

2021-08-02 Thread Greg Kroah-Hartman
From: Eric Dumazet 

commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db upstream.

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo 
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo 
Signed-off-by: Xuan Zhuo 
Signed-off-by: Eric Dumazet 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Matthieu Baerts 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/virtio_net.c   |   10 +++---
 include/linux/virtio_net.h |   14 +-
 2 files changed, 16 insertions(+), 8 deletions(-)

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -339,9 +339,13 @@ static struct sk_buff *page_to_skb(struc
offset += hdr_padded_len;
p += hdr_padded_len;
 
-   copy = len;
-   if (copy > skb_tailroom(skb))
-   copy = skb_tailroom(skb);
+   /* Copy all frame if it fits skb->head, otherwise
+* we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+*/
+   if (len <= skb_tailroom(skb))
+   copy = len;
+   else
+   copy = ETH_HLEN;
skb_put_data(skb, p, copy);
 
len -= copy;
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(
skb_reset_mac_header(skb);
 
if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-   u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
-   u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+   u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+   if (!pskb_may_pull(skb, needed))
+   return -EINVAL;
 
if (!skb_partial_csum_set(skb, start, off))
return -EINVAL;
 
p_off = skb_transport_offset(skb) + thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
} else {
/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -100,14 +104,14 @@ retry:
}
 
p_off = keys.control.thoff + thlen;
-   if (p_off > skb_headlen(skb) ||
+   if (!pskb_may_pull(skb, p_off) ||
keys.basic.ip_proto != ip_proto)
return -EINVAL;
 
skb_set_transport_header(skb, keys.control.thoff);
} else if (gso_type) {
p_off = thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
}
}


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 3/9] drivers/base/memory: introduce "memory groups" to logically group memory blocks

2021-07-28 Thread Greg Kroah-Hartman
On Fri, Jul 23, 2021 at 02:52:04PM +0200, David Hildenbrand wrote:
> In our "auto-movable" memory onlining policy, we want to make decisions
> across memory blocks of a single memory device. Examples of memory devices
> include ACPI memory devices (in the simplest case a single DIMM) and
> virtio-mem. For now, we don't have a connection between a single memory
> block device and the real memory device. Each memory device consists of
> 1..X memory block devices.
> 
> Let's logically group memory blocks belonging to the same memory device
> in "memory groups". Memory groups can span multiple physical ranges and a
> memory group itself does not contain any information regarding physical
> ranges, only properties (e.g., "max_pages") necessary for improved memory
> onlining.
> 
> Introduce two memory group types:
> 
> 1) Static memory group: E.g., a single ACPI memory device, consisting of
>1..X memory resources. A memory group consists of 1..Y memory blocks.
>The whole group is added/removed in one go. If any part cannot get
>offlined, the whole group cannot be removed.
> 
> 2) Dynamic memory group: E.g., a single virtio-mem device. Memory is
>dynamically added/removed in a fixed granularity, called a "unit",
>consisting of 1..X memory blocks. A unit is added/removed in one go.
>If any part of a unit cannot get offlined, the whole unit cannot be
>removed.
> 
> In case of 1) we usually want either all memory managed by ZONE_MOVABLE
> or none. In case of 2) we usually want to have as many units as possible
> managed by ZONE_MOVABLE. We want a single unit to be of the same type.
> 
> For now, memory groups are an internal concept that is not exposed to
> user space; we might want to change that in the future, though.
> 
> add_memory() users can specify a mgid instead of a nid when passing
> the MHP_NID_IS_MGID flag.
> 
> Signed-off-by: David Hildenbrand 
> ---
>  drivers/base/memory.c  | 102 +++--
>  include/linux/memory.h |  46 ++-
>  include/linux/memory_hotplug.h |   6 +-
>  mm/memory_hotplug.c|  11 +++-
>  4 files changed, 158 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 86ec2dc82fc2..42109e7fb0b5 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -82,6 +82,11 @@ static struct bus_type memory_subsys = {
>   */
>  static DEFINE_XARRAY(memory_blocks);
>  
> +/*
> + * Memory groups, indexed by memory group identification (mgid).
> + */
> +static DEFINE_XARRAY_FLAGS(memory_groups, XA_FLAGS_ALLOC);
> +
>  static BLOCKING_NOTIFIER_HEAD(memory_chain);
>  
>  int register_memory_notifier(struct notifier_block *nb)
> @@ -634,7 +639,8 @@ int register_memory(struct memory_block *memory)
>  }
>  
>  static int init_memory_block(unsigned long block_id, unsigned long state,
> -  unsigned long nr_vmemmap_pages)
> +  unsigned long nr_vmemmap_pages,
> +  struct memory_group *group)
>  {
>   struct memory_block *mem;
>   int ret = 0;
> @@ -653,6 +659,11 @@ static int init_memory_block(unsigned long block_id, 
> unsigned long state,
>   mem->nid = NUMA_NO_NODE;
>   mem->nr_vmemmap_pages = nr_vmemmap_pages;
>  
> + if (group) {
> + mem->group = group;
> + refcount_inc(&group->refcount);
> + }
> +
>   ret = register_memory(mem);
>  
>   return ret;
> @@ -671,7 +682,7 @@ static int add_memory_block(unsigned long base_section_nr)
>   if (section_count == 0)
>   return 0;
>   return init_memory_block(memory_block_id(base_section_nr),
> -  MEM_ONLINE, 0);
> +  MEM_ONLINE, 0,  NULL);
>  }
>  
>  static void unregister_memory(struct memory_block *memory)
> @@ -681,6 +692,11 @@ static void unregister_memory(struct memory_block 
> *memory)
>  
>   WARN_ON(xa_erase(&memory_blocks, memory->dev.id) == NULL);
>  
> + if (memory->group) {
> + refcount_dec(&memory->group->refcount);
> + memory->group = NULL;

Who freed the memory for the group?

> + }
> +
>   /* drop the ref. we got via find_memory_block() */
>   put_device(&memory->dev);
>   device_unregister(&memory->dev);
> @@ -694,7 +710,8 @@ static void unregister_memory(struct memory_block *memory)
>   * Called under device_hotplug_lock.
>   */
>  int create_memory_block_devices(unsigned long start, unsigned long size,
> - unsigned long vmemmap_pages)
> + unsigned long vmemmap_pages,
> + struct memory_group *group)
>  {
>   const unsigned long start_block_id = pfn_to_block_id(PFN_DOWN(start));
>   unsigned long end_block_id = pfn_to_block_id(PFN_DOWN(start + size));
> @@ -707,7 +724,8 @@ int create_memory_block_devices(unsigned long start, 
> unsigned long size,
>   r

Re: [PATCH v4 0/5] bus: Make remove callback return void

2021-07-22 Thread Greg Kroah-Hartman
On Wed, Jul 21, 2021 at 12:09:41PM +0200, Greg Kroah-Hartman wrote:
> On Tue, Jul 13, 2021 at 09:35:17PM +0200, Uwe Kleine-König wrote:
> > Hello,
> > 
> > this is v4 of the final patch set for my effort to make struct
> > bus_type::remove return void.
> > 
> > The first four patches contain cleanups that make some of these
> > callbacks (more obviously) always return 0. They are acked by the
> > respective maintainers. Bjorn Helgaas explicitly asked to include the
> > pci patch (#1) into this series, so Greg taking this is fine. I assume
> > the s390 people are fine with Greg taking patches #2 to #4, too, they
> > didn't explicitly said so though.
> > 
> > The last patch actually changes the prototype and so touches quite some
> > drivers and has the potential to conflict with future developments, so I
> > consider it beneficial to put these patches into next soon. I expect
> > that it will be Greg who takes the complete series, he already confirmed
> > via irc (for v2) to look into this series.
> > 
> > The only change compared to v3 is in the fourth patch where I modified a
> > few more drivers to fix build failures. Some of them were found by build
> > bots (thanks!), some of them I found myself using a regular expression
> > search. The newly modified files are:
> > 
> >  arch/sparc/kernel/vio.c
> >  drivers/nubus/bus.c
> >  drivers/sh/superhyway/superhyway.c
> >  drivers/vlynq/vlynq.c
> >  drivers/zorro/zorro-driver.c
> >  sound/ac97/bus.c
> > 
> > Best regards
> > Uwe
> 
> Now queued up.  I can go make a git tag that people can pull from after
> 0-day is finished testing this to verify all is good, if others need it.

Ok, here's a tag that any other subsystem can pull from if they want
these changes in their tree before 5.15-rc1 is out.  I might pull it
into my char-misc-next tree as well just to keep that tree sane as it
seems to pick up new busses on a regular basis...

thanks,

greg k-h

---


The following changes since commit 2734d6c1b1a089fb593ef6a23d4b70903526fe0c:

  Linux 5.14-rc2 (2021-07-18 14:13:49 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 
tags/bus_remove_return_void-5.15

for you to fetch changes up to fc7a6209d5710618eb4f72a77cd81b8d694ecf89:

  bus: Make remove callback return void (2021-07-21 11:53:42 +0200)

--------
Bus: Make remove callback return void tag

Tag for other trees/branches to pull from in order to have a stable
place to build off of if they want to add new busses for 5.15.

Signed-off-by: Greg Kroah-Hartman 


Uwe Kleine-König (5):
  PCI: endpoint: Make struct pci_epf_driver::remove return void
  s390/cio: Make struct css_driver::remove return void
  s390/ccwgroup: Drop if with an always false condition
  s390/scm: Make struct scm_driver::remove return void
  bus: Make remove callback return void

 arch/arm/common/locomo.c  | 3 +--
 arch/arm/common/sa.c  | 4 +---
 arch/arm/mach-rpc/ecard.c | 4 +---
 arch/mips/sgi-ip22/ip22-gio.c | 3 +--
 arch/parisc/kernel/drivers.c  | 5 ++---
 arch/powerpc/platforms/ps3/system-bus.c   | 3 +--
 arch/powerpc/platforms/pseries/ibmebus.c  | 3 +--
 arch/powerpc/platforms/pseries/vio.c  | 3 +--
 arch/s390/include/asm/eadm.h  | 2 +-
 arch/sparc/kernel/vio.c   | 4 +---
 drivers/acpi/bus.c| 3 +--
 drivers/amba/bus.c| 4 +---
 drivers/base/auxiliary.c  | 4 +---
 drivers/base/isa.c| 4 +---
 drivers/base/platform.c   | 4 +---
 drivers/bcma/main.c   | 6 ++
 drivers/bus/sunxi-rsb.c   | 4 +---
 drivers/cxl/core.c| 3 +--
 drivers/dax/bus.c | 4 +---
 drivers/dma/idxd/sysfs.c  | 4 +---
 drivers/firewire/core-device.c| 4 +---
 drivers/firmware/arm_scmi/bus.c   | 4 +---
 drivers/firmware/google/coreboot_table.c  | 4 +---
 drivers/fpga/dfl.c| 4 +---
 drivers/hid/hid-core.c| 4 +---
 drivers/hid/intel-ish-hid/ishtp/bus.c | 4 +---
 drivers/hv/vmbus_drv.c| 5 +
 drivers/hwtracing/intel_th/core.c | 4 +---
 drivers/i2c/i2c-core-base.c   | 5 +
 drivers/i3c/master.c  | 4 +---
 drivers/input/gameport/gameport.c | 3 +--
 drivers/input/serio/serio.c   | 3 +--
 drivers/ipack/ipack.c | 4 +---
 drivers/macint

Re: [PATCH v4 0/5] bus: Make remove callback return void

2021-07-21 Thread Greg Kroah-Hartman
On Tue, Jul 13, 2021 at 09:35:17PM +0200, Uwe Kleine-König wrote:
> Hello,
> 
> this is v4 of the final patch set for my effort to make struct
> bus_type::remove return void.
> 
> The first four patches contain cleanups that make some of these
> callbacks (more obviously) always return 0. They are acked by the
> respective maintainers. Bjorn Helgaas explicitly asked to include the
> pci patch (#1) into this series, so Greg taking this is fine. I assume
> the s390 people are fine with Greg taking patches #2 to #4, too, they
> didn't explicitly said so though.
> 
> The last patch actually changes the prototype and so touches quite some
> drivers and has the potential to conflict with future developments, so I
> consider it beneficial to put these patches into next soon. I expect
> that it will be Greg who takes the complete series, he already confirmed
> via irc (for v2) to look into this series.
> 
> The only change compared to v3 is in the fourth patch where I modified a
> few more drivers to fix build failures. Some of them were found by build
> bots (thanks!), some of them I found myself using a regular expression
> search. The newly modified files are:
> 
>  arch/sparc/kernel/vio.c
>  drivers/nubus/bus.c
>  drivers/sh/superhyway/superhyway.c
>  drivers/vlynq/vlynq.c
>  drivers/zorro/zorro-driver.c
>  sound/ac97/bus.c
> 
> Best regards
> Uwe

Now queued up.  I can go make a git tag that people can pull from after
0-day is finished testing this to verify all is good, if others need it.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.12 121/296] {net, vdpa}/mlx5: Configure interface MAC into mpfs L2 table

2021-05-31 Thread Greg Kroah-Hartman
From: Eli Cohen 

commit 7c9f131f366ab414691907fa0407124ea2b2f3bc upstream.

net/mlx5: Expose MPFS configuration API

MPFS is the multi physical function switch that bridges traffic between
the physical port and any physical functions associated with it. The
driver is required to add or remove MAC entries to properly forward
incoming traffic to the correct physical function.

We export the API to control MPFS so that other drivers, such as
mlx5_vdpa are able to add MAC addresses of their network interfaces.

The MAC address of the vdpa interface must be configured into the MPFS L2
address. Failing to do so could cause, in some NIC configurations, failure
to forward packets to the vdpa network device instance.

Fix this by adding calls to update the MPFS table.

CC: 
CC: 
CC: 
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: Eli Cohen 
Signed-off-by: Saeed Mahameed 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_fs.c|1 +
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  |1 +
 drivers/net/ethernet/mellanox/mlx5/core/lib/mpfs.c |3 +++
 drivers/net/ethernet/mellanox/mlx5/core/lib/mpfs.h |5 +
 drivers/vdpa/mlx5/net/mlx5_vnet.c  |   19 ++-
 include/linux/mlx5/mpfs.h  |   18 ++
 6 files changed, 42 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/mlx5/mpfs.h

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "en.h"
 #include "lib/mpfs.h"
 
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "esw/acl/lgcy.h"
 #include "mlx5_core.h"
 #include "lib/eq.h"
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/mpfs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/mpfs.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "mlx5_core.h"
 #include "lib/mpfs.h"
@@ -175,6 +176,7 @@ out:
mutex_unlock(&mpfs->lock);
return err;
 }
+EXPORT_SYMBOL(mlx5_mpfs_add_mac);
 
 int mlx5_mpfs_del_mac(struct mlx5_core_dev *dev, u8 *mac)
 {
@@ -206,3 +208,4 @@ unlock:
mutex_unlock(&mpfs->lock);
return err;
 }
+EXPORT_SYMBOL(mlx5_mpfs_del_mac);
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/mpfs.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/mpfs.h
@@ -84,12 +84,9 @@ struct l2addr_node {
 #ifdef CONFIG_MLX5_MPFS
 int  mlx5_mpfs_init(struct mlx5_core_dev *dev);
 void mlx5_mpfs_cleanup(struct mlx5_core_dev *dev);
-int  mlx5_mpfs_add_mac(struct mlx5_core_dev *dev, u8 *mac);
-int  mlx5_mpfs_del_mac(struct mlx5_core_dev *dev, u8 *mac);
 #else /* #ifndef CONFIG_MLX5_MPFS */
 static inline int  mlx5_mpfs_init(struct mlx5_core_dev *dev) { return 0; }
 static inline void mlx5_mpfs_cleanup(struct mlx5_core_dev *dev) {}
-static inline int  mlx5_mpfs_add_mac(struct mlx5_core_dev *dev, u8 *mac) { 
return 0; }
-static inline int  mlx5_mpfs_del_mac(struct mlx5_core_dev *dev, u8 *mac) { 
return 0; }
 #endif
+
 #endif
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "mlx5_vdpa.h"
 
 MODULE_AUTHOR("Eli Cohen ");
@@ -1854,11 +1855,16 @@ static int mlx5_vdpa_set_map(struct vdpa
 static void mlx5_vdpa_free(struct vdpa_device *vdev)
 {
struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+   struct mlx5_core_dev *pfmdev;
struct mlx5_vdpa_net *ndev;
 
ndev = to_mlx5_vdpa_ndev(mvdev);
 
free_resources(ndev);
+   if (!is_zero_ether_addr(ndev->config.mac)) {
+   pfmdev = pci_get_drvdata(pci_physfn(mvdev->mdev->pdev));
+   mlx5_mpfs_del_mac(pfmdev, ndev->config.mac);
+   }
mlx5_vdpa_free_resources(&ndev->mvdev);
mutex_destroy(&ndev->reslock);
 }
@@ -1980,6 +1986,7 @@ static int mlx5v_probe(struct auxiliary_
struct mlx5_adev *madev = container_of(adev, struct mlx5_adev, adev);
struct mlx5_core_dev *mdev = madev->mdev;
struct virtio_net_config *config;
+   struct mlx5_core_dev *pfmdev;
struct mlx5_vdpa_dev *mvdev;
struct mlx5_vdpa_net *ndev;
u32 max_vqs;
@@ -2008,10 +2015,17 @@ static int mlx5v_probe(struct auxiliary_
if (err)
goto err_mtu;
 
+   if (!is_zero_ether_addr(config->mac)) {
+   pfmdev = pci_get_drvdata(pci_physfn(mdev->pdev));
+   err = mlx5_mpfs_add_mac(pfmdev, config->mac);
+   if (err)
+   goto err_mtu;
+   }
+
mvdev->vdev.dma_dev = mdev->device;
err

[PATCH 5.10 100/252] {net, vdpa}/mlx5: Configure interface MAC into mpfs L2 table

2021-05-31 Thread Greg Kroah-Hartman
From: Eli Cohen 

commit 7c9f131f366ab414691907fa0407124ea2b2f3bc upstream.

net/mlx5: Expose MPFS configuration API

MPFS is the multi physical function switch that bridges traffic between
the physical port and any physical functions associated with it. The
driver is required to add or remove MAC entries to properly forward
incoming traffic to the correct physical function.

We export the API to control MPFS so that other drivers, such as
mlx5_vdpa are able to add MAC addresses of their network interfaces.

The MAC address of the vdpa interface must be configured into the MPFS L2
address. Failing to do so could cause, in some NIC configurations, failure
to forward packets to the vdpa network device instance.

Fix this by adding calls to update the MPFS table.

CC: 
CC: 
CC: 
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: Eli Cohen 
Signed-off-by: Saeed Mahameed 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_fs.c|1 +
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  |1 +
 drivers/net/ethernet/mellanox/mlx5/core/lib/mpfs.c |3 +++
 drivers/net/ethernet/mellanox/mlx5/core/lib/mpfs.h |5 +
 drivers/vdpa/mlx5/net/mlx5_vnet.c  |   19 ++-
 include/linux/mlx5/mpfs.h  |   18 ++
 6 files changed, 42 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/mlx5/mpfs.h

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "en.h"
 #include "lib/mpfs.h"
 
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "esw/acl/lgcy.h"
 #include "mlx5_core.h"
 #include "lib/eq.h"
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/mpfs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/mpfs.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "mlx5_core.h"
 #include "lib/mpfs.h"
@@ -175,6 +176,7 @@ out:
mutex_unlock(&mpfs->lock);
return err;
 }
+EXPORT_SYMBOL(mlx5_mpfs_add_mac);
 
 int mlx5_mpfs_del_mac(struct mlx5_core_dev *dev, u8 *mac)
 {
@@ -206,3 +208,4 @@ unlock:
mutex_unlock(&mpfs->lock);
return err;
 }
+EXPORT_SYMBOL(mlx5_mpfs_del_mac);
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/mpfs.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/mpfs.h
@@ -84,12 +84,9 @@ struct l2addr_node {
 #ifdef CONFIG_MLX5_MPFS
 int  mlx5_mpfs_init(struct mlx5_core_dev *dev);
 void mlx5_mpfs_cleanup(struct mlx5_core_dev *dev);
-int  mlx5_mpfs_add_mac(struct mlx5_core_dev *dev, u8 *mac);
-int  mlx5_mpfs_del_mac(struct mlx5_core_dev *dev, u8 *mac);
 #else /* #ifndef CONFIG_MLX5_MPFS */
 static inline int  mlx5_mpfs_init(struct mlx5_core_dev *dev) { return 0; }
 static inline void mlx5_mpfs_cleanup(struct mlx5_core_dev *dev) {}
-static inline int  mlx5_mpfs_add_mac(struct mlx5_core_dev *dev, u8 *mac) { 
return 0; }
-static inline int  mlx5_mpfs_del_mac(struct mlx5_core_dev *dev, u8 *mac) { 
return 0; }
 #endif
+
 #endif
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "mlx5_vnet.h"
 #include "mlx5_vdpa_ifc.h"
 #include "mlx5_vdpa.h"
@@ -1839,11 +1840,16 @@ static int mlx5_vdpa_set_map(struct vdpa
 static void mlx5_vdpa_free(struct vdpa_device *vdev)
 {
struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+   struct mlx5_core_dev *pfmdev;
struct mlx5_vdpa_net *ndev;
 
ndev = to_mlx5_vdpa_ndev(mvdev);
 
free_resources(ndev);
+   if (!is_zero_ether_addr(ndev->config.mac)) {
+   pfmdev = pci_get_drvdata(pci_physfn(mvdev->mdev->pdev));
+   mlx5_mpfs_del_mac(pfmdev, ndev->config.mac);
+   }
mlx5_vdpa_free_resources(&ndev->mvdev);
mutex_destroy(&ndev->reslock);
 }
@@ -1962,6 +1968,7 @@ static void init_mvqs(struct mlx5_vdpa_n
 void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
 {
struct virtio_net_config *config;
+   struct mlx5_core_dev *pfmdev;
struct mlx5_vdpa_dev *mvdev;
struct mlx5_vdpa_net *ndev;
u32 max_vqs;
@@ -1990,10 +1997,17 @@ void *mlx5_vdpa_add_dev(struct mlx5_core
if (err)
goto err_mtu;
 
+   if (!is_zero_ether_addr(config->mac)) {
+   pfmdev = pci_get_drvdata(pci_physfn(mdev->pdev));
+   err = mlx5_mpfs_add_mac(pfmdev, config->mac);
+   if (err)
+   goto err_mtu;
+   }
+
mvdev->vdev.dma_dev = mdev->device;
err = mlx5_vdpa_alloc_resources(&ndev->

[PATCH 5.4 08/37] virtio_net: Do not pull payload in skb->head

2021-05-20 Thread Greg Kroah-Hartman
From: Eric Dumazet 

[ Upstream commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db ]

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo 
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo 
Signed-off-by: Xuan Zhuo 
Signed-off-by: Eric Dumazet 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/virtio_net.c   | 10 +++---
 include/linux/virtio_net.h | 14 +-
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b67460864b3c..d8ee001d8e8e 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -406,9 +406,13 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
offset += hdr_padded_len;
p += hdr_padded_len;
 
-   copy = len;
-   if (copy > skb_tailroom(skb))
-   copy = skb_tailroom(skb);
+   /* Copy all frame if it fits skb->head, otherwise
+* we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+*/
+   if (len <= skb_tailroom(skb))
+   copy = len;
+   else
+   copy = ETH_HLEN + metasize;
skb_put_data(skb, p, copy);
 
if (metasize) {
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 98775d7fa696..b465f8f3e554 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
skb_reset_mac_header(skb);
 
if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-   u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
-   u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+   u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+   if (!pskb_may_pull(skb, needed))
+   return -EINVAL;
 
if (!skb_partial_csum_set(skb, start, off))
return -EINVAL;
 
p_off = skb_transport_offset(skb) + thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
} else {
/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -102,14 +106,14 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff 
*skb,
}
 
p_off = keys.control.thoff + thlen;
-   if (p_off > skb_headlen(skb) ||
+   if (!pskb_may_pull(skb, p_off) ||
keys.basic.ip_proto != ip_proto)
return -EINVAL;
 
skb_set_transport_header(skb, keys.control.thoff);
} else if (gso_type) {
p_off = thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
}
}
-- 
2.30.2



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.10 08/47] virtio_net: Do not pull payload in skb->head

2021-05-20 Thread Greg Kroah-Hartman
From: Eric Dumazet 

[ Upstream commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db ]

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo 
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo 
Signed-off-by: Xuan Zhuo 
Signed-off-by: Eric Dumazet 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/virtio_net.c   | 10 +++---
 include/linux/virtio_net.h | 14 +-
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 038ce4e5e84b..286f836a53bf 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -406,9 +406,13 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
offset += hdr_padded_len;
p += hdr_padded_len;
 
-   copy = len;
-   if (copy > skb_tailroom(skb))
-   copy = skb_tailroom(skb);
+   /* Copy all frame if it fits skb->head, otherwise
+* we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+*/
+   if (len <= skb_tailroom(skb))
+   copy = len;
+   else
+   copy = ETH_HLEN + metasize;
skb_put_data(skb, p, copy);
 
if (metasize) {
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 98775d7fa696..b465f8f3e554 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
skb_reset_mac_header(skb);
 
if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-   u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
-   u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+   u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+   if (!pskb_may_pull(skb, needed))
+   return -EINVAL;
 
if (!skb_partial_csum_set(skb, start, off))
return -EINVAL;
 
p_off = skb_transport_offset(skb) + thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
} else {
/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -102,14 +106,14 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff 
*skb,
}
 
p_off = keys.control.thoff + thlen;
-   if (p_off > skb_headlen(skb) ||
+   if (!pskb_may_pull(skb, p_off) ||
keys.basic.ip_proto != ip_proto)
return -EINVAL;
 
skb_set_transport_header(skb, keys.control.thoff);
} else if (gso_type) {
p_off = thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
}
}
-- 
2.30.2



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 5.10 055/188] virtio_net: Do not pull payload in skb->head

2021-04-12 Thread Greg Kroah-Hartman
On Mon, Apr 12, 2021 at 05:11:40AM -0400, Michael S. Tsirkin wrote:
> On Mon, Apr 12, 2021 at 10:39:29AM +0200, Greg Kroah-Hartman wrote:
> > From: Eric Dumazet 
> > 
> > commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db upstream.
> > 
> > Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
> > under-estimation for tiny skbs") brought  a ~10% performance drop.
> > 
> > The reason for the performance drop was that GRO was forced
> > to chain sk_buff (using skb_shinfo(skb)->frag_list), which
> > uses more memory but also cause packet consumers to go over
> > a lot of overhead handling all the tiny skbs.
> > 
> > It turns out that virtio_net page_to_skb() has a wrong strategy :
> > It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
> > copies 128 bytes from the page, before feeding the packet to GRO stack.
> > 
> > This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
> > under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
> > meaning we were not packing MSS with 100% efficiency.
> > 
> > Fix is to pull only the ethernet header in page_to_skb()
> > 
> > Then, we change virtio_net_hdr_to_skb() to pull the missing
> > headers, instead of assuming they were already pulled by callers.
> > 
> > This fixes the performance regression, but could also allow virtio_net
> > to accept packets with more than 128bytes of headers.
> > 
> > Many thanks to Xuan Zhuo for his report, and his tests/help.
> > 
> > Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny 
> > skbs")
> > Reported-by: Xuan Zhuo 
> > Link: https://www.spinics.net/lists/netdev/msg731397.html
> > Co-Developed-by: Xuan Zhuo 
> > Signed-off-by: Xuan Zhuo 
> > Signed-off-by: Eric Dumazet 
> > Cc: "Michael S. Tsirkin" 
> > Cc: Jason Wang 
> > Cc: virtualization@lists.linux-foundation.org
> > Acked-by: Jason Wang 
> > Signed-off-by: David S. Miller 
> > Signed-off-by: Greg Kroah-Hartman 
> 
> 
> 
> Note that an issue related to this patch was recently reported.
> It's quite possible that the root cause is a bug elsewhere
> in the kernel, but it probably makes sense to defer the backport
> until we know more ...

Thanks, I'll go drop it from all 4 queues.  If you all find out that all
is good, and it should be added back, please let us at stable@vger know
about it.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.11 062/210] virtio_net: Do not pull payload in skb->head

2021-04-12 Thread Greg Kroah-Hartman
From: Eric Dumazet 

commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db upstream.

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo 
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo 
Signed-off-by: Xuan Zhuo 
Signed-off-by: Eric Dumazet 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/virtio_net.c   |   10 +++---
 include/linux/virtio_net.h |   14 +-
 2 files changed, 16 insertions(+), 8 deletions(-)

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -406,9 +406,13 @@ static struct sk_buff *page_to_skb(struc
offset += hdr_padded_len;
p += hdr_padded_len;
 
-   copy = len;
-   if (copy > skb_tailroom(skb))
-   copy = skb_tailroom(skb);
+   /* Copy all frame if it fits skb->head, otherwise
+* we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+*/
+   if (len <= skb_tailroom(skb))
+   copy = len;
+   else
+   copy = ETH_HLEN + metasize;
skb_put_data(skb, p, copy);
 
if (metasize) {
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(
skb_reset_mac_header(skb);
 
if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-   u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
-   u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+   u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+   if (!pskb_may_pull(skb, needed))
+   return -EINVAL;
 
if (!skb_partial_csum_set(skb, start, off))
return -EINVAL;
 
p_off = skb_transport_offset(skb) + thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
} else {
/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -102,14 +106,14 @@ retry:
}
 
p_off = keys.control.thoff + thlen;
-   if (p_off > skb_headlen(skb) ||
+   if (!pskb_may_pull(skb, p_off) ||
keys.basic.ip_proto != ip_proto)
return -EINVAL;
 
skb_set_transport_header(skb, keys.control.thoff);
} else if (gso_type) {
p_off = thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
}
}


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.10 055/188] virtio_net: Do not pull payload in skb->head

2021-04-12 Thread Greg Kroah-Hartman
From: Eric Dumazet 

commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db upstream.

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo 
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo 
Signed-off-by: Xuan Zhuo 
Signed-off-by: Eric Dumazet 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/virtio_net.c   |   10 +++---
 include/linux/virtio_net.h |   14 +-
 2 files changed, 16 insertions(+), 8 deletions(-)

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -406,9 +406,13 @@ static struct sk_buff *page_to_skb(struc
offset += hdr_padded_len;
p += hdr_padded_len;
 
-   copy = len;
-   if (copy > skb_tailroom(skb))
-   copy = skb_tailroom(skb);
+   /* Copy all frame if it fits skb->head, otherwise
+* we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+*/
+   if (len <= skb_tailroom(skb))
+   copy = len;
+   else
+   copy = ETH_HLEN + metasize;
skb_put_data(skb, p, copy);
 
if (metasize) {
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(
skb_reset_mac_header(skb);
 
if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-   u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
-   u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+   u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+   if (!pskb_may_pull(skb, needed))
+   return -EINVAL;
 
if (!skb_partial_csum_set(skb, start, off))
return -EINVAL;
 
p_off = skb_transport_offset(skb) + thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
} else {
/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -102,14 +106,14 @@ retry:
}
 
p_off = keys.control.thoff + thlen;
-   if (p_off > skb_headlen(skb) ||
+   if (!pskb_may_pull(skb, p_off) ||
keys.basic.ip_proto != ip_proto)
return -EINVAL;
 
skb_set_transport_header(skb, keys.control.thoff);
} else if (gso_type) {
p_off = thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
}
}


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.4 042/111] virtio_net: Do not pull payload in skb->head

2021-04-12 Thread Greg Kroah-Hartman
From: Eric Dumazet 

[ Upstream commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db ]

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo 
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo 
Signed-off-by: Xuan Zhuo 
Signed-off-by: Eric Dumazet 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/virtio_net.c   | 10 +++---
 include/linux/virtio_net.h | 14 +-
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b67460864b3c..d8ee001d8e8e 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -406,9 +406,13 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
offset += hdr_padded_len;
p += hdr_padded_len;
 
-   copy = len;
-   if (copy > skb_tailroom(skb))
-   copy = skb_tailroom(skb);
+   /* Copy all frame if it fits skb->head, otherwise
+* we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+*/
+   if (len <= skb_tailroom(skb))
+   copy = len;
+   else
+   copy = ETH_HLEN + metasize;
skb_put_data(skb, p, copy);
 
if (metasize) {
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 98775d7fa696..b465f8f3e554 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
skb_reset_mac_header(skb);
 
if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-   u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
-   u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+   u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+   if (!pskb_may_pull(skb, needed))
+   return -EINVAL;
 
if (!skb_partial_csum_set(skb, start, off))
return -EINVAL;
 
p_off = skb_transport_offset(skb) + thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
} else {
/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -102,14 +106,14 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff 
*skb,
}
 
p_off = keys.control.thoff + thlen;
-   if (p_off > skb_headlen(skb) ||
+   if (!pskb_may_pull(skb, p_off) ||
keys.basic.ip_proto != ip_proto)
return -EINVAL;
 
skb_set_transport_header(skb, keys.control.thoff);
} else if (gso_type) {
p_off = thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
}
}
-- 
2.30.2



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.19 28/66] virtio_net: Do not pull payload in skb->head

2021-04-12 Thread Greg Kroah-Hartman
From: Eric Dumazet 

[ Upstream commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db ]

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo 
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo 
Signed-off-by: Xuan Zhuo 
Signed-off-by: Eric Dumazet 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/virtio_net.c   | 10 +++---
 include/linux/virtio_net.h | 14 +-
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 0b1c6a8906b9..06ddf009f833 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -413,9 +413,13 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
offset += hdr_padded_len;
p += hdr_padded_len;
 
-   copy = len;
-   if (copy > skb_tailroom(skb))
-   copy = skb_tailroom(skb);
+   /* Copy all frame if it fits skb->head, otherwise
+* we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+*/
+   if (len <= skb_tailroom(skb))
+   copy = len;
+   else
+   copy = ETH_HLEN + metasize;
skb_put_data(skb, p, copy);
 
if (metasize) {
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index a1829139ff4a..8f48264f5dab 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
skb_reset_mac_header(skb);
 
if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-   u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
-   u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+   u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+   u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+   if (!pskb_may_pull(skb, needed))
+   return -EINVAL;
 
if (!skb_partial_csum_set(skb, start, off))
return -EINVAL;
 
p_off = skb_transport_offset(skb) + thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
} else {
/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -102,14 +106,14 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff 
*skb,
}
 
p_off = keys.control.thoff + thlen;
-   if (p_off > skb_headlen(skb) ||
+   if (!pskb_may_pull(skb, p_off) ||
keys.basic.ip_proto != ip_proto)
return -EINVAL;
 
skb_set_transport_header(skb, keys.control.thoff);
} else if (gso_type) {
p_off = thlen;
-   if (p_off > skb_headlen(skb))
+   if (!pskb_may_pull(skb, p_off))
return -EINVAL;
}
}
-- 
2.30.2



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] virtio_console: remove pointless check for debugfs_create_dir()

2021-02-16 Thread Greg Kroah-Hartman
It is impossible for debugfs_create_dir() to return NULL, so checking
for it gives people a false sense that they actually are doing something
if an error occurs.  As there is no need to ever change kernel logic if
debugfs is working "properly" or not, there is no need to check the
return value of debugfs calls, so remove the checks here as they will
never be triggered and are wrong.

Cc: Amit Shah 
Cc: Arnd Bergmann 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/char/virtio_console.c | 23 +--
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 1836cc56e357..59dfd9c421a1 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -1456,18 +1456,15 @@ static int add_port(struct ports_device *portdev, u32 
id)
 */
send_control_msg(port, VIRTIO_CONSOLE_PORT_READY, 1);
 
-   if (pdrvdata.debugfs_dir) {
-   /*
-* Finally, create the debugfs file that we can use to
-* inspect a port's state at any time
-*/
-   snprintf(debugfs_name, sizeof(debugfs_name), "vport%up%u",
-port->portdev->vdev->index, id);
-   port->debugfs_file = debugfs_create_file(debugfs_name, 0444,
-pdrvdata.debugfs_dir,
-port,
-&port_debugfs_fops);
-   }
+   /*
+* Finally, create the debugfs file that we can use to
+* inspect a port's state at any time
+*/
+   snprintf(debugfs_name, sizeof(debugfs_name), "vport%up%u",
+port->portdev->vdev->index, id);
+   port->debugfs_file = debugfs_create_file(debugfs_name, 0444,
+pdrvdata.debugfs_dir,
+port, &port_debugfs_fops);
return 0;
 
 free_inbufs:
@@ -2244,8 +2241,6 @@ static int __init init(void)
}
 
pdrvdata.debugfs_dir = debugfs_create_dir("virtio-ports", NULL);
-   if (!pdrvdata.debugfs_dir)
-   pr_warn("Error creating debugfs dir for virtio-ports\n");
INIT_LIST_HEAD(&pdrvdata.consoles);
INIT_LIST_HEAD(&pdrvdata.portdevs);
 
-- 
2.30.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.9 032/105] vdpa: mlx5: fix vdpa/vhost dependencies

2020-12-14 Thread Greg Kroah-Hartman
From: Randy Dunlap 

[ Upstream commit 98701a2a861fa87a5055cf2809758e8725e8b146 ]

drivers/vdpa/mlx5/ uses vhost_iotlb*() interfaces, so select
VHOST_IOTLB to make them be built.

However, if VHOST_IOTLB is the only VHOST symbol that is
set/enabled, the object file still won't be built because
drivers/Makefile won't descend into drivers/vhost/ to build it,
so make drivers/Makefile build the needed binary whenever
VHOST_IOTLB is set, like it does for VHOST_RING.

Fixes these build errors:
ERROR: modpost: "vhost_iotlb_itree_next" [drivers/vdpa/mlx5/mlx5_vdpa.ko] 
undefined!
ERROR: modpost: "vhost_iotlb_itree_first" [drivers/vdpa/mlx5/mlx5_vdpa.ko] 
undefined!

Fixes: 29064bfdabd5 ("vdpa/mlx5: Add support library for mlx5 VDPA 
implementation")
Fixes: aff90770e54c ("vdpa/mlx5: Fix dependency on MLX5_CORE")
Reported-by: kernel test robot 
Signed-off-by: Randy Dunlap 
Cc: Eli Cohen 
Cc: Parav Pandit 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtualization@lists.linux-foundation.org
Cc: Saeed Mahameed 
Cc: Leon Romanovsky 
Cc: net...@vger.kernel.org
Link: https://lore.kernel.org/r/20201128213905.27409-1-rdun...@infradead.org
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
Signed-off-by: Sasha Levin 
---
 drivers/Makefile | 1 +
 drivers/vdpa/Kconfig | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/Makefile b/drivers/Makefile
index c0cd1b9075e3d..5762280377186 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -145,6 +145,7 @@ obj-$(CONFIG_OF)+= of/
 obj-$(CONFIG_SSB)  += ssb/
 obj-$(CONFIG_BCMA) += bcma/
 obj-$(CONFIG_VHOST_RING)   += vhost/
+obj-$(CONFIG_VHOST_IOTLB)  += vhost/
 obj-$(CONFIG_VHOST)+= vhost/
 obj-$(CONFIG_VLYNQ)+= vlynq/
 obj-$(CONFIG_GREYBUS)  += greybus/
diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index 358f6048dd3ce..6caf539091e55 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -32,6 +32,7 @@ config IFCVF
 
 config MLX5_VDPA
bool
+   select VHOST_IOTLB
help
  Support library for Mellanox VDPA drivers. Provides code that is
  common for all types of VDPA drivers. The following drivers are 
planned:
-- 
2.27.0



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 3/7] mm/memory_hotplug: prepare passing flags to add_memory() and friends

2020-09-09 Thread Greg Kroah-Hartman
On Tue, Sep 08, 2020 at 10:10:08PM +0200, David Hildenbrand wrote:
> We soon want to pass flags, e.g., to mark added System RAM resources.
> mergeable. Prepare for that.

What are these random "flags", and how do we know what should be passed
to them?

Why not make this an enumerated type so that we know it all works
properly, like the GPF_* flags are?  Passing around a random unsigned
long feels very odd/broken...

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/7] kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED

2020-09-09 Thread Greg Kroah-Hartman
On Tue, Sep 08, 2020 at 10:10:07PM +0200, David Hildenbrand wrote:
> IORESOURCE_MEM_DRIVER_MANAGED currently uses an unused PnP bit, which is
> always set to 0 by hardware. This is far from beautiful (and confusing),
> and the bit only applies to SYSRAM. So let's move it out of the
> bus-specific (PnP) defined bits.
> 
> We'll add another SYSRAM specific bit soon. If we ever need more bits for
> other purposes, we can steal some from "desc", or reshuffle/regroup what we
> have.
> 
> Cc: Andrew Morton 
> Cc: Michal Hocko 
> Cc: Dan Williams 
> Cc: Jason Gunthorpe 
> Cc: Kees Cook 
> Cc: Ard Biesheuvel 
> Cc: Pankaj Gupta 
> Cc: Baoquan He 
> Cc: Wei Yang 
> Cc: Eric Biederman 
> Cc: Thomas Gleixner 
> Cc: Greg Kroah-Hartman 
> Cc: ke...@lists.infradead.org
> Signed-off-by: David Hildenbrand 
> ---
>  include/linux/ioport.h | 4 +++-
>  kernel/kexec_file.c| 2 +-
>  mm/memory_hotplug.c| 4 ++--
>  3 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> index 52a91f5fa1a36..d7620d7c941a0 100644
> --- a/include/linux/ioport.h
> +++ b/include/linux/ioport.h
> @@ -58,6 +58,9 @@ struct resource {
>  #define IORESOURCE_EXT_TYPE_BITS 0x0100  /* Resource extended types */
>  #define IORESOURCE_SYSRAM0x0100  /* System RAM (modifier) */
>  
> +/* IORESOURCE_SYSRAM specific bits. */
> +#define IORESOURCE_SYSRAM_DRIVER_MANAGED 0x0200 /* Always detected 
> via a driver. */
> +

Can't you use BIT() here?

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.19 064/267] crypto: virtio: Fix use-after-free in virtio_crypto_skcipher_finalize_req()

2020-06-19 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

[ Upstream commit 8c855f0720ff006d75d0a2512c7f6c4f60ff60ee ]

The system'll crash when the users insmod crypto/tcrypto.ko with mode=155
( testing "authenc(hmac(sha1),cbc(aes))" ). It's caused by reuse the memory
of request structure.

In crypto_authenc_init_tfm(), the reqsize is set to:
  [PART 1] sizeof(authenc_request_ctx) +
  [PART 2] ictx->reqoff +
  [PART 3] MAX(ahash part, skcipher part)
and the 'PART 3' is used by both ahash and skcipher in turn.

When the virtio_crypto driver finish skcipher req, it'll call ->complete
callback(in crypto_finalize_skcipher_request) and then free its
resources whose pointers are recorded in 'skcipher parts'.

However, the ->complete is 'crypto_authenc_encrypt_done' in this case,
it will use the 'ahash part' of the request and change its content,
so virtio_crypto driver will get the wrong pointer after ->complete
finish and mistakenly free some other's memory. So the system will crash
when these memory will be used again.

The resources which need to be cleaned up are not used any more. But the
pointers of these resources may be changed in the function
"crypto_finalize_skcipher_request". Thus release specific resources before
calling this function.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Reported-by: LABBE Corentin 
Cc: Gonglei 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Link: https://lore.kernel.org/r/20200123101000.GB24255@Red
Acked-by: Gonglei 
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-3-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Sasha Levin 
---
 drivers/crypto/virtio/virtio_crypto_algs.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index 38432721069f..9348060cc32f 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -594,10 +594,11 @@ static void virtio_crypto_ablkcipher_finalize_req(
scatterwalk_map_and_copy(req->info, req->dst,
 req->nbytes - AES_BLOCK_SIZE,
 AES_BLOCK_SIZE, 0);
-   crypto_finalize_ablkcipher_request(vc_sym_req->base.dataq->engine,
-  req, err);
kzfree(vc_sym_req->iv);
virtcrypto_clear_request(&vc_sym_req->base);
+
+   crypto_finalize_ablkcipher_request(vc_sym_req->base.dataq->engine,
+  req, err);
 }
 
 static struct virtio_crypto_algo virtio_crypto_algs[] = { {
-- 
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.19 065/267] crypto: virtio: Fix src/dst scatterlist calculation in __virtio_crypto_skcipher_do_req()

2020-06-19 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

[ Upstream commit b02989f37fc5e865c9070907e4493b3a21e2 ]

The system will crash when the users insmod crypto/tcrypt.ko with mode=38
( testing "cts(cbc(aes))" ).

Usually the next entry of one sg will be @sg@ + 1, but if this sg element
is part of a chained scatterlist, it could jump to the start of a new
scatterlist array. Fix it by sg_next() on calculation of src/dst
scatterlist.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Reported-by: LABBE Corentin 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Link: https://lore.kernel.org/r/20200123101000.GB24255@Red
Signed-off-by: Gonglei 
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-2-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Sasha Levin 
---
 drivers/crypto/virtio/virtio_crypto_algs.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index 9348060cc32f..e9a8485c4929 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -367,13 +367,18 @@ __virtio_crypto_ablkcipher_do_req(struct 
virtio_crypto_sym_request *vc_sym_req,
int err;
unsigned long flags;
struct scatterlist outhdr, iv_sg, status_sg, **sgs;
-   int i;
u64 dst_len;
unsigned int num_out = 0, num_in = 0;
int sg_total;
uint8_t *iv;
+   struct scatterlist *sg;
 
src_nents = sg_nents_for_len(req->src, req->nbytes);
+   if (src_nents < 0) {
+   pr_err("Invalid number of src SG.\n");
+   return src_nents;
+   }
+
dst_nents = sg_nents(req->dst);
 
pr_debug("virtio_crypto: Number of sgs (src_nents: %d, dst_nents: 
%d)\n",
@@ -459,12 +464,12 @@ __virtio_crypto_ablkcipher_do_req(struct 
virtio_crypto_sym_request *vc_sym_req,
vc_sym_req->iv = iv;
 
/* Source data */
-   for (i = 0; i < src_nents; i++)
-   sgs[num_out++] = &req->src[i];
+   for (sg = req->src; src_nents; sg = sg_next(sg), src_nents--)
+   sgs[num_out++] = sg;
 
/* Destination data */
-   for (i = 0; i < dst_nents; i++)
-   sgs[num_out + num_in++] = &req->dst[i];
+   for (sg = req->dst; sg; sg = sg_next(sg))
+   sgs[num_out + num_in++] = sg;
 
/* Status */
sg_init_one(&status_sg, &vc_req->status, sizeof(vc_req->status));
-- 
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.19 066/267] crypto: virtio: Fix dest length calculation in __virtio_crypto_skcipher_do_req()

2020-06-19 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

[ Upstream commit d90ca42012db2863a9a30b564a2ace6016594bda ]

The src/dst length is not aligned with AES_BLOCK_SIZE(which is 16) in some
testcases in tcrypto.ko.

For example, the src/dst length of one of cts(cbc(aes))'s testcase is 17, the
crypto_virtio driver will set @src_data_len=16 but @dst_data_len=17 in this
case and get a wrong at then end.

  SRC: pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp (17 bytes)
  EXP: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc pp (17 bytes)
  DST: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 00 (pollute the last 
bytes)
  (pp: plaintext  cc:ciphertext)

Fix this issue by limit the length of dest buffer.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Cc: Gonglei 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-4-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Sasha Levin 
---
 drivers/crypto/virtio/virtio_crypto_algs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index e9a8485c4929..ab4700e4b409 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -424,6 +424,7 @@ __virtio_crypto_ablkcipher_do_req(struct 
virtio_crypto_sym_request *vc_sym_req,
goto free;
}
 
+   dst_len = min_t(unsigned int, req->nbytes, dst_len);
pr_debug("virtio_crypto: src_len: %u, dst_len: %llu\n",
req->nbytes, dst_len);
 
-- 
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.14 049/190] crypto: virtio: Fix use-after-free in virtio_crypto_skcipher_finalize_req()

2020-06-19 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

[ Upstream commit 8c855f0720ff006d75d0a2512c7f6c4f60ff60ee ]

The system'll crash when the users insmod crypto/tcrypto.ko with mode=155
( testing "authenc(hmac(sha1),cbc(aes))" ). It's caused by reuse the memory
of request structure.

In crypto_authenc_init_tfm(), the reqsize is set to:
  [PART 1] sizeof(authenc_request_ctx) +
  [PART 2] ictx->reqoff +
  [PART 3] MAX(ahash part, skcipher part)
and the 'PART 3' is used by both ahash and skcipher in turn.

When the virtio_crypto driver finish skcipher req, it'll call ->complete
callback(in crypto_finalize_skcipher_request) and then free its
resources whose pointers are recorded in 'skcipher parts'.

However, the ->complete is 'crypto_authenc_encrypt_done' in this case,
it will use the 'ahash part' of the request and change its content,
so virtio_crypto driver will get the wrong pointer after ->complete
finish and mistakenly free some other's memory. So the system will crash
when these memory will be used again.

The resources which need to be cleaned up are not used any more. But the
pointers of these resources may be changed in the function
"crypto_finalize_skcipher_request". Thus release specific resources before
calling this function.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Reported-by: LABBE Corentin 
Cc: Gonglei 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Link: https://lore.kernel.org/r/20200123101000.GB24255@Red
Acked-by: Gonglei 
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-3-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Sasha Levin 
---
 drivers/crypto/virtio/virtio_crypto_algs.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index e2231a1a05a1..772d2b3137c6 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -569,10 +569,11 @@ static void virtio_crypto_ablkcipher_finalize_req(
struct ablkcipher_request *req,
int err)
 {
-   crypto_finalize_cipher_request(vc_sym_req->base.dataq->engine,
-   req, err);
kzfree(vc_sym_req->iv);
virtcrypto_clear_request(&vc_sym_req->base);
+
+   crypto_finalize_cipher_request(vc_sym_req->base.dataq->engine,
+  req, err);
 }
 
 static struct crypto_alg virtio_crypto_algs[] = { {
-- 
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.14 050/190] crypto: virtio: Fix src/dst scatterlist calculation in __virtio_crypto_skcipher_do_req()

2020-06-19 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

[ Upstream commit b02989f37fc5e865c9070907e4493b3a21e2 ]

The system will crash when the users insmod crypto/tcrypt.ko with mode=38
( testing "cts(cbc(aes))" ).

Usually the next entry of one sg will be @sg@ + 1, but if this sg element
is part of a chained scatterlist, it could jump to the start of a new
scatterlist array. Fix it by sg_next() on calculation of src/dst
scatterlist.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Reported-by: LABBE Corentin 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Link: https://lore.kernel.org/r/20200123101000.GB24255@Red
Signed-off-by: Gonglei 
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-2-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Sasha Levin 
---
 drivers/crypto/virtio/virtio_crypto_algs.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index 772d2b3137c6..fee78ec46bae 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -354,13 +354,18 @@ __virtio_crypto_ablkcipher_do_req(struct 
virtio_crypto_sym_request *vc_sym_req,
int err;
unsigned long flags;
struct scatterlist outhdr, iv_sg, status_sg, **sgs;
-   int i;
u64 dst_len;
unsigned int num_out = 0, num_in = 0;
int sg_total;
uint8_t *iv;
+   struct scatterlist *sg;
 
src_nents = sg_nents_for_len(req->src, req->nbytes);
+   if (src_nents < 0) {
+   pr_err("Invalid number of src SG.\n");
+   return src_nents;
+   }
+
dst_nents = sg_nents(req->dst);
 
pr_debug("virtio_crypto: Number of sgs (src_nents: %d, dst_nents: 
%d)\n",
@@ -441,12 +446,12 @@ __virtio_crypto_ablkcipher_do_req(struct 
virtio_crypto_sym_request *vc_sym_req,
vc_sym_req->iv = iv;
 
/* Source data */
-   for (i = 0; i < src_nents; i++)
-   sgs[num_out++] = &req->src[i];
+   for (sg = req->src; src_nents; sg = sg_next(sg), src_nents--)
+   sgs[num_out++] = sg;
 
/* Destination data */
-   for (i = 0; i < dst_nents; i++)
-   sgs[num_out + num_in++] = &req->dst[i];
+   for (sg = req->dst; sg; sg = sg_next(sg))
+   sgs[num_out + num_in++] = sg;
 
/* Status */
sg_init_one(&status_sg, &vc_req->status, sizeof(vc_req->status));
-- 
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.14 051/190] crypto: virtio: Fix dest length calculation in __virtio_crypto_skcipher_do_req()

2020-06-19 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

[ Upstream commit d90ca42012db2863a9a30b564a2ace6016594bda ]

The src/dst length is not aligned with AES_BLOCK_SIZE(which is 16) in some
testcases in tcrypto.ko.

For example, the src/dst length of one of cts(cbc(aes))'s testcase is 17, the
crypto_virtio driver will set @src_data_len=16 but @dst_data_len=17 in this
case and get a wrong at then end.

  SRC: pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp (17 bytes)
  EXP: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc pp (17 bytes)
  DST: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 00 (pollute the last 
bytes)
  (pp: plaintext  cc:ciphertext)

Fix this issue by limit the length of dest buffer.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Cc: Gonglei 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-4-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Sasha Levin 
---
 drivers/crypto/virtio/virtio_crypto_algs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index fee78ec46bae..e6b889ce395e 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -411,6 +411,7 @@ __virtio_crypto_ablkcipher_do_req(struct 
virtio_crypto_sym_request *vc_sym_req,
goto free;
}
 
+   dst_len = min_t(unsigned int, req->nbytes, dst_len);
pr_debug("virtio_crypto: src_len: %u, dst_len: %llu\n",
req->nbytes, dst_len);
 
-- 
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.6 100/161] crypto: virtio: Fix src/dst scatterlist calculation in __virtio_crypto_skcipher_do_req()

2020-06-16 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

commit b02989f37fc5e865c9070907e4493b3a21e2 upstream.

The system will crash when the users insmod crypto/tcrypt.ko with mode=38
( testing "cts(cbc(aes))" ).

Usually the next entry of one sg will be @sg@ + 1, but if this sg element
is part of a chained scatterlist, it could jump to the start of a new
scatterlist array. Fix it by sg_next() on calculation of src/dst
scatterlist.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Reported-by: LABBE Corentin 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Link: https://lore.kernel.org/r/20200123101000.GB24255@Red
Signed-off-by: Gonglei 
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-2-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/crypto/virtio/virtio_crypto_algs.c |   15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -350,13 +350,18 @@ __virtio_crypto_skcipher_do_req(struct v
int err;
unsigned long flags;
struct scatterlist outhdr, iv_sg, status_sg, **sgs;
-   int i;
u64 dst_len;
unsigned int num_out = 0, num_in = 0;
int sg_total;
uint8_t *iv;
+   struct scatterlist *sg;
 
src_nents = sg_nents_for_len(req->src, req->cryptlen);
+   if (src_nents < 0) {
+   pr_err("Invalid number of src SG.\n");
+   return src_nents;
+   }
+
dst_nents = sg_nents(req->dst);
 
pr_debug("virtio_crypto: Number of sgs (src_nents: %d, dst_nents: 
%d)\n",
@@ -443,12 +448,12 @@ __virtio_crypto_skcipher_do_req(struct v
vc_sym_req->iv = iv;
 
/* Source data */
-   for (i = 0; i < src_nents; i++)
-   sgs[num_out++] = &req->src[i];
+   for (sg = req->src; src_nents; sg = sg_next(sg), src_nents--)
+   sgs[num_out++] = sg;
 
/* Destination data */
-   for (i = 0; i < dst_nents; i++)
-   sgs[num_out + num_in++] = &req->dst[i];
+   for (sg = req->dst; sg; sg = sg_next(sg))
+   sgs[num_out + num_in++] = sg;
 
/* Status */
sg_init_one(&status_sg, &vc_req->status, sizeof(vc_req->status));


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.6 099/161] crypto: virtio: Fix use-after-free in virtio_crypto_skcipher_finalize_req()

2020-06-16 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

commit 8c855f0720ff006d75d0a2512c7f6c4f60ff60ee upstream.

The system'll crash when the users insmod crypto/tcrypto.ko with mode=155
( testing "authenc(hmac(sha1),cbc(aes))" ). It's caused by reuse the memory
of request structure.

In crypto_authenc_init_tfm(), the reqsize is set to:
  [PART 1] sizeof(authenc_request_ctx) +
  [PART 2] ictx->reqoff +
  [PART 3] MAX(ahash part, skcipher part)
and the 'PART 3' is used by both ahash and skcipher in turn.

When the virtio_crypto driver finish skcipher req, it'll call ->complete
callback(in crypto_finalize_skcipher_request) and then free its
resources whose pointers are recorded in 'skcipher parts'.

However, the ->complete is 'crypto_authenc_encrypt_done' in this case,
it will use the 'ahash part' of the request and change its content,
so virtio_crypto driver will get the wrong pointer after ->complete
finish and mistakenly free some other's memory. So the system will crash
when these memory will be used again.

The resources which need to be cleaned up are not used any more. But the
pointers of these resources may be changed in the function
"crypto_finalize_skcipher_request". Thus release specific resources before
calling this function.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Reported-by: LABBE Corentin 
Cc: Gonglei 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Link: https://lore.kernel.org/r/20200123101000.GB24255@Red
Acked-by: Gonglei 
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-3-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/crypto/virtio/virtio_crypto_algs.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -578,10 +578,11 @@ static void virtio_crypto_skcipher_final
scatterwalk_map_and_copy(req->iv, req->dst,
 req->cryptlen - AES_BLOCK_SIZE,
 AES_BLOCK_SIZE, 0);
-   crypto_finalize_skcipher_request(vc_sym_req->base.dataq->engine,
-  req, err);
kzfree(vc_sym_req->iv);
virtcrypto_clear_request(&vc_sym_req->base);
+
+   crypto_finalize_skcipher_request(vc_sym_req->base.dataq->engine,
+  req, err);
 }
 
 static struct virtio_crypto_algo virtio_crypto_algs[] = { {


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.6 098/161] crypto: virtio: Fix dest length calculation in __virtio_crypto_skcipher_do_req()

2020-06-16 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

commit d90ca42012db2863a9a30b564a2ace6016594bda upstream.

The src/dst length is not aligned with AES_BLOCK_SIZE(which is 16) in some
testcases in tcrypto.ko.

For example, the src/dst length of one of cts(cbc(aes))'s testcase is 17, the
crypto_virtio driver will set @src_data_len=16 but @dst_data_len=17 in this
case and get a wrong at then end.

  SRC: pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp (17 bytes)
  EXP: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc pp (17 bytes)
  DST: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 00 (pollute the last 
bytes)
  (pp: plaintext  cc:ciphertext)

Fix this issue by limit the length of dest buffer.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Cc: Gonglei 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-4-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/crypto/virtio/virtio_crypto_algs.c |1 +
 1 file changed, 1 insertion(+)

--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -402,6 +402,7 @@ __virtio_crypto_skcipher_do_req(struct v
goto free;
}
 
+   dst_len = min_t(unsigned int, req->cryptlen, dst_len);
pr_debug("virtio_crypto: src_len: %u, dst_len: %llu\n",
req->cryptlen, dst_len);
 


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.7 095/163] crypto: virtio: Fix src/dst scatterlist calculation in __virtio_crypto_skcipher_do_req()

2020-06-16 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

commit b02989f37fc5e865c9070907e4493b3a21e2 upstream.

The system will crash when the users insmod crypto/tcrypt.ko with mode=38
( testing "cts(cbc(aes))" ).

Usually the next entry of one sg will be @sg@ + 1, but if this sg element
is part of a chained scatterlist, it could jump to the start of a new
scatterlist array. Fix it by sg_next() on calculation of src/dst
scatterlist.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Reported-by: LABBE Corentin 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Link: https://lore.kernel.org/r/20200123101000.GB24255@Red
Signed-off-by: Gonglei 
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-2-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/crypto/virtio/virtio_crypto_algs.c |   15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -350,13 +350,18 @@ __virtio_crypto_skcipher_do_req(struct v
int err;
unsigned long flags;
struct scatterlist outhdr, iv_sg, status_sg, **sgs;
-   int i;
u64 dst_len;
unsigned int num_out = 0, num_in = 0;
int sg_total;
uint8_t *iv;
+   struct scatterlist *sg;
 
src_nents = sg_nents_for_len(req->src, req->cryptlen);
+   if (src_nents < 0) {
+   pr_err("Invalid number of src SG.\n");
+   return src_nents;
+   }
+
dst_nents = sg_nents(req->dst);
 
pr_debug("virtio_crypto: Number of sgs (src_nents: %d, dst_nents: 
%d)\n",
@@ -443,12 +448,12 @@ __virtio_crypto_skcipher_do_req(struct v
vc_sym_req->iv = iv;
 
/* Source data */
-   for (i = 0; i < src_nents; i++)
-   sgs[num_out++] = &req->src[i];
+   for (sg = req->src; src_nents; sg = sg_next(sg), src_nents--)
+   sgs[num_out++] = sg;
 
/* Destination data */
-   for (i = 0; i < dst_nents; i++)
-   sgs[num_out + num_in++] = &req->dst[i];
+   for (sg = req->dst; sg; sg = sg_next(sg))
+   sgs[num_out + num_in++] = sg;
 
/* Status */
sg_init_one(&status_sg, &vc_req->status, sizeof(vc_req->status));


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.7 094/163] crypto: virtio: Fix use-after-free in virtio_crypto_skcipher_finalize_req()

2020-06-16 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

commit 8c855f0720ff006d75d0a2512c7f6c4f60ff60ee upstream.

The system'll crash when the users insmod crypto/tcrypto.ko with mode=155
( testing "authenc(hmac(sha1),cbc(aes))" ). It's caused by reuse the memory
of request structure.

In crypto_authenc_init_tfm(), the reqsize is set to:
  [PART 1] sizeof(authenc_request_ctx) +
  [PART 2] ictx->reqoff +
  [PART 3] MAX(ahash part, skcipher part)
and the 'PART 3' is used by both ahash and skcipher in turn.

When the virtio_crypto driver finish skcipher req, it'll call ->complete
callback(in crypto_finalize_skcipher_request) and then free its
resources whose pointers are recorded in 'skcipher parts'.

However, the ->complete is 'crypto_authenc_encrypt_done' in this case,
it will use the 'ahash part' of the request and change its content,
so virtio_crypto driver will get the wrong pointer after ->complete
finish and mistakenly free some other's memory. So the system will crash
when these memory will be used again.

The resources which need to be cleaned up are not used any more. But the
pointers of these resources may be changed in the function
"crypto_finalize_skcipher_request". Thus release specific resources before
calling this function.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Reported-by: LABBE Corentin 
Cc: Gonglei 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Link: https://lore.kernel.org/r/20200123101000.GB24255@Red
Acked-by: Gonglei 
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-3-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/crypto/virtio/virtio_crypto_algs.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -578,10 +578,11 @@ static void virtio_crypto_skcipher_final
scatterwalk_map_and_copy(req->iv, req->dst,
 req->cryptlen - AES_BLOCK_SIZE,
 AES_BLOCK_SIZE, 0);
-   crypto_finalize_skcipher_request(vc_sym_req->base.dataq->engine,
-  req, err);
kzfree(vc_sym_req->iv);
virtcrypto_clear_request(&vc_sym_req->base);
+
+   crypto_finalize_skcipher_request(vc_sym_req->base.dataq->engine,
+  req, err);
 }
 
 static struct virtio_crypto_algo virtio_crypto_algs[] = { {


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.7 093/163] crypto: virtio: Fix dest length calculation in __virtio_crypto_skcipher_do_req()

2020-06-16 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

commit d90ca42012db2863a9a30b564a2ace6016594bda upstream.

The src/dst length is not aligned with AES_BLOCK_SIZE(which is 16) in some
testcases in tcrypto.ko.

For example, the src/dst length of one of cts(cbc(aes))'s testcase is 17, the
crypto_virtio driver will set @src_data_len=16 but @dst_data_len=17 in this
case and get a wrong at then end.

  SRC: pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp (17 bytes)
  EXP: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc pp (17 bytes)
  DST: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 00 (pollute the last 
bytes)
  (pp: plaintext  cc:ciphertext)

Fix this issue by limit the length of dest buffer.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Cc: Gonglei 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-4-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/crypto/virtio/virtio_crypto_algs.c |1 +
 1 file changed, 1 insertion(+)

--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -402,6 +402,7 @@ __virtio_crypto_skcipher_do_req(struct v
goto free;
}
 
+   dst_len = min_t(unsigned int, req->cryptlen, dst_len);
pr_debug("virtio_crypto: src_len: %u, dst_len: %llu\n",
req->cryptlen, dst_len);
 


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.4 087/134] crypto: virtio: Fix src/dst scatterlist calculation in __virtio_crypto_skcipher_do_req()

2020-06-16 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

[ Upstream commit b02989f37fc5e865c9070907e4493b3a21e2 ]

The system will crash when the users insmod crypto/tcrypt.ko with mode=38
( testing "cts(cbc(aes))" ).

Usually the next entry of one sg will be @sg@ + 1, but if this sg element
is part of a chained scatterlist, it could jump to the start of a new
scatterlist array. Fix it by sg_next() on calculation of src/dst
scatterlist.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Reported-by: LABBE Corentin 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Link: https://lore.kernel.org/r/20200123101000.GB24255@Red
Signed-off-by: Gonglei 
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-2-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Sasha Levin 
---
 drivers/crypto/virtio/virtio_crypto_algs.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index fea55b5da8b5..3b37d0150814 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -353,13 +353,18 @@ __virtio_crypto_ablkcipher_do_req(struct 
virtio_crypto_sym_request *vc_sym_req,
int err;
unsigned long flags;
struct scatterlist outhdr, iv_sg, status_sg, **sgs;
-   int i;
u64 dst_len;
unsigned int num_out = 0, num_in = 0;
int sg_total;
uint8_t *iv;
+   struct scatterlist *sg;
 
src_nents = sg_nents_for_len(req->src, req->nbytes);
+   if (src_nents < 0) {
+   pr_err("Invalid number of src SG.\n");
+   return src_nents;
+   }
+
dst_nents = sg_nents(req->dst);
 
pr_debug("virtio_crypto: Number of sgs (src_nents: %d, dst_nents: 
%d)\n",
@@ -445,12 +450,12 @@ __virtio_crypto_ablkcipher_do_req(struct 
virtio_crypto_sym_request *vc_sym_req,
vc_sym_req->iv = iv;
 
/* Source data */
-   for (i = 0; i < src_nents; i++)
-   sgs[num_out++] = &req->src[i];
+   for (sg = req->src; src_nents; sg = sg_next(sg), src_nents--)
+   sgs[num_out++] = sg;
 
/* Destination data */
-   for (i = 0; i < dst_nents; i++)
-   sgs[num_out + num_in++] = &req->dst[i];
+   for (sg = req->dst; sg; sg = sg_next(sg))
+   sgs[num_out + num_in++] = sg;
 
/* Status */
sg_init_one(&status_sg, &vc_req->status, sizeof(vc_req->status));
-- 
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.4 088/134] crypto: virtio: Fix dest length calculation in __virtio_crypto_skcipher_do_req()

2020-06-16 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

[ Upstream commit d90ca42012db2863a9a30b564a2ace6016594bda ]

The src/dst length is not aligned with AES_BLOCK_SIZE(which is 16) in some
testcases in tcrypto.ko.

For example, the src/dst length of one of cts(cbc(aes))'s testcase is 17, the
crypto_virtio driver will set @src_data_len=16 but @dst_data_len=17 in this
case and get a wrong at then end.

  SRC: pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp (17 bytes)
  EXP: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc pp (17 bytes)
  DST: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 00 (pollute the last 
bytes)
  (pp: plaintext  cc:ciphertext)

Fix this issue by limit the length of dest buffer.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Cc: Gonglei 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-4-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Sasha Levin 
---
 drivers/crypto/virtio/virtio_crypto_algs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index 3b37d0150814..ac420b201dd8 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -410,6 +410,7 @@ __virtio_crypto_ablkcipher_do_req(struct 
virtio_crypto_sym_request *vc_sym_req,
goto free;
}
 
+   dst_len = min_t(unsigned int, req->nbytes, dst_len);
pr_debug("virtio_crypto: src_len: %u, dst_len: %llu\n",
req->nbytes, dst_len);
 
-- 
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.4 086/134] crypto: virtio: Fix use-after-free in virtio_crypto_skcipher_finalize_req()

2020-06-16 Thread Greg Kroah-Hartman
From: Longpeng(Mike) 

[ Upstream commit 8c855f0720ff006d75d0a2512c7f6c4f60ff60ee ]

The system'll crash when the users insmod crypto/tcrypto.ko with mode=155
( testing "authenc(hmac(sha1),cbc(aes))" ). It's caused by reuse the memory
of request structure.

In crypto_authenc_init_tfm(), the reqsize is set to:
  [PART 1] sizeof(authenc_request_ctx) +
  [PART 2] ictx->reqoff +
  [PART 3] MAX(ahash part, skcipher part)
and the 'PART 3' is used by both ahash and skcipher in turn.

When the virtio_crypto driver finish skcipher req, it'll call ->complete
callback(in crypto_finalize_skcipher_request) and then free its
resources whose pointers are recorded in 'skcipher parts'.

However, the ->complete is 'crypto_authenc_encrypt_done' in this case,
it will use the 'ahash part' of the request and change its content,
so virtio_crypto driver will get the wrong pointer after ->complete
finish and mistakenly free some other's memory. So the system will crash
when these memory will be used again.

The resources which need to be cleaned up are not used any more. But the
pointers of these resources may be changed in the function
"crypto_finalize_skcipher_request". Thus release specific resources before
calling this function.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Reported-by: LABBE Corentin 
Cc: Gonglei 
Cc: Herbert Xu 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: "David S. Miller" 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org
Link: https://lore.kernel.org/r/20200123101000.GB24255@Red
Acked-by: Gonglei 
Signed-off-by: Longpeng(Mike) 
Link: https://lore.kernel.org/r/20200602070501.2023-3-longpe...@huawei.com
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Sasha Levin 
---
 drivers/crypto/virtio/virtio_crypto_algs.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index 82b316b2f537..fea55b5da8b5 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -580,10 +580,11 @@ static void virtio_crypto_ablkcipher_finalize_req(
scatterwalk_map_and_copy(req->info, req->dst,
 req->nbytes - AES_BLOCK_SIZE,
 AES_BLOCK_SIZE, 0);
-   crypto_finalize_ablkcipher_request(vc_sym_req->base.dataq->engine,
-  req, err);
kzfree(vc_sym_req->iv);
virtcrypto_clear_request(&vc_sym_req->base);
+
+   crypto_finalize_ablkcipher_request(vc_sym_req->base.dataq->engine,
+  req, err);
 }
 
 static struct virtio_crypto_algo virtio_crypto_algs[] = { {
-- 
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.4 67/86] x86/paravirt: Remove the unused irq_enable_sysexit pv op

2020-05-18 Thread Greg Kroah-Hartman
From: Boris Ostrovsky 

commit 88c15ec90ff16880efab92b519436ee17b198477 upstream.

As result of commit "x86/xen: Avoid fast syscall path for Xen PV
guests", the irq_enable_sysexit pv op is not called by Xen PV guests
anymore and since they were the only ones who used it we can
safely remove it.

Signed-off-by: Boris Ostrovsky 
Reviewed-by: Borislav Petkov 
Acked-by: Andy Lutomirski 
Cc: Andrew Morton 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: david.vra...@citrix.com
Cc: konrad.w...@oracle.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-de...@lists.xenproject.org
Link: 
http://lkml.kernel.org/r/1447970147-1733-3-git-send-email-boris.ostrov...@oracle.com
Signed-off-by: Ingo Molnar 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/x86/entry/entry_32.S |8 ++--
 arch/x86/include/asm/paravirt.h   |7 ---
 arch/x86/include/asm/paravirt_types.h |9 -
 arch/x86/kernel/asm-offsets.c |3 ---
 arch/x86/kernel/paravirt.c|7 ---
 arch/x86/kernel/paravirt_patch_32.c   |2 --
 arch/x86/kernel/paravirt_patch_64.c   |1 -
 arch/x86/xen/enlighten.c  |3 ---
 arch/x86/xen/xen-asm_32.S |   14 --
 arch/x86/xen/xen-ops.h|3 ---
 10 files changed, 2 insertions(+), 55 deletions(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -331,7 +331,8 @@ sysenter_past_esp:
 * Return back to the vDSO, which will pop ecx and edx.
 * Don't bother with DS and ES (they already contain __USER_DS).
 */
-   ENABLE_INTERRUPTS_SYSEXIT
+   sti
+   sysexit
 
 .pushsection .fixup, "ax"
 2: movl$0, PT_FS(%esp)
@@ -554,11 +555,6 @@ ENTRY(native_iret)
iret
_ASM_EXTABLE(native_iret, iret_exc)
 END(native_iret)
-
-ENTRY(native_irq_enable_sysexit)
-   sti
-   sysexit
-END(native_irq_enable_sysexit)
 #endif
 
 ENTRY(overflow)
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -938,13 +938,6 @@ extern void default_banner(void);
push %ecx; push %edx;   \
call PARA_INDIRECT(pv_cpu_ops+PV_CPU_read_cr0); \
pop %edx; pop %ecx
-
-#define ENABLE_INTERRUPTS_SYSEXIT  \
-   PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_irq_enable_sysexit),\
- CLBR_NONE,\
- jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_irq_enable_sysexit))
-
-
 #else  /* !CONFIG_X86_32 */
 
 /*
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -162,15 +162,6 @@ struct pv_cpu_ops {
 
u64 (*read_pmc)(int counter);
 
-#ifdef CONFIG_X86_32
-   /*
-* Atomically enable interrupts and return to userspace.  This
-* is only used in 32-bit kernels.  64-bit kernels use
-* usergs_sysret32 instead.
-*/
-   void (*irq_enable_sysexit)(void);
-#endif
-
/*
 * Switch to usermode gs and return to 64-bit usermode using
 * sysret.  Only used in 64-bit kernels to return to 64-bit
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -65,9 +65,6 @@ void common(void) {
OFFSET(PV_IRQ_irq_disable, pv_irq_ops, irq_disable);
OFFSET(PV_IRQ_irq_enable, pv_irq_ops, irq_enable);
OFFSET(PV_CPU_iret, pv_cpu_ops, iret);
-#ifdef CONFIG_X86_32
-   OFFSET(PV_CPU_irq_enable_sysexit, pv_cpu_ops, irq_enable_sysexit);
-#endif
OFFSET(PV_CPU_read_cr0, pv_cpu_ops, read_cr0);
OFFSET(PV_MMU_read_cr2, pv_mmu_ops, read_cr2);
 #endif
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -168,9 +168,6 @@ unsigned paravirt_patch_default(u8 type,
ret = paravirt_patch_ident_64(insnbuf, len);
 
else if (type == PARAVIRT_PATCH(pv_cpu_ops.iret) ||
-#ifdef CONFIG_X86_32
-type == PARAVIRT_PATCH(pv_cpu_ops.irq_enable_sysexit) ||
-#endif
 type == PARAVIRT_PATCH(pv_cpu_ops.usergs_sysret32) ||
 type == PARAVIRT_PATCH(pv_cpu_ops.usergs_sysret64))
/* If operation requires a jmp, then jmp */
@@ -226,7 +223,6 @@ static u64 native_steal_clock(int cpu)
 
 /* These are in entry.S */
 extern void native_iret(void);
-extern void native_irq_enable_sysexit(void);
 extern void native_usergs_sysret32(void);
 extern void native_usergs_sysret64(void);
 
@@ -385,9 +381,6 @@ __visible struct pv_cpu_ops pv_cpu_ops =
 
.load_sp0 = native_load_sp0,
 
-#if defined(CONFIG_X86_32)
-   .irq_enable_sysexit = native_irq_enable_sysexit,
-#endif
 #ifdef CONFIG_X86_64
 #ifdef CONFIG_IA32_EMULATION
.usergs_sysret32 = native_usergs_sysret32,
--- a/arch/x86/kernel/paravirt_patch_32.c
+++ b/arch/x86/kernel/paravirt_patch_32.c
@@ -5,7 +5,6 @@ DEF_NATIVE(pv_

Re: [PATCH v2 1/2] virtio: stop using legacy struct vring

2020-04-06 Thread Greg Kroah-Hartman
On Mon, Apr 06, 2020 at 11:35:23AM -0400, Michael S. Tsirkin wrote:
> struct vring (in the uapi directory) and supporting APIs are kept
> around to avoid breaking old userspace builds.
> It's not actually part of the UAPI - it was kept in the UAPI
> header by mistake, and using it in kernel isn't necessary
> and prevents us from making changes safely.
> In particular, the APIs actually assume the legacy layout.
> 
> Add struct vring_s (identical ATM) and supporting
> legacy APIs and switch everyone to use that.

How are we going to know that "struct vring_s" is what we need/want to
use?  What does "_s" mean?

"struct vring_kernel"?

naming is hard...

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.19 42/84] crypto: virtio - implement missing support for output IVs

2020-01-16 Thread Greg Kroah-Hartman
From: Ard Biesheuvel 

commit 500e6807ce93b1fdc7d5b827c5cc167cc35630db upstream.

In order to allow for CBC to be chained, which is something that the
CTS template relies upon, implementations of CBC need to pass the
IV to be used for subsequent invocations via the IV buffer. This was
not implemented yet for virtio-crypto so implement it now.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Gonglei 
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Ard Biesheuvel 
Signed-off-by: Herbert Xu 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/crypto/virtio/virtio_crypto_algs.c |9 +
 1 file changed, 9 insertions(+)

--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -449,6 +449,11 @@ __virtio_crypto_ablkcipher_do_req(struct
goto free;
}
memcpy(iv, req->info, ivsize);
+   if (!vc_sym_req->encrypt)
+   scatterwalk_map_and_copy(req->info, req->src,
+req->nbytes - AES_BLOCK_SIZE,
+AES_BLOCK_SIZE, 0);
+
sg_init_one(&iv_sg, iv, ivsize);
sgs[num_out++] = &iv_sg;
vc_sym_req->iv = iv;
@@ -585,6 +590,10 @@ static void virtio_crypto_ablkcipher_fin
struct ablkcipher_request *req,
int err)
 {
+   if (vc_sym_req->encrypt)
+   scatterwalk_map_and_copy(req->info, req->dst,
+req->nbytes - AES_BLOCK_SIZE,
+AES_BLOCK_SIZE, 0);
crypto_finalize_ablkcipher_request(vc_sym_req->base.dataq->engine,
   req, err);
kzfree(vc_sym_req->iv);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.4 101/203] crypto: virtio - implement missing support for output IVs

2020-01-16 Thread Greg Kroah-Hartman
From: Ard Biesheuvel 

commit 500e6807ce93b1fdc7d5b827c5cc167cc35630db upstream.

In order to allow for CBC to be chained, which is something that the
CTS template relies upon, implementations of CBC need to pass the
IV to be used for subsequent invocations via the IV buffer. This was
not implemented yet for virtio-crypto so implement it now.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Gonglei 
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Ard Biesheuvel 
Signed-off-by: Herbert Xu 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/crypto/virtio/virtio_crypto_algs.c |9 +
 1 file changed, 9 insertions(+)

--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -435,6 +435,11 @@ __virtio_crypto_ablkcipher_do_req(struct
goto free;
}
memcpy(iv, req->info, ivsize);
+   if (!vc_sym_req->encrypt)
+   scatterwalk_map_and_copy(req->info, req->src,
+req->nbytes - AES_BLOCK_SIZE,
+AES_BLOCK_SIZE, 0);
+
sg_init_one(&iv_sg, iv, ivsize);
sgs[num_out++] = &iv_sg;
vc_sym_req->iv = iv;
@@ -571,6 +576,10 @@ static void virtio_crypto_ablkcipher_fin
struct ablkcipher_request *req,
int err)
 {
+   if (vc_sym_req->encrypt)
+   scatterwalk_map_and_copy(req->info, req->dst,
+req->nbytes - AES_BLOCK_SIZE,
+AES_BLOCK_SIZE, 0);
crypto_finalize_ablkcipher_request(vc_sym_req->base.dataq->engine,
   req, err);
kzfree(vc_sym_req->iv);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.4 315/434] crypto: virtio - deal with unsupported input sizes

2019-12-29 Thread Greg Kroah-Hartman
From: Ard Biesheuvel 

[ Upstream commit 19c5da7d4a2662e85ea67d2d81df57e038fde3ab ]

Return -EINVAL for input sizes that are not a multiple of the AES
block size, since they are not supported by our CBC chaining mode.

While at it, remove the pr_err() that reports unsupported key sizes
being used: we shouldn't spam the kernel log with that.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Gonglei 
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Ard Biesheuvel 
Signed-off-by: Herbert Xu 
Signed-off-by: Sasha Levin 
---
 drivers/crypto/virtio/virtio_crypto_algs.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index 42d19205166b..673fb29fda53 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -105,8 +105,6 @@ virtio_crypto_alg_validate_key(int key_len, uint32_t *alg)
*alg = VIRTIO_CRYPTO_CIPHER_AES_CBC;
break;
default:
-   pr_err("virtio_crypto: Unsupported key length: %d\n",
-   key_len);
return -EINVAL;
}
return 0;
@@ -484,6 +482,11 @@ static int virtio_crypto_ablkcipher_encrypt(struct 
ablkcipher_request *req)
/* Use the first data virtqueue as default */
struct data_queue *data_vq = &vcrypto->data_vq[0];
 
+   if (!req->nbytes)
+   return 0;
+   if (req->nbytes % AES_BLOCK_SIZE)
+   return -EINVAL;
+
vc_req->dataq = data_vq;
vc_req->alg_cb = virtio_crypto_dataq_sym_callback;
vc_sym_req->ablkcipher_ctx = ctx;
@@ -504,6 +507,11 @@ static int virtio_crypto_ablkcipher_decrypt(struct 
ablkcipher_request *req)
/* Use the first data virtqueue as default */
struct data_queue *data_vq = &vcrypto->data_vq[0];
 
+   if (!req->nbytes)
+   return 0;
+   if (req->nbytes % AES_BLOCK_SIZE)
+   return -EINVAL;
+
vc_req->dataq = data_vq;
vc_req->alg_cb = virtio_crypto_dataq_sym_callback;
vc_sym_req->ablkcipher_ctx = ctx;
-- 
2.20.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.19 167/219] crypto: virtio - deal with unsupported input sizes

2019-12-29 Thread Greg Kroah-Hartman
From: Ard Biesheuvel 

[ Upstream commit 19c5da7d4a2662e85ea67d2d81df57e038fde3ab ]

Return -EINVAL for input sizes that are not a multiple of the AES
block size, since they are not supported by our CBC chaining mode.

While at it, remove the pr_err() that reports unsupported key sizes
being used: we shouldn't spam the kernel log with that.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Gonglei 
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Ard Biesheuvel 
Signed-off-by: Herbert Xu 
Signed-off-by: Sasha Levin 
---
 drivers/crypto/virtio/virtio_crypto_algs.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index 2c573d1aaa64..523b712770ac 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -117,8 +117,6 @@ virtio_crypto_alg_validate_key(int key_len, uint32_t *alg)
*alg = VIRTIO_CRYPTO_CIPHER_AES_CBC;
break;
default:
-   pr_err("virtio_crypto: Unsupported key length: %d\n",
-   key_len);
return -EINVAL;
}
return 0;
@@ -498,6 +496,11 @@ static int virtio_crypto_ablkcipher_encrypt(struct 
ablkcipher_request *req)
/* Use the first data virtqueue as default */
struct data_queue *data_vq = &vcrypto->data_vq[0];
 
+   if (!req->nbytes)
+   return 0;
+   if (req->nbytes % AES_BLOCK_SIZE)
+   return -EINVAL;
+
vc_req->dataq = data_vq;
vc_req->alg_cb = virtio_crypto_dataq_sym_callback;
vc_sym_req->ablkcipher_ctx = ctx;
@@ -518,6 +521,11 @@ static int virtio_crypto_ablkcipher_decrypt(struct 
ablkcipher_request *req)
/* Use the first data virtqueue as default */
struct data_queue *data_vq = &vcrypto->data_vq[0];
 
+   if (!req->nbytes)
+   return 0;
+   if (req->nbytes % AES_BLOCK_SIZE)
+   return -EINVAL;
+
vc_req->dataq = data_vq;
vc_req->alg_cb = virtio_crypto_dataq_sym_callback;
vc_sym_req->ablkcipher_ctx = ctx;
-- 
2.20.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.14 121/161] crypto: virtio - deal with unsupported input sizes

2019-12-29 Thread Greg Kroah-Hartman
From: Ard Biesheuvel 

[ Upstream commit 19c5da7d4a2662e85ea67d2d81df57e038fde3ab ]

Return -EINVAL for input sizes that are not a multiple of the AES
block size, since they are not supported by our CBC chaining mode.

While at it, remove the pr_err() that reports unsupported key sizes
being used: we shouldn't spam the kernel log with that.

Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver")
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Gonglei 
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Ard Biesheuvel 
Signed-off-by: Herbert Xu 
Signed-off-by: Sasha Levin 
---
 drivers/crypto/virtio/virtio_crypto_algs.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c 
b/drivers/crypto/virtio/virtio_crypto_algs.c
index 5035b0dc1e40..e2231a1a05a1 100644
--- a/drivers/crypto/virtio/virtio_crypto_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_algs.c
@@ -110,8 +110,6 @@ virtio_crypto_alg_validate_key(int key_len, uint32_t *alg)
*alg = VIRTIO_CRYPTO_CIPHER_AES_CBC;
break;
default:
-   pr_err("virtio_crypto: Unsupported key length: %d\n",
-   key_len);
return -EINVAL;
}
return 0;
@@ -485,6 +483,11 @@ static int virtio_crypto_ablkcipher_encrypt(struct 
ablkcipher_request *req)
/* Use the first data virtqueue as default */
struct data_queue *data_vq = &vcrypto->data_vq[0];
 
+   if (!req->nbytes)
+   return 0;
+   if (req->nbytes % AES_BLOCK_SIZE)
+   return -EINVAL;
+
vc_req->dataq = data_vq;
vc_req->alg_cb = virtio_crypto_dataq_sym_callback;
vc_sym_req->ablkcipher_ctx = ctx;
@@ -505,6 +508,11 @@ static int virtio_crypto_ablkcipher_decrypt(struct 
ablkcipher_request *req)
/* Use the first data virtqueue as default */
struct data_queue *data_vq = &vcrypto->data_vq[0];
 
+   if (!req->nbytes)
+   return 0;
+   if (req->nbytes % AES_BLOCK_SIZE)
+   return -EINVAL;
+
vc_req->dataq = data_vq;
vc_req->alg_cb = virtio_crypto_dataq_sym_callback;
vc_sym_req->ablkcipher_ctx = ctx;
-- 
2.20.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.9 115/199] virtio-balloon: fix managed page counts when migrating pages between zones

2019-12-19 Thread Greg Kroah-Hartman
 Pages 32768
  [  190.830303] Offlined Pages 32768
  [  190.833071] Built 1 zonelists, mobility grouping on.  Total pages: 
-36920272750453009

In another instance (older kernel), I was no longer able to start any
process:
  [root@vm ~]# [  214.348068] Offlined Pages 32768
  [  215.973009] Offlined Pages 32768
  cat /proc/meminfo
  -bash: fork: Cannot allocate memory
  [root@vm ~]# cat /proc/meminfo
  -bash: fork: Cannot allocate memory

Fix it by properly adjusting the managed page count when migrating if
the zone changed. The managed page count of the zones now looks after
unplug of the DIMM (and after deflating the balloon) just like before
inflating the balloon (and plugging+onlining the DIMM).

We'll temporarily modify the totalram page count. If this ever becomes a
problem, we can fine tune by providing helpers that don't touch
the totalram pages (e.g., adjust_zone_managed_page_count()).

Please note that fixing up the managed page count is only necessary when
we adjusted the managed page count when inflating - only if we
don't have VIRTIO_BALLOON_F_DEFLATE_ON_OOM. With that feature, the
managed page count is not touched when inflating/deflating.

Reported-by: Yumei Huang 
Fixes: 3dcc0571cd64 ("mm: correctly update zone->managed_pages")
Cc:  # v3.11+
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Jiang Liu 
Cc: Andrew Morton 
Cc: Igor Mammedov 
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: David Hildenbrand 
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/virtio/virtio_balloon.c |   11 +++
 1 file changed, 11 insertions(+)

--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -482,6 +482,17 @@ static int virtballoon_migratepage(struc
 
get_page(newpage); /* balloon reference */
 
+   /*
+ * When we migrate a page to a different zone and adjusted the
+ * managed page count when inflating, we have to fixup the count of
+ * both involved zones.
+ */
+   if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM) &&
+   page_zone(page) != page_zone(newpage)) {
+   adjust_managed_page_count(page, 1);
+   adjust_managed_page_count(newpage, -1);
+   }
+
/* balloon's page migration 1st step  -- inflate "newpage" */
spin_lock_irqsave(&vb_dev_info->pages_lock, flags);
balloon_page_insert(vb_dev_info, newpage);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4.4 098/162] virtio-balloon: fix managed page counts when migrating pages between zones

2019-12-19 Thread Greg Kroah-Hartman
 Pages 32768
  [  190.830303] Offlined Pages 32768
  [  190.833071] Built 1 zonelists, mobility grouping on.  Total pages: 
-36920272750453009

In another instance (older kernel), I was no longer able to start any
process:
  [root@vm ~]# [  214.348068] Offlined Pages 32768
  [  215.973009] Offlined Pages 32768
  cat /proc/meminfo
  -bash: fork: Cannot allocate memory
  [root@vm ~]# cat /proc/meminfo
  -bash: fork: Cannot allocate memory

Fix it by properly adjusting the managed page count when migrating if
the zone changed. The managed page count of the zones now looks after
unplug of the DIMM (and after deflating the balloon) just like before
inflating the balloon (and plugging+onlining the DIMM).

We'll temporarily modify the totalram page count. If this ever becomes a
problem, we can fine tune by providing helpers that don't touch
the totalram pages (e.g., adjust_zone_managed_page_count()).

Please note that fixing up the managed page count is only necessary when
we adjusted the managed page count when inflating - only if we
don't have VIRTIO_BALLOON_F_DEFLATE_ON_OOM. With that feature, the
managed page count is not touched when inflating/deflating.

Reported-by: Yumei Huang 
Fixes: 3dcc0571cd64 ("mm: correctly update zone->managed_pages")
Cc:  # v3.11+
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Jiang Liu 
Cc: Andrew Morton 
Cc: Igor Mammedov 
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: David Hildenbrand 
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/virtio/virtio_balloon.c |   11 +++
 1 file changed, 11 insertions(+)

--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -468,6 +468,17 @@ static int virtballoon_migratepage(struc
 
get_page(newpage); /* balloon reference */
 
+   /*
+ * When we migrate a page to a different zone and adjusted the
+ * managed page count when inflating, we have to fixup the count of
+ * both involved zones.
+ */
+   if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM) &&
+   page_zone(page) != page_zone(newpage)) {
+   adjust_managed_page_count(page, 1);
+   adjust_managed_page_count(newpage, -1);
+   }
+
/* balloon's page migration 1st step  -- inflate "newpage" */
spin_lock_irqsave(&vb_dev_info->pages_lock, flags);
balloon_page_insert(vb_dev_info, newpage);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.4 058/177] virtio-balloon: fix managed page counts when migrating pages between zones

2019-12-16 Thread Greg Kroah-Hartman
 Pages 32768
  [  190.830303] Offlined Pages 32768
  [  190.833071] Built 1 zonelists, mobility grouping on.  Total pages: 
-36920272750453009

In another instance (older kernel), I was no longer able to start any
process:
  [root@vm ~]# [  214.348068] Offlined Pages 32768
  [  215.973009] Offlined Pages 32768
  cat /proc/meminfo
  -bash: fork: Cannot allocate memory
  [root@vm ~]# cat /proc/meminfo
  -bash: fork: Cannot allocate memory

Fix it by properly adjusting the managed page count when migrating if
the zone changed. The managed page count of the zones now looks after
unplug of the DIMM (and after deflating the balloon) just like before
inflating the balloon (and plugging+onlining the DIMM).

We'll temporarily modify the totalram page count. If this ever becomes a
problem, we can fine tune by providing helpers that don't touch
the totalram pages (e.g., adjust_zone_managed_page_count()).

Please note that fixing up the managed page count is only necessary when
we adjusted the managed page count when inflating - only if we
don't have VIRTIO_BALLOON_F_DEFLATE_ON_OOM. With that feature, the
managed page count is not touched when inflating/deflating.

Reported-by: Yumei Huang 
Fixes: 3dcc0571cd64 ("mm: correctly update zone->managed_pages")
Cc:  # v3.11+
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Jiang Liu 
Cc: Andrew Morton 
Cc: Igor Mammedov 
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: David Hildenbrand 
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/virtio/virtio_balloon.c |   11 +++
 1 file changed, 11 insertions(+)

--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -721,6 +721,17 @@ static int virtballoon_migratepage(struc
 
get_page(newpage); /* balloon reference */
 
+   /*
+ * When we migrate a page to a different zone and adjusted the
+ * managed page count when inflating, we have to fixup the count of
+ * both involved zones.
+ */
+   if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM) &&
+   page_zone(page) != page_zone(newpage)) {
+   adjust_managed_page_count(page, 1);
+   adjust_managed_page_count(newpage, -1);
+   }
+
/* balloon's page migration 1st step  -- inflate "newpage" */
spin_lock_irqsave(&vb_dev_info->pages_lock, flags);
balloon_page_insert(vb_dev_info, newpage);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5.3 047/180] virtio-balloon: fix managed page counts when migrating pages between zones

2019-12-16 Thread Greg Kroah-Hartman
 Pages 32768
  [  190.830303] Offlined Pages 32768
  [  190.833071] Built 1 zonelists, mobility grouping on.  Total pages: 
-36920272750453009

In another instance (older kernel), I was no longer able to start any
process:
  [root@vm ~]# [  214.348068] Offlined Pages 32768
  [  215.973009] Offlined Pages 32768
  cat /proc/meminfo
  -bash: fork: Cannot allocate memory
  [root@vm ~]# cat /proc/meminfo
  -bash: fork: Cannot allocate memory

Fix it by properly adjusting the managed page count when migrating if
the zone changed. The managed page count of the zones now looks after
unplug of the DIMM (and after deflating the balloon) just like before
inflating the balloon (and plugging+onlining the DIMM).

We'll temporarily modify the totalram page count. If this ever becomes a
problem, we can fine tune by providing helpers that don't touch
the totalram pages (e.g., adjust_zone_managed_page_count()).

Please note that fixing up the managed page count is only necessary when
we adjusted the managed page count when inflating - only if we
don't have VIRTIO_BALLOON_F_DEFLATE_ON_OOM. With that feature, the
managed page count is not touched when inflating/deflating.

Reported-by: Yumei Huang 
Fixes: 3dcc0571cd64 ("mm: correctly update zone->managed_pages")
Cc:  # v3.11+
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Jiang Liu 
Cc: Andrew Morton 
Cc: Igor Mammedov 
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: David Hildenbrand 
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/virtio/virtio_balloon.c |   11 +++
 1 file changed, 11 insertions(+)

--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -721,6 +721,17 @@ static int virtballoon_migratepage(struc
 
get_page(newpage); /* balloon reference */
 
+   /*
+ * When we migrate a page to a different zone and adjusted the
+ * managed page count when inflating, we have to fixup the count of
+ * both involved zones.
+ */
+   if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM) &&
+   page_zone(page) != page_zone(newpage)) {
+   adjust_managed_page_count(page, 1);
+   adjust_managed_page_count(newpage, -1);
+   }
+
/* balloon's page migration 1st step  -- inflate "newpage" */
spin_lock_irqsave(&vb_dev_info->pages_lock, flags);
balloon_page_insert(vb_dev_info, newpage);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


  1   2   3   >