Re: [PATCH v2 net 2/3] virtio_net: fix adding vids on big-endian

2018-04-19 Thread Cornelia Huck
On Thu, 19 Apr 2018 08:30:49 +0300
"Michael S. Tsirkin"  wrote:

> Programming vids (adding or removing them) still passes
> guest-endian values in the DMA buffer. That's wrong
> if guest is big-endian and when virtio 1 is enabled.
> 
> Note: this is on top of a previous patch:
>   virtio_net: split out ctrl buffer
> 
> Fixes: 9465a7a6f ("virtio_net: enable v1.0 support")
> Signed-off-by: Michael S. Tsirkin 
> ---
>  drivers/net/virtio_net.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Ouch. Have you seen any bug reports for that?


Re: [PATCH v3] net: iucv: Free memory obtained by kzalloc

2018-03-01 Thread Cornelia Huck
On Thu,  1 Mar 2018 17:01:57 +0530
Arvind Yadav <arvind.yadav...@gmail.com> wrote:

> Free memory by calling put_device(), if afiucv_iucv_init is not
> successful.
> 
> Signed-off-by: Arvind Yadav <arvind.yadav...@gmail.com>
> ---
> changes in v2:
>  Calling put_device() before kfree().
> changes in v3:
>  No need to call kfree(). So removed kfree().
> 
>  net/iucv/af_iucv.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
> index 1e8cc7b..9e2643a 100644
> --- a/net/iucv/af_iucv.c
> +++ b/net/iucv/af_iucv.c
> @@ -2433,9 +2433,11 @@ static int afiucv_iucv_init(void)
>   af_iucv_dev->driver = _iucv_driver;
>   err = device_register(af_iucv_dev);
>   if (err)
> - goto out_driver;
> + goto out_iucv_dev;
>   return 0;
>  
> +out_iucv_dev:
> + put_device(af_iucv_dev);
>  out_driver:
>   driver_unregister(_iucv_driver);
>  out_iucv:

Reviewed-by: Cornelia Huck <coh...@redhat.com>


Re: [PATCH] net: iucv: Free memory obtained by kzalloc

2018-02-28 Thread Cornelia Huck
On Wed, 28 Feb 2018 17:39:53 +0530
Arvind Yadav <arvind.yadav...@gmail.com> wrote:

> On Wednesday 28 February 2018 05:26 PM, Cornelia Huck wrote:
> > On Wed, 28 Feb 2018 17:14:55 +0530
> > Arvind Yadav <arvind.yadav...@gmail.com> wrote:
> >  
> >> On Wednesday 28 February 2018 04:00 PM, Cornelia Huck wrote:  
> >>> On Wed, 28 Feb 2018 15:24:16 +0530
> >>> Arvind Yadav <arvind.yadav...@gmail.com> wrote:
> >>> 
> >>>> Free memory, if afiucv_iucv_init is not successful and
> >>>> removing a IUCV driver.
> >>>>
> >>>> Signed-off-by: Arvind Yadav <arvind.yadav...@gmail.com>
> >>>> ---
> >>>>net/iucv/af_iucv.c | 5 -
> >>>>1 file changed, 4 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
> >>>> index 1e8cc7b..eb0995a 100644
> >>>> --- a/net/iucv/af_iucv.c
> >>>> +++ b/net/iucv/af_iucv.c
> >>>> @@ -2433,9 +2433,11 @@ static int afiucv_iucv_init(void)
> >>>>  af_iucv_dev->driver = _iucv_driver;
> >>>>  err = device_register(af_iucv_dev);
> >>>>  if (err)
> >>>> -goto out_driver;
> >>>> +goto out_iucv_dev;
> >>>>  return 0;
> >>>>
> >>>> +out_iucv_dev:
> >>>> +kfree(af_iucv_dev);
> >>>>out_driver:
> >>>>  driver_unregister(_iucv_driver);
> >>>>out_iucv:
> >>>> @@ -2496,6 +2498,7 @@ static void __exit afiucv_exit(void)
> >>>>{
> >>>>  if (pr_iucv) {
> >>>>  device_unregister(af_iucv_dev);
> >>>> +kfree(af_iucv_dev);
> >>>>  driver_unregister(_iucv_driver);
> >>>>  pr_iucv->iucv_unregister(_iucv_handler, 0);
> >>>>  symbol_put(iucv_if);  
> >>> No, you must not use kfree() after you called device_register() (even
> >>> if it was not successful!) -- see the comment for device_register().  
> >> Yes, Your are right. First we need to call put_device() then kfree().
> >> I will send updated patch.  
> > No, that's not correct, either. device_register() will give up any
> > reference it obtained, and the caller did not obtain any additional
> > reference, so a put_device() would be wrong. A kfree() on a refcounted
> > structure is wrong as well.  
> If you will see the comment for device_register() (drivers/base/core.c)
> there is mentioned that
> 'NOTE: _Never_ directly free @dev after calling this function, even
> if it returned an error! Always use put_device() to give up the
> reference initialized in this function instead.'
> But as per you comment. we should not use.

You don't need to do a put_device() after your device_unregister(),
that's all managed by the driver core.


Re: [PATCH] net: iucv: Free memory obtained by kzalloc

2018-02-28 Thread Cornelia Huck
On Wed, 28 Feb 2018 17:14:55 +0530
Arvind Yadav <arvind.yadav...@gmail.com> wrote:

> On Wednesday 28 February 2018 04:00 PM, Cornelia Huck wrote:
> > On Wed, 28 Feb 2018 15:24:16 +0530
> > Arvind Yadav <arvind.yadav...@gmail.com> wrote:
> >  
> >> Free memory, if afiucv_iucv_init is not successful and
> >> removing a IUCV driver.
> >>
> >> Signed-off-by: Arvind Yadav <arvind.yadav...@gmail.com>
> >> ---
> >>   net/iucv/af_iucv.c | 5 -
> >>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
> >> index 1e8cc7b..eb0995a 100644
> >> --- a/net/iucv/af_iucv.c
> >> +++ b/net/iucv/af_iucv.c
> >> @@ -2433,9 +2433,11 @@ static int afiucv_iucv_init(void)
> >>af_iucv_dev->driver = _iucv_driver;
> >>err = device_register(af_iucv_dev);
> >>if (err)
> >> -  goto out_driver;
> >> +  goto out_iucv_dev;
> >>return 0;
> >>   
> >> +out_iucv_dev:
> >> +  kfree(af_iucv_dev);
> >>   out_driver:
> >>driver_unregister(_iucv_driver);
> >>   out_iucv:
> >> @@ -2496,6 +2498,7 @@ static void __exit afiucv_exit(void)
> >>   {
> >>if (pr_iucv) {
> >>device_unregister(af_iucv_dev);
> >> +  kfree(af_iucv_dev);
> >>driver_unregister(_iucv_driver);
> >>pr_iucv->iucv_unregister(_iucv_handler, 0);
> >>symbol_put(iucv_if);  
> > No, you must not use kfree() after you called device_register() (even
> > if it was not successful!) -- see the comment for device_register().  
> Yes, Your are right. First we need to call put_device() then kfree().
> I will send updated patch.

No, that's not correct, either. device_register() will give up any
reference it obtained, and the caller did not obtain any additional
reference, so a put_device() would be wrong. A kfree() on a refcounted
structure is wrong as well.


Re: [PATCH] net: iucv: Free memory obtained by kzalloc

2018-02-28 Thread Cornelia Huck
On Wed, 28 Feb 2018 15:24:16 +0530
Arvind Yadav  wrote:

> Free memory, if afiucv_iucv_init is not successful and
> removing a IUCV driver.
> 
> Signed-off-by: Arvind Yadav 
> ---
>  net/iucv/af_iucv.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
> index 1e8cc7b..eb0995a 100644
> --- a/net/iucv/af_iucv.c
> +++ b/net/iucv/af_iucv.c
> @@ -2433,9 +2433,11 @@ static int afiucv_iucv_init(void)
>   af_iucv_dev->driver = _iucv_driver;
>   err = device_register(af_iucv_dev);
>   if (err)
> - goto out_driver;
> + goto out_iucv_dev;
>   return 0;
>  
> +out_iucv_dev:
> + kfree(af_iucv_dev);
>  out_driver:
>   driver_unregister(_iucv_driver);
>  out_iucv:
> @@ -2496,6 +2498,7 @@ static void __exit afiucv_exit(void)
>  {
>   if (pr_iucv) {
>   device_unregister(af_iucv_dev);
> + kfree(af_iucv_dev);
>   driver_unregister(_iucv_driver);
>   pr_iucv->iucv_unregister(_iucv_handler, 0);
>   symbol_put(iucv_if);

No, you must not use kfree() after you called device_register() (even
if it was not successful!) -- see the comment for device_register().


Re: [PATCH 4/6] virtio_net: allow specifying context for rx

2017-03-30 Thread Cornelia Huck
On Thu, 30 Mar 2017 17:31:37 +0300
"Michael S. Tsirkin" <m...@redhat.com> wrote:

> On Thu, Mar 30, 2017 at 09:26:51AM +0200, Cornelia Huck wrote:
> > On Wed, 29 Mar 2017 23:48:54 +0300
> > "Michael S. Tsirkin" <m...@redhat.com> wrote:
> > 
> > > With mergeable buffers we never use s/g for rx,
> > > so allow specifying context in that case.
> > > 
> > > Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
> > > ---
> > >  drivers/net/virtio_net.c | 15 ++-
> > >  1 file changed, 14 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index 6802169..340f737 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -2044,6 +2044,7 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
> > >   int ret = -ENOMEM;
> > >   int i, total_vqs;
> > >   const char **names;
> > > + const bool *ctx;
> > >  
> > >   /* We expect 1 RX virtqueue followed by 1 TX virtqueue, followed by
> > >* possible N-1 RX/TX queue pairs used in multiqueue mode, followed by
> > > @@ -2062,6 +2063,13 @@ static int virtnet_find_vqs(struct virtnet_info 
> > > *vi)
> > >   names = kmalloc(total_vqs * sizeof(*names), GFP_KERNEL);
> > >   if (!names)
> > >   goto err_names;
> > > + if (vi->mergeable_rx_bufs) {
> > > + ctx = kzalloc(total_vqs * sizeof(*ctx), GFP_KERNEL);
> > > + if (!ctx)
> > > + goto err_ctx;
> > > + } else {
> > > + ctx = NULL;
> > > + }
> > >  
> > >   /* Parameters for control virtqueue, if any */
> > >   if (vi->has_cvq) {
> > > @@ -2077,9 +2085,12 @@ static int virtnet_find_vqs(struct virtnet_info 
> > > *vi)
> > >   sprintf(vi->sq[i].name, "output.%d", i);
> > >   names[rxq2vq(i)] = vi->rq[i].name;
> > >   names[txq2vq(i)] = vi->sq[i].name;
> > > + if (ctx)
> > > + ctx[rxq2vq(i)] = true;
> > >   }
> > >  
> > > - ret = virtio_find_vqs(vi->vdev, total_vqs, vqs, callbacks, names, NULL);
> > > + ret = vi->vdev->config->find_vqs(vi->vdev, total_vqs, vqs, callbacks,
> > > +  names, ctx, NULL);
> > 
> > virtio_find_vqs_ctx()? (Needs to be exported, obviously.)
> 
> I guess I can do that but there's a single user ATM.
> Do you think it's worth doing right now, or wait until
> it gets more users?

Given that you introduce it in patch 2, I'd just make it an exported
function from the start. (It also looks nicer IMO.)



Re: [PATCH 1/2] virtio: allow drivers to validate features

2017-03-30 Thread Cornelia Huck
On Wed, 29 Mar 2017 20:14:44 +0300
"Michael S. Tsirkin"  wrote:

> Some drivers can't support all features in all configurations.  At the
> moment we blindly set FEATURES_OK and later FAILED.  Support this better
> by adding a callback drivers can use to do some early checks.

Looks reasonable. Do we need to document that the driver must not do
anything beyond dealing with features and reading the config space that
early?

> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  drivers/virtio/virtio.c | 6 ++
>  include/linux/virtio.h  | 1 +
>  2 files changed, 7 insertions(+)
> 
> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> index 400d70b..48230a5 100644
> --- a/drivers/virtio/virtio.c
> +++ b/drivers/virtio/virtio.c
> @@ -232,6 +232,12 @@ static int virtio_dev_probe(struct device *_d)
>   if (device_features & (1ULL << i))
>   __virtio_set_bit(dev, i);
>  
> + if (drv->validate) {
> + err = drv->validate(dev);
> + if (err)
> + goto err;
> + }
> +
>   err = virtio_finalize_features(dev);
>   if (err)
>   goto err;
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index 193fea9..ed04753 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -176,6 +176,7 @@ struct virtio_driver {
>   unsigned int feature_table_size;
>   const unsigned int *feature_table_legacy;
>   unsigned int feature_table_size_legacy;
> + int (*validate)(struct virtio_device *dev);
>   int (*probe)(struct virtio_device *dev);
>   void (*scan)(struct virtio_device *dev);
>   void (*remove)(struct virtio_device *dev);

Would be good to add some doc; but other members are undocumented here
already...



Re: [PATCH 4/6] virtio_net: allow specifying context for rx

2017-03-30 Thread Cornelia Huck
On Wed, 29 Mar 2017 23:48:54 +0300
"Michael S. Tsirkin"  wrote:

> With mergeable buffers we never use s/g for rx,
> so allow specifying context in that case.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  drivers/net/virtio_net.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 6802169..340f737 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -2044,6 +2044,7 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>   int ret = -ENOMEM;
>   int i, total_vqs;
>   const char **names;
> + const bool *ctx;
>  
>   /* We expect 1 RX virtqueue followed by 1 TX virtqueue, followed by
>* possible N-1 RX/TX queue pairs used in multiqueue mode, followed by
> @@ -2062,6 +2063,13 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>   names = kmalloc(total_vqs * sizeof(*names), GFP_KERNEL);
>   if (!names)
>   goto err_names;
> + if (vi->mergeable_rx_bufs) {
> + ctx = kzalloc(total_vqs * sizeof(*ctx), GFP_KERNEL);
> + if (!ctx)
> + goto err_ctx;
> + } else {
> + ctx = NULL;
> + }
>  
>   /* Parameters for control virtqueue, if any */
>   if (vi->has_cvq) {
> @@ -2077,9 +2085,12 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>   sprintf(vi->sq[i].name, "output.%d", i);
>   names[rxq2vq(i)] = vi->rq[i].name;
>   names[txq2vq(i)] = vi->sq[i].name;
> + if (ctx)
> + ctx[rxq2vq(i)] = true;
>   }
>  
> - ret = virtio_find_vqs(vi->vdev, total_vqs, vqs, callbacks, names, NULL);
> + ret = vi->vdev->config->find_vqs(vi->vdev, total_vqs, vqs, callbacks,
> +  names, ctx, NULL);

virtio_find_vqs_ctx()? (Needs to be exported, obviously.)

>   if (ret)
>   goto err_find;
>  
> @@ -2101,6 +2112,8 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>   return 0;
>  
>  err_find:
> + kfree(ctx);
> +err_ctx:
>   kfree(names);
>  err_names:
>   kfree(callbacks);



Re: [PATCH 1/6] virtio: wrap find_vqs

2017-03-30 Thread Cornelia Huck
On Wed, 29 Mar 2017 23:48:44 +0300
"Michael S. Tsirkin"  wrote:

> We are going to add more parameters to find_vqs, let's wrap the call so
> we don't need to tweak all drivers every time.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  drivers/block/virtio_blk.c | 3 +--
>  drivers/char/virtio_console.c  | 6 +++---
>  drivers/crypto/virtio/virtio_crypto_core.c | 3 +--
>  drivers/gpu/drm/virtio/virtgpu_kms.c   | 3 +--
>  drivers/net/caif/caif_virtio.c | 3 +--
>  drivers/net/virtio_net.c   | 3 +--
>  drivers/rpmsg/virtio_rpmsg_bus.c   | 2 +-
>  drivers/scsi/virtio_scsi.c | 3 +--
>  drivers/virtio/virtio_balloon.c| 3 +--
>  drivers/virtio/virtio_input.c  | 3 +--
>  include/linux/virtio_config.h  | 9 +
>  net/vmw_vsock/virtio_transport.c   | 6 +++---
>  12 files changed, 24 insertions(+), 23 deletions(-)

Regardless whether that context thing is the right thing to do, this
looks like a sensible cleanup.



Re: [PATCH] linux/types.h: enable endian checks for all sparse builds

2016-12-08 Thread Cornelia Huck
On Thu, 8 Dec 2016 04:29:39 +0200
"Michael S. Tsirkin"  wrote:

> By now, linux is mostly endian-clean. Enabling endian-ness
> checks for everyone produces about 200 new sparse warnings for me -
> less than 10% over the 2000 sparse warnings already there.

Out of curiousity: Where do most of those warnings show up?

> 
> Not a big deal, OTOH enabling this helps people notice
> they are introducing new bugs.
> 
> So let's just drop __CHECK_ENDIAN__. Follow-up patches
> can drop distinction between __bitwise and __bitwise__.
> 
> Cc: Linus Torvalds 
> Suggested-by: Christoph Hellwig 
> Signed-off-by: Michael S. Tsirkin 
> ---
> 
> Linus, could you ack this for upstream? If yes I'll
> merge through my tree as a replacement for enabling
> this just for virtio.
> 
>  include/uapi/linux/types.h | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/include/uapi/linux/types.h b/include/uapi/linux/types.h
> index acf0979..41e5914 100644
> --- a/include/uapi/linux/types.h
> +++ b/include/uapi/linux/types.h
> @@ -23,11 +23,7 @@
>  #else
>  #define __bitwise__
>  #endif
> -#ifdef __CHECK_ENDIAN__
>  #define __bitwise __bitwise__
> -#else
> -#define __bitwise
> -#endif
> 
>  typedef __u16 __bitwise __le16;
>  typedef __u16 __bitwise __be16;

FWIW, I like this better than just enabling it for the virtio code.



Re: [PATCH 3/3] tools/virtio: use {READ,WRITE}_ONCE() in uaccess.h

2016-11-24 Thread Cornelia Huck
On Thu, 24 Nov 2016 10:25:14 +
Mark Rutland <mark.rutl...@arm.com> wrote:

> As a step towards killing off ACCESS_ONCE, use {READ,WRITE}_ONCE() for the
> virtio tools uaccess primitives, pulling these in from .
> 
> With this done, we can kill off the now-unused ACCESS_ONCE() definition.
> 
> Signed-off-by: Mark Rutland <mark.rutl...@arm.com>
> Cc: Jason Wang <jasow...@redhat.com>
> Cc: Michael S. Tsirkin <m...@redhat.com>
> Cc: linux-ker...@vger.kernel.org
> Cc: virtualizat...@lists.linux-foundation.org
> ---
>  tools/virtio/linux/uaccess.h | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>



Re: [PATCH 2/3] vringh: kill off ACCESS_ONCE()

2016-11-24 Thread Cornelia Huck
On Thu, 24 Nov 2016 10:25:13 +
Mark Rutland <mark.rutl...@arm.com> wrote:

> Despite living under drivers/ vringh.c is also used as part of the userspace
> virtio tools. Before we can kill off the ACCESS_ONCE()definition in the tools,
> we must convert vringh.c to use {READ,WRITE}_ONCE().
> 
> This patch does so, along with the required include of  for
> the relevant definitions. The userspace tools provide their own definitions in
> their own .
> 
> Signed-off-by: Mark Rutland <mark.rutl...@arm.com>
> Cc: Jason Wang <jasow...@redhat.com>
> Cc: Michael S. Tsirkin <m...@redhat.com>
> Cc: k...@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Cc: netdev@vger.kernel.org
> Cc: virtualizat...@lists.linux-foundation.org
> ---
>  drivers/vhost/vringh.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>



Re: [PATCH 1/3] tools/virtio: fix READ_ONCE()

2016-11-24 Thread Cornelia Huck
On Thu, 24 Nov 2016 10:25:12 +
Mark Rutland <mark.rutl...@arm.com> wrote:

> The virtio tools implementation of READ_ONCE() has a single parameter called
> 'var', but erroneously refers to 'val' for its cast, and thus won't work 
> unless
> there's a variable of the correct type that happens to be called 'var'.
> 
> Fix this with s/var/val/, making READ_ONCE() work as expected regardless.
> 
> Fixes: a7c490333df3cff5 ("tools/virtio: use virt_xxx barriers")
> Signed-off-by: Mark Rutland <mark.rutl...@arm.com>
> Cc: Jason Wang <jasow...@redhat.com>
> Cc: Michael S. Tsirkin <m...@redhat.com>
> Cc: linux-ker...@vger.kernel.org
> Cc: virtualizat...@lists.linux-foundation.org
> ---
>  tools/virtio/linux/compiler.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>



Re: [PATCH] virtio-net: drop legacy features in virtio 1 mode

2016-11-04 Thread Cornelia Huck
On Fri, 4 Nov 2016 12:55:36 +0200
"Michael S. Tsirkin" <m...@redhat.com> wrote:

> Virtio 1.0 spec says VIRTIO_F_ANY_LAYOUT and VIRTIO_NET_F_GSO are
> legacy-only feature bits. Do not negotiate them in virtio 1 mode.  Note
> this is a spec violation so we need to backport it to stable/downstream
> kernels.
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
> ---
>  drivers/net/virtio_net.c | 30 --
>  1 file changed, 20 insertions(+), 10 deletions(-)

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>



Re: [REGRESSION] 362899b ("macvtap: switch to use skb array") causes oops during teardown

2016-08-11 Thread Cornelia Huck
On Thu, 11 Aug 2016 15:49:12 +0800
Jason Wang <jasow...@redhat.com> wrote:

> This looks like a use-after-free. Could you pls try the following patch 
> to see it if fixes your issue?
> 
> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> index a38c0da..070e329 100644
> --- a/drivers/net/macvtap.c
> +++ b/drivers/net/macvtap.c
> @@ -275,7 +275,6 @@ static void macvtap_put_queue(struct macvtap_queue *q)
>  rtnl_unlock();
> 
>  synchronize_rcu();
> -   skb_array_cleanup(>skb_array);
>  sock_put(>sk);
>   }
> 
> @@ -533,10 +532,8 @@ static void macvtap_sock_write_space(struct sock *sk)
>   static void macvtap_sock_destruct(struct sock *sk)
>   {
>  struct macvtap_queue *q = container_of(sk, struct 
> macvtap_queue, sk);
> -   struct sk_buff *skb;
> 
> -   while ((skb = skb_array_consume(>skb_array)) != NULL)
> -   kfree_skb(skb);
> +   skb_array_cleanup(>skb_array);
>   }
> 
>   static int macvtap_open(struct inode *inode, struct file *file)

Yes, that change fixes things for me.

Tested-by: Cornelia Huck <cornelia.h...@de.ibm.com>

Thanks for the quick reply!



[REGRESSION] 362899b ("macvtap: switch to use skb array") causes oops during teardown

2016-08-10 Thread Cornelia Huck
I'm hitting the following oops during shutdown (halt command in guest)
of a libvirt-managed qemu guest 100% of the time:

[  108.920486] Unable to handle kernel pointer dereference in virtual kernel 
address space
[  108.920492] Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6803
[  108.920495] Fault in home space mode while using kernel ASCE.
[  108.920504] AS:00e20007 R3:0024 
[  108.920588] Oops: 0038 ilc:2 [#1] PREEMPT SMP 
[  108.920592] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack 
nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp iptable_filter ip_tables 
x_tables bridge stp llc ghash_s390 prng ecb aes_s390 des_s390 des_generic 
sha512_s390 sha256_s390 sha1_s390 sha_common lockd grace vhost_net tun vhost 
macvtap macvlan kvm sunrpc dm_multipath dm_mod autofs4
[  108.920628] CPU: 8 PID: 2648 Comm: qemu-system-s39 Not tainted 
4.8.0-rc1-00031-gd3d312e #25
[  108.920630] Hardware name: IBM  2964 NC9  704
  (LPAR)
[  108.920634] Krnl PSW : 0704e0018000 0064a3e8 
(kfree_skb+0x38/0x288)
[  108.920640]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
Krnl GPRS: 9fb04a5f 0007a7a37d48 6b6b6b6b6b6b6b6b 03ff80721af0
[  108.920645]0078d626 0007a7a34000  
0008
[  108.920647]0007b0f04210 0007bde65718 03ff80721afa 
6b6b6b6b6b6b6b6b
[  108.920648]0007bde65698 00807180 03ff80721afa 
0007a7a37d08
[  108.920656] Krnl Code: 0064a3da: b90400aelgr 
%r10,%r14
   0064a3de: b90400b2   lgr %r11,%r2
  #0064a3e2: ec280121007c   cgij%r2,0,8,64a624
  >0064a3e8: 581020e4   l   %r1,228(%r2)
   0064a3ec: ec160005017e   cij %r1,1,6,64a3f6
   0064a3f2: a7f4000b   brc 15,64a408
   0064a3f6: a718   lhi %r1,-1
   0064a3fa: eb1120e400f8   laa %r1,%r1,228(%r2)
[  108.920673] Call Trace:
[  108.920675] ([<0007a7a37d28>] 0x7a7a37d28)
[  108.920679] ([<03ff80721afa>] macvtap_sock_destruct+0x92/0xa8 [macvtap])
[  108.920681] ([<00647026>] __sk_destruct+0x3e/0x1f0)
[  108.920684] ([<03ff80723ed8>] macvtap_release+0x150/0x1b0 [macvtap])
[  108.920688] ([<0031bd72>] __fput+0x132/0x230)
[  108.920691] ([<0015f7aa>] task_work_run+0xb2/0xe8)
[  108.920695] ([<0078e494>] system_call+0xdc/0x270)
[  108.920697] INFO: lockdep is turned off.
[  108.920698] Last Breaking-Event-Address:
[  108.920700]  [<03ff80721100>] 0x3ff80721100
[  108.920703]  Kernel panic - not syncing: Fatal exception: panic_on_oops

s390 host with a qeth device as the sole networking interface, one
network interface in the guest, using vhost (I can try to figure out
what libvirt is doing, if needed).

If I revert 362899b ("macvtap: switch to use skb array") and its
companion patch 0d7eacb ("macvtap: correctly free skb during socket
destruction"), starting a guest via libvirt and halting it again from
the guest works 100% of the time again.

I'm willing to collect debug info if you tell me what you need.



Re: [PATCH 2/2] vhost: disentangle vring endianness stuff from the core code

2016-02-10 Thread Cornelia Huck
On Wed, 10 Feb 2016 14:08:43 +0100
Greg Kurz  wrote:

> But you are right, there is a bug: we should rollback if vhost_init_used()
> fails. Something like below:
> 
>  err_used:
> vq->private_data = oldsock;
> vhost_net_enable_vq(n, vq);
> +   vhost_adjust_vring_endian(vq);

Shouldn't we switch back before we reenable? Or have I lost myself in
this maze here again?

> if (ubufs)
> vhost_net_ubuf_put_wait_and_free(ubufs);
>  err_ubufs:



Re: [PATCH 1/2] vhost: helpers to enable/disable vring endianness

2016-02-10 Thread Cornelia Huck
On Wed, 10 Feb 2016 13:11:34 +0100
Greg Kurz  wrote:

> On Wed, 10 Feb 2016 13:21:22 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > On Wed, Jan 13, 2016 at 06:09:41PM +0100, Greg Kurz wrote:

> > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > > index ad2146a9ab2d..e02e06755ab7 100644
> > > --- a/drivers/vhost/vhost.c
> > > +++ b/drivers/vhost/vhost.c
> > > @@ -43,11 +43,16 @@ enum {
> > >  #define vhost_avail_event(vq) ((__virtio16 __user 
> > > *)>used->ring[vq->num])
> > >  
> > >  #ifdef CONFIG_VHOST_CROSS_ENDIAN_LEGACY
> > > -static void vhost_vq_reset_user_be(struct vhost_virtqueue *vq)
> > > +static void vhost_disable_user_be(struct vhost_virtqueue *vq)
> > >  {
> > >   vq->user_be = !virtio_legacy_is_little_endian();
> > >  }
> > >
> > 
> > Hmm this doesn't look like an improvement to me.
> > What does it mean to disable big endian? Make it little endian?
> 
> The default behavior for the device is native endian.
> 
> The SET_VRING_ENDIAN ioctl is used to make the device big endian
> on little endian hosts, hence "enabling" cross-endian mode...
> 
> > Existing reset seems to make sense.
> > 
> 
> ... and we "disable" cross-endian mode on reset.
> 
> > > +static void vhost_enable_user_be(struct vhost_virtqueue *vq, bool 
> > > user_be)
> > > +{
> > > + vq->user_be = user_be;
> > > +}
> > > +  
> > 
> > And this is maybe "init_user_be"?
> > 
> 
> Anyway I don't mind changing the names to reset/init_user_be if you think it
> is clearer.

FWIW, I find the enable/disable terminology less confusing.



Re: [PATCH 1/2] vhost: helpers to enable/disable vring endianness

2016-01-21 Thread Cornelia Huck
On Wed, 13 Jan 2016 18:09:41 +0100
Greg Kurz <gk...@linux.vnet.ibm.com> wrote:

> The default use case for vhost is when the host and the vring have the
> same endianness (default native endianness). But there are cases where
> they differ and vhost should byteswap when accessing the vring:
> - the host is big endian and the vring comes from a virtio 1.0 device
>   which is always little endian
> - the architecture is bi-endian and the vring comes from a legacy virtio
>   device with a different endianness than the endianness of the host (aka
>   legacy cross-endian)
> 
> These cases are handled by the vq->is_le and the optional vq->user_be,
> with the following logic:
> - if none of the fields is enabled, vhost access the vring without byteswap
> - if the vring is virtio 1.0 and the host is big endian, vq->is_le is
>   enabled to enforce little endian access to the vring
> - if the vring is legacy cross-endian, userspace enables vq->user_be
>   to inform vhost about the vring endianness. This endianness is then
>   enforced for vring accesses through vq->is_le again
> 
> The logic is unclear in the current code.
> 
> This patch introduces helpers with explicit enable and disable semantics,
> for better clarity.
> 
> No behaviour change.
> 
> Signed-off-by: Greg Kurz <gk...@linux.vnet.ibm.com>
> ---
>  drivers/vhost/vhost.c |   28 +++-
>  1 file changed, 19 insertions(+), 9 deletions(-)

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>



Re: [PATCH 2/2] vhost: disentangle vring endianness stuff from the core code

2016-01-21 Thread Cornelia Huck
On Wed, 13 Jan 2016 18:09:47 +0100
Greg Kurz <gk...@linux.vnet.ibm.com> wrote:

> The way vring endianness is being handled currently obfuscates
> the code in vhost_init_used().
> 
> This patch tries to fix that by doing the following:
> - move the the code that adjusts endianness to a dedicated helper
> - export this helper so that backends explicitely call it
> 
> No behaviour change.
> 
> Signed-off-by: Greg Kurz <gk...@linux.vnet.ibm.com>
> ---
>  drivers/vhost/net.c   |3 +++
>  drivers/vhost/scsi.c  |3 +++
>  drivers/vhost/test.c  |2 ++
>  drivers/vhost/vhost.c |   16 +++-
>  drivers/vhost/vhost.h |1 +
>  5 files changed, 20 insertions(+), 5 deletions(-)

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>



Re: [PATCH] vhost: move is_le setup to the backend

2015-11-12 Thread Cornelia Huck
On Fri, 30 Oct 2015 12:42:35 +0100
Greg Kurz <gk...@linux.vnet.ibm.com> wrote:

> The vq->is_le field is used to fix endianness when accessing the vring via
> the cpu_to_vhost16() and vhost16_to_cpu() helpers in the following cases:
> 
> 1) host is big endian and device is modern virtio
> 
> 2) host has cross-endian support and device is legacy virtio with a different
>endianness than the host
> 
> Both cases rely on the VHOST_SET_FEATURES ioctl, but 2) also needs the
> VHOST_SET_VRING_ENDIAN ioctl to be called by userspace. Since vq->is_le
> is only needed when the backend is active, it was decided to set it at
> backend start.
> 
> This is currently done in vhost_init_used()->vhost_init_is_le() but it
> obfuscates the core vhost code. This patch moves the is_le setup to a
> dedicated function that is called from the backend code.
> 
> Note vhost_net is the only backend that can pass vq->private_data == NULL to
> vhost_init_used(), hence the "if (sock)" branch.
> 
> No behaviour change.
> 
> Signed-off-by: Greg Kurz <gk...@linux.vnet.ibm.com>
> ---
>  drivers/vhost/net.c   |6 ++
>  drivers/vhost/scsi.c  |3 +++
>  drivers/vhost/test.c  |2 ++
>  drivers/vhost/vhost.c |   12 +++-----
>  drivers/vhost/vhost.h |1 +
>  5 files changed, 19 insertions(+), 5 deletions(-)

Makes sense.

Reviewed-by: Cornelia Huck <cornelia.h...@de.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: Add Kconfig option to signal cross-endian guests

2015-07-09 Thread Cornelia Huck
On Thu,  9 Jul 2015 09:49:05 +0200
Thomas Huth th...@redhat.com wrote:

 The option for supporting cross-endianness legacy guests in

s/cross-endianness/cross-endian/ ?

 the vhost and tun code should only be available on systems
 that support cross-endian guests.
 
 Signed-off-by: Thomas Huth th...@redhat.com
 ---
  arch/arm/kvm/Kconfig | 1 +
  arch/arm64/kvm/Kconfig   | 1 +
  arch/powerpc/kvm/Kconfig | 1 +
  drivers/net/Kconfig  | 1 +
  drivers/vhost/Kconfig| 1 +
  virt/kvm/Kconfig | 3 +++
  6 files changed, 8 insertions(+)

Acked-by: Cornelia Huck cornelia.h...@de.ibm.com

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22-git: known regressions with patches

2007-07-20 Thread Cornelia Huck
On Thu, 19 Jul 2007 18:35:04 +0200,
Michal Piotrowski [EMAIL PROTECTED] wrote:

 FS
 
 Subject : AFS compile broken
 References  : http://lkml.org/lkml/2007/7/19/218
 Last known good : ?
 Submitter   : Meelis Roos [EMAIL PROTECTED]
 Caused-By   : ?
 Handled-By  : Cornelia Huck [EMAIL PROTECTED]
 Patch   : http://lkml.org/lkml/2007/7/19/223
 Status  : patch available
 

Multiple people posted this fix, and Andrew's already made it to git
(commit 275afcac9953ece0828972edeab9684cfe1a5ef3).
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] try parent numa_node at first before using default

2007-07-12 Thread Cornelia Huck
On Thu, 12 Jul 2007 07:47:52 +0200,
Stefan Richter [EMAIL PROTECTED] wrote:

 The patch does nothing for all subsystems which do
 
   device_initialize(dev);
   dev-parent = pd;
   device_add(dev);
 
 Let's avoid to add infrastructure which does nothing, or only does
 something by accident.

I agree. dev-parent now is only expected to be set in device_add(),
not in device_initialize(), so the sequence above is perfectly fine.
(Why should you need it, anyway? parent information becomes necessary
only when something is added to the tree.)
 
 The alternatives are:
 
   - Change all subsystems to set dev-parent before device_initialize().
 *Document* that the device_initialize() API has this requirement.
 This is counter-intuitive, amounts to some work across the kernel,
 and could be gotten wrong again in future code because it's a
 counter-intuitive API.

Yes. We shouldn't do that.
 
   - Move your code from device_initialize() to device_add().  One minor
 drawback is that node-specific allocations based on the device's
 numa_node would not be optimized before device_add(), but there is
 probably no need for this.  Driver probes come after device_add().

I'd expect most allocations to be done when probing, so this shouldn't
hurt much.
 
   - Let subsystems explicitly call set_dev_node() on their own.
 
 
 Also keep in mind that either device_move() should update the numa_node,
 or the subsystems which call device_move() should explicitly update it
 on their own.   (Unless they know that their devices will always stay at
 the same NUMA node even when switching parents.)

I'd trust the subsystems to know best whether something regarding NUMA
changed.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] try parent numa_node at first before using default

2007-07-12 Thread Cornelia Huck
On Thu, 12 Jul 2007 13:30:28 +0200,
Stefan Richter [EMAIL PROTECTED] wrote:

 So, since figuring the correct DMA device out is done by drivers
 themselves, they usually can figure out the correct NUMA node as well.
 The only precondition is that each DMA device has the correct NUMA node
 set.  This is the job of subsystems like PCI, but it may be reasonable
 to have simple and broadly applicable helpers for this in the driver core.
 
 So, _is_ the dev-numa_node = dev-parent.numa_node assumption, if
 implemented in device_add() and device_move(), as simple and as broadly
 applicable as we want driver core's API to be?  Perhaps yes.

If making the general assumption that the device will inherit the
numa_node through its parent(s) from the responsible DMA device is
reasonable, then device_move() should certainly update the numa_node
from the new parent. (For special cases, the caller could still call
set_dev_node() itself.)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] try parent numa_node at first before using default

2007-07-12 Thread Cornelia Huck
On Thu, 12 Jul 2007 10:59:53 -0700,
Yinghai Lu [EMAIL PROTECTED] wrote:

 @@ -1285,8 +1290,11 @@ int device_move(struct device *dev, struct device 
 *new_parent)
   dev-parent = new_parent;
   if (old_parent)
   klist_remove(dev-knode_parent);
 - if (new_parent)
 + if (new_parent) {
   klist_add_tail(dev-knode_parent, new_parent-klist_children);
 + set_dev_node(dev, dev_to_node(new_parent));
 + }
 +
   if (!dev-class)
   goto out_put;
   error = device_move_class_links(dev, old_parent, new_parent);

You're not correctly undoing the changes if the last function fails.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch driver tree] qeth: Remove usage of subsys.rwsem

2007-04-17 Thread Cornelia Huck
Hi,

the current driver tree contains the removal of subsys.rwsem.
Unfortunately, this breaks qeth. However, it should be no problem to
fix the walking of the devices for /proc/qeth:

No need to take subsys.rwsem during walking the devices,
driver_find_devices() should already suffice.

Signed-off-by: Cornelia Huck [EMAIL PROTECTED]

---
 drivers/s390/net/qeth_proc.c |2 --
 1 files changed, 2 deletions(-)

--- linux.orig/drivers/s390/net/qeth_proc.c
+++ linux/drivers/s390/net/qeth_proc.c
@@ -37,7 +37,6 @@ qeth_procfile_seq_start(struct seq_file 
struct device *dev = NULL;
loff_t nr = 0;
 
-   down_read(qeth_ccwgroup_driver.driver.bus-subsys.rwsem);
if (*offset == 0)
return SEQ_START_TOKEN;
while (1) {
@@ -53,7 +52,6 @@ qeth_procfile_seq_start(struct seq_file 
 static void
 qeth_procfile_seq_stop(struct seq_file *s, void* it)
 {
-   up_read(qeth_ccwgroup_driver.driver.bus-subsys.rwsem);
 }
 
 static void *
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html