Re: [PATCH vhost v2 1/2] virtio_pci: fix the common map size and add check for common size

2023-10-08 Thread Xuan Zhuo
On Sun, 8 Oct 2023 06:42:37 -0400, "Michael S. Tsirkin"  wrote:
> On Sun, Oct 08, 2023 at 05:38:41PM +0800, Xuan Zhuo wrote:
> > Now, the function vp_modern_map_capability() takes the size parameter,
> > which corresponds to the size of virtio_pci_common_cfg. As a result,
> > this indicates the size of memory area to map.
> >
> > However, if the _F_RING_RESET is negotiated, the extra items will be
> > used. Therefore, we need to use the size of virtio_pci_modern_common_cfg
> > to map more space.
> >
> > Meanwhile, this patch removes the feature(_F_RING_ERSET and
>
> typo
>
> > _F_NOTIFICATION_DATA) when the common cfg size does not match
> > the corresponding feature.
> >
> > Fixes: 0b50cece0b78 ("virtio_pci: introduce helper to get/set queue reset")
> > Signed-off-by: Xuan Zhuo 
> > ---
> >  drivers/virtio/virtio_pci_modern.c | 20 +++-
> >  drivers/virtio/virtio_pci_modern_dev.c |  4 ++--
> >  include/linux/virtio_pci_modern.h  |  1 +
> >  3 files changed, 22 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/virtio/virtio_pci_modern.c 
> > b/drivers/virtio/virtio_pci_modern.c
> > index d6bb68ba84e5..c0b9d2363ddb 100644
> > --- a/drivers/virtio/virtio_pci_modern.c
> > +++ b/drivers/virtio/virtio_pci_modern.c
> > @@ -22,8 +22,26 @@
> >  static u64 vp_get_features(struct virtio_device *vdev)
> >  {
> > struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> > +   u64 features = vp_modern_get_features(_dev->mdev);
> > +
> > +#define check_feature(feature, field)  
> > \
> > +   do {
> > \
> > +   if (features & BIT_ULL(feature)) {  
> > \
> > +   u32 offset = offsetofend(struct 
> > virtio_pci_modern_common_cfg, field);   \
> > +   if (unlikely(vp_dev->mdev.common_len < offset)) 
> > \
> > +   features &= ~BIT_ULL(feature);  
> > \
> > +   }   
> > \
> > +   } while (0)
> > +
> > +   /* For buggy devices, if the common len does not match the feature, we
> > +* remove the feature.
>
> I don't like doing this in vp_get_features. userspace won't be able
> to see the actual device features at all.
> Also, we should print an info message at least.
>
> I am still debating with myself whether clearing feature bits
> or just failing finalize_features (and thus, probe) is best.

For me, I think failing probe is best.

Then the developer of the device can find that firstly.
And I think we should print an info message when we detect
this error.

If we clear the feature bits, the developer of the device may
ignore this error.

>
>
> > +*/
> > +   check_feature(VIRTIO_F_NOTIFICATION_DATA, queue_notify_data);
> > +   check_feature(VIRTIO_F_RING_RESET, queue_reset);
> > +
> > +#undef check_feature
>
> this macro's too scary. just pass offset and feature bit as
> parameters to an inline function.

I should pass the features as a parameter.

Thanks.



>
> >
> > -   return vp_modern_get_features(_dev->mdev);
> > +   return features;
> >  }
> >
> >  static void vp_transport_features(struct virtio_device *vdev, u64 features)
> > diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
> > b/drivers/virtio/virtio_pci_modern_dev.c
> > index aad7d9296e77..33f319da1558 100644
> > --- a/drivers/virtio/virtio_pci_modern_dev.c
> > +++ b/drivers/virtio/virtio_pci_modern_dev.c
> > @@ -291,8 +291,8 @@ int vp_modern_probe(struct virtio_pci_modern_device 
> > *mdev)
> > err = -EINVAL;
> > mdev->common = vp_modern_map_capability(mdev, common,
> >   sizeof(struct virtio_pci_common_cfg), 4,
> > - 0, sizeof(struct virtio_pci_common_cfg),
> > - NULL, NULL);
> > + 0, sizeof(struct 
> > virtio_pci_modern_common_cfg),
> > + >common_len, NULL);
> > if (!mdev->common)
> > goto err_map_common;
> > mdev->isr = vp_modern_map_capability(mdev, isr, sizeof(u8), 1,
> > diff --git a/include/linux/virtio_pci_modern.h 
> > b/include/linux/virtio_pci_modern.h
> > index 067ac1d789bc..edf62bae0474 100644
> > --- a/include/linux/virtio_pci_modern.h
> > +++ b/include/linux/virtio_pci_modern.h
> > @@ -28,6 +28,7 @@ struct virtio_pci_modern_device {
> > /* So we can sanity-check accesses. */
> > size_t notify_len;
> > size_t device_len;
> > +   size_t common_len;
> >
> > /* Capability for when we need to map notifications per-vq. */
> > int notify_map_cap;
> > --
> > 2.32.0.3.g01195cf9f
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org

Re: [RFC PATCH 5/7] tun: Introduce virtio-net hashing feature

2023-10-08 Thread Willem de Bruijn
On Sun, Oct 8, 2023 at 10:04 PM Akihiko Odaki  wrote:
>
> On 2023/10/09 4:07, Willem de Bruijn wrote:
> > On Sun, Oct 8, 2023 at 7:22 AM Akihiko Odaki  
> > wrote:
> >>
> >> virtio-net have two usage of hashes: one is RSS and another is hash
> >> reporting. Conventionally the hash calculation was done by the VMM.
> >> However, computing the hash after the queue was chosen defeats the
> >> purpose of RSS.
> >>
> >> Another approach is to use eBPF steering program. This approach has
> >> another downside: it cannot report the calculated hash due to the
> >> restrictive nature of eBPF.
> >>
> >> Introduce the code to compute hashes to the kernel in order to overcome
> >> thse challenges. An alternative solution is to extend the eBPF steering
> >> program so that it will be able to report to the userspace, but it makes
> >> little sense to allow to implement different hashing algorithms with
> >> eBPF since the hash value reported by virtio-net is strictly defined by
> >> the specification.
> >>
> >> The hash value already stored in sk_buff is not used and computed
> >> independently since it may have been computed in a way not conformant
> >> with the specification.
> >>
> >> Signed-off-by: Akihiko Odaki 
> >> ---
> >
> >> +static const struct tun_vnet_hash_cap tun_vnet_hash_cap = {
> >> +   .max_indirection_table_length =
> >> +   TUN_VNET_HASH_MAX_INDIRECTION_TABLE_LENGTH,
> >> +
> >> +   .types = VIRTIO_NET_SUPPORTED_HASH_TYPES
> >> +};
> >
> > No need to have explicit capabilities exchange like this? Tun either
> > supports all or none.
>
> tun does not support VIRTIO_NET_RSS_HASH_TYPE_IP_EX,
> VIRTIO_NET_RSS_HASH_TYPE_TCP_EX, and VIRTIO_NET_RSS_HASH_TYPE_UDP_EX.
>
> It is because the flow dissector does not support IPv6 extensions. The
> specification is also vague, and does not tell how many TLVs should be
> consumed at most when interpreting destination option header so I chose
> to avoid adding code for these hash types to the flow dissector. I doubt
> anyone will complain about it since nobody complains for Linux.
>
> I'm also adding this so that we can extend it later.
> max_indirection_table_length may grow for systems with 128+ CPUs, or
> types may have other bits for new protocols in the future.
>
> >
> >>  case TUNSETSTEERINGEBPF:
> >> -   ret = tun_set_ebpf(tun, >steering_prog, argp);
> >> +   bpf_ret = tun_set_ebpf(tun, >steering_prog, argp);
> >> +   if (IS_ERR(bpf_ret))
> >> +   ret = PTR_ERR(bpf_ret);
> >> +   else if (bpf_ret)
> >> +   tun->vnet_hash.flags &= ~TUN_VNET_HASH_RSS;
> >
> > Don't make one feature disable another.
> >
> > TUNSETSTEERINGEBPF and TUNSETVNETHASH are mutually exclusive
> > functions. If one is enabled the other call should fail, with EBUSY
> > for instance.
> >
> >> +   case TUNSETVNETHASH:
> >> +   len = sizeof(vnet_hash);
> >> +   if (copy_from_user(_hash, argp, len)) {
> >> +   ret = -EFAULT;
> >> +   break;
> >> +   }
> >> +
> >> +   if (((vnet_hash.flags & TUN_VNET_HASH_REPORT) &&
> >> +(tun->vnet_hdr_sz < sizeof(struct 
> >> virtio_net_hdr_v1_hash) ||
> >> + !tun_is_little_endian(tun))) ||
> >> +vnet_hash.indirection_table_mask >=
> >> +TUN_VNET_HASH_MAX_INDIRECTION_TABLE_LENGTH) {
> >> +   ret = -EINVAL;
> >> +   break;
> >> +   }
> >> +
> >> +   argp = (u8 __user *)argp + len;
> >> +   len = (vnet_hash.indirection_table_mask + 1) * 2;
> >> +   if (copy_from_user(vnet_hash_indirection_table, argp, 
> >> len)) {
> >> +   ret = -EFAULT;
> >> +   break;
> >> +   }
> >> +
> >> +   argp = (u8 __user *)argp + len;
> >> +   len = virtio_net_hash_key_length(vnet_hash.types);
> >> +
> >> +   if (copy_from_user(vnet_hash_key, argp, len)) {
> >> +   ret = -EFAULT;
> >> +   break;
> >> +   }
> >
> > Probably easier and less error-prone to define a fixed size control
> > struct with the max indirection table size.
>
> I made its size variable because the indirection table and key may grow
> in the future as I wrote above.
>
> >
> > Btw: please trim the CC: list considerably on future patches.
>
> I'll do so in the next version with the TUNSETSTEERINGEBPF change you
> proposed.

To be clear: please don't just resubmit with that one change.

The skb and cb issues are quite fundamental issues that need to be resolved.

I'd like to understand why adjusting the existing BPF feature for this
exact purpose cannot be amended to return the key it produced.

As you point out, the C flow dissector is insufficient. The BPF flow
dissector does not have this problem. The same 

Re: [RFC PATCH 5/7] tun: Introduce virtio-net hashing feature

2023-10-08 Thread Willem de Bruijn
On Sun, Oct 8, 2023 at 7:22 AM Akihiko Odaki  wrote:
>
> virtio-net have two usage of hashes: one is RSS and another is hash
> reporting. Conventionally the hash calculation was done by the VMM.
> However, computing the hash after the queue was chosen defeats the
> purpose of RSS.
>
> Another approach is to use eBPF steering program. This approach has
> another downside: it cannot report the calculated hash due to the
> restrictive nature of eBPF.
>
> Introduce the code to compute hashes to the kernel in order to overcome
> thse challenges. An alternative solution is to extend the eBPF steering
> program so that it will be able to report to the userspace, but it makes
> little sense to allow to implement different hashing algorithms with
> eBPF since the hash value reported by virtio-net is strictly defined by
> the specification.
>
> The hash value already stored in sk_buff is not used and computed
> independently since it may have been computed in a way not conformant
> with the specification.
>
> Signed-off-by: Akihiko Odaki 
> ---

> +static const struct tun_vnet_hash_cap tun_vnet_hash_cap = {
> +   .max_indirection_table_length =
> +   TUN_VNET_HASH_MAX_INDIRECTION_TABLE_LENGTH,
> +
> +   .types = VIRTIO_NET_SUPPORTED_HASH_TYPES
> +};

No need to have explicit capabilities exchange like this? Tun either
supports all or none.

> case TUNSETSTEERINGEBPF:
> -   ret = tun_set_ebpf(tun, >steering_prog, argp);
> +   bpf_ret = tun_set_ebpf(tun, >steering_prog, argp);
> +   if (IS_ERR(bpf_ret))
> +   ret = PTR_ERR(bpf_ret);
> +   else if (bpf_ret)
> +   tun->vnet_hash.flags &= ~TUN_VNET_HASH_RSS;

Don't make one feature disable another.

TUNSETSTEERINGEBPF and TUNSETVNETHASH are mutually exclusive
functions. If one is enabled the other call should fail, with EBUSY
for instance.

> +   case TUNSETVNETHASH:
> +   len = sizeof(vnet_hash);
> +   if (copy_from_user(_hash, argp, len)) {
> +   ret = -EFAULT;
> +   break;
> +   }
> +
> +   if (((vnet_hash.flags & TUN_VNET_HASH_REPORT) &&
> +(tun->vnet_hdr_sz < sizeof(struct 
> virtio_net_hdr_v1_hash) ||
> + !tun_is_little_endian(tun))) ||
> +vnet_hash.indirection_table_mask >=
> +TUN_VNET_HASH_MAX_INDIRECTION_TABLE_LENGTH) {
> +   ret = -EINVAL;
> +   break;
> +   }
> +
> +   argp = (u8 __user *)argp + len;
> +   len = (vnet_hash.indirection_table_mask + 1) * 2;
> +   if (copy_from_user(vnet_hash_indirection_table, argp, len)) {
> +   ret = -EFAULT;
> +   break;
> +   }
> +
> +   argp = (u8 __user *)argp + len;
> +   len = virtio_net_hash_key_length(vnet_hash.types);
> +
> +   if (copy_from_user(vnet_hash_key, argp, len)) {
> +   ret = -EFAULT;
> +   break;
> +   }

Probably easier and less error-prone to define a fixed size control
struct with the max indirection table size.

Btw: please trim the CC: list considerably on future patches.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH 1/7] net: skbuff: Add tun_vnet_hash flag

2023-10-08 Thread Willem de Bruijn
On Sun, Oct 8, 2023 at 7:22 AM Akihiko Odaki  wrote:
>
> tun_vnet_hash can use this flag to indicate it stored virtio-net hash
> cache to cb.
>
> Signed-off-by: Akihiko Odaki 
> ---
>  include/linux/skbuff.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 4174c4b82d13..e638f157c13c 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -837,6 +837,7 @@ typedef unsigned char *sk_buff_data_t;
>   * @truesize: Buffer size
>   * @users: User count - see {datagram,tcp}.c
>   * @extensions: allocated extensions, valid if active_extensions is 
> nonzero
> + * @tun_vnet_hash: tun stored virtio-net hash cache to cb
>   */
>
>  struct sk_buff {
> @@ -989,6 +990,7 @@ struct sk_buff {
>  #if IS_ENABLED(CONFIG_IP_SCTP)
> __u8csum_not_inet:1;
>  #endif
> +   __u8tun_vnet_hash:1;

sk_buff space is very limited.

No need to extend it, especially for code that stays within a single
subsystem (tun).

To a lesser extent the same point applies to the qdisc_skb_cb.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH 0/7] tun: Introduce virtio-net hashing feature

2023-10-08 Thread Willem de Bruijn
On Sun, Oct 8, 2023 at 7:21 AM Akihiko Odaki  wrote:
>
> virtio-net have two usage of hashes: one is RSS and another is hash
> reporting. Conventionally the hash calculation was done by the VMM.
> However, computing the hash after the queue was chosen defeats the
> purpose of RSS.
>
> Another approach is to use eBPF steering program. This approach has
> another downside: it cannot report the calculated hash due to the
> restrictive nature of eBPF.
>
> Introduce the code to compute hashes to the kernel in order to overcome
> thse challenges.
>
> An alternative solution is to extend the eBPF steering program so that it
> will be able to report to the userspace, but it makes little sense to
> allow to implement different hashing algorithms with eBPF since the hash
> value reported by virtio-net is strictly defined by the specification.

But using the existing BPF steering may have the benefit of requiring
a lot less new code.

There is ample precedence for BPF programs that work this way. The
flow dissector comes to mind.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 09/16] vdpa/mlx5: Allow creation/deletion of any given mr struct

2023-10-08 Thread Dragos Tatulea via Virtualization
On Sun, 2023-10-08 at 12:25 +0800, Jason Wang wrote:
> On Tue, Sep 26, 2023 at 3:21 PM Dragos Tatulea  wrote:
> > 
> > On Tue, 2023-09-26 at 12:44 +0800, Jason Wang wrote:
> > > On Tue, Sep 12, 2023 at 9:02 PM Dragos Tatulea 
> > > wrote:
> > > > 
> > > > This patch adapts the mr creation/deletion code to be able to work with
> > > > any given mr struct pointer. All the APIs are adapted to take an extra
> > > > parameter for the mr.
> > > > 
> > > > mlx5_vdpa_create/delete_mr doesn't need a ASID parameter anymore. The
> > > > check is done in the caller instead (mlx5_set_map).
> > > > 
> > > > This change is needed for a followup patch which will introduce an
> > > > additional mr for the vq descriptor data.
> > > > 
> > > > Signed-off-by: Dragos Tatulea 
> > > > ---
> > > 
> > > Thinking of this decoupling I think I have a question.
> > > 
> > > We advertise 2 address spaces and 2 groups. So we actually don't know
> > > for example which address spaces will be used by dvq.
> > > 
> > > And actually we allow the user space to do something like
> > > 
> > > set_group_asid(dvq_group, 0)
> > > set_map(0)
> > > set_group_asid(dvq_group, 1)
> > > set_map(1)
> > > 
> > > I wonder if the decoupling like this patch can work and why.
> > > 
> > This scenario could indeed work. Especially if you look at the 13'th patch
> > [0]
> > where hw support is added. Are you wondering if this should work at all or
> > if it
> > should be blocked?
> 
> It would be great if it can work with the following patches. But at
> least for this patch, it seems not:
> 
> For example, what happens if we switch back to group 0 for dvq?
> 
> set_group_asid(dvq_group, 0)
> set_map(0)
> set_group_asid(dvq_group, 1)
> set_map(1)
> // here we destroy the mr created for asid 0
> set_group_asid(dvq_group, 0)
> 
If by destroy you mean .reset, I think it works: During .reset the mapping in
ASID 0 is reset back to the DMA/pysical map (mlx5_vdpa_create_dma_mr). Am I
missing something?

> Btw, if this is a new issue, I haven't checked whether or not it
> exists before this series (if yes, we can fix on top).

> > 
> > > It looks to me the most easy way is to let each AS be backed by an MR.
> > > Then we don't even need to care about the dvq, cvq.
> > That's what this patch series dowes.
> 
> Good to know this, I will review the series.
> 
I was planning to spin a v3 with Eugenio's suggestions. Should I wait for your
feedback before doing that?

Thanks,
Dragos
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost v2 1/2] virtio_pci: fix the common map size and add check for common size

2023-10-08 Thread Michael S. Tsirkin
On Sun, Oct 08, 2023 at 05:38:41PM +0800, Xuan Zhuo wrote:
> Now, the function vp_modern_map_capability() takes the size parameter,
> which corresponds to the size of virtio_pci_common_cfg. As a result,
> this indicates the size of memory area to map.
> 
> However, if the _F_RING_RESET is negotiated, the extra items will be
> used. Therefore, we need to use the size of virtio_pci_modern_common_cfg
> to map more space.
> 
> Meanwhile, this patch removes the feature(_F_RING_ERSET and

typo

> _F_NOTIFICATION_DATA) when the common cfg size does not match
> the corresponding feature.
> 
> Fixes: 0b50cece0b78 ("virtio_pci: introduce helper to get/set queue reset")
> Signed-off-by: Xuan Zhuo 
> ---
>  drivers/virtio/virtio_pci_modern.c | 20 +++-
>  drivers/virtio/virtio_pci_modern_dev.c |  4 ++--
>  include/linux/virtio_pci_modern.h  |  1 +
>  3 files changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_pci_modern.c 
> b/drivers/virtio/virtio_pci_modern.c
> index d6bb68ba84e5..c0b9d2363ddb 100644
> --- a/drivers/virtio/virtio_pci_modern.c
> +++ b/drivers/virtio/virtio_pci_modern.c
> @@ -22,8 +22,26 @@
>  static u64 vp_get_features(struct virtio_device *vdev)
>  {
>   struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> + u64 features = vp_modern_get_features(_dev->mdev);
> +
> +#define check_feature(feature, field)
> \
> + do {
> \
> + if (features & BIT_ULL(feature)) {  
> \
> + u32 offset = offsetofend(struct 
> virtio_pci_modern_common_cfg, field);   \
> + if (unlikely(vp_dev->mdev.common_len < offset)) 
> \
> + features &= ~BIT_ULL(feature);  
> \
> + }   
> \
> + } while (0)
> +
> + /* For buggy devices, if the common len does not match the feature, we
> +  * remove the feature.

I don't like doing this in vp_get_features. userspace won't be able
to see the actual device features at all.
Also, we should print an info message at least.

I am still debating with myself whether clearing feature bits
or just failing finalize_features (and thus, probe) is best.


> +  */
> + check_feature(VIRTIO_F_NOTIFICATION_DATA, queue_notify_data);
> + check_feature(VIRTIO_F_RING_RESET, queue_reset);
> +
> +#undef check_feature

this macro's too scary. just pass offset and feature bit as
parameters to an inline function.

>  
> - return vp_modern_get_features(_dev->mdev);
> + return features;
>  }
>  
>  static void vp_transport_features(struct virtio_device *vdev, u64 features)
> diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
> b/drivers/virtio/virtio_pci_modern_dev.c
> index aad7d9296e77..33f319da1558 100644
> --- a/drivers/virtio/virtio_pci_modern_dev.c
> +++ b/drivers/virtio/virtio_pci_modern_dev.c
> @@ -291,8 +291,8 @@ int vp_modern_probe(struct virtio_pci_modern_device *mdev)
>   err = -EINVAL;
>   mdev->common = vp_modern_map_capability(mdev, common,
> sizeof(struct virtio_pci_common_cfg), 4,
> -   0, sizeof(struct virtio_pci_common_cfg),
> -   NULL, NULL);
> +   0, sizeof(struct 
> virtio_pci_modern_common_cfg),
> +   >common_len, NULL);
>   if (!mdev->common)
>   goto err_map_common;
>   mdev->isr = vp_modern_map_capability(mdev, isr, sizeof(u8), 1,
> diff --git a/include/linux/virtio_pci_modern.h 
> b/include/linux/virtio_pci_modern.h
> index 067ac1d789bc..edf62bae0474 100644
> --- a/include/linux/virtio_pci_modern.h
> +++ b/include/linux/virtio_pci_modern.h
> @@ -28,6 +28,7 @@ struct virtio_pci_modern_device {
>   /* So we can sanity-check accesses. */
>   size_t notify_len;
>   size_t device_len;
> + size_t common_len;
>  
>   /* Capability for when we need to map notifications per-vq. */
>   int notify_map_cap;
> -- 
> 2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v2 2/2] virtio_pci: add build offset check for the new common cfg items

2023-10-08 Thread Xuan Zhuo
Add checks to the check_offsets(void) for queue_notify_data and
queue_reset.

Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
---
 drivers/virtio/virtio_pci_modern_dev.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
b/drivers/virtio/virtio_pci_modern_dev.c
index 33f319da1558..e2a1fe7bb66c 100644
--- a/drivers/virtio/virtio_pci_modern_dev.c
+++ b/drivers/virtio/virtio_pci_modern_dev.c
@@ -203,6 +203,10 @@ static inline void check_offsets(void)
 offsetof(struct virtio_pci_common_cfg, queue_used_lo));
BUILD_BUG_ON(VIRTIO_PCI_COMMON_Q_USEDHI !=
 offsetof(struct virtio_pci_common_cfg, queue_used_hi));
+   BUILD_BUG_ON(VIRTIO_PCI_COMMON_Q_NDATA !=
+offsetof(struct virtio_pci_modern_common_cfg, 
queue_notify_data));
+   BUILD_BUG_ON(VIRTIO_PCI_COMMON_Q_RESET !=
+offsetof(struct virtio_pci_modern_common_cfg, 
queue_reset));
 }
 
 /*
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v2 1/2] virtio_pci: fix the common map size and add check for common size

2023-10-08 Thread Xuan Zhuo
Now, the function vp_modern_map_capability() takes the size parameter,
which corresponds to the size of virtio_pci_common_cfg. As a result,
this indicates the size of memory area to map.

However, if the _F_RING_RESET is negotiated, the extra items will be
used. Therefore, we need to use the size of virtio_pci_modern_common_cfg
to map more space.

Meanwhile, this patch removes the feature(_F_RING_ERSET and
_F_NOTIFICATION_DATA) when the common cfg size does not match
the corresponding feature.

Fixes: 0b50cece0b78 ("virtio_pci: introduce helper to get/set queue reset")
Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_pci_modern.c | 20 +++-
 drivers/virtio/virtio_pci_modern_dev.c |  4 ++--
 include/linux/virtio_pci_modern.h  |  1 +
 3 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_pci_modern.c 
b/drivers/virtio/virtio_pci_modern.c
index d6bb68ba84e5..c0b9d2363ddb 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -22,8 +22,26 @@
 static u64 vp_get_features(struct virtio_device *vdev)
 {
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
+   u64 features = vp_modern_get_features(_dev->mdev);
+
+#define check_feature(feature, field)  
\
+   do {
\
+   if (features & BIT_ULL(feature)) {  
\
+   u32 offset = offsetofend(struct 
virtio_pci_modern_common_cfg, field);   \
+   if (unlikely(vp_dev->mdev.common_len < offset)) 
\
+   features &= ~BIT_ULL(feature);  
\
+   }   
\
+   } while (0)
+
+   /* For buggy devices, if the common len does not match the feature, we
+* remove the feature.
+*/
+   check_feature(VIRTIO_F_NOTIFICATION_DATA, queue_notify_data);
+   check_feature(VIRTIO_F_RING_RESET, queue_reset);
+
+#undef check_feature
 
-   return vp_modern_get_features(_dev->mdev);
+   return features;
 }
 
 static void vp_transport_features(struct virtio_device *vdev, u64 features)
diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
b/drivers/virtio/virtio_pci_modern_dev.c
index aad7d9296e77..33f319da1558 100644
--- a/drivers/virtio/virtio_pci_modern_dev.c
+++ b/drivers/virtio/virtio_pci_modern_dev.c
@@ -291,8 +291,8 @@ int vp_modern_probe(struct virtio_pci_modern_device *mdev)
err = -EINVAL;
mdev->common = vp_modern_map_capability(mdev, common,
  sizeof(struct virtio_pci_common_cfg), 4,
- 0, sizeof(struct virtio_pci_common_cfg),
- NULL, NULL);
+ 0, sizeof(struct 
virtio_pci_modern_common_cfg),
+ >common_len, NULL);
if (!mdev->common)
goto err_map_common;
mdev->isr = vp_modern_map_capability(mdev, isr, sizeof(u8), 1,
diff --git a/include/linux/virtio_pci_modern.h 
b/include/linux/virtio_pci_modern.h
index 067ac1d789bc..edf62bae0474 100644
--- a/include/linux/virtio_pci_modern.h
+++ b/include/linux/virtio_pci_modern.h
@@ -28,6 +28,7 @@ struct virtio_pci_modern_device {
/* So we can sanity-check accesses. */
size_t notify_len;
size_t device_len;
+   size_t common_len;
 
/* Capability for when we need to map notifications per-vq. */
int notify_map_cap;
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v2 0/2] strictly check the acccess to the common cfg

2023-10-08 Thread Xuan Zhuo
1. fix the length of the pci_iomap_range() to the common cfg
2. add size check for the vq-reset
3. add build size check to the new common cfg items


Xuan Zhuo (2):
  virtio_pci: fix the common map size and add check for common size
  virtio_pci: add build offset check for the new common cfg items

 drivers/virtio/virtio_pci_modern.c | 20 +++-
 drivers/virtio/virtio_pci_modern_dev.c |  8 ++--
 include/linux/virtio_pci_modern.h  |  1 +
 3 files changed, 26 insertions(+), 3 deletions(-)

-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH vhost v1 1/2] virtio_pci: fix the common map size and add check for vq-reset

2023-10-08 Thread Michael S. Tsirkin
On Sun, Oct 08, 2023 at 03:45:34PM +0800, Xuan Zhuo wrote:
> Now, the function vp_modern_map_capability() takes the size parameter,
> which corresponds to the size of virtio_pci_common_cfg. As a result,
> this indicates the size of memory area to map.
> 
> However, if the _F_RING_RESET is negotiated, the extra items will be
> used. Therefore, we need to use the size of virtio_pci_modre_common_cfg

typo

> to map more space.
> 
> Meanwhile, this patch checks the common cfg size when _F_RING_ERSET is
> negotiated.
> 

Fixes: tag please?

> Signed-off-by: Xuan Zhuo 
> ---
>  drivers/virtio/virtio_pci_modern_dev.c | 27 --
>  1 file changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
> b/drivers/virtio/virtio_pci_modern_dev.c
> index aad7d9296e77..45d41e6c7799 100644
> --- a/drivers/virtio/virtio_pci_modern_dev.c
> +++ b/drivers/virtio/virtio_pci_modern_dev.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0-or-later
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -142,6 +143,22 @@ static inline int virtio_pci_find_capability(struct 
> pci_dev *dev, u8 cfg_type,
>   return 0;
>  }
>  
> +static inline int check_common_size(struct virtio_pci_modern_device *mdev,
> +  size_t common_len)
> +{
> + u64 features;
> +
> + features = vp_modern_get_features(mdev);

Better to combine this assignment into definition.
Or even better just drop the variable.

But more importantly this is called before DRIVER is set, right?
Not good then, this is a spec violation to read feature bits
before setting DRIVER:

The driver MUST follow this sequence to initialize a device:

\begin{enumerate}
\item Reset the device.

\item Set the ACKNOWLEDGE status bit: the guest OS has noticed the device.

\item Set the DRIVER status bit: the guest OS knows how to drive the device.

\item\label{itm:General Initialization And Device Operation /
Device Initialization / Read feature bits} Read device feature bits, and write 
the subset of feature bits
   understood by the OS and driver to the device.  During this step the
   driver MAY read (but MUST NOT write) the device-specific configuration 
fields to check that it can support the device before accepting it.


I guess we can add common_len alongside notify_len and device_len?

I think would also prefer to just clear VIRTIO_F_RING_RESET and not
fail probe.




> +
> + if (features & BIT_ULL(VIRTIO_F_RING_RESET)) {
> + if (unlikely(common_len < offsetofend(struct 
> virtio_pci_modern_common_cfg,
> +   queue_reset)))
> + return -ENOENT;

Why ENOENT?


Also as long as you are doing this, please give the same
treatment to VIRTIO_F_NOTIFICATION_DATA.

> + }
> +
> + return 0;
> +}
> +
>  /* This is part of the ABI.  Don't screw with it. */
>  static inline void check_offsets(void)
>  {
> @@ -218,6 +235,7 @@ int vp_modern_probe(struct virtio_pci_modern_device *mdev)
>   int err, common, isr, notify, device;
>   u32 notify_length;
>   u32 notify_offset;
> + size_t common_len;
>   int devid;
>  
>   check_offsets();
> @@ -291,10 +309,14 @@ int vp_modern_probe(struct virtio_pci_modern_device 
> *mdev)
>   err = -EINVAL;
>   mdev->common = vp_modern_map_capability(mdev, common,
> sizeof(struct virtio_pci_common_cfg), 4,
> -   0, sizeof(struct virtio_pci_common_cfg),
> -   NULL, NULL);
> +   0, sizeof(struct 
> virtio_pci_modern_common_cfg),
> +   _len, NULL);
>   if (!mdev->common)
>   goto err_map_common;
> +
> + if (check_common_size(mdev, common_len))
> + goto err_common_size;
> +
>   mdev->isr = vp_modern_map_capability(mdev, isr, sizeof(u8), 1,
>0, 1,
>NULL, NULL);
> @@ -353,6 +375,7 @@ int vp_modern_probe(struct virtio_pci_modern_device *mdev)
>  err_map_notify:
>   pci_iounmap(pci_dev, mdev->isr);
>  err_map_isr:
> +err_common_size:
>   pci_iounmap(pci_dev, mdev->common);
>  err_map_common:
>   pci_release_selected_regions(pci_dev, mdev->modern_bars);
> -- 
> 2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v1 2/2] virtio_pci: add build offset check for the new common cfg items

2023-10-08 Thread Xuan Zhuo
Add checks to the check_offsets(void) for queue_notify_data and
queue_reset.

Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
---
 drivers/virtio/virtio_pci_modern_dev.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
b/drivers/virtio/virtio_pci_modern_dev.c
index 45d41e6c7799..3a49bc963a71 100644
--- a/drivers/virtio/virtio_pci_modern_dev.c
+++ b/drivers/virtio/virtio_pci_modern_dev.c
@@ -220,6 +220,10 @@ static inline void check_offsets(void)
 offsetof(struct virtio_pci_common_cfg, queue_used_lo));
BUILD_BUG_ON(VIRTIO_PCI_COMMON_Q_USEDHI !=
 offsetof(struct virtio_pci_common_cfg, queue_used_hi));
+   BUILD_BUG_ON(VIRTIO_PCI_COMMON_Q_NDATA !=
+offsetof(struct virtio_pci_modern_common_cfg, 
queue_notify_data));
+   BUILD_BUG_ON(VIRTIO_PCI_COMMON_Q_RESET !=
+offsetof(struct virtio_pci_modern_common_cfg, 
queue_reset));
 }
 
 /*
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v1 0/2] strictly check the acccess to the common cfg

2023-10-08 Thread Xuan Zhuo
1. fix the length of the pci_iomap_range() to the common cfg
2. add size check for the vq-reset
3. add build size check to the new common cfg items

Xuan Zhuo (2):
  virtio_pci: fix the common map size and add check for vq-reset
  virtio_pci: add build offset check for the new common cfg items

 drivers/virtio/virtio_pci_modern_dev.c | 31 --
 1 file changed, 29 insertions(+), 2 deletions(-)

--
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v1 1/2] virtio_pci: fix the common map size and add check for vq-reset

2023-10-08 Thread Xuan Zhuo
Now, the function vp_modern_map_capability() takes the size parameter,
which corresponds to the size of virtio_pci_common_cfg. As a result,
this indicates the size of memory area to map.

However, if the _F_RING_RESET is negotiated, the extra items will be
used. Therefore, we need to use the size of virtio_pci_modre_common_cfg
to map more space.

Meanwhile, this patch checks the common cfg size when _F_RING_ERSET is
negotiated.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_pci_modern_dev.c | 27 --
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
b/drivers/virtio/virtio_pci_modern_dev.c
index aad7d9296e77..45d41e6c7799 100644
--- a/drivers/virtio/virtio_pci_modern_dev.c
+++ b/drivers/virtio/virtio_pci_modern_dev.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -142,6 +143,22 @@ static inline int virtio_pci_find_capability(struct 
pci_dev *dev, u8 cfg_type,
return 0;
 }
 
+static inline int check_common_size(struct virtio_pci_modern_device *mdev,
+size_t common_len)
+{
+   u64 features;
+
+   features = vp_modern_get_features(mdev);
+
+   if (features & BIT_ULL(VIRTIO_F_RING_RESET)) {
+   if (unlikely(common_len < offsetofend(struct 
virtio_pci_modern_common_cfg,
+ queue_reset)))
+   return -ENOENT;
+   }
+
+   return 0;
+}
+
 /* This is part of the ABI.  Don't screw with it. */
 static inline void check_offsets(void)
 {
@@ -218,6 +235,7 @@ int vp_modern_probe(struct virtio_pci_modern_device *mdev)
int err, common, isr, notify, device;
u32 notify_length;
u32 notify_offset;
+   size_t common_len;
int devid;
 
check_offsets();
@@ -291,10 +309,14 @@ int vp_modern_probe(struct virtio_pci_modern_device *mdev)
err = -EINVAL;
mdev->common = vp_modern_map_capability(mdev, common,
  sizeof(struct virtio_pci_common_cfg), 4,
- 0, sizeof(struct virtio_pci_common_cfg),
- NULL, NULL);
+ 0, sizeof(struct 
virtio_pci_modern_common_cfg),
+ _len, NULL);
if (!mdev->common)
goto err_map_common;
+
+   if (check_common_size(mdev, common_len))
+   goto err_common_size;
+
mdev->isr = vp_modern_map_capability(mdev, isr, sizeof(u8), 1,
 0, 1,
 NULL, NULL);
@@ -353,6 +375,7 @@ int vp_modern_probe(struct virtio_pci_modern_device *mdev)
 err_map_notify:
pci_iounmap(pci_dev, mdev->isr);
 err_map_isr:
+err_common_size:
pci_iounmap(pci_dev, mdev->common);
 err_map_common:
pci_release_selected_regions(pci_dev, mdev->modern_bars);
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vduse: make vduse_class constant

2023-10-08 Thread Michael S. Tsirkin
On Sun, Oct 08, 2023 at 08:46:45AM +0200, Greg Kroah-Hartman wrote:
> On Sun, Oct 08, 2023 at 02:41:22AM -0400, Michael S. Tsirkin wrote:
> > On Sun, Oct 08, 2023 at 08:40:05AM +0200, Greg Kroah-Hartman wrote:
> > > On Sun, Oct 08, 2023 at 02:20:52AM -0400, Michael S. Tsirkin wrote:
> > > > On Fri, Oct 06, 2023 at 04:30:44PM +0200, Greg Kroah-Hartman wrote:
> > > > > Now that the driver core allows for struct class to be in read-only
> > > > > memory, we should make all 'class' structures declared at build time
> > > > > placing them into read-only memory, instead of having to be 
> > > > > dynamically
> > > > > allocated at runtime.
> > > > > 
> > > > > Cc: "Michael S. Tsirkin" 
> > > > > Cc: Jason Wang 
> > > > > Cc: Xuan Zhuo 
> > > > > Cc: Xie Yongji 
> > > > > Signed-off-by: Greg Kroah-Hartman 
> > > > 
> > > > Acked-by: Michael S. Tsirkin 
> > > > 
> > > > Greg should I merge it or do you intend to merge all these patches?
> > > 
> > > "all"?  There's loads of them for all sorts of subsystems, so feel free
> > > to take it through your subsystem tree if you want.  I usually scoop up
> > > the ones that no one picks after a release and take them through my
> > > tree, to pick up the stragglers.
> > > 
> > > So it's your call, whatever is easier for you is fine for me.
> > > 
> > > thanks,
> > > 
> > > greg k-h
> > 
> > To clarify which commit does this depend on?
> 
> The 6.4 kernel release :)

I'll pick this up then. Thanks!

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vduse: make vduse_class constant

2023-10-08 Thread Greg Kroah-Hartman
On Sun, Oct 08, 2023 at 02:41:22AM -0400, Michael S. Tsirkin wrote:
> On Sun, Oct 08, 2023 at 08:40:05AM +0200, Greg Kroah-Hartman wrote:
> > On Sun, Oct 08, 2023 at 02:20:52AM -0400, Michael S. Tsirkin wrote:
> > > On Fri, Oct 06, 2023 at 04:30:44PM +0200, Greg Kroah-Hartman wrote:
> > > > Now that the driver core allows for struct class to be in read-only
> > > > memory, we should make all 'class' structures declared at build time
> > > > placing them into read-only memory, instead of having to be dynamically
> > > > allocated at runtime.
> > > > 
> > > > Cc: "Michael S. Tsirkin" 
> > > > Cc: Jason Wang 
> > > > Cc: Xuan Zhuo 
> > > > Cc: Xie Yongji 
> > > > Signed-off-by: Greg Kroah-Hartman 
> > > 
> > > Acked-by: Michael S. Tsirkin 
> > > 
> > > Greg should I merge it or do you intend to merge all these patches?
> > 
> > "all"?  There's loads of them for all sorts of subsystems, so feel free
> > to take it through your subsystem tree if you want.  I usually scoop up
> > the ones that no one picks after a release and take them through my
> > tree, to pick up the stragglers.
> > 
> > So it's your call, whatever is easier for you is fine for me.
> > 
> > thanks,
> > 
> > greg k-h
> 
> To clarify which commit does this depend on?

The 6.4 kernel release :)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vduse: make vduse_class constant

2023-10-08 Thread Michael S. Tsirkin
On Sun, Oct 08, 2023 at 08:40:05AM +0200, Greg Kroah-Hartman wrote:
> On Sun, Oct 08, 2023 at 02:20:52AM -0400, Michael S. Tsirkin wrote:
> > On Fri, Oct 06, 2023 at 04:30:44PM +0200, Greg Kroah-Hartman wrote:
> > > Now that the driver core allows for struct class to be in read-only
> > > memory, we should make all 'class' structures declared at build time
> > > placing them into read-only memory, instead of having to be dynamically
> > > allocated at runtime.
> > > 
> > > Cc: "Michael S. Tsirkin" 
> > > Cc: Jason Wang 
> > > Cc: Xuan Zhuo 
> > > Cc: Xie Yongji 
> > > Signed-off-by: Greg Kroah-Hartman 
> > 
> > Acked-by: Michael S. Tsirkin 
> > 
> > Greg should I merge it or do you intend to merge all these patches?
> 
> "all"?  There's loads of them for all sorts of subsystems, so feel free
> to take it through your subsystem tree if you want.  I usually scoop up
> the ones that no one picks after a release and take them through my
> tree, to pick up the stragglers.
> 
> So it's your call, whatever is easier for you is fine for me.
> 
> thanks,
> 
> greg k-h

To clarify which commit does this depend on?

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vduse: make vduse_class constant

2023-10-08 Thread Greg Kroah-Hartman
On Sun, Oct 08, 2023 at 02:20:52AM -0400, Michael S. Tsirkin wrote:
> On Fri, Oct 06, 2023 at 04:30:44PM +0200, Greg Kroah-Hartman wrote:
> > Now that the driver core allows for struct class to be in read-only
> > memory, we should make all 'class' structures declared at build time
> > placing them into read-only memory, instead of having to be dynamically
> > allocated at runtime.
> > 
> > Cc: "Michael S. Tsirkin" 
> > Cc: Jason Wang 
> > Cc: Xuan Zhuo 
> > Cc: Xie Yongji 
> > Signed-off-by: Greg Kroah-Hartman 
> 
> Acked-by: Michael S. Tsirkin 
> 
> Greg should I merge it or do you intend to merge all these patches?

"all"?  There's loads of them for all sorts of subsystems, so feel free
to take it through your subsystem tree if you want.  I usually scoop up
the ones that no one picks after a release and take them through my
tree, to pick up the stragglers.

So it's your call, whatever is easier for you is fine for me.

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vduse: make vduse_class constant

2023-10-08 Thread Michael S. Tsirkin
On Fri, Oct 06, 2023 at 04:30:44PM +0200, Greg Kroah-Hartman wrote:
> Now that the driver core allows for struct class to be in read-only
> memory, we should make all 'class' structures declared at build time
> placing them into read-only memory, instead of having to be dynamically
> allocated at runtime.
> 
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> Cc: Xuan Zhuo 
> Cc: Xie Yongji 
> Signed-off-by: Greg Kroah-Hartman 

Acked-by: Michael S. Tsirkin 

Greg should I merge it or do you intend to merge all these patches?

> ---
>  drivers/vdpa/vdpa_user/vduse_dev.c | 40 --
>  1 file changed, 21 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
> b/drivers/vdpa/vdpa_user/vduse_dev.c
> index df7869537ef1..0ddd4b8abecb 100644
> --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> @@ -134,7 +134,6 @@ static DEFINE_MUTEX(vduse_lock);
>  static DEFINE_IDR(vduse_idr);
>  
>  static dev_t vduse_major;
> -static struct class *vduse_class;
>  static struct cdev vduse_ctrl_cdev;
>  static struct cdev vduse_cdev;
>  static struct workqueue_struct *vduse_irq_wq;
> @@ -1528,6 +1527,16 @@ static const struct kobj_type vq_type = {
>   .default_groups = vq_groups,
>  };
>  
> +static char *vduse_devnode(const struct device *dev, umode_t *mode)
> +{
> + return kasprintf(GFP_KERNEL, "vduse/%s", dev_name(dev));
> +}
> +
> +static const struct class vduse_class = {
> + .name = "vduse",
> + .devnode = vduse_devnode,
> +};
> +
>  static void vduse_dev_deinit_vqs(struct vduse_dev *dev)
>  {
>   int i;
> @@ -1638,7 +1647,7 @@ static int vduse_destroy_dev(char *name)
>   mutex_unlock(>lock);
>  
>   vduse_dev_reset(dev);
> - device_destroy(vduse_class, MKDEV(MAJOR(vduse_major), dev->minor));
> + device_destroy(_class, MKDEV(MAJOR(vduse_major), dev->minor));
>   idr_remove(_idr, dev->minor);
>   kvfree(dev->config);
>   vduse_dev_deinit_vqs(dev);
> @@ -1805,7 +1814,7 @@ static int vduse_create_dev(struct vduse_dev_config 
> *config,
>  
>   dev->minor = ret;
>   dev->msg_timeout = VDUSE_MSG_DEFAULT_TIMEOUT;
> - dev->dev = device_create_with_groups(vduse_class, NULL,
> + dev->dev = device_create_with_groups(_class, NULL,
>   MKDEV(MAJOR(vduse_major), dev->minor),
>   dev, vduse_dev_groups, "%s", config->name);
>   if (IS_ERR(dev->dev)) {
> @@ -1821,7 +1830,7 @@ static int vduse_create_dev(struct vduse_dev_config 
> *config,
>  
>   return 0;
>  err_vqs:
> - device_destroy(vduse_class, MKDEV(MAJOR(vduse_major), dev->minor));
> + device_destroy(_class, MKDEV(MAJOR(vduse_major), dev->minor));
>  err_dev:
>   idr_remove(_idr, dev->minor);
>  err_idr:
> @@ -1934,11 +1943,6 @@ static const struct file_operations vduse_ctrl_fops = {
>   .llseek = noop_llseek,
>  };
>  
> -static char *vduse_devnode(const struct device *dev, umode_t *mode)
> -{
> - return kasprintf(GFP_KERNEL, "vduse/%s", dev_name(dev));
> -}
> -
>  struct vduse_mgmt_dev {
>   struct vdpa_mgmt_dev mgmt_dev;
>   struct device dev;
> @@ -2082,11 +2086,9 @@ static int vduse_init(void)
>   int ret;
>   struct device *dev;
>  
> - vduse_class = class_create("vduse");
> - if (IS_ERR(vduse_class))
> - return PTR_ERR(vduse_class);
> -
> - vduse_class->devnode = vduse_devnode;
> + ret = class_register(_class);
> + if (ret)
> + return ret;
>  
>   ret = alloc_chrdev_region(_major, 0, VDUSE_DEV_MAX, "vduse");
>   if (ret)
> @@ -2099,7 +2101,7 @@ static int vduse_init(void)
>   if (ret)
>   goto err_ctrl_cdev;
>  
> - dev = device_create(vduse_class, NULL, vduse_major, NULL, "control");
> + dev = device_create(_class, NULL, vduse_major, NULL, "control");
>   if (IS_ERR(dev)) {
>   ret = PTR_ERR(dev);
>   goto err_device;
> @@ -2141,13 +2143,13 @@ static int vduse_init(void)
>  err_wq:
>   cdev_del(_cdev);
>  err_cdev:
> - device_destroy(vduse_class, vduse_major);
> + device_destroy(_class, vduse_major);
>  err_device:
>   cdev_del(_ctrl_cdev);
>  err_ctrl_cdev:
>   unregister_chrdev_region(vduse_major, VDUSE_DEV_MAX);
>  err_chardev_region:
> - class_destroy(vduse_class);
> + class_unregister(_class);
>   return ret;
>  }
>  module_init(vduse_init);
> @@ -2159,10 +2161,10 @@ static void vduse_exit(void)
>   destroy_workqueue(vduse_irq_bound_wq);
>   destroy_workqueue(vduse_irq_wq);
>   cdev_del(_cdev);
> - device_destroy(vduse_class, vduse_major);
> + device_destroy(_class, vduse_major);
>   cdev_del(_ctrl_cdev);
>   unregister_chrdev_region(vduse_major, VDUSE_DEV_MAX);
> - class_destroy(vduse_class);
> + class_unregister(_class);
>  }
>  module_exit(vduse_exit);
>  
> -- 
> 2.42.0

___
Virtualization mailing