from:"Michael S. Tsirkin"

Re: [RFC v1 0/8] vhost-vdpa: add support for iommufd

2023-11-07 Thread Michael S. Tsirkin

On Tue, Nov 07, 2023 at 11:48:48AM -0400, Jason Gunthorpe wrote:
> On Tue, Nov 07, 2023 at 09:55:26AM -0500, Michael S. Tsirkin wrote:
> > On Tue, Nov 07, 2023 at 08:49:02AM -0400, Jason Gunthorpe wrote:
> > > IMHO, this patch series needs to spend more time internally to Red Hat
> > > before it is presented to the community.
> > 
> > Just to add an example why I think this "internal review" is a bad idea
> > I seem to recall that someone internal to nvidia at some point
> > attempted to implement this already. The only output from that
> > work we have is that "it's tough" - no pointers to what's tough,
> > no code to study even as a bad path to follow.
> > And while Red Hat might be big, the virt team is rather smaller.
> 
> I don't think Nicolin got to a presentable code point.
> 
> But you can start to see the issues even in this series, like
> simulator is complicated. mlx5 is complicated. Deciding to omit those
> is one path. Come with a proposal and justification to take it out,
> not a patch with an unexplained #ifdef.

Right. Simulator I don't think we need to support, or at least
not necessarily to get this merged - it does not really
benefit from any iommufd features.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC v1 0/8] vhost-vdpa: add support for iommufd

2023-11-07 Thread Michael S. Tsirkin

On Tue, Nov 07, 2023 at 08:49:02AM -0400, Jason Gunthorpe wrote:
> IMHO, this patch series needs to spend more time internally to Red Hat
> before it is presented to the community.

Just to add an example why I think this "internal review" is a bad idea
I seem to recall that someone internal to nvidia at some point
attempted to implement this already. The only output from that
work we have is that "it's tough" - no pointers to what's tough,
no code to study even as a bad path to follow.
And while Red Hat might be big, the virt team is rather smaller.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC v1 0/8] vhost-vdpa: add support for iommufd

2023-11-07 Thread Michael S. Tsirkin

On Tue, Nov 07, 2023 at 10:12:37AM -0400, Jason Gunthorpe wrote:
> Big company's should take the responsibility to train and provide
> skill development for their own staff.

That would result in a beautiful cathedral of a patch. I know this is
how some companies work. We are doing more of a bazaar thing here,
though. In a bunch of subsystems it seems that you don't get the
necessary skills until you have been publically shouted at by
maintainers - better to start early ;). Not a nice environment for
novices, for sure.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC v1 0/8] vhost-vdpa: add support for iommufd

2023-11-07 Thread Michael S. Tsirkin

On Tue, Nov 07, 2023 at 08:49:02AM -0400, Jason Gunthorpe wrote:
> On Tue, Nov 07, 2023 at 02:30:34AM -0500, Michael S. Tsirkin wrote:
> > On Sat, Nov 04, 2023 at 01:16:33AM +0800, Cindy Lu wrote:
> > > 
> > > Hi All
> > > This code provides the iommufd support for vdpa device
> > > This code fixes the bugs from the last version and also add the asid 
> > > support. rebase on kernel
> > > v6,6-rc3
> > > Test passed in the physical device (vp_vdpa), but  there are still some 
> > > problems in the emulated device (vdpa_sim_net), 
> > 
> > What kind of problems? Understanding that will make it easier
> > to figure out whether this is worth reviewing.
> 
> IMHO, this patch series needs to spend more time internally to Red Hat
> before it is presented to the community. It is too far away from
> something worth reviewing at this point.
> 
> Jason

I am always trying to convince people to post RFCs early
instead of working for months behind closed doors only
to be told to rewrite everything in Rust.

Why does it have to be internal to a specific company?
I see Yi Liu from Intel is helping Cindy get it into shape
and that's classic open source ethos.

I know some subsystems ignore the RFC tag but I didn't realize
iommu is one of these. Is that really true?

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC v1 0/8] vhost-vdpa: add support for iommufd

2023-11-07 Thread Michael S. Tsirkin

On Sat, Nov 04, 2023 at 01:16:33AM +0800, Cindy Lu wrote:
> Test passed in the physical device (vp_vdpa), but  there are still some 
> problems in the emulated device (vdpa_sim_net), 

I'm not sure there's even value in bothering with iommufd for the
simulator. Just find a way to disable it and fail gracefully.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC v1 0/8] vhost-vdpa: add support for iommufd

2023-11-06 Thread Michael S. Tsirkin

On Sat, Nov 04, 2023 at 01:16:33AM +0800, Cindy Lu wrote:
> 
> Hi All
> This code provides the iommufd support for vdpa device
> This code fixes the bugs from the last version and also add the asid support. 
> rebase on kernel
> v6,6-rc3
> Test passed in the physical device (vp_vdpa), but  there are still some 
> problems in the emulated device (vdpa_sim_net), 

What kind of problems? Understanding that will make it easier
to figure out whether this is worth reviewing.

> I will continue working on it
> 
> The kernel code is
> https://gitlab.com/lulu6/vhost/-/tree/iommufdRFC_v1
> 
> Signed-off-by: Cindy Lu 

Please also Cc iommufd maintainers:

Jason Gunthorpe  (maintainer:IOMMUFD)
Kevin Tian  (maintainer:IOMMUFD)
Joerg Roedel  (maintainer:IOMMU SUBSYSTEM)
Will Deacon  (maintainer:IOMMU SUBSYSTEM)
Robin Murphy  (reviewer:IOMMU SUBSYSTEM)
io...@lists.linux.dev (open list:IOMMUFD)
linux-ker...@vger.kernel.org (open list)

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[GIT PULL] vhost,virtio,vdpa: features, fixes, cleanups

2023-11-05 Thread Michael S. Tsirkin

The following changes since commit ffc253263a1375a65fa6c9f62a893e9767fbebfa:

  Linux 6.6 (2023-10-29 16:31:08 -1000)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

for you to fetch changes up to 86f6c224c97911b4392cb7b402e6a4ed323a449e:

  vdpa_sim: implement .reset_map support (2023-11-01 09:20:00 -0400)


vhost,virtio,vdpa: features, fixes, cleanups

vdpa/mlx5:
VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK
new maintainer
vdpa:
support for vq descriptor mappings
decouple reset of iotlb mapping from device reset

fixes, cleanups all over the place

Signed-off-by: Michael S. Tsirkin 


Dragos Tatulea (14):
  vdpa/mlx5: Expose descriptor group mkey hw capability
  vdpa/mlx5: Create helper function for dma mappings
  vdpa/mlx5: Decouple cvq iotlb handling from hw mapping code
  vdpa/mlx5: Take cvq iotlb lock during refresh
  vdpa/mlx5: Collapse "dvq" mr add/delete functions
  vdpa/mlx5: Rename mr destroy functions
  vdpa/mlx5: Allow creation/deletion of any given mr struct
  vdpa/mlx5: Move mr mutex out of mr struct
  vdpa/mlx5: Improve mr update flow
  vdpa/mlx5: Introduce mr for vq descriptor
  vdpa/mlx5: Enable hw support for vq descriptor mapping
  vdpa/mlx5: Make iotlb helper functions more generic
  vdpa/mlx5: Update cvq iotlb mapping on ASID change
  MAINTAINERS: Add myself as mlx5_vdpa driver

Eugenio Pérez (1):
  mlx5_vdpa: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK

Geert Uytterhoeven (1):
  vhost-scsi: Spelling s/preceeding/preceding/g

Greg Kroah-Hartman (1):
  vduse: make vduse_class constant

Michael S. Tsirkin (1):
  Merge branch 'mlx5-vhost' of 
https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git

Shannon Nelson (1):
  virtio: kdoc for struct virtio_pci_modern_device

Shawn.Shao (1):
  vdpa: Update sysfs ABI documentation

Si-Wei Liu (10):
  vdpa: introduce dedicated descriptor group for virtqueue
  vhost-vdpa: introduce descriptor group backend feature
  vhost-vdpa: uAPI to get dedicated descriptor group id
  vdpa: introduce .reset_map operation callback
  vhost-vdpa: reset vendor specific mapping to initial state in .release
  vhost-vdpa: introduce IOTLB_PERSIST backend feature bit
  vdpa: introduce .compat_reset operation callback
  vhost-vdpa: clean iotlb map during reset for older userspace
  vdpa/mlx5: implement .reset_map driver op
  vdpa_sim: implement .reset_map support

Xuan Zhuo (3):
  virtio: add definition of VIRTIO_F_NOTIF_CONFIG_DATA feature bit
  virtio_pci: add build offset check for the new common cfg items
  virtio_pci: add check for common cfg size

Xueshi Hu (1):
  virtio-balloon: correct the comment of virtballoon_migratepage()

zhenwei pi (1):
  virtio-blk: fix implicit overflow on virtio_max_dma_size

 Documentation/ABI/testing/sysfs-bus-vdpa |   4 +-
 MAINTAINERS  |   6 +
 drivers/block/virtio_blk.c   |   4 +-
 drivers/vdpa/mlx5/core/mlx5_vdpa.h   |  32 +++--
 drivers/vdpa/mlx5/core/mr.c  | 213 +++
 drivers/vdpa/mlx5/core/resources.c   |   6 +-
 drivers/vdpa/mlx5/net/mlx5_vnet.c| 137 +++-
 drivers/vdpa/vdpa_sim/vdpa_sim.c |  52 ++--
 drivers/vdpa/vdpa_user/vduse_dev.c   |  40 +++---
 drivers/vhost/scsi.c |   2 +-
 drivers/vhost/vdpa.c |  79 +++-
 drivers/virtio/virtio_balloon.c  |   2 +-
 drivers/virtio/virtio_pci_modern.c   |  36 ++
 drivers/virtio/virtio_pci_modern_dev.c   |   6 +-
 drivers/virtio/virtio_vdpa.c |   2 +-
 include/linux/mlx5/mlx5_ifc.h|   8 +-
 include/linux/mlx5/mlx5_ifc_vdpa.h   |   7 +-
 include/linux/vdpa.h |  41 +-
 include/linux/virtio_pci_modern.h|  35 +++--
 include/uapi/linux/vhost.h   |   8 ++
 include/uapi/linux/vhost_types.h |   7 +
 include/uapi/linux/virtio_config.h   |   5 +
 22 files changed, 546 insertions(+), 186 deletions(-)

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v4 4/4] vduse: Add LSM hooks to check Virtio device type

2023-11-03 Thread Michael S. Tsirkin

On Fri, Nov 03, 2023 at 08:55:19AM +0100, Maxime Coquelin wrote:
> 
> 
> On 11/2/23 19:59, Michael S. Tsirkin wrote:
> > On Thu, Nov 02, 2023 at 06:56:59PM +0100, Maxime Coquelin wrote:
> > > 
> > > 
> > > On 10/24/23 17:30, Casey Schaufler wrote:
> > > > On 10/24/2023 2:49 AM, Maxime Coquelin wrote:
> > > > > 
> > > > > 
> > > > > On 10/23/23 17:13, Casey Schaufler wrote:
> > > > > > On 10/23/2023 12:28 AM, Maxime Coquelin wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 10/21/23 00:20, Casey Schaufler wrote:
> > > > > > > > On 10/20/2023 8:58 AM, Maxime Coquelin wrote:
> > > > > > > > > This patch introduces LSM hooks for devices creation,
> > > > > > > > > destruction and opening operations, checking the
> > > > > > > > > application is allowed to perform these operations for
> > > > > > > > > the Virtio device type.
> > > > > > > > 
> > > > > > > > Why do you think that there needs to be a special LSM check for 
> > > > > > > > virtio
> > > > > > > > devices? What can't existing device attributes be used?
> > > > > > > 
> > > > > > > Michael asked for a way for SELinux to allow/prevent the creation 
> > > > > > > of
> > > > > > > some types of devices [0].
> > > > > > > 
> > > > > > > A device is created using ioctl() on VDUSE control chardev. Its 
> > > > > > > type is
> > > > > > > specified via a field in the structure passed in argument.
> > > > > > > 
> > > > > > > I didn't see other way than adding dedicated LSM hooks to achieve 
> > > > > > > this,
> > > > > > > but it is possible that their is a better way to do it?
> > > > > > 
> > > > > > At the very least the hook should be made more general, and I'd 
> > > > > > have to
> > > > > > see a proposal before commenting on that. security_dev_destroy(dev)
> > > > > > might
> > > > > > be a better approach. If there's reason to control destruction of 
> > > > > > vduse
> > > > > > devices it's reasonable to assume that there are other devices with 
> > > > > > the
> > > > > > same or similar properties.
> > > > > 
> > > > > VDUSE is different from other devices as the device is actually
> > > > > implemented by the user-space application, so this is very specific in
> > > > > my opinion.
> > > > 
> > > > This is hardly unique. If you're implementing the device
> > > > in user-space you may well be able to implement the desired
> > > > controls there.
> > > > 
> > > > > 
> > > > > > 
> > > > > > Since SELinux is your target use case, can you explain why you can't
> > > > > > create SELinux policy to enforce the restrictions you're after? I
> > > > > > believe
> > > > > > (but can be proven wrong, of course) that SELinux has mechanism for
> > > > > > dealing
> > > > > > with controls on ioctls.
> > > > > > 
> > > > > 
> > > > > I am not aware of such mechanism to deal with ioctl(), if you have a
> > > > > pointer that would be welcome.
> > > > 
> > > > security/selinux/hooks.c
> > > 
> > > We might be able to extend selinux_file_ioctl(), but that will only
> > > covers the ioctl for the control file, this patch also adds hook for the
> > > device file opening that would need dedicated hook as the device type
> > > information is stored in the device's private data.
> > > 
> > > Michael, before going further, I would be interested in your feedback.
> > > Was this patch what you had in mind when requesting for a way to
> > > allow/deny devices types for a given application?
> > > 
> > > Regards,
> > > Maxime
> > 
> > 
> > Yes, this is more or less what I had in mind.
> 
> Great.
> 
> Do you think we need to cover both ioctl() on the control file and
> open() on the device file, or only ioctl() is enough?
> 
> If the former, we will need VDUSE-specific hooks. I may be able to
> improve my patch to have a single hook instead of 3 by passing the type
> of operation as an extra argument (create/destroy/open).
> 
> If the latter, we may be able to extend the generic ioctl hook.
> 
> Personally, I think it would make sense to also ensure a given
> application can only open existing VDUSE devices it supports. For
> example, openvswitch should only be allowed to open networking VDUSE
> devices.
> 
> Thanks,
> Maxime

I agree here. I think an open hook is important.
Make sure to document the need in the cover letter
and commit log.

> > 
> > > > 
> > > > > 
> > > > > Thanks,
> > > > > Maxime
> > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Maxime
> > > > > > > 
> > > > > > > [0]:
> > > > > > > https://lore.kernel.org/all/20230829130430-mutt-send-email-...@kernel.org/
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V2 vfio 5/9] virtio-pci: Initialize the supported admin commands

2023-11-03 Thread Michael S. Tsirkin

On Fri, Nov 03, 2023 at 08:33:06AM +0800, kernel test robot wrote:
> Hi Yishai,
> 
> kernel test robot noticed the following build warnings:
> 
> [auto build test WARNING on awilliam-vfio/for-linus]
> [also build test WARNING on linus/master v6.6]
> [cannot apply to awilliam-vfio/next mst-vhost/linux-next next-20231102]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:
> https://github.com/intel-lab-lkp/linux/commits/Yishai-Hadas/virtio-Define-feature-bit-for-administration-virtqueue/20231030-000414
> base:   https://github.com/awilliam/linux-vfio.git for-linus
> patch link:
> https://lore.kernel.org/r/20231029155952.67686-6-yishaih%40nvidia.com
> patch subject: [PATCH V2 vfio 5/9] virtio-pci: Initialize the supported admin 
> commands
> config: i386-randconfig-061-20231102 
> (https://download.01.org/0day-ci/archive/20231103/202311030838.gjyabtjm-...@intel.com/config)
> compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
> reproduce (this is a W=1 build): 
> (https://download.01.org/0day-ci/archive/20231103/202311030838.gjyabtjm-...@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version 
> of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot 
> | Closes: 
> https://lore.kernel.org/oe-kbuild-all/202311030838.gjyabtjm-...@intel.com/
> 
> sparse warnings: (new ones prefixed by >>)
> >> drivers/virtio/virtio_pci_modern.c:726:16: sparse: sparse: restricted 
> >> __le16 degrades to integer
> 
> vim +726 drivers/virtio/virtio_pci_modern.c
> 
>673
>674static int vp_modern_admin_cmd_exec(struct virtio_device *vdev,
>675struct virtio_admin_cmd 
> *cmd)
>676{
>677struct scatterlist *sgs[VIRTIO_AVQ_SGS_MAX], hdr, stat;
>678struct virtio_pci_device *vp_dev = to_vp_device(vdev);
>679struct virtio_admin_cmd_status *va_status;
>680unsigned int out_num = 0, in_num = 0;
>681struct virtio_admin_cmd_hdr *va_hdr;
>682struct virtqueue *avq;
>683u16 status;
>684int ret;
>685
>686avq = virtio_has_feature(vdev, VIRTIO_F_ADMIN_VQ) ?
>687vp_dev->admin_vq.info.vq : NULL;
>688if (!avq)
>689return -EOPNOTSUPP;
>690
>691va_status = kzalloc(sizeof(*va_status), GFP_KERNEL);
>692if (!va_status)
>693return -ENOMEM;
>694
>695va_hdr = kzalloc(sizeof(*va_hdr), GFP_KERNEL);
>696if (!va_hdr) {
>697ret = -ENOMEM;
>698goto err_alloc;
>699}
>700
>701va_hdr->opcode = cmd->opcode;
>702va_hdr->group_type = cmd->group_type;
>703va_hdr->group_member_id = cmd->group_member_id;
>704
>705/* Add header */
>706sg_init_one(&hdr, va_hdr, sizeof(*va_hdr));
>707sgs[out_num] = &hdr;
>708out_num++;
>709
>710if (cmd->data_sg) {
>711sgs[out_num] = cmd->data_sg;
>712out_num++;
>713}
>714
>715/* Add return status */
>716sg_init_one(&stat, va_status, sizeof(*va_status));
>717sgs[out_num + in_num] = &stat;
>718in_num++;
>719
>720if (cmd->result_sg) {
>721sgs[out_num + in_num] = cmd->result_sg;
>722in_num++;
>723}
>724
>725if (cmd->opcode == VIRTIO_ADMIN_CMD_LIST_QUERY ||
>  > 726cmd->opcode == VIRTIO_ADMIN_CMD_LIST_USE)

yes, this is broken on BE. You need to convert enums to LE before you
compare.

>727ret = 
> __virtqueue_exec_admin_cmd(&vp_dev->admin_vq, sgs,
>728   out_num, in_num,
>729   sgs, GFP_KERNEL);
>730else
>731ret = 
> virtqueue_exec_admin_cmd(&vp_dev->admin_vq, sgs,
>732   out_num, in_num,
>733   sgs, GFP_KERNEL);
>734if (ret) {
>735dev_err(&vdev->dev,
>736"Failed to execute command on admi

Re: [PATCH v4 4/4] vduse: Add LSM hooks to check Virtio device type

2023-11-02 Thread Michael S. Tsirkin

On Thu, Nov 02, 2023 at 06:56:59PM +0100, Maxime Coquelin wrote:
> 
> 
> On 10/24/23 17:30, Casey Schaufler wrote:
> > On 10/24/2023 2:49 AM, Maxime Coquelin wrote:
> > > 
> > > 
> > > On 10/23/23 17:13, Casey Schaufler wrote:
> > > > On 10/23/2023 12:28 AM, Maxime Coquelin wrote:
> > > > > 
> > > > > 
> > > > > On 10/21/23 00:20, Casey Schaufler wrote:
> > > > > > On 10/20/2023 8:58 AM, Maxime Coquelin wrote:
> > > > > > > This patch introduces LSM hooks for devices creation,
> > > > > > > destruction and opening operations, checking the
> > > > > > > application is allowed to perform these operations for
> > > > > > > the Virtio device type.
> > > > > > 
> > > > > > Why do you think that there needs to be a special LSM check for 
> > > > > > virtio
> > > > > > devices? What can't existing device attributes be used?
> > > > > 
> > > > > Michael asked for a way for SELinux to allow/prevent the creation of
> > > > > some types of devices [0].
> > > > > 
> > > > > A device is created using ioctl() on VDUSE control chardev. Its type 
> > > > > is
> > > > > specified via a field in the structure passed in argument.
> > > > > 
> > > > > I didn't see other way than adding dedicated LSM hooks to achieve 
> > > > > this,
> > > > > but it is possible that their is a better way to do it?
> > > > 
> > > > At the very least the hook should be made more general, and I'd have to
> > > > see a proposal before commenting on that. security_dev_destroy(dev)
> > > > might
> > > > be a better approach. If there's reason to control destruction of vduse
> > > > devices it's reasonable to assume that there are other devices with the
> > > > same or similar properties.
> > > 
> > > VDUSE is different from other devices as the device is actually
> > > implemented by the user-space application, so this is very specific in
> > > my opinion.
> > 
> > This is hardly unique. If you're implementing the device
> > in user-space you may well be able to implement the desired
> > controls there.
> > 
> > > 
> > > > 
> > > > Since SELinux is your target use case, can you explain why you can't
> > > > create SELinux policy to enforce the restrictions you're after? I
> > > > believe
> > > > (but can be proven wrong, of course) that SELinux has mechanism for
> > > > dealing
> > > > with controls on ioctls.
> > > > 
> > > 
> > > I am not aware of such mechanism to deal with ioctl(), if you have a
> > > pointer that would be welcome.
> > 
> > security/selinux/hooks.c
> 
> We might be able to extend selinux_file_ioctl(), but that will only
> covers the ioctl for the control file, this patch also adds hook for the
> device file opening that would need dedicated hook as the device type
> information is stored in the device's private data.
> 
> Michael, before going further, I would be interested in your feedback.
> Was this patch what you had in mind when requesting for a way to
> allow/deny devices types for a given application?
> 
> Regards,
> Maxime


Yes, this is more or less what I had in mind.

> > 
> > > 
> > > Thanks,
> > > Maxime
> > > 
> > > > 
> > > > > 
> > > > > Thanks,
> > > > > Maxime
> > > > > 
> > > > > [0]:
> > > > > https://lore.kernel.org/all/20230829130430-mutt-send-email-...@kernel.org/
> > > > > 
> > > > > 
> > > > 
> > > 
> > 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: virtcrypto_dataq_callback calls crypto_finalize_request() from irq context

2023-11-02 Thread Michael S. Tsirkin

On Thu, Nov 02, 2023 at 01:04:07PM +, Gonglei (Arei) wrote:
> 
> 
> > -Original Message-
> > From: Michael S. Tsirkin [mailto:m...@redhat.com]
> > Sent: Wednesday, November 1, 2023 9:26 PM
> > To: Halil Pasic 
> > Cc: Gonglei (Arei) ; Herbert Xu
> > ; Jason Wang ;
> > virtualization@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> > linux-cry...@vger.kernel.org; Marc Hartmayer 
> > Subject: Re: virtcrypto_dataq_callback calls crypto_finalize_request() from 
> > irq
> > context
> > 
> > On Sun, Sep 24, 2023 at 07:39:41PM +0200, Halil Pasic wrote:
> > > On Sun, 24 Sep 2023 11:56:25 +
> > > "Gonglei (Arei)"  wrote:
> > >
> > > > Hi Halil,
> > > >
> > > > Commit 4058cf08945 introduced a check for detecting crypto
> > > > completion function called with enable BH, and indeed the
> > > > virtio-crypto driver didn't disable BH, which needs a patch to fix it.
> > > >
> > > > P.S.:
> > > > https://lore.kernel.org/lkml/20220221120833.2618733-5-clabbe@baylibr
> > > > e.com/T/
> > > >
> > > > Regards,
> > > > -Gonglei
> > >
> > > Thanks Gonglei!
> > >
> > > Thanks! I would be glad to test that fix on s390x. Are you about to
> > > send one?
> > >
> > > Regards,
> > > Halil
> > 
> > 
> > Gonglei did you intend to send a fix?
> 
> Actually I sent a patch a month ago, pls see another thread.
> 
> 
> Regards,
> -Gonglei

And I think there was an issue with that patch that you
wanted to fix?
config changed callback got fixed but this still didn't.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC 0/7] vdpa: Add support for iommufd

2023-11-02 Thread Michael S. Tsirkin

On Thu, Oct 26, 2023 at 02:48:07PM +0800, Cindy Lu wrote:
> On Thu, Oct 26, 2023 at 2:42 PM Michael S. Tsirkin  wrote:
> >
> > On Sun, Sep 24, 2023 at 01:05:33AM +0800, Cindy Lu wrote:
> > > Hi All
> > > Really apologize for the delay, this is the draft RFC for
> > > iommufd support for vdpa, This code provides the basic function
> > > for iommufd support
> > >
> > > The code was tested and passed in device vdpa_sim_net
> > > The qemu code is
> > > https://gitlab.com/lulu6/gitlabqemutmp/-/tree/iommufdRFC
> > > The kernel code is
> > > https://gitlab.com/lulu6/vhost/-/tree/iommufdRFC
> > >
> > > ToDo
> > > 1. this code is out of date and needs to clean and rebase on the latest 
> > > code
> > > 2. this code has some workaround, I Skip the check for
> > > iommu_group and CACHE_COHERENCY, also some misc issues like need to add
> > > mutex for iommfd operations
> > > 3. only test in emulated device, other modes not tested yet
> > >
> > > After addressed these problems I will send out a new version for RFC. I 
> > > will
> > > provide the code in 3 weeks
> >
> > What's the status here?
> >
> Hi Michael
> The code is finished, but I found some bug after adding the support for ASID,
> will post the new version after this bug is fixed, should be next week
> Thanks
> Cindy

The week is almost gone, what's going on?


> > --
> > MST
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: virtcrypto_dataq_callback calls crypto_finalize_request() from irq context

2023-11-01 Thread Michael S. Tsirkin

On Sun, Sep 24, 2023 at 07:39:41PM +0200, Halil Pasic wrote:
> On Sun, 24 Sep 2023 11:56:25 +
> "Gonglei (Arei)"  wrote:
> 
> > Hi Halil,
> > 
> > Commit 4058cf08945 introduced a check for detecting crypto completion 
> > function 
> > called with enable BH, and indeed the virtio-crypto driver didn't disable 
> > BH, which needs
> > a patch to fix it.
> > 
> > P.S.: 
> > https://lore.kernel.org/lkml/20220221120833.2618733-5-cla...@baylibre.com/T/
> > 
> > Regards,
> > -Gonglei
> 
> Thanks Gonglei!
> 
> Thanks! I would be glad to test that fix on s390x. Are you about to send
> one?
> 
> Regards,
> Halil


Gonglei did you intend to send a fix?

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 0/2] vdpa: Add logging operatins

2023-11-01 Thread Michael S. Tsirkin

On Mon, Sep 11, 2023 at 02:56:58PM +0800, Jason Wang wrote:
> Adding Eugenio and Si Wei.
> 
> On Sat, Aug 26, 2023 at 9:24 AM Jiang Dongxu  wrote:
> >
> > From: jiangdongxu 
> >
> > Currently, the vdpa device supports suspend and resume operations.
> > To support vdpa device live migration, we need to support logging
> > operations and device state save/load opertions.
> >
> > These series introduces some new operations for vdpa devices.
> > They allow vdpa to enable logging while vm start live migration.
> >
> > And I will submit another patches about saving and loading
> > vdpa device state later.
> 
> Thanks for working on this, I have several questions:
> 
> 1) Is there an example implementation of the logging in the drivers?
> We need a real user in order to merge this.
> 2) Is the logging based on IOVA or VA? How the DMA isolation is being
> done with this? Do we need a SET_LOGGING_ASID uAPI for this? (We had
> some discussion on this in the past).
> 
> Thanks


No answer so far so I'll drop this for now.

> >
> > jiangdongxu (2):
> >   vdpa: add log operations
> >   vhost-vdpa: add uAPI for logging
> >
> >  drivers/vhost/vdpa.c   | 49 ++
> >  include/linux/vdpa.h   | 14 +++
> >  include/uapi/linux/vhost.h |  4 
> >  3 files changed, 67 insertions(+)
> >
> > --
> > 2.27.0
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation

2023-11-01 Thread Michael S. Tsirkin

On Wed, Nov 01, 2023 at 05:40:30PM +0800, Heng Qi wrote:
> 
> 
> 在 2023/10/25 下午1:49, Michael S. Tsirkin 写道:
> > On Thu, Oct 12, 2023 at 03:44:04PM +0800, Heng Qi wrote:
> > > Now, virtio-net already supports per-queue moderation parameter
> > > setting. Based on this, we use the netdim library of linux to support
> > > dynamic coalescing moderation for virtio-net.
> > > 
> > > Due to hardware scheduling issues, we only tested rx dim.
> > So patches 1 to 4 look ok but patch 5 is untested - we should
> > probably wait until it's tested properly.
> 
> Hi, Michael.
> 
> For a few reasons (reply to Jason's thread), I won't be trying to push tx
> dim any more in the short term.
> 
> Please review the remaining patches.
> 
> Thanks a lot!


You got a bunch of comments from Jason - want to address them
in a new version then, and I'll review that?

> > 
> > 
> > > @Test env
> > > rxq0 has affinity to cpu0.
> > > 
> > > @Test cmd
> > > client: taskset -c 0 sockperf tp -i ${IP} -t 30 --tcp -m ${msg_size}
> > > server: taskset -c 0 sockperf sr --tcp
> > > 
> > > @Test res
> > > The second column is the ratio of the result returned by client
> > > when rx dim is enabled to the result returned by client when
> > > rx dim is disabled.
> > >   --
> > >   | msg_size |  rx_dim=on / rx_dim=off |
> > >   --
> > >   |   14B| + 3%|
> > >   --
> > >   |   100B   | + 16%   |
> > >   --
> > >   |   500B   | + 25%   |
> > >   --
> > >   |   1400B  | + 28%   |
> > >   --
> > >   |   2048B  | + 22%   |
> > >   --
> > >   |   4096B  | + 5%|
> > >   --
> > > 
> > > ---
> > > This patch set was part of the previous netdim patch set[1].
> > > [1] was split into a merged bugfix set[2] and the current set.
> > > The previous relevant commentators have been Cced.
> > > 
> > > [1] 
> > > https://lore.kernel.org/all/20230811065512.22190-1-hen...@linux.alibaba.com/
> > > [2] 
> > > https://lore.kernel.org/all/cover.1696745452.git.hen...@linux.alibaba.com/
> > > 
> > > Heng Qi (5):
> > >virtio-net: returns whether napi is complete
> > >virtio-net: separate rx/tx coalescing moderation cmds
> > >virtio-net: extract virtqueue coalescig cmd for reuse
> > >virtio-net: support rx netdim
> > >virtio-net: support tx netdim
> > > 
> > >   drivers/net/virtio_net.c | 394 ---
> > >   1 file changed, 322 insertions(+), 72 deletions(-)
> > > 
> > > -- 
> > > 2.19.1.6.gb485710b

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net-XXX] vhost-vdpa: fix use after free in vhost_vdpa_probe()

2023-10-31 Thread Michael S. Tsirkin

On Fri, Oct 27, 2023 at 03:12:54PM +0300, Dan Carpenter wrote:
> The put_device() calls vhost_vdpa_release_dev() which calls
> ida_simple_remove() and frees "v".  So this call to
> ida_simple_remove() is a use after free and a double free.
> 
> Fixes: ebe6a354fa7e ("vhost-vdpa: Call ida_simple_remove() when failed")
> Signed-off-by: Dan Carpenter 

queued, thanks!

> ---
>  drivers/vhost/vdpa.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 9a2343c45df0..1aa67729e188 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -1511,7 +1511,6 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
>  
>  err:
>   put_device(&v->dev);
> - ida_simple_remove(&vhost_vdpa_ida, v->minor);
>   return r;
>  }
>  
> -- 
> 2.42.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH] virtio_pci: move structure to a header

2023-10-31 Thread Michael S. Tsirkin

These are guest/host interfaces so belong in the header
where e.g. qemu will know to find them.
Note: we added a new structure as opposed to extending existing one
because someone might be relying on the size of the existing structure
staying unchanged.  Add a warning to avoid using sizeof.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/virtio/virtio_pci_modern_dev.c |  7 ---
 include/linux/virtio_pci_modern.h  |  7 ---
 include/uapi/linux/virtio_pci.h| 11 +++
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
b/drivers/virtio/virtio_pci_modern_dev.c
index e2a1fe7bb66c..7de8b1ebabac 100644
--- a/drivers/virtio/virtio_pci_modern_dev.c
+++ b/drivers/virtio/virtio_pci_modern_dev.c
@@ -294,9 +294,10 @@ int vp_modern_probe(struct virtio_pci_modern_device *mdev)
 
err = -EINVAL;
mdev->common = vp_modern_map_capability(mdev, common,
- sizeof(struct virtio_pci_common_cfg), 4,
- 0, sizeof(struct 
virtio_pci_modern_common_cfg),
- &mdev->common_len, NULL);
+ sizeof(struct virtio_pci_common_cfg), 4, 0,
+ offsetofend(struct virtio_pci_modern_common_cfg,
+ queue_reset),
+ &mdev->common_len, NULL);
if (!mdev->common)
goto err_map_common;
mdev->isr = vp_modern_map_capability(mdev, isr, sizeof(u8), 1,
diff --git a/include/linux/virtio_pci_modern.h 
b/include/linux/virtio_pci_modern.h
index d0f2797420f7..a09e13a577a9 100644
--- a/include/linux/virtio_pci_modern.h
+++ b/include/linux/virtio_pci_modern.h
@@ -5,13 +5,6 @@
 #include 
 #include 
 
-struct virtio_pci_modern_common_cfg {
-   struct virtio_pci_common_cfg cfg;
-
-   __le16 queue_notify_data;   /* read-write */
-   __le16 queue_reset; /* read-write */
-};
-
 /**
  * struct virtio_pci_modern_device - info for modern PCI virtio
  * @pci_dev:   Ptr to the PCI device struct
diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
index f703afc7ad31..44f4dd2add18 100644
--- a/include/uapi/linux/virtio_pci.h
+++ b/include/uapi/linux/virtio_pci.h
@@ -166,6 +166,17 @@ struct virtio_pci_common_cfg {
__le32 queue_used_hi;   /* read-write */
 };
 
+/*
+ * Warning: do not use sizeof on this: use offsetofend for
+ * specific fields you need.
+ */
+struct virtio_pci_modern_common_cfg {
+   struct virtio_pci_common_cfg cfg;
+
+   __le16 queue_notify_data;   /* read-write */
+   __le16 queue_reset; /* read-write */
+};
+
 /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */
 struct virtio_pci_cfg_cap {
struct virtio_pci_cap cap;
-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V2 vfio 6/9] virtio-pci: Introduce APIs to execute legacy IO admin commands

2023-10-31 Thread Michael S. Tsirkin

On Tue, Oct 31, 2023 at 10:30:41AM +0200, Yishai Hadas wrote:
> > And further, is caller expected not to invoke several of these
> > in parallel on the same device? If yes this needs to be
> > documented. I don't see where does vfio enforce this if yes.
> Please have a look at virtiovf_issue_legacy_rw_cmd() from patch #9.
> 
> It has a lock on its VF device to serialize access to the bar, it includes
> calling this API.
> 
> Yishai

OK so if caller must serialize accesses then please document this assumption.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V2 vfio 9/9] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-31 Thread Michael S. Tsirkin

On Tue, Oct 31, 2023 at 04:17:45PM +0800, Yi Liu wrote:
> a dumb question. Is it common between all virtio devices that the legay
> interface is in BAR0?

It has to be, that is where the legacy driver is looking for it.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V2 vfio 6/9] virtio-pci: Introduce APIs to execute legacy IO admin commands

2023-10-31 Thread Michael S. Tsirkin

On Sun, Oct 29, 2023 at 05:59:49PM +0200, Yishai Hadas wrote:
> Introduce APIs to execute legacy IO admin commands.
> 
> It includes: io_legacy_read/write for both common and the device
> registers, io_legacy_notify_info.
> 
> In addition, exposing an API to check whether the legacy IO commands are
> supported. (i.e. virtio_pci_admin_has_legacy_io()).
> 
> Those APIs will be used by the next patches from this series.
> 
> Signed-off-by: Yishai Hadas 
> ---
>  drivers/virtio/virtio_pci_common.c |  11 ++
>  drivers/virtio/virtio_pci_common.h |   2 +
>  drivers/virtio/virtio_pci_modern.c | 241 +
>  include/linux/virtio_pci_admin.h   |  21 +++
>  4 files changed, 275 insertions(+)
>  create mode 100644 include/linux/virtio_pci_admin.h
> 
> diff --git a/drivers/virtio/virtio_pci_common.c 
> b/drivers/virtio/virtio_pci_common.c
> index 6b4766d5abe6..212d68401d2c 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -645,6 +645,17 @@ static struct pci_driver virtio_pci_driver = {
>   .sriov_configure = virtio_pci_sriov_configure,
>  };
>  
> +struct virtio_device *virtio_pci_vf_get_pf_dev(struct pci_dev *pdev)
> +{
> + struct virtio_pci_device *pf_vp_dev;
> +
> + pf_vp_dev = pci_iov_get_pf_drvdata(pdev, &virtio_pci_driver);
> + if (IS_ERR(pf_vp_dev))
> + return NULL;
> +
> + return &pf_vp_dev->vdev;
> +}
> +
>  module_pci_driver(virtio_pci_driver);
>  
>  MODULE_AUTHOR("Anthony Liguori ");
> diff --git a/drivers/virtio/virtio_pci_common.h 
> b/drivers/virtio/virtio_pci_common.h
> index 9e07e556a51a..07d4f863ac44 100644
> --- a/drivers/virtio/virtio_pci_common.h
> +++ b/drivers/virtio/virtio_pci_common.h
> @@ -156,4 +156,6 @@ static inline void virtio_pci_legacy_remove(struct 
> virtio_pci_device *vp_dev)
>  int virtio_pci_modern_probe(struct virtio_pci_device *);
>  void virtio_pci_modern_remove(struct virtio_pci_device *);
>  
> +struct virtio_device *virtio_pci_vf_get_pf_dev(struct pci_dev *pdev);
> +
>  #endif
> diff --git a/drivers/virtio/virtio_pci_modern.c 
> b/drivers/virtio/virtio_pci_modern.c
> index 25e27aa79cab..def0f2de6091 100644
> --- a/drivers/virtio/virtio_pci_modern.c
> +++ b/drivers/virtio/virtio_pci_modern.c
> @@ -15,6 +15,7 @@
>   */
>  
>  #include 
> +#include 
>  #define VIRTIO_PCI_NO_LEGACY
>  #define VIRTIO_RING_NO_LEGACY
>  #include "virtio_pci_common.h"
> @@ -794,6 +795,246 @@ static void vp_modern_destroy_avq(struct virtio_device 
> *vdev)
>   vp_dev->del_vq(&vp_dev->admin_vq.info);
>  }
>  
> +#define VIRTIO_LEGACY_ADMIN_CMD_BITMAP \
> + (BIT_ULL(VIRTIO_ADMIN_CMD_LEGACY_COMMON_CFG_WRITE) | \
> +  BIT_ULL(VIRTIO_ADMIN_CMD_LEGACY_COMMON_CFG_READ) | \
> +  BIT_ULL(VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_WRITE) | \
> +  BIT_ULL(VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_READ) | \
> +  BIT_ULL(VIRTIO_ADMIN_CMD_LEGACY_NOTIFY_INFO))
> +
> +/*
> + * virtio_pci_admin_has_legacy_io - Checks whether the legacy IO
> + * commands are supported
> + * @dev: VF pci_dev
> + *
> + * Returns true on success.
> + */
> +bool virtio_pci_admin_has_legacy_io(struct pci_dev *pdev)
> +{
> + struct virtio_device *virtio_dev = virtio_pci_vf_get_pf_dev(pdev);
> + struct virtio_pci_device *vp_dev;
> +
> + if (!virtio_dev)
> + return false;
> +
> + if (!virtio_has_feature(virtio_dev, VIRTIO_F_ADMIN_VQ))
> + return false;
> +
> + vp_dev = to_vp_device(virtio_dev);
> +
> + if ((vp_dev->admin_vq.supported_cmds & VIRTIO_LEGACY_ADMIN_CMD_BITMAP) 
> ==
> + VIRTIO_LEGACY_ADMIN_CMD_BITMAP)
> + return true;
> + return false;
> +}
> +EXPORT_SYMBOL_GPL(virtio_pci_admin_has_legacy_io);
> +
> +static int virtio_pci_admin_legacy_io_write(struct pci_dev *pdev, u16 opcode,
> + u8 offset, u8 size, u8 *buf)
> +{
> + struct virtio_device *virtio_dev = virtio_pci_vf_get_pf_dev(pdev);
> + struct virtio_admin_cmd_legacy_wr_data *data;
> + struct virtio_admin_cmd cmd = {};
> + struct scatterlist data_sg;
> + int vf_id;
> + int ret;
> +
> + if (!virtio_dev)
> + return -ENODEV;
> +
> + vf_id = pci_iov_vf_id(pdev);
> + if (vf_id < 0)
> + return vf_id;
> +
> + data = kzalloc(sizeof(*data) + size, GFP_KERNEL);
> + if (!data)
> + return -ENOMEM;
> +
> + data->offset = offset;
> + memcpy(data->registers, buf, size);
> + sg_init_one(&data_sg, data, sizeof(*data) + size);
> + cmd.opcode = cpu_to_le16(opcode);
> + cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SRIOV);
> + cmd.group_member_id = cpu_to_le64(vf_id + 1);
> + cmd.data_sg = &data_sg;
> + ret = vp_modern_admin_cmd_exec(virtio_dev, &cmd);
> +
> + kfree(data);
> + return ret;
> +}


> +
> +/*
> + * virtio_pci_admin_legacy_io_write_common - Write common legacy registers
> + * of a member device
> + * @dev: VF pci_dev
> + * @offset: starting byte

Re: [PATCH V2 vfio 2/9] virtio-pci: Introduce admin virtqueue

2023-10-31 Thread Michael S. Tsirkin

On Tue, Oct 31, 2023 at 03:11:57AM +, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin 
> > Sent: Tuesday, October 31, 2023 5:02 AM
> > 
> > On Mon, Oct 30, 2023 at 06:10:06PM +, Parav Pandit wrote:
> > >
> > >
> > > > From: Michael S. Tsirkin 
> > > > Sent: Monday, October 30, 2023 9:29 PM On Mon, Oct 30, 2023 at
> > > > 03:51:40PM +, Parav Pandit wrote:
> > > > >
> > > > >
> > > > > > From: Michael S. Tsirkin 
> > > > > > Sent: Monday, October 30, 2023 1:53 AM
> > > > > >
> > > > > > On Sun, Oct 29, 2023 at 05:59:45PM +0200, Yishai Hadas wrote:
> > > > > > > From: Feng Liu 
> > > > > > >
> > > > > > > Introduce support for the admin virtqueue. By negotiating
> > > > > > > VIRTIO_F_ADMIN_VQ feature, driver detects capability and
> > > > > > > creates one administration virtqueue. Administration virtqueue
> > > > > > > implementation in virtio pci generic layer, enables multiple
> > > > > > > types of upper layer drivers such as vfio, net, blk to utilize it.
> > > > > > >
> > > > > > > Signed-off-by: Feng Liu 
> > > > > > > Reviewed-by: Parav Pandit 
> > > > > > > Reviewed-by: Jiri Pirko 
> > > > > > > Signed-off-by: Yishai Hadas 
> > > > > > > ---
> > > > > > >  drivers/virtio/virtio.c| 37 ++--
> > > > > > >  drivers/virtio/virtio_pci_common.c |  3 ++
> > > > > > >  drivers/virtio/virtio_pci_common.h | 15 ++-
> > > > > > >  drivers/virtio/virtio_pci_modern.c | 61
> > +-
> > > > > > >  drivers/virtio/virtio_pci_modern_dev.c | 18 
> > > > > > >  include/linux/virtio_config.h  |  4 ++
> > > > > > >  include/linux/virtio_pci_modern.h  |  5 +++
> > > > > > >  7 files changed, 137 insertions(+), 6 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > index
> > > > > > > 3893dc29eb26..f4080692b351 100644
> > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > @@ -302,9 +302,15 @@ static int virtio_dev_probe(struct device 
> > > > > > > *_d)
> > > > > > >   if (err)
> > > > > > >   goto err;
> > > > > > >
> > > > > > > + if (dev->config->create_avq) {
> > > > > > > + err = dev->config->create_avq(dev);
> > > > > > > + if (err)
> > > > > > > + goto err;
> > > > > > > + }
> > > > > > > +
> > > > > > >   err = drv->probe(dev);
> > > > > > >   if (err)
> > > > > > > - goto err;
> > > > > > > + goto err_probe;
> > > > > > >
> > > > > > >   /* If probe didn't do it, mark device DRIVER_OK ourselves. */
> > > > > > >   if (!(dev->config->get_status(dev) &
> > > > > > > VIRTIO_CONFIG_S_DRIVER_OK))
> > > > > >
> > > > > > Hmm I am not all that happy that we are just creating avq
> > unconditionally.
> > > > > > Can't we do it on demand to avoid wasting resources if no one uses 
> > > > > > it?
> > > > > >
> > > > > Virtio queues must be enabled before driver_ok as we discussed in
> > > > F_DYNAMIC bit exercise.
> > > > > So creating AQ when first legacy command is invoked, would be too 
> > > > > late.
> > > >
> > > > Well we didn't release the spec with AQ so I am pretty sure there
> > > > are no devices using the feature. Do we want to already make an
> > > > exception for AQ and allow creating AQs after DRIVER_OK even without
> > F_DYNAMIC?
> > > >
> > > No. it would abuse the init time config registers for the dynamic things 
> > > like
> > this.
> > > For flow filters and others there is need for dynamic q creation with 
&g

Re: [PATCH V2 vfio 2/9] virtio-pci: Introduce admin virtqueue

2023-10-30 Thread Michael S. Tsirkin

On Mon, Oct 30, 2023 at 06:10:06PM +, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin 
> > Sent: Monday, October 30, 2023 9:29 PM
> > On Mon, Oct 30, 2023 at 03:51:40PM +, Parav Pandit wrote:
> > >
> > >
> > > > From: Michael S. Tsirkin 
> > > > Sent: Monday, October 30, 2023 1:53 AM
> > > >
> > > > On Sun, Oct 29, 2023 at 05:59:45PM +0200, Yishai Hadas wrote:
> > > > > From: Feng Liu 
> > > > >
> > > > > Introduce support for the admin virtqueue. By negotiating
> > > > > VIRTIO_F_ADMIN_VQ feature, driver detects capability and creates
> > > > > one administration virtqueue. Administration virtqueue
> > > > > implementation in virtio pci generic layer, enables multiple types
> > > > > of upper layer drivers such as vfio, net, blk to utilize it.
> > > > >
> > > > > Signed-off-by: Feng Liu 
> > > > > Reviewed-by: Parav Pandit 
> > > > > Reviewed-by: Jiri Pirko 
> > > > > Signed-off-by: Yishai Hadas 
> > > > > ---
> > > > >  drivers/virtio/virtio.c| 37 ++--
> > > > >  drivers/virtio/virtio_pci_common.c |  3 ++
> > > > >  drivers/virtio/virtio_pci_common.h | 15 ++-
> > > > >  drivers/virtio/virtio_pci_modern.c | 61 
> > > > > +-
> > > > >  drivers/virtio/virtio_pci_modern_dev.c | 18 
> > > > >  include/linux/virtio_config.h  |  4 ++
> > > > >  include/linux/virtio_pci_modern.h  |  5 +++
> > > > >  7 files changed, 137 insertions(+), 6 deletions(-)
> > > > >
> > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > index
> > > > > 3893dc29eb26..f4080692b351 100644
> > > > > --- a/drivers/virtio/virtio.c
> > > > > +++ b/drivers/virtio/virtio.c
> > > > > @@ -302,9 +302,15 @@ static int virtio_dev_probe(struct device *_d)
> > > > >   if (err)
> > > > >   goto err;
> > > > >
> > > > > + if (dev->config->create_avq) {
> > > > > + err = dev->config->create_avq(dev);
> > > > > + if (err)
> > > > > + goto err;
> > > > > + }
> > > > > +
> > > > >   err = drv->probe(dev);
> > > > >   if (err)
> > > > > - goto err;
> > > > > + goto err_probe;
> > > > >
> > > > >   /* If probe didn't do it, mark device DRIVER_OK ourselves. */
> > > > >   if (!(dev->config->get_status(dev) & VIRTIO_CONFIG_S_DRIVER_OK))
> > > >
> > > > Hmm I am not all that happy that we are just creating avq 
> > > > unconditionally.
> > > > Can't we do it on demand to avoid wasting resources if no one uses it?
> > > >
> > > Virtio queues must be enabled before driver_ok as we discussed in
> > F_DYNAMIC bit exercise.
> > > So creating AQ when first legacy command is invoked, would be too late.
> > 
> > Well we didn't release the spec with AQ so I am pretty sure there are no 
> > devices
> > using the feature. Do we want to already make an exception for AQ and allow
> > creating AQs after DRIVER_OK even without F_DYNAMIC?
> > 
> No. it would abuse the init time config registers for the dynamic things like 
> this.
> For flow filters and others there is need for dynamic q creation with 
> multiple physical address anyway.

That seems like a completely unrelated issue.

> So creating virtqueues dynamically using a generic scheme is desired with new 
> feature bit.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V2 vfio 5/9] virtio-pci: Initialize the supported admin commands

2023-10-30 Thread Michael S. Tsirkin

On Mon, Oct 30, 2023 at 06:06:45PM +0200, Yishai Hadas wrote:
> On 30/10/2023 17:57, Michael S. Tsirkin wrote:
> > On Mon, Oct 30, 2023 at 05:27:50PM +0200, Yishai Hadas wrote:
> > > On 29/10/2023 22:17, Michael S. Tsirkin wrote:
> > > > On Sun, Oct 29, 2023 at 05:59:48PM +0200, Yishai Hadas wrote:
> > > > > Initialize the supported admin commands upon activating the admin 
> > > > > queue.
> > > > > 
> > > > > The supported commands are saved as part of the admin queue context, 
> > > > > it
> > > > > will be used by the next patches from this series.
> > > > > 
> > > > > Note:
> > > > > As we don't want to let upper layers to execute admin commands before
> > > > > that this initialization step was completed, we set ref count to 1 
> > > > > only
> > > > > post of that flow and use a non ref counted version command for this
> > > > > internal flow.
> > > > > 
> > > > > Signed-off-by: Yishai Hadas 
> > > > > ---
> > > > >drivers/virtio/virtio_pci_common.h |  1 +
> > > > >drivers/virtio/virtio_pci_modern.c | 77 
> > > > > +-
> > > > >2 files changed, 77 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/virtio/virtio_pci_common.h 
> > > > > b/drivers/virtio/virtio_pci_common.h
> > > > > index a21b9ba01a60..9e07e556a51a 100644
> > > > > --- a/drivers/virtio/virtio_pci_common.h
> > > > > +++ b/drivers/virtio/virtio_pci_common.h
> > > > > @@ -46,6 +46,7 @@ struct virtio_pci_admin_vq {
> > > > >   struct virtio_pci_vq_info info;
> > > > >   struct completion flush_done;
> > > > >   refcount_t refcount;
> > > > > + u64 supported_cmds;
> > > > >   /* Name of the admin queue: avq.$index. */
> > > > >   char name[10];
> > > > >   u16 vq_index;
> > > > > diff --git a/drivers/virtio/virtio_pci_modern.c 
> > > > > b/drivers/virtio/virtio_pci_modern.c
> > > > > index ccd7a4d9f57f..25e27aa79cab 100644
> > > > > --- a/drivers/virtio/virtio_pci_modern.c
> > > > > +++ b/drivers/virtio/virtio_pci_modern.c
> > > > > @@ -19,6 +19,9 @@
> > > > >#define VIRTIO_RING_NO_LEGACY
> > > > >#include "virtio_pci_common.h"
> > > > > +static int vp_modern_admin_cmd_exec(struct virtio_device *vdev,
> > > > > + struct virtio_admin_cmd *cmd);
> > > > > +
> > > > I don't much like forward declarations. Just order functions sensibly
> > > > and they will not be needed.
> > > OK, will be part of V3.
> > > 
> > > > >static u64 vp_get_features(struct virtio_device *vdev)
> > > > >{
> > > > >   struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> > > > > @@ -59,6 +62,42 @@ vp_modern_avq_set_abort(struct virtio_pci_admin_vq 
> > > > > *admin_vq, bool abort)
> > > > >   WRITE_ONCE(admin_vq->abort, abort);
> > > > >}
> > > > > +static void virtio_pci_admin_init_cmd_list(struct virtio_device 
> > > > > *virtio_dev)
> > > > > +{
> > > > > + struct virtio_pci_device *vp_dev = to_vp_device(virtio_dev);
> > > > > + struct virtio_admin_cmd cmd = {};
> > > > > + struct scatterlist result_sg;
> > > > > + struct scatterlist data_sg;
> > > > > + __le64 *data;
> > > > > + int ret;
> > > > > +
> > > > > + data = kzalloc(sizeof(*data), GFP_KERNEL);
> > > > > + if (!data)
> > > > > + return;
> > > > > +
> > > > > + sg_init_one(&result_sg, data, sizeof(*data));
> > > > > + cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_LIST_QUERY);
> > > > > + cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SRIOV);
> > > > > + cmd.result_sg = &result_sg;
> > > > > +
> > > > > + ret = vp_modern_admin_cmd_exec(virtio_dev, &cmd);
> > > > > + if (ret)
> > > > > + goto end;
> > > > > +
> > > > > + sg_init_one(&

Re: [PATCH V2 vfio 2/9] virtio-pci: Introduce admin virtqueue

2023-10-30 Thread Michael S. Tsirkin

On Mon, Oct 30, 2023 at 03:51:40PM +, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin 
> > Sent: Monday, October 30, 2023 1:53 AM
> > 
> > On Sun, Oct 29, 2023 at 05:59:45PM +0200, Yishai Hadas wrote:
> > > From: Feng Liu 
> > >
> > > Introduce support for the admin virtqueue. By negotiating
> > > VIRTIO_F_ADMIN_VQ feature, driver detects capability and creates one
> > > administration virtqueue. Administration virtqueue implementation in
> > > virtio pci generic layer, enables multiple types of upper layer
> > > drivers such as vfio, net, blk to utilize it.
> > >
> > > Signed-off-by: Feng Liu 
> > > Reviewed-by: Parav Pandit 
> > > Reviewed-by: Jiri Pirko 
> > > Signed-off-by: Yishai Hadas 
> > > ---
> > >  drivers/virtio/virtio.c| 37 ++--
> > >  drivers/virtio/virtio_pci_common.c |  3 ++
> > >  drivers/virtio/virtio_pci_common.h | 15 ++-
> > >  drivers/virtio/virtio_pci_modern.c | 61 +-
> > >  drivers/virtio/virtio_pci_modern_dev.c | 18 
> > >  include/linux/virtio_config.h  |  4 ++
> > >  include/linux/virtio_pci_modern.h  |  5 +++
> > >  7 files changed, 137 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index
> > > 3893dc29eb26..f4080692b351 100644
> > > --- a/drivers/virtio/virtio.c
> > > +++ b/drivers/virtio/virtio.c
> > > @@ -302,9 +302,15 @@ static int virtio_dev_probe(struct device *_d)
> > >   if (err)
> > >   goto err;
> > >
> > > + if (dev->config->create_avq) {
> > > + err = dev->config->create_avq(dev);
> > > + if (err)
> > > + goto err;
> > > + }
> > > +
> > >   err = drv->probe(dev);
> > >   if (err)
> > > - goto err;
> > > + goto err_probe;
> > >
> > >   /* If probe didn't do it, mark device DRIVER_OK ourselves. */
> > >   if (!(dev->config->get_status(dev) & VIRTIO_CONFIG_S_DRIVER_OK))
> > 
> > Hmm I am not all that happy that we are just creating avq unconditionally.
> > Can't we do it on demand to avoid wasting resources if no one uses it?
> >
> Virtio queues must be enabled before driver_ok as we discussed in F_DYNAMIC 
> bit exercise.
> So creating AQ when first legacy command is invoked, would be too late.

Well we didn't release the spec with AQ so I am pretty sure there are no
devices using the feature. Do we want to already make an
exception for AQ and allow creating AQs after DRIVER_OK
even without F_DYNAMIC?

> > 
> > > @@ -316,6 +322,10 @@ static int virtio_dev_probe(struct device *_d)
> > >   virtio_config_enable(dev);
> > >
> > >   return 0;
> > > +
> > > +err_probe:
> > > + if (dev->config->destroy_avq)
> > > + dev->config->destroy_avq(dev);
> > >  err:
> > >   virtio_add_status(dev, VIRTIO_CONFIG_S_FAILED);
> > >   return err;
> > > @@ -331,6 +341,9 @@ static void virtio_dev_remove(struct device *_d)
> > >
> > >   drv->remove(dev);
> > >
> > > + if (dev->config->destroy_avq)
> > > + dev->config->destroy_avq(dev);
> > > +
> > >   /* Driver should have reset device. */
> > >   WARN_ON_ONCE(dev->config->get_status(dev));
> > >
> > > @@ -489,13 +502,20 @@ EXPORT_SYMBOL_GPL(unregister_virtio_device);
> > >  int virtio_device_freeze(struct virtio_device *dev)  {
> > >   struct virtio_driver *drv = drv_to_virtio(dev->dev.driver);
> > > + int ret;
> > >
> > >   virtio_config_disable(dev);
> > >
> > >   dev->failed = dev->config->get_status(dev) &
> > VIRTIO_CONFIG_S_FAILED;
> > >
> > > - if (drv && drv->freeze)
> > > - return drv->freeze(dev);
> > > + if (drv && drv->freeze) {
> > > + ret = drv->freeze(dev);
> > > + if (ret)
> > > + return ret;
> > > + }
> > > +
> > > + if (dev->config->destroy_avq)
> > > + dev->config->destroy_avq(dev);
> > >
> > >   return 0;
> > >  }
> > > @@ -532,10 +552,16 @@ int virtio_device_restore(struct virtio_device *dev)
> > >   if (ret)
> > >   goto err;
> > >
> > >

Re: [PATCH V2 vfio 5/9] virtio-pci: Initialize the supported admin commands

2023-10-30 Thread Michael S. Tsirkin

On Mon, Oct 30, 2023 at 05:27:50PM +0200, Yishai Hadas wrote:
> On 29/10/2023 22:17, Michael S. Tsirkin wrote:
> > On Sun, Oct 29, 2023 at 05:59:48PM +0200, Yishai Hadas wrote:
> > > Initialize the supported admin commands upon activating the admin queue.
> > > 
> > > The supported commands are saved as part of the admin queue context, it
> > > will be used by the next patches from this series.
> > > 
> > > Note:
> > > As we don't want to let upper layers to execute admin commands before
> > > that this initialization step was completed, we set ref count to 1 only
> > > post of that flow and use a non ref counted version command for this
> > > internal flow.
> > > 
> > > Signed-off-by: Yishai Hadas 
> > > ---
> > >   drivers/virtio/virtio_pci_common.h |  1 +
> > >   drivers/virtio/virtio_pci_modern.c | 77 +-
> > >   2 files changed, 77 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/virtio/virtio_pci_common.h 
> > > b/drivers/virtio/virtio_pci_common.h
> > > index a21b9ba01a60..9e07e556a51a 100644
> > > --- a/drivers/virtio/virtio_pci_common.h
> > > +++ b/drivers/virtio/virtio_pci_common.h
> > > @@ -46,6 +46,7 @@ struct virtio_pci_admin_vq {
> > >   struct virtio_pci_vq_info info;
> > >   struct completion flush_done;
> > >   refcount_t refcount;
> > > + u64 supported_cmds;
> > >   /* Name of the admin queue: avq.$index. */
> > >   char name[10];
> > >   u16 vq_index;
> > > diff --git a/drivers/virtio/virtio_pci_modern.c 
> > > b/drivers/virtio/virtio_pci_modern.c
> > > index ccd7a4d9f57f..25e27aa79cab 100644
> > > --- a/drivers/virtio/virtio_pci_modern.c
> > > +++ b/drivers/virtio/virtio_pci_modern.c
> > > @@ -19,6 +19,9 @@
> > >   #define VIRTIO_RING_NO_LEGACY
> > >   #include "virtio_pci_common.h"
> > > +static int vp_modern_admin_cmd_exec(struct virtio_device *vdev,
> > > + struct virtio_admin_cmd *cmd);
> > > +
> > I don't much like forward declarations. Just order functions sensibly
> > and they will not be needed.
> 
> OK, will be part of V3.
> 
> > 
> > >   static u64 vp_get_features(struct virtio_device *vdev)
> > >   {
> > >   struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> > > @@ -59,6 +62,42 @@ vp_modern_avq_set_abort(struct virtio_pci_admin_vq 
> > > *admin_vq, bool abort)
> > >   WRITE_ONCE(admin_vq->abort, abort);
> > >   }
> > > +static void virtio_pci_admin_init_cmd_list(struct virtio_device 
> > > *virtio_dev)
> > > +{
> > > + struct virtio_pci_device *vp_dev = to_vp_device(virtio_dev);
> > > + struct virtio_admin_cmd cmd = {};
> > > + struct scatterlist result_sg;
> > > + struct scatterlist data_sg;
> > > + __le64 *data;
> > > + int ret;
> > > +
> > > + data = kzalloc(sizeof(*data), GFP_KERNEL);
> > > + if (!data)
> > > + return;
> > > +
> > > + sg_init_one(&result_sg, data, sizeof(*data));
> > > + cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_LIST_QUERY);
> > > + cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SRIOV);
> > > + cmd.result_sg = &result_sg;
> > > +
> > > + ret = vp_modern_admin_cmd_exec(virtio_dev, &cmd);
> > > + if (ret)
> > > + goto end;
> > > +
> > > + sg_init_one(&data_sg, data, sizeof(*data));
> > > + cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_LIST_USE);
> > > + cmd.data_sg = &data_sg;
> > > + cmd.result_sg = NULL;
> > > +
> > > + ret = vp_modern_admin_cmd_exec(virtio_dev, &cmd);
> > > + if (ret)
> > > + goto end;
> > > +
> > > + vp_dev->admin_vq.supported_cmds = le64_to_cpu(*data);
> > > +end:
> > > + kfree(data);
> > > +}
> > > +
> > >   static void vp_modern_avq_activate(struct virtio_device *vdev)
> > >   {
> > >   struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> > > @@ -67,6 +106,7 @@ static void vp_modern_avq_activate(struct 
> > > virtio_device *vdev)
> > >   if (!virtio_has_feature(vdev, VIRTIO_F_ADMIN_VQ))
> > >   return;
> > > + virtio_pci_admin_init_cmd_list(vdev);
> > >   init_completion(&admin_vq->flush_do

Re: [PATCH V2 vfio 3/9] virtio-pci: Introduce admin command sending function

2023-10-29 Thread Michael S. Tsirkin

On Sun, Oct 29, 2023 at 05:59:46PM +0200, Yishai Hadas wrote:
> From: Feng Liu 
> 
> Add support for sending admin command through admin virtqueue interface.
> Abort any inflight admin commands once device reset completes.
> 
> To enforce the below statement from the specification [1], the admin
> queue is activated for the upper layer users only post of setting status
> to DRIVER_OK.
> 
> [1] The driver MUST NOT send any buffer available notifications to the
> device before setting DRIVER_OK.
> 
> Signed-off-by: Feng Liu 
> Reviewed-by: Parav Pandit 
> Signed-off-by: Yishai Hadas 
> ---
>  drivers/virtio/virtio_pci_common.h |   3 +
>  drivers/virtio/virtio_pci_modern.c | 174 +
>  include/linux/virtio.h |   8 ++
>  include/uapi/linux/virtio_pci.h|  22 
>  4 files changed, 207 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_pci_common.h 
> b/drivers/virtio/virtio_pci_common.h
> index e03af0966a4b..a21b9ba01a60 100644
> --- a/drivers/virtio/virtio_pci_common.h
> +++ b/drivers/virtio/virtio_pci_common.h
> @@ -44,9 +44,12 @@ struct virtio_pci_vq_info {
>  struct virtio_pci_admin_vq {
>   /* Virtqueue info associated with this admin queue. */
>   struct virtio_pci_vq_info info;
> + struct completion flush_done;
> + refcount_t refcount;

what exactly does this count?

>   /* Name of the admin queue: avq.$index. */
>   char name[10];
>   u16 vq_index;
> + bool abort;
>  };
>  
>  /* Our device structure */
> diff --git a/drivers/virtio/virtio_pci_modern.c 
> b/drivers/virtio/virtio_pci_modern.c
> index 01c5ba346471..ccd7a4d9f57f 100644
> --- a/drivers/virtio/virtio_pci_modern.c
> +++ b/drivers/virtio/virtio_pci_modern.c
> @@ -36,6 +36,58 @@ static bool vp_is_avq(struct virtio_device *vdev, unsigned 
> int index)
>   return index == vp_dev->admin_vq.vq_index;
>  }
>  
> +static bool vp_modern_avq_get(struct virtio_pci_admin_vq *admin_vq)
> +{
> + return refcount_inc_not_zero(&admin_vq->refcount);
> +}
> +
> +static void vp_modern_avq_put(struct virtio_pci_admin_vq *admin_vq)
> +{
> + if (refcount_dec_and_test(&admin_vq->refcount))
> + complete(&admin_vq->flush_done);
> +}
> +
> +static bool vp_modern_avq_is_abort(const struct virtio_pci_admin_vq 
> *admin_vq)
> +{
> + return READ_ONCE(admin_vq->abort);
> +}
> +
> +static void
> +vp_modern_avq_set_abort(struct virtio_pci_admin_vq *admin_vq, bool abort)
> +{
> + /* Mark the AVQ to abort, so that inflight commands can be aborted. */
> + WRITE_ONCE(admin_vq->abort, abort);
> +}
> +
> +static void vp_modern_avq_activate(struct virtio_device *vdev)
> +{
> + struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> + struct virtio_pci_admin_vq *admin_vq = &vp_dev->admin_vq;
> +
> + if (!virtio_has_feature(vdev, VIRTIO_F_ADMIN_VQ))
> + return;
> +
> + init_completion(&admin_vq->flush_done);
> + refcount_set(&admin_vq->refcount, 1);
> + vp_modern_avq_set_abort(admin_vq, false);
> +}
> +
> +static void vp_modern_avq_deactivate(struct virtio_device *vdev)
> +{
> + struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> + struct virtio_pci_admin_vq *admin_vq = &vp_dev->admin_vq;
> +
> + if (!virtio_has_feature(vdev, VIRTIO_F_ADMIN_VQ))
> + return;
> +
> + vp_modern_avq_set_abort(admin_vq, true);
> + /* Balance with refcount_set() during vp_modern_avq_activate */
> + vp_modern_avq_put(admin_vq);
> +
> + /* Wait for all the inflight admin commands to be aborted */
> + wait_for_completion(&vp_dev->admin_vq.flush_done);
> +}
> +
>  static void vp_transport_features(struct virtio_device *vdev, u64 features)
>  {
>   struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> @@ -172,6 +224,8 @@ static void vp_set_status(struct virtio_device *vdev, u8 
> status)
>   /* We should never be setting status to 0. */
>   BUG_ON(status == 0);
>   vp_modern_set_status(&vp_dev->mdev, status);
> + if (status & VIRTIO_CONFIG_S_DRIVER_OK)
> + vp_modern_avq_activate(vdev);
>  }
>  
>  static void vp_reset(struct virtio_device *vdev)
> @@ -188,6 +242,9 @@ static void vp_reset(struct virtio_device *vdev)
>*/
>   while (vp_modern_get_status(mdev))
>   msleep(1);
> +
> + vp_modern_avq_deactivate(vdev);
> +
>   /* Flush pending VQ/configuration callbacks. */
>   vp_synchronize_vectors(vdev);
>  }
> @@ -505,6 +562,121 @@ static bool vp_get_shm_region(struct virtio_device 
> *vdev,
>   return true;
>  }
>  
> +static int virtqueue_exec_admin_cmd(struct virtio_pci_admin_vq *admin_vq,
> + struct scatterlist **sgs,
> + unsigned int out_num,
> + unsigned int in_num,
> + void *data,
> + gfp_t gfp)
> +{
> + struct virtqueue *vq;
> + int ret, len;
> +
> + if (!vp_modern_avq

Re: [PATCH V2 vfio 2/9] virtio-pci: Introduce admin virtqueue

2023-10-29 Thread Michael S. Tsirkin

On Sun, Oct 29, 2023 at 05:59:45PM +0200, Yishai Hadas wrote:
> From: Feng Liu 
> 
> Introduce support for the admin virtqueue. By negotiating
> VIRTIO_F_ADMIN_VQ feature, driver detects capability and creates one
> administration virtqueue. Administration virtqueue implementation in
> virtio pci generic layer, enables multiple types of upper layer
> drivers such as vfio, net, blk to utilize it.
> 
> Signed-off-by: Feng Liu 
> Reviewed-by: Parav Pandit 
> Reviewed-by: Jiri Pirko 
> Signed-off-by: Yishai Hadas 
> ---
>  drivers/virtio/virtio.c| 37 ++--
>  drivers/virtio/virtio_pci_common.c |  3 ++
>  drivers/virtio/virtio_pci_common.h | 15 ++-
>  drivers/virtio/virtio_pci_modern.c | 61 +-
>  drivers/virtio/virtio_pci_modern_dev.c | 18 
>  include/linux/virtio_config.h  |  4 ++
>  include/linux/virtio_pci_modern.h  |  5 +++
>  7 files changed, 137 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> index 3893dc29eb26..f4080692b351 100644
> --- a/drivers/virtio/virtio.c
> +++ b/drivers/virtio/virtio.c
> @@ -302,9 +302,15 @@ static int virtio_dev_probe(struct device *_d)
>   if (err)
>   goto err;
>  
> + if (dev->config->create_avq) {
> + err = dev->config->create_avq(dev);
> + if (err)
> + goto err;
> + }
> +
>   err = drv->probe(dev);
>   if (err)
> - goto err;
> + goto err_probe;
>  
>   /* If probe didn't do it, mark device DRIVER_OK ourselves. */
>   if (!(dev->config->get_status(dev) & VIRTIO_CONFIG_S_DRIVER_OK))

Hmm I am not all that happy that we are just creating avq
unconditionally.
Can't we do it on demand to avoid wasting resources if
no one uses it?





> @@ -316,6 +322,10 @@ static int virtio_dev_probe(struct device *_d)
>   virtio_config_enable(dev);
>  
>   return 0;
> +
> +err_probe:
> + if (dev->config->destroy_avq)
> + dev->config->destroy_avq(dev);
>  err:
>   virtio_add_status(dev, VIRTIO_CONFIG_S_FAILED);
>   return err;
> @@ -331,6 +341,9 @@ static void virtio_dev_remove(struct device *_d)
>  
>   drv->remove(dev);
>  
> + if (dev->config->destroy_avq)
> + dev->config->destroy_avq(dev);
> +
>   /* Driver should have reset device. */
>   WARN_ON_ONCE(dev->config->get_status(dev));
>  
> @@ -489,13 +502,20 @@ EXPORT_SYMBOL_GPL(unregister_virtio_device);
>  int virtio_device_freeze(struct virtio_device *dev)
>  {
>   struct virtio_driver *drv = drv_to_virtio(dev->dev.driver);
> + int ret;
>  
>   virtio_config_disable(dev);
>  
>   dev->failed = dev->config->get_status(dev) & VIRTIO_CONFIG_S_FAILED;
>  
> - if (drv && drv->freeze)
> - return drv->freeze(dev);
> + if (drv && drv->freeze) {
> + ret = drv->freeze(dev);
> + if (ret)
> + return ret;
> + }
> +
> + if (dev->config->destroy_avq)
> + dev->config->destroy_avq(dev);
>  
>   return 0;
>  }
> @@ -532,10 +552,16 @@ int virtio_device_restore(struct virtio_device *dev)
>   if (ret)
>   goto err;
>  
> + if (dev->config->create_avq) {
> + ret = dev->config->create_avq(dev);
> + if (ret)
> + goto err;
> + }
> +
>   if (drv->restore) {
>   ret = drv->restore(dev);
>   if (ret)
> - goto err;
> + goto err_restore;
>   }
>  
>   /* If restore didn't do it, mark device DRIVER_OK ourselves. */
> @@ -546,6 +572,9 @@ int virtio_device_restore(struct virtio_device *dev)
>  
>   return 0;
>  
> +err_restore:
> + if (dev->config->destroy_avq)
> + dev->config->destroy_avq(dev);
>  err:
>   virtio_add_status(dev, VIRTIO_CONFIG_S_FAILED);
>   return ret;
> diff --git a/drivers/virtio/virtio_pci_common.c 
> b/drivers/virtio/virtio_pci_common.c
> index c2524a7207cf..6b4766d5abe6 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -236,6 +236,9 @@ void vp_del_vqs(struct virtio_device *vdev)
>   int i;
>  
>   list_for_each_entry_safe(vq, n, &vdev->vqs, list) {
> + if (vp_dev->is_avq(vdev, vq->index))
> + continue;
> +
>   if (vp_dev->per_vq_vectors) {
>   int v = vp_dev->vqs[vq->index]->msix_vector;
>  
> diff --git a/drivers/virtio/virtio_pci_common.h 
> b/drivers/virtio/virtio_pci_common.h
> index 4b773bd7c58c..e03af0966a4b 100644
> --- a/drivers/virtio/virtio_pci_common.h
> +++ b/drivers/virtio/virtio_pci_common.h
> @@ -41,6 +41,14 @@ struct virtio_pci_vq_info {
>   unsigned int msix_vector;
>  };
>  
> +struct virtio_pci_admin_vq {
> + /* Virtqueue info associated with this admin queue. */
> + struct virtio_pci_vq_info info;
> + /* Name of the admin q

Re: [PATCH V2 vfio 5/9] virtio-pci: Initialize the supported admin commands

2023-10-29 Thread Michael S. Tsirkin

On Sun, Oct 29, 2023 at 05:59:48PM +0200, Yishai Hadas wrote:
> Initialize the supported admin commands upon activating the admin queue.
> 
> The supported commands are saved as part of the admin queue context, it
> will be used by the next patches from this series.
> 
> Note:
> As we don't want to let upper layers to execute admin commands before
> that this initialization step was completed, we set ref count to 1 only
> post of that flow and use a non ref counted version command for this
> internal flow.
> 
> Signed-off-by: Yishai Hadas 
> ---
>  drivers/virtio/virtio_pci_common.h |  1 +
>  drivers/virtio/virtio_pci_modern.c | 77 +-
>  2 files changed, 77 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/virtio/virtio_pci_common.h 
> b/drivers/virtio/virtio_pci_common.h
> index a21b9ba01a60..9e07e556a51a 100644
> --- a/drivers/virtio/virtio_pci_common.h
> +++ b/drivers/virtio/virtio_pci_common.h
> @@ -46,6 +46,7 @@ struct virtio_pci_admin_vq {
>   struct virtio_pci_vq_info info;
>   struct completion flush_done;
>   refcount_t refcount;
> + u64 supported_cmds;
>   /* Name of the admin queue: avq.$index. */
>   char name[10];
>   u16 vq_index;
> diff --git a/drivers/virtio/virtio_pci_modern.c 
> b/drivers/virtio/virtio_pci_modern.c
> index ccd7a4d9f57f..25e27aa79cab 100644
> --- a/drivers/virtio/virtio_pci_modern.c
> +++ b/drivers/virtio/virtio_pci_modern.c
> @@ -19,6 +19,9 @@
>  #define VIRTIO_RING_NO_LEGACY
>  #include "virtio_pci_common.h"
>  
> +static int vp_modern_admin_cmd_exec(struct virtio_device *vdev,
> + struct virtio_admin_cmd *cmd);
> +

I don't much like forward declarations. Just order functions sensibly
and they will not be needed.

>  static u64 vp_get_features(struct virtio_device *vdev)
>  {
>   struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> @@ -59,6 +62,42 @@ vp_modern_avq_set_abort(struct virtio_pci_admin_vq 
> *admin_vq, bool abort)
>   WRITE_ONCE(admin_vq->abort, abort);
>  }
>  
> +static void virtio_pci_admin_init_cmd_list(struct virtio_device *virtio_dev)
> +{
> + struct virtio_pci_device *vp_dev = to_vp_device(virtio_dev);
> + struct virtio_admin_cmd cmd = {};
> + struct scatterlist result_sg;
> + struct scatterlist data_sg;
> + __le64 *data;
> + int ret;
> +
> + data = kzalloc(sizeof(*data), GFP_KERNEL);
> + if (!data)
> + return;
> +
> + sg_init_one(&result_sg, data, sizeof(*data));
> + cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_LIST_QUERY);
> + cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SRIOV);
> + cmd.result_sg = &result_sg;
> +
> + ret = vp_modern_admin_cmd_exec(virtio_dev, &cmd);
> + if (ret)
> + goto end;
> +
> + sg_init_one(&data_sg, data, sizeof(*data));
> + cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_LIST_USE);
> + cmd.data_sg = &data_sg;
> + cmd.result_sg = NULL;
> +
> + ret = vp_modern_admin_cmd_exec(virtio_dev, &cmd);
> + if (ret)
> + goto end;
> +
> + vp_dev->admin_vq.supported_cmds = le64_to_cpu(*data);
> +end:
> + kfree(data);
> +}
> +
>  static void vp_modern_avq_activate(struct virtio_device *vdev)
>  {
>   struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> @@ -67,6 +106,7 @@ static void vp_modern_avq_activate(struct virtio_device 
> *vdev)
>   if (!virtio_has_feature(vdev, VIRTIO_F_ADMIN_VQ))
>   return;
>  
> + virtio_pci_admin_init_cmd_list(vdev);
>   init_completion(&admin_vq->flush_done);
>   refcount_set(&admin_vq->refcount, 1);
>   vp_modern_avq_set_abort(admin_vq, false);
> @@ -562,6 +602,35 @@ static bool vp_get_shm_region(struct virtio_device *vdev,
>   return true;
>  }
>  
> +static int __virtqueue_exec_admin_cmd(struct virtio_pci_admin_vq *admin_vq,
> + struct scatterlist **sgs,
> + unsigned int out_num,
> + unsigned int in_num,
> + void *data,
> + gfp_t gfp)
> +{
> + struct virtqueue *vq;
> + int ret, len;
> +
> + vq = admin_vq->info.vq;
> +
> + ret = virtqueue_add_sgs(vq, sgs, out_num, in_num, data, gfp);
> + if (ret < 0)
> + return ret;
> +
> + if (unlikely(!virtqueue_kick(vq)))
> + return -EIO;
> +
> + while (!virtqueue_get_buf(vq, &len) &&
> +!virtqueue_is_broken(vq))
> + cpu_relax();
> +
> + if (virtqueue_is_broken(vq))
> + return -EIO;
> +
> + return 0;
> +}
> +


This is tolerable I guess but it might pin the CPU for a long time.
The difficulty is handling suprize removal well which we currently
don't do with interrupts. I would say it's ok as is but add
a TODO comments along the lines of /* TODO: use interrupts once these 
virtqueue_is_broken */

>  static int virtqueue_exec_admin_cmd(struct virtio_pci_ad

Re: [PATCH v4] ALSA: virtio: use ack callback

2023-10-27 Thread Michael S. Tsirkin

On Fri, Oct 27, 2023 at 12:18:00PM +0200, Stefano Garzarella wrote:
> On Fri, Oct 27, 2023 at 11:27:40AM +0200, Takashi Iwai wrote:
> > On Wed, 25 Oct 2023 11:49:19 +0200,
> > Matias Ezequiel Vara Larsen wrote:
> > > 
> > > This commit uses the ack() callback to determine when a buffer has been
> > > updated, then exposes it to guest.
> > > 
> > > The current mechanism splits a dma buffer into descriptors that are
> > > exposed to the device. This dma buffer is shared with the user
> > > application. When the device consumes a buffer, the driver moves the
> > > request from the used ring to available ring.
> > > 
> > > The driver exposes the buffer to the device without knowing if the
> > > content has been updated from the user. The section 2.8.21.1 of the
> > > virtio spec states that: "The device MAY access the descriptor chains
> > > the driver created and the memory they refer to immediately". If the
> > > device picks up buffers from the available ring just after it is
> > > notified, it happens that the content may be old.
> > > 
> > > When the ack() callback is invoked, the driver exposes only the buffers
> > > that have already been updated, i.e., enqueued in the available ring.
> > > Thus, the device always picks up a buffer that is updated.
> > > 
> > > For capturing, the driver starts by exposing all the available buffers
> > > to device. After device updates the content of a buffer, it enqueues it
> > > in the used ring. It is only after the ack() for capturing is issued
> > > that the driver re-enqueues the buffer in the available ring.
> > > 
> > > Co-developed-by: Anton Yakovlev 
> > > Signed-off-by: Anton Yakovlev 
> > > Signed-off-by: Matias Ezequiel Vara Larsen 
> > 
> > Applied now to for-next branch.
> 
> Cool, thanks for that!
> 
> I just wonder if we should CC stable since we are fixing a virtio
> specification violation.
> 
> @Michael what do you think?
> 
> Thanks,
> Stefano


Acked-by: Michael S. Tsirkin 
Fixes: de3a9980d8c3 ("ALSA: virtio: add virtio sound driver")


The patch is too big for stable - more than 100 lines added. See:
Documentation/process/stable-kernel-rules.rst



-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V1 vfio 9/9] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-26 Thread Michael S. Tsirkin

On Thu, Oct 26, 2023 at 11:55:39AM -0600, Alex Williamson wrote:
> On Thu, 26 Oct 2023 15:08:12 +0300
> Yishai Hadas  wrote:
> 
> > On 25/10/2023 22:13, Alex Williamson wrote:
> > > On Wed, 25 Oct 2023 17:35:51 +0300
> > > Yishai Hadas  wrote:
> > >  
> > >> On 24/10/2023 22:57, Alex Williamson wrote:  
> > >>> On Tue, 17 Oct 2023 16:42:17 +0300
> > >>> Yishai Hadas  wrote:
>
> >  +  if (copy_to_user(buf + copy_offset, &val32, copy_count))
> >  +  return -EFAULT;
> >  +  }
> >  +
> >  +  if (range_intersect_range(pos, count, PCI_SUBSYSTEM_ID, 
> >  sizeof(val16),
> >  +©_offset, ©_count, NULL)) {
> >  +  /*
> >  +   * Transitional devices use the PCI subsystem device id 
> >  as
> >  +   * virtio device id, same as legacy driver always did.  
> > >>> Where did we require the subsystem vendor ID to be 0x1af4?  This
> > >>> subsystem device ID really only makes since given that subsystem
> > >>> vendor ID, right?  Otherwise I don't see that non-transitional devices,
> > >>> such as the VF, have a hard requirement per the spec for the subsystem
> > >>> vendor ID.
> > >>>
> > >>> Do we want to make this only probe the correct subsystem vendor ID or do
> > >>> we want to emulate the subsystem vendor ID as well?  I don't see this is
> > >>> correct without one of those options.  
> > >> Looking in the 1.x spec we can see the below.
> > >>
> > >> Legacy Interfaces: A Note on PCI Device Discovery
> > >>
> > >> "Transitional devices MUST have the PCI Subsystem
> > >> Device ID matching the Virtio Device ID, as indicated in section 5 ...
> > >> This is to match legacy drivers."
> > >>
> > >> However, there is no need to enforce Subsystem Vendor ID.
> > >>
> > >> This is what we followed here.
> > >>
> > >> Makes sense ?  
> > > So do I understand correctly that virtio dictates the subsystem device
> > > ID for all subsystem vendor IDs that implement a legacy virtio
> > > interface?  Ok, but this device didn't actually implement a legacy
> > > virtio interface.  The device itself is not tranistional, we're imposing
> > > an emulated transitional interface onto it.  So did the subsystem vendor
> > > agree to have their subsystem device ID managed by the virtio committee
> > > or might we create conflicts?  I imagine we know we don't have a
> > > conflict if we also virtualize the subsystem vendor ID.
> > >  
> > The non transitional net device in the virtio spec defined as the below 
> > tuple.
> > T_A: VID=0x1AF4, DID=0x1040, Subsys_VID=FOO, Subsys_DID=0x40.
> > 
> > And transitional net device in the virtio spec for a vendor FOO is 
> > defined as:
> > T_B: VID=0x1AF4,DID=0x1000,Subsys_VID=FOO, subsys_DID=0x1
> > 
> > This driver is converting T_A to T_B, which both are defined by the 
> > virtio spec.
> > Hence, it does not conflict for the subsystem vendor, it is fine.
> 
> Surprising to me that the virtio spec dictates subsystem device ID in
> all cases.

Modern virtio spec doesn't. Legacy spec did.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2] virtio_pci: Switch away from deprecated irq_set_affinity_hint

2023-10-26 Thread Michael S. Tsirkin

On Thu, Oct 26, 2023 at 06:25:08PM +0200, Jakub Sitnicki wrote:
> On Wed, Oct 25, 2023 at 04:53 PM +02, Jakub Sitnicki wrote:
> > Since commit 65c7cdedeb30 ("genirq: Provide new interfaces for affinity
> > hints") irq_set_affinity_hint is being phased out.
> >
> > Switch to new interfaces for setting and applying irq affinity hints.
> >
> > Signed-off-by: Jakub Sitnicki 
> > ---
> > v2:
> >  - Leave cpumask_copy as is. We can't pass pointer to stack memory as hint.
> >Proposed a change to IRQ affinity interface to address this limitation:
> >https://lore.kernel.org/r/20231025141517.375378-1-ja...@cloudflare.com
> 
> Just a note to the ^ - if we wanted to get rid of msix_affinity_masks,
> we could call irq_set_affinity directly, instead of calling it through
> irq_set_affinity[_and]_hint.
> 
> The hint wouldn't be available any more in /proc/irq/N/affinity_hint,
> but the same information can be gathered from /proc/irq/N/smp_affinity.
> 
> [...]


So we are potentially breaking some userspace - what's the value we
gain?  Is there some way we can make disable_irq/enable_irq work?
That would have a lot of value.
There is an actual need for that in virtio for coco but we can't use
these APIs with affinity managed IRQs.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V1 vfio 9/9] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-26 Thread Michael S. Tsirkin

On Thu, Oct 26, 2023 at 03:09:13PM +, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin 
> > Sent: Thursday, October 26, 2023 8:36 PM
> > 
> > On Thu, Oct 26, 2023 at 01:28:18PM +, Parav Pandit wrote:
> > >
> > > > From: Michael S. Tsirkin 
> > > > Sent: Thursday, October 26, 2023 6:45 PM
> > >
> > > > > Followed by an open coded driver check for 0x1000 to 0x103f range.
> > > > > Do you mean windows driver expects specific subsystem vendor id of
> > 0x1af4?
> > > >
> > > > Look it up, it's open source.
> > >
> > > Those are not OS inbox drivers anyway.
> > > :)
> > 
> > Does not matter at all if guest has drivers installed.
> > Either you worry about legacy guests or not.
> > 
> So, Linux guests have inbox drivers, that we care about and they seems to be 
> covered, right?
> 
> > 
> > > The current vfio driver is following the virtio spec based on legacy 
> > > spec, 1.x
> > spec following the transitional device sections.
> > > There is no need to do something out of spec at this point.
> > 
> > legacy spec wasn't maintained properly, drivers diverged sometimes
> > significantly. what matters is installed base.
> 
> So if you know the subsystem vendor id that Windows expects, please share, so 
> we can avoid playing puzzle game. :)
> It anyway can be reported by the device itself.

I don't know myself offhand. I just know it's not so simple. Looking at the 
source
for network drivers I see:

%kvmnet6.DeviceDesc%= kvmnet6.ndi, 
PCI\VEN_1AF4&DEV_1000&SUBSYS_0001_INX_SUBSYS_VENDOR_ID&REV_00, 
PCI\VEN_1AF4&DEV_1000


So the drivers will:
A. bind with high priority to subsystem vendor ID used when drivers where built.
   popular drivers built and distributed for free by Red Hat have 1AF4
B. bind with low priority to any subsystem device/vendor id as long as
   vendor is 1af4 and device is 1000


My conclusions:
- you probably need a way to tweak subsystem vendor id in software
- default should probably be 1AF4 not whatever actual device uses


-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V1 vfio 9/9] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-26 Thread Michael S. Tsirkin

On Thu, Oct 26, 2023 at 01:28:18PM +, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin 
> > Sent: Thursday, October 26, 2023 6:45 PM
> 
> > > Followed by an open coded driver check for 0x1000 to 0x103f range.
> > > Do you mean windows driver expects specific subsystem vendor id of 0x1af4?
> > 
> > Look it up, it's open source.
> 
> Those are not OS inbox drivers anyway.
> :)

Does not matter at all if guest has drivers installed.
Either you worry about legacy guests or not.


> The current vfio driver is following the virtio spec based on legacy spec, 
> 1.x spec following the transitional device sections.
> There is no need to do something out of spec at this point.

legacy spec wasn't maintained properly, drivers diverged sometimes
significantly. what matters is installed base.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V1 vfio 9/9] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-26 Thread Michael S. Tsirkin

On Thu, Oct 26, 2023 at 12:40:04PM +, Parav Pandit wrote:
> > From: Michael S. Tsirkin 
> > Sent: Thursday, October 26, 2023 5:42 PM
> > 
> > On Thu, Oct 26, 2023 at 03:08:12PM +0300, Yishai Hadas wrote:
> > > > > Makes sense ?
> > > > So do I understand correctly that virtio dictates the subsystem
> > > > device ID for all subsystem vendor IDs that implement a legacy
> > > > virtio interface?  Ok, but this device didn't actually implement a
> > > > legacy virtio interface.  The device itself is not tranistional,
> > > > we're imposing an emulated transitional interface onto it.  So did
> > > > the subsystem vendor agree to have their subsystem device ID managed
> > > > by the virtio committee or might we create conflicts?  I imagine we
> > > > know we don't have a conflict if we also virtualize the subsystem 
> > > > vendor ID.
> > > >
> > > The non transitional net device in the virtio spec defined as the
> > > below tuple.
> > > T_A: VID=0x1AF4, DID=0x1040, Subsys_VID=FOO, Subsys_DID=0x40.
> > >
> > > And transitional net device in the virtio spec for a vendor FOO is
> > > defined
> > > as:
> > > T_B: VID=0x1AF4,DID=0x1000,Subsys_VID=FOO, subsys_DID=0x1
> > >
> > > This driver is converting T_A to T_B, which both are defined by the
> > > virtio spec.
> > > Hence, it does not conflict for the subsystem vendor, it is fine.
> > 
> > You are talking about legacy guests, what 1.X spec says about them is much 
> > less
> > important than what guests actually do.
> > Check the INF of the open source windows drivers and linux code, at least.
> 
> Linux legacy guest has,
> 
> static struct pci_device_id virtio_pci_id_table[] = {
> { 0x1af4, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
> { 0 },
> };
> Followed by an open coded driver check for 0x1000 to 0x103f range.
> Do you mean windows driver expects specific subsystem vendor id of 0x1af4?

Look it up, it's open source.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V1 vfio 9/9] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-26 Thread Michael S. Tsirkin

On Thu, Oct 26, 2023 at 03:08:12PM +0300, Yishai Hadas wrote:
> > > Makes sense ?
> > So do I understand correctly that virtio dictates the subsystem device
> > ID for all subsystem vendor IDs that implement a legacy virtio
> > interface?  Ok, but this device didn't actually implement a legacy
> > virtio interface.  The device itself is not tranistional, we're imposing
> > an emulated transitional interface onto it.  So did the subsystem vendor
> > agree to have their subsystem device ID managed by the virtio committee
> > or might we create conflicts?  I imagine we know we don't have a
> > conflict if we also virtualize the subsystem vendor ID.
> > 
> The non transitional net device in the virtio spec defined as the below
> tuple.
> T_A: VID=0x1AF4, DID=0x1040, Subsys_VID=FOO, Subsys_DID=0x40.
> 
> And transitional net device in the virtio spec for a vendor FOO is defined
> as:
> T_B: VID=0x1AF4,DID=0x1000,Subsys_VID=FOO, subsys_DID=0x1
> 
> This driver is converting T_A to T_B, which both are defined by the virtio
> spec.
> Hence, it does not conflict for the subsystem vendor, it is fine.

You are talking about legacy guests, what 1.X spec says about them
is much less important than what guests actually do.
Check the INF of the open source windows drivers and linux code, at least.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v5 0/7] vdpa: decouple reset of iotlb mapping from device reset

2023-10-26 Thread Michael S. Tsirkin

On Thu, Oct 26, 2023 at 12:14:33AM -0700, Si-Wei Liu wrote:
> In order to reduce needlessly high setup and teardown cost
> of iotlb mapping during live migration, it's crucial to
> decouple the vhost-vdpa iotlb abstraction from the virtio
> device life cycle, i.e. iotlb mappings should be left
> intact across virtio device reset [1]. For it to work, the
> on-chip IOMMU parent device could implement a separate
> .reset_map() operation callback to restore 1:1 DMA mapping
> without having to resort to the .reset() callback, the
> latter of which is mainly used to reset virtio device state.
> This new .reset_map() callback will be invoked only before
> the vhost-vdpa driver is to be removed and detached from
> the vdpa bus, such that other vdpa bus drivers, e.g. 
> virtio-vdpa, can start with 1:1 DMA mapping when they
> are attached. For the context, those on-chip IOMMU parent
> devices, create the 1:1 DMA mapping at vdpa device creation,
> and they would implicitly destroy the 1:1 mapping when
> the first .set_map or .dma_map callback is invoked.
> 
> This patchset is rebased on top of the latest vhost tree.
> 
> [1] Reducing vdpa migration downtime because of memory pin / maps
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html


If this is just a squash of v4 with fixes then I already pushed it.
Ignoring this for now.

> ---
> v5:
> - Squashed two fixups to the clean map patch
> 
> v4:
> - Rework compatibility using new .compat_reset driver op
> 
> v3:
> - add .reset_map support to vdpa_sim
> - introduce module parameter to provide bug-for-bug compatibility with older
>   userspace 
> 
> v2:
> - improved commit message to clarify the intended csope of .reset_map API
> - improved commit messages to clarify no breakage on older userspace
> 
> v1:
> - rewrote commit messages to include more detailed description and background
> - reword to vendor specific IOMMU implementation from on-chip IOMMU
> - include parent device backend feautres to persistent iotlb precondition
> - reimplement mlx5_vdpa patch on top of descriptor group series
> 
> RFC v3:
> - fix missing return due to merge error in patch #4
> 
> RFC v2:
> - rebased on top of the "[PATCH RFC v2 0/3] vdpa: dedicated descriptor table 
> group" series:
>   
> https://lore.kernel.org/virtualization/1694248959-13369-1-git-send-email-si-wei@oracle.com/
> 
> ---
> 
> Si-Wei Liu (7):
>   vdpa: introduce .reset_map operation callback
>   vhost-vdpa: reset vendor specific mapping to initial state in .release
>   vhost-vdpa: introduce IOTLB_PERSIST backend feature bit
>   vdpa: introduce .compat_reset operation callback
>   vhost-vdpa: clean iotlb map during reset for older userspace
>   vdpa/mlx5: implement .reset_map driver op
>   vdpa_sim: implement .reset_map support
> 
>  drivers/vdpa/mlx5/core/mlx5_vdpa.h |  1 +
>  drivers/vdpa/mlx5/core/mr.c| 17 ++
>  drivers/vdpa/mlx5/net/mlx5_vnet.c  | 27 ++--
>  drivers/vdpa/vdpa_sim/vdpa_sim.c   | 52 --
>  drivers/vhost/vdpa.c   | 52 +++---
>  drivers/virtio/virtio_vdpa.c   |  2 +-
>  include/linux/vdpa.h   | 30 +++--
>  include/uapi/linux/vhost_types.h   |  2 ++
>  8 files changed, 164 insertions(+), 19 deletions(-)
> 
> -- 
> 2.39.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC 0/7] vdpa: Add support for iommufd

2023-10-25 Thread Michael S. Tsirkin

On Thu, Oct 26, 2023 at 02:48:07PM +0800, Cindy Lu wrote:
> On Thu, Oct 26, 2023 at 2:42 PM Michael S. Tsirkin  wrote:
> >
> > On Sun, Sep 24, 2023 at 01:05:33AM +0800, Cindy Lu wrote:
> > > Hi All
> > > Really apologize for the delay, this is the draft RFC for
> > > iommufd support for vdpa, This code provides the basic function
> > > for iommufd support
> > >
> > > The code was tested and passed in device vdpa_sim_net
> > > The qemu code is
> > > https://gitlab.com/lulu6/gitlabqemutmp/-/tree/iommufdRFC
> > > The kernel code is
> > > https://gitlab.com/lulu6/vhost/-/tree/iommufdRFC
> > >
> > > ToDo
> > > 1. this code is out of date and needs to clean and rebase on the latest 
> > > code
> > > 2. this code has some workaround, I Skip the check for
> > > iommu_group and CACHE_COHERENCY, also some misc issues like need to add
> > > mutex for iommfd operations
> > > 3. only test in emulated device, other modes not tested yet
> > >
> > > After addressed these problems I will send out a new version for RFC. I 
> > > will
> > > provide the code in 3 weeks
> >
> > What's the status here?
> >
> Hi Michael
> The code is finished, but I found some bug after adding the support for ASID,
> will post the new version after this bug is fixed, should be next week
> Thanks
> Cindy


We'll miss this merge window then.

> > --
> > MST
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC 0/7] vdpa: Add support for iommufd

2023-10-25 Thread Michael S. Tsirkin

On Sun, Sep 24, 2023 at 01:05:33AM +0800, Cindy Lu wrote:
> Hi All
> Really apologize for the delay, this is the draft RFC for
> iommufd support for vdpa, This code provides the basic function 
> for iommufd support 
> 
> The code was tested and passed in device vdpa_sim_net
> The qemu code is
> https://gitlab.com/lulu6/gitlabqemutmp/-/tree/iommufdRFC
> The kernel code is
> https://gitlab.com/lulu6/vhost/-/tree/iommufdRFC
> 
> ToDo
> 1. this code is out of date and needs to clean and rebase on the latest code 
> 2. this code has some workaround, I Skip the check for
> iommu_group and CACHE_COHERENCY, also some misc issues like need to add
> mutex for iommfd operations 
> 3. only test in emulated device, other modes not tested yet
> 
> After addressed these problems I will send out a new version for RFC. I will
> provide the code in 3 weeks

What's the status here?

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] vhost-vdpa: fix use-after-free in _compat_vdpa_reset

2023-10-25 Thread Michael S. Tsirkin

On Wed, Oct 25, 2023 at 04:13:14PM -0700, Si-Wei Liu wrote:
> When the vhost-vdpa device is being closed, vhost_vdpa_cleanup() doesn't
> clean up the vqs pointer after free. This could lead to use-after-tree
> when _compat_vdpa_reset() tries to access the vqs that are freed already.
> Fix is to set vqs pointer to NULL at the end of vhost_vdpa_cleanup()
> after getting freed, which is guarded by atomic opened state.
> 
>   BUG: unable to handle page fault for address: 0001005b4af4
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x) - not-present page
>   PGD 16a80a067 P4D 0
>   Oops:  [#1] PREEMPT SMP NOPTI
>   CPU: 4 PID: 40387 Comm: qemu-kvm Not tainted 6.6.0-rc7+ #3
>   Hardware name: Dell Inc. PowerEdge R750/0PJ80M, BIOS 1.8.2 09/14/2022
>   RIP: 0010:_compat_vdpa_reset.isra.0+0x27/0xb0 [vhost_vdpa]
>   Code: 90 90 90 0f 1f 44 00 00 41 55 4c 8d ae 08 03 00 00 41 54 55 48
>   89 f5 53 4c 8b a6 00 03 00 00 48 85 ff 74 49 48 8b 07 4c 89 ef <48> 8b
>   80 88 45 00 00 48 c1 e8 08 48 83 f0 01 89 c3 e8 73 5e 9b dc
>   RSP: 0018:ff73a85762073ba0 EFLAGS: 00010286
>   RAX: 0001005b056c RBX: ff32b13ca6994c68 RCX: 0002
>   RDX: 0001 RSI: ff32b13c07559000 RDI: ff32b13c07559308
>   RBP: ff32b13c07559000 R08:  R09: ff32b12ca497c0f0
>   R10: ff73a85762073c58 R11: 000c106f9de3 R12: ff32b12c95b1d050
>   R13: ff32b13c07559308 R14: ff32b12d0ddc5100 R15: 8002
>   FS:  7fec5b8cbf80() GS:ff32b13bbfc8()
>   knlGS:
>   CS:  0010 DS:  ES:  CR0: 80050033
>   CR2: 0001005b4af4 CR3: 00015644a003 CR4: 00773ee0
>   PKRU: 5554
>   Call Trace:
>
>? __die+0x20/0x70
>? page_fault_oops+0x76/0x170
>? exc_page_fault+0x65/0x150
>? asm_exc_page_fault+0x22/0x30
>? _compat_vdpa_reset.isra.0+0x27/0xb0 [vhost_vdpa]
>vhost_vdpa_open+0x57/0x280 [vhost_vdpa]
>? __pfx_chrdev_open+0x10/0x10
>chrdev_open+0xc6/0x260
>? __pfx_chrdev_open+0x10/0x10
>do_dentry_open+0x16e/0x530
>do_open+0x21c/0x400
>path_openat+0x111/0x290
>do_filp_open+0xb2/0x160
>? __check_object_size.part.0+0x5e/0x140
>do_sys_openat2+0x96/0xd0
>__x64_sys_openat+0x53/0xa0
>do_syscall_64+0x59/0x90
>? syscall_exit_to_user_mode+0x22/0x40
>? do_syscall_64+0x69/0x90
>? syscall_exit_to_user_mode+0x22/0x40
>? do_syscall_64+0x69/0x90
>? do_syscall_64+0x69/0x90
>? syscall_exit_to_user_mode+0x22/0x40
>? do_syscall_64+0x69/0x90
>? exc_page_fault+0x65/0x150
>entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> 
> Fixes: 10cbf8dfaf93 ("vhost-vdpa: clean iotlb map during reset for older 
> userspace")
> Fixes: ac7e98c73c05 ("vhost-vdpa: fix NULL pointer deref in 
> _compat_vdpa_reset")

So these two are all in next correct?

I really do not like it how 10cbf8dfaf936e3ef1f5d7fdc6e9dada268ba6bb
introduced a regression and then apparently we keep fixing things up?

Can I squash these 3 commits?


> Reported-by: Lei Yang 
> Closes: 
> https://lore.kernel.org/all/CAPpAL=yhdqn1aztecn3mps8o4m+bl_hvy02fdpihn7dwd91...@mail.gmail.com/
> Signed-off-by: Si-Wei Liu 
> ---
>  drivers/vhost/vdpa.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 9a2343c45df0..30df5c58db73 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -1355,6 +1355,7 @@ static void vhost_vdpa_cleanup(struct vhost_vdpa *v)
>   vhost_vdpa_free_domain(v);
>   vhost_dev_cleanup(&v->vdev);
>   kfree(v->vdev.vqs);
> + v->vdev.vqs = NULL;
>  }
>  
>  static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> -- 
> 2.39.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V1 vfio 6/9] virtio-pci: Introduce APIs to execute legacy IO admin commands

2023-10-25 Thread Michael S. Tsirkin

On Wed, Oct 25, 2023 at 05:03:55PM +0300, Yishai Hadas wrote:
> > Yes - I think some kind of API will be needed to setup/cleanup.
> > Then 1st call to setup would do the list/use dance? some ref counting?
> 
> OK, we may work to come in V2 with that option in place.
> 
> Please note that the initialization 'list/use' commands would be done as
> part of the admin queue activation but we can't enable the admin queue for
> the upper layers before that it was done.

I don't know what does this mean.

> So, we may consider skipping the ref count set/get as part of the
> initialization flow with some flag/detection of the list/use commands as the
> ref count setting enables the admin queue for upper-layers which we would
> like to prevent by that time.

You can init on 1st use but you can't destroy after last use.
For symmetry it's better to just have explicit constructor/destructor.


> > 
> > And maybe the API should just be
> > bool virtio_pci_admin_has_legacy_io()
> 
> This can work as well.
> 
> In that case, the API will just get the VF PCI to get from it the PF +
> 'admin_queue' context and will check internally that all current 5 legacy
> commands are supported.
> 
> Yishai

Yes, makes sense.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V1 vfio 9/9] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-25 Thread Michael S. Tsirkin

On Wed, Oct 25, 2023 at 05:35:51PM +0300, Yishai Hadas wrote:
> > Do we want to make this only probe the correct subsystem vendor ID or do
> > we want to emulate the subsystem vendor ID as well?  I don't see this is
> > correct without one of those options.
> 
> Looking in the 1.x spec we can see the below.
> 
> Legacy Interfaces: A Note on PCI Device Discovery
> 
> "Transitional devices MUST have the PCI Subsystem
> Device ID matching the Virtio Device ID, as indicated in section 5 ...
> This is to match legacy drivers."
> 
> However, there is no need to enforce Subsystem Vendor ID.
> 
> This is what we followed here.
> 
> Makes sense ?

Won't work for legacy windows drivers.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V1 vfio 6/9] virtio-pci: Introduce APIs to execute legacy IO admin commands

2023-10-25 Thread Michael S. Tsirkin

On Wed, Oct 25, 2023 at 04:00:43PM +0300, Yishai Hadas wrote:
> On 25/10/2023 13:17, Michael S. Tsirkin wrote:
> > On Wed, Oct 25, 2023 at 12:18:32PM +0300, Yishai Hadas wrote:
> > > On 25/10/2023 0:01, Michael S. Tsirkin wrote:
> > > 
> > >  On Tue, Oct 17, 2023 at 04:42:14PM +0300, Yishai Hadas wrote:
> > > 
> > >  Introduce APIs to execute legacy IO admin commands.
> > > 
> > >  It includes: list_query/use, io_legacy_read/write,
> > >  io_legacy_notify_info.
> > > 
> > >  Those APIs will be used by the next patches from this series.
> > > 
> > >  Signed-off-by: Yishai Hadas 
> > >  ---
> > >   drivers/virtio/virtio_pci_common.c |  11 ++
> > >   drivers/virtio/virtio_pci_common.h |   2 +
> > >   drivers/virtio/virtio_pci_modern.c | 206 
> > > +
> > >   include/linux/virtio_pci_admin.h   |  18 +++
> > >   4 files changed, 237 insertions(+)
> > >   create mode 100644 include/linux/virtio_pci_admin.h
> > > 
> > >  diff --git a/drivers/virtio/virtio_pci_common.c 
> > > b/drivers/virtio/virtio_pci_common.c
> > >  index 6b4766d5abe6..212d68401d2c 100644
> > >  --- a/drivers/virtio/virtio_pci_common.c
> > >  +++ b/drivers/virtio/virtio_pci_common.c
> > >  @@ -645,6 +645,17 @@ static struct pci_driver virtio_pci_driver 
> > > = {
> > >  .sriov_configure = virtio_pci_sriov_configure,
> > >   };
> > > 
> > >  +struct virtio_device *virtio_pci_vf_get_pf_dev(struct pci_dev 
> > > *pdev)
> > >  +{
> > >  +   struct virtio_pci_device *pf_vp_dev;
> > >  +
> > >  +   pf_vp_dev = pci_iov_get_pf_drvdata(pdev, 
> > > &virtio_pci_driver);
> > >  +   if (IS_ERR(pf_vp_dev))
> > >  +   return NULL;
> > >  +
> > >  +   return &pf_vp_dev->vdev;
> > >  +}
> > >  +
> > >   module_pci_driver(virtio_pci_driver);
> > > 
> > >   MODULE_AUTHOR("Anthony Liguori ");
> > >  diff --git a/drivers/virtio/virtio_pci_common.h 
> > > b/drivers/virtio/virtio_pci_common.h
> > >  index a21b9ba01a60..2785e61ed668 100644
> > >  --- a/drivers/virtio/virtio_pci_common.h
> > >  +++ b/drivers/virtio/virtio_pci_common.h
> > >  @@ -155,4 +155,6 @@ static inline void 
> > > virtio_pci_legacy_remove(struct virtio_pci_device *vp_dev)
> > >   int virtio_pci_modern_probe(struct virtio_pci_device *);
> > >   void virtio_pci_modern_remove(struct virtio_pci_device *);
> > > 
> > >  +struct virtio_device *virtio_pci_vf_get_pf_dev(struct pci_dev 
> > > *pdev);
> > >  +
> > >   #endif
> > >  diff --git a/drivers/virtio/virtio_pci_modern.c 
> > > b/drivers/virtio/virtio_pci_modern.c
> > >  index cc159a8e6c70..00b65e20b2f5 100644
> > >  --- a/drivers/virtio/virtio_pci_modern.c
> > >  +++ b/drivers/virtio/virtio_pci_modern.c
> > >  @@ -719,6 +719,212 @@ static void vp_modern_destroy_avq(struct 
> > > virtio_device *vdev)
> > >  vp_dev->del_vq(&vp_dev->admin_vq.info);
> > >   }
> > > 
> > >  +/*
> > >  + * virtio_pci_admin_list_query - Provides to driver list of 
> > > commands
> > >  + * supported for the PCI VF.
> > >  + * @dev: VF pci_dev
> > >  + * @buf: buffer to hold the returned list
> > >  + * @buf_size: size of the given buffer
> > >  + *
> > >  + * Returns 0 on success, or negative on failure.
> > >  + */
> > >  +int virtio_pci_admin_list_query(struct pci_dev *pdev, u8 *buf, 
> > > int buf_size)
> > >  +{
> > >  +   struct virtio_device *virtio_dev = 
> > > virtio_pci_vf_get_pf_dev(pdev);
> > >  +   struct virtio_admin_cmd cmd = {};
> > >  +   struct scatterlist result_sg;
> > >  +
> > >  +   if (!virtio_dev)
> > >  +   return -ENODEV;
> > >  +
> > >

Re: [PATCH V1 vfio 6/9] virtio-pci: Introduce APIs to execute legacy IO admin commands

2023-10-25 Thread Michael S. Tsirkin

On Wed, Oct 25, 2023 at 04:00:43PM +0300, Yishai Hadas wrote:
> On 25/10/2023 13:17, Michael S. Tsirkin wrote:
> > On Wed, Oct 25, 2023 at 12:18:32PM +0300, Yishai Hadas wrote:
> > > On 25/10/2023 0:01, Michael S. Tsirkin wrote:
> > > 
> > >  On Tue, Oct 17, 2023 at 04:42:14PM +0300, Yishai Hadas wrote:
> > > 
> > >  Introduce APIs to execute legacy IO admin commands.
> > > 
> > >  It includes: list_query/use, io_legacy_read/write,
> > >  io_legacy_notify_info.
> > > 
> > >  Those APIs will be used by the next patches from this series.
> > > 
> > >  Signed-off-by: Yishai Hadas 
> > >  ---
> > >   drivers/virtio/virtio_pci_common.c |  11 ++
> > >   drivers/virtio/virtio_pci_common.h |   2 +
> > >   drivers/virtio/virtio_pci_modern.c | 206 
> > > +
> > >   include/linux/virtio_pci_admin.h   |  18 +++
> > >   4 files changed, 237 insertions(+)
> > >   create mode 100644 include/linux/virtio_pci_admin.h
> > > 
> > >  diff --git a/drivers/virtio/virtio_pci_common.c 
> > > b/drivers/virtio/virtio_pci_common.c
> > >  index 6b4766d5abe6..212d68401d2c 100644
> > >  --- a/drivers/virtio/virtio_pci_common.c
> > >  +++ b/drivers/virtio/virtio_pci_common.c
> > >  @@ -645,6 +645,17 @@ static struct pci_driver virtio_pci_driver 
> > > = {
> > >  .sriov_configure = virtio_pci_sriov_configure,
> > >   };
> > > 
> > >  +struct virtio_device *virtio_pci_vf_get_pf_dev(struct pci_dev 
> > > *pdev)
> > >  +{
> > >  +   struct virtio_pci_device *pf_vp_dev;
> > >  +
> > >  +   pf_vp_dev = pci_iov_get_pf_drvdata(pdev, 
> > > &virtio_pci_driver);
> > >  +   if (IS_ERR(pf_vp_dev))
> > >  +   return NULL;
> > >  +
> > >  +   return &pf_vp_dev->vdev;
> > >  +}
> > >  +
> > >   module_pci_driver(virtio_pci_driver);
> > > 
> > >   MODULE_AUTHOR("Anthony Liguori ");
> > >  diff --git a/drivers/virtio/virtio_pci_common.h 
> > > b/drivers/virtio/virtio_pci_common.h
> > >  index a21b9ba01a60..2785e61ed668 100644
> > >  --- a/drivers/virtio/virtio_pci_common.h
> > >  +++ b/drivers/virtio/virtio_pci_common.h
> > >  @@ -155,4 +155,6 @@ static inline void 
> > > virtio_pci_legacy_remove(struct virtio_pci_device *vp_dev)
> > >   int virtio_pci_modern_probe(struct virtio_pci_device *);
> > >   void virtio_pci_modern_remove(struct virtio_pci_device *);
> > > 
> > >  +struct virtio_device *virtio_pci_vf_get_pf_dev(struct pci_dev 
> > > *pdev);
> > >  +
> > >   #endif
> > >  diff --git a/drivers/virtio/virtio_pci_modern.c 
> > > b/drivers/virtio/virtio_pci_modern.c
> > >  index cc159a8e6c70..00b65e20b2f5 100644
> > >  --- a/drivers/virtio/virtio_pci_modern.c
> > >  +++ b/drivers/virtio/virtio_pci_modern.c
> > >  @@ -719,6 +719,212 @@ static void vp_modern_destroy_avq(struct 
> > > virtio_device *vdev)
> > >  vp_dev->del_vq(&vp_dev->admin_vq.info);
> > >   }
> > > 
> > >  +/*
> > >  + * virtio_pci_admin_list_query - Provides to driver list of 
> > > commands
> > >  + * supported for the PCI VF.
> > >  + * @dev: VF pci_dev
> > >  + * @buf: buffer to hold the returned list
> > >  + * @buf_size: size of the given buffer
> > >  + *
> > >  + * Returns 0 on success, or negative on failure.
> > >  + */
> > >  +int virtio_pci_admin_list_query(struct pci_dev *pdev, u8 *buf, 
> > > int buf_size)
> > >  +{
> > >  +   struct virtio_device *virtio_dev = 
> > > virtio_pci_vf_get_pf_dev(pdev);
> > >  +   struct virtio_admin_cmd cmd = {};
> > >  +   struct scatterlist result_sg;
> > >  +
> > >  +   if (!virtio_dev)
> > >  +   return -ENODEV;
> > >  +
> > >

Re: [PATCH V1 vfio 6/9] virtio-pci: Introduce APIs to execute legacy IO admin commands

2023-10-25 Thread Michael S. Tsirkin

On Wed, Oct 25, 2023 at 12:18:32PM +0300, Yishai Hadas wrote:
> On 25/10/2023 0:01, Michael S. Tsirkin wrote:
> 
> On Tue, Oct 17, 2023 at 04:42:14PM +0300, Yishai Hadas wrote:
> 
> Introduce APIs to execute legacy IO admin commands.
> 
> It includes: list_query/use, io_legacy_read/write,
> io_legacy_notify_info.
> 
> Those APIs will be used by the next patches from this series.
> 
> Signed-off-by: Yishai Hadas 
> ---
>  drivers/virtio/virtio_pci_common.c |  11 ++
>  drivers/virtio/virtio_pci_common.h |   2 +
>  drivers/virtio/virtio_pci_modern.c | 206 
> +
>  include/linux/virtio_pci_admin.h   |  18 +++
>  4 files changed, 237 insertions(+)
>  create mode 100644 include/linux/virtio_pci_admin.h
> 
> diff --git a/drivers/virtio/virtio_pci_common.c 
> b/drivers/virtio/virtio_pci_common.c
> index 6b4766d5abe6..212d68401d2c 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -645,6 +645,17 @@ static struct pci_driver virtio_pci_driver = {
> .sriov_configure = virtio_pci_sriov_configure,
>  };
> 
> +struct virtio_device *virtio_pci_vf_get_pf_dev(struct pci_dev *pdev)
> +{
> +   struct virtio_pci_device *pf_vp_dev;
> +
> +   pf_vp_dev = pci_iov_get_pf_drvdata(pdev, &virtio_pci_driver);
> +   if (IS_ERR(pf_vp_dev))
> +   return NULL;
> +
> +   return &pf_vp_dev->vdev;
> +}
> +
>  module_pci_driver(virtio_pci_driver);
> 
>  MODULE_AUTHOR("Anthony Liguori ");
> diff --git a/drivers/virtio/virtio_pci_common.h 
> b/drivers/virtio/virtio_pci_common.h
> index a21b9ba01a60..2785e61ed668 100644
> --- a/drivers/virtio/virtio_pci_common.h
> +++ b/drivers/virtio/virtio_pci_common.h
> @@ -155,4 +155,6 @@ static inline void 
> virtio_pci_legacy_remove(struct virtio_pci_device *vp_dev)
>  int virtio_pci_modern_probe(struct virtio_pci_device *);
>  void virtio_pci_modern_remove(struct virtio_pci_device *);
> 
> +struct virtio_device *virtio_pci_vf_get_pf_dev(struct pci_dev *pdev);
> +
>  #endif
> diff --git a/drivers/virtio/virtio_pci_modern.c 
> b/drivers/virtio/virtio_pci_modern.c
> index cc159a8e6c70..00b65e20b2f5 100644
> --- a/drivers/virtio/virtio_pci_modern.c
> +++ b/drivers/virtio/virtio_pci_modern.c
> @@ -719,6 +719,212 @@ static void vp_modern_destroy_avq(struct 
> virtio_device *vdev)
> vp_dev->del_vq(&vp_dev->admin_vq.info);
>  }
> 
> +/*
> + * virtio_pci_admin_list_query - Provides to driver list of commands
> + * supported for the PCI VF.
> + * @dev: VF pci_dev
> + * @buf: buffer to hold the returned list
> + * @buf_size: size of the given buffer
> + *
> + * Returns 0 on success, or negative on failure.
> + */
> +int virtio_pci_admin_list_query(struct pci_dev *pdev, u8 *buf, int 
> buf_size)
> +{
> +   struct virtio_device *virtio_dev = 
> virtio_pci_vf_get_pf_dev(pdev);
> +   struct virtio_admin_cmd cmd = {};
> +   struct scatterlist result_sg;
> +
> +   if (!virtio_dev)
> +   return -ENODEV;
> +
> +   sg_init_one(&result_sg, buf, buf_size);
> +   cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_LIST_QUERY);
> +   cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SRIOV);
> +   cmd.result_sg = &result_sg;
> +
> +   return vp_modern_admin_cmd_exec(virtio_dev, &cmd);
> +}
> +EXPORT_SYMBOL_GPL(virtio_pci_admin_list_query);
> +
> +/*
> + * virtio_pci_admin_list_use - Provides to device list of commands
> + * used for the PCI VF.
> + * @dev: VF pci_dev
> + * @buf: buffer which holds the list
> + * @buf_size: size of the given buffer
> + *
> + * Returns 0 on success, or negative on failure.
> + */
> +int virtio_pci_admin_list_use(struct pci_dev *pdev, u8 *buf, int 
> buf_size)
> +{
> +   struct virtio_device *virtio_dev = 
> virtio_pci_vf_get_pf_dev(pdev);
> +   struct virtio_admin_cmd cmd = {};
> +   struct scatterlist data_sg;
> +
> +

Re: [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation

2023-10-24 Thread Michael S. Tsirkin

On Wed, Oct 25, 2023 at 09:18:27AM +0800, Jason Wang wrote:
> On Tue, Oct 24, 2023 at 8:03 PM Heng Qi  wrote:
> >
> >
> >
> > 在 2023/10/12 下午4:29, Jason Wang 写道:
> > > On Thu, Oct 12, 2023 at 3:44 PM Heng Qi  wrote:
> > >> Now, virtio-net already supports per-queue moderation parameter
> > >> setting. Based on this, we use the netdim library of linux to support
> > >> dynamic coalescing moderation for virtio-net.
> > >>
> > >> Due to hardware scheduling issues, we only tested rx dim.
> > > Do you have PPS numbers? And TX numbers are also important as the
> > > throughput could be misleading due to various reasons.
> >
> > Hi Jason!
> >
> > The comparison of rx netdim performance is as follows:
> > (the backend supporting tx dim is not yet ready)
> 
> Thanks a lot for the numbers.
> 
> I'd still expect the TX result as I did play tx interrupt coalescing
> about 10 years ago.
> 
> I will start to review the series but let's try to have some TX numbers as 
> well.
> 
> Btw, it would be more convenient to have a raw PPS benchmark. E.g you
> can try to use a software or hardware packet generator.
> 
> Thanks

Latency results are also kind of interesting.


> >
> >
> > I. Sockperf UDP
> > =
> > 1. Env
> > rxq_0 is affinity to cpu_0
> >
> > 2. Cmd
> > client:  taskset -c 0 sockperf tp -p 8989 -i $IP -t 10 -m 16B
> > server: taskset -c 0 sockperf sr -p 8989
> >
> > 3. Result
> > dim off: 1143277.00 rxpps, throughput 17.844 MBps, cpu is 100%.
> > dim on: 1124161.00 rxpps, throughput 17.610 MBps, cpu is 83.5%.
> > =
> >
> >
> > II. Redis
> > =
> > 1. Env
> > There are 8 rxqs and rxq_i is affinity to cpu_i.
> >
> > 2. Result
> > When all cpus are 100%, ops/sec of memtier_benchmark client is
> > dim off:   978437.23
> > dim on: 1143638.28
> > =
> >
> >
> > III. Nginx
> > =
> > 1. Env
> > There are 8 rxqs and rxq_i is affinity to cpu_i.
> >
> > 2. Result
> > When all cpus are 100%, requests/sec of wrk client is
> > dim off:   877931.67
> > dim on: 1019160.31
> > =
> >
> > Thanks!
> >
> > >
> > > Thanks
> > >
> > >> @Test env
> > >> rxq0 has affinity to cpu0.
> > >>
> > >> @Test cmd
> > >> client: taskset -c 0 sockperf tp -i ${IP} -t 30 --tcp -m ${msg_size}
> > >> server: taskset -c 0 sockperf sr --tcp
> > >>
> > >> @Test res
> > >> The second column is the ratio of the result returned by client
> > >> when rx dim is enabled to the result returned by client when
> > >> rx dim is disabled.
> > >>  --
> > >>  | msg_size |  rx_dim=on / rx_dim=off |
> > >>  --
> > >>  |   14B| + 3%|
> > >>  --
> > >>  |   100B   | + 16%   |
> > >>  --
> > >>  |   500B   | + 25%   |
> > >>  --
> > >>  |   1400B  | + 28%   |
> > >>  --
> > >>  |   2048B  | + 22%   |
> > >>  --
> > >>  |   4096B  | + 5%|
> > >>  --
> > >>
> > >> ---
> > >> This patch set was part of the previous netdim patch set[1].
> > >> [1] was split into a merged bugfix set[2] and the current set.
> > >> The previous relevant commentators have been Cced.
> > >>
> > >> [1] 
> > >> https://lore.kernel.org/all/20230811065512.22190-1-hen...@linux.alibaba.com/
> > >> [2] 
> > >> https://lore.kernel.org/all/cover.1696745452.git.hen...@linux.alibaba.com/
> > >>
> > >> Heng Qi (5):
> > >>virtio-net: returns whether napi is complete
> > >>virtio-net: separate rx/tx coalescing moderation cmds
> > >>virtio-net: extract virtqueue coalescig cmd for reuse
> > >>virtio-net: support rx netdim
> > >>virtio-net: support tx netdim
> > >>
> > >>   drivers/net/virtio_net.c | 394 ---
> > >>   1 file changed, 322 insertions(+), 72 deletions(-)
> > >>
> > >> --
> > >> 2.19.1.6.gb485710b
> > >>
> > >>
> >
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net-next 5/5] virtio-net: support tx netdim

2023-10-24 Thread Michael S. Tsirkin

On Wed, Oct 25, 2023 at 11:35:43AM +0800, Jason Wang wrote:
> On Thu, Oct 12, 2023 at 3:44 PM Heng Qi  wrote:
> >
> > Similar to rx netdim, this patch supports adaptive tx
> > coalescing moderation for the virtio-net.
> >
> > Signed-off-by: Heng Qi 
> > ---
> >  drivers/net/virtio_net.c | 143 ---
> >  1 file changed, 119 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 6ad2890a7909..1c680cb09d48 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -154,6 +154,15 @@ struct send_queue {
> >
> > struct virtnet_sq_stats stats;
> >
> > +   /* The number of tx notifications */
> > +   u16 calls;
> > +
> > +   /* Is dynamic interrupt moderation enabled? */
> > +   bool dim_enabled;
> > +
> > +   /* Dynamic Interrupt Moderation */
> > +   struct dim dim;
> > +
> > struct virtnet_interrupt_coalesce intr_coal;
> >
> > struct napi_struct napi;
> > @@ -317,8 +326,9 @@ struct virtnet_info {
> > u8 duplex;
> > u32 speed;
> >
> > -   /* Is rx dynamic interrupt moderation enabled? */
> > +   /* Is dynamic interrupt moderation enabled? */
> > bool rx_dim_enabled;
> > +   bool tx_dim_enabled;
> >
> > /* Interrupt coalescing settings */
> > struct virtnet_interrupt_coalesce intr_coal_tx;
> > @@ -464,19 +474,40 @@ static bool virtqueue_napi_complete(struct 
> > napi_struct *napi,
> > return false;
> >  }
> >
> > +static void virtnet_tx_dim_work(struct work_struct *work);
> > +
> > +static void virtnet_tx_dim_update(struct virtnet_info *vi, struct 
> > send_queue *sq)
> > +{
> > +   struct virtnet_sq_stats *stats = &sq->stats;
> > +   struct dim_sample cur_sample = {};
> > +
> > +   u64_stats_update_begin(&sq->stats.syncp);
> > +   dim_update_sample(sq->calls, stats->packets,
> > + stats->bytes, &cur_sample);
> > +   u64_stats_update_end(&sq->stats.syncp);
> > +
> > +   net_dim(&sq->dim, cur_sample);
> > +}
> > +
> >  static void skb_xmit_done(struct virtqueue *vq)
> >  {
> > struct virtnet_info *vi = vq->vdev->priv;
> > -   struct napi_struct *napi = &vi->sq[vq2txq(vq)].napi;
> > +   struct send_queue *sq = &vi->sq[vq2txq(vq)];
> > +   struct napi_struct *napi = &sq->napi;
> > +
> > +   sq->calls++;
> 
> I wonder what's the impact of this counters for netdim. As we have a
> mode of orphan skb in xmit.
> 
> We need to test to see.
> 
> Thanks

Agreed, performance patches should come with performance results.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation

2023-10-24 Thread Michael S. Tsirkin

On Thu, Oct 12, 2023 at 03:44:04PM +0800, Heng Qi wrote:
> Now, virtio-net already supports per-queue moderation parameter
> setting. Based on this, we use the netdim library of linux to support
> dynamic coalescing moderation for virtio-net.
> 
> Due to hardware scheduling issues, we only tested rx dim.

So patches 1 to 4 look ok but patch 5 is untested - we should
probably wait until it's tested properly.


> @Test env
> rxq0 has affinity to cpu0.
> 
> @Test cmd
> client: taskset -c 0 sockperf tp -i ${IP} -t 30 --tcp -m ${msg_size}
> server: taskset -c 0 sockperf sr --tcp
> 
> @Test res
> The second column is the ratio of the result returned by client
> when rx dim is enabled to the result returned by client when
> rx dim is disabled.
>   --
>   | msg_size |  rx_dim=on / rx_dim=off |
>   --
>   |   14B| + 3%|   
>   --
>   |   100B   | + 16%   |
>   --
>   |   500B   | + 25%   |
>   --
>   |   1400B  | + 28%   |
>   --
>   |   2048B  | + 22%   |
>   --
>   |   4096B  | + 5%|
>   --
> 
> ---
> This patch set was part of the previous netdim patch set[1].
> [1] was split into a merged bugfix set[2] and the current set.
> The previous relevant commentators have been Cced.
> 
> [1] 
> https://lore.kernel.org/all/20230811065512.22190-1-hen...@linux.alibaba.com/
> [2] https://lore.kernel.org/all/cover.1696745452.git.hen...@linux.alibaba.com/
> 
> Heng Qi (5):
>   virtio-net: returns whether napi is complete
>   virtio-net: separate rx/tx coalescing moderation cmds
>   virtio-net: extract virtqueue coalescig cmd for reuse
>   virtio-net: support rx netdim
>   virtio-net: support tx netdim
> 
>  drivers/net/virtio_net.c | 394 ---
>  1 file changed, 322 insertions(+), 72 deletions(-)
> 
> -- 
> 2.19.1.6.gb485710b

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V1 vfio 6/9] virtio-pci: Introduce APIs to execute legacy IO admin commands

2023-10-24 Thread Michael S. Tsirkin

On Tue, Oct 17, 2023 at 04:42:14PM +0300, Yishai Hadas wrote:
> Introduce APIs to execute legacy IO admin commands.
> 
> It includes: list_query/use, io_legacy_read/write,
> io_legacy_notify_info.
> 
> Those APIs will be used by the next patches from this series.
> 
> Signed-off-by: Yishai Hadas 
> ---
>  drivers/virtio/virtio_pci_common.c |  11 ++
>  drivers/virtio/virtio_pci_common.h |   2 +
>  drivers/virtio/virtio_pci_modern.c | 206 +
>  include/linux/virtio_pci_admin.h   |  18 +++
>  4 files changed, 237 insertions(+)
>  create mode 100644 include/linux/virtio_pci_admin.h
> 
> diff --git a/drivers/virtio/virtio_pci_common.c 
> b/drivers/virtio/virtio_pci_common.c
> index 6b4766d5abe6..212d68401d2c 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -645,6 +645,17 @@ static struct pci_driver virtio_pci_driver = {
>   .sriov_configure = virtio_pci_sriov_configure,
>  };
>  
> +struct virtio_device *virtio_pci_vf_get_pf_dev(struct pci_dev *pdev)
> +{
> + struct virtio_pci_device *pf_vp_dev;
> +
> + pf_vp_dev = pci_iov_get_pf_drvdata(pdev, &virtio_pci_driver);
> + if (IS_ERR(pf_vp_dev))
> + return NULL;
> +
> + return &pf_vp_dev->vdev;
> +}
> +
>  module_pci_driver(virtio_pci_driver);
>  
>  MODULE_AUTHOR("Anthony Liguori ");
> diff --git a/drivers/virtio/virtio_pci_common.h 
> b/drivers/virtio/virtio_pci_common.h
> index a21b9ba01a60..2785e61ed668 100644
> --- a/drivers/virtio/virtio_pci_common.h
> +++ b/drivers/virtio/virtio_pci_common.h
> @@ -155,4 +155,6 @@ static inline void virtio_pci_legacy_remove(struct 
> virtio_pci_device *vp_dev)
>  int virtio_pci_modern_probe(struct virtio_pci_device *);
>  void virtio_pci_modern_remove(struct virtio_pci_device *);
>  
> +struct virtio_device *virtio_pci_vf_get_pf_dev(struct pci_dev *pdev);
> +
>  #endif
> diff --git a/drivers/virtio/virtio_pci_modern.c 
> b/drivers/virtio/virtio_pci_modern.c
> index cc159a8e6c70..00b65e20b2f5 100644
> --- a/drivers/virtio/virtio_pci_modern.c
> +++ b/drivers/virtio/virtio_pci_modern.c
> @@ -719,6 +719,212 @@ static void vp_modern_destroy_avq(struct virtio_device 
> *vdev)
>   vp_dev->del_vq(&vp_dev->admin_vq.info);
>  }
>  
> +/*
> + * virtio_pci_admin_list_query - Provides to driver list of commands
> + * supported for the PCI VF.
> + * @dev: VF pci_dev
> + * @buf: buffer to hold the returned list
> + * @buf_size: size of the given buffer
> + *
> + * Returns 0 on success, or negative on failure.
> + */
> +int virtio_pci_admin_list_query(struct pci_dev *pdev, u8 *buf, int buf_size)
> +{
> + struct virtio_device *virtio_dev = virtio_pci_vf_get_pf_dev(pdev);
> + struct virtio_admin_cmd cmd = {};
> + struct scatterlist result_sg;
> +
> + if (!virtio_dev)
> + return -ENODEV;
> +
> + sg_init_one(&result_sg, buf, buf_size);
> + cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_LIST_QUERY);
> + cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SRIOV);
> + cmd.result_sg = &result_sg;
> +
> + return vp_modern_admin_cmd_exec(virtio_dev, &cmd);
> +}
> +EXPORT_SYMBOL_GPL(virtio_pci_admin_list_query);
> +
> +/*
> + * virtio_pci_admin_list_use - Provides to device list of commands
> + * used for the PCI VF.
> + * @dev: VF pci_dev
> + * @buf: buffer which holds the list
> + * @buf_size: size of the given buffer
> + *
> + * Returns 0 on success, or negative on failure.
> + */
> +int virtio_pci_admin_list_use(struct pci_dev *pdev, u8 *buf, int buf_size)
> +{
> + struct virtio_device *virtio_dev = virtio_pci_vf_get_pf_dev(pdev);
> + struct virtio_admin_cmd cmd = {};
> + struct scatterlist data_sg;
> +
> + if (!virtio_dev)
> + return -ENODEV;
> +
> + sg_init_one(&data_sg, buf, buf_size);
> + cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_LIST_USE);
> + cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SRIOV);
> + cmd.data_sg = &data_sg;
> +
> + return vp_modern_admin_cmd_exec(virtio_dev, &cmd);
> +}
> +EXPORT_SYMBOL_GPL(virtio_pci_admin_list_use);

list commands are actually for a group, not for the VF.

> +
> +/*
> + * virtio_pci_admin_legacy_io_write - Write legacy registers of a member 
> device
> + * @dev: VF pci_dev
> + * @opcode: op code of the io write command

opcode is actually either VIRTIO_ADMIN_CMD_LEGACY_COMMON_CFG_WRITE
or VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_WRITE correct?

So please just add 2 APIs for this so users don't need to care.
Could be wrappers around these two things.




> + * @offset: starting byte offset within the registers to write to
> + * @size: size of the data to write
> + * @buf: buffer which holds the data
> + *
> + * Returns 0 on success, or negative on failure.
> + */
> +int virtio_pci_admin_legacy_io_write(struct pci_dev *pdev, u16 opcode,
> +  u8 offset, u8 size, u8 *buf)
> +{
> + struct virtio_device *virtio_dev = virtio_pci_vf_get_pf_dev(pdev);
> + struct virtio_admin_cmd_l

Re: [PATCH] virtio_ring: add an error code check in virtqueue_resize

2023-10-23 Thread Michael S. Tsirkin

On Mon, Oct 23, 2023 at 06:52:34PM +0800, Xuan Zhuo wrote:
> On Mon, 23 Oct 2023 17:52:02 +0800, Xuan Zhuo  
> wrote:
> > On Mon, 23 Oct 2023 17:50:46 +0800, Su Hui  wrote:
> > > On 2023/10/23 13:46, Xuan Zhuo wrote:
> > >  Well, what are the cases where it can happen practically?
> > > >>> Device error. Such as vp_active_vq()
> > > >>>
> > > >>> Thanks.
> > > >> Hmm interesting. OK. But do callers know to recover?
> > > > No.
> > > >
> > > > So I think WARN + broken is suitable.
> > > >
> > > > Thanks.
> > >  Sorry for the late, is the following code okay?
> > > 
> > >  @@ -2739,7 +2739,7 @@ int virtqueue_resize(struct virtqueue *_vq, 
> > >  u32 num,
> > >  void (*recycle)(struct virtqueue *vq, void 
> > >  *buf))
> > >  {
> > > struct vring_virtqueue *vq = to_vvq(_vq);
> > >  -   int err;
> > >  +   int err, err_reset;
> > > 
> > > if (num > vq->vq.num_max)
> > > return -E2BIG;
> > >  @@ -2759,7 +2759,15 @@ int virtqueue_resize(struct virtqueue *_vq, 
> > >  u32 num,
> > > else
> > > err = virtqueue_resize_split(_vq, num);
> > > 
> > >  -   return virtqueue_enable_after_reset(_vq);
> > >  +   err_reset = virtqueue_enable_after_reset(_vq);
> > >  +
> > >  +   if (err) {
> > > >>> No err.
> > > >>>
> > > >>> err is not important.
> > > >>> You can remove that.
> > > >> Emm, I'm a little confused that which code should I remove ?
> > > >>
> > > >>
> > > >> like this:
> > > >>if (vq->packed_ring)
> > > >>virtqueue_resize_packed(_vq, num);
> > > >>else
> > > >>virtqueue_resize_split(_vq, num);
> > > >>
> > > >> And we should set broken and warn inside 
> > > >> virtqueue_enable_after_reset()?
> > >
> > > In my opinion, we should return the error code of 
> > > virtqueue_resize_packed() / virtqueue_resize_split().
> > > But if this err is not important, this patch makes no sense.
> > > Maybe I misunderstand somewhere...
> > > If you think it's worth sending a patch, you can send it :).(I'm not 
> > > familiar with this code).
> >
> > OK.
> 
> Hi Michael,
> 
> The queue reset code is wrote with the CONFIG_VIRTIO_HARDEN_NOTIFICATION.
> 
> When we disable the vq, the broken is true until we re-enable it.
> 
> So when we re-enable it fail, the vq is broken status.
> 
> Normally, this just happens on the buggy device.
> So I think that is enough.
> 
> Thanks.

I don't know what to do about CONFIG_VIRTIO_HARDEN_NOTIFICATION.
It's known to be broken and it does not look like there's
active effort to revive it - should we just drop it for now?


> 
>   static int vp_modern_disable_vq_and_reset(struct virtqueue *vq)
>   {
>   [...]
> 
>   vp_modern_set_queue_reset(mdev, vq->index);
> 
>   [...]
> 
>   #ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION
> ->>   __virtqueue_break(vq);
>   #endif
> 
>   [...]
>   }
> 
>   static int vp_modern_enable_vq_after_reset(struct virtqueue *vq)
>   {
>   [...]
> 
>   if (vp_modern_get_queue_reset(mdev, index))
>   return -EBUSY;
> 
>   if (vp_modern_get_queue_enable(mdev, index))
>   return -EBUSY;
> 
>   err = vp_active_vq(vq, info->msix_vector);
>   if (err)
>   return err;
> 
>   }
> 
>   [...]
> 
>   #ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION
> ->>   __virtqueue_unbreak(vq);
>   #endif
> 
>   [...]
>   }
> 
> 
> 
> 
> >
> > Thanks.
> >
> >
> > >
> > > Thanks,
> > > Su Hui
> > >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio_ring: add an error code check in virtqueue_resize

2023-10-23 Thread Michael S. Tsirkin

On Mon, Oct 23, 2023 at 05:52:02PM +0800, Xuan Zhuo wrote:
> On Mon, 23 Oct 2023 17:50:46 +0800, Su Hui  wrote:
> > On 2023/10/23 13:46, Xuan Zhuo wrote:
> >  Well, what are the cases where it can happen practically?
> > >>> Device error. Such as vp_active_vq()
> > >>>
> > >>> Thanks.
> > >> Hmm interesting. OK. But do callers know to recover?
> > > No.
> > >
> > > So I think WARN + broken is suitable.
> > >
> > > Thanks.
> >  Sorry for the late, is the following code okay?
> > 
> >  @@ -2739,7 +2739,7 @@ int virtqueue_resize(struct virtqueue *_vq, u32 
> >  num,
> >  void (*recycle)(struct virtqueue *vq, void 
> >  *buf))
> >  {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >  -   int err;
> >  +   int err, err_reset;
> > 
> > if (num > vq->vq.num_max)
> > return -E2BIG;
> >  @@ -2759,7 +2759,15 @@ int virtqueue_resize(struct virtqueue *_vq, u32 
> >  num,
> > else
> > err = virtqueue_resize_split(_vq, num);
> > 
> >  -   return virtqueue_enable_after_reset(_vq);
> >  +   err_reset = virtqueue_enable_after_reset(_vq);
> >  +
> >  +   if (err) {
> > >>> No err.
> > >>>
> > >>> err is not important.
> > >>> You can remove that.
> > >> Emm, I'm a little confused that which code should I remove ?
> > >>
> > >>
> > >> like this:
> > >>  if (vq->packed_ring)
> > >>  virtqueue_resize_packed(_vq, num);
> > >>  else
> > >>  virtqueue_resize_split(_vq, num);
> > >>
> > >> And we should set broken and warn inside virtqueue_enable_after_reset()?
> >
> > In my opinion, we should return the error code of virtqueue_resize_packed() 
> > / virtqueue_resize_split().
> > But if this err is not important, this patch makes no sense.
> > Maybe I misunderstand somewhere...
> > If you think it's worth sending a patch, you can send it :).(I'm not 
> > familiar with this code).
> 
> OK.
> 
> Thanks.

I would first try to recover by re-enabling.
If that fails we can set broken.


> 
> >
> > Thanks,
> > Su Hui
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[GIT PULL] virtio: last minute fixes

2023-10-22 Thread Michael S. Tsirkin

The following changes since commit 58720809f52779dc0f08e53e54b014209d13eebb:

  Linux 6.6-rc6 (2023-10-15 13:34:39 -0700)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

for you to fetch changes up to 061b39fdfe7fd98946e67637213bcbb10a318cca:

  virtio_pci: fix the common cfg map size (2023-10-18 11:30:12 -0400)


virtio: last minute fixes

a collection of small fixes that look like worth having in
this release.

Signed-off-by: Michael S. Tsirkin 


Dragos Tatulea (2):
  vdpa/mlx5: Fix double release of debugfs entry
  vdpa/mlx5: Fix firmware error on creation of 1k VQs

Eric Auger (1):
  vhost: Allow null msg.size on VHOST_IOTLB_INVALIDATE

Gavin Shan (1):
  virtio_balloon: Fix endless deflation and inflation on arm64

Liming Wu (1):
  tools/virtio: Add dma sync api for virtio test

Maximilian Heyne (1):
  virtio-mmio: fix memory leak of vm_dev

Shawn.Shao (1):
  vdpa_sim_blk: Fix the potential leak of mgmt_dev

Xuan Zhuo (1):
  virtio_pci: fix the common cfg map size

zhenwei pi (1):
  virtio-crypto: handle config changed by work queue

 drivers/crypto/virtio/virtio_crypto_common.h |  3 ++
 drivers/crypto/virtio/virtio_crypto_core.c   | 14 +-
 drivers/vdpa/mlx5/net/debug.c|  5 +-
 drivers/vdpa/mlx5/net/mlx5_vnet.c| 70 ++--
 drivers/vdpa/mlx5/net/mlx5_vnet.h| 11 -
 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c |  5 +-
 drivers/vhost/vhost.c|  4 +-
 drivers/virtio/virtio_balloon.c  |  6 ++-
 drivers/virtio/virtio_mmio.c | 19 ++--
 drivers/virtio/virtio_pci_modern_dev.c   |  2 +-
 tools/virtio/linux/dma-mapping.h | 12 +
 11 files changed, 121 insertions(+), 30 deletions(-)

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V1 vfio 0/9] Introduce a vfio driver over virtio devices

2023-10-22 Thread Michael S. Tsirkin

On Sun, Oct 22, 2023 at 11:20:31AM +0300, Yishai Hadas wrote:
> On 17/10/2023 16:42, Yishai Hadas wrote:
> > This series introduce a vfio driver over virtio devices to support the
> > legacy interface functionality for VFs.
> > 
> > Background, from the virtio spec [1].
> > 
> > In some systems, there is a need to support a virtio legacy driver with
> > a device that does not directly support the legacy interface. In such
> > scenarios, a group owner device can provide the legacy interface
> > functionality for the group member devices. The driver of the owner
> > device can then access the legacy interface of a member device on behalf
> > of the legacy member device driver.
> > 
> > For example, with the SR-IOV group type, group members (VFs) can not
> > present the legacy interface in an I/O BAR in BAR0 as expected by the
> > legacy pci driver. If the legacy driver is running inside a virtual
> > machine, the hypervisor executing the virtual machine can present a
> > virtual device with an I/O BAR in BAR0. The hypervisor intercepts the
> > legacy driver accesses to this I/O BAR and forwards them to the group
> > owner device (PF) using group administration commands.
> > 
> > 
> > The first 6 patches are in the virtio area and handle the below:
> > - Fix common config map for modern device as was reported by Michael 
> > Tsirkin.
> > - Introduce the admin virtqueue infrastcture.
> > - Expose the layout of the commands that should be used for
> >supporting the legacy access.
> > - Expose APIs to enable upper layers as of vfio, net, etc
> >to execute admin commands.
> > 
> > The above follows the virtio spec that was lastly accepted in that area
> > [1].
> > 
> > The last 3 patches are in the vfio area and handle the below:
> > - Expose some APIs from vfio/pci to be used by the vfio/virtio driver.
> > - Introduce a vfio driver over virtio devices to support the legacy
> >interface functionality for VFs.
> > 
> > The series was tested successfully over virtio-net VFs in the host,
> > while running in the guest both modern and legacy drivers.
> > 
> > [1]
> > https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c
> > 
> > Changes from V0: 
> > https://www.spinics.net/lists/linux-virtualization/msg63802.html
> > 
> > Virtio:
> > - Fix the common config map size issue that was reported by Michael
> >Tsirkin.
> > - Do not use vp_dev->vqs[] array upon vp_del_vqs() as was asked by
> >Michael, instead skip the AQ specifically.
> > - Move admin vq implementation into virtio_pci_modern.c as was asked by
> >Michael.
> > - Rename structure virtio_avq to virtio_pci_admin_vq and some extra
> >corresponding renames.
> > - Remove exported symbols virtio_pci_vf_get_pf_dev(),
> >virtio_admin_cmd_exec() as now callers are local to the module.
> > - Handle inflight commands as part of the device reset flow.
> > - Introduce APIs per admin command in virtio-pci as was asked by Michael.
> > 
> > Vfio:
> > - Change to use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL for
> >vfio_pci_core_setup_barmap() and vfio_pci_iowrite#xxx() as pointed by
> >Alex.
> > - Drop the intermediate patch which prepares the commands and calls the
> >generic virtio admin command API (i.e. virtio_admin_cmd_exec()).
> > - Instead, call directly to the new APIs per admin command that are
> >exported from Virtio - based on Michael's request.
> > - Enable only virtio-net as part of the pci_device_id table to enforce
> >upon binding only what is supported as suggested by Alex.
> > - Add support for byte-wise access (read/write) over the device config
> >region as was asked by Alex.
> > - Consider whether MSIX is practically enabled/disabled to choose the
> >right opcode upon issuing read/write admin command, as mentioned
> >by Michael.
> > - Move to use VIRTIO_PCI_CONFIG_OFF instead of adding some new defines
> >as was suggested by Michael.
> > - Set the '.close_device' op to vfio_pci_core_close_device() as was
> >pointed by Alex.
> > - Adapt to Vfio multi-line comment style in a few places.
> > - Add virtualization@lists.linux-foundation.org in the MAINTAINERS file
> >to be CCed for the new driver as was suggested by Jason.
> > 
> > Yishai
> > 
> > Feng Liu (5):
> >virtio-pci: Fix common config map for modern device
> >virtio: Define feature bit for administration virtqueue
> >virtio-pci: Introduce admin virtqueue
> >virtio-pci: Introduce admin command sending function
> >virtio-pci: Introduce admin commands
> > 
> > Yishai Hadas (4):
> >virtio-pci: Introduce APIs to execute legacy IO admin commands
> >vfio/pci: Expose vfio_pci_core_setup_barmap()
> >vfio/pci: Expose vfio_pci_iowrite/read##size()
> >vfio/virtio: Introduce a vfio driver over virtio devices
> > 
> >   MAINTAINERS

Re: [PATCH] virtio_ring: add an error code check in virtqueue_resize

2023-10-20 Thread Michael S. Tsirkin

On Fri, Oct 20, 2023 at 05:50:22PM +0800, Xuan Zhuo wrote:
> On Fri, 20 Oct 2023 05:42:14 -0400, "Michael S. Tsirkin"  
> wrote:
> > On Fri, Oct 20, 2023 at 05:36:41PM +0800, Xuan Zhuo wrote:
> > > On Fri, 20 Oct 2023 05:34:32 -0400, "Michael S. Tsirkin" 
> > >  wrote:
> > > > On Fri, Oct 20, 2023 at 05:23:21PM +0800, Su Hui wrote:
> > > > > virtqueue_resize_packed() or virtqueue_resize_split() can return
> > > > > error code if failed, so add a check for this.
> > > > >
> > > > > Signed-off-by: Su Hui 
> > > > > ---
> > > > >
> > > > > I'm not sure that return directly is right or not,
> > > > > maybe there are some process should do before return.
> > > >
> > > > yes - presizely what virtqueue_enable_after_reset does.
> > > >
> > > > Error handling in virtqueue_enable_after_reset is really weird BTW.
> > > > For some reason it overrides the error code returned.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > >  drivers/virtio/virtio_ring.c | 3 +++
> > > > >  1 file changed, 3 insertions(+)
> > > > >
> > > > > diff --git a/drivers/virtio/virtio_ring.c 
> > > > > b/drivers/virtio/virtio_ring.c
> > > > > index 51d8f3299c10..cf662c3a755b 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -2759,6 +2759,9 @@ int virtqueue_resize(struct virtqueue *_vq, u32 
> > > > > num,
> > > > >   else
> > > > >   err = virtqueue_resize_split(_vq, num);
> > > > >
> > > > > + if (err)
> > > > > + return err;
> > > > > +
> > > > >   return virtqueue_enable_after_reset(_vq);
> > > >
> > > > So I think it should be something like:
> > > >
> > > > int err_reset = virtqueue_enable_after_reset(_vq);
> > > > BUG_ON(err_reset);
> > > >
> > > > return err;
> > > >
> > >
> > > How about WARN and vq->broken?
> > >
> > > Thanks.
> >
> > Well, what are the cases where it can happen practically?
> 
> Device error. Such as vp_active_vq()
> 
> Thanks.

Hmm interesting. OK. But do callers know to recover?

> 
> >
> > >
> > > >
> > > >
> > > > >  }
> > > > >  EXPORT_SYMBOL_GPL(virtqueue_resize);
> > > > > --
> > > > > 2.30.2
> > > >
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio_ring: add an error code check in virtqueue_resize

2023-10-20 Thread Michael S. Tsirkin

On Fri, Oct 20, 2023 at 05:36:41PM +0800, Xuan Zhuo wrote:
> On Fri, 20 Oct 2023 05:34:32 -0400, "Michael S. Tsirkin"  
> wrote:
> > On Fri, Oct 20, 2023 at 05:23:21PM +0800, Su Hui wrote:
> > > virtqueue_resize_packed() or virtqueue_resize_split() can return
> > > error code if failed, so add a check for this.
> > >
> > > Signed-off-by: Su Hui 
> > > ---
> > >
> > > I'm not sure that return directly is right or not,
> > > maybe there are some process should do before return.
> >
> > yes - presizely what virtqueue_enable_after_reset does.
> >
> > Error handling in virtqueue_enable_after_reset is really weird BTW.
> > For some reason it overrides the error code returned.
> >
> >
> >
> >
> >
> > >  drivers/virtio/virtio_ring.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index 51d8f3299c10..cf662c3a755b 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -2759,6 +2759,9 @@ int virtqueue_resize(struct virtqueue *_vq, u32 num,
> > >   else
> > >   err = virtqueue_resize_split(_vq, num);
> > >
> > > + if (err)
> > > + return err;
> > > +
> > >   return virtqueue_enable_after_reset(_vq);
> >
> > So I think it should be something like:
> >
> > int err_reset = virtqueue_enable_after_reset(_vq);
> > BUG_ON(err_reset);
> >
> > return err;
> >
> 
> How about WARN and vq->broken?
> 
> Thanks.

Well, what are the cases where it can happen practically?

> 
> >
> >
> > >  }
> > >  EXPORT_SYMBOL_GPL(virtqueue_resize);
> > > --
> > > 2.30.2
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio_ring: add an error code check in virtqueue_resize

2023-10-20 Thread Michael S. Tsirkin

On Fri, Oct 20, 2023 at 05:23:21PM +0800, Su Hui wrote:
> virtqueue_resize_packed() or virtqueue_resize_split() can return
> error code if failed, so add a check for this.
> 
> Signed-off-by: Su Hui 
> ---
> 
> I'm not sure that return directly is right or not,
> maybe there are some process should do before return.

yes - presizely what virtqueue_enable_after_reset does.

Error handling in virtqueue_enable_after_reset is really weird BTW.
For some reason it overrides the error code returned.





>  drivers/virtio/virtio_ring.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 51d8f3299c10..cf662c3a755b 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -2759,6 +2759,9 @@ int virtqueue_resize(struct virtqueue *_vq, u32 num,
>   else
>   err = virtqueue_resize_split(_vq, num);
>  
> + if (err)
> + return err;
> +
>   return virtqueue_enable_after_reset(_vq);

So I think it should be something like:

int err_reset = virtqueue_enable_after_reset(_vq);
BUG_ON(err_reset);

return err;



>  }
>  EXPORT_SYMBOL_GPL(virtqueue_resize);
> -- 
> 2.30.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: PING: [PATCH] virtio-blk: fix implicit overflow on virtio_max_dma_size

2023-10-19 Thread Michael S. Tsirkin

On Thu, Oct 19, 2023 at 05:43:55PM +0800, zhenwei pi wrote:
> Hi Michael,
> 
> This seems to have been ignored as you suggested.
> 
> LINK: https://www.spinics.net/lists/linux-virtualization/msg63015.html

Pls Cc more widely then:

Paolo Bonzini  (reviewer:VIRTIO BLOCK AND SCSI DRIVERS)
Stefan Hajnoczi  (reviewer:VIRTIO BLOCK AND SCSI DRIVERS)
Xuan Zhuo  (reviewer:VIRTIO CORE AND NET DRIVERS)
Jens Axboe  (maintainer:BLOCK LAYER)
linux-bl...@vger.kernel.org (open list:BLOCK LAYER)

would all be good people to ask to review this.


> On 9/4/23 14:10, zhenwei pi wrote:
> > The following codes have an implicit conversion from size_t to u32:
> > (u32)max_size = (size_t)virtio_max_dma_size(vdev);
> > 
> > This may lead overflow, Ex (size_t)4G -> (u32)0. Once
> > virtio_max_dma_size() has a larger size than U32_MAX, use U32_MAX
> > instead.
> > 
> > Signed-off-by: zhenwei pi 
> > ---
> >   drivers/block/virtio_blk.c | 4 +++-
> >   1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> > index 1fe011676d07..4a4b9bad551e 100644
> > --- a/drivers/block/virtio_blk.c
> > +++ b/drivers/block/virtio_blk.c
> > @@ -1313,6 +1313,7 @@ static int virtblk_probe(struct virtio_device *vdev)
> > u16 min_io_size;
> > u8 physical_block_exp, alignment_offset;
> > unsigned int queue_depth;
> > +   size_t max_dma_size;
> > if (!vdev->config->get) {
> > dev_err(&vdev->dev, "%s failure: config access disabled\n",
> > @@ -1411,7 +1412,8 @@ static int virtblk_probe(struct virtio_device *vdev)
> > /* No real sector limit. */
> > blk_queue_max_hw_sectors(q, UINT_MAX);
> > -   max_size = virtio_max_dma_size(vdev);
> > +   max_dma_size = virtio_max_dma_size(vdev);
> > +   max_size = max_dma_size > U32_MAX ? U32_MAX : max_dma_size;
> > /* Host can optionally specify maximum segment size and number of
> >  * segments. */
> 
> -- 
> zhenwei pi

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net-next v1 13/19] virtio_net: xsk: tx: virtnet_free_old_xmit() distinguishes xsk buffer

2023-10-19 Thread Michael S. Tsirkin

On Thu, Oct 19, 2023 at 03:13:48PM +0800, Xuan Zhuo wrote:
> On Thu, 19 Oct 2023 02:38:16 -0400, "Michael S. Tsirkin"  
> wrote:
> > On Tue, Oct 17, 2023 at 10:02:05AM +0800, Xuan Zhuo wrote:
> > > On Mon, 16 Oct 2023 16:44:34 -0700, Jakub Kicinski  
> > > wrote:
> > > > On Mon, 16 Oct 2023 20:00:27 +0800 Xuan Zhuo wrote:
> > > > > @@ -305,9 +311,15 @@ static inline void virtnet_free_old_xmit(struct 
> > > > > virtnet_sq *sq, bool in_napi,
> > > > >
> > > > >   stats->bytes += xdp_get_frame_len(frame);
> > > > >   xdp_return_frame(frame);
> > > > > + } else {
> > > > > + stats->bytes += virtnet_ptr_to_xsk(ptr);
> > > > > + ++xsknum;
> > > > >   }
> > > > >   stats->packets++;
> > > > >   }
> > > > > +
> > > > > + if (xsknum)
> > > > > + xsk_tx_completed(sq->xsk.pool, xsknum);
> > > > >  }
> > > >
> > > > sparse complains:
> > > >
> > > > drivers/net/virtio/virtio_net.h:322:41: warning: incorrect type in 
> > > > argument 1 (different address spaces)
> > > > drivers/net/virtio/virtio_net.h:322:41:expected struct 
> > > > xsk_buff_pool *pool
> > > > drivers/net/virtio/virtio_net.h:322:41:got struct xsk_buff_pool
> > > > [noderef] __rcu *pool
> > > >
> > > > please build test with W=1 C=1
> > >
> > > OK. I will add C=1 to may script.
> > >
> > > Thanks.
> >
> > And I hope we all understand, rcu has to be used properly it's not just
> > about casting the warning away.
> 
> 
> Yes. I see. I will use rcu_dereference() and rcu_read_xxx().
> 
> Thanks.

When you do, pls don't forget to add comments documenting what does
rcu_read_lock and synchronize_rcu.


-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost] virtio-ring: split: update avali idx lazily

2023-10-18 Thread Michael S. Tsirkin

On Thu, Oct 19, 2023 at 02:35:33PM +0800, Xuan Zhuo wrote:
> If the vhost-user device is in busy-polling mode, the cachelines of
> avali ring

avail

same in subject

> are raced by the driver process and the vhost-user process.
> Because that the idx will be updated everytime, when the new ring items
> are updated. So one cache line will be read too times, the two processes
> will race the cache line. So I change the way the idx is updated, we
> only update the idx before notifying the device.
> test command:
> This is the command, that testpmd runs with virtio-net AF_XDP.
> 
> ./build/app/dpdk-testpmd -l 1-2 --no-pci --main-lcore=2 \
> --vdev net_af_xdp0,iface=ens5,queue_count=1,busy_budget=0 \
> --log-level=pmd.net.af_xdp:8 \
> -- -i -a --nb-cores=1 --rxq=1 --txq=1 --forward-mode=macswap
> 
> without this commit:
> testpmd> show port stats all
> 
>    NIC statistics for port 0  
> 
>   RX-packets: 3615824336 RX-missed: 0  RX-bytes:  202486162816
>   RX-errors: 0
>   RX-nombuf:  0
>   TX-packets: 3615795592 TX-errors: 20738  TX-bytes:  202484553152
> 
>   Throughput (since last show)
>   Rx-pps:  3790446  Rx-bps:   1698120056
>   Tx-pps:  3790446  Tx-bps:   1698120056
>   
> 
> 
> with this commit:
> testpmd> show port stats all
> 
>    NIC statistics for port 0  
> 
>   RX-packets: 68152727   RX-missed: 0  RX-bytes:  3816552712
>   RX-errors: 0
>   RX-nombuf:  0
>   TX-packets: 68114967   TX-errors: 33216  TX-bytes:  3814438152
> 
>   Throughput (since last show)
>   Rx-pps:  6333196  Rx-bps:   2837272088
>   Tx-pps:  6333227  Tx-bps:   2837285936
>   
> 
> 
> perf c2c:
> 
> On the vhost-user side, the perf c2c show 34.25% Hitm of the first cache
> line of the avail structure without this commit. The hitm reduces to
> 1.57% when this commit is included.
> 
> dpdk:
> 
> I read the code of the dpdk, there is the similar code.
> 
> virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t 
> nb_pkts)
> {
>   [...]
> 
>   for (nb_tx = 0; nb_tx < nb_pkts; nb_tx++) {
> 
>   [...]
> 
>   /* Enqueue Packet buffers */
>   virtqueue_enqueue_xmit(txvq, txm, slots, use_indirect,
>   can_push, 0);
>   }
> 
>   [...]
> 
>   if (likely(nb_tx)) {
> -->   vq_update_avail_idx(vq);
> 
>   if (unlikely(virtqueue_kick_prepare(vq))) {
>   virtqueue_notify(vq);
>   PMD_TX_LOG(DEBUG, "Notified backend after xmit");
>   }
>   }
> }
> 
> End:
> 
> Is all the _prepared() is called before _notify()?
> I checked, all the _notify() is called after the _prepare().
> 
> Signed-off-by: Xuan Zhuo 


I am concerned that this seems very aggressive and might cause more kicks
if vhost-user is not in polling more or if it's not vhost-user
at all.

Please test a bunch more configs.

Some ideas if I'm right:
- update avail index anyway after we added X descriptors
- if we detect that we had to kick, reset X aggressively to 0
  then grow it gradually (not sure when though?)
  





> ---
>  drivers/virtio/virtio_ring.c | 15 ++-
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 51d8f3299c10..215a38065d22 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -687,12 +687,7 @@ static inline int virtqueue_add_split(struct virtqueue 
> *_vq,
>   avail = vq->split.avail_idx_shadow & (vq->split.vring.num - 1);
>   vq->split.vring.avail->ring[avail] = cpu_to_virtio16(_vq->vdev, head);
>  
> - /* Descriptors and available array need to be set before we expose the
> -  * new available array entries. */
> - virtio_wmb(vq->weak_barriers);
>   vq->split.avail_idx_shadow++;
> - vq->split.vring.avail->idx = cpu_to_virtio16(_vq->vdev,
> - vq->split.avail_idx_shadow);
>   vq->num_added++;
>  
>   pr_debug("Added buffer head %i to %p\n", head, vq);
> @@ -738,6 +733,14 @@ static bool virtqueue_kick_prepare_split(struct 
> virtqueue *_vq)
>   bool needs_kick;
>  
>   START_USE(vq);
> +
> + /* Descriptors and available array need to be set before we expose the
> +  * new available array entries.
> +  */
> + virtio_wmb(vq->weak_barriers);
> + vq->split.vring.avail->idx = cpu_to_virtio16(_vq->vdev,
> + vq->split.avail_

Re: [PATCH net-next v1 13/19] virtio_net: xsk: tx: virtnet_free_old_xmit() distinguishes xsk buffer

2023-10-18 Thread Michael S. Tsirkin

On Tue, Oct 17, 2023 at 10:02:05AM +0800, Xuan Zhuo wrote:
> On Mon, 16 Oct 2023 16:44:34 -0700, Jakub Kicinski  wrote:
> > On Mon, 16 Oct 2023 20:00:27 +0800 Xuan Zhuo wrote:
> > > @@ -305,9 +311,15 @@ static inline void virtnet_free_old_xmit(struct 
> > > virtnet_sq *sq, bool in_napi,
> > >
> > >   stats->bytes += xdp_get_frame_len(frame);
> > >   xdp_return_frame(frame);
> > > + } else {
> > > + stats->bytes += virtnet_ptr_to_xsk(ptr);
> > > + ++xsknum;
> > >   }
> > >   stats->packets++;
> > >   }
> > > +
> > > + if (xsknum)
> > > + xsk_tx_completed(sq->xsk.pool, xsknum);
> > >  }
> >
> > sparse complains:
> >
> > drivers/net/virtio/virtio_net.h:322:41: warning: incorrect type in argument 
> > 1 (different address spaces)
> > drivers/net/virtio/virtio_net.h:322:41:expected struct xsk_buff_pool 
> > *pool
> > drivers/net/virtio/virtio_net.h:322:41:got struct xsk_buff_pool
> > [noderef] __rcu *pool
> >
> > please build test with W=1 C=1
> 
> OK. I will add C=1 to may script.
> 
> Thanks.

And I hope we all understand, rcu has to be used properly it's not just
about casting the warning away.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net-next v1 05/19] virtio_net: add prefix virtnet to all struct/api inside virtio_net.h

2023-10-18 Thread Michael S. Tsirkin

On Thu, Oct 19, 2023 at 02:14:27PM +0800, Jason Wang wrote:
> On Mon, Oct 16, 2023 at 8:01 PM Xuan Zhuo  wrote:
> >
> > We move some structures and APIs to the header file, but these
> > structures and APIs do not prefixed with virtnet. This patch adds
> > virtnet for these.
> 
> What's the benefit of doing this? AFAIK virtio-net is the only user
> for virtio-net.h?
> 
> THanks

If the split takes place I, for one, would be happy if there's some way
to tell where to look for a given structure/API just from the name.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost v3 3/4] virtio_pci: add check for common cfg size

2023-10-18 Thread Michael S. Tsirkin

On Tue, Oct 10, 2023 at 11:11:19AM +0800, Xuan Zhuo wrote:
> Some buggy devices, the common cfg size may not match the features.
> 
> This patch checks the common cfg size for the
> features(VIRTIO_F_NOTIF_CONFIG_DATA, VIRTIO_F_RING_RESET). When the
> common cfg size does not match the corresponding feature, we fail the
> probe and print error message.
> 
> Signed-off-by: Xuan Zhuo 
> ---
>  drivers/virtio/virtio_pci_modern.c | 36 ++
>  drivers/virtio/virtio_pci_modern_dev.c |  2 +-
>  include/linux/virtio_pci_modern.h  |  1 +
>  3 files changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/virtio/virtio_pci_modern.c 
> b/drivers/virtio/virtio_pci_modern.c
> index d6bb68ba84e5..6a8f5ff05636 100644
> --- a/drivers/virtio/virtio_pci_modern.c
> +++ b/drivers/virtio/virtio_pci_modern.c
> @@ -39,6 +39,39 @@ static void vp_transport_features(struct virtio_device 
> *vdev, u64 features)
>   __virtio_set_bit(vdev, VIRTIO_F_RING_RESET);
>  }
>  
> +static int __vp_check_common_size_one_feature(struct virtio_device *vdev, 
> u32 fbit,
> + u32 offset, const char *fname)
> +{
> + struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> +
> + if (!__virtio_test_bit(vdev, fbit))
> + return 0;
> +
> + if (likely(vp_dev->mdev.common_len >= offset))
> + return 0;
> +
> + dev_err(&vdev->dev,
> + "virtio: common cfg size(%ld) does not match the feature %s\n",
> + vp_dev->mdev.common_len, fname);

You made common_len size_t so printing it with %ld is wrong.
Causes warnings at least on on 32 bit.

> +
> + return -EINVAL;
> +}
> +
> +#define vp_check_common_size_one_feature(vdev, fbit, field) \
> + __vp_check_common_size_one_feature(vdev, fbit, \
> + offsetofend(struct virtio_pci_modern_common_cfg, field), #fbit)
> +
> +static int vp_check_common_size(struct virtio_device *vdev)
> +{
> + if (vp_check_common_size_one_feature(vdev, VIRTIO_F_NOTIF_CONFIG_DATA, 
> queue_notify_data))
> + return -EINVAL;
> +
> + if (vp_check_common_size_one_feature(vdev, VIRTIO_F_RING_RESET, 
> queue_reset))
> + return -EINVAL;
> +
> + return 0;
> +}
> +
>  /* virtio config->finalize_features() implementation */
>  static int vp_finalize_features(struct virtio_device *vdev)
>  {
> @@ -57,6 +90,9 @@ static int vp_finalize_features(struct virtio_device *vdev)
>   return -EINVAL;
>   }
>  
> + if (vp_check_common_size(vdev))
> + return -EINVAL;
> +
>   vp_modern_set_features(&vp_dev->mdev, vdev->features);
>  
>   return 0;
> diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
> b/drivers/virtio/virtio_pci_modern_dev.c
> index 9cb601e16688..33f319da1558 100644
> --- a/drivers/virtio/virtio_pci_modern_dev.c
> +++ b/drivers/virtio/virtio_pci_modern_dev.c
> @@ -292,7 +292,7 @@ int vp_modern_probe(struct virtio_pci_modern_device *mdev)
>   mdev->common = vp_modern_map_capability(mdev, common,
> sizeof(struct virtio_pci_common_cfg), 4,
> 0, sizeof(struct 
> virtio_pci_modern_common_cfg),
> -   NULL, NULL);
> +   &mdev->common_len, NULL);
>   if (!mdev->common)
>   goto err_map_common;
>   mdev->isr = vp_modern_map_capability(mdev, isr, sizeof(u8), 1,
> diff --git a/include/linux/virtio_pci_modern.h 
> b/include/linux/virtio_pci_modern.h
> index 067ac1d789bc..edf62bae0474 100644
> --- a/include/linux/virtio_pci_modern.h
> +++ b/include/linux/virtio_pci_modern.h
> @@ -28,6 +28,7 @@ struct virtio_pci_modern_device {
>   /* So we can sanity-check accesses. */
>   size_t notify_len;
>   size_t device_len;
> + size_t common_len;
>  
>   /* Capability for when we need to map notifications per-vq. */
>   int notify_map_cap;
> -- 
> 2.32.0.3.g01195cf9f


dropped this patch for now until the warning can be fixed.
rest of patchset still in.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost v4 00/16] vdpa: Add support for vq descriptor mappings

2023-10-18 Thread Michael S. Tsirkin

On Wed, Oct 18, 2023 at 08:14:39PM +0300, Dragos Tatulea wrote:
> This patch series adds support for vq descriptor table mappings which
> are used to improve vdpa live migration downtime. The improvement comes
> from using smaller mappings which take less time to create and destroy
> in hw.
> 
> The first part adds the vdpa core changes from Si-Wei [0].
> 
> The second part adds support in mlx5_vdpa:
> - Refactor the mr code to be able to cleanly add descriptor mappings.
> - Add hardware descriptor mr support.
> - Properly update iotlb for cvq during ASID switch.
> 
> Changes in v4:
> 
> - Improved the handling of empty iotlbs. See mlx5_vdpa_change_map
>   section in patch "12/16 vdpa/mlx5: Improve mr upate flow".
> - Fixed a invalid usage of desc_group_mkey hw vq field when the
>   capability is not there. See patch
>   "15/16 vdpa/mlx5: Enable hw support for vq descriptor map".

At this point, whether this patchset makes it in 6.7 will largely depend
on how many rcs there are in 6.6, so it can get some time in next.


> Changes in v3:
> 
> - dup_iotlb now checks for src == dst case and returns an error.
> - Renamed iotlb parameter in dup_iotlb to dst.
> - Removed a redundant check of the asid value.
> - Fixed a commit message.
> - mx5_ifc.h patch has been applied to mlx5-vhost tree. When applying
>   this series please pull from that tree first.
> 
> Changes in v2:
> 
> - The "vdpa/mlx5: Enable hw support for vq descriptor mapping" change
>   was split off into two patches to avoid merge conflicts into the tree
>   of Linus.
> 
>   The first patch contains only changes for mlx5_ifc.h. This must be
>   applied into the mlx5-vdpa tree [1] first. Once this patch is applied
>   on mlx5-vdpa, the change has to be pulled fom mlx5-vdpa into the vhost
>   tree and only then the remaining patches can be applied.
> 
> [0] 
> https://lore.kernel.org/virtualization/1694248959-13369-1-git-send-email-si-wei@oracle.com
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vhost
> 
> Dragos Tatulea (13):
>   vdpa/mlx5: Expose descriptor group mkey hw capability
>   vdpa/mlx5: Create helper function for dma mappings
>   vdpa/mlx5: Decouple cvq iotlb handling from hw mapping code
>   vdpa/mlx5: Take cvq iotlb lock during refresh
>   vdpa/mlx5: Collapse "dvq" mr add/delete functions
>   vdpa/mlx5: Rename mr destroy functions
>   vdpa/mlx5: Allow creation/deletion of any given mr struct
>   vdpa/mlx5: Move mr mutex out of mr struct
>   vdpa/mlx5: Improve mr update flow
>   vdpa/mlx5: Introduce mr for vq descriptor
>   vdpa/mlx5: Enable hw support for vq descriptor mapping
>   vdpa/mlx5: Make iotlb helper functions more generic
>   vdpa/mlx5: Update cvq iotlb mapping on ASID change
> 
> Si-Wei Liu (3):
>   vdpa: introduce dedicated descriptor group for virtqueue
>   vhost-vdpa: introduce descriptor group backend feature
>   vhost-vdpa: uAPI to get dedicated descriptor group id
> 
>  drivers/vdpa/mlx5/core/mlx5_vdpa.h |  31 +++--
>  drivers/vdpa/mlx5/core/mr.c| 194 -
>  drivers/vdpa/mlx5/core/resources.c |   6 +-
>  drivers/vdpa/mlx5/net/mlx5_vnet.c  | 105 +++-
>  drivers/vhost/vdpa.c   |  27 
>  include/linux/mlx5/mlx5_ifc.h  |   8 +-
>  include/linux/mlx5/mlx5_ifc_vdpa.h |   7 +-
>  include/linux/vdpa.h   |  11 ++
>  include/uapi/linux/vhost.h |   8 ++
>  include/uapi/linux/vhost_types.h   |   5 +
>  10 files changed, 272 insertions(+), 130 deletions(-)
> 
> -- 
> 2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 0/2] vdpa: Add logging operatins

2023-10-18 Thread Michael S. Tsirkin

On Mon, Sep 11, 2023 at 02:56:58PM +0800, Jason Wang wrote:
> Adding Eugenio and Si Wei.
> 
> On Sat, Aug 26, 2023 at 9:24 AM Jiang Dongxu  wrote:
> >
> > From: jiangdongxu 
> >
> > Currently, the vdpa device supports suspend and resume operations.
> > To support vdpa device live migration, we need to support logging
> > operations and device state save/load opertions.
> >
> > These series introduces some new operations for vdpa devices.
> > They allow vdpa to enable logging while vm start live migration.
> >
> > And I will submit another patches about saving and loading
> > vdpa device state later.
> 
> Thanks for working on this, I have several questions:
> 
> 1) Is there an example implementation of the logging in the drivers?
> We need a real user in order to merge this.
> 2) Is the logging based on IOVA or VA? How the DMA isolation is being
> done with this? Do we need a SET_LOGGING_ASID uAPI for this? (We had
> some discussion on this in the past).
> 
> Thanks


Seems to be stuck. Dropped this for now.

> >
> > jiangdongxu (2):
> >   vdpa: add log operations
> >   vhost-vdpa: add uAPI for logging
> >
> >  drivers/vhost/vdpa.c   | 49 ++
> >  include/linux/vdpa.h   | 14 +++
> >  include/uapi/linux/vhost.h |  4 
> >  3 files changed, 67 insertions(+)
> >
> > --
> > 2.27.0
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost 02/22] virtio_ring: introduce virtqueue_dma_[un]map_page_attrs

2023-10-18 Thread Michael S. Tsirkin

On Wed, Oct 18, 2023 at 04:57:21PM +0800, Xuan Zhuo wrote:
> On Wed, 18 Oct 2023 04:44:24 -0400, "Michael S. Tsirkin"  
> wrote:
> > On Wed, Oct 18, 2023 at 04:00:22PM +0800, Xuan Zhuo wrote:
> > > On Wed, 18 Oct 2023 03:59:03 -0400, "Michael S. Tsirkin" 
> > >  wrote:
> > > > On Wed, Oct 18, 2023 at 03:53:00PM +0800, Xuan Zhuo wrote:
> > > > > Hi Michael,
> > > > >
> > > > > Do you think it's appropriate to push the first two patches of this 
> > > > > patch set to
> > > > > linux 6.6?
> > > > >
> > > > > Thanks.
> > > >
> > > > I generally treat patchsets as a whole unless someone asks me to do
> > > > otherwise. Why do you want this?
> > >
> > > As we discussed, the patch set supporting AF_XDP will be push to net-next.
> > > But the two patchs belong to the vhost.
> > >
> > > So, if you think that is appropriate, I will post a new patchset(include 
> > > the two
> > > patchs without virtio-net + AF_XDP) to vhost. I wish that can be merged 
> > > to 6.6.
> >
> > Oh wait 6.6? Too late really, merge window has been closed for weeks.
> 
> I mean as a fix. So I ask you do you think it is appropriate?

Sure if there's a bugfix please post is separately - what issues do
these two patches fix? this is the part I'm missing. Especially patch 2
which just adds a new API.

> >
> > > Then when the 6.7 net-next merge window is open, I can push this patch 
> > > set to 6.7.
> > > The v1 version use the virtqueue_dma_map_single_attrs to replace
> > > virtqueue_dma_map_page_attrs. But I think we should use 
> > > virtqueue_dma_map_page_attrs.
> > >
> > > Thanks.
> > >
> >
> > Get a complete working patchset that causes no regressions posted first 
> > please
> > then we will discuss merge strategy.
> > I would maybe just put everything in one file for now, easier to merge,
> > refactor later when it's all upstream. But up to you.
> 
> OK. I will get a working patchset firstly.
> 
> Thanks.
> 
> >
> >
> > > >
> > > > --
> > > > MST
> > > >
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost 02/22] virtio_ring: introduce virtqueue_dma_[un]map_page_attrs

2023-10-18 Thread Michael S. Tsirkin

On Wed, Oct 18, 2023 at 04:00:22PM +0800, Xuan Zhuo wrote:
> On Wed, 18 Oct 2023 03:59:03 -0400, "Michael S. Tsirkin"  
> wrote:
> > On Wed, Oct 18, 2023 at 03:53:00PM +0800, Xuan Zhuo wrote:
> > > Hi Michael,
> > >
> > > Do you think it's appropriate to push the first two patches of this patch 
> > > set to
> > > linux 6.6?
> > >
> > > Thanks.
> >
> > I generally treat patchsets as a whole unless someone asks me to do
> > otherwise. Why do you want this?
> 
> As we discussed, the patch set supporting AF_XDP will be push to net-next.
> But the two patchs belong to the vhost.
> 
> So, if you think that is appropriate, I will post a new patchset(include the 
> two
> patchs without virtio-net + AF_XDP) to vhost. I wish that can be merged to 
> 6.6.

Oh wait 6.6? Too late really, merge window has been closed for weeks.

> Then when the 6.7 net-next merge window is open, I can push this patch set to 
> 6.7.
> The v1 version use the virtqueue_dma_map_single_attrs to replace
> virtqueue_dma_map_page_attrs. But I think we should use 
> virtqueue_dma_map_page_attrs.
> 
> Thanks.
> 

Get a complete working patchset that causes no regressions posted first please
then we will discuss merge strategy.
I would maybe just put everything in one file for now, easier to merge,
refactor later when it's all upstream. But up to you.


> >
> > --
> > MST
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost 02/22] virtio_ring: introduce virtqueue_dma_[un]map_page_attrs

2023-10-18 Thread Michael S. Tsirkin

On Wed, Oct 18, 2023 at 03:53:00PM +0800, Xuan Zhuo wrote:
> Hi Michael,
> 
> Do you think it's appropriate to push the first two patches of this patch set 
> to
> linux 6.6?
> 
> Thanks.

I see this is with the eye towards merging this gradually. However,
I want the patchset to be ready first, right now it's not -
with build failures and new warnings on some systems.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost 02/22] virtio_ring: introduce virtqueue_dma_[un]map_page_attrs

2023-10-18 Thread Michael S. Tsirkin

On Wed, Oct 18, 2023 at 03:53:00PM +0800, Xuan Zhuo wrote:
> Hi Michael,
> 
> Do you think it's appropriate to push the first two patches of this patch set 
> to
> linux 6.6?
> 
> Thanks.

I generally treat patchsets as a whole unless someone asks me to do
otherwise. Why do you want this?

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-16 Thread Michael S. Tsirkin

On Mon, Oct 16, 2023 at 04:33:10PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 10/13/2023 9:50 PM, Michael S. Tsirkin wrote:
> > On Fri, Oct 13, 2023 at 06:28:34PM +0800, Zhu, Lingshan wrote:
> > > 
> > > On 10/12/2023 9:27 PM, Jason Gunthorpe wrote:
> > > 
> > >  On Thu, Oct 12, 2023 at 06:29:47PM +0800, Zhu, Lingshan wrote:
> > > 
> > > 
> > >  sorry for the late reply, we have discussed this for weeks in 
> > > virtio mailing
> > >  list. I have proposed a live migration solution which is a 
> > > config space solution.
> > > 
> > >  I'm sorry that can't be a serious proposal - config space can't do
> > >  DMA, it is not suitable.
> > > 
> > > config space only controls the live migration process and config the 
> > > related
> > > facilities.
> > > We don't use config space to transfer data.
> > > 
> > > The new added registers work like queue_enable or features.
> > > 
> > > For example, we use DMA to report dirty pages and MMIO to fetch the dirty 
> > > data.
> > > 
> > > I remember in another thread you said:"you can't use DMA for any migration
> > > flows"
> > > 
> > > And I agree to that statement, so we use config space registers to 
> > > control the
> > > flow.
> > > 
> > > Thanks,
> > > Zhu Lingshan
> > > 
> > > 
> > >  Jason
> > > 
> > If you are using dma then I don't see what's wrong with admin vq.
> > dma is all it does.
> dma != admin vq,

Well they share the same issue that they don't work for nesting
because DMA can not be intercepted.

> and I think we have discussed many details in pros and cons
> in admin vq live migration proposal in virtio-comment.
> I am not sure we should span the discussions here, repeat them over again.
> 
> Thanks
> > 

Yea let's not.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net-next 2/5] virtio-net: separate rx/tx coalescing moderation cmds

2023-10-16 Thread Michael S. Tsirkin

On Mon, Oct 16, 2023 at 03:45:38PM +0800, Heng Qi wrote:
> 
> 
> 在 2023/10/14 上午9:11, Jakub Kicinski 写道:
> > On Thu, 12 Oct 2023 15:44:06 +0800 Heng Qi wrote:
> > > +
> > > +static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi,
> > > +   struct ethtool_coalesce *ec)
> > > +{
> > > + struct scatterlist sgs_rx;
> > > +
> > ../drivers/net/virtio_net.c: In function ‘virtnet_send_rx_notf_coal_cmds’:
> > ../drivers/net/virtio_net.c:3306:14: error: ‘i’ undeclared (first use in 
> > this function); did you mean ‘vi’?
> >   3306 | for (i = 0; i < vi->max_queue_pairs; i++) {
> >|  ^
> >|  vi
> 
> Will fix in the next version.
> 
> Thanks!

OK, however pls do test individual patches as well as the whole
patchset.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-13 Thread Michael S. Tsirkin

On Fri, Oct 13, 2023 at 06:28:34PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 10/12/2023 9:27 PM, Jason Gunthorpe wrote:
> 
> On Thu, Oct 12, 2023 at 06:29:47PM +0800, Zhu, Lingshan wrote:
> 
> 
> sorry for the late reply, we have discussed this for weeks in virtio 
> mailing
> list. I have proposed a live migration solution which is a config 
> space solution.
> 
> I'm sorry that can't be a serious proposal - config space can't do
> DMA, it is not suitable.
> 
> config space only controls the live migration process and config the related
> facilities.
> We don't use config space to transfer data.
> 
> The new added registers work like queue_enable or features.
> 
> For example, we use DMA to report dirty pages and MMIO to fetch the dirty 
> data.
> 
> I remember in another thread you said:"you can't use DMA for any migration
> flows"
> 
> And I agree to that statement, so we use config space registers to control the
> flow.
> 
> Thanks,
> Zhu Lingshan
> 
> 
> Jason
> 

If you are using dma then I don't see what's wrong with admin vq.
dma is all it does.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH] ALSA: virtio: use copy and fill_silence callbacks

2023-10-12 Thread Michael S. Tsirkin

On Thu, Oct 12, 2023 at 05:10:50PM +0200, Matias Ezequiel Vara Larsen wrote:
> This commit replaces the mmap mechanism with the copy() and
> fill_silence() callbacks for both capturing and playback for the
> virtio-sound driver. This change is required to prevent the updating of
> the content of a buffer that is already in the available ring.
> 
> The current mechanism splits a dma buffer into descriptors that are
> exposed to the device. This dma buffer is shared with the user
> application. When the device consumes a buffer, the driver moves the
> request from the used ring to available ring.
> 
> The driver exposes the buffer to the device without knowing if the
> content has been updated from the user. The section 2.8.21.1 of the
> virtio spec states that: "The device MAY access the descriptor chains
> the driver created and the memory they refer to immediately". If the
> device picks up buffers from the available ring just after it is
> notified, it happens that the content may be old.
> 
> By providing the copy() callback, the driver first updates the content
> of the buffer, and then, exposes the buffer to the device by enqueuing
> it in the available ring. Thus, device always picks up a buffer that is
> updated.
> 
> For capturing, the driver starts by exposing all the available buffers
> to device. After device updates the content of a buffer, it enqueues it
> in the used ring. It is only after the copy() for capturing is issued
> that the driver re-enqueues the buffer in the available ring.
> 
> Note that the copy() function assumes that user is always writing a
> period. Testing shows that this is true but I may be wrong. This RFC
> aims at clarifying this.
> 
> Signed-off-by: Matias Ezequiel Vara Larsen 


Thank you for working on this!

> ---
>  sound/virtio/virtio_pcm.c | 11 ++--
>  sound/virtio/virtio_pcm.h |  9 +++-
>  sound/virtio/virtio_pcm_msg.c | 50 ---
>  sound/virtio/virtio_pcm_ops.c | 94 +++
>  4 files changed, 137 insertions(+), 27 deletions(-)
> 
> diff --git a/sound/virtio/virtio_pcm.c b/sound/virtio/virtio_pcm.c
> index c10d91fff2fb..bfe982952303 100644
> --- a/sound/virtio/virtio_pcm.c
> +++ b/sound/virtio/virtio_pcm.c
> @@ -104,8 +104,6 @@ static int virtsnd_pcm_build_hw(struct 
> virtio_pcm_substream *vss,
>* only message-based transport.
>*/
>   vss->hw.info =
> - SNDRV_PCM_INFO_MMAP |
> - SNDRV_PCM_INFO_MMAP_VALID |
>   SNDRV_PCM_INFO_BATCH |
>   SNDRV_PCM_INFO_BLOCK_TRANSFER |
>   SNDRV_PCM_INFO_INTERLEAVED |
> @@ -471,12 +469,11 @@ int virtsnd_pcm_build_devs(struct virtio_snd *snd)
>   for (kss = ks->substream; kss; kss = kss->next)
>   vs->substreams[kss->number]->substream = kss;
>  
> - snd_pcm_set_ops(vpcm->pcm, i, &virtsnd_pcm_ops);
> + if (i == SNDRV_PCM_STREAM_CAPTURE)
> + snd_pcm_set_ops(vpcm->pcm, i, 
> &virtsnd_pcm_capture_ops);
> + else
> + snd_pcm_set_ops(vpcm->pcm, i, 
> &virtsnd_pcm_playback_ops);
>   }
> -
> - snd_pcm_set_managed_buffer_all(vpcm->pcm,
> -SNDRV_DMA_TYPE_VMALLOC, NULL,
> -0, 0);
>   }
>  
>   return 0;
> diff --git a/sound/virtio/virtio_pcm.h b/sound/virtio/virtio_pcm.h
> index 062eb8e8f2cf..1c1106ec971f 100644
> --- a/sound/virtio/virtio_pcm.h
> +++ b/sound/virtio/virtio_pcm.h
> @@ -50,6 +50,8 @@ struct virtio_pcm_substream {
>   struct work_struct elapsed_period;
>   spinlock_t lock;
>   size_t buffer_bytes;
> + u8 *buffer;
> + size_t buffer_sz;
>   size_t hw_ptr;
>   bool xfer_enabled;
>   bool xfer_xrun;
> @@ -90,7 +92,8 @@ struct virtio_pcm {
>   struct virtio_pcm_stream streams[SNDRV_PCM_STREAM_LAST + 1];
>  };
>  
> -extern const struct snd_pcm_ops virtsnd_pcm_ops;
> +extern const struct snd_pcm_ops virtsnd_pcm_playback_ops;
> +extern const struct snd_pcm_ops virtsnd_pcm_capture_ops;
>  
>  int virtsnd_pcm_validate(struct virtio_device *vdev);
>  
> @@ -117,7 +120,9 @@ int virtsnd_pcm_msg_alloc(struct virtio_pcm_substream 
> *vss,
>  
>  void virtsnd_pcm_msg_free(struct virtio_pcm_substream *vss);
>  
> -int virtsnd_pcm_msg_send(struct virtio_pcm_substream *vss);
> +int virtsnd_pcm_msg_send(struct virtio_pcm_substream *vss, bool single);
> +
> +int virtsnd_pcm_msg_send_locked(struct virtio_pcm_substream *vss, bool 
> single);
>  
>  unsigned int virtsnd_pcm_msg_pending_num(struct virtio_pcm_substream *vss);
>  
> diff --git a/sound/virtio/virtio_pcm_msg.c b/sound/virtio/virtio_pcm_msg.c
> index aca2dc1989ba..9a5f9814cb62 100644
> --- a/sound/virtio/virtio_pcm_msg.c
> +++ b/sound/virtio/virtio_pcm_msg.c
> @@ -132,7 +132,6 @@ static void virtsnd_pcm_sg_from(struct scatterlist *sgs, 
> int ns

Re: [PATCH vhost 00/22] virtio-net: support AF_XDP zero copy

2023-10-12 Thread Michael S. Tsirkin

On Thu, Oct 12, 2023 at 04:32:40PM +0800, Xuan Zhuo wrote:
> On Thu, 12 Oct 2023 15:50:13 +0800, Jason Wang  wrote:
> > On Thu, Oct 12, 2023 at 9:58 AM Xuan Zhuo  
> > wrote:
> > >
> > > On Wed, 11 Oct 2023 10:00:57 -0700, Jakub Kicinski  
> > > wrote:
> > > > On Wed, 11 Oct 2023 17:27:06 +0800 Xuan Zhuo wrote:
> > > > > ## AF_XDP
> > > > >
> > > > > XDP socket(AF_XDP) is an excellent bypass kernel network framework. 
> > > > > The zero
> > > > > copy feature of xsk (XDP socket) needs to be supported by the driver. 
> > > > > The
> > > > > performance of zero copy is very good. mlx5 and intel ixgbe already 
> > > > > support
> > > > > this feature, This patch set allows virtio-net to support xsk's 
> > > > > zerocopy xmit
> > > > > feature.
> > > >
> > > > You're moving the driver and adding a major feature.
> > > > This really needs to go via net or bpf.
> > > > If you have dependencies in other trees please wait for
> > > > after the merge window.
> > >
> > >
> > > If so, I can remove the first two commits.
> > >
> > > Then, the sq uses the premapped mode by default.
> > > And we can use the api virtqueue_dma_map_single_attrs to replace the
> > > virtqueue_dma_map_page_attrs.
> > >
> > > And then I will fix that on the top.
> > >
> > > Hi Micheal and Jason, is that ok for you?
> >
> > I would go with what looks easy for you but I think Jakub wants the
> > series to go with next-next (this is what we did in the past for
> > networking specific features that is done in virtio-net). So we need
> > to tweak the prefix to use net-next instead of vhost.
> 
> OK.
> 
> I will fix that in next version.
> 
> Thanks.

Scaling scope back as far as possible is a good idea generally.
I am not sure how this will work though. Let's see.

> >
> > Thanks
> >
> >
> > >
> > > Thanks.
> > >
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost 21/22] virtio_net: update tx timeout record

2023-10-12 Thread Michael S. Tsirkin

On Thu, Oct 12, 2023 at 07:54:02PM +0800, Xuan Zhuo wrote:
> On Thu, 12 Oct 2023 05:36:56 -0400, "Michael S. Tsirkin"  
> wrote:
> > On Thu, Oct 12, 2023 at 05:12:33PM +0800, Xuan Zhuo wrote:
> > > On Thu, 12 Oct 2023 05:10:55 -0400, "Michael S. Tsirkin" 
> > >  wrote:
> > > > On Wed, Oct 11, 2023 at 05:27:27PM +0800, Xuan Zhuo wrote:
> > > > > If send queue sent some packets, we update the tx timeout
> > > > > record to prevent the tx timeout.
> > > > >
> > > > > Signed-off-by: Xuan Zhuo 
> > > > > ---
> > > > >  drivers/net/virtio/xsk.c | 10 ++
> > > > >  1 file changed, 10 insertions(+)
> > > > >
> > > > > diff --git a/drivers/net/virtio/xsk.c b/drivers/net/virtio/xsk.c
> > > > > index 7abd46bb0e3d..e605f860edb6 100644
> > > > > --- a/drivers/net/virtio/xsk.c
> > > > > +++ b/drivers/net/virtio/xsk.c
> > > > > @@ -274,6 +274,16 @@ bool virtnet_xsk_xmit(struct virtnet_sq *sq, 
> > > > > struct xsk_buff_pool *pool,
> > > > >
> > > > >   virtnet_xsk_check_queue(sq);
> > > > >
> > > > > + if (stats.packets) {
> > > > > + struct netdev_queue *txq;
> > > > > + struct virtnet_info *vi;
> > > > > +
> > > > > + vi = sq->vq->vdev->priv;
> > > > > +
> > > > > + txq = netdev_get_tx_queue(vi->dev, sq - vi->sq);
> > > > > + txq_trans_cond_update(txq);
> > > > > + }
> > > > > +
> > > > >   u64_stats_update_begin(&sq->stats.syncp);
> > > > >   sq->stats.packets += stats.packets;
> > > > >   sq->stats.bytes += stats.bytes;
> > > >
> > > > I don't get what this is doing. Is there some kind of race here you
> > > > are trying to address? And what introduced the race?
> > >
> > >
> > > Because the xsk xmit shares the send queue with the kernel xmit,
> > > then when I do benchmark, the xsk will always use the send queue,
> > > so the kernel may have no chance to do xmit, the tx watchdog
> > > thinks that the send queue is hang and prints tx timeout log.
> > >
> > > So I call the txq_trans_cond_update() to tell the tx watchdog
> > > that the send queue is working.
> > >
> > > Thanks.
> >
> > Don't like this hack.
> > So packets are stuck in queue - that's not good is it?
> > Is ours the only driver that shares queues like this?
> 
> NO.
> 
> And txq_trans_cond_update() is called by many net drivers for the similar 
> reason.
> 
> Thanks

Hmm it seems you are right. OK, sorry about the noise.

> 
> >
> > >
> > > >
> > > > > --
> > > > > 2.32.0.3.g01195cf9f
> > > >
> > > >
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-12 Thread Michael S. Tsirkin

On Thu, Oct 12, 2023 at 11:11:20AM +, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin 
> > Sent: Thursday, October 12, 2023 4:23 PM
> > 
> > On Tue, Sep 26, 2023 at 03:45:36AM +, Parav Pandit wrote:
> > >
> > >
> > > > From: Michael S. Tsirkin 
> > > > Sent: Tuesday, September 26, 2023 12:06 AM
> > >
> > > > One can thinkably do that wait in hardware, though. Just defer
> > > > completion until read is done.
> > > >
> > > Once OASIS does such new interface and if some hw vendor _actually_ wants
> > to do such complex hw, may be vfio driver can adopt to it.
> > 
> > The reset behaviour I describe is already in the spec. What else do you want
> > OASIS to standardize? Virtio currently is just a register map it does not 
> > yet
> > include suggestions on how exactly do pci express transactions look. You 
> > feel we
> > should add that?
> 
> The reset behavior in the spec for modern as listed in [1] and [2] is just 
> fine.
> 
> What I meant is in context of having MMIO based legacy registers to "defer 
> completion until read is done".
> I think you meant, "Just differ read completion, until reset is done".

yes

> This means the hw needs to finish the device reset for thousands of devices 
> within the read completion timeout of the pci.

no, each device does it's own reset.

> So when if OASIS does such standardization, someone can implement it.
> 
> What I recollect, is OASIS didn't not standardize such anti-scale approach 
> and took the admin command approach which achieve better scale.
> Hope I clarified.

You are talking about the extension for trap and emulate.
I am instead talking about devices that work with
existing legacy linux drivers with no traps.

> I am not expecting OASIS to do anything extra for legacy registers.
> 
> [1] The device MUST reset when 0 is written to device_status, and present a 0 
> in device_status once that is done.
> [2] After writing 0 to device_status, the driver MUST wait for a read of 
> device_status to return 0 before reinitializing
> the device.

We can add a note explaining that legacy drivers do not wait
after doing reset, that is not a problem.
If someone wants to make a device that works with existing
legacy linux drivers, they can do that.
Won't work with all drivers though, which is why oasis did not
want to standardize this.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices

2023-10-12 Thread Michael S. Tsirkin

On Tue, Sep 26, 2023 at 03:45:36AM +, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin 
> > Sent: Tuesday, September 26, 2023 12:06 AM
> 
> > One can thinkably do that wait in hardware, though. Just defer completion 
> > until
> > read is done.
> >
> Once OASIS does such new interface and if some hw vendor _actually_ wants to 
> do such complex hw, may be vfio driver can adopt to it.

The reset behaviour I describe is already in the spec. What else do you
want OASIS to standardize? Virtio currently is just a register map it
does not yet include suggestions on how exactly do pci express
transactions look. You feel we should add that?

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost 01/22] virtio_ring: virtqueue_set_dma_premapped support disable

2023-10-12 Thread Michael S. Tsirkin

On Thu, Oct 12, 2023 at 05:18:54PM +0800, Xuan Zhuo wrote:
> On Thu, 12 Oct 2023 05:15:52 -0400, "Michael S. Tsirkin"  
> wrote:
> > On Wed, Oct 11, 2023 at 05:27:07PM +0800, Xuan Zhuo wrote:
> > > virtqueue_set_dma_premapped() adds a new parameter to disable the
> > > virtqueue premapped mode.
> > >
> > > Signed-off-by: Xuan Zhuo 
> > > ---
> > >  drivers/net/virtio_net.c |  2 +-
> > >  drivers/virtio/virtio_ring.c | 11 ---
> > >  include/linux/virtio.h   |  2 +-
> > >  3 files changed, 10 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index fe7f314d65c9..6b5f47ebf9b2 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -737,7 +737,7 @@ static void virtnet_rq_set_premapped(struct 
> > > virtnet_info *vi)
> > >   return;
> > >
> > >   for (i = 0; i < vi->max_queue_pairs; i++) {
> > > - if (virtqueue_set_dma_premapped(vi->rq[i].vq))
> > > + if (virtqueue_set_dma_premapped(vi->rq[i].vq, true))
> > >   continue;
> > >
> > >   vi->rq[i].do_dma = true;
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index 51d8f3299c10..b3ded56722f4 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -2784,7 +2784,7 @@ EXPORT_SYMBOL_GPL(virtqueue_resize);
> > >   * 0: success.
> > >   * -EINVAL: vring does not use the dma api, so we can not enable 
> > > premapped mode.
> > >   */
> > > -int virtqueue_set_dma_premapped(struct virtqueue *_vq)
> > > +int virtqueue_set_dma_premapped(struct virtqueue *_vq, bool mode)
> > >  {
> > >   struct vring_virtqueue *vq = to_vvq(_vq);
> > >   u32 num;
> > > @@ -2803,8 +2803,13 @@ int virtqueue_set_dma_premapped(struct virtqueue 
> > > *_vq)
> > >   return -EINVAL;
> > >   }
> > >
> > > - vq->premapped = true;
> > > - vq->do_unmap = false;
> > > + if (mode) {
> > > + vq->premapped = true;
> > > + vq->do_unmap = false;
> > > + } else {
> > > + vq->premapped = false;
> > > + vq->do_unmap = vq->use_dma_api;
> > > + }
> > >
> > >   END_USE(vq);
> > >
> > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > index 4cc614a38376..1cf7b004348b 100644
> > > --- a/include/linux/virtio.h
> > > +++ b/include/linux/virtio.h
> > > @@ -81,7 +81,7 @@ bool virtqueue_enable_cb(struct virtqueue *vq);
> > >
> > >  unsigned virtqueue_enable_cb_prepare(struct virtqueue *vq);
> > >
> > > -int virtqueue_set_dma_premapped(struct virtqueue *_vq);
> > > +int virtqueue_set_dma_premapped(struct virtqueue *_vq, bool mode);
> > >
> > >  bool virtqueue_poll(struct virtqueue *vq, unsigned);
> >
> > Wait a sec I thought we never change premapped. If you make this
> > dynamic don't you need a bunch of locking?
> > Or maybe queue is empty when you change this?
> > If yes pls add a bunch of BUG_ON checks to make sure this is not misused.
> 
> 
> Actually, this api is called immediately after the vq init or vq reset.
> 
> We already have such a check.
> 
> Thanks.
> 
> /**
>  * virtqueue_set_dma_premapped - set the vring premapped mode
>  * @_vq: the struct virtqueue we're talking about.
>  *
>  * Enable the premapped mode of the vq.
>  *
>  * The vring in premapped mode does not do dma internally, so the driver must
>  * do dma mapping in advance. The driver must pass the dma_address through
>  * dma_address of scatterlist. When the driver got a used buffer from
>  * the vring, it has to unmap the dma address.
>  *
>  * This function must be called immediately after creating the vq, or after vq
>  * reset, and before adding any buffers to it.
>  *
>  * Caller must ensure we don't call this with other virtqueue operations
>  * at the same time (except where noted).
>  *
>  * Returns zero or a negative error.
>  * 0: success.
>  * -EINVAL: vring does not use the dma api, so we can not enable premapped 
> mode.
>  */
> int virtqueue_set_dma_premapped(struct virtqueue *_vq, bool mode)
> {
>   struct vring_virtqueue *vq = to_vvq(_vq);
>   u32 num;
> 
>   START_USE(vq);
> 
>   num = vq->packed_ring ? vq-&g

Re: [PATCH vhost 21/22] virtio_net: update tx timeout record

2023-10-12 Thread Michael S. Tsirkin

On Thu, Oct 12, 2023 at 05:12:33PM +0800, Xuan Zhuo wrote:
> On Thu, 12 Oct 2023 05:10:55 -0400, "Michael S. Tsirkin"  
> wrote:
> > On Wed, Oct 11, 2023 at 05:27:27PM +0800, Xuan Zhuo wrote:
> > > If send queue sent some packets, we update the tx timeout
> > > record to prevent the tx timeout.
> > >
> > > Signed-off-by: Xuan Zhuo 
> > > ---
> > >  drivers/net/virtio/xsk.c | 10 ++
> > >  1 file changed, 10 insertions(+)
> > >
> > > diff --git a/drivers/net/virtio/xsk.c b/drivers/net/virtio/xsk.c
> > > index 7abd46bb0e3d..e605f860edb6 100644
> > > --- a/drivers/net/virtio/xsk.c
> > > +++ b/drivers/net/virtio/xsk.c
> > > @@ -274,6 +274,16 @@ bool virtnet_xsk_xmit(struct virtnet_sq *sq, struct 
> > > xsk_buff_pool *pool,
> > >
> > >   virtnet_xsk_check_queue(sq);
> > >
> > > + if (stats.packets) {
> > > + struct netdev_queue *txq;
> > > + struct virtnet_info *vi;
> > > +
> > > + vi = sq->vq->vdev->priv;
> > > +
> > > + txq = netdev_get_tx_queue(vi->dev, sq - vi->sq);
> > > + txq_trans_cond_update(txq);
> > > + }
> > > +
> > >   u64_stats_update_begin(&sq->stats.syncp);
> > >   sq->stats.packets += stats.packets;
> > >   sq->stats.bytes += stats.bytes;
> >
> > I don't get what this is doing. Is there some kind of race here you
> > are trying to address? And what introduced the race?
> 
> 
> Because the xsk xmit shares the send queue with the kernel xmit,
> then when I do benchmark, the xsk will always use the send queue,
> so the kernel may have no chance to do xmit, the tx watchdog
> thinks that the send queue is hang and prints tx timeout log.
> 
> So I call the txq_trans_cond_update() to tell the tx watchdog
> that the send queue is working.
> 
> Thanks.

Don't like this hack.
So packets are stuck in queue - that's not good is it?
Is ours the only driver that shares queues like this?

> 
> >
> > > --
> > > 2.32.0.3.g01195cf9f
> >
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost 01/22] virtio_ring: virtqueue_set_dma_premapped support disable

2023-10-12 Thread Michael S. Tsirkin

On Wed, Oct 11, 2023 at 05:27:07PM +0800, Xuan Zhuo wrote:
> virtqueue_set_dma_premapped() adds a new parameter to disable the
> virtqueue premapped mode.
> 
> Signed-off-by: Xuan Zhuo 
> ---
>  drivers/net/virtio_net.c |  2 +-
>  drivers/virtio/virtio_ring.c | 11 ---
>  include/linux/virtio.h   |  2 +-
>  3 files changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index fe7f314d65c9..6b5f47ebf9b2 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -737,7 +737,7 @@ static void virtnet_rq_set_premapped(struct virtnet_info 
> *vi)
>   return;
>  
>   for (i = 0; i < vi->max_queue_pairs; i++) {
> - if (virtqueue_set_dma_premapped(vi->rq[i].vq))
> + if (virtqueue_set_dma_premapped(vi->rq[i].vq, true))
>   continue;
>  
>   vi->rq[i].do_dma = true;
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 51d8f3299c10..b3ded56722f4 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -2784,7 +2784,7 @@ EXPORT_SYMBOL_GPL(virtqueue_resize);
>   * 0: success.
>   * -EINVAL: vring does not use the dma api, so we can not enable premapped 
> mode.
>   */
> -int virtqueue_set_dma_premapped(struct virtqueue *_vq)
> +int virtqueue_set_dma_premapped(struct virtqueue *_vq, bool mode)
>  {
>   struct vring_virtqueue *vq = to_vvq(_vq);
>   u32 num;
> @@ -2803,8 +2803,13 @@ int virtqueue_set_dma_premapped(struct virtqueue *_vq)
>   return -EINVAL;
>   }
>  
> - vq->premapped = true;
> - vq->do_unmap = false;
> + if (mode) {
> + vq->premapped = true;
> + vq->do_unmap = false;
> + } else {
> + vq->premapped = false;
> + vq->do_unmap = vq->use_dma_api;
> + }
>  
>   END_USE(vq);
>  
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index 4cc614a38376..1cf7b004348b 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -81,7 +81,7 @@ bool virtqueue_enable_cb(struct virtqueue *vq);
>  
>  unsigned virtqueue_enable_cb_prepare(struct virtqueue *vq);
>  
> -int virtqueue_set_dma_premapped(struct virtqueue *_vq);
> +int virtqueue_set_dma_premapped(struct virtqueue *_vq, bool mode);
>  
>  bool virtqueue_poll(struct virtqueue *vq, unsigned);

Wait a sec I thought we never change premapped. If you make this
dynamic don't you need a bunch of locking?
Or maybe queue is empty when you change this?
If yes pls add a bunch of BUG_ON checks to make sure this is not misused.


> -- 
> 2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost 08/22] virtio_net: virtnet_poll_tx support rescheduled

2023-10-12 Thread Michael S. Tsirkin

On Wed, Oct 11, 2023 at 05:27:14PM +0800, Xuan Zhuo wrote:
> virtnet_poll_tx() support to return budget when busy to be rescheduled.
> 
> When retval < budget, napi_poll() in dev.c will exit directly. And
> virtqueue_napi_complete() will be called to close napi.
> 
> When retval == budget, the napi_poll() in dev.c will re-add napi to the
> queue.
> 
> The purpose of this patch is to support xsk xmit in virtio_poll_tx() for
> subsequent patch.
> 
> Signed-off-by: Xuan Zhuo 
> ---
>  drivers/net/virtio/main.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/virtio/main.c b/drivers/net/virtio/main.c
> index bcfd31a55076..f32cfa189972 100644
> --- a/drivers/net/virtio/main.c
> +++ b/drivers/net/virtio/main.c
> @@ -1976,6 +1976,7 @@ static int virtnet_poll_tx(struct napi_struct *napi, 
> int budget)
>   struct virtnet_info *vi = sq->vq->vdev->priv;
>   unsigned int index = vq2txq(sq->vq);
>   struct netdev_queue *txq;
> + int busy = 0;
>   int opaque;
>   bool done;
>  
> @@ -1993,6 +1994,11 @@ static int virtnet_poll_tx(struct napi_struct *napi, 
> int budget)
>   if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS)
>   netif_tx_wake_queue(txq);
>  
> + if (busy) {
> + __netif_tx_unlock(txq);
> + return budget;
> + }
> +
>   opaque = virtqueue_enable_cb_prepare(sq->vq);
>  
>   done = napi_complete_done(napi, 0);

This just adds a bit of dead code.
Pls just squash into that patch.

> -- 
> 2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost 21/22] virtio_net: update tx timeout record

2023-10-12 Thread Michael S. Tsirkin

On Wed, Oct 11, 2023 at 05:27:27PM +0800, Xuan Zhuo wrote:
> If send queue sent some packets, we update the tx timeout
> record to prevent the tx timeout.
> 
> Signed-off-by: Xuan Zhuo 
> ---
>  drivers/net/virtio/xsk.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/net/virtio/xsk.c b/drivers/net/virtio/xsk.c
> index 7abd46bb0e3d..e605f860edb6 100644
> --- a/drivers/net/virtio/xsk.c
> +++ b/drivers/net/virtio/xsk.c
> @@ -274,6 +274,16 @@ bool virtnet_xsk_xmit(struct virtnet_sq *sq, struct 
> xsk_buff_pool *pool,
>  
>   virtnet_xsk_check_queue(sq);
>  
> + if (stats.packets) {
> + struct netdev_queue *txq;
> + struct virtnet_info *vi;
> +
> + vi = sq->vq->vdev->priv;
> +
> + txq = netdev_get_tx_queue(vi->dev, sq - vi->sq);
> + txq_trans_cond_update(txq);
> + }
> +
>   u64_stats_update_begin(&sq->stats.syncp);
>   sq->stats.packets += stats.packets;
>   sq->stats.bytes += stats.bytes;

I don't get what this is doing. Is there some kind of race here you
are trying to address? And what introduced the race?

> -- 
> 2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-11 Thread Michael S. Tsirkin

On Wed, Oct 11, 2023 at 02:19:44PM -0300, Jason Gunthorpe wrote:
> On Wed, Oct 11, 2023 at 12:59:30PM -0400, Michael S. Tsirkin wrote:
> > On Wed, Oct 11, 2023 at 11:58:10AM -0300, Jason Gunthorpe wrote:
> > > Trying to put VFIO-only code in virtio is what causes all the
> > > issues. If you mis-design the API boundary everything will be painful,
> > > no matter where you put the code.
> > 
> > Are you implying the whole idea of adding these legacy virtio admin
> > commands to virtio spec was a design mistake?
> 
> No, I'm saying again that trying to relocate all the vfio code into
> drivers/virtio is a mistake
> 
> Jason

Yea please don't. And by the same token, please do not put
implementations of virtio spec under drivers/vfio.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-11 Thread Michael S. Tsirkin

On Wed, Oct 11, 2023 at 09:18:49AM -0300, Jason Gunthorpe wrote:
> With VDPA doing the same stuff as vfio I'm not sure who is auditing it
> for security.

Check the signed off tags and who sends the pull requests if you want to
know.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-11 Thread Michael S. Tsirkin

On Wed, Oct 11, 2023 at 09:18:49AM -0300, Jason Gunthorpe wrote:
> The simple way to be sure is to never touch the PCI function that has
> DMA assigned to a VM from the hypervisor, except through config space.

What makes config space different that it's safe though?
Isn't this more of a "we can't avoid touching config space" than
that it's safe? The line doesn't look that bright to me -
if there's e.g. a memory area designed explicitly for
hypervisor to poke at, that seems fine.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-11 Thread Michael S. Tsirkin

On Wed, Oct 11, 2023 at 11:58:10AM -0300, Jason Gunthorpe wrote:
> Trying to put VFIO-only code in virtio is what causes all the
> issues. If you mis-design the API boundary everything will be painful,
> no matter where you put the code.

Are you implying the whole idea of adding these legacy virtio admin
commands to virtio spec was a design mistake?
It was nvidia guys who proposed it, so I'm surprised to hear you say this.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-11 Thread Michael S. Tsirkin

On Wed, Oct 11, 2023 at 11:58:11AM +0300, Yishai Hadas wrote:
> On 11/10/2023 11:02, Michael S. Tsirkin wrote:
> > On Wed, Oct 11, 2023 at 10:44:49AM +0300, Yishai Hadas wrote:
> > > On 10/10/2023 23:42, Michael S. Tsirkin wrote:
> > > > On Tue, Oct 10, 2023 at 07:09:08PM +0300, Yishai Hadas wrote:
> > > > > > > Assuming that we'll put each command inside virtio as the generic 
> > > > > > > layer, we
> > > > > > > won't be able to call/use this API internally to get the PF as of 
> > > > > > > cyclic
> > > > > > > dependencies between the modules, link will fail.
> > > > I just mean:
> > > > virtio_admin_legacy_io_write(sruct pci_device *,  )
> > > > 
> > > > 
> > > > internally it starts from vf gets the pf (or vf itself or whatever
> > > > the transport is) sends command gets status returns.
> > > > 
> > > > what is cyclic here?
> > > > 
> > > virtio-pci depends on virtio [1].
> > > 
> > > If we put the commands in the generic layer as we expect it to be (i.e.
> > > virtio), then trying to call internally call for 
> > > virtio_pci_vf_get_pf_dev()
> > > to get the PF from the VF will end-up by a linker cyclic error as of below
> > > [2].
> > > 
> > > As of that, someone can suggest to put the commands in virtio-pci, however
> > > this will fully bypass the generic layer of virtio and future clients 
> > > won't
> > > be able to use it.
> > virtio_pci would get pci device.
> > virtio pci convers that to virtio device of owner + group member id and 
> > calls virtio.
> 
> Do you suggest another set of exported symbols (i.e per command ) in virtio
> which will get the owner device + group member + the extra specific command
> parameters ?
> 
> This will end-up duplicating the number of export symbols per command.

Or make them inline.
Or maybe actually even the specific commands should live inside virtio pci
they are pci specific after all.

> > no cycles and minimal transport specific code, right?
> 
> See my above note, if we may just call virtio without any further work on
> the command's input, than YES.
> 
> If so, virtio will prepare the command by setting the relevant SG lists and
> other data and finally will call:
> 
> vdev->config->exec_admin_cmd(vdev, cmd);
> 
> Was that your plan ?

is vdev the pf? then it won't support the transport where commands
are submitted through bar0 of vf itself.

> > 
> > > In addition, passing in the VF PCI pointer instead of the VF group member 
> > > ID
> > > + the VIRTIO PF device, will require in the future to duplicate each 
> > > command
> > > once we'll use SIOV devices.
> > I don't think anyone knows how will SIOV look. But shuffling
> > APIs around is not a big deal. We'll see.
> 
> As you are the maintainer it's up-to-you, just need to consider another
> further duplication here.
> 
> Yishai
> 
> > 
> > > Instead, we suggest the below API for the above example.
> > > 
> > > virtio_admin_legacy_io_write(virtio_device *virtio_dev,  u64
> > > group_member_id,  )
> > > 
> > > [1]
> > > [yishaih@reg-l-vrt-209 linux]$ modinfo virtio-pci
> > > filename: /lib/modules/6.6.0-rc2+/kernel/drivers/virtio/virtio_pci.ko
> > > version:    1
> > > license:    GPL
> > > description:    virtio-pci
> > > author: Anthony Liguori 
> > > srcversion: 7355EAC9408D38891938391
> > > alias:  pci:v1AF4d*sv*sd*bc*sc*i*
> > > depends: virtio_pci_modern_dev,virtio,virtio_ring,virtio_pci_legacy_dev
> > > retpoline:  Y
> > > intree: Y
> > > name:   virtio_pci
> > > vermagic:   6.6.0-rc2+ SMP preempt mod_unload modversions
> > > parm:   force_legacy:Force legacy mode for transitional virtio 1
> > > devices (bool)
> > > 
> > > [2]
> > > 
> > > depmod: ERROR: Cycle detected: virtio -> virtio_pci -> virtio
> > > depmod: ERROR: Found 2 modules in dependency cycles!
> > > make[2]: *** [scripts/Makefile.modinst:128: depmod] Error 1
> > > make[1]: *** [/images/yishaih/src/kernel/linux/Makefile:1821:
> > > modules_install] Error 2
> > > 
> > > Yishai
> > virtio absolutely must not depend on virtio pci, it is used on
> > systems without pci at all.
> > 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-11 Thread Michael S. Tsirkin

On Tue, Oct 10, 2023 at 11:59:26PM -0700, Christoph Hellwig wrote:
> On Wed, Oct 11, 2023 at 02:43:37AM -0400, Michael S. Tsirkin wrote:
> > > Btw, what is that intel thing everyone is talking about?  And why
> > > would the virtio core support vendor specific behavior like that?
> > 
> > It's not a thing it's Zhu Lingshan :) intel is just one of the vendors
> > that implemented vdpa support and so Zhu Lingshan from intel is working
> > on vdpa and has also proposed virtio spec extensions for migration.
> > intel's driver is called ifcvf.  vdpa composes all this stuff that is
> > added to vfio in userspace, so it's a different approach.
> 
> Well, so let's call it virtio live migration instead of intel.
> 
> And please work all together in the virtio committee that you have
> one way of communication between controlling and controlled functions.
> If one extension does it one way and the other a different way that's
> just creating a giant mess.

Absolutely, this is exactly what I keep suggesting. Thanks for
bringing this up, will help me drive the point home!

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-11 Thread Michael S. Tsirkin

On Wed, Oct 11, 2023 at 08:00:57AM +, Parav Pandit wrote:
> Hi Christoph,
> 
> > From: Christoph Hellwig 
> > Sent: Wednesday, October 11, 2023 12:29 PM
> > 
> > On Wed, Oct 11, 2023 at 02:43:37AM -0400, Michael S. Tsirkin wrote:
> > > > Btw, what is that intel thing everyone is talking about?  And why
> > > > would the virtio core support vendor specific behavior like that?
> > >
> > > It's not a thing it's Zhu Lingshan :) intel is just one of the vendors
> > > that implemented vdpa support and so Zhu Lingshan from intel is
> > > working on vdpa and has also proposed virtio spec extensions for 
> > > migration.
> > > intel's driver is called ifcvf.  vdpa composes all this stuff that is
> > > added to vfio in userspace, so it's a different approach.
> > 
> > Well, so let's call it virtio live migration instead of intel.
> > 
> > And please work all together in the virtio committee that you have one way 
> > of
> > communication between controlling and controlled functions.
> > If one extension does it one way and the other a different way that's just
> > creating a giant mess.
> 
> We in virtio committee are working on VF device migration where:
> VF = controlled function
> PF = controlling function
> 
> The second proposal is what Michael mentioned from Intel that somehow combine 
> controlled and controlling function as single entity on VF.
> 
> The main reasons I find it weird are:
> 1. it must always need to do mediation to do fake the device reset, and flr 
> flows
> 2. dma cannot work as you explained for complex device state
> 3. it needs constant knowledge of each tiny things for each virtio device type
> 
> Such single entity appears a bit very weird to me but maybe it is just me.

Yea it appears to include everyone from nvidia. Others are used to it -
this is exactly what happens with virtio generally. E.g. vhost
processes fast path in the kernel and control path is in userspace.
vdpa has been largely modeled after that, for better or worse.
-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-11 Thread Michael S. Tsirkin

On Wed, Oct 11, 2023 at 10:44:49AM +0300, Yishai Hadas wrote:
> On 10/10/2023 23:42, Michael S. Tsirkin wrote:
> > On Tue, Oct 10, 2023 at 07:09:08PM +0300, Yishai Hadas wrote:
> > > > > Assuming that we'll put each command inside virtio as the generic 
> > > > > layer, we
> > > > > won't be able to call/use this API internally to get the PF as of 
> > > > > cyclic
> > > > > dependencies between the modules, link will fail.
> > I just mean:
> > virtio_admin_legacy_io_write(sruct pci_device *,  )
> > 
> > 
> > internally it starts from vf gets the pf (or vf itself or whatever
> > the transport is) sends command gets status returns.
> > 
> > what is cyclic here?
> > 
> virtio-pci depends on virtio [1].
> 
> If we put the commands in the generic layer as we expect it to be (i.e.
> virtio), then trying to call internally call for virtio_pci_vf_get_pf_dev()
> to get the PF from the VF will end-up by a linker cyclic error as of below
> [2].
> 
> As of that, someone can suggest to put the commands in virtio-pci, however
> this will fully bypass the generic layer of virtio and future clients won't
> be able to use it.

virtio_pci would get pci device.
virtio pci convers that to virtio device of owner + group member id and calls 
virtio.

no cycles and minimal transport specific code, right?

> In addition, passing in the VF PCI pointer instead of the VF group member ID
> + the VIRTIO PF device, will require in the future to duplicate each command
> once we'll use SIOV devices.

I don't think anyone knows how will SIOV look. But shuffling
APIs around is not a big deal. We'll see.

> Instead, we suggest the below API for the above example.
> 
> virtio_admin_legacy_io_write(virtio_device *virtio_dev,  u64
> group_member_id,  )
> 
> [1]

> [yishaih@reg-l-vrt-209 linux]$ modinfo virtio-pci
> filename: /lib/modules/6.6.0-rc2+/kernel/drivers/virtio/virtio_pci.ko
> version:    1
> license:    GPL
> description:    virtio-pci
> author: Anthony Liguori 
> srcversion: 7355EAC9408D38891938391
> alias:  pci:v1AF4d*sv*sd*bc*sc*i*
> depends: virtio_pci_modern_dev,virtio,virtio_ring,virtio_pci_legacy_dev
> retpoline:  Y
> intree: Y
> name:   virtio_pci
> vermagic:   6.6.0-rc2+ SMP preempt mod_unload modversions
> parm:   force_legacy:Force legacy mode for transitional virtio 1
> devices (bool)
> 
> [2]
> 
> depmod: ERROR: Cycle detected: virtio -> virtio_pci -> virtio
> depmod: ERROR: Found 2 modules in dependency cycles!
> make[2]: *** [scripts/Makefile.modinst:128: depmod] Error 1
> make[1]: *** [/images/yishaih/src/kernel/linux/Makefile:1821:
> modules_install] Error 2
> 
> Yishai

virtio absolutely must not depend on virtio pci, it is used on
systems without pci at all.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-10 Thread Michael S. Tsirkin

On Tue, Oct 10, 2023 at 11:13:30PM -0700, Christoph Hellwig wrote:
> On Tue, Oct 10, 2023 at 12:59:37PM -0300, Jason Gunthorpe wrote:
> > On Tue, Oct 10, 2023 at 11:14:56AM -0400, Michael S. Tsirkin wrote:
> > 
> > > I suggest 3 but call it on the VF. commands will switch to PF
> > > internally as needed. For example, intel might be interested in exposing
> > > admin commands through a memory BAR of VF itself.
> > 
> > FWIW, we have been pushing back on such things in VFIO, so it will
> > have to be very carefully security justified.
> > 
> > Probably since that is not standard it should just live in under some
> > intel-only vfio driver behavior, not in virtio land.
> 
> Btw, what is that intel thing everyone is talking about?  And why
> would the virtio core support vendor specific behavior like that?

It's not a thing it's Zhu Lingshan :) intel is just one of the vendors
that implemented vdpa support and so Zhu Lingshan from intel is working
on vdpa and has also proposed virtio spec extensions for migration.
intel's driver is called ifcvf.  vdpa composes all this stuff that is
added to vfio in userspace, so it's a different approach.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-10 Thread Michael S. Tsirkin

On Tue, Oct 10, 2023 at 07:09:08PM +0300, Yishai Hadas wrote:
> 
> > > Assuming that we'll put each command inside virtio as the generic layer, 
> > > we
> > > won't be able to call/use this API internally to get the PF as of cyclic
> > > dependencies between the modules, link will fail.

I just mean:
virtio_admin_legacy_io_write(sruct pci_device *,  )


internally it starts from vf gets the pf (or vf itself or whatever
the transport is) sends command gets status returns.

what is cyclic here?

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-10 Thread Michael S. Tsirkin

On Tue, Oct 10, 2023 at 04:21:15PM +, Parav Pandit wrote:
> 
> > From: Jason Gunthorpe 
> > Sent: Tuesday, October 10, 2023 9:37 PM
> > 
> > On Tue, Oct 10, 2023 at 12:03:29PM -0400, Michael S. Tsirkin wrote:
> > > On Tue, Oct 10, 2023 at 12:59:37PM -0300, Jason Gunthorpe wrote:
> > > > On Tue, Oct 10, 2023 at 11:14:56AM -0400, Michael S. Tsirkin wrote:
> > > >
> > > > > I suggest 3 but call it on the VF. commands will switch to PF
> > > > > internally as needed. For example, intel might be interested in
> > > > > exposing admin commands through a memory BAR of VF itself.
> 
> If in the future if one does admin command on the VF memory BAR, there is no 
> need of cast either.
> vfio-virtio-pci driver can do on the pci vf device directly.

this is why I want the API to get the VF pci device as a parameter.
I don't get what is cyclic about it, yet.

> (though per VF memory registers would be anti-scale design for real hw; to 
> discuss in other forum).

up to hardware vendor really.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-10 Thread Michael S. Tsirkin

On Tue, Oct 10, 2023 at 12:59:37PM -0300, Jason Gunthorpe wrote:
> On Tue, Oct 10, 2023 at 11:14:56AM -0400, Michael S. Tsirkin wrote:
> 
> > I suggest 3 but call it on the VF. commands will switch to PF
> > internally as needed. For example, intel might be interested in exposing
> > admin commands through a memory BAR of VF itself.
> 
> FWIW, we have been pushing back on such things in VFIO, so it will
> have to be very carefully security justified.
> 
> Probably since that is not standard it should just live in under some
> intel-only vfio driver behavior, not in virtio land.
> 
> It is also costly to switch between pf/vf, it should not be done
> pointlessly on the fast path.
> 
> Jason

Currently, the switch seems to be just a cast of private data.
I am suggesting keeping that cast inside virtio. Why is that
expensive?

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-10 Thread Michael S. Tsirkin

On Tue, Oct 10, 2023 at 06:43:32PM +0300, Yishai Hadas wrote:
> On 10/10/2023 18:14, Michael S. Tsirkin wrote:
> > On Tue, Oct 10, 2023 at 06:09:44PM +0300, Yishai Hadas wrote:
> > > On 10/10/2023 17:54, Michael S. Tsirkin wrote:
> > > > On Tue, Oct 10, 2023 at 11:08:49AM -0300, Jason Gunthorpe wrote:
> > > > > On Tue, Oct 10, 2023 at 09:56:00AM -0400, Michael S. Tsirkin wrote:
> > > > > 
> > > > > > > However - the Intel GPU VFIO driver is such a bad experiance I 
> > > > > > > don't
> > > > > > > want to encourage people to make VFIO drivers, or code that is 
> > > > > > > only
> > > > > > > used by VFIO drivers, that are not under drivers/vfio review.
> > > > > > So if Alex feels it makes sense to add some virtio functionality
> > > > > > to vfio and is happy to maintain or let you maintain the UAPI
> > > > > > then why would I say no? But we never expected devices to have
> > > > > > two drivers like this does, so just exposing device pointer
> > > > > > and saying "use regular virtio APIs for the rest" does not
> > > > > > cut it, the new APIs have to make sense
> > > > > > so virtio drivers can develop normally without fear of stepping
> > > > > > on the toes of this admin driver.
> > > > > Please work with Yishai to get something that make sense to you. He
> > > > > can post a v2 with the accumulated comments addressed so far and then
> > > > > go over what the API between the drivers is.
> > > > > 
> > > > > Thanks,
> > > > > Jason
> > > > /me shrugs. I pretty much posted suggestions already. Should not be 
> > > > hard.
> > > > Anything unclear - post on list.
> > > > 
> > > Yes, this is the plan.
> > > 
> > > We are working to address the comments that we got so far in both VFIO &
> > > VIRTIO, retest and send the next version.
> > > 
> > > Re the API between the modules, It looks like we have the below
> > > alternatives.
> > > 
> > > 1) Proceed with current approach where we exposed a generic API to execute
> > > any admin command, however, make it much more solid inside VIRTIO.
> > > 2) Expose extra APIs from VIRTIO for commands that we can consider future
> > > client usage of them as of LIST_QUERY/LIST_USE, however still have the
> > > generic execute admin command for others.
> > > 3) Expose API per command from VIRTIO and fully drop the generic execute
> > > admin command.
> > > 
> > > Few notes:
> > > Option #1 looks the most generic one, it drops the need to expose multiple
> > > symbols / APIs per command and for now we have a single client for them
> > > (i.e. VFIO).
> > > Options #2 & #3, may still require to expose the 
> > > virtio_pci_vf_get_pf_dev()
> > > API to let VFIO get the VIRTIO PF (struct virtio_device *) from its PCI
> > > device, each command will get it as its first argument.
> > > 
> > > Michael,
> > > What do you suggest here ?
> > > 
> > > Thanks,
> > > Yishai
> > I suggest 3 but call it on the VF. commands will switch to PF
> > internally as needed. For example, intel might be interested in exposing
> > admin commands through a memory BAR of VF itself.
> > 
> The driver who owns the VF is VFIO, it's not a VIRTIO one.
> 
> The ability to get the VIRTIO PF is from the PCI device (i.e. struct
> pci_dev).
> 
> In addition,
> virtio_pci_vf_get_pf_dev() was implemented for now in virtio-pci as it
> worked on pci_dev.

On pci_dev of vf, yes? So again just move this into each command,
that's all. I.e. pass pci_dev to each.

> Assuming that we'll put each command inside virtio as the generic layer, we
> won't be able to call/use this API internally to get the PF as of cyclic
> dependencies between the modules, link will fail.
> 
> So in option #3 we may still need to get outside into VFIO the VIRTIO PF and
> give it as pointer to VIRTIO upon each command.
> 
> Does it work for you ?
> 
> Yishai

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-10 Thread Michael S. Tsirkin

On Tue, Oct 10, 2023 at 06:09:44PM +0300, Yishai Hadas wrote:
> On 10/10/2023 17:54, Michael S. Tsirkin wrote:
> > On Tue, Oct 10, 2023 at 11:08:49AM -0300, Jason Gunthorpe wrote:
> > > On Tue, Oct 10, 2023 at 09:56:00AM -0400, Michael S. Tsirkin wrote:
> > > 
> > > > > However - the Intel GPU VFIO driver is such a bad experiance I don't
> > > > > want to encourage people to make VFIO drivers, or code that is only
> > > > > used by VFIO drivers, that are not under drivers/vfio review.
> > > > So if Alex feels it makes sense to add some virtio functionality
> > > > to vfio and is happy to maintain or let you maintain the UAPI
> > > > then why would I say no? But we never expected devices to have
> > > > two drivers like this does, so just exposing device pointer
> > > > and saying "use regular virtio APIs for the rest" does not
> > > > cut it, the new APIs have to make sense
> > > > so virtio drivers can develop normally without fear of stepping
> > > > on the toes of this admin driver.
> > > Please work with Yishai to get something that make sense to you. He
> > > can post a v2 with the accumulated comments addressed so far and then
> > > go over what the API between the drivers is.
> > > 
> > > Thanks,
> > > Jason
> > /me shrugs. I pretty much posted suggestions already. Should not be hard.
> > Anything unclear - post on list.
> > 
> Yes, this is the plan.
> 
> We are working to address the comments that we got so far in both VFIO &
> VIRTIO, retest and send the next version.
> 
> Re the API between the modules, It looks like we have the below
> alternatives.
> 
> 1) Proceed with current approach where we exposed a generic API to execute
> any admin command, however, make it much more solid inside VIRTIO.
> 2) Expose extra APIs from VIRTIO for commands that we can consider future
> client usage of them as of LIST_QUERY/LIST_USE, however still have the
> generic execute admin command for others.
> 3) Expose API per command from VIRTIO and fully drop the generic execute
> admin command.
> 
> Few notes:
> Option #1 looks the most generic one, it drops the need to expose multiple
> symbols / APIs per command and for now we have a single client for them
> (i.e. VFIO).
> Options #2 & #3, may still require to expose the virtio_pci_vf_get_pf_dev()
> API to let VFIO get the VIRTIO PF (struct virtio_device *) from its PCI
> device, each command will get it as its first argument.
> 
> Michael,
> What do you suggest here ?
> 
> Thanks,
> Yishai

I suggest 3 but call it on the VF. commands will switch to PF
internally as needed. For example, intel might be interested in exposing
admin commands through a memory BAR of VF itself.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-10 Thread Michael S. Tsirkin

On Tue, Oct 10, 2023 at 11:08:49AM -0300, Jason Gunthorpe wrote:
> On Tue, Oct 10, 2023 at 09:56:00AM -0400, Michael S. Tsirkin wrote:
> 
> > > However - the Intel GPU VFIO driver is such a bad experiance I don't
> > > want to encourage people to make VFIO drivers, or code that is only
> > > used by VFIO drivers, that are not under drivers/vfio review.
> > 
> > So if Alex feels it makes sense to add some virtio functionality
> > to vfio and is happy to maintain or let you maintain the UAPI
> > then why would I say no? But we never expected devices to have
> > two drivers like this does, so just exposing device pointer
> > and saying "use regular virtio APIs for the rest" does not
> > cut it, the new APIs have to make sense
> > so virtio drivers can develop normally without fear of stepping
> > on the toes of this admin driver.
> 
> Please work with Yishai to get something that make sense to you. He
> can post a v2 with the accumulated comments addressed so far and then
> go over what the API between the drivers is.
> 
> Thanks,
> Jason

/me shrugs. I pretty much posted suggestions already. Should not be hard.
Anything unclear - post on list.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device

2023-10-10 Thread Michael S. Tsirkin

On Tue, Oct 10, 2023 at 10:10:31AM -0300, Jason Gunthorpe wrote:
> > > There is alot of code in VFIO and the VMM side to take a VF and turn
> > > it into a vPCI function. You can't just trivially duplicate VFIO in a
> > > dozen drivers without creating a giant mess.
> > 
> > I do not advocate for duplicating it.  But the code that calls this
> > functionality belongs into the driver that deals with the compound
> > device that we're doing this work for.
> 
> On one hand, I don't really care - we can put the code where people
> like.
> 
> However - the Intel GPU VFIO driver is such a bad experiance I don't
> want to encourage people to make VFIO drivers, or code that is only
> used by VFIO drivers, that are not under drivers/vfio review.

So if Alex feels it makes sense to add some virtio functionality
to vfio and is happy to maintain or let you maintain the UAPI
then why would I say no? But we never expected devices to have
two drivers like this does, so just exposing device pointer
and saying "use regular virtio APIs for the rest" does not
cut it, the new APIs have to make sense
so virtio drivers can develop normally without fear of stepping
on the toes of this admin driver.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH 5/7] tun: Introduce virtio-net hashing feature

2023-10-09 Thread Michael S. Tsirkin

On Mon, Oct 09, 2023 at 05:44:20PM +0900, Akihiko Odaki wrote:
> On 2023/10/09 17:13, Willem de Bruijn wrote:
> > On Sun, Oct 8, 2023 at 12:22 AM Akihiko Odaki  
> > wrote:
> > > 
> > > virtio-net have two usage of hashes: one is RSS and another is hash
> > > reporting. Conventionally the hash calculation was done by the VMM.
> > > However, computing the hash after the queue was chosen defeats the
> > > purpose of RSS.
> > > 
> > > Another approach is to use eBPF steering program. This approach has
> > > another downside: it cannot report the calculated hash due to the
> > > restrictive nature of eBPF.
> > > 
> > > Introduce the code to compute hashes to the kernel in order to overcome
> > > thse challenges. An alternative solution is to extend the eBPF steering
> > > program so that it will be able to report to the userspace, but it makes
> > > little sense to allow to implement different hashing algorithms with
> > > eBPF since the hash value reported by virtio-net is strictly defined by
> > > the specification.
> > > 
> > > The hash value already stored in sk_buff is not used and computed
> > > independently since it may have been computed in a way not conformant
> > > with the specification.
> > > 
> > > Signed-off-by: Akihiko Odaki 
> > 
> > > @@ -2116,31 +2172,49 @@ static ssize_t tun_put_user(struct tun_struct 
> > > *tun,
> > >  }
> > > 
> > >  if (vnet_hdr_sz) {
> > > -   struct virtio_net_hdr gso;
> > > +   union {
> > > +   struct virtio_net_hdr hdr;
> > > +   struct virtio_net_hdr_v1_hash v1_hash_hdr;
> > > +   } hdr;
> > > +   int ret;
> > > 
> > >  if (iov_iter_count(iter) < vnet_hdr_sz)
> > >  return -EINVAL;
> > > 
> > > -   if (virtio_net_hdr_from_skb(skb, &gso,
> > > -   tun_is_little_endian(tun), 
> > > true,
> > > -   vlan_hlen)) {
> > > +   if ((READ_ONCE(tun->vnet_hash.flags) & 
> > > TUN_VNET_HASH_REPORT) &&
> > > +   vnet_hdr_sz >= sizeof(hdr.v1_hash_hdr) &&
> > > +   skb->tun_vnet_hash) {
> > 
> > Isn't vnet_hdr_sz guaranteed to be >= hdr.v1_hash_hdr, by virtue of
> > the set hash ioctl failing otherwise?
> > 
> > Such checks should be limited to control path where possible
> 
> There is a potential race since tun->vnet_hash.flags and vnet_hdr_sz are not
> read at once.

And then it's a complete mess and you get inconsistent
behaviour with packets getting sent all over the place, right?
So maybe keep a pointer to this struct so it can be
changed atomically then. Maybe even something with rcu I donnu.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH 5/7] tun: Introduce virtio-net hashing feature

2023-10-09 Thread Michael S. Tsirkin

Akihiko Odaki sorry about reposts.
Having an email with two "@" in the CC list:
 xuanzhuo@linux.alibaba.comsh...@kernel.org
tripped up mutt's reply-all for me and made it send to you only.

I am guessing you meant two addresses: xuanz...@linux.alibaba.com
and sh...@kernel.org.


On Sun, Oct 08, 2023 at 02:20:49PM +0900, Akihiko Odaki wrote:
> virtio-net have two usage of hashes: one is RSS and another is hash
> reporting. Conventionally the hash calculation was done by the VMM.
> However, computing the hash after the queue was chosen defeats the
> purpose of RSS.
> 
> Another approach is to use eBPF steering program. This approach has
> another downside: it cannot report the calculated hash due to the
> restrictive nature of eBPF.
> 
> Introduce the code to compute hashes to the kernel in order to overcome
> thse challenges. An alternative solution is to extend the eBPF steering
> program so that it will be able to report to the userspace, but it makes
> little sense to allow to implement different hashing algorithms with
> eBPF since the hash value reported by virtio-net is strictly defined by
> the specification.
> 
> The hash value already stored in sk_buff is not used and computed
> independently since it may have been computed in a way not conformant
> with the specification.
> 
> Signed-off-by: Akihiko Odaki 
> ---
>  drivers/net/tun.c   | 187 
>  include/uapi/linux/if_tun.h |  16 +++
>  2 files changed, 182 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 89ab9efe522c..561a573cd008 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -171,6 +171,9 @@ struct tun_prog {
>   struct bpf_prog *prog;
>  };
>  
> +#define TUN_VNET_HASH_MAX_KEY_SIZE 40
> +#define TUN_VNET_HASH_MAX_INDIRECTION_TABLE_LENGTH 128
> +

where do these come from?

>  /* Since the socket were moved to tun_file, to preserve the behavior of 
> persist
>   * device, socket filter, sndbuf and vnet header size were restore when the
>   * file were attached to a persist device.
> @@ -209,6 +212,9 @@ struct tun_struct {
>   struct tun_prog __rcu *steering_prog;
>   struct tun_prog __rcu *filter_prog;
>   struct ethtool_link_ksettings link_ksettings;
> + struct tun_vnet_hash vnet_hash;
> + u16 
> vnet_hash_indirection_table[TUN_VNET_HASH_MAX_INDIRECTION_TABLE_LENGTH];
> + u32 vnet_hash_key[TUN_VNET_HASH_MAX_KEY_SIZE / 4];

That's quite a lot of data to add in this struct, and will be used
by a small minority of users. Are you pushing any hot data out of cache
with this? Why not allocate these as needed?

>   /* init args */
>   struct file *file;
>   struct ifreq *ifr;
> @@ -219,6 +225,13 @@ struct veth {
>   __be16 h_vlan_TCI;
>  };
>  
> +static const struct tun_vnet_hash_cap tun_vnet_hash_cap = {
> + .max_indirection_table_length =
> + TUN_VNET_HASH_MAX_INDIRECTION_TABLE_LENGTH,
> +
> + .types = VIRTIO_NET_SUPPORTED_HASH_TYPES
> +};
> +
>  static void tun_flow_init(struct tun_struct *tun);
>  static void tun_flow_uninit(struct tun_struct *tun);
>  
> @@ -320,10 +333,16 @@ static long tun_set_vnet_be(struct tun_struct *tun, int 
> __user *argp)
>   if (get_user(be, argp))
>   return -EFAULT;
>  
> - if (be)
> + if (be) {
> + if (!(tun->flags & TUN_VNET_LE) &&
> + (tun->vnet_hash.flags & TUN_VNET_HASH_REPORT)) {
> + return -EINVAL;
> + }
> +
>   tun->flags |= TUN_VNET_BE;
> - else
> + } else {
>   tun->flags &= ~TUN_VNET_BE;
> + }
>  
>   return 0;
>  }
> @@ -558,15 +577,47 @@ static u16 tun_ebpf_select_queue(struct tun_struct 
> *tun, struct sk_buff *skb)
>   return ret % numqueues;
>  }
>  
> +static u16 tun_vnet_select_queue(struct tun_struct *tun, struct sk_buff *skb)
> +{
> + u32 value = qdisc_skb_cb(skb)->tun_vnet_hash_value;
> + u32 numqueues;
> + u32 index;
> + u16 queue;
> +
> + numqueues = READ_ONCE(tun->numqueues);
> + if (!numqueues)
> + return 0;
> +
> + index = value & READ_ONCE(tun->vnet_hash.indirection_table_mask);
> + queue = READ_ONCE(tun->vnet_hash_indirection_table[index]);
> + if (!queue)
> + queue = READ_ONCE(tun->vnet_hash.unclassified_queue);

Apparently 0 is an illegal queue value? You are making this part
of UAPI better document things like this.

> +
> + return queue % numqueues;
> +}
> +
>  static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb,
>   struct net_device *sb_dev)
>  {
>   struct tun_struct *tun = netdev_priv(dev);
> + u8 vnet_hash_flags = READ_ONCE(tun->vnet_hash.flags);
> + struct virtio_net_hash hash;
>   u16 ret;
>  
> + if (vnet_hash_flags & (TUN_VNET_HASH_RSS | TUN_VNET_HASH_REPORT)) {
> + virtio_net_hash(skb, READ_ONCE(tun->vnet_hash.types),
> +

Re: [PATCH 2/2] tools/virtio: Add hints when module is not installed

2023-10-09 Thread Michael S. Tsirkin

On Mon, Oct 09, 2023 at 02:44:55AM +, Liming Wu wrote:
> 
> 
> > -Original Message-
> > From: Jason Wang 
> > Sent: Sunday, October 8, 2023 12:36 PM
> > To: Liming Wu 
> > Cc: Michael S . Tsirkin ; k...@vger.kernel.org;
> > virtualization@lists.linux-foundation.org; net...@vger.kernel.org; linux-
> > ker...@vger.kernel.org; 398776...@qq.com
> > Subject: Re: [PATCH 2/2] tools/virtio: Add hints when module is not 
> > installed
> > 
> > On Tue, Sep 26, 2023 at 1:00 PM  wrote:
> > >
> > > From: Liming Wu 
> > >
> > > Need to insmod vhost_test.ko before run virtio_test.
> > > Give some hints to users.
> > >
> > > Signed-off-by: Liming Wu 
> > > ---
> > >  tools/virtio/virtio_test.c | 4 
> > >  1 file changed, 4 insertions(+)
> > >
> > > diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
> > > index 028f54e6854a..ce2c4d93d735 100644
> > > --- a/tools/virtio/virtio_test.c
> > > +++ b/tools/virtio/virtio_test.c
> > > @@ -135,6 +135,10 @@ static void vdev_info_init(struct vdev_info* dev,
> > unsigned long long features)
> > > dev->buf = malloc(dev->buf_size);
> > > assert(dev->buf);
> > > dev->control = open("/dev/vhost-test", O_RDWR);
> > > +
> > > +   if (dev->control < 0)
> > > +   fprintf(stderr, "Install vhost_test module" \
> > > +   "(./vhost_test/vhost_test.ko) firstly\n");
> > 
> > There should be many other reasons to fail for open().
> > 
> > Let's use strerror()?
> Yes,  Thanks for the review. 
> Please rechecked the code as follow:
> --- a/tools/virtio/virtio_test.c
> +++ b/tools/virtio/virtio_test.c
> @@ -135,6 +135,11 @@ static void vdev_info_init(struct vdev_info* dev, 
> unsigned long long features)
> dev->buf = malloc(dev->buf_size);
> assert(dev->buf);
> dev->control = open("/dev/vhost-test", O_RDWR);
> +
> +   if (dev->control == NULL)


???
Why are you comparing a file descriptor to NULL?

> +   fprintf(stderr,
> +   "%s: Check whether vhost_test.ko is installed.\n",
> +   strerror(errno));


No, do not suggest checking unconditionally this is just wasting user's
time.  You would have to check the exact errno value. You will either
get ENOENT or ENODEV if module is not loaded. Other errors indicate
other problems.  And what matters is whether it's loaded, not installed
- vhost_test.ko will not get auto-loaded even if installed.


> assert(dev->control >= 0);
> r = ioctl(dev->control, VHOST_SET_OWNER, NULL);
> assert(r >= 0);
>  
> Thanks
> 

In short, I am not applying this patch. If you really want to make
things a bit easier in case of errors, replace all assert r >= 0 with
a macro that prints out strerror(errno), that should be enough.
Maybe print file/line number too while we are at it.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH vhost v2 1/2] virtio_pci: fix the common map size and add check for common size

2023-10-09 Thread Michael S. Tsirkin

On Mon, Oct 09, 2023 at 09:15:31AM +0800, Xuan Zhuo wrote:
> On Sun, 8 Oct 2023 06:42:37 -0400, "Michael S. Tsirkin"  
> wrote:
> > On Sun, Oct 08, 2023 at 05:38:41PM +0800, Xuan Zhuo wrote:
> > > Now, the function vp_modern_map_capability() takes the size parameter,
> > > which corresponds to the size of virtio_pci_common_cfg. As a result,
> > > this indicates the size of memory area to map.
> > >
> > > However, if the _F_RING_RESET is negotiated, the extra items will be
> > > used. Therefore, we need to use the size of virtio_pci_modern_common_cfg
> > > to map more space.
> > >
> > > Meanwhile, this patch removes the feature(_F_RING_ERSET and
> >
> > typo
> >
> > > _F_NOTIFICATION_DATA) when the common cfg size does not match
> > > the corresponding feature.
> > >
> > > Fixes: 0b50cece0b78 ("virtio_pci: introduce helper to get/set queue 
> > > reset")
> > > Signed-off-by: Xuan Zhuo 
> > > ---
> > >  drivers/virtio/virtio_pci_modern.c | 20 +++-
> > >  drivers/virtio/virtio_pci_modern_dev.c |  4 ++--
> > >  include/linux/virtio_pci_modern.h  |  1 +
> > >  3 files changed, 22 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/virtio/virtio_pci_modern.c 
> > > b/drivers/virtio/virtio_pci_modern.c
> > > index d6bb68ba84e5..c0b9d2363ddb 100644
> > > --- a/drivers/virtio/virtio_pci_modern.c
> > > +++ b/drivers/virtio/virtio_pci_modern.c
> > > @@ -22,8 +22,26 @@
> > >  static u64 vp_get_features(struct virtio_device *vdev)
> > >  {
> > >   struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> > > + u64 features = vp_modern_get_features(&vp_dev->mdev);
> > > +
> > > +#define check_feature(feature, field)
> > > \
> > > + do {
> > > \
> > > + if (features & BIT_ULL(feature)) {  
> > > \
> > > + u32 offset = offsetofend(struct 
> > > virtio_pci_modern_common_cfg, field);   \
> > > + if (unlikely(vp_dev->mdev.common_len < offset)) 
> > > \
> > > + features &= ~BIT_ULL(feature);  
> > > \
> > > + }   
> > > \
> > > + } while (0)
> > > +
> > > + /* For buggy devices, if the common len does not match the feature, we
> > > +  * remove the feature.
> >
> > I don't like doing this in vp_get_features. userspace won't be able
> > to see the actual device features at all.
> > Also, we should print an info message at least.
> >
> > I am still debating with myself whether clearing feature bits
> > or just failing finalize_features (and thus, probe) is best.
> 
> For me, I think failing probe is best.
> 
> Then the developer of the device can find that firstly.
> And I think we should print an info message when we detect
> this error.

If you fail probe - maybe even a warning.

> If we clear the feature bits, the developer of the device may
> ignore this error.
> 
> >
> >
> > > +  */
> > > + check_feature(VIRTIO_F_NOTIFICATION_DATA, queue_notify_data);
> > > + check_feature(VIRTIO_F_RING_RESET, queue_reset);
> > > +
> > > +#undef check_feature
> >
> > this macro's too scary. just pass offset and feature bit as
> > parameters to an inline function.
> 
> I should pass the features as a parameter.
> 
> Thanks.
> 
> 
> 
> >
> > >
> > > - return vp_modern_get_features(&vp_dev->mdev);
> > > + return features;
> > >  }
> > >
> > >  static void vp_transport_features(struct virtio_device *vdev, u64 
> > > features)
> > > diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
> > > b/drivers/virtio/virtio_pci_modern_dev.c
> > > index aad7d9296e77..33f319da1558 100644
> > > --- a/drivers/virtio/virtio_pci_modern_dev.c
> > > +++ b/drivers/virtio/virtio_pci_modern_dev.c
> > > @@ -291,8 +291,8 @@ int vp_modern_probe(struct virtio_pci_modern_device 
> > > *mdev)
> > >   err = -EINVAL;
> > >   mdev->common = vp_modern_map_capability(mdev, common,
> > >

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1001 matches

Mail list logo