[virtio-dev] RE: [PATCH v12 3/3] admin: Add group member legacy register access commands

2023-07-07 Thread Parav Pandit
> From: Michael S. Tsirkin 
> Sent: Friday, July 7, 2023 5:31 AM


> ok getting there. mostly notification needs a bit more work but it's close.

Fixed all the comments in v13.
Thanks.



[virtio-dev] [PATCH v13] admin: Add group member legacy register access commands

2023-07-07 Thread Parav Pandit
Introduce group member legacy common configuration and legacy device
configuration access read/write commands.

Group member legacy registers access commands enable group owner driver
to access legacy registers on behalf of the guest virtual machine.

Usecase:

1. A hypervisor/system needs to provide transitional
   virtio devices to the guest VM at scale of thousands,
   typically, one to eight devices per VM.

2. A hypervisor/system needs to provide such devices using a
   vendor agnostic driver in the hypervisor system.

3. A hypervisor system prefers to have single stack regardless of
   virtio device type (net/blk) and be future compatible with a
   single vfio stack using SR-IOV or other scalable device
   virtualization technology to map PCI devices to the guest VM.
   (as transitional or otherwise)

Motivation/Background:
=
The existing virtio transitional PCI device is missing support for
PCI SR-IOV based devices. Currently it does not work beyond
PCI PF, or as software emulated device in reality. Currently it
has below cited system level limitations:

[a] PCIe spec citation:
VFs do not support I/O Space and thus VF BARs shall not indicate I/O Space.

[b] cpu arch citiation:
Intel 64 and IA-32 Architectures Software Developer’s Manual:
The processor’s I/O address space is separate and distinct from
the physical-memory address space. The I/O address space consists
of 64K individually addressable 8-bit I/O ports, numbered 0 through H.

[c] PCIe spec citation:
If a bridge implements an I/O address range,...I/O address range will be
aligned to a 4 KB boundary.

Overview:
=
Above usecase requirements is solved by PCI PF group owner accessing
its group member PCI VFs legacy registers using the administration
commands of the group owner PCI PF.

Two types of administration commands are added which read/write PCI VF
registers.

Software usage example:
===

1. One way to use and map to the guest VM is by using vfio driver
framework in Linux kernel.

+--+
|pci_dev_id = 0x100X   |
+---|pci_rev_id = 0x0  |-+
|vfio device|BAR0 = I/O region | |
|   |Other attributes  | |
|   +--+ |
||
+   +--+ +-+ |
|   |I/O BAR to AQ | | Other vfio  | |
|   |rd/wr mapper& | | functionalities | |
|   | forwarder| | | |
|   +--+ +-+ |
||
+--+-+---+
   | |
   Config region |
 accessDriver notifications
   | |
  +++   +++
  | +-+ |   | PCI VF device A |
  | | AQ  |-+>+-+ |
  | +-+ |   |   | | legacy regs | |
  | PCI PF device   |   |   | +-+ |
  +-+   |   +-+
|
|   +++
|   | PCI VF device N |
+>+-+ |
| | legacy regs | |
| +-+ |
+-+

2. Continue to use the virtio pci driver to bind to the
   listed device id and use it as in the host.

3. Use it in a light weight hypervisor to run bare-metal OS.

Fixes: https://github.com/oasis-tcs/virtio-spec/issues/167
Signed-off-by: Parav Pandit 
---
changelog:
v12->v13:
- added article
- add hyphen between little and endian
- mentioned vq index depth of 16-bit
- rewrote alternative approach line
- mention vq index, length and endianness in mmio description
- fixed padding bytes size from 7 to 6 bytes
- rewrote bar field description
- offset alignment text added
- added text to ignore reserved notification entries
- device and driver conformance lines added for notification info command fields
- dropped group member prefix to the driver
- reworded text for flags requirements
- reworded to say all driver notifications in conformance
- itemize conformance entries under command to ease reading
v11->v12:
- added missing article the at few places
- rewrote group_member_id statements like other existing
  commands which is cleaner and shorter
- added length and alignment lines to multiple commands
- rewrote fast path to separate dedicated mechanism
- rewrote example and description para for legacy notification command
- made separate paragraph for the notify info command
- dropped citation to virtio pci capabilities for member device
- notification region changed to notification address throughout
- added description to all the fields of the info struct
- avoided union in spirit of keeping all for pci
- used single listing
- 

[virtio-dev] RE: [PATCH v12 0/3] admin: Access legacy registers using admin commands

2023-07-07 Thread Parav Pandit


> From: Cornelia Huck 
> Sent: Friday, July 7, 2023 7:46 AM
> > Patch summary:
> > --
> > patch-1 fix split rows of admin opcode tables by a line
> > patch-2 fix section numbering
> 
> Pushed 1+2 as editorial updates.

Thanks.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] RE: [virtio-comment] Re: [PATCH v12 3/3] admin: Add group member legacy register access commands

2023-07-07 Thread Parav Pandit


> From: virtio-comm...@lists.oasis-open.org  open.org> On Behalf Of Michael S. Tsirkin
> Sent: Friday, July 7, 2023 10:17 AM

> > > > +group member driver SHOULD use the notification address to send a
> > > > +driver notification to the device.
> > >
> > > No group member driver is not involved.
> > Ok.
> >
> > > Just say the driver. And "driver notifications" - you want this for all 
> > > of them
> no?
> > >
> > Didn’t understand the question.
> > Do you mean "all devices"?
> 
> all driver notifications.
Oh ok. yes. adding it.
Thanks.


[virtio-dev] Re: [PATCH v12 3/3] admin: Add group member legacy register access commands

2023-07-07 Thread Michael S. Tsirkin
On Fri, Jul 07, 2023 at 01:26:56PM +, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin 
> > Sent: Friday, July 7, 2023 5:31 AM
> 
> [..]
> > > +For VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_WRITE and
> > > +VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_READ commands, the driver
> > SHOULD set
> > > +\field{offset} and the length of the \field{data} to refer to a
> > > +single field within the virtio device-specific configuration.
> > > +
> > > +If VIRTIO_ADMIN_CMD_LEGACY_NOTIFY_INFO command is supported, the
> > > +group member driver SHOULD use the notification address to send a
> > > +driver notification to the device.
> > 
> > No group member driver is not involved.
> Ok.
> 
> > Just say the driver. And "driver notifications" - you want this for all of 
> > them no?
> > 
> Didn’t understand the question. 
> Do you mean "all devices"?

all driver notifications.

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] RE: [PATCH v12 3/3] admin: Add group member legacy register access commands

2023-07-07 Thread Parav Pandit


> From: Michael S. Tsirkin 
> Sent: Friday, July 7, 2023 5:31 AM

[..]
> > +For VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_WRITE and
> > +VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_READ commands, the driver
> SHOULD set
> > +\field{offset} and the length of the \field{data} to refer to a
> > +single field within the virtio device-specific configuration.
> > +
> > +If VIRTIO_ADMIN_CMD_LEGACY_NOTIFY_INFO command is supported, the
> > +group member driver SHOULD use the notification address to send a
> > +driver notification to the device.
> 
> No group member driver is not involved.
Ok.

> Just say the driver. And "driver notifications" - you want this for all of 
> them no?
> 
Didn’t understand the question. 
Do you mean "all devices"?


[virtio-dev] Re: [RFC PATCH] docs/interop: define STANDALONE protocol feature for vhost-user

2023-07-07 Thread Alex Bennée


"Michael S. Tsirkin"  writes:

> On Fri, Jul 07, 2023 at 08:58:00AM +0100, Alex Bennée wrote:
>> 
>> "Michael S. Tsirkin"  writes:
>> 
>> > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
>> >> Currently QEMU has to know some details about the back-end to be able
>> >> to setup the guest. While various parts of the setup can be delegated
>> >> to the backend (for example config handling) this is a very piecemeal
>> >> approach.
>> >
>> >> This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
>> >> which the back-end can advertise which allows a probe message to be
>> >> sent to get all the details QEMU needs to know in one message.
>> >
>> > The reason we do piecemeal is that these existing pieces can be reused
>> > as others evolve or fall by wayside.
>> 
>> Sure I have no objection in principle but we then turn code like:
>> 
>> if (dev->protocol_features & (1ULL << 
>> VHOST_USER_PROTOCOL_F_STANDALONE)) {
>> err = vhost_user_get_backend_specs(dev, errp);
>> if (err < 0) {
>> error_setg_errno(errp, EPROTO, "vhost_get_backend_specs 
>> failed");
>> return -EPROTO;
>> }
>> }
>> 
>> to
>> 
>> if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_ID) &&
>> dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_CFGSZ) &&
>> dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MINVQ) &&
>> dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MAXVQ)
>> ) {
>> err = vhost_user_get_virtio_id(dev, errp);
>> if (err < 0) {
>> error_setg_errno(errp, EPROTO, "vhost_get_backend_id 
>> failed");
>> return -EPROTO;
>> }
>> err = vhost_user_get_virtio_cfgsz(dev, errp);
>> if (err < 0) {
>> error_setg_errno(errp, EPROTO, "vhost_get_backend_cfgsz 
>> failed");
>> return -EPROTO;
>> }
>> err = vhost_user_get_virtio_minvq(dev, errp);
>> if (err < 0) {
>> error_setg_errno(errp, EPROTO, "vhost_get_backend_minvq 
>> failed");
>> return -EPROTO;
>> }
>> err = vhost_user_get_virtio_maxvq(dev, errp);
>> if (err < 0) {
>> error_setg_errno(errp, EPROTO, "vhost_get_backend_maxvq 
>> failed");
>> return -EPROTO;
>> }
>> dev->specs.valid = true;
>> }
>> 
>> for little gain IMHO.
>>
>> > For example, I can think of instances where you want to connect
>> > specifically to e.g. networking backend, and specify it
>> > on command line. Reasons could be many, e.g. for debugging,
>> > or to prevent connecting to wrong device on wrong channel
>> > (kind of like type safety).
>> 
>> I don't quite follow what you are trying to say here.
>
> That some or all of these might be better on qemu command line
> not come from backend. Then we'll want to *send* it to backend.
> All this at our discretion without protocol changes.

That doesn't solve the standalone problem though (not all VMM's are QEMU
after all). I'm currently putting together a PoC with the
vhost-user-device and I was intending:

 - no CLI args, probe and if nothing fail
 - CLI args, probe with no response, continue with CLI args
 - CLI args, probe with response, check args match (or in bounds for
   vqs) and fail if not

Stefan wasn't super keen on the vhost-user-device in v2 being user
creatable because things could go weird quite quickly in hard to debug
ways:

  Message-Id: <20230418162140.373219-1-alex.ben...@linaro.org>
  Date: Tue, 18 Apr 2023 17:21:27 +0100
  Subject: [PATCH v2 00/13] virtio: add vhost-user-generic and reduce copy and 
paste
  From: =?UTF-8?q?Alex=20Benn=C3=A9e?= 

However it certainly is useful from a development point of view being
able to plug in new VirtIO backends without having to copy and paste
another slightly different stub into QEMU. I was pondering a middle
ground of maybe making the CLI options all x- variants to emphasise the
"here be dragons please know what you are doing" aspect of them.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v12 0/3] admin: Access legacy registers using admin commands

2023-07-07 Thread Cornelia Huck
On Fri, Jul 07 2023, Parav Pandit  wrote:

> This short series introduces legacy registers access commands for the owner
> group member access the legacy registers of the member VFs.
> This short series introduces legacy region access commands by the group owner
> device for its member devices.
> Currently it is applicable to the PCI PF and VF devices. If in future any
> SIOV devices to support legacy registers, they can be easily supported using
> same commands by using the group member identifiers of the future SIOV 
> devices.
>
> More details as overview, motivation, use case are further described
> below.
>
> Patch summary:
> --
> patch-1 fix split rows of admin opcode tables by a line
> patch-2 fix section numbering

Pushed 1+2 as editorial updates.

> patch-3 add legacy region access commands


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v11 0/3] admin: Access legacy registers using admin commands

2023-07-07 Thread Cornelia Huck
On Thu, Jul 06 2023, "Michael S. Tsirkin"  wrote:

> On Fri, Jul 07, 2023 at 12:27:19AM +0300, Parav Pandit wrote:
>> This short series introduces legacy registers access commands for the owner
>> group member access the legacy registers of the member VFs.
>> This short series introduces legacy region access commands by the group owner
>> device for its member devices.
>> Currently it is applicable to the PCI PF and VF devices. If in future any
>> SIOV devices to support legacy registers, they can be easily supported using
>> same commands by using the group member identifiers of the future SIOV 
>> devices.
>> 
>> More details as overview, motivation, use case are further described
>> below.
>
> corneli want to apply 1,2 as editorial?

I guess that makes sense to get them out of the way.


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [RFC PATCH] docs/interop: define STANDALONE protocol feature for vhost-user

2023-07-07 Thread Stefano Garzarella

On Thu, Jul 06, 2023 at 05:31:15PM +0100, Alex Bennée wrote:


Alex Bennée  writes:


Currently QEMU has to know some details about the back-end to be able
to setup the guest. While various parts of the setup can be delegated
to the backend (for example config handling) this is a very piecemeal
approach.

This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
which the back-end can advertise which allows a probe message to be
sent to get all the details QEMU needs to know in one message.

Signed-off-by: Alex Bennée 

---
Initial RFC for discussion. I intend to prototype this work with QEMU
and one of the rust-vmm vhost-user daemons.
---
 docs/interop/vhost-user.rst | 37 +
 hw/virtio/vhost-user.c  |  8 
 2 files changed, 45 insertions(+)

diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index 5a070adbc1..85b1b1583a 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -275,6 +275,21 @@ Inflight description

 :queue size: a 16-bit size of virtqueues

+Backend specifications
+^^
+
++---+-+++
+| device id | config size |   min_vqs  |   max_vqs  |
++---+-+++
+
+:device id: a 32-bit value holding the VirtIO device ID
+
+:config size: a 32-bit value holding the config size (see 
``VHOST_USER_GET_CONFIG``)
+
+:min_vqs: a 32-bit value holding the minimum number of vqs supported
+
+:max_vqs: a 32-bit value holding the maximum number of vqs supported, must be 
>= min_vqs
+
 C structure
 ---

@@ -296,6 +311,7 @@ In QEMU the vhost-user message is implemented with the 
following struct:
   VhostUserConfig config;
   VhostUserVringArea area;
   VhostUserInflight inflight;
+  VhostUserBackendSpecs specs;
   };
   } QEMU_PACKED VhostUserMsg;

@@ -316,6 +332,7 @@ replies. Here is a list of the ones that do:
 * ``VHOST_USER_GET_VRING_BASE``
 * ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``)
 * ``VHOST_USER_GET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
+* ``VHOST_USER_GET_BACKEND_SPECS`` (if ``VHOST_USER_PROTOCOL_F_STANDALONE``)

 .. seealso::

@@ -885,6 +902,13 @@ Protocol features
   #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
   #define VHOST_USER_PROTOCOL_F_STATUS   16
   #define VHOST_USER_PROTOCOL_F_XEN_MMAP 17
+  #define VHOST_USER_PROTOCOL_F_STANDALONE   18
+
+Some features are only valid in the presence of other supporting
+features. In the case of ``VHOST_USER_PROTOCOL_F_STANDALONE`` the
+backend must also support ``VHOST_USER_PROTOCOL_F_CONFIG`` and
+``VHOST_USER_PROTOCOL_F_STATUS``.
+


This is too tight a restriction as not all VirtIO backends manage a
config space. So I suggest the following:

 Some features are only valid in the presence of other supporting
 features. In the case of ``VHOST_USER_PROTOCOL_F_STANDALONE`` the
 backend must also support ``VHOST_USER_PROTOCOL_F_STATUS`` and
 optionally ``VHOST_USER_PROTOCOL_F_CONFIG`` (if there is a config space).


Right, but could we describe it more as a kind of dependence between
features?

Something like this:

Some features depend on others to be supported:

* ``VHOST_USER_PROTOCOL_F_STANDALONE`` depends on:

  * ``VHOST_USER_PROTOCOL_F_STATUS``
  * ``VHOST_USER_PROTOCOL_F_CONFIG`` (if there is a config space)

Thanks,
Stefano


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [RFC PATCH] docs/interop: define STANDALONE protocol feature for vhost-user

2023-07-07 Thread Stefano Garzarella

On Tue, Jul 04, 2023 at 04:02:42PM +0100, Alex Bennée wrote:


Stefano Garzarella  writes:


On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:

Currently QEMU has to know some details about the back-end to be able
to setup the guest. While various parts of the setup can be delegated
to the backend (for example config handling) this is a very piecemeal
approach.

This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
which the back-end can advertise which allows a probe message to be
sent to get all the details QEMU needs to know in one message.

Signed-off-by: Alex Bennée 

---
Initial RFC for discussion. I intend to prototype this work with QEMU
and one of the rust-vmm vhost-user daemons.


Thanks for starting this discussion!

I'm comparing with vhost-vdpa IOCTLs, so my questions may be
superficial, but they help me understand the differences.


I did have a quick read-through to get a handle on vhost-vdpa but the
docs are fairly empty. However I see there is more detail in the linux
commit so after looking at that I do wonder:

* The kernel commit defines a subset of VIRTIO_F feature flags. Should
  we do the same for this interface?


Sorry, I didn't get this.

Do you mean that the kernel is filtering some features?
Or are you talking about backend features?



* The VDPA GET/SET STATUS and GET/SET CONFIG ioctls are already covered
  by the equivalent VHOST_USER messages?


Yep, I think so.




---
docs/interop/vhost-user.rst | 37 +
hw/virtio/vhost-user.c  |  8 
2 files changed, 45 insertions(+)

diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index 5a070adbc1..85b1b1583a 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -275,6 +275,21 @@ Inflight description

:queue size: a 16-bit size of virtqueues

+Backend specifications
+^^
+
++---+-+++
+| device id | config size |   min_vqs  |   max_vqs  |
++---+-+++
+
+:device id: a 32-bit value holding the VirtIO device ID
+
+:config size: a 32-bit value holding the config size (see 
``VHOST_USER_GET_CONFIG``)
+
+:min_vqs: a 32-bit value holding the minimum number of vqs supported


Why do we need the minimum?


We need to know the minimum number because some devices have fixed VQs
that must be present.


But does QEMU need to know this?

Or is it okay that the driver will then fail in the guest if there
are not the right number of queues?






+
+:max_vqs: a 32-bit value holding the maximum number of vqs supported, must be 
>= min_vqs


Is this overlap with VHOST_USER_GET_QUEUE_NUM?


Yes it does and I considered implementing a bunch of messages to fill in
around what we already have. However that seemed like it would add a
bunch of complexity to the interface when we could get all the initial
data in one go.


Yes I understand, though if we need to add new things to return in the
future how do we do it? Do we need to provide features for this
structure?






+
C structure
---

@@ -296,6 +311,7 @@ In QEMU the vhost-user message is implemented with the 
following struct:
  VhostUserConfig config;
  VhostUserVringArea area;
  VhostUserInflight inflight;
+  VhostUserBackendSpecs specs;
  };
  } QEMU_PACKED VhostUserMsg;

@@ -316,6 +332,7 @@ replies. Here is a list of the ones that do:
* ``VHOST_USER_GET_VRING_BASE``
* ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``)
* ``VHOST_USER_GET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
+* ``VHOST_USER_GET_BACKEND_SPECS`` (if ``VHOST_USER_PROTOCOL_F_STANDALONE``)

.. seealso::

@@ -885,6 +902,13 @@ Protocol features
  #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
  #define VHOST_USER_PROTOCOL_F_STATUS   16
  #define VHOST_USER_PROTOCOL_F_XEN_MMAP 17
+  #define VHOST_USER_PROTOCOL_F_STANDALONE   18
+
+Some features are only valid in the presence of other supporting
+features. In the case of ``VHOST_USER_PROTOCOL_F_STANDALONE`` the
+backend must also support ``VHOST_USER_PROTOCOL_F_CONFIG`` and
+``VHOST_USER_PROTOCOL_F_STATUS``.
+


What about adding a new section where we will describe what we mean
with "standalone" devices?

For example that the entire virtio device is emulated in the backend,
etc.

By the way, I was thinking more about F_FULL_DEVICE, but I'm not good
with names, so I'll just throw out an idea :-)


Naming things is hard ;-)


I know :-)



I could add a new section although AIUI there is nothing really in
existing daemons which is split between the front-end and back-end. The
stubs are basically boilerplate and ensure DT/PCIe hubs make the device
appear so once things are discovered QEMU never really gets involved
aside from being a dumb relay.


For the backend I don't think there is much to say, but for the frontend
it changes a lot in my opinion 

[virtio-dev] Re: [RFC PATCH] docs/interop: define STANDALONE protocol feature for vhost-user

2023-07-07 Thread Michael S. Tsirkin
On Fri, Jul 07, 2023 at 08:58:00AM +0100, Alex Bennée wrote:
> 
> "Michael S. Tsirkin"  writes:
> 
> > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> >> Currently QEMU has to know some details about the back-end to be able
> >> to setup the guest. While various parts of the setup can be delegated
> >> to the backend (for example config handling) this is a very piecemeal
> >> approach.
> >
> >> This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> >> which the back-end can advertise which allows a probe message to be
> >> sent to get all the details QEMU needs to know in one message.
> >
> > The reason we do piecemeal is that these existing pieces can be reused
> > as others evolve or fall by wayside.
> 
> Sure I have no objection in principle but we then turn code like:
> 
> if (dev->protocol_features & (1ULL << 
> VHOST_USER_PROTOCOL_F_STANDALONE)) {
> err = vhost_user_get_backend_specs(dev, errp);
> if (err < 0) {
> error_setg_errno(errp, EPROTO, "vhost_get_backend_specs 
> failed");
> return -EPROTO;
> }
> }
> 
> to
> 
> if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_ID) &&
> dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_CFGSZ) &&
> dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MINVQ) &&
> dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MAXVQ)
> ) {
> err = vhost_user_get_virtio_id(dev, errp);
> if (err < 0) {
> error_setg_errno(errp, EPROTO, "vhost_get_backend_id failed");
> return -EPROTO;
> }
> err = vhost_user_get_virtio_cfgsz(dev, errp);
> if (err < 0) {
> error_setg_errno(errp, EPROTO, "vhost_get_backend_cfgsz 
> failed");
> return -EPROTO;
> }
> err = vhost_user_get_virtio_minvq(dev, errp);
> if (err < 0) {
> error_setg_errno(errp, EPROTO, "vhost_get_backend_minvq 
> failed");
> return -EPROTO;
> }
> err = vhost_user_get_virtio_maxvq(dev, errp);
> if (err < 0) {
> error_setg_errno(errp, EPROTO, "vhost_get_backend_maxvq 
> failed");
> return -EPROTO;
> }
> dev->specs.valid = true;
> }
> 
> for little gain IMHO.
>
> > For example, I can think of instances where you want to connect
> > specifically to e.g. networking backend, and specify it
> > on command line. Reasons could be many, e.g. for debugging,
> > or to prevent connecting to wrong device on wrong channel
> > (kind of like type safety).
> 
> I don't quite follow what you are trying to say here.

That some or all of these might be better on qemu command line
not come from backend. Then we'll want to *send* it to backend.
All this at our discretion without protocol changes.


> > What is the reason to have 1 message? startup latency?
> > How about we allow pipelining several messages then?
> > Will be easier.
> 
> I'm not overly worried about performance because this is all at
> start-up. I am worried about excessive complexity though. We already
> have quite a lot of interacting protocol messages.
> 
> >
> >
> >> 
> >> Signed-off-by: Alex Bennée 
> >> 
> >> ---
> >> Initial RFC for discussion. I intend to prototype this work with QEMU
> >> and one of the rust-vmm vhost-user daemons.
> >> ---
> >>  docs/interop/vhost-user.rst | 37 +
> >>  hw/virtio/vhost-user.c  |  8 
> >>  2 files changed, 45 insertions(+)
> >> 
> >> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> >> index 5a070adbc1..85b1b1583a 100644
> >> --- a/docs/interop/vhost-user.rst
> >> +++ b/docs/interop/vhost-user.rst
> >> @@ -275,6 +275,21 @@ Inflight description
> >>  
> >>  :queue size: a 16-bit size of virtqueues
> >>  
> >> +Backend specifications
> >> +^^
> >> +
> >> ++---+-+++
> >> +| device id | config size |   min_vqs  |   max_vqs  |
> >> ++---+-+++
> >> +
> >> +:device id: a 32-bit value holding the VirtIO device ID
> >> +
> >> +:config size: a 32-bit value holding the config size (see 
> >> ``VHOST_USER_GET_CONFIG``)
> >> +
> >> +:min_vqs: a 32-bit value holding the minimum number of vqs supported
> >> +
> >> +:max_vqs: a 32-bit value holding the maximum number of vqs supported, 
> >> must be >= min_vqs
> >> +
> >
> > looks like a weird set of info.
> 
> It's basically the information you need for -device vhost-user-device to
> start-up (and what is essentially the information set by the stubs as
> they start-up).
> 
> > why would we want # of vqs and not their sizes?
> 
> I thought the vring's themselves where allocated by the driver. We only
> need to the number of vqs so we can allocate the 

[virtio-dev] Re: [PATCH v12 3/3] admin: Add group member legacy register access commands

2023-07-07 Thread Michael S. Tsirkin
On Fri, Jul 07, 2023 at 06:54:01AM +0300, Parav Pandit wrote:
> Introduce group member legacy common configuration and legacy device
> configuration access read/write commands.
> 
> Group member legacy registers access commands enable group owner driver
> software to access legacy registers on behalf of the guest virtual
> machine.
> 
> Usecase:
> 
> 1. A hypervisor/system needs to provide transitional
>virtio devices to the guest VM at scale of thousands,
>typically, one to eight devices per VM.
> 
> 2. A hypervisor/system needs to provide such devices using a
>vendor agnostic driver in the hypervisor system.
> 
> 3. A hypervisor system prefers to have single stack regardless of
>virtio device type (net/blk) and be future compatible with a
>single vfio stack using SR-IOV or other scalable device
>virtualization technology to map PCI devices to the guest VM.
>(as transitional or otherwise)
> 
> Motivation/Background:
> =
> The existing virtio transitional PCI device is missing support for
> PCI SR-IOV based devices. Currently it does not work beyond
> PCI PF, or as software emulated device in reality. Currently it
> has below cited system level limitations:
> 
> [a] PCIe spec citation:
> VFs do not support I/O Space and thus VF BARs shall not indicate I/O Space.
> 
> [b] cpu arch citiation:
> Intel 64 and IA-32 Architectures Software Developer’s Manual:
> The processor’s I/O address space is separate and distinct from
> the physical-memory address space. The I/O address space consists
> of 64K individually addressable 8-bit I/O ports, numbered 0 through H.
> 
> [c] PCIe spec citation:
> If a bridge implements an I/O address range,...I/O address range will be
> aligned to a 4 KB boundary.
> 
> Overview:
> =
> Above usecase requirements is solved by PCI PF group owner accessing
> its group member PCI VFs legacy registers using the administration
> commands of the group owner PCI PF.
> 
> Two types of administration commands are added which read/write PCI VF
> registers.
> 
> Software usage example:
> ===
> 
> 1. One way to use and map to the guest VM is by using vfio driver
> framework in Linux kernel.
> 
> +--+
> |pci_dev_id = 0x100X   |
> +---|pci_rev_id = 0x0  |-+
> |vfio device|BAR0 = I/O region | |
> |   |Other attributes  | |
> |   +--+ |
> ||
> +   +--+ +-+ |
> |   |I/O BAR to AQ | | Other vfio  | |
> |   |rd/wr mapper& | | functionalities | |
> |   | forwarder| | | |
> |   +--+ +-+ |
> ||
> +--+-+---+
>| |
>Config region |
>  accessDriver notifications
>| |
>   +++   +++
>   | +-+ |   | PCI VF device A |
>   | | AQ  |-+>+-+ |
>   | +-+ |   |   | | legacy regs | |
>   | PCI PF device   |   |   | +-+ |
>   +-+   |   +-+
> |
> |   +++
> |   | PCI VF device N |
> +>+-+ |
> | | legacy regs | |
> | +-+ |
> +-+
> 
> 2. Continue to use the virtio pci driver to bind to the
>listed device id and use it as in the host.
> 
> 3. Use it in a light weight hypervisor to run bare-metal OS.
> 
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/167
> Signed-off-by: Parav Pandit 


ok getting there. mostly notification needs a bit more work
but it's close.


> ---
> changelog:
> v11->v12:
> - added missing article the at few places
> - rewrote group_member_id statements like other existing
>   commands which is cleaner and shorter
> - added length and alignment lines to multiple commands
> - rewrote fast path to separate dedicated mechanism
> - rewrote example and description para for legacy notification command
> - made separate paragraph for the notify info command
> - dropped citation to virtio pci capabilities for member device
> - notification region changed to notification address throughout
> - added description to all the fields of the info struct
> - avoided union in spirit of keeping all for pci
> - used single listing
> - moved description to end which was in between two structs
> - added 4 entry and preference description
> - added conformance line for notification via mmio works same way as
>   admin command
> v10->v11:
> - replaced tab with white spaces in read structure
> - 

Re: [virtio-dev][PATCH 1/2] virtio-spi: define the DEVICE ID for virtio SPI master

2023-07-07 Thread Haixu Cui

Hi Michael S. Tsirkin, Cornelia Huck,
I have created an issue 
https://github.com/oasis-tcs/virtio-spec/issues/174 to request Device ID 
45 for SPI master. Please help to deal with it, thank you very much.


Best Regards
Haixu Cui

On 6/30/2023 10:57 PM, Michael S. Tsirkin wrote:

On Mon, Apr 17, 2023 at 10:03:52PM +0800, Haixu Cui wrote:

Define the DEVICE ID of virtio SPI master device as 45.


It does not looks like patch 2 will make it in time for 1.3
but should we vote on reserving device id maybe?
If yes pls create a github issue and request vote on list.


---
  content.tex | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/content.tex b/content.tex
index cff548a..29f2859 100644
--- a/content.tex
+++ b/content.tex
@@ -682,6 +682,8 @@ \chapter{Device Types}\label{sec:Device Types}
  \hline
  44 &   ISM device \\
  \hline
+45 &   SPI master \\
+\hline
  \end{tabular}
  
  Some of the devices above are unspecified by this document,

--
2.17.1


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org




-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [virtio-comment] RE: [PATCH v10 4/4] transport-pci: Introduce group legacy group member config region access

2023-07-07 Thread Michael S. Tsirkin
On Fri, Jul 07, 2023 at 10:12:35AM +0200, Cornelia Huck wrote:
> [Also, I notice that the discussion seems to get bogged down in tiny
> details, fundamental statements, or a mixture of both. It might
> sometimes be helpful to just step back, re-read the original proposal,
> and only then answer. Otherwise, this leads to frustration for everyone
> involved.]

I feel in this instance we are actually good. All that's left is
small details that's why we are discussing them.

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v11 3/3] admin: Add group member legacy register access commands

2023-07-07 Thread Michael S. Tsirkin
On Fri, Jul 07, 2023 at 03:54:31AM +, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin 
> > Sent: Thursday, July 6, 2023 6:36 PM
> 
> [..]
> 
> > I notice you decided to silently ignore my suggestion to document how are
> > notifications performed. Repeating myself like this is despiriting for me.
> I am sorry if it appeared that way,
> But no, I didn’t silently ignored.
> 
> I added the description as best I could find it, but you commented about it 
> being messy with mixing up the terminology.
> I rewrote it in v12, it looks better now. Please check.
> 
> > Pls re-add especially since you already document it for the cfg_Write access
> > method anyway.
> > 
> Added without citation to hypervisor etc.
> 
> > 
> > also in a conformance section, document the effect of notification being the
> > same as notification through legacy interface.
> 
> PCI specific things were copied from the current spec reference in the 
> notification capability section.
> All the changes you suggested are done.
> Captured in the change log of v12.
> 
> Thanks a lot.
> Since many parts are rewritten as you suggested in the thread, 
> I prefer to add your Signed-off to in v13 if we need roll it or when it is 
> merged, please apply if you find it appropriate.
> It is at least important to me to add it yours Sign-off.

If you like sure.

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [virtio-comment] RE: [PATCH v10 4/4] transport-pci: Introduce group legacy group member config region access

2023-07-07 Thread Cornelia Huck
On Thu, Jul 06 2023, Parav Pandit  wrote:

>> From: Michael S. Tsirkin 
>> Sent: Thursday, July 6, 2023 4:42 PM
>
> [..]
>> > Can we please avoid this over engineering?
>> > Interface has the doors open for driver to make wise decision depending on
>> its environment.
>> 
>> what if driver can access both with the same ease? this is the case that 
>> bothers
>> me and I think it's practical since it will be common on linux.
>
> Then Linux can say my preference is order, so it picks member.

Let's step back and consider what our goal is here: to provide a spec
that someone wanting to implement a driver can follow and end up with a
driver that works with devices that adhere to this spec.

I don't think expecting the driver to make "wise descisions" is helpful
for the person wanting to implement a driver. "Do this, and in case this
does not work in your environment, do that" seems much more helpful, and
still gives enough flexibility to cover different environments. In
short, make things simple for the consumer of the spec.

[Also, I notice that the discussion seems to get bogged down in tiny
details, fundamental statements, or a mixture of both. It might
sometimes be helpful to just step back, re-read the original proposal,
and only then answer. Otherwise, this leads to frustration for everyone
involved.]


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [RFC PATCH] docs/interop: define STANDALONE protocol feature for vhost-user

2023-07-07 Thread Alex Bennée


"Michael S. Tsirkin"  writes:

> On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
>> Currently QEMU has to know some details about the back-end to be able
>> to setup the guest. While various parts of the setup can be delegated
>> to the backend (for example config handling) this is a very piecemeal
>> approach.
>
>> This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
>> which the back-end can advertise which allows a probe message to be
>> sent to get all the details QEMU needs to know in one message.
>
> The reason we do piecemeal is that these existing pieces can be reused
> as others evolve or fall by wayside.

Sure I have no objection in principle but we then turn code like:

if (dev->protocol_features & (1ULL << 
VHOST_USER_PROTOCOL_F_STANDALONE)) {
err = vhost_user_get_backend_specs(dev, errp);
if (err < 0) {
error_setg_errno(errp, EPROTO, "vhost_get_backend_specs 
failed");
return -EPROTO;
}
}

to

if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_ID) &&
dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_CFGSZ) &&
dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MINVQ) &&
dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MAXVQ)
) {
err = vhost_user_get_virtio_id(dev, errp);
if (err < 0) {
error_setg_errno(errp, EPROTO, "vhost_get_backend_id failed");
return -EPROTO;
}
err = vhost_user_get_virtio_cfgsz(dev, errp);
if (err < 0) {
error_setg_errno(errp, EPROTO, "vhost_get_backend_cfgsz 
failed");
return -EPROTO;
}
err = vhost_user_get_virtio_minvq(dev, errp);
if (err < 0) {
error_setg_errno(errp, EPROTO, "vhost_get_backend_minvq 
failed");
return -EPROTO;
}
err = vhost_user_get_virtio_maxvq(dev, errp);
if (err < 0) {
error_setg_errno(errp, EPROTO, "vhost_get_backend_maxvq 
failed");
return -EPROTO;
}
dev->specs.valid = true;
}

for little gain IMHO.

> For example, I can think of instances where you want to connect
> specifically to e.g. networking backend, and specify it
> on command line. Reasons could be many, e.g. for debugging,
> or to prevent connecting to wrong device on wrong channel
> (kind of like type safety).

I don't quite follow what you are trying to say here.

> What is the reason to have 1 message? startup latency?
> How about we allow pipelining several messages then?
> Will be easier.

I'm not overly worried about performance because this is all at
start-up. I am worried about excessive complexity though. We already
have quite a lot of interacting protocol messages.

>
>
>> 
>> Signed-off-by: Alex Bennée 
>> 
>> ---
>> Initial RFC for discussion. I intend to prototype this work with QEMU
>> and one of the rust-vmm vhost-user daemons.
>> ---
>>  docs/interop/vhost-user.rst | 37 +
>>  hw/virtio/vhost-user.c  |  8 
>>  2 files changed, 45 insertions(+)
>> 
>> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
>> index 5a070adbc1..85b1b1583a 100644
>> --- a/docs/interop/vhost-user.rst
>> +++ b/docs/interop/vhost-user.rst
>> @@ -275,6 +275,21 @@ Inflight description
>>  
>>  :queue size: a 16-bit size of virtqueues
>>  
>> +Backend specifications
>> +^^
>> +
>> ++---+-+++
>> +| device id | config size |   min_vqs  |   max_vqs  |
>> ++---+-+++
>> +
>> +:device id: a 32-bit value holding the VirtIO device ID
>> +
>> +:config size: a 32-bit value holding the config size (see 
>> ``VHOST_USER_GET_CONFIG``)
>> +
>> +:min_vqs: a 32-bit value holding the minimum number of vqs supported
>> +
>> +:max_vqs: a 32-bit value holding the maximum number of vqs supported, must 
>> be >= min_vqs
>> +
>
> looks like a weird set of info.

It's basically the information you need for -device vhost-user-device to
start-up (and what is essentially the information set by the stubs as
they start-up).

> why would we want # of vqs and not their sizes?

I thought the vring's themselves where allocated by the driver. We only
need to the number of vqs so we can allocate the tracking structures.

> why config size but not config itself?

We already have GET_CONFIG and SET_CONFIG but without knowing the size
of the config space we can't properly set it up.



-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev][PATCH 2/2] virtio-spi: add the device specification

2023-07-07 Thread Haixu Cui

Hi Jonathon Reinhart,

Thank you so much for all your helpful advice and info!

I took a look at latest Linux SPI driver and found, for 
cs_setup/cs_hold/cs_inactive set in device-tree, there are following two 
conditions:
1) if SPI controller supports CS timing configuration and CS is not 
using a GPIO line, then the SPI driver set the 
cs_setup/cs_hold/ca_inactive values into the corresponding SPI CS registers;
2) if CS is using a GPIO line, or SPI controller doesn't support CS 
timing configuration, it is the software to perform the 
cs_setup/cs_hold/cs_inactive delays, which is implemented in function 
spi_set_cs in driver/spi/spi.c.


So I agree it is necessary to pass cs_setup/cs_hold/cs_inactive to 
the host for each transfer.


For condition 1, host should set these values into the CS timing 
registers. And as you mentioned "one might require different CS 
setup/hold times, depending on which slave_id they are talking to (on 
the same bus)", if so, host need to overwrite the CS timing registers 
between the two transfers talking to different salve_id.


For condition 2, SPI driver running in the host performing the 
cs_setup/cs_hold/cs_inactive delays seems more accurate, compared with 
performing delays in guest.



And the latest virtio spi transfer structure is defined as follows:

 struct spi_transfer_head {
u8 slave_id;
u8 bits_per_word;
u8 cs_change;
u8 tx_nbits;
u8 rx_nbits;
u8 reserved[3];
u32 mode;
u32 freq;
u32 word_delay_ns;
u32 delay_ns;
u32 cs_change_delay_ns;
u32 cs_setup_ns;
u32 cs_hold_ns;
u32 cs_inactive_ns;
 };

 @slave_id: chipselect index the SPI transfer used
 @bits_per_word: the number of bits in each SPI transfer word
 @cs_change: True to deselect device before starting the next transfer
 @tx_nbits: number of bits used for writing
 @rx_nbits: number of bits used for reading
 @reserved: reserved for future use
 @mode: the spi mode defines how data is clocked out and in
 @freq: transfer speed
 @word_delay_ns: delay to be inserted between consecutive words of 
a transfer, in ns unit
 @delay_ns: delay to be introduced after this transfer before 
(optionally) changing the chipselect status, in ns unit
 @cs_change_delay_ns: delay between cs deassert and assert when 
cs_change is set, in ns unit
 @cs_setup_ns: delay to be introduced by the controller after CS is 
asserted, in ns unit
 @cs_hold_ns: delay to be introduced by the controller before CS is 
deasserted, in ns unit
 @cs_inactive_ns: delay to be introduced by the controller after CS 
is deasserted, in ns unit



Could you please help check if this new structure contains enough 
information. Really appreciate for kind help.


Best Regards
Haixu Cui

On 7/1/2023 5:59 AM, Jonathon Reinhart wrote:

Hi @jrreinh...@google.com,
  Thank you very much for your helpful comments.

  I missed the delay and cs_change_delay parameters. I will add both
of them, although cs_change_delay can not be set from userspace, but can
be set in kernel space.


  For CS delays such as PRE_DELAY/POST_DELAY, it seems that they are
defined in structure spi_device:

  @cs_setup: delay to be introduced by the controller after CS is
asserted.

  @cs_hold: delay to be introduced by the controller before CS is
deasserted.

  @cs_inactive: delay to be introduced by the controller after CS is
deasserted. If @cs_change_delay is used from @spi_transfer, then the
two delays will be added up.

  They show the SPI controller ability to control the CS
assertion/deassertion timing and should not be changed for each transfer
(because thay can be updated by setting structure spi_transfer or
structure spi_ioc_transfer). I think it better to define these parameter
in host OS rather than in guest OS since it's the host OS to operate the
hardware SPI controller directly. Besides, it can also avoid passing the
same values from guest to host time after time.

  What's your opinion on this topic? Again, thank you very much.


Hi Haixu,

I took another look at the Linux structures and attempted to map up the
delay parameters to the various points in a SPI sequence. Please correct
me if I'm wrong about any of this.

Consider this diagram where CS# is an active-low chip select, and SCLK
is the serial clock:

   ___.   ._.
   CS#|___| |___
  .   . .
  .   . .
  SCLK _________________
  .   .. .   . .  .   . .
Delays:  +-A-++B+   +--C--+  +-D-+--E--+
  .   .. .   . .