On 9/7/2023 2:34 AM, Zhu, Lingshan wrote:
On 9/7/2023 4:09 PM, Eugenio Perez Martin wrote:
On Tue, Sep 5, 2023 at 11:08 AM Zhu, Lingshan
<lingshan....@intel.com> wrote:
On 8/21/2023 5:26 PM, Eugenio Perez Martin wrote:
On Fri, Aug 18, 2023 at 11:44 AM Zhu, Lingshan
<lingshan....@intel.com> wrote:
On 8/17/2023 11:19 PM, Eugenio Perez Martin wrote:
On Tue, Aug 15, 2023 at 1:30 PM Zhu, Lingshan
<lingshan....@intel.com> wrote:
On 8/15/2023 8:34 AM, Jason Wang wrote:
On Mon, Aug 14, 2023 at 7:29 PM Zhu Lingshan
<lingshan....@intel.com> wrote:
This commit specifies the constraints of the virtqueue state,
and the actions should be taken by the device when SUSPEND
and DRIVER_OK is set
Signed-off-by: Jason Wang <jasow...@redhat.com>
Signed-off-by: Zhu Lingshan <lingshan....@intel.com>
---
content.tex | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)
diff --git a/content.tex b/content.tex
index 43bd5de..f6ac581 100644
--- a/content.tex
+++ b/content.tex
@@ -587,6 +587,37 @@ \subsection{\field{Used State} Field}
See also \ref{sec:Packed Virtqueues / Driver and Device
Ring Wrap Counters}.
+\drivernormative{\subsection}{Virtqueue State}{Basic
Facilities of a Virtio Device / Virtqueue State}
+
+If VIRTIO_F_QUEUE_STATE has been negotiated, the driver MUST
set SUSPEND in \field{device status}
+first before getting or setting Virtqueue State of any
virtqueues.
I don't get why this is a must. It could be useful for debugging.
To avoid race conditions with the device and make the device
implementation easier
+
+If VIRTIO_F_QUEUE_STATE has been negotiaged but
VIRTIO_RING_F_PACKED not been negotiated,
typo
yes
+the driver MUST NOT access \field{Used State} of any
virtqueues, it should use the
+used index in the used ring.
+
+\devicenormative{\subsection}{Virtqueue State}{Basic
Facilities of a Virtio Device / Virtqueue State}
+
+If VIRTIO_F_QUEUE_STATE has been negotiated but SUSPEND is
not set in \field{device status},
+the device MUST ignore any accesses against Virtqueue State
of any virtqueues.
Btw, do we need to clarify the behavior of ring reset after
suspending?
I think once suspended, the device should ignore resetting a queue
Actually shadow virtqueue could benefit from the ability to
change vq
properties (addresses) while the device is suspended, and then just
resume it. I've been told that ring reset is overkill for that.
If ring reset is overkill, is SUSPEND even more overkill?
It depends on the cost of recreating the vq in the device I think. But
it has more to do with *what* is changed in the vq, as it seems some
parameters (vq size) has more impact than others like vq address. The
way to stop the device does not affect, but ring reset offers the
possibility of change all of the parameters already.
Adding Si-Wei and Dragos here, as they pointed it out in the
virtio-networking upstream meeting.
But probably it is better to address it on top, with another
feature flag.
I think if we want to changing the vq properties, there must be a
mechanism to
stop the queue then resume the queue.
How about allow setting queue_enable = 0 to stop it and =1 to
resume and
force it reinitialize?
Yes, I think that is better suited. But maybe this is better to be
added on top, so we maintain this series small.
Hi Eugenio,
I have a second thought while implementing above queue_enable = 0,
it doesn't provide more advantages over queue_reset:
1) queue_reset can help to stop a queue and the vq properties can be
reconfigured during queue_reset --> queue_enable.
2) once the driver sees SUSPEND presented by the device, it assume the
device states and vq states are stable, at that point the driver can
read reliable device configurations. So vq reset should be ignored
once SUSPEND is present and if we implement queue stop, it should be
ignored too when SUSPEND.
The relation between SUSPEND and ring_reset needs to be described in
this series, yes. This is a good start, but I'm not sure if this one
meets all the requirements for SW assisted live migration.
We can always add new feature flags to define a different interaction
in the future, like for devices that can support the change of vq
attributes in the suspend. To not steal the merit, this idea was
proposed by Si-Wei in a recent virtio-networking meeting.
If so, we even don't need a new feature bit. We can just allow
resetting vqs after the device presenting SUSPEND.
For the single bit of feature interaction with queue_reset this looks
fine, but queue_reset is perhaps not the only feature that needs to
interact with SUSPEND. While on the other hand I suspect it's probably
not easy to converge on everything all at once for the moment. Just to
avoid the lure of hijacking this thread for other things, it'd be easier
I feel to define a pristine SUSPEND method starting with the most
restrictive mandates, describing every possible means to prohibiting
*any* change to the config space for device in suspension. This not just
keeps the (backward) compatibility on the table which is consistent with
the assumption of various SUSPEND implementations available today, but
would make it possible to customize different flavors of interactions
guarded by different feature flag in the future. For instance, today
queue_reset may mostly work the best on software device implementation
where one can introduce a specific SUSPEND_RING_RESET_ALLOWED feature
flag to unlock/override part of the restriction from the pristine
SUSPEND feature when both are negotiated and used together. In future,
if there's any need to revisit this part for e.g. hardware device
implementation of queue_reset might not be able to meet certain desired
performance (downtime) goal, then a new feature might have to be
introduced to define another hardware-biased means of interaction with
suspended device.
The device presenting SUSPEND indicates that the device config space
is stabilized at that moment, ready for the driver to fetch fields
data there.
Then the driver is allowed to reset, re-config and re-enable the vqs.
Maybe not for this case, but for completeness I found a very relevant
question is, as your patch defines SUSPEND in the context of live
migration, how do you envision to resume/restart the device immediately
in place on the source host (say migration is cancelled after all
devices are suspended, or migration failed at the last minute for some
reason)? Reset the device and start to recover everything from scratch?
Or do queue_reset then queue_enable on every virtqueue while keeping the
other device states (those already populated through ctrl vq) around? Or
suppose right now we have a symmetric RESUME feature that keeps every
device state including the queue state in place. Which option a hardware
vendor would like to pick if user/customer would like to have the
best/least downtime? Does the hardware's choice matter much for software
device implementation?
As can be seen amongst these options, there's perhaps no single best
solution between software and hardware devices, or even between
different hardware vendors. So instead of ruling out possibility for
future extension to flavor other implementations, be it hardware or
software, I feel it's probably not the best thing for now to get SUSPEND
hard wired to queue_reset or RESUME. Device reset is the base case that
every device has to implement, that I feel might be the only failsafe
method to get the device out of the suspension state with pristine SUSPEND.
The only requirement is: The driver is responsible for maintain
the integrity and validity of the config space fields, because
the device is ready-only to the config space at that moment(SUSPEND-ed)
and the driver should be responsible for its actions, perform proper
synchronizations, e.g., re-read.
It looks fine, though as stated above, please leave it to a different
feature flag with another patch to define the queue_reset interaction
with SUSPEND.
Thanks,
-Siwei
Does this work for you?
Thanks
3) the device should only accept resetting a queue when !SUSPEND and
the driver can flush the queue buffers before resetting it to avoid
losing buffers,
and we will have tracker for in-flight descriptors later.
Any thoughts?
Thanks
Thanks!
Thanks
Zhu Lingshan
+
+When VIRTIO_F_QUEUE_STATE has been negotiated but
VIRTIO_RING_F_PACKED is not,
+the device MUST ignore any accesses against \field{Used State}.
+
+If VIRTIO_F_QUEUE_STATE has been negotiaged, the device MUST
reset
+the Virtqueue State of every virtqueue upon a reset.
Need to define the meaning of "reset" this is important for
packed virtqueue.
I will remove this as Stefan suggested.
+
+If VIRTIO_F_QUEUE_STATE and VIRTIO_RING_F_PACKED have been
negotiaged, when SUSPEND is set,
+the device MUST record the Virtqueue State of every enabled
virtqueue
+in \field{Available State} and \field{Used State} respectively,
+and correspondingly restore the Virtqueue State of every
enabled virtqueue
+from \field{Avaiable State} and \field{Used State} when
DRIVER_OK is set.
We can just let the device report those states in any case then we
don't need to care about those details, or did you see any
blockers?
Agree, I will add the definition of used_state of splitted vq in
the
next version
Thanks
Thanks
+
+If VIRTIO_F_QUEUE_STATE has been negotiated but
VIRTIO_RING_F_PACKED has been not, when SUSPEND is set,
+the device MUST record the available state of every enabled
virtqueue in \field{Available State},
+and restore the available state of every enabled virtqueue
from \field{Avaiable State}
+when DRIVER_OK is set.
+
\input{admin.tex}
\chapter{General Initialization And Device
Operation}\label{sec:General Initialization And Device Operation}
--
2.35.3
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscr...@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscr...@lists.oasis-open.org
List help: virtio-comment-h...@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License:
https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines:
https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org