[..]
>> On 02/16/2018 08:24 AM, Michael S. Tsirkin wrote:
>>> Performance analysis of this is in my kvm forum 2016 presentation. The
>>> idea is to have a r/w descriptor in a ring structure, replacing the used
>>> and available ring, index and descriptor buffer.
>>>
>>> This is also easier for devices to implement than the 1.0 layout.
>>> Several more enhancements will be necessary to actually make this
>>> efficient for devices to use.
>>>
>>> Signed-off-by: Michael S. Tsirkin <[email protected]>
>>> ---
>>> content.tex | 28 ++-
>>> packed-ring.tex | 646
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 671 insertions(+), 3 deletions(-)
>>> create mode 100644 packed-ring.tex
[..]
>>> +
>>> +\subsection{Element Address and Length}
>>> +\label{sec:Packed Virtqueues / Element Address and Length}
>>> +
>>> +In an available descriptor, Element Address corresponds to the
>>> +physical address of the buffer element. The length of the element assumed
>>> +to be physically contigious is stored in Element Length.
>>> +
>>> +In a used descriptor, Element Address is unused. Element Length
>>> +specifies the length of the buffer that has been initialized
>>> +(written to) by the device.
>>> +
>>> +Element length is reserved for used descriptors without the
>>> +VIRTQ_DESC_F_WRITE flag, and is ignored by drivers.
>>> +
>>> +\subsection{Scatter-Gather Support}
>>
>> [Consistent wording] Both types of virtqueues support scatter-gather
>> but the term is used only for packed. Maybe we could unify the wording.
>> Or is this something for later?
>
> I'll take a look but this can be safely done later too.
>
>
Agreed.
[..]
>>> +
>>> +\subsection{Multi-buffer requests}
>>> +\label{sec:Packed Virtqueues / Multi-descriptor batches}
>>> +Some devices combine multiple buffers as part of processing of a
>>> +single request. These devices always mark the first descriptor
>>> +in the request used after the rest of the descriptors in the
>>> +request has been written out into the ring. This guarantees that
>>> +the driver will never observe a partial request in the ring.
>>> +
>>
>> I see you've changed s/in the request available/in the request used/.
>> But I still don't understand this paragraph. I will try to figure
>> it out later (and will come back to you if I fail).
>
> FYI this applies to mergeable buffers for the network device.
>
>
Yeah, was my understanding to, but I will have to look into the
details starting from there. Will come back to you if I can't
clear it up for myself.
[..]
>>> +
>>> +\devicenormative{\subsection}{The Virtqueue Descriptor Table}{Basic
>>> Facilities of a Virtio Device / Packed Virtqueues / The Virtqueue
>>> Descriptor Table}
>>> +A device MUST NOT write to a device-readable buffer, and a device SHOULD
>>> NOT
>>> +read a device-writable buffer.
>>> +A device MUST NOT use a descriptor unless it observes
>>> +VIRTQ_DESC_F_AVAIL bit in its \field{flags} being changed.
>>
>> I don't really understand this. How does the device observe
>> the VIRTQ_DESC_F_AVAIL bit being changed?
>
> By reading the descriptor.
>
:) My point is: to observe a change one usually either needs at
least one reading before and at least one reading after the change,
or one needs to know that a certain reading means change. The latter
is possible if we know that at the beginning of the time frame under
consideration (t_0) only a certain set of values,let's say B like before,
is possible, and after the change only a certain other set of values
let's say A like after, is possible, and A and B are disjunctive (
$A \cap B = \emtyset$).
I guess here the latter is supposed to be the case. But then I think
we need a more detailed description here. Please see also my other email
(response to Jens).
[..]
>>> +Suppression Structure Format}.
>>> +
>>> +\begin{note}
>>> +For optimal performance, a driver MAY disable interrupts while processing
>>> +the used buffers, but beware the problem of missing interrupts between
>>> +emptying the ring and reenabling interrupts. This is usually handled by
>>> +re-checking for more used buffers after interrups are re-enabled:
>>> +\end{note}
>>> +
>>> +\begin{lstlisting}
>>> +vq->driver_event.flags = 0x2;
>>> +
>>> +for (;;) {
>>> + struct virtq_desc *d = vq->desc[vq->next_used];
>>> +
>>> + flags = d->flags;
>>> + bool avail = flags & (1 << VIRTQ_DESC_F_AVAIL);
>>> + bool used = flags & (1 << VIRTQ_DESC_F_USED);
>>> +
>>> + if (avail != used) {
>>
>> I don't understand the condition which is AFAIU supposed to
>> correspond to the descriptor *not* being used.
>
> So avail == used means used. avail != used means available.
>
Please see the follow up with Jens.
>>> + vq->driver_event.flags = 0x1;
>>> + memory_barrier();
>>> +
>>> + flags = d->flags;
>>> + bool avail = flags & (1 << VIRTQ_DESC_F_AVAIL);
>>> + bool used = flags & (1 << VIRTQ_DESC_F_USED);
>>> + if (avail != used) {
>>> + break;
>>> + }
>>> +
>>> + vq->driver_event.flags = 0x2;
>>> + }
>>> +
>>> + read_memory_barrier();
[..]
>> I'm pretty much confused on how this scheme with the available
>> and used wrap counters (or device and driver wrap counters is
>> supposed to work). A working implementation in C would really help
>> me to understand this.
>
> DPDK based implementation has been posted.
>
Thank you very much for the hint. Slipped past me unfortunately.
Regards,
Halil
>>> + process_buffer(d);
>>> + vq->next_used++;
>>> + if (vq->next_used >= vq->size) {
>>> + vq->next_used = 0;
>>> + }
>>> +}
>>> +\end{lstlisting}
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail. Follow this link to all your TCs in OASIS at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]