On Wed, Feb 08, 2017 at 06:41:40PM +0100, Paolo Bonzini wrote:
>
>
> On 08/02/2017 04:20, Michael S. Tsirkin wrote:
> > * Scatter/gather support
> >
> > We can use 1 bit to chain s/g entries in a request, same as virtio 1.0:
> >
> > /* This marks a buffer as continuing via the next field. */
> > #define VRING_DESC_F_NEXT 1
> >
> > Unlike virtio 1.0, all descriptors must have distinct ID values.
> >
> > Also unlike virtio 1.0, use of this flag will be an optional feature
> > (e.g. VIRTIO_F_DESC_NEXT) so both devices and drivers can opt out of it.
>
> I would still prefer that we had _either_ single-direct or
> multiple-indirect descriptors, i.e. no VRING_DESC_F_NEXT. I can propose
> my idea for this in a separate message.
All it costs us spec-wise is a single bit :)
The cost of indirect is an extra cache miss.
We couldn't decide what's better for everyone in 1.0 days and I doubt
we'll be able to now, but yes, benchmarking is needed to make
sire it's required. Very easy to remove or not to use/support in
drivers/devices though.
> > * Batching descriptors:
> >
> > virtio 1.0 allows passing a batch of descriptors in both directions, by
> > incrementing the used/avail index by values > 1. We can support this by
> > chaining a list of descriptors through a bit the flags field.
> > To allow use together with s/g, a different bit will be used.
> >
> > #define VRING_DESC_F_BATCH_NEXT 0x0010
> >
> > Batching works for both driver and device descriptors.
>
> I'm still not sure how this would be useful.
So this is used at least by virtio-net mergeable buffers to combine
many buffers into a single packet.
Similarly, on transmit linux sometimes supplies packets in batches
(XMIT_MORE flag) if the other side processes them it seems nice to tell
it: there's more to come soon, if you see this it is wise to poll now.
That's why I kind of felt it's better as a standard bit.
> It cannot be mandatory to
> set the bit, I think, because you don't know when the host/guest is
> going to read descriptors. So both host and guest always have to look
> ahead one element in any case.
Right but the point is what to do if you find nothing there?
If you saw VRING_DESC_F_BATCH_NEXT it's a hint that
you should poll, there's more to come soon.
> > * Non power-of-2 ring sizes
> >
> > As the ring simply wraps around, there's no reason to
> > require ring size to be power of two.
> > It can be made a separate feature though.
>
> Power of 2 ring sizes are required in order to ignore the high bits of
> the indices. With non-power-of-2 sizes you are forced to keep the
> indices less than the ring size.
Right. So
if (unlikely(idx++ > size))
idx = 0;
OTOH ring size that's twice larger than necessary
because of power of two requirements wastes cache.
> Alternatively you can do this:
>
> > * Event index would be in the range 0 to 2 * Queue Size
> > (to detect wrap arounds) and wrap to 0 after that.
> >
> > The assumption is that each side maintains an internal
> > descriptor counter 0 to 2 * Queue Size that wraps to 0.
> > In that case, interrupt triggers when counter reaches
> > the given value.
>
> but it seems more complicated than just forcing power-of-2 and ignoring
> the high bits.
>
> Thanks,
>
> Paolo
Absolutely power of 2 lets you save a branch.
At this stage I'm just recording all the ideas
and then as a next step we can micro-benchmark prototypes
and compare.
--
MST
_______________________________________________
Virtualization mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/virtualization