[Qemu-devel] Re: [PATCHv3 1/2] virtio: support layout with avail ring before idx

2010-06-06 Thread Michael S. Tsirkin
On Sat, Jun 05, 2010 at 01:40:26PM +0930, Rusty Russell wrote:
 On Fri, 4 Jun 2010 09:12:05 pm Michael S. Tsirkin wrote:
  On Fri, Jun 04, 2010 at 08:46:49PM +0930, Rusty Russell wrote:
   I'm uncomfortable with moving a field.
   
   We haven't done that before and I wonder what will break with old code.
  
  With e.g. my patch, We only do this conditionally when bit is negotitated.
 
 Of course, but see this change:
 
 commit ef688e151c00e5d529703be9a04fd506df8bc54e
 Author: Rusty Russell ru...@rustcorp.com.au
 Date:   Fri Jun 12 22:16:35 2009 -0600
 
 virtio: meet virtio spec by finalizing features before using device
 
 Virtio devices are supposed to negotiate features before they start using
 the device, but the current code doesn't do this.  This is because the
 driver's probe() function invariably has to add buffers to a virtqueue,
 or probe the disk (virtio_blk).
 
 This currently doesn't matter since no existing backend is strict about
 the feature negotiation.  But it's possible to imagine a future feature
 which completely changes how a device operates: in this case, we'd need
 to acknowledge it before using the device.
 
 Signed-off-by: Rusty Russell ru...@rustcorp.com.au
 
 Now, this isn't impossible to overcome: we know that if they use the ring
 before completing feature negotiation then they don't understand the new
 format.
 
 But we have to be aware of that on the qemu side.  Are we?

I think we are ok. virtqueue_init which sets the avail/ysed pointers is
called when we write the base address.  So we only need to be careful
and not change this feature bit after creating the rings.


   Should we instead just abandon the flags field and use last_used only?
   Or, more radically, put flags == last_used when the feature is on?
   
   Thoughts?
   Rusty.
  
  Hmm, e.g. with TX and virtio net, we almost never want interrupts,
  whatever the index value.
 
 Good point.  OK, I give in, I'll take your patch which moves the fields
 to the end.  Is that your preference?

Yes, I think so.
You mean PATCHv3 unchanged with 254 byte padding?

 Please be careful with the qemu side though...
 
 It's not inconceivable that I'll write that virtio cacheline simulator this
 (coming) week, too...
 
 Thanks.
 Rusty.





[Qemu-devel] Re: [PATCHv3 1/2] virtio: support layout with avail ring before idx

2010-06-04 Thread Michael S. Tsirkin
On Fri, Jun 04, 2010 at 12:04:57PM +0930, Rusty Russell wrote:
 On Wed, 2 Jun 2010 12:17:12 am Michael S. Tsirkin wrote:
  This adds an (unused) option to put available ring before control (avail
  index, flags), and adds padding between index and flags. This avoids
  cache line sharing between control and ring, and also makes it possible
  to extend avail control without incurring extra cache misses.
  
  Signed-off-by: Michael S. Tsirkin m...@redhat.com
 
 No no no no.  254?  You're trying to Morton me![1]

Hmm, I wonder what will we do if we want a 3rd field on
a separate chacheline. But ok.

 How's this (untested):

I think we also want to put flags there as well,
they are used on interrupt path, together with last used index.

 diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
 --- a/include/linux/virtio_ring.h
 +++ b/include/linux/virtio_ring.h
 @@ -74,8 +74,8 @@ struct vring {
  /* The standard layout for the ring is a continuous chunk of memory which 
 looks
   * like this.  We assume num is a power of 2.
   *
 - * struct vring
 - * {
 + * struct vring {
 + *   *** The driver writes to this part.
   *   // The actual descriptors (16 bytes each)
   *   struct vring_desc desc[num];
   *
 @@ -84,9 +84,11 @@ struct vring {
   *   __u16 avail_idx;
   *   __u16 available[num];
   *
 - *   // Padding to the next align boundary.
 + *   // Padding so used_flags is on the next align boundary.
   *   char pad[];
 + *   __u16 last_used; // On a cacheline of its own.
   *
 + *   *** The device writes to this part.
   *   // A ring of used descriptor heads with free-running index.
   *   __u16 used_flags;
   *   __u16 used_idx;
 @@ -110,6 +112,12 @@ static inline unsigned vring_size(unsign
   + sizeof(__u16) * 2 + sizeof(struct vring_used_elem) * num;
  }
  
 +/* Last used index sits at the very end of the driver part of the struct */
 +static inline __u16 *vring_last_used_idx(const struct vring *vr)
 +{
 + return (__u16 *)vr-used - 1;
 +}
 +
  #ifdef __KERNEL__
  #include linux/irqreturn.h
  struct virtio_device;
 
 Cheers,
 Rusty.
 [1] Andrew Morton has this technique where he posts a solution so ugly it
 forces others to fix it properly.  Ego-roping, basically.



[Qemu-devel] Re: [PATCHv3 1/2] virtio: support layout with avail ring before idx

2010-06-04 Thread Rusty Russell
On Fri, 4 Jun 2010 08:05:43 pm Michael S. Tsirkin wrote:
 On Fri, Jun 04, 2010 at 12:04:57PM +0930, Rusty Russell wrote:
  On Wed, 2 Jun 2010 12:17:12 am Michael S. Tsirkin wrote:
   This adds an (unused) option to put available ring before control (avail
   index, flags), and adds padding between index and flags. This avoids
   cache line sharing between control and ring, and also makes it possible
   to extend avail control without incurring extra cache misses.
   
   Signed-off-by: Michael S. Tsirkin m...@redhat.com
  
  No no no no.  254?  You're trying to Morton me![1]
 
 Hmm, I wonder what will we do if we want a 3rd field on
 a separate chacheline. But ok.
 
  How's this (untested):
 
 I think we also want to put flags there as well,
 they are used on interrupt path, together with last used index.

I'm uncomfortable with moving a field.

We haven't done that before and I wonder what will break with old code.

Should we instead just abandon the flags field and use last_used only?
Or, more radically, put flags == last_used when the feature is on?

Thoughts?
Rusty.



[Qemu-devel] Re: [PATCHv3 1/2] virtio: support layout with avail ring before idx

2010-06-04 Thread Michael S. Tsirkin
On Fri, Jun 04, 2010 at 08:46:49PM +0930, Rusty Russell wrote:
 On Fri, 4 Jun 2010 08:05:43 pm Michael S. Tsirkin wrote:
  On Fri, Jun 04, 2010 at 12:04:57PM +0930, Rusty Russell wrote:
   On Wed, 2 Jun 2010 12:17:12 am Michael S. Tsirkin wrote:
This adds an (unused) option to put available ring before control (avail
index, flags), and adds padding between index and flags. This avoids
cache line sharing between control and ring, and also makes it possible
to extend avail control without incurring extra cache misses.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
   
   No no no no.  254?  You're trying to Morton me![1]
  
  Hmm, I wonder what will we do if we want a 3rd field on
  a separate chacheline. But ok.
  
   How's this (untested):
  
  I think we also want to put flags there as well,
  they are used on interrupt path, together with last used index.
 
 I'm uncomfortable with moving a field.
 
 We haven't done that before and I wonder what will break with old code.

With e.g. my patch, We only do this conditionally when bit is negotitated.

 Should we instead just abandon the flags field and use last_used only?
 Or, more radically, put flags == last_used when the feature is on?
 
 Thoughts?
 Rusty.

Hmm, e.g. with TX and virtio net, we almost never want interrupts,
whatever the index value.

-- 
MST



[Qemu-devel] Re: [PATCHv3 1/2] virtio: support layout with avail ring before idx

2010-06-04 Thread Rusty Russell
On Fri, 4 Jun 2010 09:12:05 pm Michael S. Tsirkin wrote:
 On Fri, Jun 04, 2010 at 08:46:49PM +0930, Rusty Russell wrote:
  I'm uncomfortable with moving a field.
  
  We haven't done that before and I wonder what will break with old code.
 
 With e.g. my patch, We only do this conditionally when bit is negotitated.

Of course, but see this change:

commit ef688e151c00e5d529703be9a04fd506df8bc54e
Author: Rusty Russell ru...@rustcorp.com.au
Date:   Fri Jun 12 22:16:35 2009 -0600

virtio: meet virtio spec by finalizing features before using device

Virtio devices are supposed to negotiate features before they start using
the device, but the current code doesn't do this.  This is because the
driver's probe() function invariably has to add buffers to a virtqueue,
or probe the disk (virtio_blk).

This currently doesn't matter since no existing backend is strict about
the feature negotiation.  But it's possible to imagine a future feature
which completely changes how a device operates: in this case, we'd need
to acknowledge it before using the device.

Signed-off-by: Rusty Russell ru...@rustcorp.com.au

Now, this isn't impossible to overcome: we know that if they use the ring
before completing feature negotiation then they don't understand the new
format.

But we have to be aware of that on the qemu side.  Are we?

  Should we instead just abandon the flags field and use last_used only?
  Or, more radically, put flags == last_used when the feature is on?
  
  Thoughts?
  Rusty.
 
 Hmm, e.g. with TX and virtio net, we almost never want interrupts,
 whatever the index value.

Good point.  OK, I give in, I'll take your patch which moves the fields
to the end.  Is that your preference?

Please be careful with the qemu side though...

It's not inconceivable that I'll write that virtio cacheline simulator this
(coming) week, too...

Thanks.
Rusty.



[Qemu-devel] Re: [PATCHv3 1/2] virtio: support layout with avail ring before idx

2010-06-03 Thread Rusty Russell
On Wed, 2 Jun 2010 12:17:12 am Michael S. Tsirkin wrote:
 This adds an (unused) option to put available ring before control (avail
 index, flags), and adds padding between index and flags. This avoids
 cache line sharing between control and ring, and also makes it possible
 to extend avail control without incurring extra cache misses.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com

No no no no.  254?  You're trying to Morton me![1]

How's this (untested):

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -74,8 +74,8 @@ struct vring {
 /* The standard layout for the ring is a continuous chunk of memory which looks
  * like this.  We assume num is a power of 2.
  *
- * struct vring
- * {
+ * struct vring {
+ * *** The driver writes to this part.
  * // The actual descriptors (16 bytes each)
  * struct vring_desc desc[num];
  *
@@ -84,9 +84,11 @@ struct vring {
  * __u16 avail_idx;
  * __u16 available[num];
  *
- * // Padding to the next align boundary.
+ * // Padding so used_flags is on the next align boundary.
  * char pad[];
+ * __u16 last_used; // On a cacheline of its own.
  *
+ * *** The device writes to this part.
  * // A ring of used descriptor heads with free-running index.
  * __u16 used_flags;
  * __u16 used_idx;
@@ -110,6 +112,12 @@ static inline unsigned vring_size(unsign
+ sizeof(__u16) * 2 + sizeof(struct vring_used_elem) * num;
 }
 
+/* Last used index sits at the very end of the driver part of the struct */
+static inline __u16 *vring_last_used_idx(const struct vring *vr)
+{
+   return (__u16 *)vr-used - 1;
+}
+
 #ifdef __KERNEL__
 #include linux/irqreturn.h
 struct virtio_device;

Cheers,
Rusty.
[1] Andrew Morton has this technique where he posts a solution so ugly it
forces others to fix it properly.  Ego-roping, basically.