Re: breakage in blkfront with ring_pages > 1

2011-07-05 Thread Colin Percival
On 07/05/11 19:42, Colin Percival wrote:
> On 07/05/11 19:04, Justin T. Gibbs wrote:
>>  On 7/5/11 7:14 PM, Colin Percival wrote:
>>> Maybe the right option is to have a loader tunable dev.xn.linuxback to
>>> control which version of the protocol is used?
>>
>> What a mess.
> 
> Yep.  Mess or not, shall I go ahead with having a loader tunable control this,
> or can you think of a better solution?

Does anyone object to the attached patch?  It keeps the differing behaviour to
a minimum -- we MUST set ring-ref with a FreeBSD blkback, and we MUST NOT set
it with a linux blkback -- but otherwise errs in the direction of setting more
variables than are needed, to maximize the possibility of a future blkback
being compatible with both blkback_is_linux=0 and blkback_is_linux=1.

-- 
Colin Percival
Security Officer, FreeBSD | freebsd.org | The power to serve
Founder / author, Tarsnap | tarsnap.com | Online backups for the truly paranoid
Index: sys/dev/xen/blkfront/blkfront.c
===
--- sys/dev/xen/blkfront/blkfront.c	(revision 223815)
+++ sys/dev/xen/blkfront/blkfront.c	(working copy)
@@ -616,7 +616,16 @@
 	if (setup_blkring(sc) != 0)
 		return;
 
+	/* Different backends use different names for this variable. */
 	error = xs_printf(XST_NIL, node_path,
+			 "num-ring-pages","%u", sc->ring_pages);
+	if (error) {
+		xenbus_dev_fatal(sc->xb_dev, error,
+ "writing %s/num-ring-pages",
+ node_path);
+		return;
+	}
+	error = xs_printf(XST_NIL, node_path,
 			 "ring-pages","%u", sc->ring_pages);
 	if (error) {
 		xenbus_dev_fatal(sc->xb_dev, error,
@@ -673,6 +682,9 @@
 	xenbus_set_state(sc->xb_dev, XenbusStateInitialised);
 }
 
+static int blkback_is_linux = 0;
+TUNABLE_INT("dev.xbd.blkback_is_linux", &blkback_is_linux);
+
 static int 
 setup_blkring(struct xb_softc *sc)
 {
@@ -702,14 +714,16 @@
 			return (error);
 		}
 	}
-	error = xs_printf(XST_NIL, xenbus_get_node(sc->xb_dev),
-			  "ring-ref","%u", sc->ring_ref[0]);
-	if (error) {
-		xenbus_dev_fatal(sc->xb_dev, error, "writing %s/ring-ref",
- xenbus_get_node(sc->xb_dev));
-		return (error);
+	if (!blkback_is_linux || sc->ring_pages == 1) {
+		error = xs_printf(XST_NIL, xenbus_get_node(sc->xb_dev),
+  "ring-ref","%u", sc->ring_ref[0]);
+		if (error) {
+			xenbus_dev_fatal(sc->xb_dev, error, "writing %s/ring-ref",
+	 xenbus_get_node(sc->xb_dev));
+			return (error);
+		}
 	}
-	for (i = 1; i < sc->ring_pages; i++) {
+	for (i = 0; i < sc->ring_pages; i++) {
 		char ring_ref_name[]= "ring_refXX";
 
 		snprintf(ring_ref_name, sizeof(ring_ref_name), "ring-ref%u", i);
___
freebsd-xen@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-xen
To unsubscribe, send any mail to "freebsd-xen-unsubscr...@freebsd.org"


Re: breakage in blkfront with ring_pages > 1

2011-07-05 Thread Colin Percival
[oops, let's try sending this again with reply-all instead of reply...]

On 07/05/11 19:04, Justin T. Gibbs wrote:
>  On 7/5/11 7:14 PM, Colin Percival wrote:
>> On 06/10/11 13:30, Justin T. Gibbs wrote:
>>> On 6/9/11 9:26 PM, Colin Percival wrote:
 Has anyone seen anything like this? Is it possible that there's a bug
 in how our blkfront negotiates the request ring? Does anyone have
 ring_pages > 1 in use?
>>>
>>> The only backend driver I know of that can support more than one ring page
>>> is FreeBSD. So, the problem likely is that blkfront is getting the
>>> negotiation wrong and only 1 page is in use.
>>
>> Turns out that the linux backend in question really does support multiple
>> pages.
> 
> Can you provide a source or spec reference for this driver? 

Ah, here it is, dating from 2008:
http://xen.1045712.n5.nabble.com/attachment/2527534/0/big_ring.patch

Sorry, I had several windows open and thought I was looking at kernel.org
tree commits rather than a separate patch.

> There was a
> proposal (in 2009?) to add multiple page support to the XenSource
> provided blkfront/back, but it didn't get committed to the Xen repository.
> It also doesn't use the same XenStore variables that you list.  Is this
> something Amazon developed in house?

I don't know what hand if any Amazon had in writing this, but it looks like
they've convinced at least Ubuntu and RedHat to use this version.

>> There's also an inconsistency about how multiple rings are negotiated:
>> We set:
>> * ring-pages to the number of pages blkfront wants to use
>> * ring-ref to the reference for the first page
>> * ring-refXX to the references for other pages
>> while linux sets:
>> * num-ring-pages to the number of pages blkfront wants to use
>> * ring-refXX to the page references.
> 
> You forgot to mention that the FreeBSD driver allows request size to
> be negotiated and explicitly supports "chained" requests to break the
> 44k request size limit imposed by the original interface.

I was trying to highlight the compatibility issues, not list all the ways
that FreeBSD is superior to Linux. ;-)

>> It seems to be impossible to be compatible with both, since Linux
> interprets
>> having a value set for ring-ref to indicate that the single-ring
> protocol is
>> being used and doesn't check anything else.
> 
> It's not "Linux".  There is no multi-page support in either the kernel.org
> hosted drivers or those in the Linux tree hosted in the official Xen
> repository.

Agreed.  But it's multiple Linux distributions (and EC2).

>> Is there any official source which defines the blkback/front protocol?
> 
> Only the source code in the Xen repository.  Both the FreeBSD extensions
> that I did and those for the Linux system you are using are likely backwards
> compatible to the original interface, but unfortunately are not compatible
> with each other.

Yes, both versions are definitely backwards compatible.

>> Maybe
>> the right option is to have a loader tunable dev.xn.linuxback to
> control which
>> version of the protocol is used?
> 
> What a mess.

Yep.  Mess or not, shall I go ahead with having a loader tunable control this,
or can you think of a better solution?

-- 
Colin Percival
Security Officer, FreeBSD | freebsd.org | The power to serve
Founder / author, Tarsnap | tarsnap.com | Online backups for the truly paranoid
___
freebsd-xen@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-xen
To unsubscribe, send any mail to "freebsd-xen-unsubscr...@freebsd.org"


Re: breakage in blkfront with ring_pages > 1

2011-07-05 Thread Justin T. Gibbs

On 7/5/11 7:14 PM, Colin Percival wrote:

 On 06/10/11 13:30, Justin T. Gibbs wrote:
> On 6/9/11 9:26 PM, Colin Percival wrote:
>> Has anyone seen anything like this? Is it possible that there's a bug
>> in how our blkfront negotiates the request ring? Does anyone have
>> ring_pages > 1 in use?
>
> The only backend driver I know of that can support more than one ring 

page

> is FreeBSD. So, the problem likely is that blkfront is getting the
> negotiation wrong and only 1 page is in use.

 Turns out that the linux backend in question really does support multiple
 pages.


Can you provide a source or spec reference for this driver?  There was a
proposal (in 2009?) to add multiple page support to the XenSource
provided blkfront/back, but it didn't get committed to the Xen repository.
It also doesn't use the same XenStore variables that you list.  Is this
something Amazon developed in house?


 There's also an inconsistency about how multiple rings are negotiated:
 We set:
 * ring-pages to the number of pages blkfront wants to use
 * ring-ref to the reference for the first page
 * ring-refXX to the references for other pages
 while linux sets:
 * num-ring-pages to the number of pages blkfront wants to use
 * ring-refXX to the page references.


You forgot to mention that the FreeBSD driver allows request size to
be negotiated and explicitly supports "chained" requests to break the
44k request size limit imposed by the original interface.  The 2009
proposal did not allow larger requests which is why I didn't follow
it when I implemented the FreeBSD extension.  Allowing larger requests
both improves performance and removes the need to do costly split/coalesce
operations on both sides of the blkif interface.

 It seems to be impossible to be compatible with both, since Linux 

interprets
 having a value set for ring-ref to indicate that the single-ring 

protocol is

 being used and doesn't check anything else.


It's not "Linux".  There is no multi-page support in either the kernel.org
hosted drivers or those in the Linux tree hosted in the official Xen
repository.


 Is there any official source which defines the blkback/front protocol?


Only the source code in the Xen repository.  Both the FreeBSD extensions
that I did and those for the Linux system you are using are likely backwards
compatible to the original interface, but unfortunately are not compatible
with each other.


 Maybe
 the right option is to have a loader tunable dev.xn.linuxback to 

control which

 version of the protocol is used?


What a mess.

--
Justin

___
freebsd-xen@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-xen
To unsubscribe, send any mail to "freebsd-xen-unsubscr...@freebsd.org"


Re: breakage in blkfront with ring_pages > 1

2011-07-05 Thread Colin Percival
On 06/10/11 13:30, Justin T. Gibbs wrote:
>  On 6/9/11 9:26 PM, Colin Percival wrote:
>> Has anyone seen anything like this? Is it possible that there's a bug
>> in how our blkfront negotiates the request ring? Does anyone have
>> ring_pages > 1 in use?
> 
> The only backend driver I know of that can support more than one ring page
> is FreeBSD.  So, the problem likely is that blkfront is getting the
> negotiation wrong and only 1 page is in use.

Turns out that the linux backend in question really does support multiple pages.
There's also an inconsistency about how multiple rings are negotiated:
We set:
  * ring-pages to the number of pages blkfront wants to use
  * ring-ref to the reference for the first page
  * ring-refXX to the references for other pages
while linux sets:
  * num-ring-pages to the number of pages blkfront wants to use
  * ring-refXX to the page references.

It seems to be impossible to be compatible with both, since Linux interprets
having a value set for ring-ref to indicate that the single-ring protocol is
being used and doesn't check anything else.

Is there any official source which defines the blkback/front protocol?  Maybe
the right option is to have a loader tunable dev.xn.linuxback to control which
version of the protocol is used?

-- 
Colin Percival
Security Officer, FreeBSD | freebsd.org | The power to serve
Founder / author, Tarsnap | tarsnap.com | Online backups for the truly paranoid
___
freebsd-xen@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-xen
To unsubscribe, send any mail to "freebsd-xen-unsubscr...@freebsd.org"