Re: [PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-19 Thread Avi Kivity

On 05/18/2010 10:22 PM, Stefan Hajnoczi wrote:

On Tue, May 18, 2010 at 6:18 PM, Avi Kivitya...@redhat.com  wrote:
   

The block multiwrite code pretends to be able to merge overlapping requests,
but doesn't do so in fact.  This leads to I/O errors (for example on mkfs
of a large virtio disk).
 

Are overlapping write requests correct guest behavior?  I thought the
ordering semantics require a flush between overlapping writes to
ensure A is written before B.

What cache= mode are you running?
   


writeback.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-19 Thread Stefan Hajnoczi
On Wed, May 19, 2010 at 9:09 AM, Avi Kivity a...@redhat.com wrote:
 On 05/18/2010 10:22 PM, Stefan Hajnoczi wrote:
 What cache= mode are you running?

 writeback.

In the cache=writeback case the virtio-blk guest driver does:

blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, ...)

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-19 Thread Avi Kivity

On 05/19/2010 12:01 PM, Stefan Hajnoczi wrote:

On Wed, May 19, 2010 at 9:09 AM, Avi Kivitya...@redhat.com  wrote:
   

On 05/18/2010 10:22 PM, Stefan Hajnoczi wrote:
 

What cache= mode are you running?
   

writeback.
 

In the cache=writeback case the virtio-blk guest driver does:

blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, ...)
   


I don't follow.  What's the implication?

btw I really dislike how the cache attribute (which I see as a pure host 
choice) is exposed to the guest.  It means we can't change caching mode 
on the fly (for example after live migration), or that changing caching 
mode during a restart may expose a previously hidden guest bug.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-19 Thread Stefan Hajnoczi
On Wed, May 19, 2010 at 10:06 AM, Avi Kivity a...@redhat.com wrote:
 In the cache=writeback case the virtio-blk guest driver does:

 blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, ...)


 I don't follow.  What's the implication?

I was wondering whether the queue is incorrectly set to a mode where
overlapping write requests aren't being ordered.  Anyway, Christoph
says overlapping write requests can be issued so my theory is broken.

 btw I really dislike how the cache attribute (which I see as a pure host
 choice) is exposed to the guest.  It means we can't change caching mode on
 the fly (for example after live migration), or that changing caching mode
 during a restart may expose a previously hidden guest bug.

Christoph has mentioned wanting to make the write cache switchable
from inside the guest.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-19 Thread Christoph Hellwig
On Wed, May 19, 2010 at 10:23:44AM +0100, Stefan Hajnoczi wrote:
 On Wed, May 19, 2010 at 10:06 AM, Avi Kivity a...@redhat.com wrote:
  In the cache=writeback case the virtio-blk guest driver does:
 
  blk_queue_ordered(q, QUEUE_ORDERED_DRAIN_FLUSH, ...)
 
 
  I don't follow. ?What's the implication?
 
 I was wondering whether the queue is incorrectly set to a mode where
 overlapping write requests aren't being ordered.  Anyway, Christoph
 says overlapping write requests can be issued so my theory is broken.

They can happen, but won't during usual pagecache based writeback.  So
this should not happen for the pagecache based mke2fs workload.  It
could happen for a workload with mkfs that uses O_DIRECT.  And we
need to handle it either way.

And btw, our barrier handling for devices using multiwrite (aka virtio)
is utterly busted right now, patch will follow ASAP.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-18 Thread Avi Kivity
The block multiwrite code pretends to be able to merge overlapping requests,
but doesn't do so in fact.  This leads to I/O errors (for example on mkfs
of a large virtio disk).

Signed-off-by: Avi Kivity a...@redhat.com
---
 block.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 48305b7..0e44e26 100644
--- a/block.c
+++ b/block.c
@@ -1956,8 +1956,8 @@ static int multiwrite_merge(BlockDriverState *bs, 
BlockRequest *reqs,
 int64_t oldreq_last = reqs[outidx].sector + reqs[outidx].nb_sectors;
 
 // This handles the cases that are valid for all block drivers, namely
-// exactly sequential writes and overlapping writes.
-if (reqs[i].sector = oldreq_last) {
+// exactly sequential writes
+if (reqs[i].sector == oldreq_last) {
 merge = 1;
 }
 
-- 
1.6.6.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-18 Thread Stefan Hajnoczi
On Tue, May 18, 2010 at 6:18 PM, Avi Kivity a...@redhat.com wrote:
 The block multiwrite code pretends to be able to merge overlapping requests,
 but doesn't do so in fact.  This leads to I/O errors (for example on mkfs
 of a large virtio disk).

Are overlapping write requests correct guest behavior?  I thought the
ordering semantics require a flush between overlapping writes to
ensure A is written before B.

What cache= mode are you running?

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-18 Thread Stefan Hajnoczi
I just caught up on mails and saw you had already mentioned that
overlapping writes from the guest look fishy in the the 1Tb block
issue.  Cache mode might still be interesting because it affects how
guest virtio-blk chooses queue ordering mode.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH +stable] block: don't attempt to merge overlapping requests

2010-05-18 Thread Michael Tokarev

18.05.2010 23:37, Stefan Hajnoczi wrote:

I just caught up on mails and saw you had already mentioned that
overlapping writes from the guest look fishy in the the1Tb block
issue.  Cache mode might still be interesting because it affects how
guest virtio-blk chooses queue ordering mode.


What I can say for sure is that the issue mentioned (1Tb block)
occurs regardless of the cache mode.  I tried all 3, with exactly
the same results (well, not entirely exactly, since the prob
depends on timing too, and timing is different depending on the
cache mode), and performed all further tests with cache=writeback
since it's the mode which lets the guest to finish mkfs'ing the
1.5Tb disk in a reasonable time (or else it takes hours).

But actually I don't quite see where that dependency is: guest
does not know how its data is cached on host...

/mjt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html