Re: [Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-12-05 Thread Matthias Brugger
2013/11/11 Stefan Hajnoczi stefa...@redhat.com:
 On Mon, Nov 11, 2013 at 11:00:45AM +0100, Matthias Brugger wrote:
 2013/11/5 Stefan Hajnoczi stefa...@gmail.com:
  I'd also like to see the thread pool implementation you wish to add
  before we add a layer of indirection which has no users yet.

 Fair enough, I will evaluate if it will make more sense to implement a
 new AIO infrastructure instead to try reuse the thread-pool.
 Actually my implementation will differ in the way, that we will have
 several workerthreads with everyone of them having its own queue. The
 requests will be distributed between them depending on an identifier.
 The request function which  the worker_thread call will be the same as
 using aio=threads, so I'm not quite sure which will be the way to go.
 Any opinions and hints, like the one you gave are highly appreciated.

 If I understand the slides you linked to correctly, the guest will pass
 an identifier with each request.  The host has worker threads allowing
 each stream of requests to be serviced independently.  The idea is to
 associate guest processes with unique identifiers.

 The guest I/O scheduler is supposed to submit requests in a way that
 meets certain policies (e.g. fairness between processes, deadlines,
 etc).

 Why is it necessary to push this task down into the host?  I don't
 understand the advantage of this approach except that maybe it works
 around certain misconfigurations, I/O scheduler quirks, or plain old
 bugs - all of which should be investigated and fixed at the source
 instead of adding another layer of code to mask them.

It is about I/O scheduling. CFQ the state of the art I/O scheduler
merges adjacent requests from the same PID before dispatching them to
the disk.
If we can distinguish between the different threads of a virtual
machine that read/write a file, the I/O scheduler in the host can
merge requests in an effective way for sequential access. Qemu fails
in this, because of its architecture. Apart that at the moment there
is no way to distinguish the guest threads from each other (I'm
working on some kernel patches), Qemu has one big queue from which
several workerthreads grab requests and dispatch them to the disk.
Even if you have one large read from just one thread in the guest, the
I/O scheduler in the host will get the requests from different PIDs (=
workerthreads) and won't be able to merge them.
In former versions, there was some work done to merge requests in
Qemu, but I don't think they were very useful, because you don't know
how the layout of the image file looks like on the physical disk.
Anyway I think this code parts have been removed.
The only layer where you really know how the blocks of the virtual
disk image are distributed over the disk is the block layer of the
host. So you have to do the block request merging there. With the new
architecture this would come for free, as you can map every thread
from a guest to one workerthread of Qemu.

Matthias


 Stefan



-- 
motzblog.wordpress.com



Re: [Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-12-05 Thread Paolo Bonzini
Il 05/12/2013 09:40, Matthias Brugger ha scritto:
 CFQ the state of the art I/O scheduler

The deadline scheduler typically provides much better performance for
server usage (including hosting VMs).  It doesn't support some features
such as I/O throttling via cgroups, but QEMU now has a very good
throttling mechanism implemented by Benoit Canet.

I suggest that you repeat your experiments using all six configurations:
- deadline scheduler with aio=native
- deadline scheduler with aio=threads
- deadline scheduler with aio=threads + your patches
- CFQ scheduler with aio=native
- CFQ scheduler with aio=threads
- CFQ scheduler with aio=threads + your patches

 In former versions, there was some work done to merge requests in
 Qemu, but I don't think they were very useful, because you don't know
 how the layout of the image file looks like on the physical disk.
 Anyway I think this code parts have been removed.

This is still there for writes, in bdrv_aio_multiwrite.  Only
virtio-blk.c uses it, but it's there.

 The only layer where you really know how the blocks of the virtual
 disk image are distributed over the disk is the block layer of the
 host. So you have to do the block request merging there. With the new
 architecture this would come for free, as you can map every thread
 from a guest to one workerthread of Qemu.

This also assumes a relatively dumb guest.  If the guest uses itself a
thread pool, you would have exactly the same problem, wouldn't you?

Paolo



Re: [Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-12-05 Thread Stefan Hajnoczi
On Thu, Dec 05, 2013 at 09:40:56AM +0100, Matthias Brugger wrote:
 2013/11/11 Stefan Hajnoczi stefa...@redhat.com:
  On Mon, Nov 11, 2013 at 11:00:45AM +0100, Matthias Brugger wrote:
  2013/11/5 Stefan Hajnoczi stefa...@gmail.com:
   I'd also like to see the thread pool implementation you wish to add
   before we add a layer of indirection which has no users yet.
 
  Fair enough, I will evaluate if it will make more sense to implement a
  new AIO infrastructure instead to try reuse the thread-pool.
  Actually my implementation will differ in the way, that we will have
  several workerthreads with everyone of them having its own queue. The
  requests will be distributed between them depending on an identifier.
  The request function which  the worker_thread call will be the same as
  using aio=threads, so I'm not quite sure which will be the way to go.
  Any opinions and hints, like the one you gave are highly appreciated.
 
  If I understand the slides you linked to correctly, the guest will pass
  an identifier with each request.  The host has worker threads allowing
  each stream of requests to be serviced independently.  The idea is to
  associate guest processes with unique identifiers.
 
  The guest I/O scheduler is supposed to submit requests in a way that
  meets certain policies (e.g. fairness between processes, deadlines,
  etc).
 
  Why is it necessary to push this task down into the host?  I don't
  understand the advantage of this approach except that maybe it works
  around certain misconfigurations, I/O scheduler quirks, or plain old
  bugs - all of which should be investigated and fixed at the source
  instead of adding another layer of code to mask them.
 
 It is about I/O scheduling. CFQ the state of the art I/O scheduler
 merges adjacent requests from the same PID before dispatching them to
 the disk.
 If we can distinguish between the different threads of a virtual
 machine that read/write a file, the I/O scheduler in the host can
 merge requests in an effective way for sequential access. Qemu fails
 in this, because of its architecture. Apart that at the moment there
 is no way to distinguish the guest threads from each other (I'm
 working on some kernel patches), Qemu has one big queue from which
 several workerthreads grab requests and dispatch them to the disk.
 Even if you have one large read from just one thread in the guest, the
 I/O scheduler in the host will get the requests from different PIDs (=
 workerthreads) and won't be able to merge them.

From clone(2):

  If several threads are doing I/O on behalf of the same process
  (aio_read(3), for instance), they should employ CLONE_IO to get better
  I/O performance.

I think we should just make sure that worker threads share the same
io_context in QEMU.  That way the host I/O scheduler will treat their
requests as one.

Can you benchmark CLONE_IO and see if it improves the situation?

 In former versions, there was some work done to merge requests in
 Qemu, but I don't think they were very useful, because you don't know
 how the layout of the image file looks like on the physical disk.
 Anyway I think this code parts have been removed.

If you mean bdrv_aio_multiwrite(), it's still there and used by
virtio-blk emulation.  It's only used for write requests, not reads.

Stefan



Re: [Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-11-11 Thread Matthias Brugger
2013/11/5 Stefan Hajnoczi stefa...@gmail.com:
 On Mon, Nov 04, 2013 at 11:28:41AM +0100, Matthias Brugger wrote:
 v2:
  - fix issues found by checkpatch.pl
  - change the descritpion of patch 3

 This patch series makes the thread pool implementation modular.
 This allows each drive to use a special implementation.
 The patch series prepares qemu to be able to include thread pools different 
 to
 the one actually implemented. It will allow to implement approaches like
 paravirtualized block requests [1].

 [1] 
 http://www.linux-kvm.org/wiki/images/5/53/2012-forum-Brugger-lightningtalk.pdf

 -drive aio=threads|native already distinguishes between different AIO
 implementations.  IMO -drive aio= is the logical interface if you want
 to add a new AIO mechanism.

Good point.


 I'd also like to see the thread pool implementation you wish to add
 before we add a layer of indirection which has no users yet.

Fair enough, I will evaluate if it will make more sense to implement a
new AIO infrastructure instead to try reuse the thread-pool.
Actually my implementation will differ in the way, that we will have
several workerthreads with everyone of them having its own queue. The
requests will be distributed between them depending on an identifier.
The request function which  the worker_thread call will be the same as
using aio=threads, so I'm not quite sure which will be the way to go.
Any opinions and hints, like the one you gave are highly appreciated.


-- 
motzblog.wordpress.com



Re: [Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-11-11 Thread Stefan Hajnoczi
On Mon, Nov 11, 2013 at 11:00:45AM +0100, Matthias Brugger wrote:
 2013/11/5 Stefan Hajnoczi stefa...@gmail.com:
  I'd also like to see the thread pool implementation you wish to add
  before we add a layer of indirection which has no users yet.
 
 Fair enough, I will evaluate if it will make more sense to implement a
 new AIO infrastructure instead to try reuse the thread-pool.
 Actually my implementation will differ in the way, that we will have
 several workerthreads with everyone of them having its own queue. The
 requests will be distributed between them depending on an identifier.
 The request function which  the worker_thread call will be the same as
 using aio=threads, so I'm not quite sure which will be the way to go.
 Any opinions and hints, like the one you gave are highly appreciated.

If I understand the slides you linked to correctly, the guest will pass
an identifier with each request.  The host has worker threads allowing
each stream of requests to be serviced independently.  The idea is to
associate guest processes with unique identifiers.

The guest I/O scheduler is supposed to submit requests in a way that
meets certain policies (e.g. fairness between processes, deadlines,
etc).

Why is it necessary to push this task down into the host?  I don't
understand the advantage of this approach except that maybe it works
around certain misconfigurations, I/O scheduler quirks, or plain old
bugs - all of which should be investigated and fixed at the source
instead of adding another layer of code to mask them.

Stefan



Re: [Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-11-11 Thread Alex Bligh

On 11 Nov 2013, at 12:43, Stefan Hajnoczi wrote:

 Why is it necessary to push this task down into the host?  I don't
 understand the advantage of this approach except that maybe it works
 around certain misconfigurations, I/O scheduler quirks, or plain old
 bugs - all of which should be investigated and fixed at the source
 instead of adding another layer of code to mask them.

I can see an argument why a guest with two very differently
performing disks attached might be best served by two worker
threads, particularly if one such thread was in part CPU bound
(inventing this use case is left as an exercise for the reader).

-- 
Alex Bligh







Re: [Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-11-11 Thread Paolo Bonzini
Il 11/11/2013 18:59, Alex Bligh ha scritto:
  Why is it necessary to push this task down into the host?  I don't
  understand the advantage of this approach except that maybe it works
  around certain misconfigurations, I/O scheduler quirks, or plain old
  bugs - all of which should be investigated and fixed at the source
  instead of adding another layer of code to mask them.
 
 I can see an argument why a guest with two very differently
 performing disks attached might be best served by two worker
 threads, particularly if one such thread was in part CPU bound
 (inventing this use case is left as an exercise for the reader).

In most cases you want to use aio=native anyway, and then the QEMU
thread pool is entirely bypassed.

Paolo



Re: [Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-11-11 Thread Alex Bligh

On 11 Nov 2013, at 18:01, Paolo Bonzini wrote:

 Il 11/11/2013 18:59, Alex Bligh ha scritto:
 Why is it necessary to push this task down into the host?  I don't
 understand the advantage of this approach except that maybe it works
 around certain misconfigurations, I/O scheduler quirks, or plain old
 bugs - all of which should be investigated and fixed at the source
 instead of adding another layer of code to mask them.
 
 I can see an argument why a guest with two very differently
 performing disks attached might be best served by two worker
 threads, particularly if one such thread was in part CPU bound
 (inventing this use case is left as an exercise for the reader).
 
 In most cases you want to use aio=native anyway, and then the QEMU
 thread pool is entirely bypassed.

'most cases' - really? I thought anything using either qcow2 or
ceph won't support that? Also I am guessing if aio=native is used
then it by definition won't be CPU bound... :-)

From one of the Edinburgh presentations I had thought we were going
towards everything using the thread pool (subject to appropriate
rearchitecture).

-- 
Alex Bligh







Re: [Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-11-11 Thread Paolo Bonzini
Il 11/11/2013 19:32, Alex Bligh ha scritto:
 
 On 11 Nov 2013, at 18:01, Paolo Bonzini wrote:
 
 Il 11/11/2013 18:59, Alex Bligh ha scritto:
 Why is it necessary to push this task down into the host?  I don't
 understand the advantage of this approach except that maybe it works
 around certain misconfigurations, I/O scheduler quirks, or plain old
 bugs - all of which should be investigated and fixed at the source
 instead of adding another layer of code to mask them.

 I can see an argument why a guest with two very differently
 performing disks attached might be best served by two worker
 threads, particularly if one such thread was in part CPU bound
 (inventing this use case is left as an exercise for the reader).

 In most cases you want to use aio=native anyway, and then the QEMU
 thread pool is entirely bypassed.
 
 'most cases' - really? I thought anything using either qcow2 or
 ceph won't support that?

qcow2 works very well with aio=native.

ceph, libiscsi, gluster, etc. will not support aio=native indeed, but
then they won't use the thread pool either so I wasn't thinking about
them (only files and block devices).

Paolo



Re: [Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-11-05 Thread Stefan Hajnoczi
On Mon, Nov 04, 2013 at 11:28:41AM +0100, Matthias Brugger wrote:
 v2:
  - fix issues found by checkpatch.pl
  - change the descritpion of patch 3
 
 This patch series makes the thread pool implementation modular.
 This allows each drive to use a special implementation.
 The patch series prepares qemu to be able to include thread pools different to
 the one actually implemented. It will allow to implement approaches like
 paravirtualized block requests [1].
 
 [1] 
 http://www.linux-kvm.org/wiki/images/5/53/2012-forum-Brugger-lightningtalk.pdf

-drive aio=threads|native already distinguishes between different AIO
implementations.  IMO -drive aio= is the logical interface if you want
to add a new AIO mechanism.

I'd also like to see the thread pool implementation you wish to add
before we add a layer of indirection which has no users yet.



[Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-11-04 Thread Matthias Brugger
v2:
 - fix issues found by checkpatch.pl
 - change the descritpion of patch 3

This patch series makes the thread pool implementation modular.
This allows each drive to use a special implementation.
The patch series prepares qemu to be able to include thread pools different to
the one actually implemented. It will allow to implement approaches like
paravirtualized block requests [1].

[1] 
http://www.linux-kvm.org/wiki/images/5/53/2012-forum-Brugger-lightningtalk.pdf

Matthias Brugger (3):
  Make thread pool implementation modular
  Block layer uses modular thread pool
  Add workerthreads configuration option

 async.c |  4 ++--
 block/raw-posix.c   | 15 +++
 block/raw-win32.c   |  9 +++--
 blockdev.c  | 13 +
 include/block/aio.h |  2 +-
 include/block/thread-pool.h | 11 +++
 include/qemu-common.h   |  2 ++
 qemu-options.hx |  2 +-
 thread-pool.c   | 33 +
 9 files changed, 81 insertions(+), 10 deletions(-)

-- 
1.8.1.2