Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Peter Lieven
Am 27.11.2014 um 17:40 schrieb Paolo Bonzini: On 27/11/2014 11:27, Peter Lieven wrote: +static __thread struct CoRoutinePool { +Coroutine *ptrs[POOL_MAX_SIZE]; +unsigned int size; +unsigned int nextfree; +} CoPool; The per-thread ring unfortunately didn't work well last time

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Peter Lieven
Am 28.11.2014 um 11:28 schrieb Paolo Bonzini: On 28/11/2014 09:13, Peter Lieven wrote: Am 27.11.2014 um 17:40 schrieb Paolo Bonzini: On 27/11/2014 11:27, Peter Lieven wrote: +static __thread struct CoRoutinePool { +Coroutine *ptrs[POOL_MAX_SIZE]; +unsigned int size; +unsigned

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Paolo Bonzini
master: Run operation 4000 iterations 12.851414 s, 3112K operations/s, 321ns per coroutine paolo: Run operation 4000 iterations 11.951720 s, 3346K operations/s, 298ns per coroutine Nice. :) Can you please try coroutine: Use __thread … together, too? I still see 11% time

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Peter Lieven
Am 28.11.2014 um 12:14 schrieb Paolo Bonzini: master: Run operation 4000 iterations 12.851414 s, 3112K operations/s, 321ns per coroutine paolo: Run operation 4000 iterations 11.951720 s, 3346K operations/s, 298ns per coroutine Nice. :) Can you please try coroutine: Use __thread

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Paolo Bonzini
On 28/11/2014 12:21, Peter Lieven wrote: Am 28.11.2014 um 12:14 schrieb Paolo Bonzini: master: Run operation 4000 iterations 12.851414 s, 3112K operations/s, 321ns per coroutine paolo: Run operation 4000 iterations 11.951720 s, 3346K operations/s, 298ns per coroutine Nice. :)

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Peter Lieven
Am 28.11.2014 um 12:23 schrieb Paolo Bonzini: On 28/11/2014 12:21, Peter Lieven wrote: Am 28.11.2014 um 12:14 schrieb Paolo Bonzini: master: Run operation 4000 iterations 12.851414 s, 3112K operations/s, 321ns per coroutine paolo: Run operation 4000 iterations 11.951720 s, 3346K

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Peter Lieven
Am 28.11.2014 um 12:23 schrieb Paolo Bonzini: On 28/11/2014 12:21, Peter Lieven wrote: Am 28.11.2014 um 12:14 schrieb Paolo Bonzini: master: Run operation 4000 iterations 12.851414 s, 3112K operations/s, 321ns per coroutine paolo: Run operation 4000 iterations 11.951720 s, 3346K

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Peter Lieven
Am 28.11.2014 um 12:32 schrieb Peter Lieven: Am 28.11.2014 um 12:23 schrieb Paolo Bonzini: On 28/11/2014 12:21, Peter Lieven wrote: Am 28.11.2014 um 12:14 schrieb Paolo Bonzini: master: Run operation 4000 iterations 12.851414 s, 3112K operations/s, 321ns per coroutine paolo: Run

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Paolo Bonzini
On 28/11/2014 12:32, Peter Lieven wrote: Am 28.11.2014 um 12:23 schrieb Paolo Bonzini: On 28/11/2014 12:21, Peter Lieven wrote: Am 28.11.2014 um 12:14 schrieb Paolo Bonzini: master: Run operation 4000 iterations 12.851414 s, 3112K operations/s, 321ns per coroutine paolo: Run

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Peter Lieven
Am 28.11.2014 um 13:21 schrieb Paolo Bonzini: On 28/11/2014 12:32, Peter Lieven wrote: Am 28.11.2014 um 12:23 schrieb Paolo Bonzini: On 28/11/2014 12:21, Peter Lieven wrote: Am 28.11.2014 um 12:14 schrieb Paolo Bonzini: master: Run operation 4000 iterations 12.851414 s, 3112K

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Paolo Bonzini
On 28/11/2014 12:46, Peter Lieven wrote: I get: Run operation 4000 iterations 9.883958 s, 4046K operations/s, 247ns per coroutine Ok, understood, it steals the whole pool, right? Isn't that bad if we have more than one thread in need of a lot of coroutines? Overall the algorithm

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Peter Lieven
Am 28.11.2014 um 13:26 schrieb Paolo Bonzini: On 28/11/2014 12:46, Peter Lieven wrote: I get: Run operation 4000 iterations 9.883958 s, 4046K operations/s, 247ns per coroutine Ok, understood, it steals the whole pool, right? Isn't that bad if we have more than one thread in need of a

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Stefan Hajnoczi
On Thu, Nov 27, 2014 at 11:27:06AM +0100, Peter Lieven wrote: diff --git a/iothread.c b/iothread.c index 342a23f..b53529b 100644 --- a/iothread.c +++ b/iothread.c @@ -15,6 +15,7 @@ #include qom/object_interfaces.h #include qemu/module.h #include block/aio.h +#include block/coroutine.h

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Paolo Bonzini
On 28/11/2014 13:39, Peter Lieven wrote: Am 28.11.2014 um 13:26 schrieb Paolo Bonzini: On 28/11/2014 12:46, Peter Lieven wrote: I get: Run operation 4000 iterations 9.883958 s, 4046K operations/s, 247ns per coroutine Ok, understood, it steals the whole pool, right? Isn't that bad if

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Peter Lieven
Am 28.11.2014 um 13:45 schrieb Paolo Bonzini: On 28/11/2014 13:39, Peter Lieven wrote: Am 28.11.2014 um 13:26 schrieb Paolo Bonzini: On 28/11/2014 12:46, Peter Lieven wrote: I get: Run operation 4000 iterations 9.883958 s, 4046K operations/s, 247ns per coroutine Ok, understood, it

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Paolo Bonzini
On 28/11/2014 13:49, Peter Lieven wrote: Idea: If the release_pool is full why not put the coroutine in the thread alloc_pool instead of throwing it away? :-) Because you can only waste 64 coroutines per thread. But numbers cannot s/only// be sneezed at, so it's worth doing it as a

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Peter Lieven
Am 28.11.2014 um 13:39 schrieb Peter Lieven: Am 28.11.2014 um 13:26 schrieb Paolo Bonzini: On 28/11/2014 12:46, Peter Lieven wrote: I get: Run operation 4000 iterations 9.883958 s, 4046K operations/s, 247ns per coroutine Ok, understood, it steals the whole pool, right? Isn't that bad if

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Peter Lieven
Am 28.11.2014 um 13:45 schrieb Paolo Bonzini: On 28/11/2014 13:39, Peter Lieven wrote: Am 28.11.2014 um 13:26 schrieb Paolo Bonzini: On 28/11/2014 12:46, Peter Lieven wrote: I get: Run operation 4000 iterations 9.883958 s, 4046K operations/s, 247ns per coroutine Ok, understood, it

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Paolo Bonzini
On 28/11/2014 14:17, Peter Lieven wrote: The release_pool is not cleanup up on termination I think. That's not necessary, it is global. I don't see where you iterate over release_pool and destroy all coroutines? The OS does that for us when we exit. Paolo

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-28 Thread Peter Lieven
Am 28.11.2014 um 15:17 schrieb Paolo Bonzini: On 28/11/2014 14:17, Peter Lieven wrote: The release_pool is not cleanup up on termination I think. That's not necessary, it is global. I don't see where you iterate over release_pool and destroy all coroutines? The OS does that for us when we

[Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-27 Thread Peter Lieven
This patch creates a ring structure for the coroutine pool instead of a linked list. The implementation of the list has the issue that it always throws aways the latest coroutines instead of the oldest ones. This is a drawback since the latest used coroutines are more likely cached than old ones.

Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per thread for the pool

2014-11-27 Thread Paolo Bonzini
On 27/11/2014 11:27, Peter Lieven wrote: +static __thread struct CoRoutinePool { +Coroutine *ptrs[POOL_MAX_SIZE]; +unsigned int size; +unsigned int nextfree; +} CoPool; The per-thread ring unfortunately didn't work well last time it was tested. Devices that do not use