Re: [python-tulip] Re: Need advices to understand some counter-intuitive performance issues with AsyncIO

Ludovic Gasc Thu, 29 Jan 2015 12:52:11 -0800

Thank you Jonathan and Andrew, it's more clear to me.
I'm digging with time counters in source code to try to find a good
compromise.


Regards.

--
Ludovic Gasc

On Thu, Jan 29, 2015 at 5:10 PM, Andrew Svetlov <[email protected]>
wrote:

> Task creation requires one extra event loop iteration to start.
>
> On Thu, Jan 29, 2015 at 6:03 PM, Jonathan Slenders
> <[email protected]> wrote:
> > From my experience, Task creation can result a noticeable overhead when
> you
> > try to split up work in many *really* small tasks. "Yield from" is really
> > light, but when you create a Task, you create an instance of such a
> Python
> > object and "yield from task" will actually proxy through Task.__iter__ (I
> > think).
> >
> > So, performance-wise, I think you should only create tasks for a
> coroutines
> > which spend a certain amount of their time waiting for I/O and you want
> to
> > fill these "I/O" gaps with other tasks.
> > But in a web server, where you have many requests, there's a chance that
> > these gaps are already filled by a parallel requests anyway. So, creating
> > more tasks could reduce latency when the load is low, but increase the
> > latency (because of CPU saturation) when the load is high.
> >
> > I hope this explains. (And please correct me if I'm wrong.)
> >
> > About the last example, I'm not sure.
> >
> >
> >
> > Le dimanche 25 janvier 2015 00:07:41 UTC+1, Ludovic Gasc a écrit :
> >>
> >> Hi,
> >>
> >> I've a "strange" behaviour with AsyncIO: more I try to improve the
> >> performances, more it's slow.
> >> I certainly missed something in AsyncIO, or maybe you have some tips
> other
> >> than try and benchmark for each change.
> >>
> >> For example, this is two coroutines to update data from PostgreSQL,
> >> executed by aiohttp.web and API-Hour:
> >> (If you want to see all code, it's available here:
> >>
> >>
> https://github.com/Eyepea/FrameworkBenchmarks/tree/API-Hour/frameworks/Python/API-Hour/hello/hello
> >> )
> >>
> >> @asyncio.coroutine
> >> def update_random_records(container, limit):
> >>     results = []
> >>     for i in range(limit):
> >>         results.append((yield from update_random_record(container)))
> >>
> >>     return results
> >>
> >>
> >> @asyncio.coroutine
> >> def update_random_record(container):
> >>     pg = yield from container.engines['pg']
> >>
> >>     world = yield from get_random_record(container)
> >>
> >>     with (yield from pg.cursor()) as cur:
> >>         yield from cur.execute('UPDATE world SET
> >> randomnumber=%(random_number)s WHERE id=%(idx)s',
> >>                                {'random_number': randint(1, 10000),
> 'idx':
> >> world['Id']})
> >>     return world
> >>
> >>
> >> When I launch wrk (HTTP benchmark tool) on HTTP server, I've this
> result:
> >>
> >>
> >> lg@steroids:~$ wrk -t8 -c256 -d30s
> >> http://127.0.0.1:8008/updates?queries=20
> >>
> >> Running 30s test @ http://127.0.0.1:8008/updates?queries=20
> >>   8 threads and 256 connections
> >>   Thread Stats   Avg      Stdev     Max   +/- Stdev
> >>     Latency   547.86ms  428.03ms   2.90s    85.37%
> >>     Req/Sec    59.53     12.62    92.00     69.32%
> >>   14283 requests in 30.04s, 10.99MB read
> >>   Socket errors: connect 0, read 0, write 0, timeout 37
> >> Requests/sec:    475.42
> >>
> >> Transfer/sec:    374.46KB
> >>
> >>
> >> Now, when I try to change coroutine update_random_records to launch all
> >> update_random_record coroutines in the same time instead to wait the
> end of
> >> update_random_record to launch a new coroutine:
> >>
> >>
> >> @asyncio.coroutine
> >> def update_random_records(container, limit):
> >>     tasks = []
> >>     results = []
> >>     for i in range(limit):
> >>
> >>
> tasks.append(container.loop.create_task(update_random_record(container)))
> >>     yield from asyncio.wait(tasks)
> >>     for task in tasks:
> >>         results.append(task.result())
> >>     return results
> >>
> >>
> >> I've this result:
> >>
> >>
> >> lg@steroids:~$ wrk -t8 -c256 -d30s
> >> http://127.0.0.1:8008/updates?queries=20
> >> Running 30s test @ http://127.0.0.1:8008/updates?queries=20
> >>   8 threads and 256 connections
> >>   Thread Stats   Avg      Stdev     Max   +/- Stdev
> >>     Latency   585.21ms  563.88ms   3.95s    89.03%
> >>     Req/Sec    57.56     18.82   118.00     66.89%
> >>   13480 requests in 30.04s, 10.37MB read
> >>   Socket errors: connect 0, read 0, write 0, timeout 193
> >> Requests/sec:    448.76
> >> Transfer/sec:    353.49KB
> >>
> >>
> >> As you can see, less requests/sec but also more HTTP requests in
> timeout.
> >> The limitation should be my PostgreSQL database.
> >>
> >> And now, if I add a Semaphore(value=10) to reduce concurrent coroutines
> of
> >> update_random_records :
> >>
> >>
> >> @asyncio.coroutine
> >> def update_random_record(container):
> >>     with (yield from container.semaphores['updates']):
> >>         pg = yield from container.engines['pg']
> >>
> >>         world = yield from get_random_record(container)
> >>
> >>         with (yield from pg.cursor()) as cur:
> >>             yield from cur.execute('UPDATE world SET
> >> randomnumber=%(random_number)s WHERE id=%(idx)s',
> >>                                    {'random_number': randint(1, 10000),
> >> 'idx': world['Id']})
> >>         return world
> >>
> >>
> >> Now:
> >>
> >> lg@steroids:~$ wrk -t8 -c256 -d30s
> >> http://127.0.0.1:8008/updates?queries=20
> >>
> >> Running 30s test @ http://127.0.0.1:8008/updates?queries=20
> >>   8 threads and 256 connections
> >>   Thread Stats   Avg      Stdev     Max   +/- Stdev
> >>     Latency   619.24ms  476.83ms   3.20s    81.59%
> >>     Req/Sec    52.74      9.49    81.00     69.92%
> >>   12590 requests in 30.03s, 9.68MB read
> >>   Socket errors: connect 0, read 0, write 0, timeout 53
> >> Requests/sec:    419.23
> >> Transfer/sec:    330.21KB
> >>
> >>
> >> It's better, but less rapid than the first attempt with a simple yield
> >> form in a loop. Change Semaphore value doesn't change the result a lot.
> >>
> >> I'd already have this problem with AsyncIO in another context, slurp all
> >> databases content of a CouchDB instance: When I've tried to launch
> several
> >> I/O HTTP requests in same time, Python script had needed more time.
> >>
> >>
> >> It could be the context switching between coroutines that reduces
> >> performance ? How to identify/measure that ?
> >>
> >>
> >> Thanks for your ideas.
> >>
> >>
> >> Regards.
> >>
> >>
> >
>
>
>
> --
> Thanks,
> Andrew Svetlov
>

Re: [python-tulip] Re: Need advices to understand some counter-intuitive performance issues with AsyncIO

Reply via email to