[python-tulip] Need advices to understand some counter-intuitive performance issues with AsyncIO

Ludovic Gasc Sat, 24 Jan 2015 15:08:18 -0800

Hi,

I've a "strange" behaviour with AsyncIO: more I try to improve the 
performances, more it's slow.
I certainly missed something in AsyncIO, or maybe you have some tips other 
than try and benchmark for each change.


For example, this is two coroutines to update data from PostgreSQL, 
executed by aiohttp.web and API-Hour:
(If you want to see all code, it's available here:
https://github.com/Eyepea/FrameworkBenchmarks/tree/API-Hour/frameworks/Python/API-Hour/hello/hello
 
)

@asyncio.coroutine
def update_random_records(container, limit):
    results = []
    for i in range(limit):
        results.append((yield from update_random_record(container)))

    return results


@asyncio.coroutine
def update_random_record(container):
    pg = yield from container.engines['pg']

    world = yield from get_random_record(container)

    with (yield from pg.cursor()) as cur:
        yield from cur.execute('UPDATE world SET randomnumber=%(random_number)s 
WHERE id=%(idx)s',
                               {'random_number': randint(1, 10000), 'idx': 
world['Id']})
    return world


When I launch wrk (HTTP benchmark tool) on HTTP server, I've this result:


lg@steroids:~$ wrk -t8 -c256 -d30s http://127.0.0.1:8008/updates?queries=20

Running 30s test @ http://127.0.0.1:8008/updates?queries=20 8 threads and 
256 connections Thread Stats Avg Stdev Max +/- Stdev Latency 547.86ms 
428.03ms 2.90s 85.37% Req/Sec 59.53 12.62 92.00 69.32% 14283 requests in 
30.04s, 10.99MB read Socket errors: connect 0, read 0, write 0, timeout 37 
Requests/sec: 475.42 

Transfer/sec:    374.46KB


Now, when I try to change coroutine update_random_records to launch all 
update_random_record coroutines in the same time instead to wait the end of 
update_random_record to launch a new coroutine:


@asyncio.coroutine
def update_random_records(container, limit):
    tasks = []
    results = []
    for i in range(limit):
        
tasks.append(container.loop.create_task(update_random_record(container)))
    yield from asyncio.wait(tasks)
    for task in tasks:
        results.append(task.result())
    return results


I've this result:


lg@steroids:~$ wrk -t8 -c256 -d30s http://127.0.0.1:8008/updates?queries=20
Running 30s test @ http://127.0.0.1:8008/updates?queries=20
  8 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   585.21ms  563.88ms   3.95s    89.03%
    Req/Sec    57.56     18.82   118.00     66.89%
  13480 requests in 30.04s, 10.37MB read
  Socket errors: connect 0, read 0, write 0, timeout 193
Requests/sec:    448.76
Transfer/sec:    353.49KB


As you can see, less requests/sec but also more HTTP requests in timeout. The 
limitation should be my PostgreSQL database.

And now, if I add a Semaphore(value=10) to reduce concurrent coroutines of 
update_random_records :


@asyncio.coroutine
def update_random_record(container):
    with (yield from container.semaphores['updates']):
        pg = yield from container.engines['pg']

        world = yield from get_random_record(container)

        with (yield from pg.cursor()) as cur:
            yield from cur.execute('UPDATE world SET 
randomnumber=%(random_number)s WHERE id=%(idx)s',
                                   {'random_number': randint(1, 10000), 'idx': 
world['Id']})
        return world


Now:

lg@steroids:~$ wrk -t8 -c256 -d30s http://127.0.0.1:8008/updates?queries=20

Running 30s test @ http://127.0.0.1:8008/updates?queries=20 8 threads and 
256 connections Thread Stats Avg Stdev Max +/- Stdev Latency 619.24ms 
476.83ms 3.20s 81.59% Req/Sec 52.74 9.49 81.00 69.92% 12590 requests in 
30.03s, 9.68MB read Socket errors: connect 0, read 0, write 0, timeout 53 
Requests/sec: 419.23 Transfer/sec: 330.21KB 


It's better, but less rapid than the first attempt with a simple yield form in 
a loop. Change Semaphore value doesn't change the result a lot.

I'd already have this problem with AsyncIO in another context, slurp all 
databases content of a CouchDB instance: When I've tried to launch several I/O 
HTTP requests in same time, Python script had needed more time.


It could be the context switching between coroutines that reduces performance ? 
How to identify/measure that ?


Thanks for your ideas.


Regards.

[python-tulip] Need advices to understand some counter-intuitive performance issues with AsyncIO

Reply via email to