George Neuner writes:

> But Python's DB pool is threaded, and Python's threads are core
> limited by the GIL in all the major implementations (excepting
> Jython).

Python's Postgres pooling does not[1] use POSIX threads under the hood
to manage the connections if that's what you mean, nor is the
concurrency of the Python applications based on system threads.  All of
the Python examples use either asyncio + fork(2) or green threads +
fork(2).  This includes the django example[3].

> There are a few things Python can do faster than Racket, but the VAST
> difference in performance shown in the techempower tests isn't
> explained by them.

Here's a benchmark that doesn't touch the DB at all, showing an even
bigger difference in throughput between the two:

https://www.techempower.com/benchmarks/#section=data-r19&hw=ph&test=json

Here's the same benchmark running on my local machine where I
intentionally limited the Django app to a single CPU and I made it use
the `gevent' library for its workers so it is more comparable to the
Racket implementation:

https://www.techempower.com/benchmarks/#section=test&shareid=14ecbf16-cdb3-4501-8b7d-a2b8a549f73c&hw=ph&test=json&a=2

And here's what happens when I let it use as much parallelism as it can:

https://www.techempower.com/benchmarks/#section=test&shareid=d3ad4d79-c7a7-4ca0-b297-ffda549947c8&hw=ph&test=json&a=2

I do agree that improving the parallelism part wouldn't be enough to
catch up (clearly, there's a 2x difference even on a single core), but
it is a large factor here.

I wrote the latest implementation of the Racket code for that benchmark
and I considered doing things like bypassing the "standard"
`dispatch/servlet' implementation to avoid the overhead of all the
continuation machinery in the web server, but that felt like cheating.

Another area where the web server does more work than it should is in
generating responses: the web server uses chunked transfer encoding for
all responses; whereas all the Python web servers simply write the
response directly to the socket when the length of the content is known
ahead of time.

Another thing of note about the django implementation is that it uses
ujson, written in C with the express intent of being as fast as
possible, to generate the JSON data.


[1]: They call the default implementation a `ThreadedConnectionPool'[2],
but that's just because it uses the mutexes that the `threading' module
provides.

[2]: 
https://github.com/psycopg/psycopg2/blob/779a1370ceeac130de07edc0510f2c55846be1bd/lib/pool.py#L155

[3]: 
https://github.com/TechEmpower/FrameworkBenchmarks/blob/c49524762379a2cdf82627b0032c654f3a9eafb6/frameworks/Python/django/gunicorn_conf.py#L8-L21

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/m2mu5mr0zy.fsf%40192.168.0.142.

Reply via email to