On Thu, Mar 31, 2011 at 11:59 AM, Mark McWiggins
<[email protected]> wrote:
> Hi All,
> I've just been working on a project to connect to a Postgresql database with
> Python, and I found:
> * SQLObject too slow
> * SQLAlchemy too slow
> * psycopg2 very buggy
> * psycopg somewhat buggy
> Is the MySQL interface any better?

This depends heavily on the number of records and kinds of queries and
joins you're doing. First check the raw speed of the queries in psql
and mysql -- that'll tell how much time the database itself is using
and whether PostgreSQL or MySQL is faster for that dataset and those
queries.

MySQL with MyISAM tables is traditionally faster for
write-occasionally, read-often scenarios. But this loses the
transactional integrity of PostgreSQL or MySQL/InnoDB.

How did you determine that psycopg2 is buggy? Is it based on your own
tests or something you read? psycopg2 is the default PostgreSQL driver
in SQLAlchemy, so it should be reliable in most cases. SQLAlchemy
doesn't even support psycopg, so they seem to think it's obsolete or
not worth bothering with.

PostgreSQL is unusual in having several Python drivers, some optimized
for particular use cases, and some more bug-free than others.

The MySQL driver has gone from well supported to mediocrely supported
to well supported. It has a long history and is very widely used.

There are three levels of overhead with Python SQL libraries. The
DBAPI modules (psycopg2) have the lowest overhead: they send a SQL
query and convert the result to a list of tuples. You can't get better
than that without custom C code.

The second level is SQLAlchemy's SQL level. It builds queries in an OO
way, and has a more flexible row type in the return value. But all
these are minimal calculations in memory and lazily-evaluated field
values, so the speed difference is negligable compared to raw DBAPI.

The third level is an object-relational mapper, which copies every
result row into a class instance. This takes significantly more
overhead, which is very noticeable with thousands of result records.
The answer is not to use the ORM for bulk queries like that. In most
cases you just need a pageful of results to display, and any bulk work
is embedded in the query so it happens inside the db server which is
optimized for it. In SQLAlchemy you can drop to the SQL level for bulk
queries. In SQLObject you can't because there is no SQL level (or at
least there wasn't as of a few years ago).

-- 
Mike Orr <[email protected]>

Reply via email to