On Thu, Mar 31, 2011 at 11:59 AM, Mark McWiggins <[email protected]> wrote: > Hi All, > I've just been working on a project to connect to a Postgresql database with > Python, and I found: > * SQLObject too slow > * SQLAlchemy too slow > * psycopg2 very buggy > * psycopg somewhat buggy > Is the MySQL interface any better?
This depends heavily on the number of records and kinds of queries and joins you're doing. First check the raw speed of the queries in psql and mysql -- that'll tell how much time the database itself is using and whether PostgreSQL or MySQL is faster for that dataset and those queries. MySQL with MyISAM tables is traditionally faster for write-occasionally, read-often scenarios. But this loses the transactional integrity of PostgreSQL or MySQL/InnoDB. How did you determine that psycopg2 is buggy? Is it based on your own tests or something you read? psycopg2 is the default PostgreSQL driver in SQLAlchemy, so it should be reliable in most cases. SQLAlchemy doesn't even support psycopg, so they seem to think it's obsolete or not worth bothering with. PostgreSQL is unusual in having several Python drivers, some optimized for particular use cases, and some more bug-free than others. The MySQL driver has gone from well supported to mediocrely supported to well supported. It has a long history and is very widely used. There are three levels of overhead with Python SQL libraries. The DBAPI modules (psycopg2) have the lowest overhead: they send a SQL query and convert the result to a list of tuples. You can't get better than that without custom C code. The second level is SQLAlchemy's SQL level. It builds queries in an OO way, and has a more flexible row type in the return value. But all these are minimal calculations in memory and lazily-evaluated field values, so the speed difference is negligable compared to raw DBAPI. The third level is an object-relational mapper, which copies every result row into a class instance. This takes significantly more overhead, which is very noticeable with thousands of result records. The answer is not to use the ORM for bulk queries like that. In most cases you just need a pageful of results to display, and any bulk work is embedded in the query so it happens inside the db server which is optimized for it. In SQLAlchemy you can drop to the SQL level for bulk queries. In SQLObject you can't because there is no SQL level (or at least there wasn't as of a few years ago). -- Mike Orr <[email protected]>
