Re: [DB-SIG] New take on PostgreSQL bindings for Python

M.-A. Lemburg Thu, 07 Sep 2006 10:07:05 -0700

Cristian Gafton wrote:
>> In database applications you rarely want huge result sets
>> in one go. You typically try to read in the data in chunks where the
>> actual fetching of the chunks is done using multiple SQL statements,
> 
> That's my point exactly - the only one that knows what to expect back from 
> the backend is the application writer, because on the more popular 
> databases (like MySQL and PostgreSQL), fetching results in chunks adds a 
> sizeable cost in extra rountrips and speed of retrieving the results.


Right, but this isn't something for the DB API to define. You
have to use the SQL of a particular database backend and its
features (such as server side cursors, ability to limit/offset
the query result set, etc.).

A module author can make things a little easier for the programmer
by providing this functionality via the cursor.scroll()
method, e.g.

cursor.execute('select * from mytable')
cursor.scroll(9999)
rs = cursor.fetchmany(100)

to fetch 100 rows at offset 9999 of the result set.

>>>> Maybe I'm missing something, but doesn't the programmer let the
>>>> database module know by using either .fetchmany() or
>>>> .fetchall() ?!
>>> It doesn't. The C level APIs of the databases are written in such a way
>>> that at the end of the low level on-the-wire execute() call you are making
>>> you get returned the entire result set. There is nothing fetchXXX can do
>>> to help you there.
>> If that's the case for PostgreSQL, perhaps you need to add a
>> non-standard method to choose the fetch strategy before doing
>> the .executexxx() ?!
> 
> Okay, that's what I was proposing with
>       cursor = db.itercursor()
> 
> which would set up the cursor to iterate through the results, in a similar 
> fashion to what dict.iteritems() does compared to dict.items(). I take it 
> you agree with that approach then?

It would be better to define the fetching strategy on a regular
cursor object, e.g. cursor.setprefetchsize(1024) to have the
database module prepare fetches of 1024 rows or
cursor.setprefetchsize(sys.maxint) to always read the whole
result set.

You could also use the cursor.arraysize attribute on cursors as
indicator of how many rows to pre-fetch. cursor.arraysize defines
the default number of rows to fetch using cursor.fetchmany().

Aside: cursors can optionally implement the iteration
protocol, so you can write:

for row in cursor:
    print row

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 07 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
_______________________________________________
DB-SIG maillist  -  DB-SIG@python.org
http://mail.python.org/mailman/listinfo/db-sig

Re: [DB-SIG] New take on PostgreSQL bindings for Python

Reply via email to