On Thu, May 31, 2007 14:17, Trigve Siver wrote:

> I mean if there is some diffrence using pqxx Transaction.exec() and
> pqxx::absolute_cursor. AFAIK cursor read results "on demand" while
> Transaction.exec()
> return whole result for executed query. Is it true? Can be there some
> performance gain when using cursors?

Ah, so it's not really about absolute_cursor but about cursors in general.

You understood correctly.  While a cursor only returns chunks of the work,
and only when you ask for them, transaction_base::exec() simply reads the
whole result set at once.

When talking purely about the overall speed it takes to retrieve all your
results, and in a normal program, a cursor will always be slower than
exec().  Don't complicate your code with cursors if you don't need to!

So why do we have cursors at all?  They make sense in many cases:

 * If your result set can be huge, and you only look at one row at a time,
there's no reason to keep the entire result set in memory.  With a cursor
you can download and process just one chunk at a time.

 * If you want to give quick first results, and it takes too long to get
the whole result set, use a cursor to retrieve and process a bit of your
data relatively soon.

 * If you want to give regular progress reports, e.g. a progress bar, on
large result sets, break them up into chunks with a cursor.

 * If your code has situations where you often discover that you want to
skip the next n rows of a result, use a cursor to skip right over them.

 * If you really, really want to speed up your program and it's worth a
bit of extra complexity, have one thread/process retrieve pieces of your
result set and hand them to another thread/process that works with the
data as it comes in.  In the ideal case, if you spend exactly half your
time waiting for results and half your time processing them, and you have
a large data set with consistent timings, this trick will let you
approximate a 2x speedup.

BTW, a cursor does *not* make sense if you want to retrieve only a part of
your result set, and ignore the rest.  For that, use the LIMIT and OFFSET
parameters to the SELECT command.

There is also a cachedresult class that combines the advantages of both,
so it feels like you've got your entire result set but in reality chunks
are fetched on demand.  In the future it may even incorporate that
optimization I described.  But for now, cachedresult is awaiting a big
overhaul.


Jeroen


_______________________________________________
Libpqxx-general mailing list
[email protected]
http://gborg.postgresql.org/mailman/listinfo/libpqxx-general

Reply via email to