I try to export ca. 5 Mio. records (with 6 float fields) from a table and want
to write it into a file with python. I query the database with the pg module.
I have 4 Gbytes of RAM but the python script crashes during getresult()
(MemoryError).
A float is stored as a double in Python (usually 8 Bytes or so) so it
should fit into memory, but how much of the 4GB is really free? Maybe
Postgres has snapped up everything as shared memory or there are other
processes using up the memory. If you can send me the traceback or even
a demo script, I can check if things can be optimized a bit.
The pgdb module supports the fetchmany() method but is quite slower than the pg
module.
What do you think is the best method to extract a huge amount of data from DB
query and write it into file but using a python script? Do a loop over
fetchmany() with the pgdb module?
Unfortunately, the pg module does not support cursors; getresult()
always loads everything. You have to use pgdb for cursors. Is using
fetchmany() with an appropriate number of records really so slow?
Two other methods that I would immediately think of:
* Use the copy command:
db.query("copy mytable to '/mypath/myfile.dat'")
* Perhaps it is possible to partition your selects, and read the data in
chunks, something like:
db.query('select ... where primary_key '
'between 2000 and 3000').getresult()
-- Christoph
_______________________________________________
PyGreSQL mailing list
[email protected]
http://mailman.vex.net/mailman/listinfo/pygresql