On Feb 12, 2006, at 12:35 AM, Mike Orr wrote:

I'm not sure that there is a programmatic solution within SQLObject
for tables with many, many, many rows, is there?  Does anyone out
there have any experience dealing with tables of this size within
SQLObject?

Your situation is the opposite of mine so I don't know.  I have a few
huge records; you have a lot of small ones.  I think SQLObject holds
weakrefs to the records and they disappear immediately when they go
out of scope.  My problem has been, is SQLite memory efficient?
connection.queryAll works fine so I wouldn't hesitate to use it.  You
can use Select() and sqlbuilder and COLUMN.q to construct your SQL if
you want.  Everything else in SQLObject seems tied to the record
object, so I think you get entire records.

For others who might be in the same situation as me, here's how I currently deal with processing very large tables:

    # process.py
    from itertools import chain, imap

    def step(selection, size):
        i = 0
        while 1:
            output = selection[i:i+size]
            if output.count() > 0:
                for row in output:
                    yield row
                i += size
            else:
                raise StopIteration

    def process(f, selection, stepsize=1000):
        return imap(f, chain(step(selection, stepsize)))


And, using it would look something like this:

    # processor.py
    from process import process

    def f(item):
        pass  # do the processing work here

    gen = process(f, bigtable.select())
    list(gen)  # do the processing


In this setup, there shouldn't more than 1000 sqlobjects in memory at any one time. *But*, the size of the python process grows and grows over time as it's processing the rows.

In my test of a 3 million row table where 'f' does nothing at all, it doesn't look like any of the sqlobject rows are being removed from memory with the above functions. Does anyone have any suggestions about how to 'release' those objects?

I thought the generators would efficiently take care of it. As soon as the local variables 'output' and 'row' are reassigned with the new objects, the old objects should have no more references. Is that correct?

I'm using python2.4, fedora core 4 and MySQL4.1 (and an older version sqlobject: r569)

Thanks,
--Tracy



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
sqlobject-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sqlobject-discuss

Reply via email to