Hi,
I am using sqlalchemy through its orm layer. The basic flow of my
program is this:
1. Download about 18000 web pages and store the raw HTML in a table
2. Iterate through each web page and use pyparsing to parse it. I
then insert the results of that
parsing into about a dozen different tables.
The second step is taking a long time (above and beyond the fixed
parsing time) and I want to make sure that I am handling the session
in an optimal manner. Here is a sketch of what I do in step 2:
session = make_my_session()
query = session.query(MyModel)
for obj in query: # 18,000 of these
try:
obj.parse() # creates a bunch of orm classes in my db (in
about 12 tables)
session.commit()
except:
session.rollback()
Is this a good way of doing this? Is there a better/faster way?
How should I setup the session? Right now I am going
transactional=True and autoflush=True
I am using sqlalchemy 0.4 series with the sqlite that comes with
python 2.5 on Win32.
Thanks!
Brian
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---