Great. I implemented your suggestions and it was (predictably) a significant speedup. Cheers!
On Sep 15, 4:31 pm, Michael Bayer <[email protected]> wrote: > On Sep 15, 2011, at 10:20 AM, Jeff wrote: > > > Thanks for the guidance! > > > In a situation which a script submits a swarm of independent jobs for > > a cluster, and then finishes before some/all of those jobs have > > started running, each job will need to create the engine, yes? Or is > > there a better way to do it? > > Anytime you start a new process, that's where create_engine() would need to > be called once. > > When I use the Python multiprocessing library for example, I have a function > init_for_subprocess() which I can pass as the "on init" function to the > various multiprocessing functions, or if I'm writing a function that I know > is the starting point of the process boundary I'd put init_for_subprocess() > at the top. init_for_subprocess() then ultimately does the create_engine() > and establishes it as a module level global in the appropriate place). > > > > > > > > > > > On Sep 15, 2:30 pm, Michael Bayer <[email protected]> wrote: > >> On Sep 15, 2011, at 6:39 AM, Jeff wrote: > > >>> SQLAlchemy version 0.7.1 > >>> MySQL Ver 14.12 Distrib 5.0.77 > > >>> We have a series of tables with one to many connections: > >>> A -> B -> C -> D->E etc. > > >>> Script1 has a big for loop over several hundred/thousand values. In > >>> each loop iteration it goes through A,B,C, makes some new entries, > >>> then calls Function1 (passing some ids from A,B,C). > >>> Function1 makes a new entry in D, then calls Function2 (passing ids > >>> from A,B,C,D). > >>> Function2 makes modification to the entry in D and makes several new > >>> entries in E. > > >>> Not far into the loop we get an error saying the MySQL database has > >>> run out of connections: > >>> (Operational Error) (1040, 'Too many connections') > > >> Your scripts call create_engine() essentially in a loop. This isn't > >> really the appropriate usage of create_engine(). The Engine does not > >> represent a single database connection; is an expensive-to-create registry > >> of information about your database and DBAPI as well as a connection pool > >> (seehttp://www.sqlalchemy.org/docs/core/engines.htmlforan overview). > >> Dereferencing it will eventually close out connections which were open > >> from it, but not immediately as the garbage collector thread typically > >> needs to find those unreachable objects. > > >> The appropriate scope for Engine is once per url per application, at the > >> module level. That means if your application has only one URL, there > >> should be exactly one call to create_engine() in just one place, and the > >> resulting Engine should be placed in a single module made available for > >> other modules to import. Otherwise you're working against the intended > >> design of create_engine(). > > >> With that, all functions that call upon the Engine will be calling upon > >> the underlying connection pool so that the total number of connections > >> used by the application can be managed. > > >> The guidelines for Session are less stringent, though again generally a > >> single Session is shared among all functions and methods for a particular > >> operation. I didn't read your script carefully but typically a single > >> Session is passed along all functions that need to operate on data, so > >> that all those functions can share the same pool of objects which all > >> interact cleanly, not to mention all within one transaction. The script > >> as it is now creates many new transactions. > > >> If you really do want to use a Session inside a function you can forego > >> the usage of sessionmaker as again that function is just a helper for > >> declaring module-level patterns. The Session constructor can be called > >> directly, i.e. session = Session(engine). > > > -- > > You received this message because you are subscribed to the Google Groups > > "sqlalchemy" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]. > > For more options, visit this group > > athttp://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
