Great. I implemented your suggestions and it was (predictably) a
significant speedup. Cheers!

On Sep 15, 4:31 pm, Michael Bayer <[email protected]> wrote:
> On Sep 15, 2011, at 10:20 AM, Jeff wrote:
>
> > Thanks for the guidance!
>
> > In a situation which a script submits a swarm of independent jobs for
> > a cluster, and then finishes before some/all of those jobs have
> > started running, each job will need to create the engine, yes? Or is
> > there a better way to do it?
>
> Anytime you start a new process, that's where create_engine() would need to 
> be called once.    
>
> When I use the Python multiprocessing library for example, I have a function 
> init_for_subprocess() which I can pass as the "on init" function to the 
> various multiprocessing functions, or if I'm writing a function that I know 
> is the starting point of the process boundary I'd put init_for_subprocess() 
> at the top.   init_for_subprocess() then ultimately does the create_engine() 
> and establishes it as a module level global in the appropriate place).
>
>
>
>
>
>
>
>
>
> > On Sep 15, 2:30 pm, Michael Bayer <[email protected]> wrote:
> >> On Sep 15, 2011, at 6:39 AM, Jeff wrote:
>
> >>> SQLAlchemy version 0.7.1
> >>> MySQL Ver 14.12 Distrib 5.0.77
>
> >>> We have a series of tables with one to many connections:
> >>> A -> B -> C -> D->E etc.
>
> >>> Script1 has a big for loop over several hundred/thousand values. In
> >>> each loop iteration it goes through A,B,C, makes some new entries,
> >>> then calls Function1 (passing some ids from A,B,C).
> >>> Function1 makes a new entry in D, then calls Function2 (passing ids
> >>> from A,B,C,D).
> >>> Function2 makes modification to the entry in D and makes several new
> >>> entries in E.
>
> >>> Not far into the loop we get an error saying the MySQL database has
> >>> run out of connections:
> >>> (Operational Error) (1040, 'Too many connections')
>
> >> Your scripts call create_engine() essentially in a loop.     This isn't 
> >> really the appropriate usage of create_engine().   The Engine does not 
> >> represent a single database connection; is an expensive-to-create registry 
> >> of information about your database and DBAPI as well as a connection pool 
> >> (seehttp://www.sqlalchemy.org/docs/core/engines.htmlforan overview).  
> >> Dereferencing it will eventually close out connections which were open 
> >> from it, but not immediately as the garbage collector thread typically 
> >> needs to find those unreachable objects.
>
> >> The appropriate scope for Engine is once per url per application, at the 
> >> module level.   That means if your application has only one URL, there 
> >> should be exactly one call to create_engine() in just one place, and the 
> >> resulting Engine should be placed in a single module made available for 
> >> other modules to import.  Otherwise you're working against the intended 
> >> design of create_engine().
>
> >> With that, all functions that call upon the Engine will be calling upon 
> >> the underlying connection pool so that the total number of connections 
> >> used by the application can be managed.
>
> >> The guidelines for  Session are less stringent, though again generally a 
> >> single Session is shared among all functions and methods for a particular 
> >> operation.   I didn't read your script carefully but typically a single 
> >> Session is passed along all functions that need to operate on data, so 
> >> that all those functions can share the same pool of objects which all 
> >> interact cleanly, not to mention all within one transaction.    The script 
> >> as it is now creates many new transactions.
>
> >> If you really do want to use a Session inside a function you can forego 
> >> the usage of sessionmaker as again that function is just a helper for 
> >> declaring module-level patterns.    The Session constructor can be called 
> >> directly, i.e. session = Session(engine).
>
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "sqlalchemy" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to 
> > [email protected].
> > For more options, visit this group 
> > athttp://groups.google.com/group/sqlalchemy?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Reply via email to