I did something like this last night (hacked into the SingletonThreadPool though, as I didn't know about the poolclass argument to create_engine before).  The problem with this approach is that a new connection is created for each Query you pop off of your session; in our case, that works out to around 7-8 connections opened and closed per request.  No resources are leaked, but it's very hard on performance (req/s dropped to 4.5 using this approach compared to a baseline of 35 req/s on my laptop), and it exacerbates the "too many open files" errors.

I tinkered, and came up with this combination of the OpenClose and SingletonThread pools -- no connections are leaked, BUT you need to explictly dig into the pool after your request cycle is complete to clean up the one connection that was used throughout:

import sqlalchemy.pool as sapool
import thread
class OpenClosePool(sapool.Pool):
    def __init__(self, creator, **params):
        sapool.Pool.__init__(self, **params)
        self.creator = creator
        self._conns = {}
       
    def status(self):
        return "OpenClosePool"

    def do_return_conn(self, conn):
        self._conns[thread.get_ident()] = conn

    def cleanupThread (self):
        try:
            conn = self._conns[thread.get_ident()]
            conn.close()
            del self._conns[thread.get_ident()]
        except KeyError:
            pass

    def do_return_invalid(self, conn):
        pass

    def do_get(self):
        return self.creator()

sqlite = create_engine(sqliteuri, poolclass=OpenClosePool)

#### and, in your controller:

def __after__ (self, environ, start_response):
    sqlite.engine.connection_provider._pool.cleanupThread()

That's probably the most deterministic approach that can be taken, but it also isn't as slick and automatic as the rest of SA.  It also doesn't resolve the "too many open files" issue.  I think the latter is simply the way it is -- when your connections are request-bound (due to the threading model), you're going to be in a world of hurt when you get flooded.

Thanks for your continuing assistance and dialogue.  FYI, we've put together a caching strategy that will make the database issues largely irrelevant until our next release in the fall (at which point we'll probably be swinging back to postgres).

- Charlie


Michael Bayer <[EMAIL PROTECTED]> wrote:
the close() that youre doing there is potentially not closing every connection...since the cleanup() phase can jump in there at any time and remove the connection from the pool, when you say connection(None) again, it can sometimes open another new connection.  yet another prob with pysqlite is that you cant even *close* a connection that was opened from a different thread....so perhaps thats creating issues.

So here is yet another thing you might try; no pooling at all:

import sqlalchemy.pool as pool

class OpenClosePool(pool.Pool):
    def __init__(self, creator, **params):
        pool.Pool.__init__(self, **params)
self.creator = creator
        
    def status(self):
        return "OpenClosePool"

    def do_return_conn(self, conn):
conn.close()

    def do_return_invalid(self, conn):
pass

    def do_get(self):
return self.creator()

engine = create_engine('sqlite:///some_file', poolclass=OpenClosePool)

for sqlite this might be a decent option.  if it works better then i will make this the standard pool used for sqlite file-based access.  

On Aug 5, 2006, at 1:43 PM, [EMAIL PROTECTED] wrote:

Thanks very much -- that certainly clears out the connections properly, at least as far as the python references go.

I'm still getting "too many open files" errors when I push apachebench up to ~50 requests/sec, so it looks like sqlite connections can't get out of their own way in a highly-concurrent environment.  I tried adding an explicit close call at the end of our app's response cycle:

    session.connection(None).connection.close()

but that had no effect.  I'm guessing the native part of the sqlite connection isn't breaking the open file handle fast enough.

Interestingly, according to [http://www.sqlite.org/cvstrac/wiki?p=MultiThreading], starting with sqlite 3.3.1, sqlite connections may be used across threads, assuming your underlying OS has appropriate locking behaviour.  pysqlite2 checks thread id's though, so my attempt at using a QueuePool with a sqlite engine didn't go very far.

This is mostly off-topic, but if anyone has any tips on how to make sqlite more robust within a threaded webapp environment, I'm all ears.  At this point, it looks like we'll have to drop back to postgres to get proper capacity.

BTW, You mentioned FCGI -- we're not using FCGI, but the problem I was having was pretty consistent across SCGI as well as when using the default internal pylons/WSGI webserver.

Thanks again!

- Charlie

Michael Bayer <[EMAIL PROTECTED]> wrote:
heres another patch that is probably better, does not require an explicit step on your part as it does a cleanup pass, checked in to rev 1765.  see if this one works out, a test program on this end created 50 threads and maintained a size of 5 connections.  fortunately sqlite is pretty casual/fast about connections and we probably could just have a straight connect/disconnect module with more or less the same performance, if it werent for :memory: connections.

Index: lib/sqlalchemy/pool.py
===================================================================
--- lib/sqlalchemy/pool.py      (revision 1764)
+++ lib/sqlalchemy/pool.py      (working copy)
@@ -184,10 +184,11 @@
class SingletonThreadPool(Pool):
     """Maintains one connection per each thread, never moving to another thread.  this is
     used for SQLite."""
-    def __init__(self, creator, **params):
+    def __init__(self, creator, pool_size=5, **params):
         Pool.__init__(self, **params)
         self._conns = {}
         self._creator = creator
+        self.size = pool_size
     def dispose(self):
         for key, conn in self._conns.items():
@@ -203,7 +204,16 @@
             del self._conns[thread.get_ident()]
         except KeyError:
             pass
-                   
+   
+    def cleanup(self):
+        for key in self._conns.keys():
+            try:
+                del self._conns[key]
+            except KeyError:
+                pass
+            if len(self._conns) <= self.size:
+                return
+                                   
     def status(self):
         return "SingletonThreadPool id:%d thread:%d size: %d" % (id(self), thread.get_ident(), len(self._conns))
@@ -222,6 +232,8 @@
         except KeyError:
             c = self._creator()
             self._conns[thread.get_ident()] = c
+            if len(self._conns) > self.size:
+                self.cleanup()
             return c
    
class QueuePool(Pool):


On Aug 5, 2006, at 2:11 AM, [EMAIL PROTECTED] wrote:

We're about ready to go to production with our new website (built using pylons, sqlalchemy 0.2.6/sqlite, and kid), so I thought I would point apachebench at it to see how things performed.  I was horrified to see memory consumption going through the roof (i.e. >100MB in less than 10,000 requests!).

I did a bunch of profiling, and eventually found that *every* database connection was being leaked.  This was confirmed by adding this line to our base controller's __call__ function, which showed the pool's size growing mercilessly (`sqlite` is a reference to the database engine returned by create_engine):

    sqlite.engine.connection_provider._pool.status()

Well, after tearing my hair out on this one, I finally came across the dispose() function on the connection pool.  Adding this line to our base controller (after all of the request processing has completed) results in proper functionality, and the pool's size never gets above 10 (even when it's getting pounded by apachebench):

    sqlite.engine.connection_provider._pool.dispose()

Now, we're creating our database engine simply enough:

    sqlite = create_engine(appconfig.get('sqlite.dburi'), strategy='threadlocal')

And I believe our usage of SA is straightforward enough (modelled after zblog's usage, actually):

    query = create_session(bind_to=sqlite).query(MyMappedClass)
    instances = query.select_by(name='blahblah')

We don't hang onto any ORM-ed instances at all -- indeed, if we don't include the pool.dispose() function call after each request, we can cause the connection leak behaviour within a no-op request handler (one that only runs a query using SA -- no templating or other logic), so I can't imagine that we're somehow actively causing the connections to be retained via loaded mapped class instances.

Given this, I certainly didn't expect that calling the pool.dispose() function would be necessary for the sqlite connections to be cleaned up properly -- indeed, this is either a bug, or we're doing something very wrong.

Thoughts?  is the pool.dispose() call at the end of each request a reasonable stopgap, or will things break in unexpected ways if that function is used?  Further, does anyone have any suggestions on what we might be doing wrong that our connection pool is growing like a weed?

Thanks!

Charlie
[EMAIL PROTECTED]


Do you Yahoo!?
Next-gen email? Have it all with the all-new Yahoo! Mail Beta.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
Sqlalchemy-users mailing list




How low will we go? Check out Yahoo! MessengerÂ’s low PC-to-Phone call rates.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
Sqlalchemy-users mailing list



Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls. Great rates starting at 1¢/min.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Sqlalchemy-users mailing list
Sqlalchemy-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users

Reply via email to