Re: [sqlalchemy] hang on connect in forked process

Mike Bayer Thu, 18 Feb 2016 18:24:44 -0800


On 02/18/2016 01:39 PM, Uri Okrent wrote:

Looks like forking from a thread causes other issues.  I think I've
resolved the hang in psycopg2 by creating a brand new engine in the
forked subprocess, but now I'm getting occasional hangs here:

#1 Waiting for a lock (e.g. GIL)
#2 <built-in method acquire of thread.lock object at remote 0x7fc061c17b58>
#4 file
'/usr/lib64/python2.6/site-packages/sqlalchemy/util/langhelpers.py', in
'_next'
#8 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/loading.py',
in 'instances'
#16 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py',
in '__getitem__'
#24 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py',
in 'first'

At this point:
|
def counter():
     """Return a threadsafe counter function."""

     lock = compat.threading.Lock()
     counter = itertools.count(1)

     # avoid the 2to3 "next" transformation...
     def _next():
         lock.acquire()
         try:
             return next(counter)
         finally:
             lock.release()

     return _next
|


Not sure a process pool will solve the issue since I can't necessarily
tell the volume of concurrent queries beforehand.  I'm wondering if
there's a way to clear the lock manually in the forked process, or
somehow protect this section when forking.  Either one is probably a
monkey patch...

if you're using python multiprocessing, that thing has a lot ofdeadlock-ish things in it especially in an older Python like 2.6 - ituses threads in various ways to communicate with process pools and such.Can't really say why your counter is locking, would have to at leastsee how you're using it and also I don't quite get how this counterfunction is interacting with query.first(). But locks are not"interruptable" unless you kill the whole thread in which it runs, soyou'd need to put a timeout on it.


On Wednesday, February 17, 2016 at 10:41:00 AM UTC-8, Mike Bayer wrote:



    On 02/17/2016 11:33 AM, Uri Okrent wrote:
     > Maybe this is a psycopg question and if so please say so.
     >
     > I have a multi-threaded server which maintains a thread-pool (and a
     > corresponding connection pool) for servicing requests.  In order to
     > mitigate python's high-water-mark memory usage behavior for large
     > queries, I'm attempting to handle queries in particular using a
    forked
     > subprocess from the request thread.
     >
     > I'm using the connection invalidation recipe described here (the
    second
     > one that adds listeners to the Pool):
     >
    
http://docs.sqlalchemy.org/en/latest/core/pooling.html#using-connection-pools-with-multiprocessing
    
<http://docs.sqlalchemy.org/en/latest/core/pooling.html#using-connection-pools-with-multiprocessing>

     >
     > It seems to be working correctly -- that is, I can see that the
    child
     > process is indeed creating a new connection.  However, I'm still
     > experiencing intermittent hangs in the child process during
    connection
     > creation.  I've gotten a stack trace using gdb, and I think I
    understand
     > what is going on but I'm not sure how to protect the critical
    section.
     >
     > It looks like threads creating connections in the parent process
    acquire
     > some threading synchronization primitive inside psycopg's _connect
     > function (that's in c so I didn't see the actual source).  This
     > apparently occurs occasionally at the same time as the fork, so
    that the
     > child process never sees the primitive release in the parent
    process and
     > hangs forever.  Interestingly, hangs stop after the server has been
     > running for a while, presumably because the parent process is
    warmed up
     > and has a full connection pool, and is no longer creating
    connections.
     >
     > Here is my stack on a hung process:
     > #17 <built-in function _connect>
     > #19 file
    '/usr/lib64/python2.6/site-packages/psycopg2/__init__.py', in
     > 'connect'
     > #24 file
     >
    '/usr/lib64/python2.6/site-packages/sqlalchemy/engine/default.py', in
     > 'connect'
     > #29 file
     >
    '/usr/lib64/python2.6/site-packages/sqlalchemy/engine/strategies.py', in

     > 'connect'
     > #33 file '/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py', in
     > '__connect'
     > #36 file '/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py', in
     > 'get_connection'
     > #39 file '/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py', in
     > 'checkout'
     > #43 file '/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py', in
     > '_checkout'
     > #47 file '/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py', in
     > 'connect'
     > #50 file
    '/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py',
     > in '_wrap_pool_connect'
     > #54 file
    '/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py',
     > in 'contextual_connect'
     > #58 file
    '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py',
     > in '_connection_for_bind'
     > #61 file
    '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py',
     > in '_connection_for_bind'
     > #65 file
    '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py',
     > in 'connection'
     > #70 file
    '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py',
     > in '_connection_from_session'
     > #74 file
    '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py',
     > in '_execute_and_instances'
     > #77 file
    '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py',
     > in '__iter__'
     > #91 file
    '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py',
     > in '__getitem__'
     > #99 file
    '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py',
     > in 'first'
     >
     > I'm using sqlalchemy 1.0.12 and psycopg 2.5.3
     >
     > My quick and dirty fix would be to fill the connection pool in the
     > parent process by force before servicing requests,

    well I'd not want to transfer a psycopg2 connection from a parent to a
    child fork, because now that same filehandle is in both processes and
    you'll get unsafe concurrent access on it.

    I've used multiprocessing with psycopg2 for years in a wide variety of
    scenarios and I've never seen it hanging on the actual psycopg2.connect
    call.   But perhaps that's because I've never called fork() from inside
    a thread that is not the main thread - if that is what's triggering it
    here, I'd use a pattern such as a process pool or similar where the
    forking is done from the main thread ahead of time.




    but that is a hack,
     > and in case of an invalidated connection the server would be
    susceptible
     > to the issue again while recreating the invalid connection in the
    parent
     > process.
     > I apparently need to synchronize my fork in one thread with
    connections
     > being created in others, but I'm not sure how to do that.  Any
    pointers
     > would be great.
     >
     > TIA,
     > Uri
     >
     > --
     > You received this message because you are subscribed to the Google
     > Groups "sqlalchemy" group.
     > To unsubscribe from this group and stop receiving emails from it,
    send
     > an email to sqlalchemy+...@googlegroups.com <javascript:>
     > <mailto:sqlalchemy+unsubscr...@googlegroups.com <javascript:>>.
     > To post to this group, send email to sqlal...@googlegroups.com
    <javascript:>
     > <mailto:sqlal...@googlegroups.com <javascript:>>.
     > Visit this group at https://groups.google.com/group/sqlalchemy
    <https://groups.google.com/group/sqlalchemy>.
     > For more options, visit https://groups.google.com/d/optout
    <https://groups.google.com/d/optout>.

--
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to sqlalchemy+unsubscr...@googlegroups.com
<mailto:sqlalchemy+unsubscr...@googlegroups.com>.
To post to this group, send email to sqlalchemy@googlegroups.com
<mailto:sqlalchemy@googlegroups.com>.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] hang on connect in forked process

Reply via email to