Re: [sqlalchemy] Re: long transaction after database switched over

Mike Bayer Sun, 13 May 2018 19:49:05 -0700

as well as the log that shows this:

+    if self in pool._pool.queue:
+        pool.logger.debug("jjs rec already in pool checkin, skip %r", self)
+        return



actually happening.




On Sun, May 13, 2018 at 10:42 PM, Mike Bayer <[email protected]> wrote:
> On Sun, May 13, 2018 at 10:35 PM,  <[email protected]> wrote:
>> "jjs" is short for my name and "cjr" is my colleague's(JinRong Cai). We work
>> together and add logs in the same environment, add prefix of name to get a
>> keyword for log filter.
>>
>> The doubled ConnectionRecords(with None connection) are got by two different
>> coroutines, and the variable record.connection is rewrite by the second
>> coroutines.
>> After the record being put back to QueuePool again, there are two same
>> record with same connection in the pool, which will cause the next two
>> coroutines get conflict and deadlock in psycopg2 as above.
>
> OK, I need the detailed step by step that shows how the two coroutines
> are interacting such that they both are returning the same connection.
>   The nova log above seems to show just one coroutine I assume since
> there is just one nova request ID, if I understand correctly.
>
>>
>> The logs of psycopg2 has been pasted at
>> https://github.com/psycopg/psycopg2/issues/703
>
> The above issue says the bug has been found in SQLAlchemy but there is
> no bug report yet and I've not yet been shown how the behavior here is
> possible.      I posted a straightforward example above, can you
> please show me where the two coroutines get mixed up ?
>
>
>
>
>>
>>
>>
>> 在 2018年5月11日星期五 UTC+8上午3:05:46，Mike Bayer写道：
>>>
>>> I tried to follow your log lines, which is difficult because I don't
>>> know where those log statements actually are and I don't know what
>>> acronyms like "jjs", "cjr", mean.   But it does look like it's al in
>>> one thread (so not async GC) and all in one Nova request (so perhaps
>>> just one greenlet involved, unless other greenlets are affecting this
>>> without them being displayed in this log snippet).   I do see the
>>> connection record doubled at the end.   So here is the illustration of
>>> those steps in isolation, the connection record is not doubled.  Can
>>> you alter this program to show this condition occurring?
>>>
>>> from sqlalchemy import create_engine
>>>
>>> e = create_engine("postgresql://scott:tiger@localhost/test",
>>> echo_pool='debug')
>>> e.connect().close()
>>>
>>> assert len(e.pool._pool.queue) == 1
>>>
>>> c1 = e.connect()
>>>
>>> assert len(e.pool._pool.queue) == 0
>>>
>>> c1.engine.pool._invalidate(c1._Connection__connection, None)
>>> c1.invalidate(e)
>>>
>>> # connection record is back
>>> assert len(e.pool._pool.queue) == 1
>>>
>>> # do a reconnect
>>> c1.connection
>>>
>>> # uses the record again
>>> assert len(e.pool._pool.queue) == 0
>>>
>>> # close the Connection
>>> c1.close()
>>>
>>> # record is back
>>> assert len(e.pool._pool.queue) == 1
>>>
>>> # etc.
>>> c1 = e.connect()
>>> assert len(e.pool._pool.queue) == 0
>>>
>>> c1.close()
>>> assert len(e.pool._pool.queue) == 1
>>>
>>>
>>>
>>>
>>> On Thu, May 10, 2018 at 12:25 PM, Mike Bayer <[email protected]>
>>> wrote:
>>> > Here's a print statement you might want to try:
>>> >
>>> > diff --git a/lib/sqlalchemy/pool.py b/lib/sqlalchemy/pool.py
>>> > index 89a4cea7c..452db4883 100644
>>> > --- a/lib/sqlalchemy/pool.py
>>> > +++ b/lib/sqlalchemy/pool.py
>>> > @@ -689,6 +689,19 @@ def _finalize_fairy(connection, connection_record,
>>> >      been garbage collected.
>>> >
>>> >      """
>>> > +    print(
>>> > +        "FINALIZE FAIRY:  GREENLET ID: %s, DB CONN: %s, FAIRY: %s, REF:
>>> > %s  "
>>> > +        "CONNECTION_REC: %s CONNECTION_REC REF %s" % (
>>> > +            "put greenlet id here",
>>> > +            connection,
>>> > +            fairy,
>>> > +            ref,
>>> > +            connection_record,
>>> > +            connection_record.fairy_ref
>>> > +            if connection_record is not None else "n/a"
>>> > +        )
>>> > +    )
>>> > +
>>> >      _refs.discard(connection_record)
>>> >
>>> >      if ref is not None:
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > On Thu, May 10, 2018 at 12:21 PM, Mike Bayer <[email protected]>
>>> > wrote:
>>> >> On Thu, May 10, 2018 at 7:23 AM,  <[email protected]> wrote:
>>> >>> Hi Mike,
>>> >>>
>>> >>> I added more logs and finnaly find the reason and a method to avoid
>>> >>> the
>>> >>> deadlock between coroutines.
>>> >>>
>>> >>> In summary, the deadlock happend because two different coroutines use
>>> >>> the
>>> >>> same ConnectionRecord with same connection.
>>> >>> It happends under QueuePool, when the DB connections are reset by the
>>> >>> DB
>>> >>> server(not related with the backend driver).
>>> >>>
>>> >>> 1.A coroutine get a ConnectionRecord into ConnectionFairy, and apply a
>>> >>> weakref of the ConnectionFairy, with callback function
>>> >>> `_finalize_fairy`.
>>> >>> 2.When the connection get disconnected, it receives an
>>> >>> exception(psycopg2.OperationalError), and goto
>>> >>> Connection._handle_dbapi_exception.
>>> >>> 3.In `_handle_dbapi_exception.finnaly`, it will call both
>>> >>> `ConnectionFairy.invalidate` and `self.invalidate`.
>>> >>> 4.In `ConnectionFairy.invalidate`, `_ConnectionRecord._checkin` is
>>> >>> called,
>>> >>> and put the ConnectionRecord back to the QueuePool.
>>> >>> 5.In `self.invalidate`, `del self._root.__connection` is called, and
>>> >>> **del**
>>> >>> the ConnectionFairy.
>>> >>> 6.The **del** operation will make weakref of ConnectionFairy to dead,
>>> >>> and
>>> >>> the callback `_finalize_fairy` executed.
>>> >>> 7.In `_finalize_fairy`, another `_ConnectionRecord._checkin` is
>>> >>> called, put
>>> >>> the ConnectionRecord to the QueuePool again.
>>> >>
>>> >> Here's what does not make sense.  You are claiming that a simple
>>> >> invalidation will result in the same ConnectionRecord being checked in
>>> >> twice.   This is obviously not the case, what is special in your
>>> >> situation that makes this happen?
>>> >>
>>> >> If I follow the steps you refer towards, in step 4, when
>>> >> ConnectionFairy._checkin is called, that calls finalize_fairy
>>> >> directly, which then calls ConnectionRecord.checkin().  Within
>>> >> ConnectionRecord.checkin(), ConnectionRecord.fairy_ref is set to None.
>>> >>    Then back in ConnectionFairy._checkin(), it also sets
>>> >> self._connection_record = None.
>>> >>
>>> >> If we come back into finalize_fairy a *second* time then, as a result
>>> >> of the "del" which occurs in _handle_error(), and the
>>> >> connection_record is present, the function will exit immediately,
>>> >> because:
>>> >>
>>> >>        if connection_record.fairy_ref is not ref:
>>> >>             return
>>> >>
>>> >> So the situation as given is not possible without concurrent access to
>>> >> the ConnectionFairy being introduced externally to the pool.
>>> >>
>>> >> There is exactly one thing I can think of that is both unique to the
>>> >> psycopg2-style of asynchronous connection, not the PyMySQL form of it,
>>> >> that might be important here, and it's that I think psycopg2's
>>> >> connection.close() method will defer to another greenlet.  That means
>>> >> when we're inside of ConnectionRecord.invalidate(), and we call
>>> >> self.__close(), that could be deferring to another greenlet.   But
>>> >> this all happens before the connection is returned to the pool, so I
>>> >> still don't know what is happening.
>>> >>
>>> >>>
>>> >>> The logs in detail is below. To prove my result is right, I tried to
>>> >>> avoid
>>> >>> the scienario with this patch, and it worked.
>>> >>>
>>> >>> def checkin(self):
>>> >>>     self.fairy_ref = None
>>> >>>     connection = self.connection
>>> >>>     pool = self.__pool
>>> >>>
>>> >>>     while self.finalize_callback:
>>> >>>         finalizer = self.finalize_callback.pop()
>>> >>>         finalizer(connection)
>>> >>> +    if self in pool._pool.queue:
>>> >>> +        pool.logger.debug("jjs rec already in pool checkin, skip %r",
>>> >>> self)
>>> >>> +        return
>>> >>
>>> >> as mentioned above it's not possible to reach this point without
>>> >> concurrent access to the pooled connection, which means this fix could
>>> >> fail also (with at least threaded-style concurrency, not greenlet
>>> >> style).   The original source of the greenlet issue must still be
>>> >> identified.
>>> >>
>>> >> I would need a self-contained proof of concept that demonstrates this
>>> >> issue occurring.  The log output is not that helpful because it at
>>> >> least would need to write out the identifier of the current greenlet,
>>> >> since it is very likely that greenlets have something to do with this
>>> >> issue.
>>> >>
>>> >>
>>> >>
>>> >>>
>>> >>>     if pool.dispatch.checkin:
>>> >>>         pool.dispatch.checkin(connection, self)
>>> >>>     pool._return_conn(self)
>>> >>>
>>> >>> Trace the ConnectionRecord 0x88fc5d0
>>> >>>
>>> >>> 2018-05-10T14:02:17.927+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] Created new connection
>>> >>> <connection object at 0x81f69b0; closed: 0>
>>> >>> 2018-05-10T14:02:17.928+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs get conn success
>>> >>> <connection
>>> >>> object at 0x81f69b0; closed: 0>, rec.connection <connection object at
>>> >>> 0x81f69b0; closed: 0>, pool queue
>>> >>> deque([<sqlalchemy.pool._ConnectionRecord
>>> >>> object at 0x7a04810>, <sqlalchemy.pool._ConnectionRecord object at
>>> >>> 0x79fdad0>])
>>> >>> 2018-05-10T14:02:17.929+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs checkout fairy
>>> >>> <sqlalchemy.pool._ConnectionFairy object at 0x329c0d0> rec
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x88fc5d0> pool queue
>>> >>> deque([<sqlalchemy.pool._ConnectionRecord object at 0x7a04810>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x79fdad0>])
>>> >>> 2018-05-10T14:02:17.929+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] Connection <connection
>>> >>> object at
>>> >>> 0x81f69b0; closed: 0> checked out from pool, status Pool size: 10
>>> >>> Connections in pool: 2 Current Overflow: 0 Max Overflow: 40 Current
>>> >>> Checked
>>> >>> out connections: 8, queue deque([<sqlalchemy.pool._ConnectionRecord
>>> >>> object
>>> >>> at 0x7a04810>, <sqlalchemy.pool._ConnectionRecord object at
>>> >>> 0x79fdad0>])
>>> >>> 2018-05-10T14:02:17.930+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs
>>> >>> _ConnectionFairy._checkout
>>> >>> conn <connection object at 0x81f69b0; closed: 0>, rec
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x88fc5d0>, rec.conn
>>> >>> <connection object at 0x81f69b0; closed: 0>
>>> >>> 2018-05-10T14:02:49.716+08:00 localhost nova-conductor ERROR
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:122 error]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] cjr error conn
>>> >>> <sqlalchemy.engine.base.Connection object at 0x329c150>, err
>>> >>> connection
>>> >>> pointer is NULL
>>> >>> 2018-05-10T14:02:49.717+08:00 localhost nova-conductor ERROR
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:122 error]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] cjr enter newraise
>>> >>> 2018-05-10T14:02:49.717+08:00 localhost nova-conductor ERROR
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:122 error]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] cjr enter finally
>>> >>>>>Connection._handle_dbapi_exception --> finally -->
>>> >>>>> **self.engine.pool._invalidate** & **self.invalidate**
>>> >>>>>self.engine.pool._invalidate --> Pool._invalidate -->
>>> >>>>> ConnectionFairy.invalidate --> _ConnectionRecord.invalidate &
>>> >>>>> **self._checkin()**
>>> >>>>>self.invalidate() --> **del self._root.__connection**
>>> >>> 2018-05-10T14:02:49.718+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs connectionFairy
>>> >>> invalidate
>>> >>> 2018-05-10T14:02:49.718+08:00 localhost nova-conductor INFO [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:109 info]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] Invalidate connection
>>> >>> <connection
>>> >>> object at 0x81f69b0; closed: 2> (reason: OperationalError:connection
>>> >>> pointer
>>> >>> is NULL
>>> >>>>>_ConnectionRecord.invalidate(e)  -->
>>> >>>>> self.__pool.dispatch.invalidate(self.connection, self, e) &
>>> >>>>> self.__close() &
>>> >>>>> self.connection = None
>>> >>>>>self.__close() --> self.finalize_callback.clear() &
>>> >>>>> self.__pool._close_connection(self.connection)
>>> >>> 2018-05-10T14:02:49.719+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] Closing connection
>>> >>> <connection
>>> >>> object at 0x81f69b0; closed: 2>
>>> >>>>>Pool._close_connection(connection) -->
>>> >>>>> self._dialect.do_close(connection)
>>> >>>>> --> _ConnDialect.do_close(dbapi_connection) -->
>>> >>>>> dbapi_connection.close() FIN
>>> >>> 2018-05-10T14:02:49.719+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs
>>> >>> _ConnectionFairy._checkin
>>> >>> conn None rec <sqlalchemy.pool._ConnectionRecord object at 0x88fc5d0>
>>> >>> rec.conn None
>>> >>>>>_ConnectionFairy._checkin --> _finalize_fairy(conn, conn_rec, pool,
>>> >>>>> None,
>>> >>>>> echo, fairy) & self.conn = None & self.conn_rec = None
>>> >>> 2018-05-10T14:02:49.720+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs _finalize_fairy args
>>> >>> conn
>>> >>> None, rec.conn None, rec, <sqlalchemy.pool._ConnectionRecord object at
>>> >>> 0x88fc5d0>. ref None, fairy <sqlalchemy.pool._ConnectionFairy object
>>> >>> at
>>> >>> 0x329c0d0>
>>> >>>>>_finalize_fairy --> connection_record.checkin()
>>> >>> 2018-05-10T14:02:49.720+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs going to
>>> >>> connection_record.checkin
>>> >>> 2018-05-10T14:02:49.721+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs before conn checkin
>>> >>> None,
>>> >>> pool queue deque([<sqlalchemy.pool._ConnectionRecord object at
>>> >>> 0x7524590>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7a2b990>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0xa856b10>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7ebee90>]),
>>> >>> finalize_callback
>>> >>> deque([])
>>> >>> 2018-05-10T14:02:49.721+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs after conn checkin,
>>> >>> pool
>>> >>> queue deque([<sqlalchemy.pool._ConnectionRecord object at 0x7524590>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7a2b990>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0xa856b10>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7ebee90>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x88fc5d0>])
>>> >>> 2018-05-10T14:02:49.723+08:00 localhost nova-conductor ERROR
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:122 error]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] Exception during reset or
>>> >>> similar
>>> >>> 2018-05-10T14:02:49.723+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs going to
>>> >>> connection_record.invalidate
>>> >>> 2018-05-10T14:02:49.724+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs going to
>>> >>> connection_record.checkin
>>> >>> 2018-05-10T14:02:49.724+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs before conn checkin
>>> >>> None,
>>> >>> pool queue deque([<sqlalchemy.pool._ConnectionRecord object at
>>> >>> 0x7524590>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7a2b990>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0xa856b10>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7ebee90>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x88fc5d0>]),
>>> >>> finalize_callback
>>> >>> deque([])
>>> >>> 2018-05-10T14:02:49.732+08:00 localhost nova-conductor DEBUG
>>> >>> [pid:4528]
>>> >>> [MainThread] [tid:122769072] [log.py:104 debug]
>>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs after conn checkin,
>>> >>> pool
>>> >>> queue deque([<sqlalchemy.pool._ConnectionRecord object at 0x7524590>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7a2b990>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0xa856b10>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7ebee90>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x88fc5d0>,
>>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x88fc5d0>])
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> 在 2018年5月8日星期二 UTC+8下午11:06:04，[email protected]写道：
>>> >>>>
>>> >>>> I found why we could get the same connection twice from the pool...
>>> >>>>
>>> >>>> Because we put it into pool twice.
>>> >>>>
>>> >>>> Any suggestion for finding why it put the same connection into pool
>>> >>>> within
>>> >>>> different coroutines?
>>> >>>>
>>> >>>>
>>> >>>> 2018-05-08T22:39:17.448+08:00 localhost nova-conductor DEBUG
>>> >>>> [pid:15755]
>>> >>>> [MainThread] [tid:187954992] [log.py:104 debug]
>>> >>>> [req-7c2199c0-d9a6-4131-8358-2aebcf5a4ce1] jjs conn checkin
>>> >>>> <connection
>>> >>>> object at 0x111d9740; closed: 0>
>>> >>>> 2018-05-08T22:39:17.502+08:00 localhost nova-conductor DEBUG
>>> >>>> [pid:15755]
>>> >>>> [MainThread] [tid:178645264] [log.py:104 debug]
>>> >>>> [req-7ab83c30-91bd-4278-b694-85a71035992c] jjs conn checkin
>>> >>>> <connection
>>> >>>> object at 0x111d9740; closed: 0>
>>> >>>> ...
>>> >>>> 2018-05-08T22:39:17.992+08:00 localhost nova-conductor DEBUG
>>> >>>> [pid:15755]
>>> >>>> [MainThread] [tid:309091856] [log.py:104 debug]
>>> >>>> [req-dee5dd2d-866f-408e-999b-b415ea7ce953] Connection <connection
>>> >>>> object at
>>> >>>> 0x111d9740; closed: 0> checked out from pool
>>> >>>> 2018-05-08T22:39:18.039+08:00 localhost nova-conductor DEBUG
>>> >>>> [pid:15755]
>>> >>>> [MainThread] [tid:178645104] [log.py:104 debug]
>>> >>>> [req-ac8017d7-c97e-44ca-92d2-99f180858e14] Connection <connection
>>> >>>> object at
>>> >>>> 0x111d9740; closed: 0> checked out from pool
>>> >>>>
>>> >>>>
>>> >>>> 在 2018年5月8日星期二 UTC+8下午10:18:52，[email protected]写道：
>>> >>>>>
>>> >>>>> I added logs in sqlalchemy/pool.py
>>> >>>>>
>>> >>>>> QueuePool.checkout
>>> >>>>>
>>> >>>>> @classmethod
>>> >>>>> def checkout(cls, pool):
>>> >>>>>     rec = pool._do_get()
>>> >>>>>     try:
>>> >>>>>         dbapi_connection = rec.get_connection()
>>> >>>>>         pool.logger.debug("jjs get conn success %s",
>>> >>>>> dbapi_connection)
>>> >>>>>     except:
>>> >>>>>         pool.logger.debug("jjs get error in get_connection")
>>> >>>>>         with util.safe_reraise():
>>> >>>>>             rec.checkin()
>>> >>>>>     echo = pool._should_log_debug()
>>> >>>>>     fairy = _ConnectionFairy(dbapi_connection, rec, echo)
>>> >>>>>     rec.fairy_ref = weakref.ref(
>>> >>>>>         fairy,
>>> >>>>>         lambda ref: _finalize_fairy and
>>> >>>>>         _finalize_fairy(
>>> >>>>>             dbapi_connection,
>>> >>>>>             rec, pool, ref, echo)
>>> >>>>>     )
>>> >>>>>     _refs.add(rec)
>>> >>>>>     if echo:
>>> >>>>>         pool.logger.debug("jjs conn %s, pool status %s",
>>> >>>>> dbapi_connection, pool.status())
>>> >>>>>         pool.logger.debug("Connection %r checked out from pool",
>>> >>>>>                           dbapi_connection)
>>> >>>>>     return fairy
>>> >>>>>
>>> >>>>> Then I got logs below:
>>> >>>>>
>>> >>>>> coroutine A:
>>> >>>>>
>>> >>>>> 2018-05-08T21:39:59.055+08:00 localhost nova-conductor INFO
>>> >>>>> [pid:55528]
>>> >>>>> [MainThread] [tid:197123760] [log.py:109 info]
>>> >>>>> [req-ba1ab897-c80a-4bc4-922a-39044d669da1] jjs Pool wait is False,
>>> >>>>> timeout
>>> >>>>> 30
>>> >>>>> 2018-05-08T21:39:59.078+08:00 localhost nova-conductor DEBUG
>>> >>>>> [pid:55528]
>>> >>>>> [MainThread] [tid:197123760] [log.py:104 debug]
>>> >>>>> [req-ba1ab897-c80a-4bc4-922a-39044d669da1] Created new connection
>>> >>>>> <connection object at 0x903f8c0; closed: 0>
>>> >>>>> 2018-05-08T21:39:59.079+08:00 localhost nova-conductor DEBUG
>>> >>>>> [pid:55528]
>>> >>>>> [MainThread] [tid:197123760] [log.py:104 debug]
>>> >>>>> [req-ba1ab897-c80a-4bc4-922a-39044d669da1] jjs get conn success
>>> >>>>> <connection
>>> >>>>> object at 0x903f8c0; closed: 0>
>>> >>>>> 2018-05-08T21:39:59.080+08:00 localhost nova-conductor DEBUG
>>> >>>>> [pid:55528]
>>> >>>>> [MainThread] [tid:197123760] [log.py:104 debug]
>>> >>>>> [req-ba1ab897-c80a-4bc4-922a-39044d669da1] jjs conn <connection
>>> >>>>> object at
>>> >>>>> 0x903f8c0;  closed: 0>, pool status Pool size: 10  Connections in
>>> >>>>> pool: 1
>>> >>>>> Current Overflow: 36 Max Overflow: 40 Current Checked out
>>> >>>>> connections: 45
>>> >>>>> 2018-05-08T21:39:59.080+08:00 localhost nova-conductor DEBUG
>>> >>>>> [pid:55528]
>>> >>>>> [MainThread] [tid:197123760] [log.py:104 debug]
>>> >>>>> [req-ba1ab897-c80a-4bc4-922a-39044d669da1] Connection <connection
>>> >>>>> object at
>>> >>>>> 0x903f8c0; closed: 0> checked out from pool
>>> >>>>>
>>> >>>>> coroutine B:
>>> >>>>>
>>> >>>>> 2018-05-08T21:39:59.212+08:00 localhost nova-conductor INFO
>>> >>>>> [pid:55528]
>>> >>>>> [MainThread] [tid:151519312] [log.py:109 info]
>>> >>>>> [req-a78075b5-c55c-40c4-86d0-65957e89cb56] jjs Pool wait is False,
>>> >>>>> timeout
>>> >>>>> 30
>>> >>>>> 2018-05-08T21:39:59.213+08:00 localhost nova-conductor DEBUG
>>> >>>>> [pid:55528]
>>> >>>>> [MainThread] [tid:151519312] [log.py:104 debug]
>>> >>>>> [req-a78075b5-c55c-40c4-86d0-65957e89cb56] jjs get conn success
>>> >>>>> <connection
>>> >>>>> object at 0x903f8c0;  closed: 0>
>>> >>>>> 2018-05-08T21:39:59.213+08:00 localhost nova-conductor DEBUG
>>> >>>>> [pid:55528]
>>> >>>>> [MainThread] [tid:151519312] [log.py:104 debug]
>>> >>>>> [req-a78075b5-c55c-40c4-86d0-65957e89cb56] jjs conn <connection
>>> >>>>> object at
>>> >>>>> 0x903f8c0; closed: 0>, pool status Pool size: 10  Connections in
>>> >>>>> pool: 0
>>> >>>>> Current Overflow: 36 Max Overflow: 40 Current Checked out
>>> >>>>> connections: 46
>>> >>>>> 2018-05-08T21:39:59.214+08:00 localhost nova-conductor DEBUG
>>> >>>>> [pid:55528]
>>> >>>>> [MainThread] [tid:151519312] [log.py:104 debug]
>>> >>>>> [req-a78075b5-c55c-40c4-86d0-65957e89cb56] Connection <connection
>>> >>>>> object at
>>> >>>>> 0x903f8c0; closed: 0> checked out from pool
>>> >>>>>
>>> >>>>>
>>> >>>>> It seems that when we create_connection in QueuePool._do_get, we may
>>> >>>>> not
>>> >>>>> get the lock _pool.mutex(~Queue.mutex).
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> 在 2018年5月7日星期一 UTC+8下午8:41:14，Mike Bayer写道：
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> On Mon, May 7, 2018, 7:01 AM <[email protected]> wrote:
>>> >>>>>>>
>>> >>>>>>> We added coroutine_id in psycopg2, and found that two coroutine
>>> >>>>>>> use the
>>> >>>>>>> same connection before the pthread_mutex_lock released.
>>> >>>>>>>
>>> >>>>>>> Maybe something in the connection pool goes wrong?
>>> >>>>>>> [pid] [coroutine_id] msg
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> This seems unlikely.   Openstack uses the SQLAlchemy connection
>>> >>>>>> pool
>>> >>>>>> with eventlet patching in conjunction with the MySQL drivers and
>>> >>>>>> nothing
>>> >>>>>> like that happens.   The biggest issue SQLAlchemy had with
>>> >>>>>> greenlets is that
>>> >>>>>> a timed out greenlet would throw GreenletExit and cause a dirty
>>> >>>>>> connection
>>> >>>>>> to go back into the pool, that was fixed in 1.2.
>>> >>>>>>
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db730]before PyObject_CallFunctionObjArgs conn
>>> >>>>>>> 0x94122f0,
>>> >>>>>>> cb 0x23ad320
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db730]conn_poll: status = 2, conn 0x94122f0
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db730]conn_poll: async_status = ASYNC_WRITE 0x94122f0
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db4b0]before EXC_IF_ASYNC_IN_PROGRESS conn 0x94122f0,
>>> >>>>>>> async_cursor 0x881ac00
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db4b0]before EXC_IF_ASYNC_IN_PROGRESS conn 0x94122f0,
>>> >>>>>>> async_cursor 0x881ac00
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db4b0]pq_abort: enter pgconn = 0x94122f0, autocommit
>>> >>>>>>> = 0,
>>> >>>>>>> status = 2
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db4b0]before lock pgconn = 0x94122f0, owner 49174
>>> >>>>>>> __lock 1
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> Below is the logs grep by coroutine id:
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db730]finish send query, before psyco_wait, conn
>>> >>>>>>> 0x94122f0
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db730]before have_wait_callback conn 0x94122f0
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db730]before PyObject_CallFunctionObjArgs conn
>>> >>>>>>> 0x94122f0,
>>> >>>>>>> cb 0x23ad320
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db730]conn_poll: status = 2, conn 0x94122f0
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db730]conn_poll: async_status = ASYNC_WRITE 0x94122f0
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db730]conn_poll: poll writing
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db730]conn_poll: async_status -> ASYNC_READ
>>> >>>>>>>
>>> >>>>>>> -----
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db4b0]psyco_conn_cursor: new unnamed cursor for
>>> >>>>>>> connection
>>> >>>>>>> at 0x8de2d30
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db4b0]cursor_setup: init cursor object at 0xa6c2650
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db4b0]cursor_setup: parameters: name = (null), conn =
>>> >>>>>>> 0x8de2d30
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db4b0]cursor_setup: good cursor object at 0xa6c2650,
>>> >>>>>>> refcnt = 1
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db4b0]psyco_conn_cursor: new cursor at 0xa6c2650:
>>> >>>>>>> refcnt =
>>> >>>>>>> 1
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db4b0]before EXC_IF_ASYNC_IN_PROGRESS conn 0x94122f0,
>>> >>>>>>> async_cursor 0x881ac00
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db4b0]before EXC_IF_ASYNC_IN_PROGRESS conn 0x94122f0,
>>> >>>>>>> async_cursor 0x881ac00
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db4b0]pq_abort: enter pgconn = 0x94122f0, autocommit
>>> >>>>>>> = 0,
>>> >>>>>>> status = 2
>>> >>>>>>>
>>> >>>>>>> [49174] [0xa5db4b0]before lock pgconn = 0x94122f0, owner 49174
>>> >>>>>>> __lock 1
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> 在 2018年4月28日星期六 UTC+8下午4:07:34，[email protected]写道：
>>> >>>>>>>>
>>> >>>>>>>> We reproduced this problem and added logs in psycopg2, found a
>>> >>>>>>>> confusing thing and reported it to psycopg2.
>>> >>>>>>>>
>>> >>>>>>>> It seems sqlalchemy & psycopg2 stack flow is:
>>> >>>>>>>> 1.from sqlalchemy to psycopg: pq_execute/pq_commit
>>> >>>>>>>> 2.from psycopg to sqlalchemy PyWeakref_NewRef(conn)
>>> >>>>>>>> 3.get exception in sqlalchemy, and do_rollback
>>> >>>>>>>> 4.from sqlalchemy to psycopg: pq_abort
>>> >>>>>>>> 5.psycopg get pthread_mutex_lock deadlock
>>> >>>>>>>>
>>> >>>>>>>> What we have done to reproduce it is: stop master pg-server and
>>> >>>>>>>> promote the slave pg-server to master, with moving FIP from old
>>> >>>>>>>> master to
>>> >>>>>>>> slave. At the same time, let nova-conductor do quite a lot db
>>> >>>>>>>> query
>>> >>>>>>>> requests.
>>> >>>>>>>>
>>> >>>>>>>> [1] https://github.com/psycopg/psycopg2/issues/703
>>> >>>>>>>>
>>> >>>>>>>> 在 2018年4月23日星期一 UTC+8下午9:45:04，Mike Bayer写道：
>>> >>>>>>>>>
>>> >>>>>>>>> On Mon, Apr 23, 2018 at 9:03 AM,  <[email protected]> wrote:
>>> >>>>>>>>> > Sorry for reply on this topic.
>>> >>>>>>>>> >
>>> >>>>>>>>> > We recently get the same problem in our production
>>> >>>>>>>>> > environment.
>>> >>>>>>>>> >
>>> >>>>>>>>> > I found a patch in other lib [1], and they added conn.close()
>>> >>>>>>>>> > in
>>> >>>>>>>>> > exception
>>> >>>>>>>>> > psycopg2.ProgrammingError.
>>> >>>>>>>>> >
>>> >>>>>>>>> > Shall we do the same in [2] ?
>>> >>>>>>>>>
>>> >>>>>>>>> SQLAlchemy does things much more carefully than that, we parse
>>> >>>>>>>>> the
>>> >>>>>>>>> error message for specific ones that correspond to "connection
>>> >>>>>>>>> is no
>>> >>>>>>>>> longer usable", we call these "is_disconnect" but it can be any
>>> >>>>>>>>> invalidating condition.
>>> >>>>>>>>>
>>> >>>>>>>>> You can make these yourself, and they can also be made to be
>>> >>>>>>>>> part of
>>> >>>>>>>>> oslo.db, using the handle_error event:
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> http://docs.sqlalchemy.org/en/latest/core/events.html?highlight=handle_error#sqlalchemy.events.ConnectionEvents.handle_error
>>> >>>>>>>>>
>>> >>>>>>>>> within oslo.db you would want to propose a change here:
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> https://github.com/openstack/oslo.db/blob/master/oslo_db/sqlalchemy/exc_filters.py#L387
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> >
>>> >>>>>>>>> > [1]
>>> >>>>>>>>> > https://github.com/aio-libs/aiopg/pull/415/files?diff=split
>>> >>>>>>>>> >
>>> >>>>>>>>> > [2]
>>> >>>>>>>>> >
>>> >>>>>>>>> >
>>> >>>>>>>>> > https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/engine/base.py#L1289
>>> >>>>>>>>> >
>>> >>>>>>>>> >
>>> >>>>>>>>> > 在 2017年11月13日星期一 UTC+8上午10:44:31，JinRong Cai写道：
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> Hi  Michael ,
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> I am using openstack with postgresql which sqlalchemy and
>>> >>>>>>>>> >> oslo_db
>>> >>>>>>>>> >> module
>>> >>>>>>>>> >> were used.
>>> >>>>>>>>> >> And there are some problems after my pg database switched
>>> >>>>>>>>> >> over.
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> Here is my switch over process:
>>> >>>>>>>>> >> 1. nova-conductor(python application) is running with DB
>>> >>>>>>>>> >> connection
>>> >>>>>>>>> >> strings point to vip , which is in primary site(A) of pg.
>>> >>>>>>>>> >> 2. switch VIP from primary(A) to new primary(B)
>>> >>>>>>>>> >> 3. switch over pg: shutdown primary(A), promopt standby(B) to
>>> >>>>>>>>> >> new
>>> >>>>>>>>> >> primary.
>>> >>>>>>>>> >> 4. nova-conductor is running in the whole process.
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> After some seconds, I found some nova-conductor processes are
>>> >>>>>>>>> >> hang
>>> >>>>>>>>> >> with
>>> >>>>>>>>> >> status futex_wait_queue_me, and the status of the query in DB
>>> >>>>>>>>> >> is
>>> >>>>>>>>> >> "idle in
>>> >>>>>>>>> >> transaction", the transaction was not commited or rollbacked!
>>> >>>>>>>>> >> I think disconnection was handled in the oslo_db, which will
>>> >>>>>>>>> >> send
>>> >>>>>>>>> >> a
>>> >>>>>>>>> >> ping(select 1) to DB.
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> If DB was switchd over, the connection in the pool would be
>>> >>>>>>>>> >> set
>>> >>>>>>>>> >> with
>>> >>>>>>>>> >> status invalid, and reconnect after next check out.
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> ###error messages from nova-conductor
>>> >>>>>>>>> >> localhost nova-conductor ERROR [pid:36365] [MainThread]
>>> >>>>>>>>> >> [tid:122397712]
>>> >>>>>>>>> >> [exc_filters.py:330 _raise_for_remaining_DBAPIError]
>>> >>>>>>>>> >> [req-2bd8a290-e17b-4178-80a6-4b36d5793d85] DBAPIError
>>> >>>>>>>>> >> exception
>>> >>>>>>>>> >> wrapped from
>>> >>>>>>>>> >> (psycopg2.ProgrammingError) execute cannot be used while an
>>> >>>>>>>>> >> asynchronous
>>> >>>>>>>>> >> query is underway [SQL: 'SELECT 1']
>>> >>>>>>>>> >>  36365 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most
>>> >>>>>>>>> >> recent
>>> >>>>>>>>> >> call
>>> >>>>>>>>> >> last):
>>> >>>>>>>>> >>   36365 ERROR oslo_db.sqlalchemy.exc_filters   File
>>> >>>>>>>>> >> "/python2.7/site-packages/sqlalchemy/engine/base.py", line
>>> >>>>>>>>> >> 1139,
>>> >>>>>>>>> >> in
>>> >>>>>>>>> >> _execute_context
>>> >>>>>>>>> >>   36365 ERROR oslo_db.sqlalchemy.exc_filters     context)
>>> >>>>>>>>> >>   36365 ERROR oslo_db.sqlalchemy.exc_filters   File
>>> >>>>>>>>> >> "/python2.7/site-packages/sqlalchemy/engine/default.py", line
>>> >>>>>>>>> >> 450,
>>> >>>>>>>>> >> in
>>> >>>>>>>>> >> do_execute
>>> >>>>>>>>> >>   36365 ERROR oslo_db.sqlalchemy.exc_filters
>>> >>>>>>>>> >> cursor.execute(statement,
>>> >>>>>>>>> >> parameters)
>>> >>>>>>>>> >>   36365 ERROR oslo_db.sqlalchemy.exc_filters
>>> >>>>>>>>> >> ProgrammingError:
>>> >>>>>>>>> >> execute
>>> >>>>>>>>> >> cannot be used while an asynchronous query is underway
>>> >>>>>>>>> >>   36365 ERROR oslo_db.sqlalchemy.exc_filters
>>> >>>>>>>>> >>  localhost nova-conductor ERROR [pid:36365] [MainThread]
>>> >>>>>>>>> >> [tid:122397712]
>>> >>>>>>>>> >> [log.py:122 error] [req-2bd8a290-e17b-4178-80a6-4b36d5793d85]
>>> >>>>>>>>> >> Error closing
>>> >>>>>>>>> >> cursor
>>> >>>>>>>>> >>   36365 ERROR sqlalchemy.pool.QueuePool Traceback (most
>>> >>>>>>>>> >> recent
>>> >>>>>>>>> >> call last):
>>> >>>>>>>>> >>   36365 ERROR sqlalchemy.pool.QueuePool   File
>>> >>>>>>>>> >> "/python2.7/site-packages/sqlalchemy/engine/base.py", line
>>> >>>>>>>>> >> 1226,
>>> >>>>>>>>> >> in
>>> >>>>>>>>> >> _safe_close_cursor
>>> >>>>>>>>> >>   36365 ERROR sqlalchemy.pool.QueuePool     cursor.close()
>>> >>>>>>>>> >>   36365 ERROR sqlalchemy.pool.QueuePool ProgrammingError:
>>> >>>>>>>>> >> close
>>> >>>>>>>>> >> cannot be
>>> >>>>>>>>> >> used while an asynchronous query is underway
>>> >>>>>>>>> >>   36365 ERROR sqlalchemy.pool.QueuePool
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> ###ps status of nova-conductor
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> POD6-Mongodb03:/var/log/uvp-getosstat/statistics20171106101500log
>>> >>>>>>>>> >> # cat
>>> >>>>>>>>> >> /proc/33316/stack
>>> >>>>>>>>> >> [<ffffffff810e4c24>] futex_wait_queue_me+0xc4/0x120
>>> >>>>>>>>> >> [<ffffffff810e5799>] futex_wait+0x179/0x280
>>> >>>>>>>>> >> [<ffffffff810e782e>] do_futex+0xfe/0x5b0
>>> >>>>>>>>> >> [<ffffffff810e7d60>] SyS_futex+0x80/0x180
>>> >>>>>>>>> >> [<ffffffff81654e09>] system_call_fastpath+0x16/0x1b
>>> >>>>>>>>> >> [<ffffffffffffffff>] 0xffffffffffffffff
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> ### stack of the nova-conductor process
>>> >>>>>>>>> >> POD6-Mongodb03:/tmp # pstack 33316
>>> >>>>>>>>> >> #0  0x00002b8449e35f4d in __lll_lock_wait () from
>>> >>>>>>>>> >> /lib64/libpthread.so.0
>>> >>>>>>>>> >> #1  0x00002b8449e31d02 in _L_lock_791 () from
>>> >>>>>>>>> >> /lib64/libpthread.so.0
>>> >>>>>>>>> >> #2  0x00002b8449e31c08 in pthread_mutex_lock () from
>>> >>>>>>>>> >> /lib64/libpthread.so.0
>>> >>>>>>>>> >> #3  0x00002b84554c44ab in pq_abort () from
>>> >>>>>>>>> >> /python2.7/site-packages/psycopg2/_psycopg.so
>>> >>>>>>>>> >> #4  0x00002b84554c955e in psyco_conn_rollback () from
>>> >>>>>>>>> >> /python2.7/site-packages/psycopg2/_psycopg.so
>>> >>>>>>>>> >> #5  0x00002b8449b42b50 in PyEval_EvalFrameEx () from
>>> >>>>>>>>> >> /lib64/libpython2.7.so.1.0
>>> >>>>>>>>> >> #6  0x00002b8449b42ad0 in PyEval_EvalFrameEx () from
>>> >>>>>>>>> >> /lib64/libpython2.7.so.1.0
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> The psycopg2 was trying to close the cursor, and try to get
>>> >>>>>>>>> >> the
>>> >>>>>>>>> >> mutex lock
>>> >>>>>>>>> >> "pthread_mutex_lock", but it seems that the cursor was used
>>> >>>>>>>>> >> by
>>> >>>>>>>>> >> other
>>> >>>>>>>>> >> session.
>>> >>>>>>>>> >>
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> Questions:
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> 1. What the error "ProgrammingError: close cannot be used
>>> >>>>>>>>> >> while an
>>> >>>>>>>>> >> asynchronous query is underway" mean?
>>> >>>>>>>>> >> AFAIK, these caused by psycopg2, which means a asynchronous
>>> >>>>>>>>> >> query
>>> >>>>>>>>> >> was
>>> >>>>>>>>> >> executed in one connection.
>>> >>>>>>>>> >> But the I think the sqlalchemy was thread safe since it was
>>> >>>>>>>>> >> patched by
>>> >>>>>>>>> >> eventlet, see details in
>>> >>>>>>>>> >> eventlet/support/psycopg2_patcher.py
>>> >>>>>>>>> >> And we can see different green thread number in the log, as:
>>> >>>>>>>>> >> [pid:36365] [MainThread] [tid:122397712]
>>> >>>>>>>>> >> [pid:36365] [MainThread] [tid:122397815]
>>> >>>>>>>>> >> So, I guess the connection pool in one process is safe.
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> 2. The nova-conductor was a multi-thread python client, which
>>> >>>>>>>>> >> forked
>>> >>>>>>>>> >> several child process.
>>> >>>>>>>>> >> ps -elf|grep -i nova-conductor
>>> >>>>>>>>> >> 30878  1 pool_s /usr/bin/nova-conductor
>>> >>>>>>>>> >> 36364  1 ep_pol /usr/bin/nova-conductor
>>> >>>>>>>>> >> 36365  1 futex_ /usr/bin/nova-conductor
>>> >>>>>>>>> >> 36366  1 ep_pol /usr/bin/nova-conductor
>>> >>>>>>>>> >> 36367  1 ep_pol /usr/bin/nova-conductor
>>> >>>>>>>>> >> 36368  1 ep_pol /usr/bin/nova-conductor
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> If the nova-conductor was started with only one child, the
>>> >>>>>>>>> >> problem
>>> >>>>>>>>> >> was not
>>> >>>>>>>>> >> happen.
>>> >>>>>>>>> >> Does this mean the connection/engine CAN NOT shared in these
>>> >>>>>>>>> >> child
>>> >>>>>>>>> >> processes?
>>> >>>>>>>>> >>
>>> >>>>>>>>> >> Thanks.
>>> >>>>>>>>> >
>>> >>>>>>>>> > --
>>> >>>>>>>>> > SQLAlchemy -
>>> >>>>>>>>> > The Python SQL Toolkit and Object Relational Mapper
>>> >>>>>>>>> >
>>> >>>>>>>>> > http://www.sqlalchemy.org/
>>> >>>>>>>>> >
>>> >>>>>>>>> > To post example code, please provide an MCVE: Minimal,
>>> >>>>>>>>> > Complete,
>>> >>>>>>>>> > and
>>> >>>>>>>>> > Verifiable Example. See http://stackoverflow.com/help/mcve for
>>> >>>>>>>>> > a
>>> >>>>>>>>> > full
>>> >>>>>>>>> > description.
>>> >>>>>>>>> > ---
>>> >>>>>>>>> > You received this message because you are subscribed to the
>>> >>>>>>>>> > Google
>>> >>>>>>>>> > Groups
>>> >>>>>>>>> > "sqlalchemy" group.
>>> >>>>>>>>> > To unsubscribe from this group and stop receiving emails from
>>> >>>>>>>>> > it,
>>> >>>>>>>>> > send an
>>> >>>>>>>>> > email to [email protected].
>>> >>>>>>>>> > To post to this group, send email to
>>> >>>>>>>>> > [email protected].
>>> >>>>>>>>> > Visit this group at
>>> >>>>>>>>> > https://groups.google.com/group/sqlalchemy.
>>> >>>>>>>>> > For more options, visit https://groups.google.com/d/optout.
>>> >>>>>>>
>>> >>>>>>> --
>>> >>>>>>> SQLAlchemy -
>>> >>>>>>> The Python SQL Toolkit and Object Relational Mapper
>>> >>>>>>>
>>> >>>>>>> http://www.sqlalchemy.org/
>>> >>>>>>>
>>> >>>>>>> To post example code, please provide an MCVE: Minimal, Complete,
>>> >>>>>>> and
>>> >>>>>>> Verifiable Example. See http://stackoverflow.com/help/mcve for a
>>> >>>>>>> full
>>> >>>>>>> description.
>>> >>>>>>> ---
>>> >>>>>>> You received this message because you are subscribed to the Google
>>> >>>>>>> Groups "sqlalchemy" group.
>>> >>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>> >>>>>>> send
>>> >>>>>>> an email to [email protected].
>>> >>>>>>> To post to this group, send email to [email protected].
>>> >>>>>>> Visit this group at https://groups.google.com/group/sqlalchemy.
>>> >>>>>>> For more options, visit https://groups.google.com/d/optout.
>>> >>>
>>> >>> --
>>> >>> SQLAlchemy -
>>> >>> The Python SQL Toolkit and Object Relational Mapper
>>> >>>
>>> >>> http://www.sqlalchemy.org/
>>> >>>
>>> >>> To post example code, please provide an MCVE: Minimal, Complete, and
>>> >>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>>> >>> description.
>>> >>> ---
>>> >>> You received this message because you are subscribed to the Google
>>> >>> Groups
>>> >>> "sqlalchemy" group.
>>> >>> To unsubscribe from this group and stop receiving emails from it, send
>>> >>> an
>>> >>> email to [email protected].
>>> >>> To post to this group, send email to [email protected].
>>> >>> Visit this group at https://groups.google.com/group/sqlalchemy.
>>> >>> For more options, visit https://groups.google.com/d/optout.
>>
>> --
>> SQLAlchemy -
>> The Python SQL Toolkit and Object Relational Mapper
>>
>> http://www.sqlalchemy.org/
>>
>> To post example code, please provide an MCVE: Minimal, Complete, and
>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>> description.
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Re: long transaction after database switched over

Reply via email to