On Tue, May 15, 2018, 3:26 AM <jiajunsu....@gmail.com> wrote: > We got the log "jjs _finalize_fairy args" just once but the code in > function _finalize_fairy seem to be called twice. It also confused me. >
It's because that code originated elsewhere and got blocked, then it picks up again but the callstack is somehow corrupted (or logging isn't telling the truth). Can you please give me: Exact version of eventlet in use Exact version of psycopg2 and psyco-green Exact python interpreter version Exact version of SQLAlchemy Exact version of Oslo.db Thanks > BTW, I tried to add code[1] to avoid double checkin in _finalize_fairy but > failed. > That means the code in pool.py#L670-692 was called without L646-668. > > [1] https://github.com/jiajunsu/test_logs/blob/master/pool.py#L659 > > 在 2018年5月15日星期二 UTC+8下午12:41:17,Mike Bayer写道: >> >> s >> >> >> >> On Mon, May 14, 2018 at 9:38 PM, <jiajun...@gmail.com> wrote: >> > The tid is coroutine_id, we get it by id(greenlet.getcurrent()) in each >> > coroutine(green thread). >> > >> > We run nova-conductor with monkey_patch, the same as [1], >> > eventlet.monkey_patch(os=False) >> > >> > Our nova-conductor is running with multi workers. >> > >> > It's confusing why the _finalize_fairy is called twice but only log >> once at >> > the entrance, that's why I guess it's called by weakref's callback. >> >> For 130446032 which is the tid you mentioned, _finalize_fairy is only >> called once in your log: >> >> $ cat nc_5679.log | grep -e "130446032.*jjs _finalize" >> 2018-05-14T11:11:39.955+08:00 localhost nova-conductor DEBUG >> [pid:5679] [MainThread] [tid:130446032] [log.py:104 debug] >> [req-841bb432-7cd2-4fa3-9def-559a34b05a21] jjs _finalize_fairy args >> conn None, rec.conn None, rec, <sqlalchemy.pool._ConnectionRecord >> object at 0x70cdb10>. ref None, fairy >> <sqlalchemy.pool._ConnectionFairy object at 0x8a87390> debug >> >> Also, the log shows that no _finalize_fairy is ever called due to >> "del" - this would show the weakref "ref" as non-None: >> >> $ cat nc_5679.log | grep -e "FINALIZE FAIRY.*REF:.*" | grep -v "REF: >> None" >> $ >> >> I would assume by "finalize fairy" called twice, you're referring to >> the fact that you see the "exception during reset or simlar" message. >> But that message is still AFAICT impossible to be emitted without the >> function being entered, and a corresponding "FINALIZE FAIRY" log line >> preceding it, unless something is going wrong with the logging, or the >> interpreter is somehow not functioning correctly (I wonder if eventlet >> had any such issues). I'm struggling to see any other possibility >> but not coming up with any. >> >> One weird thing I'm trying to figure out is that I see a connection >> being returned to the pool normally, but then nothing is logged about >> it after that: >> >> 2018-05-14T11:11:21.183+08:00 localhost nova-conductor DEBUG >> [pid:5679] [MainThread] [tid:130446352] [log.py:104 debug] >> [req-19a57f61-bc64-4575-8253-00b5bfe62a74] Connection <connection >> object at 0x7f419c0; closed: 0> being returned to pool, >> connection_record.connection <connection object at 0x7f419c0; closed: >> 0> debug >> >> >> the next log message from tid:130446352 should be "jjs conn reset". >> But that never happens. The next we see of that tid is: >> >> 2018-05-14T11:11:21.549+08:00 localhost nova-conductor DEBUG >> [pid:5679] [MainThread] [tid:130446352] [log.py:104 debug] >> [req-0ef61146-5cb6-4f46-9788-406b6d376843] jjs >> _ConnectionFairy._checkout fariy None debug >> 2018-05-14T11:11:21.549+08:00 localhost nova-conductor DEBUG >> [pid:5679] [MainThread] [tid:130446352] [log.py:104 debug] >> [req-0ef61146-5cb6-4f46-9788-406b6d376843] jjs before pool._do_get >> pool queue deque([<sqlalchemy.pool._ConnectionRecord object at >> 0x7777e10>, <sqlalchemy.pool._ConnectionRecord object at 0x75fb7d0>, >> <sqlalchemy.pool._ConnectionRecord object at 0x778d810>, >> <sqlalchemy.pool._ConnectionRecord object at 0x7600610>, >> <sqlalchemy.pool._ConnectionRecord object at 0x6ea7e10>, >> <sqlalchemy.pool._ConnectionRecord object at 0x7798050>, >> <sqlalchemy.pool._ConnectionRecord object at 0x796d110>, >> <sqlalchemy.pool._ConnectionRecord object at 0x7798d50>, >> <sqlalchemy.pool._ConnectionRecord object at 0x75f5710>]) debug >> >> that is, it's checking out a new connection ? If it were blocking on >> trying to reset the connection, we wouldn't see that tid again. I >> wonder if the greenlet was interrupted or something. I'm finding >> various impossibilities like this going on and not able to come up >> with any theory other than the interpreter is somehow not functioning >> correcltly, or the logging here is somehow incomplete. >> >> I notice you're on a very old version of SQLAlchemy (three years old). >> You really should be on at least the 1.1 series because for >> greenlets you need to have: >> >> http://docs.sqlalchemy.org/en/latest/changelog/migration_11.html#change-3803. >> >> >> >> >> >> >> >> >> >> > >> > [1]https://github.com/openstack/nova/blob/master/nova/cmd/__init__.py >> > >> > >> > 在 2018年5月14日星期一 UTC+8下午11:59:43,Mike Bayer写道: >> >> >> >> I notice in this log file there are multiple "tid", which I assume to >> >> be thread identifiers ? E.g. OS-level threads? Are you using >> >> eventlet or not? Are you monkeypatching SQLAlchemy and then using >> >> multiple OS-level threads at the same time? Because that likely >> >> doesn't work. >> >> >> >> Particularly, the log here illustrates an impossible codepath. >> >> >> >> We enter finalize_fairy with connection is None: >> >> >> >> 2018-05-14T11:11:39.961+08:00 localhost nova-conductor DEBUG >> >> [pid:5679] [MainThread] [tid:130446352] [log.py:104 debug] >> >> [req-74ef6531-2d92-40ea-a524-f54e11edab06] FINALIZE FAIRY: DB CONN: >> >> None, FAIRY: <sqlalchemy.pool._ConnectionFairy object at 0x87dd3d0>, >> >> REF: None CONNECTION_REC: <sqlalchemy.pool._ConnectionRecord object >> >> at 0x776fe90> CONNECTION_REC REF <weakref at 0x7c5cdb8; to >> >> '_ConnectionFairy' at 0x87dd3d0> debug >> >> >> >> this seems to make it through to the checkin at the bottom. >> >> >> >> Then, we see this: >> >> >> >> 2018-05-14T11:11:39.968+08:00 localhost nova-conductor ERROR >> >> [pid:5679] [MainThread] [tid:130446032] [log.py:122 error] >> >> [req-841bb432-7cd2-4fa3-9def-559a34b05a21] Exception during reset or >> >> similar >> >> >> >> The only place that message is generated is *in* finalize_fairy, >> >> *before* the checkin occurs, and *only* if _finalize_fairy was called >> >> *with* a connection that is non -None. There is no other call >> >> _finalize_fairy with that request identifier, therefore I see no >> >> evidence of where this message "exception during reset" can be >> >> generated from - there would at least need to be an additional log >> >> line with "FINALIZE FAIRY" and the connection non-None for this to be >> >> reached. I see no way that the pool.py shipped with SQLAlchemy can >> >> possibly generate this series of steps. >> >> >> >> This is an extremely intricate issue and I still have only the vaguest >> >> notion of how you are running SQLAlchemy / Nova. >> >> >> >> >> >> >> >> >> >> On Mon, May 14, 2018 at 11:10 AM, Mike Bayer <mik...@zzzcomputing.com> >> >> >> wrote: >> >> > OK I need the exact pool.py you are using as well that includes all >> >> > these logging statements and any other adjustments that were made. >> >> > >> >> > >> >> > >> >> > On Mon, May 14, 2018 at 8:02 AM, <jiajun...@gmail.com> wrote: >> >> >> I've uploaded the logs here: >> >> >> >> https://raw.githubusercontent.com/jiajunsu/test_logs/master/nc_5679.log >> >> >> >> >> >> The ConnectionRecord is 0x70cdb10, tid(coroutine id) is 130446032 >> >> >> >> >> >> At 2018-05-14T11:11:39.970, we got the log " jjs rec already in >> pool >> >> >> checkin", but I just deleted `return` in code, so the problem >> happen >> >> >> again. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> 在 2018年5月14日星期一 UTC+8上午10:48:49,Mike Bayer写道: >> >> >>> >> >> >>> as well as the log that shows this: >> >> >>> >> >> >>> + if self in pool._pool.queue: >> >> >>> + pool.logger.debug("jjs rec already in pool checkin, skip >> %r", >> >> >>> self) >> >> >>> + return >> >> >>> >> >> >>> >> >> >>> actually happening. >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> On Sun, May 13, 2018 at 10:42 PM, Mike Bayer < >> mik...@zzzcomputing.com> >> >> >>> wrote: >> >> >>> > On Sun, May 13, 2018 at 10:35 PM, <jiajun...@gmail.com> wrote: >> >> >>> >> "jjs" is short for my name and "cjr" is my colleague's(JinRong >> >> >>> >> Cai). We >> >> >>> >> work >> >> >>> >> together and add logs in the same environment, add prefix of >> name >> >> >>> >> to >> >> >>> >> get a >> >> >>> >> keyword for log filter. >> >> >>> >> >> >> >>> >> The doubled ConnectionRecords(with None connection) are got by >> two >> >> >>> >> different >> >> >>> >> coroutines, and the variable record.connection is rewrite by >> the >> >> >>> >> second >> >> >>> >> coroutines. >> >> >>> >> After the record being put back to QueuePool again, there are >> two >> >> >>> >> same >> >> >>> >> record with same connection in the pool, which will cause the >> next >> >> >>> >> two >> >> >>> >> coroutines get conflict and deadlock in psycopg2 as above. >> >> >>> > >> >> >>> > OK, I need the detailed step by step that shows how the two >> >> >>> > coroutines >> >> >>> > are interacting such that they both are returning the same >> >> >>> > connection. >> >> >>> > The nova log above seems to show just one coroutine I assume >> since >> >> >>> > there is just one nova request ID, if I understand correctly. >> >> >>> > >> >> >>> >> >> >> >>> >> The logs of psycopg2 has been pasted at >> >> >>> >> https://github.com/psycopg/psycopg2/issues/703 >> >> >>> > >> >> >>> > The above issue says the bug has been found in SQLAlchemy but >> there >> >> >>> > is >> >> >>> > no bug report yet and I've not yet been shown how the behavior >> here >> >> >>> > is >> >> >>> > possible. I posted a straightforward example above, can you >> >> >>> > please show me where the two coroutines get mixed up ? >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> 在 2018年5月11日星期五 UTC+8上午3:05:46,Mike Bayer写道: >> >> >>> >>> >> >> >>> >>> I tried to follow your log lines, which is difficult because I >> >> >>> >>> don't >> >> >>> >>> know where those log statements actually are and I don't know >> what >> >> >>> >>> acronyms like "jjs", "cjr", mean. But it does look like it's >> al >> >> >>> >>> in >> >> >>> >>> one thread (so not async GC) and all in one Nova request (so >> >> >>> >>> perhaps >> >> >>> >>> just one greenlet involved, unless other greenlets are >> affecting >> >> >>> >>> this >> >> >>> >>> without them being displayed in this log snippet). I do see >> the >> >> >>> >>> connection record doubled at the end. So here is the >> >> >>> >>> illustration of >> >> >>> >>> those steps in isolation, the connection record is not >> doubled. >> >> >>> >>> Can >> >> >>> >>> you alter this program to show this condition occurring? >> >> >>> >>> >> >> >>> >>> from sqlalchemy import create_engine >> >> >>> >>> >> >> >>> >>> e = create_engine("postgresql://scott:tiger@localhost/test", >> >> >>> >>> echo_pool='debug') >> >> >>> >>> e.connect().close() >> >> >>> >>> >> >> >>> >>> assert len(e.pool._pool.queue) == 1 >> >> >>> >>> >> >> >>> >>> c1 = e.connect() >> >> >>> >>> >> >> >>> >>> assert len(e.pool._pool.queue) == 0 >> >> >>> >>> >> >> >>> >>> c1.engine.pool._invalidate(c1._Connection__connection, None) >> >> >>> >>> c1.invalidate(e) >> >> >>> >>> >> >> >>> >>> # connection record is back >> >> >>> >>> assert len(e.pool._pool.queue) == 1 >> >> >>> >>> >> >> >>> >>> # do a reconnect >> >> >>> >>> c1.connection >> >> >>> >>> >> >> >>> >>> # uses the record again >> >> >>> >>> assert len(e.pool._pool.queue) == 0 >> >> >>> >>> >> >> >>> >>> # close the Connection >> >> >>> >>> c1.close() >> >> >>> >>> >> >> >>> >>> # record is back >> >> >>> >>> assert len(e.pool._pool.queue) == 1 >> >> >>> >>> >> >> >>> >>> # etc. >> >> >>> >>> c1 = e.connect() >> >> >>> >>> assert len(e.pool._pool.queue) == 0 >> >> >>> >>> >> >> >>> >>> c1.close() >> >> >>> >>> assert len(e.pool._pool.queue) == 1 >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> On Thu, May 10, 2018 at 12:25 PM, Mike Bayer >> >> >>> >>> <mik...@zzzcomputing.com> >> >> >>> >>> wrote: >> >> >>> >>> > Here's a print statement you might want to try: >> >> >>> >>> > >> >> >>> >>> > diff --git a/lib/sqlalchemy/pool.py b/lib/sqlalchemy/pool.py >> >> >>> >>> > index 89a4cea7c..452db4883 100644 >> >> >>> >>> > --- a/lib/sqlalchemy/pool.py >> >> >>> >>> > +++ b/lib/sqlalchemy/pool.py >> >> >>> >>> > @@ -689,6 +689,19 @@ def _finalize_fairy(connection, >> >> >>> >>> > connection_record, >> >> >>> >>> > been garbage collected. >> >> >>> >>> > >> >> >>> >>> > """ >> >> >>> >>> > + print( >> >> >>> >>> > + "FINALIZE FAIRY: GREENLET ID: %s, DB CONN: %s, >> FAIRY: >> >> >>> >>> > %s, >> >> >>> >>> > REF: >> >> >>> >>> > %s " >> >> >>> >>> > + "CONNECTION_REC: %s CONNECTION_REC REF %s" % ( >> >> >>> >>> > + "put greenlet id here", >> >> >>> >>> > + connection, >> >> >>> >>> > + fairy, >> >> >>> >>> > + ref, >> >> >>> >>> > + connection_record, >> >> >>> >>> > + connection_record.fairy_ref >> >> >>> >>> > + if connection_record is not None else "n/a" >> >> >>> >>> > + ) >> >> >>> >>> > + ) >> >> >>> >>> > + >> >> >>> >>> > _refs.discard(connection_record) >> >> >>> >>> > >> >> >>> >>> > if ref is not None: >> >> >>> >>> > >> >> >>> >>> > >> >> >>> >>> > >> >> >>> >>> > >> >> >>> >>> > >> >> >>> >>> > On Thu, May 10, 2018 at 12:21 PM, Mike Bayer >> >> >>> >>> > <mik...@zzzcomputing.com> >> >> >>> >>> > wrote: >> >> >>> >>> >> On Thu, May 10, 2018 at 7:23 AM, <jiajun...@gmail.com> >> wrote: >> >> >>> >>> >>> Hi Mike, >> >> >>> >>> >>> >> >> >>> >>> >>> I added more logs and finnaly find the reason and a method >> to >> >> >>> >>> >>> avoid >> >> >>> >>> >>> the >> >> >>> >>> >>> deadlock between coroutines. >> >> >>> >>> >>> >> >> >>> >>> >>> In summary, the deadlock happend because two different >> >> >>> >>> >>> coroutines >> >> >>> >>> >>> use >> >> >>> >>> >>> the >> >> >>> >>> >>> same ConnectionRecord with same connection. >> >> >>> >>> >>> It happends under QueuePool, when the DB connections are >> reset >> >> >>> >>> >>> by >> >> >>> >>> >>> the >> >> >>> >>> >>> DB >> >> >>> >>> >>> server(not related with the backend driver). >> >> >>> >>> >>> >> >> >>> >>> >>> 1.A coroutine get a ConnectionRecord into ConnectionFairy, >> and >> >> >>> >>> >>> apply a >> >> >>> >>> >>> weakref of the ConnectionFairy, with callback function >> >> >>> >>> >>> `_finalize_fairy`. >> >> >>> >>> >>> 2.When the connection get disconnected, it receives an >> >> >>> >>> >>> exception(psycopg2.OperationalError), and goto >> >> >>> >>> >>> Connection._handle_dbapi_exception. >> >> >>> >>> >>> 3.In `_handle_dbapi_exception.finnaly`, it will call both >> >> >>> >>> >>> `ConnectionFairy.invalidate` and `self.invalidate`. >> >> >>> >>> >>> 4.In `ConnectionFairy.invalidate`, >> >> >>> >>> >>> `_ConnectionRecord._checkin` is >> >> >>> >>> >>> called, >> >> >>> >>> >>> and put the ConnectionRecord back to the QueuePool. >> >> >>> >>> >>> 5.In `self.invalidate`, `del self._root.__connection` is >> >> >>> >>> >>> called, >> >> >>> >>> >>> and >> >> >>> >>> >>> **del** >> >> >>> >>> >>> the ConnectionFairy. >> >> >>> >>> >>> 6.The **del** operation will make weakref of >> ConnectionFairy >> >> >>> >>> >>> to >> >> >>> >>> >>> dead, >> >> >>> >>> >>> and >> >> >>> >>> >>> the callback `_finalize_fairy` executed. >> >> >>> >>> >>> 7.In `_finalize_fairy`, another >> `_ConnectionRecord._checkin` >> >> >>> >>> >>> is >> >> >>> >>> >>> called, put >> >> >>> >>> >>> the ConnectionRecord to the QueuePool again. >> >> >>> >>> >> >> >> >>> >>> >> Here's what does not make sense. You are claiming that a >> >> >>> >>> >> simple >> >> >>> >>> >> invalidation will result in the same ConnectionRecord being >> >> >>> >>> >> checked >> >> >>> >>> >> in >> >> >>> >>> >> twice. This is obviously not the case, what is special in >> >> >>> >>> >> your >> >> >>> >>> >> situation that makes this happen? >> >> >>> >>> >> >> >> >>> >>> >> If I follow the steps you refer towards, in step 4, when >> >> >>> >>> >> ConnectionFairy._checkin is called, that calls >> finalize_fairy >> >> >>> >>> >> directly, which then calls ConnectionRecord.checkin(). >> Within >> >> >>> >>> >> ConnectionRecord.checkin(), ConnectionRecord.fairy_ref is >> set >> >> >>> >>> >> to >> >> >>> >>> >> None. >> >> >>> >>> >> Then back in ConnectionFairy._checkin(), it also sets >> >> >>> >>> >> self._connection_record = None. >> >> >>> >>> >> >> >> >>> >>> >> If we come back into finalize_fairy a *second* time then, >> as a >> >> >>> >>> >> result >> >> >>> >>> >> of the "del" which occurs in _handle_error(), and the >> >> >>> >>> >> connection_record is present, the function will exit >> >> >>> >>> >> immediately, >> >> >>> >>> >> because: >> >> >>> >>> >> >> >> >>> >>> >> if connection_record.fairy_ref is not ref: >> >> >>> >>> >> return >> >> >>> >>> >> >> >> >>> >>> >> So the situation as given is not possible without >> concurrent >> >> >>> >>> >> access >> >> >>> >>> >> to >> >> >>> >>> >> the ConnectionFairy being introduced externally to the >> pool. >> >> >>> >>> >> >> >> >>> >>> >> There is exactly one thing I can think of that is both >> unique >> >> >>> >>> >> to >> >> >>> >>> >> the >> >> >>> >>> >> psycopg2-style of asynchronous connection, not the PyMySQL >> form >> >> >>> >>> >> of >> >> >>> >>> >> it, >> >> >>> >>> >> that might be important here, and it's that I think >> psycopg2's >> >> >>> >>> >> connection.close() method will defer to another greenlet. >> That >> >> >>> >>> >> means >> >> >>> >>> >> when we're inside of ConnectionRecord.invalidate(), and we >> call >> >> >>> >>> >> self.__close(), that could be deferring to another >> greenlet. >> >> >>> >>> >> But >> >> >>> >>> >> this all happens before the connection is returned to the >> pool, >> >> >>> >>> >> so >> >> >>> >>> >> I >> >> >>> >>> >> still don't know what is happening. >> >> >>> >>> >> >> >> >>> >>> >>> >> >> >>> >>> >>> The logs in detail is below. To prove my result is right, >> I >> >> >>> >>> >>> tried >> >> >>> >>> >>> to >> >> >>> >>> >>> avoid >> >> >>> >>> >>> the scienario with this patch, and it worked. >> >> >>> >>> >>> >> >> >>> >>> >>> def checkin(self): >> >> >>> >>> >>> self.fairy_ref = None >> >> >>> >>> >>> connection = self.connection >> >> >>> >>> >>> pool = self.__pool >> >> >>> >>> >>> >> >> >>> >>> >>> while self.finalize_callback: >> >> >>> >>> >>> finalizer = self.finalize_callback.pop() >> >> >>> >>> >>> finalizer(connection) >> >> >>> >>> >>> + if self in pool._pool.queue: >> >> >>> >>> >>> + pool.logger.debug("jjs rec already in pool >> checkin, >> >> >>> >>> >>> skip >> >> >>> >>> >>> %r", >> >> >>> >>> >>> self) >> >> >>> >>> >>> + return >> >> >>> >>> >> >> >> >>> >>> >> as mentioned above it's not possible to reach this point >> >> >>> >>> >> without >> >> >>> >>> >> concurrent access to the pooled connection, which means >> this >> >> >>> >>> >> fix >> >> >>> >>> >> could >> >> >>> >>> >> fail also (with at least threaded-style concurrency, not >> >> >>> >>> >> greenlet >> >> >>> >>> >> style). The original source of the greenlet issue must >> still >> >> >>> >>> >> be >> >> >>> >>> >> identified. >> >> >>> >>> >> >> >> >>> >>> >> I would need a self-contained proof of concept that >> >> >>> >>> >> demonstrates >> >> >>> >>> >> this >> >> >>> >>> >> issue occurring. The log output is not that helpful >> because it >> >> >>> >>> >> at >> >> >>> >>> >> least would need to write out the identifier of the current >> >> >>> >>> >> greenlet, >> >> >>> >>> >> since it is very likely that greenlets have something to do >> >> >>> >>> >> with >> >> >>> >>> >> this >> >> >>> >>> >> issue. >> >> >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> >>> >> >> >>> >>> >>> if pool.dispatch.checkin: >> >> >>> >>> >>> pool.dispatch.checkin(connection, self) >> >> >>> >>> >>> pool._return_conn(self) >> >> >>> >>> >>> >> >> >>> >>> >>> Trace the ConnectionRecord 0x88fc5d0 >> >> >>> >>> >>> >> >> >>> >>> >>> 2018-05-10T14:02:17.927+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] Created new >> >> >>> >>> >>> connection >> >> >>> >>> >>> <connection object at 0x81f69b0; closed: 0> >> >> >>> >>> >>> 2018-05-10T14:02:17.928+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs get conn >> >> >>> >>> >>> success >> >> >>> >>> >>> <connection >> >> >>> >>> >>> object at 0x81f69b0; closed: 0>, rec.connection >> <connection >> >> >>> >>> >>> object >> >> >>> >>> >>> at >> >> >>> >>> >>> 0x81f69b0; closed: 0>, pool queue >> >> >>> >>> >>> deque([<sqlalchemy.pool._ConnectionRecord >> >> >>> >>> >>> object at 0x7a04810>, <sqlalchemy.pool._ConnectionRecord >> >> >>> >>> >>> object at >> >> >>> >>> >>> 0x79fdad0>]) >> >> >>> >>> >>> 2018-05-10T14:02:17.929+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs checkout >> fairy >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionFairy object at 0x329c0d0> rec >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x88fc5d0> >> pool >> >> >>> >>> >>> queue >> >> >>> >>> >>> deque([<sqlalchemy.pool._ConnectionRecord object at >> >> >>> >>> >>> 0x7a04810>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x79fdad0>]) >> >> >>> >>> >>> 2018-05-10T14:02:17.929+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] Connection >> >> >>> >>> >>> <connection >> >> >>> >>> >>> object at >> >> >>> >>> >>> 0x81f69b0; closed: 0> checked out from pool, status Pool >> size: >> >> >>> >>> >>> 10 >> >> >>> >>> >>> Connections in pool: 2 Current Overflow: 0 Max Overflow: >> 40 >> >> >>> >>> >>> Current >> >> >>> >>> >>> Checked >> >> >>> >>> >>> out connections: 8, queue >> >> >>> >>> >>> deque([<sqlalchemy.pool._ConnectionRecord >> >> >>> >>> >>> object >> >> >>> >>> >>> at 0x7a04810>, <sqlalchemy.pool._ConnectionRecord object >> at >> >> >>> >>> >>> 0x79fdad0>]) >> >> >>> >>> >>> 2018-05-10T14:02:17.930+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs >> >> >>> >>> >>> _ConnectionFairy._checkout >> >> >>> >>> >>> conn <connection object at 0x81f69b0; closed: 0>, rec >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x88fc5d0>, >> >> >>> >>> >>> rec.conn >> >> >>> >>> >>> <connection object at 0x81f69b0; closed: 0> >> >> >>> >>> >>> 2018-05-10T14:02:49.716+08:00 localhost nova-conductor >> ERROR >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:122 error] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] cjr error conn >> >> >>> >>> >>> <sqlalchemy.engine.base.Connection object at 0x329c150>, >> err >> >> >>> >>> >>> connection >> >> >>> >>> >>> pointer is NULL >> >> >>> >>> >>> 2018-05-10T14:02:49.717+08:00 localhost nova-conductor >> ERROR >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:122 error] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] cjr enter >> newraise >> >> >>> >>> >>> 2018-05-10T14:02:49.717+08:00 localhost nova-conductor >> ERROR >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:122 error] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] cjr enter >> finally >> >> >>> >>> >>>>>Connection._handle_dbapi_exception --> finally --> >> >> >>> >>> >>>>> **self.engine.pool._invalidate** & **self.invalidate** >> >> >>> >>> >>>>>self.engine.pool._invalidate --> Pool._invalidate --> >> >> >>> >>> >>>>> ConnectionFairy.invalidate --> >> _ConnectionRecord.invalidate >> >> >>> >>> >>>>> & >> >> >>> >>> >>>>> **self._checkin()** >> >> >>> >>> >>>>>self.invalidate() --> **del self._root.__connection** >> >> >>> >>> >>> 2018-05-10T14:02:49.718+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs >> connectionFairy >> >> >>> >>> >>> invalidate >> >> >>> >>> >>> 2018-05-10T14:02:49.718+08:00 localhost nova-conductor >> INFO >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:109 info] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] Invalidate >> >> >>> >>> >>> connection >> >> >>> >>> >>> <connection >> >> >>> >>> >>> object at 0x81f69b0; closed: 2> (reason: >> >> >>> >>> >>> OperationalError:connection >> >> >>> >>> >>> pointer >> >> >>> >>> >>> is NULL >> >> >>> >>> >>>>>_ConnectionRecord.invalidate(e) --> >> >> >>> >>> >>>>> self.__pool.dispatch.invalidate(self.connection, self, >> e) & >> >> >>> >>> >>>>> self.__close() & >> >> >>> >>> >>>>> self.connection = None >> >> >>> >>> >>>>>self.__close() --> self.finalize_callback.clear() & >> >> >>> >>> >>>>> self.__pool._close_connection(self.connection) >> >> >>> >>> >>> 2018-05-10T14:02:49.719+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] Closing >> connection >> >> >>> >>> >>> <connection >> >> >>> >>> >>> object at 0x81f69b0; closed: 2> >> >> >>> >>> >>>>>Pool._close_connection(connection) --> >> >> >>> >>> >>>>> self._dialect.do_close(connection) >> >> >>> >>> >>>>> --> _ConnDialect.do_close(dbapi_connection) --> >> >> >>> >>> >>>>> dbapi_connection.close() FIN >> >> >>> >>> >>> 2018-05-10T14:02:49.719+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs >> >> >>> >>> >>> _ConnectionFairy._checkin >> >> >>> >>> >>> conn None rec <sqlalchemy.pool._ConnectionRecord object at >> >> >>> >>> >>> 0x88fc5d0> >> >> >>> >>> >>> rec.conn None >> >> >>> >>> >>>>>_ConnectionFairy._checkin --> _finalize_fairy(conn, >> conn_rec, >> >> >>> >>> >>>>> pool, >> >> >>> >>> >>>>> None, >> >> >>> >>> >>>>> echo, fairy) & self.conn = None & self.conn_rec = None >> >> >>> >>> >>> 2018-05-10T14:02:49.720+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs >> _finalize_fairy >> >> >>> >>> >>> args >> >> >>> >>> >>> conn >> >> >>> >>> >>> None, rec.conn None, rec, >> <sqlalchemy.pool._ConnectionRecord >> >> >>> >>> >>> object at >> >> >>> >>> >>> 0x88fc5d0>. ref None, fairy >> <sqlalchemy.pool._ConnectionFairy >> >> >>> >>> >>> object >> >> >>> >>> >>> at >> >> >>> >>> >>> 0x329c0d0> >> >> >>> >>> >>>>>_finalize_fairy --> connection_record.checkin() >> >> >>> >>> >>> 2018-05-10T14:02:49.720+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs going to >> >> >>> >>> >>> connection_record.checkin >> >> >>> >>> >>> 2018-05-10T14:02:49.721+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs before conn >> >> >>> >>> >>> checkin >> >> >>> >>> >>> None, >> >> >>> >>> >>> pool queue deque([<sqlalchemy.pool._ConnectionRecord >> object at >> >> >>> >>> >>> 0x7524590>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7a2b990>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0xa856b10>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7ebee90>]), >> >> >>> >>> >>> finalize_callback >> >> >>> >>> >>> deque([]) >> >> >>> >>> >>> 2018-05-10T14:02:49.721+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs after conn >> >> >>> >>> >>> checkin, >> >> >>> >>> >>> pool >> >> >>> >>> >>> queue deque([<sqlalchemy.pool._ConnectionRecord object at >> >> >>> >>> >>> 0x7524590>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7a2b990>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0xa856b10>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7ebee90>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x88fc5d0>]) >> >> >>> >>> >>> 2018-05-10T14:02:49.723+08:00 localhost nova-conductor >> ERROR >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:122 error] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] Exception >> during >> >> >>> >>> >>> reset >> >> >>> >>> >>> or >> >> >>> >>> >>> similar >> >> >>> >>> >>> 2018-05-10T14:02:49.723+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs going to >> >> >>> >>> >>> connection_record.invalidate >> >> >>> >>> >>> 2018-05-10T14:02:49.724+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs going to >> >> >>> >>> >>> connection_record.checkin >> >> >>> >>> >>> 2018-05-10T14:02:49.724+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs before conn >> >> >>> >>> >>> checkin >> >> >>> >>> >>> None, >> >> >>> >>> >>> pool queue deque([<sqlalchemy.pool._ConnectionRecord >> object at >> >> >>> >>> >>> 0x7524590>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7a2b990>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0xa856b10>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7ebee90>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x88fc5d0>]), >> >> >>> >>> >>> finalize_callback >> >> >>> >>> >>> deque([]) >> >> >>> >>> >>> 2018-05-10T14:02:49.732+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>> [pid:4528] >> >> >>> >>> >>> [MainThread] [tid:122769072] [log.py:104 debug] >> >> >>> >>> >>> [req-9e51bb8d-cdfe-434b-859a-9211ac2087f9] jjs after conn >> >> >>> >>> >>> checkin, >> >> >>> >>> >>> pool >> >> >>> >>> >>> queue deque([<sqlalchemy.pool._ConnectionRecord object at >> >> >>> >>> >>> 0x7524590>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7a2b990>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0xa856b10>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x7ebee90>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x88fc5d0>, >> >> >>> >>> >>> <sqlalchemy.pool._ConnectionRecord object at 0x88fc5d0>]) >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> >> >> >>> >>> >>> 在 2018年5月8日星期二 UTC+8下午11:06:04,jiajun...@gmail.com写道: >> >> >>> >>> >>>> >> >> >>> >>> >>>> I found why we could get the same connection twice from >> the >> >> >>> >>> >>>> pool... >> >> >>> >>> >>>> >> >> >>> >>> >>>> Because we put it into pool twice. >> >> >>> >>> >>>> >> >> >>> >>> >>>> Any suggestion for finding why it put the same connection >> >> >>> >>> >>>> into >> >> >>> >>> >>>> pool >> >> >>> >>> >>>> within >> >> >>> >>> >>>> different coroutines? >> >> >>> >>> >>>> >> >> >>> >>> >>>> >> >> >>> >>> >>>> 2018-05-08T22:39:17.448+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>>> [pid:15755] >> >> >>> >>> >>>> [MainThread] [tid:187954992] [log.py:104 debug] >> >> >>> >>> >>>> [req-7c2199c0-d9a6-4131-8358-2aebcf5a4ce1] jjs conn >> checkin >> >> >>> >>> >>>> <connection >> >> >>> >>> >>>> object at 0x111d9740; closed: 0> >> >> >>> >>> >>>> 2018-05-08T22:39:17.502+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>>> [pid:15755] >> >> >>> >>> >>>> [MainThread] [tid:178645264] [log.py:104 debug] >> >> >>> >>> >>>> [req-7ab83c30-91bd-4278-b694-85a71035992c] jjs conn >> checkin >> >> >>> >>> >>>> <connection >> >> >>> >>> >>>> object at 0x111d9740; closed: 0> >> >> >>> >>> >>>> ... >> >> >>> >>> >>>> 2018-05-08T22:39:17.992+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>>> [pid:15755] >> >> >>> >>> >>>> [MainThread] [tid:309091856] [log.py:104 debug] >> >> >>> >>> >>>> [req-dee5dd2d-866f-408e-999b-b415ea7ce953] Connection >> >> >>> >>> >>>> <connection >> >> >>> >>> >>>> object at >> >> >>> >>> >>>> 0x111d9740; closed: 0> checked out from pool >> >> >>> >>> >>>> 2018-05-08T22:39:18.039+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>>> [pid:15755] >> >> >>> >>> >>>> [MainThread] [tid:178645104] [log.py:104 debug] >> >> >>> >>> >>>> [req-ac8017d7-c97e-44ca-92d2-99f180858e14] Connection >> >> >>> >>> >>>> <connection >> >> >>> >>> >>>> object at >> >> >>> >>> >>>> 0x111d9740; closed: 0> checked out from pool >> >> >>> >>> >>>> >> >> >>> >>> >>>> >> >> >>> >>> >>>> 在 2018年5月8日星期二 UTC+8下午10:18:52,jiajun...@gmail.com写道: >> >> >>> >>> >>>>> >> >> >>> >>> >>>>> I added logs in sqlalchemy/pool.py >> >> >>> >>> >>>>> >> >> >>> >>> >>>>> QueuePool.checkout >> >> >>> >>> >>>>> >> >> >>> >>> >>>>> @classmethod >> >> >>> >>> >>>>> def checkout(cls, pool): >> >> >>> >>> >>>>> rec = pool._do_get() >> >> >>> >>> >>>>> try: >> >> >>> >>> >>>>> dbapi_connection = rec.get_connection() >> >> >>> >>> >>>>> pool.logger.debug("jjs get conn success %s", >> >> >>> >>> >>>>> dbapi_connection) >> >> >>> >>> >>>>> except: >> >> >>> >>> >>>>> pool.logger.debug("jjs get error in >> get_connection") >> >> >>> >>> >>>>> with util.safe_reraise(): >> >> >>> >>> >>>>> rec.checkin() >> >> >>> >>> >>>>> echo = pool._should_log_debug() >> >> >>> >>> >>>>> fairy = _ConnectionFairy(dbapi_connection, rec, >> echo) >> >> >>> >>> >>>>> rec.fairy_ref = weakref.ref( >> >> >>> >>> >>>>> fairy, >> >> >>> >>> >>>>> lambda ref: _finalize_fairy and >> >> >>> >>> >>>>> _finalize_fairy( >> >> >>> >>> >>>>> dbapi_connection, >> >> >>> >>> >>>>> rec, pool, ref, echo) >> >> >>> >>> >>>>> ) >> >> >>> >>> >>>>> _refs.add(rec) >> >> >>> >>> >>>>> if echo: >> >> >>> >>> >>>>> pool.logger.debug("jjs conn %s, pool status %s", >> >> >>> >>> >>>>> dbapi_connection, pool.status()) >> >> >>> >>> >>>>> pool.logger.debug("Connection %r checked out >> from >> >> >>> >>> >>>>> pool", >> >> >>> >>> >>>>> dbapi_connection) >> >> >>> >>> >>>>> return fairy >> >> >>> >>> >>>>> >> >> >>> >>> >>>>> Then I got logs below: >> >> >>> >>> >>>>> >> >> >>> >>> >>>>> coroutine A: >> >> >>> >>> >>>>> >> >> >>> >>> >>>>> 2018-05-08T21:39:59.055+08:00 localhost nova-conductor >> INFO >> >> >>> >>> >>>>> [pid:55528] >> >> >>> >>> >>>>> [MainThread] [tid:197123760] [log.py:109 info] >> >> >>> >>> >>>>> [req-ba1ab897-c80a-4bc4-922a-39044d669da1] jjs Pool wait >> is >> >> >>> >>> >>>>> False, >> >> >>> >>> >>>>> timeout >> >> >>> >>> >>>>> 30 >> >> >>> >>> >>>>> 2018-05-08T21:39:59.078+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>>>> [pid:55528] >> >> >>> >>> >>>>> [MainThread] [tid:197123760] [log.py:104 debug] >> >> >>> >>> >>>>> [req-ba1ab897-c80a-4bc4-922a-39044d669da1] Created new >> >> >>> >>> >>>>> connection >> >> >>> >>> >>>>> <connection object at 0x903f8c0; closed: 0> >> >> >>> >>> >>>>> 2018-05-08T21:39:59.079+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>>>> [pid:55528] >> >> >>> >>> >>>>> [MainThread] [tid:197123760] [log.py:104 debug] >> >> >>> >>> >>>>> [req-ba1ab897-c80a-4bc4-922a-39044d669da1] jjs get conn >> >> >>> >>> >>>>> success >> >> >>> >>> >>>>> <connection >> >> >>> >>> >>>>> object at 0x903f8c0; closed: 0> >> >> >>> >>> >>>>> 2018-05-08T21:39:59.080+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>>>> [pid:55528] >> >> >>> >>> >>>>> [MainThread] [tid:197123760] [log.py:104 debug] >> >> >>> >>> >>>>> [req-ba1ab897-c80a-4bc4-922a-39044d669da1] jjs conn >> >> >>> >>> >>>>> <connection >> >> >>> >>> >>>>> object at >> >> >>> >>> >>>>> 0x903f8c0; closed: 0>, pool status Pool size: 10 >> >> >>> >>> >>>>> Connections >> >> >>> >>> >>>>> in >> >> >>> >>> >>>>> pool: 1 >> >> >>> >>> >>>>> Current Overflow: 36 Max Overflow: 40 Current Checked >> out >> >> >>> >>> >>>>> connections: 45 >> >> >>> >>> >>>>> 2018-05-08T21:39:59.080+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>>>> [pid:55528] >> >> >>> >>> >>>>> [MainThread] [tid:197123760] [log.py:104 debug] >> >> >>> >>> >>>>> [req-ba1ab897-c80a-4bc4-922a-39044d669da1] Connection >> >> >>> >>> >>>>> <connection >> >> >>> >>> >>>>> object at >> >> >>> >>> >>>>> 0x903f8c0; closed: 0> checked out from pool >> >> >>> >>> >>>>> >> >> >>> >>> >>>>> coroutine B: >> >> >>> >>> >>>>> >> >> >>> >>> >>>>> 2018-05-08T21:39:59.212+08:00 localhost nova-conductor >> INFO >> >> >>> >>> >>>>> [pid:55528] >> >> >>> >>> >>>>> [MainThread] [tid:151519312] [log.py:109 info] >> >> >>> >>> >>>>> [req-a78075b5-c55c-40c4-86d0-65957e89cb56] jjs Pool wait >> is >> >> >>> >>> >>>>> False, >> >> >>> >>> >>>>> timeout >> >> >>> >>> >>>>> 30 >> >> >>> >>> >>>>> 2018-05-08T21:39:59.213+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>>>> [pid:55528] >> >> >>> >>> >>>>> [MainThread] [tid:151519312] [log.py:104 debug] >> >> >>> >>> >>>>> [req-a78075b5-c55c-40c4-86d0-65957e89cb56] jjs get conn >> >> >>> >>> >>>>> success >> >> >>> >>> >>>>> <connection >> >> >>> >>> >>>>> object at 0x903f8c0; closed: 0> >> >> >>> >>> >>>>> 2018-05-08T21:39:59.213+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>>>> [pid:55528] >> >> >>> >>> >>>>> [MainThread] [tid:151519312] [log.py:104 debug] >> >> >>> >>> >>>>> [req-a78075b5-c55c-40c4-86d0-65957e89cb56] jjs conn >> >> >>> >>> >>>>> <connection >> >> >>> >>> >>>>> object at >> >> >>> >>> >>>>> 0x903f8c0; closed: 0>, pool status Pool size: 10 >> >> >>> >>> >>>>> Connections in >> >> >>> >>> >>>>> pool: 0 >> >> >>> >>> >>>>> Current Overflow: 36 Max Overflow: 40 Current Checked >> out >> >> >>> >>> >>>>> connections: 46 >> >> >>> >>> >>>>> 2018-05-08T21:39:59.214+08:00 localhost nova-conductor >> DEBUG >> >> >>> >>> >>>>> [pid:55528] >> >> >>> >>> >>>>> [MainThread] [tid:151519312] [log.py:104 debug] >> >> >>> >>> >>>>> [req-a78075b5-c55c-40c4-86d0-65957e89cb56] Connection >> >> >>> >>> >>>>> <connection >> >> >>> >>> >>>>> object at >> >> >>> >>> >>>>> 0x903f8c0; closed: 0> checked out from pool >> >> >>> >>> >>>>> >> >> >>> >>> >>>>> >> >> >>> >>> >>>>> It seems that when we create_connection in >> >> >>> >>> >>>>> QueuePool._do_get, we >> >> >>> >>> >>>>> may >> >> >>> >>> >>>>> not >> >> >>> >>> >>>>> get the lock _pool.mutex(~Queue.mutex). >> >> >>> >>> >>>>> >> >> >>> >>> >>>>> >> >> >>> >>> >>>>> >> >> >>> >>> >>>>> 在 2018年5月7日星期一 UTC+8下午8:41:14,Mike Bayer写道: >> >> >>> >>> >>>>>> >> >> >>> >>> >>>>>> >> >> >>> >>> >>>>>> >> >> >>> >>> >>>>>> On Mon, May 7, 2018, 7:01 AM <jiajun...@gmail.com> >> wrote: >> >> >>> >>> >>>>>>> >> >> >>> >>> >>>>>>> We added coroutine_id in psycopg2, and found that two >> >> >>> >>> >>>>>>> coroutine >> >> >>> >>> >>>>>>> use the >> >> >>> >>> >>>>>>> same connection before the pthread_mutex_lock >> released. >> >> >>> >>> >>>>>>> >> >> >>> >>> >>>>>>> Maybe something in the connection pool goes wrong? >> >> >>> >>> >>>>>>> [pid] [coroutine_id] msg >> >> >>> >>> >>>>>> >> >> >>> >>> >>>>>> >> >> >>> >>> >>>>>> >> >> >>> >>> >>>>>> This seems unlikely. Openstack uses the SQLAlchemy >> >> >>> >>> >>>>>> connection >> >> >>> >>> >>>>>> pool >> >> >>> >>> >>>>>> with eventlet patching in conjunction with the MySQL >> >> >>> >>> >>>>>> drivers >> >> >>> >>> >>>>>> and >> >> >>> >>> >>>>>> nothing >> >> >>> >>> >>>>>> like that happens. The biggest issue SQLAlchemy had >> with >> >> >>> >>> >>>>>> greenlets is that >> >> >>> >>> >>>>>> a timed out greenlet would throw GreenletExit and cause >> a >> >> >>> >>> >>>>>> dirty >> >> >>> >>> >>>>>> connection >> >> >>> >>> >>>>>> to go back into the pool, that was fixed in 1.2. >> >> >>> >>> >>>>>> >> >> >>> >>> >>>>>>> >> >> >>> >>> >>>>>>> [49174] [0xa5db730]before PyObject_CallFunctionObjArgs >> >> >>> >>> >>>>>>> conn >> >> >>> >>> >>>>>>> 0x94122f0, >> >> >>> >>> >>>>>>> cb 0x23ad320 >> >> >>> >>> >>>>>>> >> >> >>> >>> >>>>>>> [49174] [0xa5db730]conn_poll: status = 2, conn >> 0x94122f0 >> >> >>> >>> >>>>>>> >> >> >>> >>> >>>>>>> [49174] [0xa5db730]conn_poll: async_status = >> ASYNC_WRITE >> >> >>> >>> >>>>>>> 0x94122f0 >> >> >>> >>> >>>>>>> >> >> >>> >>> >>>>>>> [49174] [0xa5db4b0]before EXC_IF_ASYNC_IN_PROGRESS >> conn >> >> >>> >>> >>>>>>> 0x94122f0, >> >> >>> >>> >>>>>>> async_cursor 0x881ac00 >> >> >>> >>> >>>>>>> >> >> >>> >>> >>>>>>> [49174] [0xa5db4b0]before EXC_IF_ASYNC_IN_PROGRESS >> conn >> >> >>> >>> >>>>>>> 0x94122f0, >> >> >>> >>> >>>>>>> async_cursor 0x881ac00 >> >> >>> >>> >>>>>>> >> >> >>> >>> >>>>>>> [49174] [0xa5db4b0]pq_abort: enter pgconn = 0x94122f0, >> >> >>> >>> >>>>>>> autocommit >> >> >>> >>> >>>>>>> = 0, >> >> >>> >>> >>>>>>> status = 2 >> >> >>> >>> >>>>>>> >> >> >>> >>> >>>>>>> [49174] [0xa5db4b0]before lock pgconn = 0x94122f0, >> owner >> >> >>> >>> >>>>>>> 49174 >> >> >>> >>> >>>>>>> __lock 1 >> >> >>> >>> >>>>>>> >> >> >>> >>> >>>>>>> >> >> >>> >>> >>>>>>> Below is the logs grep by coroutine id: >> >> >>> >>> >>>>>>> >> >> >>> >>> >>>>>>> [49174] [0xa5db730]finish send query, before >> psyco_wait, >> >> >>> >>> >>>>>>> conn >> >> >>> >>> >>>>>>> 0x94122f0 >> >> >>> >>> >>>>>>> >> >> >>> >>> >>>>>>> [49174] [0xa5db730]before have_wait_callback con > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To post to this group, send email to sqlalchemy@googlegroups.com. > Visit this group at https://groups.google.com/group/sqlalchemy. > For more options, visit https://groups.google.com/d/optout. > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.