This patch worked. https://gerrit.sqlalchemy.org/#/c/zzzeek/sqlalchemy/+/747/
But I didn't get the log "Double checkin attempted", maybe the real working code is if connection_record and connection_record.fairy_ref is not None: connection_record.checkin() 在 2018年5月16日星期三 UTC+8下午4:04:21,jiajun...@gmail.com写道: > > Thanks a lot. > > I'll test this patch in our environment. > > 在 2018年5月16日星期三 UTC+8上午2:58:56,Mike Bayer写道: >> >> good news, I reproduced it. >> >> Even better news for you, vindication! nothing to do with greenlets, >> threads, concurrency, or anything (though it doesn't have to do with >> the "del" or the weakref, this is much simpler ). >> >> This has to do with a special rule in the pool when there is a >> so-called "reset agent" involved that cleans up a pooled connection >> while taking into account the fact that the enclosing Connection had a >> transaction in progress. That is introducing the >> _handle_error()->invalidate() routine *inside* the normal >> _finalize_fairy routine and that is why we see _finalize_fairy twice. >> The _finalize_fairy we see that seems to start in the middle of the >> operation is called first, then the _finalize_fairy we see failing is >> inside of that. So they aren't sibling operations, they are nested. >> >> >> https://bitbucket.org/zzzeek/sqlalchemy/issues/4252/double-checkin-due-to-handle_error-called >> >> >> >> >> On Tue, May 15, 2018 at 1:26 PM, Mike Bayer <cla...@zzzcomputing.com> >> wrote: >> > can you add these lines to your logging, ensure that the check for >> > "already in pool" is present, and produce new logs? >> > >> > try: >> > fairy = fairy or _ConnectionFairy( >> > connection, connection_record, echo) >> > assert fairy.connection is connection >> > pool.logger.debug("ABOUT TO RESET CONNECTION %s CONNECTION >> > REC %s FAIRY %s", connection, fairy, connection_record) >> > fairy._reset(pool) >> > pool.logger.debug("COMPLETED RESET CONNECTION %s >> > CONNECTION REC %s FAIRY %s", connection, fairy, connection_record) >> > >> > # Immediately close detached instances >> > if not connection_record: >> > pool._close_connection(connection) >> > except BaseException as e: >> > pool.logger.debug("EXCEPTION RAISED IN RESET CONNECTION %s >> > CONNECTION REC %s FAIRY %s: %s", connection, fairy, connection_record, >> > e) >> > pool.logger.error( >> > "Exception during reset or similar", exc_info=True) >> > if connection_record: >> > pool.logger.debug("jjs going to >> connection_record.invalidate") >> > connection_record.invalidate(e=e) >> > if not isinstance(e, Exception): >> > pool.logger.debug("jjs e %s is not Exception, raise >> again", e) >> > raise >> > >> > my current theory is that a greenlet is being blocked in _reset(), >> > then waking up later but is being corrupted at the greenlet level in >> > some way. >> > >> > >> > >> > On Tue, May 15, 2018 at 12:15 PM, Mike Bayer <cla...@zzzcomputing.com> >> wrote: >> >> >> >> >> >> On Tue, May 15, 2018, 3:26 AM <jiajun...@gmail.com> wrote: >> >>> >> >>> We got the log "jjs _finalize_fairy args" just once but the code in >> >>> function _finalize_fairy seem to be called twice. It also confused >> me. >> >> >> >> >> >> It's because that code originated elsewhere and got blocked, then it >> picks >> >> up again but the callstack is somehow corrupted (or logging isn't >> telling >> >> the truth). >> >> >> >> Can you please give me: >> >> >> >> Exact version of eventlet in use >> >> Exact version of psycopg2 and psyco-green >> >> Exact python interpreter version >> >> Exact version of SQLAlchemy >> >> Exact version of Oslo.db >> >> >> >> Thanks >> >> >> >> >> >> >> >>> >> >>> BTW, I tried to add code[1] to avoid double checkin in >> _finalize_fairy but >> >>> failed. >> >>> That means the code in pool.py#L670-692 was called without L646-668. >> >>> >> >>> [1] https://github.com/jiajunsu/test_logs/blob/master/pool.py#L659 >> >>> >> >>> 在 2018年5月15日星期二 UTC+8下午12:41:17,Mike Bayer写道: >> >>>> >> >>>> s >> >>>> >> >>>> >> >>>> >> >>>> On Mon, May 14, 2018 at 9:38 PM, <jiajun...@gmail.com> wrote: >> >>>> > The tid is coroutine_id, we get it by id(greenlet.getcurrent()) in >> each >> >>>> > coroutine(green thread). >> >>>> > >> >>>> > We run nova-conductor with monkey_patch, the same as [1], >> >>>> > eventlet.monkey_patch(os=False) >> >>>> > >> >>>> > Our nova-conductor is running with multi workers. >> >>>> > >> >>>> > It's confusing why the _finalize_fairy is called twice but only >> log >> >>>> > once at >> >>>> > the entrance, that's why I guess it's called by weakref's >> callback. >> >>>> >> >>>> For 130446032 which is the tid you mentioned, _finalize_fairy is >> only >> >>>> called once in your log: >> >>>> >> >>>> $ cat nc_5679.log | grep -e "130446032.*jjs _finalize" >> >>>> 2018-05-14T11:11:39.955+08:00 localhost nova-conductor DEBUG >> >>>> [pid:5679] [MainThread] [tid:130446032] [log.py:104 debug] >> >>>> [req-841bb432-7cd2-4fa3-9def-559a34b05a21] jjs _finalize_fairy args >> >>>> conn None, rec.conn None, rec, <sqlalchemy.pool._ConnectionRecord >> >>>> object at 0x70cdb10>. ref None, fairy >> >>>> <sqlalchemy.pool._ConnectionFairy object at 0x8a87390> debug >> >>>> >> >>>> Also, the log shows that no _finalize_fairy is ever called due to >> >>>> "del" - this would show the weakref "ref" as non-None: >> >>>> >> >>>> $ cat nc_5679.log | grep -e "FINALIZE FAIRY.*REF:.*" | grep -v "REF: >> >>>> None" >> >>>> $ >> >>>> >> >>>> I would assume by "finalize fairy" called twice, you're referring to >> >>>> the fact that you see the "exception during reset or simlar" >> message. >> >>>> But that message is still AFAICT impossible to be emitted without >> the >> >>>> function being entered, and a corresponding "FINALIZE FAIRY" log >> line >> >>>> preceding it, unless something is going wrong with the logging, or >> the >> >>>> interpreter is somehow not functioning correctly (I wonder if >> eventlet >> >>>> had any such issues). I'm struggling to see any other possibility >> >>>> but not coming up with any. >> >>>> >> >>>> One weird thing I'm trying to figure out is that I see a connection >> >>>> being returned to the pool normally, but then nothing is logged >> about >> >>>> it after that: >> >>>> >> >>>> 2018-05-14T11:11:21.183+08:00 localhost nova-conductor DEBUG >> >>>> [pid:5679] [MainThread] [tid:130446352] [log.py:104 debug] >> >>>> [req-19a57f61-bc64-4575-8253-00b5bfe62a74] Connection <connection >> >>>> object at 0x7f419c0; closed: 0> being returned to pool, >> >>>> connection_record.connection <connection object at 0x7f419c0; >> closed: >> >>>> 0> debug >> >>>> >> >>>> >> >>>> the next log message from tid:130446352 should be "jjs conn reset". >> >>>> But that never happens. The next we see of that tid is: >> >>>> >> >>>> 2018-05-14T11:11:21.549+08:00 localhost nova-conductor DEBUG >> >>>> [pid:5679] [MainThread] [tid:130446352] [log.py:104 debug] >> >>>> [req-0ef61146-5cb6-4f46-9788-406b6d376843] jjs >> >>>> _ConnectionFairy._checkout fariy None debug >> >>>> 2018-05-14T11:11:21.549+08:00 localhost nova-conductor DEBUG >> >>>> [pid:5679] [MainThread] [tid:130446352] [log.py:104 debug] >> >>>> [req-0ef61146-5cb6-4f46-9788-406b6d376843] jjs before pool._do_get >> >>>> pool queue deque([<sqlalchemy.pool._ConnectionRecord object at >> >>>> 0x7777e10>, <sqlalchemy.pool._ConnectionRecord object at 0x75fb7d0>, >> >>>> <sqlalchemy.pool._ConnectionRecord object at 0x778d810>, >> >>>> <sqlalchemy.pool._ConnectionRecord object at 0x7600610>, >> >>>> <sqlalchemy.pool._ConnectionRecord object at 0x6ea7e10>, >> >>>> <sqlalchemy.pool._ConnectionRecord object at 0x7798050>, >> >>>> <sqlalchemy.pool._ConnectionRecord object at 0x796d110>, >> >>>> <sqlalchemy.pool._ConnectionRecord object at 0x7798d50>, >> >>>> <sqlalchemy.pool._ConnectionRecord object at 0x75f5710>]) debug >> >>>> >> >>>> that is, it's checking out a new connection ? If it were blocking >> on >> >>>> trying to reset the connection, we wouldn't see that tid again. I >> >>>> wonder if the greenlet was interrupted or something. I'm finding >> >>>> various impossibilities like this going on and not able to come up >> >>>> with any theory other than the interpreter is somehow not >> functioning >> >>>> correcltly, or the logging here is somehow incomplete. >> >>>> >> >>>> I notice you're on a very old version of SQLAlchemy (three years >> old). >> >>>> You really should be on at least the 1.1 series because for >> >>>> greenlets you need to have: >> >>>> >> >>>> >> http://docs.sqlalchemy.org/en/latest/changelog/migration_11.html#change-3803. >> >> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> > >> >>>> > [1] >> https://github.com/openstack/nova/blob/master/nova/cmd/__init__.py >> >>>> > >> >>>> > >> >>>> > 在 2018年5月14日星期一 UTC+8下午11:59:43,Mike Bayer写道: >> >>>> >> >> >>>> >> I notice in this log file there are multiple "tid", which I >> assume to >> >>>> >> be thread identifiers ? E.g. OS-level threads? Are you using >> >>>> >> eventlet or not? Are you monkeypatching SQLAlchemy and then >> using >> >>>> >> multiple OS-level threads at the same time? Because that likely >> >>>> >> doesn't work. >> >>>> >> >> >>>> >> Particularly, the log here illustrates an impossible codepath. >> >>>> >> >> >>>> >> We enter finalize_fairy with connection is None: >> >>>> >> >> >>>> >> 2018-05-14T11:11:39.961+08:00 localhost nova-conductor DEBUG >> >>>> >> [pid:5679] [MainThread] [tid:130446352] [log.py:104 debug] >> >>>> >> [req-74ef6531-2d92-40ea-a524-f54e11edab06] FINALIZE FAIRY: DB >> CONN: >> >>>> >> None, FAIRY: <sqlalchemy.pool._ConnectionFairy object at >> 0x87dd3d0>, >> >>>> >> REF: None CONNECTION_REC: <sqlalchemy.pool._ConnectionRecord >> object >> >>>> >> at 0x776fe90> CONNECTION_REC REF <weakref at 0x7c5cdb8; to >> >>>> >> '_ConnectionFairy' at 0x87dd3d0> debug >> >>>> >> >> >>>> >> this seems to make it through to the checkin at the bottom. >> >>>> >> >> >>>> >> Then, we see this: >> >>>> >> >> >>>> >> 2018-05-14T11:11:39.968+08:00 localhost nova-conductor ERROR >> >>>> >> [pid:5679] [MainThread] [tid:130446032] [log.py:122 error] >> >>>> >> [req-841bb432-7cd2-4fa3-9def-559a34b05a21] Exception during reset >> or >> >>>> >> similar >> >>>> >> >> >>>> >> The only place that message is generated is *in* finalize_fairy, >> >>>> >> *before* the checkin occurs, and *only* if _finalize_fairy was >> called >> >>>> >> *with* a connection that is non -None. There is no other call >> >>>> >> _finalize_fairy with that request identifier, therefore I see no >> >>>> >> evidence of where this message "exception during reset" can be >> >>>> >> generated from - there would at least need to be an additional >> log >> >>>> >> line with "FINALIZE FAIRY" and the connection non-None for this >> to be >> >>>> >> reached. I see no way that the pool.py shipped with SQLAlchemy >> can >> >>>> >> possibly generate this series of steps. >> >>>> >> >> >>>> >> This is an extremely intricate issue and I still have only the >> vaguest >> >>>> >> notion of how you are running SQLAlchemy / Nova. >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> On Mon, May 14, 2018 at 11:10 AM, Mike Bayer < >> mik...@zzzcomputing.com> >> >>>> >> wrote: >> >>>> >> > OK I need the exact pool.py you are using as well that includes >> all >> >>>> >> > these logging statements and any other adjustments that were >> made. >> >>>> >> > >> >>>> >> > >> >>>> >> > >> >>>> >> > On Mon, May 14, 2018 at 8:02 AM, <jiajun...@gmail.com> wrote: >> >>>> >> >> I've uploaded the logs here: >> >>>> >> >> >> >>>> >> >> >> https://raw.githubusercontent.com/jiajunsu/test_logs/master/nc_5679.log >> >>>> >> >> >> >>>> >> >> The ConnectionRecord is 0x70cdb10, tid(coroutine id) is >> 130446032 >> >>>> >> >> >> >>>> >> >> At 2018-05-14T11:11:39.970, we got the log " jjs rec already >> in >> >>>> >> >> pool >> >>>> >> >> checkin", but I just deleted `return` in code, so the problem >> >>>> >> >> happen >> >>>> >> >> again. >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> 在 2018年5月14日星期一 UTC+8上午10:48:49,Mike Bayer写道: >> >>>> >> >>> >> >>>> >> >>> as well as the log that shows this: >> >>>> >> >>> >> >>>> >> >>> + if self in pool._pool.queue: >> >>>> >> >>> + pool.logger.debug("jjs rec already in pool checkin, >> skip >> >>>> >> >>> %r", >> >>>> >> >>> self) >> >>>> >> >>> + return >> >>>> >> >>> >> >>>> >> >>> >> >>>> >> >>> actually happening. >> >>>> >> >>> >> >>>> >> >>> >> >>>> >> >>> >> >>>> >> >>> >> >>>> >> >>> On Sun, May 13, 2018 at 10:42 PM, Mike Bayer >> >>>> >> >>> <mik...@zzzcomputing.com> >> >>>> >> >>> wrote: >> >>>> >> >>> > On Sun, May 13, 2018 at 10:35 PM, <jiajun...@gmail.com> >> wrote: >> >>>> >> >>> >> "jjs" is short for my name and "cjr" is my >> colleague's(JinRong >> >>>> >> >>> >> Cai). We >> >>>> >> >>> >> work >> >>>> >> >>> >> together and add logs in the same environment, add prefix >> of >> >>>> >> >>> >> name >> >>>> >> >>> >> to >> >>>> >> >>> >> get a >> >>>> >> >>> >> keyword for log filter. >> >>>> >> >>> >> >> >>>> >> >>> >> The doubled ConnectionRecords(with None connection) are >> got by >> >>>> >> >>> >> two >> >>>> >> >>> >> different >> >>>> >> >>> >> coroutines, and the variable record.connection is rewrite >> by >> >>>> >> >>> >> the >> >>>> >> >>> >> second >> >>>> >> >>> >> coroutines. >> >>>> >> >>> >> After the record being put back to QueuePool again, there >> are >> >>>> >> >>> >> two >> >>>> >> >>> >> same >> >>>> >> >>> >> record with same connection in the pool, which will cause >> the >> >>>> >> >>> >> next >> >>>> >> >>> >> two >> >>>> >> >>> >> coroutines get conflict and deadlock in psycopg2 as above. >> >>>> >> >>> > >> >>>> >> >>> > OK, I need the detailed step by step that shows how the two >> >>>> >> >>> > coroutines >> >>>> >> >>> > are interacting such that they both are returning the same >> >>>> >> >>> > connection. >> >>>> >> >>> > The nova log above seems to show just one coroutine I >> assume >> >>>> >> >>> > since >> >>>> >> >>> > there is just one nova request ID, if I understand >> correctly. >> >>>> >> >>> > >> >>>> >> >>> >> >> >>>> >> >>> >> The logs of psycopg2 has been pasted at >> >>>> >> >>> >> https://github.com/psycopg/psycopg2/issues/703 >> >>>> >> >>> > >> >>>> >> >>> > The above issue says the bug has been found in SQLAlchemy >> but >> >>>> >> >>> > there >> >>>> >> >>> > is >> >>>> >> >>> > no bug report yet and I've not yet been shown how the >> behavior >> >>>> >> >>> > here >> >>>> >> >>> > is >> >>>> >> >>> > possible. I posted a straightforward example above, >> can you >> >>>> >> >>> > please show me where the two coroutines get mixed up ? >> >>>> >> >>> > >> >>>> >> >>> > >> >>>> >> >>> > >> >>>> >> >>> > >> >>>> >> >>> >> >> >>>> >> >>> >> >> >>>> >> >>> >> >> >>>> >> >>> >> 在 2018年5月11日星期五 UTC+8上午3:05:46,Mike Bayer写道: >> >>>> >> >>> >>> >> >>>> >> >>> >>> I tried to follow your log lines, which is difficult >> because I >> >>>> >> >>> >>> don't >> >>>> >> >>> >>> know where those log statements actually are and I don't >> know >> >>>> >> >>> >>> what >> >>>> >> >>> >>> acronyms like "jjs", "cjr", mean. But it does look like >> it's >> >>>> >> >>> >>> al >> >>>> >> >>> >>> in >> >>>> >> >>> >>> one thread (so not async GC) and all in one Nova request >> (so >> >>>> >> >>> >>> perhaps >> >>>> >> >>> >>> just one greenlet involved, unless other greenlets are >> >>>> >> >>> >>> affecting >> >>>> >> >>> >>> this >> >>>> >> >>> >>> without them being displayed in this log snippet). I do >> see >> >>>> >> >>> >>> the >> >>>> >> >>> >>> connection record doubled at the end. So here is the >> >>>> >> >>> >>> illustration of >> >>>> >> >>> >>> those steps in isolation, the connection record is not >> >>>> >> >>> >>> doubled. >> >>>> >> >>> >>> Can >> >>>> >> >>> >>> you alter this program to show this condition occurring? >> >>>> >> >>> >>> >> >>>> >> >>> >>> from sqlalchemy import create_engine >> >>>> >> >>> >>> >> >>>> >> >>> >>> e = >> create_engine("postgresql://scott:tiger@localhost/test", >> >>>> >> >>> >>> echo_pool='debug') >> >>>> >> >>> >>> e.connect().close() >> >>>> >> >>> >>> >> >>>> >> >>> >>> assert len(e.pool._pool.queue) == 1 >> >>>> >> >>> >>> >> >>>> >> >>> >>> c1 = e.connect() >> >>>> >> >>> >>> >> >>>> >> >>> >>> assert len(e.pool._pool.queue) == 0 >> >>>> >> >>> >>> >> >>>> >> >>> >>> c1.engine.pool._invalidate(c1._Connection__connection, >> None) >> >>>> >> >>> >>> c1.invalidate(e) >> >>>> >> >>> >>> >> >>>> >> >>> >>> # connection record is back >> >>>> >> >>> >>> assert len(e.pool._pool.queue) == 1 >> >>>> >> >>> >>> >> >>>> >> >>> >>> # do a reconnect >> >>>> >> >>> >>> c1.connection >> >>>> >> >>> >>> >> >>>> >> >>> >>> # uses the record again >> >>>> >> >>> >>> assert len(e.pool._pool.queue) == 0 >> >>>> >> >>> >>> >> >>>> >> >>> >>> # close the Connection >> >>>> >> >>> >>> c1.close() >> >>>> >> >>> >>> >> >>>> >> >>> >>> # record is back >> >>>> >> >>> >>> assert len(e.pool._pool.queue) == 1 >> >>>> >> >>> >>> >> >>>> >> >>> >>> # etc. >> >>>> >> >>> >>> c1 = e.connect() >> >>>> >> >>> >>> assert len(e.pool._pool.queue) == 0 >> >>>> >> >>> >>> >> >>>> >> >>> >>> c1.close() >> >>>> >> >>> >>> assert len(e.pool._pool.queue) == 1 >> >>>> >> >>> >>> >> >>>> >> >>> >>> >> >>>> >> >>> >>> >> >>>> >> >>> >>> >> >>>> >> >>> >>> On Thu, May 10, 2018 at 12:25 PM, Mike Bayer >> >>>> >> >>> >>> <mik...@zzzcomputing.com> >> >>>> >> >>> >>> wrote: >> >>>> >> >>> >>> > Here's a print statement you might want to try: >> >>>> >> >>> >>> > >> >>>> >> >>> >>> > diff --git a/lib/sqlalchemy/pool.py >> b/lib/sqlalchemy/pool.py >> >>>> >> >>> >>> > index 89a4cea7c..452db4883 100644 >> >>>> >> >>> >>> > --- a/lib/sqlalchemy/pool.py >> >>>> >> >>> >>> > +++ b/lib/sqlalchemy/pool.py >> >>>> >> >>> >>> > @@ -689,6 +689,19 @@ def _finalize_fairy(connection, >> >>>> >> >>> >>> > connection_record, >> >>>> >> >>> >>> > been garbage collected. >> >>>> >> >>> >>> > >> >>>> >> >>> >>> > """ >> >>>> >> >>> >>> > + print( >> >>>> >> >>> >>> > + "FINALIZE FAIRY: GREENLET ID: %s, DB CONN: >> %s, >> >>>> >> >>> >>> > FAIRY: >> >>>> >> >>> >>> > %s, >> >>>> >> >>> >>> > REF: >> >>>> >> >>> >>> > %s " >> >>>> >> >>> >>> > + "CONNECTION_REC: %s CONNECTION_REC REF %s" % ( >> >>>> >> >>> >>> > + "put greenlet id here", >> >>>> >> >>> >>> > + connection, >> >>>> >> >>> >>> > + fairy, >> >>>> >> >>> >>> > + ref, >> >>>> >> >>> >>> > + connection_record, >> >>>> >> >>> >>> > + connection_record.fairy_ref >> >>>> >> >>> >>> > + if connection_record is not None else >> "n/a" >> >>>> >> >>> >>> > + ) >> >>>> >> >>> >>> > + ) >> >>>> >> >>> >>> > + >> >>>> >> >>> >>> > _refs.discard(connection_record) >> >>>> >> >>> >>> > >> >>>> >> >>> >>> > if ref is not None: >> >>>> >> >>> >>> > >> >>>> >> >>> >>> > >> >>>> >> >>> >>> > >> >>>> >> >>> >>> > >> >>>> >> >>> >>> > >> >>>> >> >>> >>> > On Thu, May 10, 2018 at 12:21 PM, Mike Bayer >> >>>> >> >>> >>> > <mik...@zzzcomputing.com> >> >>>> >> >>> >>> > wrote: >> >>>> >> >>> >>> >> On Thu, May 10, 2018 at 7:23 AM, <jiajun...@gmail.com> >> >> >>>> >> >>> >>> >> wrote: >> >>>> >> >>> >>> >>> Hi Mike, >> >>>> >> >>> >>> >>> >> >>>> >> >>> >>> >>> I added more logs and finnaly find the reason and a >> method >> >>>> >> >>> >>> >>> to >> >>>> >> >>> >>> >>> avoid >> >>>> >> >>> >>> >>> the >> >>>> >> >>> >>> >>> deadlock between coroutines. >> >>>> >> >>> >>> >>> >> >>>> >> >>> >>> >>> In summary, the deadlock happend because two >> different >> >>>> >> >>> >>> >>> coroutines >> >>>> >> >>> >>> >>> use >> >>>> >> >>> >>> >>> the >> >>>> >> >>> >>> >>> same ConnectionRecord with same connection. >> >>>> >> >>> >>> >>> It happends under QueuePool, when the DB connections >> are >> >>>> >> >>> >>> >>> reset >> >>>> >> >>> >>> >>> by >> >>>> >> >>> >>> >>> the >> >>>> >> >>> >>> >>> DB >> >>>> >> >>> >>> >>> server(not related with the backend driver). >> >>>> >> >>> >>> >>> >> >>>> >> >>> >>> >>> 1.A coroutine get a ConnectionRecord into >> ConnectionFairy, >> >>>> >> >>> >>> >>> and >> >>>> >> >>> >>> >>> apply a >> >>>> >> >>> >>> >>> weakref of the ConnectionFairy, with callback >> function >> >>>> >> >>> >>> >>> `_finalize_fairy`. >> >>>> >> >>> >>> >>> 2.When the connection get disconnected, it receives >> an >> >>>> >> >>> >>> >>> exception(psycopg2.OperationalError), and goto >> >>>> >> >>> >>> >>> Connection._handle_dbapi_exception. >> >>>> >> >>> >>> >>> 3.In `_handle_dbapi_exception.finnaly`, it will call >> both >> >>>> >> >>> >>> >>> `ConnectionFairy.invalidate` and `self.invalidate`. >> >>>> >> >>> >>> >>> 4.In `ConnectionFairy.invalidate`, >> >>>> >> >>> >>> >>> `_ConnectionRecord._checkin` is >> >>>> >> >>> >>> >>> called, >> >>>> >> >>> >>> >>> and put the ConnectionRecord back to the QueuePool. >> >>>> >> >>> >>> >>> 5.In `self.invalidate`, `del self._root.__connection` >> is >> >>>> >> >>> >>> >>> called, >> >>>> >> >>> >>> >>> and >> >>>> >> >>> >>> >>> **del** >> >>>> >> >>> >>> >>> the ConnectionFairy. >> >>>> >> >>> >>> >>> 6.The **del** operation will make weakref of >> >>>> >> >>> >>> >>> ConnectionFairy >> >>>> >> >>> >>> >>> to >> >>>> >> >>> >>> >>> dead, >> >>>> >> >>> >>> >>> and >> >>>> >> >>> >>> >>> the callback `_finalize_fairy` executed. >> >>>> >> >>> >>> >>> 7.In `_finalize_fairy`, another >> >>>> >> >>> >>> >>> `_ConnectionRecord._checkin` >> >>>> >> >>> >>> >>> is >> >>>> >> >>> >>> >>> called, put >> >>>> >> >>> >>> >>> the ConnectionRecord to the QueuePool again. >> >>>> >> >>> >>> >> >> >>>> >> >>> >>> >> Here's what does not make sense. You are claiming >> that a >> >>>> >> >>> >>> >> simple >> >>>> >> >>> >>> >> invalidation will result in the same ConnectionRecord >> being >> >>>> >> >>> >>> >> checked >> >>>> >> >>> >>> >> in >> >>>> >> >>> >>> >> twice. This is obviously not the case, what is >> special in >> >>>> >> >>> >>> >> your >> >>>> >> >>> >>> >> situation that makes this happen? >> >>>> >> >>> >>> > > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.