Hello,
I have faced a problem in my production server (gevent based); when a
connection attempt is made and MySQL server does not respond (due to listen
backlog full), the whole application hangs. This seems to be because
SQLAlchemy QueuePool does not allow multiple connection attempts
simultaneously. It is waiting for overflow count lock. I suggest that we
allow multiple connection attempts at the same time as I don't see any side
effects of doing so. Details follow.
Details of the problem:
python-sqlalchemy 0.7.4-1
python 2.7.3-0ubuntu2
python-gevent 1.0~b1-1
mysql-server 5.5.34-0ubuntu0.12.0
I opened a gevent backdoor connection to the hung server and created a test
method and ran it.
def test():
import pdb
pdb.set_trace()
import sqlalchemy
import mysql.connector
p = sqlalchemy.pool.manage(mysql.connector, pool_size=128)
p.connect(host='myhost', port=3306, user='myuser', password='mypassword',
buffered=True)
It ran with following trace (excerpt):
(Pdb) step
--Call--
> /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(931)connect()
-> def connect(self, *args, **kw):
[...]
(Pdb)
> /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(210)connect()
-> return _ConnectionFairy(self).checkout()
[...]
(Pdb)
> /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(371)__init__()
-> rec = self._connection_record = pool._do_get()
[...]
(Pdb)
--Call--
> /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(671)_do_get()
-> def _do_get(self):
[...]
(Pdb)
> /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(676)_do_get()
-> except sqla_queue.Empty:
(Pdb)
> /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(677)_do_get()
-> if self._max_overflow > -1 and \
(Pdb)
> /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(678)_do_get()
-> self._overflow >= self._max_overflow:
(Pdb)
> /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(687)_do_get()
-> if self._overflow_lock is not None:
(Pdb)
> /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py(688)_do_get()
-> self._overflow_lock.acquire()
(Pdb)
> /usr/lib/pymodules/python2.7/gevent/coros.py(98)acquire()
-> def acquire(self, blocking=True, timeout=None):
Analysis:
Before making a connection attempt the overflow counter lock is obtained
and it is being released only after the connection either succeeds or
fails. In my case, a connection remained hung possibly because of a surge
in new DB connections and SYN backlog overflew on the database server (I
have since added a timeout and tuned my database server to have much higher
backlog). While this connection didn't respond, any new connection attempt
as seen in the above trace waited trying to acquire overflow lock. The
whole application became in capable of serving requests. Cause is this code:
class QueuePool(Pool):
def _do_get(self):
[...]
if self._overflow_lock is not None:
self._overflow_lock.acquire()
if self._max_overflow > -1 and \
self._overflow >= self._max_overflow:
if self._overflow_lock is not None:
self._overflow_lock.release()
return self._do_get()
try:
con = self._create_connection()
self._overflow += 1
finally:
if self._overflow_lock is not None:
self._overflow_lock.release()
return con
Changeset 5f0a7bb cleaned up this code but does not seem to have changed
the flow (behaviour should be the same on trunk). Since disabling the
overflow with max_overflow = -1 does not use lock at all, this behaviour is
possibly an oversight rather than intended behaviour.
Possible solution:
Since the overflow lock seems to be to only maintain overflow count, I
suggest that we increment the counter *before* connection attempt, don't
hold the lock during connection attempt and then decrement the counter in
case of an error. If there is interest in doing this, I shall find time for
a patch and possibly a test case.
Thank you,
--
Sunil Mohan Adapa
--
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.