On Friday 06 December 2013 09:45 PM, Michael Bayer wrote:
[...]
OK, I see this is with gevent - while I like the idea of gevent, I’m
not deeply familiar with best practices for it. The QueuePool
specifically uses thread-based locks to achieve it’s work. I can’t
comment on what modifications might be needed to it in order to work
with gevent’s model, but overall I’d suggest an entirely different
pool implementation optimized for gevent. When I spent some time
trying out gevent I noticed that QueuePool might have been having
problems, and this is not surprising.
For starters, I’d probably use NullPool with a gevent-based
application, if there are in fact gevent-specific issues occurring.
The threading API is transparently replaced with gevent's own
lightweight threading implementation using monkey patching. This
includes the lock implementation. After monkey patching, a library like
SQLAlchemy instead of spawning threads will unknowingly spawn gthreads
and instead of using regular thread locks will unknowingly use gthread
locks. Where in the traditional model locks block a thread and other
threads continue to run, a gthread lock stops a gthread and returns back
to the event loop for processing and running other events/gthreads.
This all usually works fine except in rare situations. I see that there
is nothing in SQLAlchemy/QueuePool that would make this not work
properly. I am happy to report that I have been
Gevent/SQLAlchemy/QueuePool for quite some time in a highly available
setup with ~2-3k QPS database load.
As noted by you later, the problem at hand has nothing to do with gevent
and would occur in a traditional threading model too. Sorry to have
introduced gevent confusion, but I felt obliged to mention it for the
purpose of a full report.
[...]
Changeset 5f0a7bb cleaned up this code but does not seem to have
changed the flow (behaviour should be the same on trunk). Since
disabling the overflow with max_overflow = -1 does not use lock at
all, this behaviour is possibly an oversight rather than intended
behavior.
Noting that I haven’t deeply gotten into this code at the moment,
overall I’m confused about “the application became incapable of
serving requests” - if the QueuePool serves out as many connections
as it’s supposed to, its supposed to block all callers at that point.
If you set max_overflow to -1, then there is no overflow_lock present
at all, it’s set to None in the constructor. Otherwise, blocking on
the call is what it’s supposed to do, in a traditionally threaded
application. If when using gevent this means that other workers are
blocked because the whole thing expects any kind of waiting to be
handled “async style”, then that suggests we need a totally different
approach for gevent.
Since the overflow lock seems to be to only maintain overflow
count, I suggest that we increment the counter *before* connection
attempt, don't hold the lock during connection attempt and then
decrement the counter in case of an error. If there is interest in
doing this, I shall find time for a patch and possibly a test
case.
How would that work with a traditionally threaded application? My
program goes to get a connection, the QueuePool says there’s none
available yet and I should wait, then the call returns with…what?
if it isn’t waiting. I apologize that I have only a fuzzy view of
how things work with gevent, and at this time of the morning I’m
probably not engaging the traditional threading model in my head so
well either.
As you predicted in the later mail, this problem has in fact occurred
way before the pool size has reached.
Pool limit = 128 + 10 overflow
Checked out connections at the time of the problem = 27
--
Sunil
--
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.