On 10/12/2016 05:49 PM, Alfred Perlstein wrote:
Thank you, in fact thank you very, very much. Writing out the code was
really above and beyond my expectations. I will use this snippet in our
The following problems remain for us (and also likely other users of
sqla) although they are not immediate issues:
1) I need to make sure that I can use "theading.local", since we are
based on flask we actually use werkzeug's threads, which I now have to
investigate if they are compatible with this snippet. Likewise anyone
using greenlets may have to modify the snippet to fit with greenlets.
That said, I do believe that porting this snippet to werkzeug's thread
primatives should insulate us from greenlet/native-threads problems, but
then those folks not using werkzeug don't have that option. This is why
overall it's a appears problematic to me.... however I've not confirmed
this yet, it's possible that "threading.local" works for most cases
(greenlets and werkzeug) but... I have to test.
threading.local() should work with greenlets if you're doing global
monkeypatching. Otherwise there should be a similar construct in
gevent that does this, or more simply don't use any kind of "local()"
object, just check a dictionary for greenlet or thread id in the filter.
2) It looks like to me that by using the engine's logger we would be
turning on logging across all threads. What happens when we have 1000
threads? What will happen to us performance wise?
So I've looked in detail at how echo works and it is true that the
InstanceLogger does not make use of a global log level and instead calls
upon the ._log() method of the logger, doing the level calculation
itself. If the Connection had this logger directly, none of the other
Connection objects would see that they need to send records to the
However, all the records passed to the log methods are passed as fixed
objects with little to no overhead, and would be blocked by the filter
before being processed. The logging calls make sure to defer all string
interpolation and __repr__ calls of objects until they are rendered by a
formatter which would not take place with the filter blocking the
records from getting there.
The whole way that Connection even checks these flags is not really how
logging was meant to be used, you're supposed to just send messages to
the log objects and let the handlers and filters work it out. The
flags are just to eke out that tiny bit extra reduction in method calls,
but these wouldn't have much of an effect even on thousands of threads.
But it's true, it isn't zero either.
> So as a question, if I ever get to it, would you entertain patches to do
> this without turning on global logging and/or being tied to the
> threading implementation?
So, to keep it zero, the event approach would be one way to go, since
events can be added to a Connection object directly (that was a big
change made some years ago, use to be just Engine level). Adding a
new attribute connection.logger that defaults to self.engine.logger
would allow InstanceLogger to be local to a connection but still the
_echo flag would need to somehow work into the log system; this is of
course possible, however I'm trying to keep everything to do with how
logging / echo works in one place, and this would incur some kind of
two-level system of "echo" between Engine and Connection that would need
a lot of tests. If Connection has its own logger than we'd think that
each Connection should be able to have independent logging names like
Engine does, etc. It would not be a 2 line pull request.
I also worry that the precedent being set would be that anytime someone
needs to do unusual things with logging, they are going to want to add
new complexity to the ".echo" flag rather than going through the normal
logging system which IMO is extremely flexible should be trusted to
scale up as well as everything else.
Of course if turning on logging.INFO and adding the filter that blocks
99% of all log messages does prove to add some significant performance
impact, that changes everything and we'd have to decide that Python
logging does need to be worked around at scale. But Python logging is
very widely used deep inside many networking related systems without
much performance impact being noticed.
btw, if you're wondering where I'm coming from with these insane scaling
questions.... I used to be CTO of OKCupid and scaled them, now at a new
place, so these things matter to me and my team.
the "1000 greenlets" model is one I'm familiar with in Openstack (in
that they've used that setting, but I found that that number of
greenlets was never utilized for real). Are your servers truly using
1000 database connections in a single process?
thanks again Mike and apologies for the tone of my original email!
we can all get along, no worries.
You received this message because you are subscribed to the Google Groups
To unsubscribe from this group and stop receiving emails from it, send an email
To post to this group, send email to email@example.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.