Re: [sqlalchemy] Selective logging using the ORM for a single thread. Can't do it, therefore can't debug our app.

Mike Bayer Wed, 12 Oct 2016 16:11:58 -0700


On 10/12/2016 05:49 PM, Alfred Perlstein wrote:



Mike,

Thank you, in fact thank you very, very much.   Writing out the code was
really above and beyond my expectations.  I will use this snippet in our
app.

The following problems remain for us (and also likely other users of
sqla) although they are not immediate issues:

1) I need to make sure that I can use "theading.local", since we are
based on flask we actually use werkzeug's threads, which I now have to
investigate if they are compatible with this snippet.  Likewise anyone
using greenlets may have to modify the snippet to fit with greenlets.
That said, I do believe that porting this snippet to werkzeug's thread
primatives should insulate us from greenlet/native-threads problems, but
then those folks not using werkzeug don't have that option.  This is why
overall it's a appears problematic to me.... however I've not confirmed
this yet, it's possible that "threading.local" works for most cases
(greenlets and werkzeug) but... I have to test.

threading.local() should work with greenlets if you're doing globalmonkeypatching. Otherwise there should be a similar construct ingevent that does this, or more simply don't use any kind of "local()"object, just check a dictionary for greenlet or thread id in the filter.

2) It looks like to me that by using the engine's logger we would be
turning on logging across all threads.  What happens when we have 1000
threads?  What will happen to us performance wise?

So I've looked in detail at how echo works and it is true that theInstanceLogger does not make use of a global log level and instead callsupon the ._log() method of the logger, doing the level calculationitself. If the Connection had this logger directly, none of the otherConnection objects would see that they need to send records to thelogging calls.

However, all the records passed to the log methods are passed as fixedobjects with little to no overhead, and would be blocked by the filterbefore being processed. The logging calls make sure to defer all stringinterpolation and __repr__ calls of objects until they are rendered by aformatter which would not take place with the filter blocking therecords from getting there.

The whole way that Connection even checks these flags is not really howlogging was meant to be used, you're supposed to just send messages tothe log objects and let the handlers and filters work it out. Theflags are just to eke out that tiny bit extra reduction in method calls,but these wouldn't have much of an effect even on thousands of threads.But it's true, it isn't zero either.


>
> So as a question, if I ever get to it, would you entertain patches to do
> this without turning on global logging and/or being tied to the
> threading implementation?

So, to keep it zero, the event approach would be one way to go, sinceevents can be added to a Connection object directly (that was a bigchange made some years ago, use to be just Engine level). Adding anew attribute connection.logger that defaults to self.engine.loggerwould allow InstanceLogger to be local to a connection but still the_echo flag would need to somehow work into the log system; this is ofcourse possible, however I'm trying to keep everything to do with howlogging / echo works in one place, and this would incur some kind oftwo-level system of "echo" between Engine and Connection that would needa lot of tests. If Connection has its own logger than we'd think thateach Connection should be able to have independent logging names likeEngine does, etc. It would not be a 2 line pull request.

I also worry that the precedent being set would be that anytime someoneneeds to do unusual things with logging, they are going to want to addnew complexity to the ".echo" flag rather than going through the normallogging system which IMO is extremely flexible should be trusted toscale up as well as everything else.

Of course if turning on logging.INFO and adding the filter that blocks99% of all log messages does prove to add some significant performanceimpact, that changes everything and we'd have to decide that Pythonlogging does need to be worked around at scale. But Python logging isvery widely used deep inside many networking related systems withoutmuch performance impact being noticed.


btw, if you're wondering where I'm coming from with these insane scaling
questions.... I used to be CTO of OKCupid and scaled them, now at a new
place, so these things matter to me and my team.

the "1000 greenlets" model is one I'm familiar with in Openstack (inthat they've used that setting, but I found that that number ofgreenlets was never utilized for real). Are your servers truly using1000 database connections in a single process?


thanks again Mike and apologies for the tone of my original email!


we can all get along, no worries.

-Alfred


--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Selective logging using the ORM for a single thread. Can't do it, therefore can't debug our app.

Reply via email to