Re: [openstack-dev] Oslo logging eats system level tracebacks by default

Doug Hellmann Wed, 28 May 2014 08:46:17 -0700

On Wed, May 28, 2014 at 10:38 AM, Sean Dague <[email protected]> wrote:
> When attempting to build a new tool for Tempest, I found that my python
> syntax errors were being completely eaten. After 2 days of debugging I
> found that oslo log.py does the following *very unexpected* thing.
>
>  - replaces the sys.excepthook with it's own function
>  - eats the execption traceback unless debug or verbose are set to True
>  - sets debug and verbose to False by default
>  - prints out a completely useless summary log message at Critical
> ([CRITICAL] [-] 'id' was my favorite of these)
>
> This is basically for an exit level event. Something so breaking that
> your program just crashed.
>
> Note this has nothing to do with preventing stack traces that are
> currently littering up the logs that happen at many logging levels, it's
> only about removing the stack trace of a CRITICAL level event that's
> going to very possibly result in a crashed daemon with no information as
> to why.
>
> So the process of including oslo log makes the code immediately
> undebuggable unless you change your config file to not the default.
>
> Whether or not there was justification for this before, one of the
> things we heard loud and clear from the operator's meetup was:
>
>  - Most operators are running at DEBUG level for all their OpenStack
> services because you can't actually do problem determination in
> OpenStack for anything < that.
>  - Operators reacted negatively to the idea of removing stack traces
> from logs, as that's typically the only way to figure out what's going
> on. It took a while of back and forth to explain that our initiative to
> do that wasn't about removing them per say, but having the code
> correctly recover.
>
> So the current oslo logging behavior seems inconsistent (we spew
> exceptions at INFO and WARN levels, and hide all the important stuff
> with a legitimately uncaught system level crash), undebuggable, and
> completely against the prevailing wishes of the operator community.
>
> I'd like to change that here - https://review.openstack.org/#/c/95860/
>
>         -Sean


I agree, we should dump as much detail as we can when we encounter an
unhandled exception that causes an app to die.

Doug

>
> --
> Sean Dague
> http://dague.net
>
>
> _______________________________________________
> OpenStack-dev mailing list
> [email protected]
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Oslo logging eats system level tracebacks by default

Reply via email to