[issue43656] TracebackException or StackSummary.extract with capture_locals=True fail to catch exceptions raised by repr() on value of frame local variable in FrameSummary.__init__.
Joe Wells added the comment: Some quick comments on Martin's pull request. 1. The changes are sufficient to let the user make things work the way it is requested that they work and anyone who does not start using the new format_locals parameter will get the old behavior. 2. A side comment: I just noticed that the docstring for FrameSummary.__init__ does not document the filename, lineno, and name parameters. Perhaps this could be fixed at the same time? 3. The new parameter format_locals is slightly confusingly named, because it does not format a single string from all the locals but instead gives a dictionary of strings with one string for each local. 4. I personally think it would be better to include something like the example value for format_locals as an extra attribute of the traceback module so everybody doesn't have to write the same function. That said, the documented example is sufficient. 5. It is not clear to me why this stringify-ing of the locals can't be done in StackSummary._extract_from_extended_frame_gen. It passes the dictionary of locals and also the function to transform it to FrameSummary. It would make more sense to transform it first. I suppose there might be some user somewhere who is directly calling FrameSummary so the interface needs to stay. 6. I fantasize that the new format_locals parameter would have access to the frame object. In order for this to happen, the frame object would have to be passed to FrameSummary instead of the 3 values (locals, name, filename) that are extracted from it. I think the current interface of FrameSummary was designed to interoperate with very old versions of Python. Oh well. 7. If anyone wanted to ever refactor FrameSummary (e.g., to enable my fantasy in point #6 just above), it becomes harder after this. This change has the effect of exposing details of the interface of FrameSummary to users of StackSummary.extract and TracebackException. The four parameters that the new parameter format_locals takes are most of the parameters of FrameSummary. -- ___ Python tracker <https://bugs.python.org/issue43656> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43656] TracebackException or StackSummary.extract with capture_locals=True fail to catch exceptions raised by repr() on value of frame local variable in FrameSummary.__init__.
Joe Wells added the comment: 1. As background, it is worth remembering that a major motivation for why FrameSummary.__init__ stringifies the local variable values in its parameter locals is to prevent the resulting data structure from keeping those values live. This enables them to be garbage collected. 2. It has been suggested that TracebackException.__init__ or StackSummary.extract could be given a function parameter. This could either be a new parameter or could involve using the existing capture_locals parameter with a non-bool. The purpose of this suggestion is to allow customizing the use of repr to stringify local variable values. If this suggestion is followed, it would be quite useful if this function parameter was given not just the local variable values, but also the name of the local variable and maybe also the stack frame it is in. This would enable filtering out undesired variables. For example, I would usually prefer to filter most of the variables from the __main__ module frame, because they are just noise that makes it hard to see the important details. Some ability to filter would also help with the security issue that is discussed in issue 23597 that Irit pointed to. 3. I doubt that new students will be setting up calls to TracebackException or modifying sys.excepthook on their own. Although a simple interface is clearly good, it might be more important to have an interface that maximizes the usefulness of the feature. 4. I no longer have the tracebacks my students were getting. You can look at the traceback from the example in msg 404319 for a traceback. While I was debugging this, I got much more complicated tracebacks because two of the classes in traceback.py also throw exceptions during their __init__ method that leave the not-yet-initialized object in a repr-will-raise-an-exception state. Despite having decades of programming experience, it actually took me a long time before I realized that the exceptions I was debugging were junk exceptions due to attempts to call repr on not-yet-initialized objects. I think figuring this out would be extremely challenging for my second-year undergraduate students. 5. If one of the calls to repr in FrameSummary.__init__ raises an exception, this prevents all the other local variable values from being inspected. If it is desired to report local variable values that cause repr to raise an exception, then it would be good to collect all such exceptions, which requires handling each exception and then continuing to traverse the traceback stack. Because of point 1 above, it might often be best to turn each such exception into a string. To avoid infinite loops in the debugging/logging tools, it would often be best not to attempt to traverse the traceback stack of each of these extra exceptions. If this argument is a good argument, this seems to mean that my most recent proposed fix is a good balance. 6. It is not clear if there is a reliable way to detect whether repr raises an exception due to an object's initialization having been interrupted, but all of the failures that I observed of this sort were for the variable name “self”. In one case, the repr failure was for a normal local variable of an __new__ method which was not a parameter of this method; the variable was named “self” by convention, but this convention would be difficult to automatically verify. In the two other cases, the repr failure was for a parameter named “self” which was the first parameter of an __init__ method. So it would help to give special treatment to the first parameter of __init__ methods, but this would not solve the problem for __new__ methods. 7. In some cases, it might be a bit complicated to ensure the object is always in a safe state for repr-ing during initialization. This is because there may be many attributes that need to be modified and this is not generally done atomically. One attribute could be designated to indicate that initialization is done, so that repr will be extra careful if that attribute does not have an expected value. Given that this is only (so far) an issue for debuggers and traceback inspection tools, it is not clear how to motivate everyone who writes a __new__, __init__, or __repr__ method to do this safely. Documentation can of course help. -- ___ Python tracker <https://bugs.python.org/issue43656> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43656] TracebackException or StackSummary.extract with capture_locals=True fail to catch exceptions raised by repr() on value of frame local variable in FrameSummary.__init__.
Joe Wells added the comment: In the hopes of convincing someone to install a fix to this bug, I will mention a few additional points. When I mention “the capture_locals feature”, I mean calls of the form traceback.TracebackException(..., capture_locals=True) and traceback.StackSummary.extract(..., capture_locals=True). 1. Right now, the capture_locals feature is unreliable. If any frame on the entire stack has a single local variable for which repr raises an exception, the user gets no information whatsoever back for the entire stack except for that exception, and that exception does not identify the offending stack frame or local variable. Also, every use of the capture_locals feature must be inside a try-except statement. 2. The exceptions raised by the capture_locals feature will often be junk exceptions carrying no useful information. The meaning of these exceptions will often be “there is an incompletely initialized object in a variable somewhere on the stack”. Because this is a fairly normal state of affairs, this will often not be useful information. 3. Although it is a excellent idea to encourage Python coders to ensure that __repr__ method never raise exceptions, it seems unlikely this will cause many __repr__ methods to be fixed in the near future. My impression is that __repr__ methods that raise exceptions on incompletely initialized objects are common. One reason for this might be that authors of __repr__ methods rarely suffer from this problem personally, because they will generally supply correct arguments to their own class constructors, and even when they don't (e.g., while building unit tests), they are unlikely to try to format to strings all the local variable values on the stack in the exception traceback. 4. Right now, the caller of traceback.FrameSummary(..., locals=x) needs to go through the entire dict x and for every value v in x test whether repr(v) raises an exception and if so either remove the key/value pair or change the value to something which can be safely repr-ed. Then FrameSummary.__init__ will repeat this work and run repr on every value in x again. So using FrameSummary safely requires duplicating the work, i.e., running repr on every value in the dict both before and during the call to FrameSummary. 5. Anyone who is using the capture_locals feature on an exception traceback has already caught an exception, and therefore decided not to let that exception “bubble up”. Their attempts to log or otherwise cope with the exception will be spoiled by the capture_locals feature raising an unrelated exception. This is even worse when the exception raised by the capture_locals feature is a junk exception that merely indicates there is an incompletely initialized object on the stack. This is likely to happen because exceptions will often happen during the initialization of an object. 6. The most recent suggested fix does in fact report the extra repr exception discovered by the capture_locals feature in the string that represents the local variable's value. The main effect of the current code propagating this repr exception is to make it harder to deal with the original exception. I hope these points are useful. -- ___ Python tracker <https://bugs.python.org/issue43656> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43656] TracebackException or StackSummary.extract with capture_locals=True fail to catch exceptions raised by repr() on value of frame local variable in FrameSummary.__init__.
Joe Wells added the comment: Andrei, thanks very much for the pointer to bug/issue https://bugs.python.org/issue39228. I had not noticed the earlier comment by Irit pointing to that bug. (Is there some way to merge bugs so that someone visiting one of the bugs will see the merged stream of comments?) The remarks in that bug are precisely why my recommended fix has this line: d[k] = '' instead of this: d[k] = object.__repr__(v) Does this not meet the concern expressed there? Something that would more thoroughly meet that would be the following: if locals: d = {} self.locals = d for k, v in locals.items(): try: d[k] = repr(v) except Exception as e: d[k] = '' else: self.locals = None I use object.__repr__ instead of repr because when repr(v) has already failed it is likely that repr(e) on the exception e would also be likely to fail. Although we could also try repr(e) first to see if it raises an exception. Your suggested reaction to this bug (documenting Python best practice) is something that should be done regardless. I completely agree. Unfortunately, better documentation of how to write good Python does not address the point of this bug which is that debugging tools should not fail when used to debug buggy code. The purpose of a debugging tool is to help with badly written code, not to work only with well written code. Right now the capture_locals=True feature is not safe to use without wrapping it with an exception handler. As a result of this bug, I have been forced to rewrite this: def format_exception_chunks(exception): return (traceback.TracebackException.from_exception(exception, capture_locals=True).format()) to instead be this: def format_exception_chunks(exception): try: tbe = (traceback.TracebackException . from_exception(exception, capture_locals=True)) return tbe.format() except Exception: pass # the traceback module fails to catch exceptions that occur when # formatting local variables' values, so we retry without # requesting the local variables. try: tbe = traceback.TracebackException.from_exception(exception) return ('WARNING: Exceptions were raised while trying to' ' format exception traceback with local variable' ' values,\n' 'so the exception traceback was formatted without' ' local variable values.\n', *tbe.format()) except Exception: return ('WARNING: Exceptions were raised while trying to format' ' exception traceback,\n' 'so no formatted traceback information is being' ' provided.\n',) My argument is that it is better to fix the bug once in traceback.py instead of every user of the capture_locals=True feature having to discover the hard way (with hours of painful bug investigation) that they need to write lots of exception handling and bug workarounds around their use of the capture_locals=True feature. -- ___ Python tracker <https://bugs.python.org/issue43656> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43656] TracebackException or StackSummary.extract with capture_locals=True fail to catch exceptions raised by repr() on value of frame local variable in FrameSummary.__init__.
Joe Wells added the comment: I'm sorry Andrei: I misread your alteration of my example and misunderstood its purpose. For anyone else reading these messages: In my most recent comment above, please ignore the part of my comment about Andrei's example. So yes, Andrei, that is how people should write their code! Your code does not trigger the bug because it is written in a better way. Agreed completely. However, this bug still needs fixed because it is currently not possible to use the capture_locals=True feature of the traceback module when debugging code written by people who do not follow Andrei's example of best practice. -- ___ Python tracker <https://bugs.python.org/issue43656> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43656] TracebackException or StackSummary.extract with capture_locals=True fail to catch exceptions raised by repr() on value of frame local variable in FrameSummary.__init__.
Joe Wells added the comment: Here are my thoughts inspired by Andrei's insightful comments. 1. In response to the major issue Andrei raises, I agree that it is desirable that repr would never raise an exception. The reasons Andrei mentions seem quite correct to me. However, I think the only practical way to make that change would be the change the code of the repr builtin. (Expecting the authors of every class in all Python code ever written to make sure that their __repr__ method will never raise an exception is unrealistic.) The bug that is the topic of the issue in this bug report can be fixed by merely handling exceptions raised by one particular call to repr in the code of FrameSummary.__init__. That change can only affect code that uses the traceback module to get nicer tracebacks, and only if the capture_locals=True feature is requested Changing the implementation of the repr builtin could conceivably cause unexpected problems for lots of deployed 3rd party code during normal use, because repr is widely used in deployed code, and hence more care and thought is needed. In contrast, changing FrameSummary.__init__ as I propose in this bug report will only affect code using the capture_locals=True feature of the traceback module, and is likely to make such code more reliable because right now that feature is failure-prone due to this bug. So I propose that making repr never raise exceptions should be a different bug. This bug does not need to wait for that bug to be fixed. 2. In response to a minor point of Andrei, I would like to mention that I encountered this because I had given students a coursework template containing this code: import traceback, sys def format_exception_chunks(exception): return (traceback.TracebackException.from_exception(exception, capture_locals=True).format()) def print_exception(_ignored_type, exception, _ignored_traceback): for chunk in format_exception_chunks(exception): print(chunk, file=sys.stderr, end="") # This change to sys.excepthook adds local variables to stack traces. sys.excepthook = print_exception This had the unfortunate effect that when a class constructor decided that it did not like its arguments, the students were overwhelmed by a baffling cascade of exception tracebacks. So while this was indeed “for convenience of interactive debugging”, it had the actual effect of making it nearly impossible for these beginner Python programmers to do any debugging at all. The overall effect of this bug is that it makes it potentially unwise to routinely rely on the capture_locals=True feature. A feature that could be useful for beginners actually makes it worse for them. 3. In response to Andrei's suggested change to my minimal example to reproduce the bug, I have a few comments. First, his minimal example is just as good as mine for reproducing the bug. Second, my example is inspired by the two classes I mention whose constructors trigger the bug: both of them do it by leaving the incompletely initialized object missing an attribute that repr depends on, so they both raise a missing attribute exception when an attempt is made to print the incompletely initialized object. I suspect this is a quite common pattern in deployed Python code. I suspect that people have not made efforts to avoid this pattern because it only triggers a bug when exception tracebacks are investigated, because the only reference to the incompletely initialized object is in the stack frame of the constructor method (either __new__ or __init__) which in turn is only referenced from the traceback. Third, my example deliberately has as little code as possible in the __repr__ method to convey the key issue. Fourth, one of these two examples (mine or Andrei's) should go into the unit tests when the bug is fixed. 4. I notice that the code of unittest.util.safe_repr gives an example of how to use object.__repr__(obj) to get something formatted for an object whose normal __repr__ method has raised an exception. So I alter my recommended fix to the following: if locals: d = {} self.locals = d for k, v in locals.items(): try: d[k] = repr(v) except Exception: d[k] = '' else: self.locals = None 5. Can someone please change the Stage of this issue to something other than “resolved” and the Resolution of this issue to something other than “not a bug”? I hope these comments are helpful and this bug gets somehow fixed. -- ___ Python tracker <https://bugs.python.org/issue43656> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43656] TracebackException or StackSummary.extract with capture_locals=True fail to catch exceptions raised by repr() on value of frame local variable in FrameSummary.__init__.
Joe Wells added the comment: I would like to request that this bug be repopened and fixed. I've changed (or at least tried to change, I'm not sure if it will let me) the title of the bug to point out that the failure happens in FrameSummary.__init__. It does not happen in StackSummary.format. This problem makes the capture_locals=True feature of TracebackException and StackSummary and the "locals" parameter of FrameSummary unreliable. If any one of the local variables in any frame on the stack is in an inconsistent state such that repr will raise an exception, the processing of the traceback fails. This kind of inconsistent state is of course likely to happen during debugging, which is precisely when you would want the capture_locals feature to actually work so you can see what is going wrong. Just one example of an exception traceback being created with an unsafe local variable value in one of the stack frames is in the following line: from _pydecimal import Decimal as D; D(None) The _pydecimal.Decimal.__new__ method raises an exception when it sees a value it doesn't know how to convert to Decimal. When it does this, the new object it was creating is left in an inconsistent state missing the _sign attribute. When you try to inspect the resulting exception traceback with traceback.TracebackException(..., capture_locals=True), this raises an exception. While I was tracking this bug down, I discovered that the traceback.TracebackException.__init__ method has the same problem: it initializes the _str attribute used by the __str__ method quite late and when it raised an exception before this point, the incompletely initialized TracebackException object caused repr to fail. There's at least one more class in traceback.py that also has this problem, but I don't remember which one it is. The cascade of exceptions causing exceptions causing exceptions makes the capture_locals feature difficult to use and debug. Here is a short snippet that reproduces the bug on Python 3.9.7: import traceback class TriggerTracebackBug: def __init__(self): raise RuntimeError("can't build a TriggerTracebackBug object for some reason") self._repr = 'if we reached this line, this object would have a repr result' def __repr__(self): return self._repr try: TriggerTracebackBug() except Exception as e: # this method call fails because there is a stack frame with a local # variable (self) such that repr fails on that variable's value: traceback.TracebackException.from_exception(e, capture_locals=True) It's clear that it is too much to expect every class to implement a safe __repr__ method. So it also seems clear to me that traceback.FrameSummary.__init__ is the place to fix it. My suggested fix is to replace this line in the latest Lib/traceback.py: self.locals = {k: repr(v) for k, v in locals.items()} if locals else None with something like this code (written in a somewhat awkward way because that helped while I was debugging it): if locals: d = {} self.locals = d for k, v in locals.items(): try: d[k] = repr(v) except Exception: d[k] = '' else: self.locals = None I've tested this code in an older version of Python and it fixed the problem for me. -- nosy: +jbw title: StackSummary.format fails if repr(value) fails -> TracebackException or StackSummary.extract with capture_locals=True fail to catch exceptions raised by repr() on value of frame local variable in FrameSummary.__init__. ___ Python tracker <https://bugs.python.org/issue43656> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com