El sáb, 8 may 2021 a las 10:00, Devin Jeanpierre (<jeanpierr...@gmail.com>) escribió:
> > What are people thoughts on the feature? > > I'm +1, this level of detail in the bytecode is very useful. My main > interest is actually from the AST though. :) In order to be in the > bytecode, one assumes it must first be in the AST. That information is > incredibly useful for refactoring tools like https://github.com/ssbr/refex > (n.b. author=me) or https://github.com/gristlabs/asttokens (which refex > builds on). Currently, asttokens actually attempts to re-discover that kind > of information after the fact, which is error-prone and difficult. > The AST already has column offsets ( https://docs.python.org/3.10/library/ast.html#ast.AST.col_offset). > > This could also be useful for finer-grained code coverage tracking and/or > debugging. One can actually imagine highlighting the spans of code which > were only partially executed: e.g. if only x() were ever executed in "x() > and y()" . Ned Batchelder once did wild hacks in this space, and maybe this > proposal could lead in the future to something non-hacky? > https://nedbatchelder.com/blog/200804/wicked_hack_python_bytecode_tracing.html > I say "in the future" because it doesn't just automatically work, since as > I understand it, coverage currently doesn't track spans, but lines hit by > the line-based debugger. Something else is needed to be able to track which > spans were hit rather than which lines, and it may be similarly hacky if > it's isolated to coveragepy. If, for example, enough were exposed to let a > debugger skip to bytecode for the next different (sub) span, then this > would be useful for both coverage and actual debugging as you step through > an expression. This is probably way out of scope for your PEP, but even so, > the feature may be laying some useful ground work here. > > -- Devin > > On Fri, May 7, 2021 at 2:52 PM Pablo Galindo Salgado <pablog...@gmail.com> > wrote: > >> Hi there, >> >> We are preparing a PEP and we would like to start some early discussion >> about one of the main aspects of the PEP. >> >> The work we are preparing is to allow the interpreter to produce more >> fine-grained error messages, pointing to >> the source associated to the instructions that are failing. For example: >> >> Traceback (most recent call last): >> >> File "test.py", line 14, in <module> >> >> lel3(x) >> >> ^^^^^^^ >> >> File "test.py", line 12, in lel3 >> >> return lel2(x) / 23 >> >> ^^^^^^^ >> >> File "test.py", line 9, in lel2 >> >> return 25 + lel(x) + lel(x) >> >> ^^^^^^ >> >> File "test.py", line 6, in lel >> >> return 1 + foo(a,b,c=x['z']['x']['y']['z']['y'], d=e) >> >> ^^^^^^^^^^^^^^^^^^^^^ >> >> TypeError: 'NoneType' object is not subscriptable >> >> The cost of this is having the start column number and end column number >> information for every bytecode instruction >> and this is what we want to discuss (there is also some stack cost to >> re-raise exceptions but that's not a big problem in >> any case). Given that column numbers are not very big compared with line >> numbers, we plan to store these as unsigned chars >> or unsigned shorts. We ran some experiments over the standard library and >> we found that the overhead of all pyc files is: >> >> * If we use shorts, the total overhead is ~3% (total size 28MB and the >> extra size is 0.88 MB). >> * If we use chars. the total overhead is ~1.5% (total size 28 MB and the >> extra size is 0.44MB). >> >> One of the disadvantages of using chars is that we can only report >> columns from 1 to 255 so if an error happens in a column >> bigger than that then we would have to exclude it (and not show the >> highlighting) for that frame. Unsigned short will allow >> the values to go from 0 to 65535. >> >> Unfortunately these numbers are not easily compressible, as every >> instruction would have very different offsets. >> >> There is also the possibility of not doing this based on some build flag >> on when using -O to allow users to opt out, but given the fact >> that these numbers can be quite useful to other tools like coverage >> measuring tools, tracers, profilers and the such adding conditional >> logic to many places would complicate the implementation considerably and >> will potentially reduce the usability of those tools so we prefer >> not to have the conditional logic. We believe this is extra cost is very >> much worth the better error reporting but we understand and respect >> other points of view. >> >> Does anyone see a better way to encode this information **without >> complicating a lot the implementation**? What are people thoughts on the >> feature? >> >> Thanks in advance, >> >> Regards from cloudy London, >> Pablo Galindo Salgado >> >> _______________________________________________ >> Python-Dev mailing list -- python-dev@python.org >> To unsubscribe send an email to python-dev-le...@python.org >> https://mail.python.org/mailman3/lists/python-dev.python.org/ >> Message archived at >> https://mail.python.org/archives/list/python-dev@python.org/message/DB3RTYBF2BXTY6ZHP3Z4DXCRWPJIQUFD/ >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-Dev mailing list -- python-dev@python.org > To unsubscribe send an email to python-dev-le...@python.org > https://mail.python.org/mailman3/lists/python-dev.python.org/ > Message archived at > https://mail.python.org/archives/list/python-dev@python.org/message/Y4R44A4JY3WHJW2PVK5AXBXYO4X3BPA4/ > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BCR3UN3LCXRBPOO7Y7KME6FX4BLVBDRO/ Code of Conduct: http://python.org/psf/codeofconduct/