[issue47212] Minor issues in reported Syntax errors
Change by Matthieu Dartiailh : -- pull_requests: +30391 pull_request: https://github.com/python/cpython/pull/32334 ___ Python tracker <https://bugs.python.org/issue47212> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47212] Minor issues in reported Syntax errors
Change by Matthieu Dartiailh : -- keywords: +patch pull_requests: +30364 stage: -> patch review pull_request: https://github.com/python/cpython/pull/32302 ___ Python tracker <https://bugs.python.org/issue47212> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47212] Minor issues in reported Syntax errors
New submission from Matthieu Dartiailh : Hi, While working on Pegen I noticed that: - the invalid_arguments rule would non point to the full generator expression in its second and fifth alternatives - when reporting an indentation error after a bare except, the error is actually a SyntaxError I will open a PR shortly to address both since the changes are quite minimal IMO. -- components: Parser messages: 416662 nosy: lys.nikolaou, mdartiailh, pablogsal priority: normal severity: normal status: open title: Minor issues in reported Syntax errors type: behavior versions: Python 3.10, Python 3.11 ___ Python tracker <https://bugs.python.org/issue47212> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44667] tokenize.py emits spurious NEWLINE if file ends on a comment without a newline
New submission from Matthieu Dartiailh : Using tokenize.py to tokenize the attached file yields: 0,0-0,0:ENCODING 'utf-8' 1,0-1,2:NAME 'if' 1,3-1,4:NAME 'a' 1,4-1,5:OP ':' 1,5-1,7:NEWLINE'\r\n' 2,0-2,4:INDENT '' 2,4-2,5:NAME 'b' 2,6-2,7:OP '=' 2,8-2,9:NUMBER '1' 2,9-2,11: NEWLINE'\r\n' 3,0-3,2:NL '\r\n' 4,0-4,6:COMMENT'# test' 4,6-4,6:NL '' 4,6-4,7:NEWLINE'' 5,0-5,0:DEDENT '' 5,0-5,0:ENDMARKER '' This output is wrong in that it adds 2 newlines one as a NL which is a correct and one as a NEWLINE which is not since there is no preceding code. If a new line is added at the end of the file, one gets: 0,0-0,0:ENCODING 'utf-8' 1,0-1,2:NAME 'if' 1,3-1,4:NAME 'a' 1,4-1,5:OP ':' 1,5-1,7:NEWLINE'\r\n' 2,0-2,4:INDENT '' 2,4-2,5:NAME 'b' 2,6-2,7:OP '=' 2,8-2,9:NUMBER '1' 2,9-2,11: NEWLINE'\r\n' 3,0-3,2:NL '\r\n' 4,0-4,6:COMMENT'# test' 4,6-4,8:NL '\r\n' 5,0-5,0:DEDENT '' 5,0-5,0:ENDMARKER '' Similarly if code is added before the comment, a single NEWLINE is generated (with no text since it is fake). The extra NEWLINE found when tokenizing the attached file can cause issue when parsing the file. It was found in https://github.com/we-like-parsers/pegen/pull/11#issuecomment-881926767 where a pure python parser based on pegen is being built. The extra NEWLINE is an issue since the grammar does not accept NEWLINE at the end of a block and cause parsing to fail using the same rules as the python grammar while the cpython parser can handle this file without any issue. I believe this issue stems from https://github.com/python/cpython/blob/3.9/Lib/tokenize.py#L605 where the check does not account for a last line limited to comments. Adding a check to determine if the line starts with a # should be sufficient to avoid emitting the extra NEWLINE. -- components: Library (Lib) files: no_newline_at_end_of_file_with_comment.py messages: 397750 nosy: mdartiailh priority: normal severity: normal status: open title: tokenize.py emits spurious NEWLINE if file ends on a comment without a newline type: behavior versions: Python 3.8 Added file: https://bugs.python.org/file50157/no_newline_at_end_of_file_with_comment.py ___ Python tracker <https://bugs.python.org/issue44667> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43129] Number of object on stack required by opcode
New submission from Matthieu Dartiailh : When constructing bytecode object manually as can be done using the bytecode library (https://github.com/MatthieuDartiailh/bytecode which was first developed by V Stinner), one can use dis.stack_effect to compute the required stack size, thus avoiding stack overflows. However it can be interesting for those manually built bytecode object to also check that no underflow can occur. This computation is straightforward once one knows the number of element on the stack a specific opcode expects. This works has been done manually in the bytecode project, but it may interesting to provide a way in the dis module to access this information with an interface similar to dis.stack_effect. If there is an interest in such a feature I would be happy to contribute it. I would however like some opinion on how to do that in an optimal manner. I assume it would require to add the implementation in https://github.com/python/cpython/blob/master/Python/compile.c and expose it in a similar manner to stack_effect. -- components: Library (Lib) messages: 386494 nosy: mdartiailh priority: normal severity: normal status: open title: Number of object on stack required by opcode type: enhancement versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue43129> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41703] Most bytecode changes are absent from Python 3.9 What's new
Matthieu Dartiailh added the comment: Looking at the current version of the page https://docs.python.org/3.9/whatsnew/3.9.html#cpython-bytecode-changes I still see only the LOAD_ASSERTION_ERROR. It seems the changelog got updated but not the What's new -- ___ Python tracker <https://bugs.python.org/issue41703> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41703] Most bytecode changes are absent from Python 3.9 What's new
New submission from Matthieu Dartiailh : A number of bytecodes have been added removed in Python 3.9 as documented in https://docs.python.org/3.9/library/dis.html. However only the addition of LOAD_ASSERTION_ERROR is currently documented in What's New. The relevant bpo issues are: - https://bugs.python.org/issue33387 - https://bugs.python.org/issue39320 -- messages: 376298 nosy: mdartiailh priority: normal severity: normal status: open title: Most bytecode changes are absent from Python 3.9 What's new ___ Python tracker <https://bugs.python.org/issue41703> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41191] PyType_FromModuleAndSpec is not mentioned in 3.9 What's new
Change by Matthieu Dartiailh : -- resolution: -> fixed ___ Python tracker <https://bugs.python.org/issue41191> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41191] PyType_FromModuleAndSpec is not mentioned in 3.9 What's new
New submission from Matthieu Dartiailh : Looking at the What's new for Python 3.9 I noticed that there was no mention of PEP 573. The added functions are properly documented and should probably be mentioned in the What's new. -- assignee: docs@python components: Documentation messages: 372790 nosy: docs@python, mdartiailh priority: normal severity: normal status: open title: PyType_FromModuleAndSpec is not mentioned in 3.9 What's new versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue41191> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30588] Missing documentation for codecs.escape_decode
Matthieu Dartiailh added the comment: The issue is that unicode_escape will not properly handle strings mixing unicode character and escaped character as it assumes latin-1 compatible characters only. For example, given the literal string 'Δ\nΔ', one cannot encode using latin-1 and encoding it using utf-8 then using unicode _escape produces a wrong output: 'Î\x94\nÎ\x94'. However using codecs.escape_decode(r'Δ\nΔ'.encode('utf-8'))[0].decode('utf-8') gives the proper output. Internally the Python parser handle this case but I was unable to find where and this is the closest solution I found. I guess it may be possible using error handlers but it seems much more cumbersome. Best regards Matthieu -- ___ Python tracker <http://bugs.python.org/issue30588> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30588] Missing documentation for codecs.escape_decode
New submission from Matthieu Dartiailh: codecs.escape_decode does not appear in the codecs documentation. This function is to my knowledge the only convenient way to process the escaped characters in a literal string (actually found here https://stackoverflow.com/questions/4020539/process-escape-sequences-in-a-string-in-python). It is most useful when implementing a parser for a language extending python semantic while retaining python processing of string (cf https://github.com/MatthieuDartiailh/enaml). Is there a reason for that function not being documented ? -- assignee: docs@python components: Documentation messages: 295342 nosy: docs@python, mdartiailh priority: normal severity: normal status: open title: Missing documentation for codecs.escape_decode versions: Python 3.3, Python 3.4, Python 3.5, Python 3.6, Python 3.7 ___ Python tracker <http://bugs.python.org/issue30588> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28810] Document bytecode changes in 3.6
Matthieu Dartiailh added the comment: Anyone to review this. Working on bytecode manipulation for different projects I wish I had known this existed before. -- nosy: +mdartiailh ___ Python tracker <http://bugs.python.org/issue28810> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29607] Broken stack_effect for CALL_FUNCTION_EX
Matthieu Dartiailh added the comment: I added the Misc/NEWS entry under Python 3.7. I guess it will be backported to 3.6 when cherry-pinking. -- ___ Python tracker <http://bugs.python.org/issue29607> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29607] Broken stack_effect for CALL_FUNCTION_EX
Changes by Matthieu Dartiailh : -- pull_requests: +168 ___ Python tracker <http://bugs.python.org/issue29607> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29607] Broken stack_effect for CALL_FUNCTION_EX
New submission from Matthieu Dartiailh: The computation of the stack_effect of the CALL_FUNCTION_EX does not reflect the use of the argument to the opcode. Currently stack_effect expect two flags (one on 0x01 and one on 0x02) corresponding to whether positional arguments and keyword arguments are being passed. However in the current implementation the argument of CALL_FUNCTION_EX is either 0 or 1 depending on the presence of keyword arguments. According to Serhiy Storchaka, the behavior of stack_effect is a left-over of the previous implementation and should be fixed. -- components: Interpreter Core messages: 288230 nosy: mdartiailh priority: normal severity: normal status: open title: Broken stack_effect for CALL_FUNCTION_EX type: behavior versions: Python 3.6, Python 3.7 ___ Python tracker <http://bugs.python.org/issue29607> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com