[issue10785] parser: store the filename as an unicode object
Roundup Robot devnull@devnull added the comment: New changeset 6e9dc970ac0e by Victor Stinner in branch 'default': Issue #10785: Store the filename as Unicode in the Python parser. http://hg.python.org/cpython/rev/6e9dc970ac0e -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
Changes by STINNER Victor victor.stin...@haypocalc.com: -- resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
STINNER Victor victor.stin...@haypocalc.com added the comment: @Benjamin: You told me that you don't want two versions of pgen, but I don't remember why. As my work on #3080 is mostly done, I now plan to patch the Python parser to store the filename as Unicode. So could you please review the patch attached to this issue? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
Changes by Antoine Pitrou pit...@free.fr: -- nosy: +benjamin.peterson ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
STINNER Victor victor.stin...@haypocalc.com added the comment: err_clear() should set err-filename to NULL. -- versions: -Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
STINNER Victor victor.stin...@haypocalc.com added the comment: Version 3 of the patch to fix also #9319. -- Added file: http://bugs.python.org/file20271/parser_filename_obj-3.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file20180/parser_filename_obj.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file20184/parser_filename_obj-2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
Alexander Belopolsky belopol...@users.sourceforge.net added the comment: I like the idea, but I don't like the trend that parser code continues to diverge from pgen. I understand that most of the Python runtime is not available to pgen, but maybe a more elegant solution than changing the type conditional on PGEN can be found. For example, maybe filename could be decoded from FS encoding to UTF-8? -- nosy: +belopolsky ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
STINNER Victor victor.stin...@haypocalc.com added the comment: maybe a more elegant solution than changing the type conditional on PGEN can be found In pgen, the filename is only used to display the following warning, in indenterror(): filename: inconsistent use of tabs and spaces in indentation In pratical, this warning never occurs on Grammar/Grammar: this file doesn't use indentation at all, only continuation lines. A better solution is maybe just to drop the filename for pgen. Anyway, pgen only compiles *one* file (Grammar/Grammar), so we don't need the input filename ;-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
STINNER Victor victor.stin...@haypocalc.com added the comment: When testing my patch, I found and fixed two bugs in pgen: - r87557: PGEN was not defined to compile pgenmain.c and printgrammar.c - r87558: pgen error was ignored on make Parser/pgen.stamp (when executing pgen to compile the grammar) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
STINNER Victor victor.stin...@haypocalc.com added the comment: Version 2 of the patch: - remove filename attribute from perrdetail and tok_state structure in PGEN mode, and add a comment to explain why - rename filename_obj to filename - indenterror() no longer print the input filename in PGEN mode -- Added file: http://bugs.python.org/file20184/parser_filename_obj-2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
New submission from STINNER Victor victor.stin...@haypocalc.com: The Python parser stores the filename as a byte string. But it decodes the filename on error because most Python functions now use unicode strings. Instead of decoding the filename at error, which may raise a new error, I propose to decode the filename on the creation of the parser object and only store the filename as unicode. This issue would prepare the last part of the full unicode support (#3080). -- components: Interpreter Core, Unicode files: parse_filename_obj.patch keywords: patch messages: 124755 nosy: haypo priority: normal severity: normal status: open title: parser: store the filename as an unicode object versions: Python 3.2, Python 3.3 Added file: http://bugs.python.org/file20179/parse_filename_obj.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file20179/parse_filename_obj.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10785] parser: store the filename as an unicode object
Changes by STINNER Victor victor.stin...@haypocalc.com: Added file: http://bugs.python.org/file20180/parser_filename_obj.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10785 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com