[issue47152] Reorganize the re module sources

2022-04-05 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: See issue47211 for removing re.TEMPLATE. -- ___ Python tracker ___ ___ Python-bugs-list

[issue47152] Reorganize the re module sources

2022-04-04 Thread Ma Lin
Ma Lin added the comment: > cryptic name In very early versions, "mark" was called register/region. https://github.com/python/cpython/blob/v1.0.1/Modules/regexpr.h#L48-L52 If span is accessed repeatedly, it's faster than Match.span(). Maybe consider renaming it, and make it as public

[issue47152] Reorganize the re module sources

2022-04-04 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > Match.regs is an undocumented attribute, it seems it has existed since 1991. Can it be removed? It was kept for compatibility with the pre-SRE implementation of the re module. It was an implementation detail in the original Python code, but I am sure

[issue47152] Reorganize the re module sources

2022-04-04 Thread Matthew Barnett
Matthew Barnett added the comment: For reference, I also implemented .regs in the regex module for compatibility, but I've never used it myself. I had to do some investigating to find out what it did! It returns a tuple of the spans of the groups. Perhaps I might have used it if it didn't

[issue47152] Reorganize the re module sources

2022-04-04 Thread Ma Lin
Ma Lin added the comment: Match.regs is an undocumented attribute, it seems it has existed since 1991. Can it be removed? https://github.com/python/cpython/blob/ff2cf1d7d5fb25224f3ff2e0c678d36f78e1f3cb/Modules/_sre/sre.c#L2871 -- ___ Python

[issue47152] Reorganize the re module sources

2022-04-04 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset ff2cf1d7d5fb25224f3ff2e0c678d36f78e1f3cb by Serhiy Storchaka in branch 'main': bpo-47152: Remove unused import in re (GH-32298) https://github.com/python/cpython/commit/ff2cf1d7d5fb25224f3ff2e0c678d36f78e1f3cb --

[issue47152] Reorganize the re module sources

2022-04-04 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset 1578f06c1c69fbbb942b90bfbacd512784b599fa by Serhiy Storchaka in branch 'main': bpo-47152: Move sources of the _sre module into a subdirectory (GH-32290) https://github.com/python/cpython/commit/1578f06c1c69fbbb942b90bfbacd512784b599fa

[issue47152] Reorganize the re module sources

2022-04-04 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- pull_requests: +30357 pull_request: https://github.com/python/cpython/pull/32298 ___ Python tracker ___

[issue47152] Reorganize the re module sources

2022-04-03 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- pull_requests: +30351 pull_request: https://github.com/python/cpython/pull/32290 ___ Python tracker ___

[issue47152] Reorganize the re module sources

2022-04-03 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: There are two very different classes with similar names: _sre.SRE_Scanner and re.Scanner. The former is used to implement the Pattern.finditer() method, but it could be used in other cases. The latter is an experimental implementation of generalized lexer

[issue47152] Reorganize the re module sources

2022-04-02 Thread STINNER Victor
STINNER Victor added the comment: The re.template() function and the re.TEMPLATE functions are not documented and not tested. The re.Scanner class is not documented but has a test_scanner() test in test_re. -- ___ Python tracker

[issue47152] Reorganize the re module sources

2022-04-02 Thread STINNER Victor
STINNER Victor added the comment: See also bpo-40259: "re.Scanner groups". -- ___ Python tracker ___ ___ Python-bugs-list mailing

[issue47152] Reorganize the re module sources

2022-04-02 Thread STINNER Victor
STINNER Victor added the comment: Old python-dev discussions on re.Scanner from 2000 to 2004: * "[Python-Dev] A standard lexer?" (July 2000) https://mail.python.org/archives/list/python-...@python.org/message/MQ4OMCVIVRJWNGHYGI3OUVZQPN5NNNAU/ thread:

[issue47152] Reorganize the re module sources

2022-04-02 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > Is the "import _locale" still used in re/__init__.py? It cannot see any > reference to it in the code and test_re still if it's removed. It is true. > *Maybe* it's time to consider that re.template() and re.Scanner are no longer > experimental? Maybe

[issue47152] Reorganize the re module sources

2022-04-02 Thread Anthony Sottile
Anthony Sottile added the comment: would it be possible to expose `parse_template` -- or at least some way to validate that a regex replacement string is correct prior to executing the replacement? I'm currently using that for my text editor:

[issue47152] Reorganize the re module sources

2022-04-02 Thread Ma Lin
Ma Lin added the comment: In `Modules` folder, there are _sre.c/sre.h/sre_constants.h/sre_lib.h files. Will them be put into a folder? -- ___ Python tracker ___

[issue47152] Reorganize the re module sources

2022-04-02 Thread STINNER Victor
STINNER Victor added the comment: It's funny to still see mentions of "experimental stuff" in Python 3.11 (2022), whereas these "experimental stuff" are there for 20 years. *Maybe* it's time to consider that re.template() and re.Scanner are no longer experimental? Maybe change their status

[issue47152] Reorganize the re module sources

2022-04-02 Thread STINNER Victor
STINNER Victor added the comment: Is the "import _locale" still used in re/__init__.py? It cannot see any reference to it in the code and test_re still if it's removed. The last reference to the _locale module has been removed in 2017 by the commit 898ff03e1e7925ecde3da66327d3cdc7e07625ba.

[issue47152] Reorganize the re module sources

2022-04-02 Thread STINNER Victor
STINNER Victor added the comment: $ ls Lib/re/ _compiler.py _constants.py __init__.py _parser.py Thanks, that's a nice enhancement! Serhiy: Would you mind to explicitly document the 3 deprecated modules in What's New in Python 3.11?

[issue47152] Reorganize the re module sources

2022-04-02 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset 1be3260a90f16aae334d993aecf7b70426f98013 by Serhiy Storchaka in branch 'main': bpo-47152: Convert the re module into a package (GH-32177) https://github.com/python/cpython/commit/1be3260a90f16aae334d993aecf7b70426f98013 --

[issue47152] Reorganize the re module sources

2022-04-01 Thread Guido van Rossum
Guido van Rossum added the comment: 1. If we're reorganizing anyway, I see no reason to keep the old names. 2. For maximum backwards compatibility, I'd say keep as much as you can, as long as keeping it won't interfere with the reorganization. --

[issue47152] Reorganize the re module sources

2022-04-01 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Modules with old names are kept (deprecated). The questions are: 1. Should we keep the sre_ prefix in new submodules? Should we prefix them with underscores? 2. Should we keep only non-underscored names in the sre_* modules or undescored names too?

[issue47152] Reorganize the re module sources

2022-04-01 Thread Guido van Rossum
Guido van Rossum added the comment: I don't mind reorganizing this, but I would insist that we keep code using old undocumented things (like the sre_* modules) working for several releases, using the standard deprecation approach. -- ___ Python

[issue47152] Reorganize the re module sources

2022-04-01 Thread STINNER Victor
STINNER Victor added the comment: sre_constants, sre_compile and sre_parse are not tested and are not documented. I don't consider them as public API currently. If someone has good reason to use them, IMO we must clearly define which exact API is needed, properly document and test it. If

[issue47152] Reorganize the re module sources

2022-03-30 Thread Ma Lin
Change by Ma Lin : -- pull_requests: +30266 pull_request: https://github.com/python/cpython/pull/32188 ___ Python tracker ___ ___

[issue47152] Reorganize the re module sources

2022-03-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: It turns out that pip uses sre_constants in its copy of pyparsing. The problem is already fixed in the upstream of pyparsing and soon should be fixed in pip. We still need to keep sre_constants and maybe other sre_* modules, but deprecate them. > Could

[issue47152] Reorganize the re module sources

2022-03-29 Thread Ma Lin
Ma Lin added the comment: Please don't merge too close to the 3.11 beta1 release date, I'll submit PRs after this merged. -- ___ Python tracker ___

[issue47152] Reorganize the re module sources

2022-03-29 Thread Dominic Davis-Foster
Dominic Davis-Foster added the comment: Could the sre_parse and sre_constants modules be kept with public names (i.e. without the leading underscore) but within the re namespace? I use them to tokenize and then syntax highlight regular expressions. I did a quick search and found a few other

[issue47152] Reorganize the re module sources

2022-03-29 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- keywords: +patch pull_requests: +30255 stage: -> patch review pull_request: https://github.com/python/cpython/pull/32177 ___ Python tracker

[issue47152] Reorganize the re module sources

2022-03-29 Thread Serhiy Storchaka
New submission from Serhiy Storchaka : I proposed it several years ago on the Python-Dev mailing list and that change was approved in general. The reorganization was deferred because there were several known bugs in the RE engine (fixes for which could potentially be backported) and there