[issue38582] re: backreference number in replace string can't >= 100

2019-10-25 Thread Matthew Barnett
Matthew Barnett added the comment: If we did decide to remove it, but there was still a demand for octal escapes, then I'd suggest introducing \oXXX. -- ___ Python tracker

[issue38582] re: backreference number in replace string can't >= 100

2019-10-25 Thread Ma Lin
Ma Lin added the comment: > I'd still retain \0 as a special case, since it really is useful. Yes, maybe \0 is used widely, I didn't think of it. Changing is troublesome, let's keep it as is. -- ___ Python tracker

[issue38582] re: backreference number in replace string can't >= 100

2019-10-25 Thread Vedran Čačić
Vedran Čačić added the comment: Not very useful, surely (now that we have hex escapes). [I'd still retain \0 as a special case, since it really is useful.] But a lot more useful than a hundred backreferences. And I'm as a matter of principle opposed to changing something that's been in the

[issue38582] re: backreference number in replace string can't >= 100

2019-10-25 Thread Ma Lin
Ma Lin added the comment: Octal escape: \oooCharacter with octal value ooo As in Standard C, up to three octal digits are accepted. It only accepts UCS1 characters (ooo <= 0o377): >>> ord('\377') 255 >>> len('\378') 2 >>> '\378' == '\37' + '8' True

[issue38582] re: backreference number in replace string can't >= 100

2019-10-25 Thread Vedran Čačić
Vedran Čačić added the comment: The documentation clearly says: > This special sequence can only be used to match one of the first 99 groups. > If the first digit of number is 0, or number is 3 octal digits long, it will > not be interpreted as a group match, but as the character with octal

[issue38582] re: backreference number in replace string can't >= 100

2019-10-25 Thread veaba
veaba <908662...@qq.com> added the comment: Aha, it's me. It's the mysterious power from the East. I just learned python. I've solved my problem. It's a very simple replace replacement, and it's solved in three lines. I'm trying to solve the problem of inadvertently finding out in the

[issue38582] re: backreference number in replace string can't >= 100

2019-10-25 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I do not believe somebody uses handwritten regular expressions with more than 100 groups. But if you generate regular expression, you can use named groups (?P...) (?P=g12345). -- ___ Python tracker

[issue38582] re: backreference number in replace string can't >= 100

2019-10-25 Thread veaba
veaba <908662...@qq.com> added the comment: Yes, this is not a good place to use regular expressions. Using regular expressions: def actual_re_demo(): import re # This is an indefinite string... text = "tf.where(condition, x=None, y=None, name=None) tf.batch_gather ..." #

[issue38582] re: backreference number in replace string can't >= 100

2019-10-25 Thread Vedran Čačić
Vedran Čačić added the comment: I have no problem with long regexes. But those are not only long, those must be _deeply nested_ regexes, where simply 100 is an arbitrary limit. I'm quite sure if you really need depth 100, you must also need a dynamic depth of nesting, which you cannot really

[issue38582] re: backreference number in replace string can't >= 100

2019-10-24 Thread Ma Lin
Ma Lin added the comment: @veaba Post only in English is fine. > Is this actually needed? Maybe very very few people dynamically generate some large patterns. > However, \g<...> is not accepted in a pattern. > in the "regex" module I added support for it in a pattern too. Yes, backreference

[issue38582] re: backreference number in replace string can't >= 100

2019-10-24 Thread veaba
veaba <908662...@qq.com> added the comment: 这里来自实际我的一个项目(https://github.com/veaba/tensorflow-docs/blob/master/scripts/spider_tensorflow_docs.py#L39-L56),当然也许我这个方法不是正确的,它只是我刚学python的一个尝试。 这个项目步骤是这样:根据HTML tag 提取文本转为markdown格式。 标签,需要用符号“`”包围,然后循环里面将匹配的字符通过\\* 替换出来。 所以,你们见到了,我发现这样的一个正则溢出错误。

[issue38582] re: backreference number in replace string can't >= 100

2019-10-24 Thread Vedran Čačić
Vedran Čačić added the comment: Is this actually needed? I can't remember ever needing more than 4 (in a pattern). I find it very hard to believe someone might actually have such a regex with more than a hundred backreferences. Probably it's just a misguided attempt to parse a nested

[issue38582] re: backreference number in replace string can't >= 100

2019-10-24 Thread Matthew Barnett
Matthew Barnett added the comment: A numeric escape of 3 digits is an octal (base 8) escape; the octal escape "\100" gives the same character as the hexadecimal escape "\x40". In a replacement template, you can use "\g<100>" if you want group 100 because \g<...> accepts both numeric and

[issue38582] re: backreference number in replace string can't >= 100

2019-10-24 Thread Ma Lin
Ma Lin added the comment: Backreference number in replace string can't >= 100 https://github.com/python/cpython/blob/v3.8.0/Lib/sre_parse.py#L1022-L1036 If none take this, I will try to fix this issue tomorrow. -- nosy: +serhiy.storchaka title: Regular match overflow -> re: