[Python-Dev] Adding new escapes to regex module
Other regex implementations have escape sequences for horizontal whitespace (`\h` and `\H`) and vertical whitespace (`\v` and `\V`). The regex module already supports `\h`, but I can't use `\v` because it represents `\0x0b', as it does in the re module. Now that someone has asked for it, I'm trying to find a nice way of adding it, and I'm currently thinking that maybe I could use `\y` and `\Y` instead as they look a little like `\v` and `\V`, and, also, vertical whitespace is sort-of in the y-direction. As far as I can tell, only ProgressSQL uses them, and, even then, it's for what everyone else writes as `\b` and `\B`. I want the regex module to remain compatible with the re module, in case they get added there sometime in the future. Opinions? ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/AYOYEAFOJW4ZHVYBDVMH4MWKXNLBBJ62/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Adding new escapes to regex module
> On 16 Aug 2022, at 21:24, MRAB wrote: > > Other regex implementations have escape sequences for horizontal whitespace > (`\h` and `\H`) and vertical whitespace (`\v` and `\V`). > > The regex module already supports `\h`, but I can't use `\v` because it > represents `\0x0b', as it does in the re module. You seem to be mixing the use \ as the escape for strings and the \ that re uses. Is it the behaviour that '\' becomes '\\' that means this is a breaking change? Won't this work? ``` re.compile('\v:\\v') # which is the same as re.compile(r'\x0b:\v') ``` Barry > Now that someone has asked for it, I'm trying to find a nice way of adding > it, and I'm currently thinking that maybe I could use `\y` and `\Y` instead > as they look a little like `\v` and `\V`, and, also, vertical whitespace is > sort-of in the y-direction. > > As far as I can tell, only ProgressSQL uses them, and, even then, it's for > what everyone else writes as `\b` and `\B`. > > I want the regex module to remain compatible with the re module, in case they > get added there sometime in the future. > > Opinions? > ___ > Python-Dev mailing list -- python-dev@python.org > To unsubscribe send an email to python-dev-le...@python.org > https://mail.python.org/mailman3/lists/python-dev.python.org/ > Message archived at > https://mail.python.org/archives/list/python-dev@python.org/message/AYOYEAFOJW4ZHVYBDVMH4MWKXNLBBJ62/ > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/R7MG2MKGXTIEXOAQDJ72LE2QLGDT7KNA/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Adding new escapes to regex module
On 2022-08-16 22:14, Barry Scott wrote: > On 16 Aug 2022, at 21:24, MRAB wrote: > > Other regex implementations have escape sequences for horizontal whitespace (`\h` and `\H`) and vertical whitespace (`\v` and `\V`). > > The regex module already supports `\h`, but I can't use `\v` because it represents `\0x0b', as it does in the re module. You seem to be mixing the use \ as the escape for strings and the \ that re uses. Is it the behaviour that '\' becomes '\\' that means this is a breaking change? Won't this work? ``` re.compile('\v:\\v') # which is the same as re.compile(r'\x0b:\v') ``` Some languages, e.g. Perl, have a dedicated syntax for writing regexes, and they take `\n` (a backslash followed by 'n') to mean "match a newline". Other languages, including Python, use string literals and can contain an actual newline, but they also take `\n` (a backslash followed by 'n') to mean "match a newline". Thus: >>> print(re.match('\n', '\n')) # Literal newline. >>> print(re.match('\\n', '\n')) # `\n` sequence. On the other hand: >>> print(re.match('\b', '\b')) # Literal backspace. >>> print(re.match('\\b', '\b')) # `\b` sequence, which means a word boundary. None >>> The problem is that the re and regex modules already have the `\v` (a backslash followed by 'v') sequence to mean "match the '\v' character", so: re.compile('\v') and: re.compile('\\v') mean exactly the same. > Now that someone has asked for it, I'm trying to find a nice way of adding it, and I'm currently thinking that maybe I could use `\y` and `\Y` instead as they look a little like `\v` and `\V`, and, also, vertical whitespace is sort-of in the y-direction. > > As far as I can tell, only ProgressSQL uses them, and, even then, it's for what everyone else writes as `\b` and `\B`. > > I want the regex module to remain compatible with the re module, in case they get added there sometime in the future. > > Opinions? ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KHI74Y2JJRYFRBGGNJUSL7RZCBAI7IAN/ Code of Conduct: http://python.org/psf/codeofconduct/