Hi all, Regexes are really useful in many places, and to me it's sad to see the builtin "re" module having to resort to requiring a source string as an argument. It would be much more elegant to simply do "s.search(pattern)" than "re.search(pattern, s)". I suggest building all regex operations into the str class itself, as well as a new syntax for regular expressions.
Thus a "findall" for any lowercase letter in a string would look like this: >>> "1a3c5e7g9i".findall(!%[a-z]%) ['a', 'c', 'e', 'g', 'i'] A "findall" for any letter, case insensitive: >>> "1A3c5E7g9I".findall(!%[a-z]%i) ['A', 'c', 'E', 'g', 'I'] A substitution of any letter for the string " WOOF WOOF ": >>> "1a3c5e7g9i".sub(!%[a-z]% WOOF WOOF %) '1 WOOF WOOF 3 WOOF WOOF 5 WOOF WOOF 7 WOOF WOOF 9 WOOF WOOF ' A substitution of any letter, case insensitive, for the string "hovercraft": >>> "1A3c5E7g9I".sub(!%[a-z]%hovercraft%i) '1hovercraft3hovercraft5hovercraft7hovercraft9hovercraft' You may wonder why I chose the regex delimiters as "!%" ... "%" [ ... "%" ] ... The choice of "%" was purely arbitrary; I just thought of it since there seems to be a convention to use "%" in PHP regex patterns. The "!" is in front to disambiguate it from the "%" modulo operator or the "%" string formatting operator, and because "!" is currently not used in Python. Another potential idea is to simply use "!" to denote the start of a regex, and use the character immediately following it to delimit the regex. Thus all of the following would be regexes matching a single lowercase letter: !%[a-z]% !#[a-z]# !?[a-z]? !/[a-z]/ And all of the following would be substitution regexes replacing a single case-insensitive letter with "@": !%[a-z]%@%i !#[a-z]#@#i !?[a-z]?@?i !/[a-z]/@/i Some examples of how to use this: >>> "pneumonoultramicroscopicsilicovolcanokoniosis".findall(!%[aeiou]+%) ['eu', 'o', 'ou', 'a', 'i', 'o', 'o', 'i', 'i', 'i', 'o', 'o', 'a', 'o', 'o', 'io', 'i'] >>> "GMzKqtnnyGdqIQNlQSLidbDlqpdhoRbHrrUAgyhMgkZKYVhQuI".search(!%[^A-Z][A-Z]{3}([a-z])[A-Z]{3}[^A-Z]%) <regex_match; span=(11, 20); match='qIQNlQSLi'> >>> "My name is Joanne.".findall(!%[A-Z][a-z]+%) ['My', 'Joanne'] Thoughts? Sincerely, Ken;
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/