> a huge advantage of REs is that they are common to many > languages. You can take a regex from grep to Perl to your editor to > Python. They're not absolutely identical, of course, but the basics > are all the same. Creating a new search language means everyone has to > learn anew. > ChrisA
1) I'm not suggesting we get rid of the re module (the VE implementation I linked requires it) 2) You can easily output regex from verbal expressions 3) verbal expressions are implemented in many different languages too: https://verbalexpressions.github.io/ 4) It even has a generic interface that all implementations are meant to follow: https://github.com/VerbalExpressions/implementation/wiki/List-of-methods-to-implement Note that the entire documentation is 250 words while just the syntax portion of Python docs for the re module is over 3000 words. > You think that example is more readable than the proposed transalation > ^(http)(s)?(\:\/\/)(www\.)?([^\ ]*)$ > which is better written > ^https?://(www\.)?[^ ]*$ > or even > ^https?://[^ ]*$ Yes. I find it *far* more readable. It's not a soup of symbols like Perl code. I can only surmise that you're fluent in regex because it seems difficult for you to see how the above could be less readable than English words. which makes it obvious that the regexp is not very useful from the > word "^"? (It matches only URLs which are the only thing, including > whitespace, on the line, probably not what was intended.) I could tell it only matches URLs that are the only thing inside the string because it clearly says: start_of_line() and end_of_line(). I would have had to refer to a reference to know that "^" doesn't always mean "not", it sometimes means "start of string" and probably other things. I would also have to check a reference to know that "$" can mean "end of string" (and probably other things). Are those groups capturing in Verbal Expressions? The use of "find" > (~ "search") rather than "match" is disconcerting to the experienced > user. You can alternately use the word "then". The source code is just one python file. It's very easy to read. I actually like "then" over "find" for the example: verbal_expression.start_of_line() .then('http') .maybe('s') .then('://') .maybe('www.') .anything_but(' ') .end_of_line() What does alternation look like? .OR(option1).OR(option2).OR(option3)... How about alternation of > non-trivial regular expressions? .OR(other_verbal_expression) As far as I can see, Verbal Expressions are basically a way of making > it so painful to write regular expressions that people will restrict > themselves to regular expressions What's so painful to write about them? Does your IDE not have autocompletion? I find REs so painful to write that I usually just use string methods if at all feasible. I don't think that this failure to respect the > developer's taste is restricted to this particular implementation, > either. I generally find it distasteful to write a pseudolanguage in strings inside of other languages (this applies to SQL as well). Especially when the design principals of that pseudolanguage are *diametrically opposed* to the design principals of the host language. A key principal of Python's design is: "you read code a lot more often than you write code, so emphasize readability". Regex seems to be based on: "Do the most with the fewest key-strokes. Readability be dammed!". It makes a lot more sense to wrap the psudolanguage in constructs that bring it in-line with the host language than to take on the mental burden of trying to comprehend two different languages at the same time. If you disagree, nothing's stopping you from continuing to write res the old-fashion way. Can we at least agree that baking special re syntax directly into the language is a bad idea? On Wed, Mar 29, 2017 at 11:49 PM, Nick Coghlan <ncogh...@gmail.com> wrote: > On 28 March 2017 at 01:17, Simon D. <si...@acoeuro.com> wrote: > > It would ease the use of regexps in Python > > We don't really want to ease the use of regexps in Python - while > they're an incredibly useful tool in a programmer's toolkit, they're > so cryptic that they're almost inevitably a maintainability nightmare. > > Baking them directly into the language runtime also locks people in to > a particular regex engine implementation, rather than being able to > swap in a third party one if they choose to do so (as many folks > currently do with the `regex` PyPI module). > > So it's appropriate to keep them as a string-based library level > capability, and hence on a relatively level playing field with less > comprehensive, but typically easier to maintain, options like string > methods and third party text parsing libraries (such as > https://pypi.python.org/pypi/parse for something close to the inverse > of str.format) > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/