Ben Last wrote:
north_american_number_re = (RE().start

.literal('(').followed_by.exactly(3).digits.then.literal(')')
                                     
.then.one.literal("-").then.exactly(3).digits

.then.one.dash.followed_by.exactly(4).digits.then.end
                                     .as_string())

Very cool. It's a bit verbose for my taste, and I'm not sure how well it will cope with nested structure.

Here's my take on what readable regexps could look like:

north_american_number_re = RE.compile(r"""
    ^
    "(" digit{3} ")"  # And why shouldn't a regexp
    "-" digit{3}      # include en embedded comment?
    "-" digit{4}
    $
""")

The problem with Perl-style regexp notation isn't so much that it's terse - it's that the syntax is irregular (sic) and doesn't follow modern principles for lexical structure in computer languages. You can get a long way just by ignoring whitespace, putting literals in quotes and allowing embedded comments.

Setting the re.VERBOSE flag achieves two out of three, so you can write:

north_american_number_re = RE.compile(r"""
    ^
    ( \d{3} )   # Definite improvement, though I really miss putting
    - \d{3}     # literals in quotes.
    - \d{4}
    $
""")

It's too bad re.VERBOSE isn't the default.

regards, Anders

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to