Re: Regexps (was Re: Invalid Operating System)

Robert Rothenberg Sun, 17 Dec 2006 21:30:09 +0100 (CET)

On 17/12/06 18:16 demerphq wrote:
> On 12/17/06, Robert Rothenberg <[email protected]> wrote:
>> On 17/12/06 08:52 Dave Hodgkinson wrote:
>>
>> > Reach for the root cause: regexps themselves are hateful. Nasy,
>> > cryptic line noise.
>>
>> As has been said in another message, regexps are their own language,
>> which
>> has origins in theoretical computer science and mathematics.  Like most
>> expressions in mathematics and logic, it looks like "nas[t]y cryptic line
>> noice," but it makes sense to those who know how to read it, and it's the
>> most efficient means of expressing the concept.


 [...]

> Well, the two come from different eras so its hardly surprising that
> they dont match. I mean you'd find it hard to read English from the
> 15th century, and someone from the 15th century would have the same
> troubles reading modern English.

Bad comparison: traditional regexps are much easier to read than the ones
used in contemporary programming languages.

That issue aside, note that I said "Like most expressions in mathematics and
logic... it's the most efficient means of expressing the concept."  Regexps
are mathematical expressions for strings instead of numbers.

You could just as well complain that the such as

  dist = sqrt( sqr(x_0 - x_1) + sqr(y_0 - y1) )

is too cryptic.  You could spell it out in several lines with lots of
comments for the mathematically illiterate, but the compiler may produce
sub-optimal code, and it will make less sense to those who know how to read
equations.

Likewise, you could spell out your regexp with dozens of lines of indexof()
and substr() function calls, but it will be less comprehendable than a
single regexp, be more likely to have bugs, and not be compiled into an
efficient finite-automata.

Re: Regexps (was Re: Invalid Operating System)

Reply via email to