Jim Wright <[EMAIL PROTECTED]> writes:

> what definition of regexp would you be following?  or would this be
> making up something new?

It wouldn't be new, Mauro is definitely referring to regexps as
normally understood.  The regexp API's found on today's Unix systems
might be usable, but unfortunately those are not available on Windows.
They also lack the support for the very useful non-greedy matching
quantifier (the "?" modifier to the "*" operator) introduced by Perl 5
and supported by most of today's major regexp implementations: Python,
Java, Tcl, etc.

One idea was to use PCRE, bundling it with Wget for the sake of
Windows and systems without PCRE.  Another (http://tinyurl.com/elp7h)
was to use and bundle Emacs's regex.c, the version of GNU regex
shipped with GNU Emacs.  It is small (one source) and offers
Unix-compatible basic and extended regeps, but also supports the
non-greedy quantifier and non-capturing groups.

See the message and the related discussion at http://tinyurl.com/mdwhx
for more about this topic.

> I'm not quite understanding the comment about the comma and needing
> escaping for literal commas.

Supporting PATTERN1,PATTERN2,... would require having a way to quote
the comma character.  But there is little reason for a specific comma
syntax since one can always use (PATTERN1|PATTERN2|...).

Being unable to have a comma in the pattern is a shortcoming in the
current -R/-A options.

> I do like the [file|path|domain]: approach.  very nice and flexible.

Thanks.

Reply via email to