On 04/04/2012 12:02 PM, Ángel González wrote: > On 04/04/12 20:16, Gijs van Tulder wrote: >> 1. You can match complete urls, instead of just the directory prefix >> or the file name suffix (which you can do with --accept and >> --include-directories). >> 2. You can use regular expressions to do the matching, which is >> sometimes easier to than using a list of wildcard patterns. >> >> Now this isn't a new idea (there are long discussions in the archive, >> see [1]). But somehow the previous attempts didn't make it, so I >> thought I'd send my own version. It's a small patch, I've been using >> it for a while and found it really useful. >> >> I've made two versions of the patch: one uses PCRE, the other uses the >> gnulib regex library, which is probably easier to integrate. >> >> Regards, >> >> Gijs > I really like PCRE, but I think the default should be POSIX regex (those > you called "gnulib regex library"), just as every other command lines > tool, such as sed or grep. There could be a --perl-regexp switch to > change it (which could take advantage of the posix interface of pcre).
sed and grep's default regexes (BREs) are next-to-useless, and are only used by default for historical compatibility. They manage to be useful much of the time for grep, but sed is greatly hampered - it's somewhat improved by GNU extensions to the POSIX BRE syntax, which become hard to live without when you try to use sed on a system that lacks them. EREs (what you get with "grep -E" or "egrep") are a big step up, and are still POSIX, so that's what I'd recommend as a default. OTOH, PCREs are completely compatible with well-formed EREs, in which case there seems little harm in letting those be default. But it would be nice to fall back onto POSIX regexes when PCRE is not found. There should be information added to the --version output, to declare the presence of regex support, and what types are supported. Also, I think regex type selection should be with something like --regex-type=pcre, rather than something like --perl-regexp, to allow for easy expansion if need be. -mjc
