The probability of encountering a $ sign somewhere inside URL is not insignificant... I agree that it's very unlikely (perhaps even illegal) to use ^ in URLs, but $ are sometimes used.
I'd have to take a look at the spec, but I think both characters should be URL-encoded anyway. Maybe it'd be a good idea to include a URL-normalizing filter that would encode everything properly (according to www-url-encoding) before regexping?
D. ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
