RFC 2396 includes a regular expression for parsing URIs, as well as
the Backus-Naur description of a URI from which you could write a
parser or a regular expression.
Charles Yeomans
On Apr 18, 2006, at 1:44 AM, Christer Olsson wrote:
I'm trying to write a regex for extracting URLs, and have one
tricky case I can't catch. I'm now using the following (greedy) regex
(http|https)://(((.*):(.*)@)?)(@)?([^:>/\s""]*)
which (as far I can see) will catch URLs like
http://www.foo.com
http://www.foo.com/bar
http://www.foo.com:80/bar
http://www.foo.com:80/bar:bar
http://user:[EMAIL PROTECTED]:80/bar
http://@www.foo.com:80/bar
but will fail on this
http://@www.foo.com:80/[EMAIL PROTECTED]
Any help very much appreciated (and I would prefer to stay pure
regex, with no post processing of the match)
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>
Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>