On Friday, August 17, 2001, at 01:07 , Sven Neuhaus wrote:

> Are mailto: and news: URLs official? If so they're missing..

I don't know of any "official" list of valid schemes in URLs. 
RFC 2396 ("Uniform Resource Identifiers (URI): Generic Syntax") 
has this to say:

3. URI Syntactic Components

    The URI syntax is dependent upon the scheme.  In general, absolute
    URI are written as follows:

       <scheme>:<scheme-specific-part>

    An absolute URI contains the name of the scheme being used (<scheme>)
    followed by a colon (":") and then a string (the <scheme-specific-
    part>) whose interpretation depends on the scheme.

    The URI syntax does not require that the scheme-specific-part have
    any general structure or set of semantics which is common among all
    URI.  However, a subset of URI do share a common syntax for
    representing hierarchical relationships within the namespace.  This
    "generic URI" syntax consists of a sequence of four main components:

       <scheme>://<authority><path>?<query>

    each of which, except <scheme>, may be absent from a particular URI.

It goes on further to define "scheme" as matching the regex /^[a-
z][a-z0-9+.-]+$/i.

So, it appears that in URIs and (as a subset) URLs, not only are 
"mail" and "news" valid schemes, but so are an infinity of other 
strings. Furthermore, without constraining the original problem 
(find URLs in a string), it seems that a multitude of false 
positives will be generated.

--
Craig S. Cottingham
[EMAIL PROTECTED]
PGP key available from: 
<http://pgp.ai.mit.edu:11371/pks/lookup?op=get&search=0xA2FFBE41>
ID=0xA2FFBE41, fingerprint=6AA8 2E28 2404 8A95 B8FC 7EFC 136F 
0CEF A2FF BE41

Reply via email to