On Mar 13, 2007, at 11:35 AM, Mario Minati wrote:


With Regex::Common I found some address that still validate but which are not
valid, at least I've never seen addresses like them:

https://minati.de./
(the point after 'de' shouldn't be valid)

Yes it is, all hostnames actually end with a . (the DNS root), but nobody requires you to enter it since they all end with one. There are instances where you DO want to include it though. For example if your DNS search order includes 'foo.com', and you type into your web browser http://www/ it takes you to http:// www.foo.com/, but what happens if your DNS server search order includes foo.com and you have a host named minati.de.foo.com? Will you go to minati.de or to minati.de.foo.com? To
make sure you get just minati.de and not minati.de.foo.com you can use
http://minati.de./

On the other hand this url
https://minati.de/index.html#lkj
is invalid, but that might be some trubble with the '#' and the encoding. (I'm fighting with utf-8 at the moment, do you have experience in that Carl?)


From 'RFC Uniform Resource Identifiers (URI): Generic Syntax'

2.4.3. Excluded US-ASCII Characters

Although they are disallowed within the URI syntax, we include here a
   description of those US-ASCII characters that have been excluded and
   the reasons for their exclusion.

... snip ...

The character "#" is excluded because it is used to delimit a URI from a
   fragment identifier in URI references (Section 4).

Your URI isn't valid unless you encode the #

--
Jason Kohles
[EMAIL PROTECTED]
http://www.jasonkohles.com/
"A witty saying proves nothing."  -- Voltaire



_______________________________________________
Html-widget mailing list
Html-widget@lists.rawmode.org
http://lists.rawmode.org/cgi-bin/mailman/listinfo/html-widget

Reply via email to