On 3 jan 2010, at 21.34, A. Craig West wrote:
> 2010/1/3 Claes Jakobsson <[email protected]>:
>> Altho the specification forbids any # in the fragment part I think it's
>> better to look for the leftmost # and consider everything after that the
>> fragment. That way "http://foo/#bar#baz" won't cause any problems.
>
> There are a few issues to keep in mind when using the leftmost #.
> There are some locations within a URL where I believe a # is a legal
> character. It may occur within a password, for example. I'm not sure
> if there is a requirement for the # character to be escaped if it
> occurs in the path.
# is one of the reserved characters that always have to be PCT encoded if you
want a literal # in URLs so it's not valid in the password nor in the path. /
and ? are valid characters in the fragment part tho
as the RFC:
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
authority = [ userinfo "@" ] host [ ":" port ]
userinfo = *( unreserved / pct-encoded / sub-delims / ":" )
/claes
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html