Dear all,
While working on the encodings, i found the following issue with
NaviServer url decoding. RFC 3986 (as well as earlier RFCs) define a
path as a sequence of segments, separated by slashes "/":
path-abempty = *( "/" segment )
path-absolute = "/" [ segment-nz *( "/" segment ) ]
path-noscheme = segment-nz-nc *( "/" segment )
path-rootless = segment-nz *( "/" segment )
NaviServer decodes in request.c the whole URL with a single
Ns_UrlPathDecode(), which is effectively the decode operation of a
segment (!). This means, that the following two entries are treated
identically:
/foo/bar1%2fbaz.tcl
/foo/bar/baz.tcl
whereas this should refer to the two following [ns_conn urlv] values
{foo bar/baz.tcl}
{foo bar baz.tcl}
See as well in [1], which states explicitly, that
the URIs http://www.w3.org/albert/bertram/marie-claude
and http://www.w3.org/albert/bertram%2Fmarie-claude
are NOT identical, as in the second case the encoded slash does not have
hierarchical significance.
It is not good that a user of NaviServer has currently no means to
detect the difference between this two cases, since it treats these as
identical. Interestingly, Apache rejects per default requests with paths
containing %2f (see discussion in [2]). I am currently considering
keeping [ns_conn url] as it is, but to return in [ns_conn urlv] the
correct hierarchical structure. Comments? -g [1]
https://www.w3.org/Addressing/URL/4_URI_Recommentations.html [2]
http://stackoverflow.com/questions/3235219/urlencoded-forward-slash-is-breaking-url
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel