Dear all,

While working on the encodings, i found the following issue with NaviServer url decoding. RFC 3986 (as well as earlier RFCs) define a path as a sequence of segments, separated by slashes "/":

   path-abempty  = *( "/" segment )
   path-absolute = "/" [ segment-nz *( "/" segment ) ]
   path-noscheme = segment-nz-nc *( "/" segment )
   path-rootless = segment-nz *( "/" segment )

NaviServer decodes in request.c the whole URL with a single Ns_UrlPathDecode(), which is effectively the decode operation of a segment (!). This means, that the following two entries are treated identically:

  /foo/bar1%2fbaz.tcl
  /foo/bar/baz.tcl

whereas this should refer to the two following [ns_conn urlv] values

  {foo bar/baz.tcl}
  {foo bar baz.tcl}

See as well in [1], which states explicitly, that

    the URIs http://www.w3.org/albert/bertram/marie-claude
  and http://www.w3.org/albert/bertram%2Fmarie-claude

are NOT identical, as in the second case the encoded slash does not have hierarchical significance.

It is not good that a user of NaviServer has currently no means to detect the difference between this two cases, since it treats these as identical. Interestingly, Apache rejects per default requests with paths containing %2f (see discussion in [2]). I am currently considering keeping [ns_conn url] as it is, but to return in [ns_conn urlv] the correct hierarchical structure. Comments? -g [1] https://www.w3.org/Addressing/URL/4_URI_Recommentations.html [2] http://stackoverflow.com/questions/3235219/urlencoded-forward-slash-is-breaking-url
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to