From: "John M. Dlugosz" <[email protected]>
On 3/15/2011 4:56 AM, Octavian Rasnita orasnita-at-gmail.com |Catalyst/Allow to home| wrote:

uri_for() escapes only the chars which are not in the following list (from URI.pm):

$reserved   = q(;/?:@&=+$,[]);
$mark       = q(-_.!~*'());                                    #'; emacs
$unreserved = "A-Za-z0-9\Q$mark\E";

The char "&" is a valid char in the URI, so it should not be escaped.. With other words, the following url is OK:

http://localhost/dir1/dir2/ham%20&%20eggs.jpg

uri_for() generates the URI as it needs to be accessed on the server and not as it should be printed in an HTML page. In order to be printed correctly, the "&" char must be HTML-encoded, so the html TT filter must be used:

<a href="[% c.uri_for('/path', 'eggs & ham.jpg', {a=1, b=2}).path_query | html%]">label</a>

It will give:

<a href="/path/eggs%20&amp;%20ham.jpg?a=1&amp;b=2">label</a>


In contrast, the 'uri' filter in TT "converting any characters outside of the permitted URI character set (as defined by RFC 2396)" and that includes |&|, |@|, |/|, |;|, |:|, |=|, |+|, |?| and |$|.
The 'url' filter in TT is less aggressive, and does not include those.


Those chars are not permitted in query strings but they are permitted in URLS. The "?", "&", "=", "+", ";" signs are used for separating the path and the query string, to delimit the query string parts, to represent a space char... They can be also used in names of the files in path. For example, the following URL is valid:

http://localhost/static/a%20&%20@%20;%20$%20+%20=.txt

If you want, you can escape these chars everywhere, not only in the query strings, but why would you want to do this?

The '&' is a "Reserved Character" according to §2.2 of RFC 2396. That is what the code sample you quoted notes: the set of reserved characters. They may have specific meanings as delimiters within the overall URI, so should be escaped. Just skimming, I see that it's reserved within the query component.


Yes, but uri_for() escapes them in the query components (where they need to be escaped).

For example:

[% file = 'a+b = c & $î @â'; a = 'a+b = c & $î @â'; b= 'a+b = c & $î @â' %]
<a href="[% c.uri_for('/path', file, {a=a, b=b}).path_query %]">label</a>

will display:

<a href="/path/a+b%20=%20c%20&%20$%C3%AE%20@%C3%A2?a=a%2Bb+%3D+c+%26+%24%C3%AE+%40%C3%A2&b=a%2Bb+%3D+c+%26+%24%C3%AE+%40%C3%A2">label</a>

Note that I didn't html-encoded the URL for beeing easier to see the result.
As you may see, the reserved chars are escaped by uri_for() only where they need to be escaped.

And of course, if you need to print this URL in an HTML document, you can add the TT html filter and the "&" chars will be displayed as &amp;.


Anyway, using the TT 'uri' filter on the dynamic path component means I don't have to use the html filter also!


Why would you like to need to escape every path component by using the TT uri filter for more times and escape the reserved chars even where they can be used as they are, instead of using the html filter once?

If you want, you can uri-escape even the [a-zA-Z0-9] chars, but why would you want to escape chars where they don't need to be escaped? :-)

Octavian


_______________________________________________
List: [email protected]
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/[email protected]/
Dev site: http://dev.catalyst.perl.org/

Reply via email to