Re: [Quixote-users] urllib.quote() and cgi.escape()

mario ruggier Thu, 26 Jan 2006 04:16:29 -0800


On Jan 24, 2006, at 8:21 PM, Titus Brown wrote:

So, should htmlescape deal with this differently?

right now it does this:

print str(htmlescape("'"))

print str(htmlescape('"'))

&quot;

If you were trying to use these characters in a URI value (for theirnormal meaning in that context!) then my understanding is that you haveto use their HTML char entities: & < > ". This way, theHTML document can be valid.

If however you are trying to use them as a string literal value in aURI context, then you should use the %xx mechanism.

(I was however unable to easily find a clear and convenient statementof the above in RFC 2396).

Now, in your original question, you were actually trying to use suchcharacters in the literal value attribute of an input element... asthis value can become a part of the URL for the page (e.g. in thequerystring) than it should follow that it should be escaped withurllib.quote(), i.e. the %xx mechanism.


So, similar to your original example:

'<input name="one" value="%s" />'%("""contains'different"quotes&stuff""")


and assume some other input field:
'<input name="two" value="normal" />'

if we submit the form (or specify the fields in the querystring for thepage) we should end up with a querystring such as:

?one=contains%27different%22quotes%26stuff&amp;two=normal

Note that the & (as delimeter!) is html escaped as it should be,but the & as literal value (%26) is url escaped (as it should be?).

But, re your actual question above, I was under the impression that the"'" character should also be escaped with ' ... but, I see thatthis char entity is not even listed in<http://www.w3.org/TR/REC-html40/sgml/entities.html>. So, maybe not.


mario

_______________________________________________
Quixote-users mailing list
[email protected]
http://mail.mems-exchange.org/mailman/listinfo/quixote-users

Re: [Quixote-users] urllib.quote() and cgi.escape()

Reply via email to