This is somewhat off-topic for this group, so I beg your forgiveness in advance: I am in the process of setting up a comments system on my personal blog, have been trying to determine how to properly filter user-submitted content to minimize the possibility of cross-site scripting attacks (including reading a number of articles on the topic [1]), and still have one or two questions. (One of my questions is actually about Firefox, to bring this back on topic a bit.)

(You may also ask: Frank, doesn't your blogging software take care of this? And I answer: No, since I was determined to be different, run Blosxom instead of WordPress, MT, etc., and am writing my own code to handle comments because I don't like the available options in Blosxom :-)

First, I won't be allowing HTML tags in submitted comments. My plan was to simply use the Perl CGI::EscapeHTML function (Blosxom is written in Perl) to convert '<', '>', double quote, and 0x8b and 0x9b to the corresponding HTML character entities prior to the submitted comment being saved and displayed. Is this sufficient, or should I be escaping other characters as well?

Second, and more important (because I'm still unclear on this): I'll be accepting URLs submitted with comments (as part of a email/URL text field), and I obviously need to do something with them to avoid XSS problems. The question is, what? I've gotten the impression that url encoding characters like '<' that might appear in submitted URLs is not a total solution, and that retaining characters like '<' in the URL, even in encoded form, could be a problem.

What's the recommended approach? One thought I had was to parse the URL, go through any query parameters one by one, decode them, totally strip out any resulting '<' and related characters, and then put the URL back together again. Is this overkill? Still not enough?

Now for the Firefox question: In doing some testing I noticed that if you enter a '<', etc., in the location bar then Firefox will url encode it prior to sending it to the server. Also, if such an encoded URL is on the page as a link, Firefox will send it to the host still in encoded form. Are there other scenarios where Firefox will send such a URL to the host *not* in encoded form (i.e., with bare '<')? Or was the possibility of this closed off as part of anti-XSS changes?

Thanks in advance for any advice you all can give!

Frank


[1] Documents I've looked at include

Amit Klein's "Cross Site Scripting Explained"
  http://crypto.stanford.edu/cs155/CSS.pdf

CERT/CC's "Understanding Malicious Content Mitigation for Web Developers"
  http://www.cert.org/tech_tips/malicious_code_mitigation.html

CERT/CC's "How To Remove Meta-characters From User-Supplied Data In CGI Scripts"
  http://www.cert.org/tech_tips/cgi_metacharacters.html

cgisecurity.com's "The Cross Site Scripting FAQ"
  http://www.cgisecurity.com/articles/xss-faq.shtml

Apache's "Cross Site Script Info: Encoding Examples"
  http://httpd.apache.org/info/css-security/encoding_examples.html

plus a few others.

--
Frank Hecker
[EMAIL PROTECTED]
_______________________________________________
Mozilla-security mailing list
[email protected]
http://mail.mozilla.org/listinfo/mozilla-security

Reply via email to