Tips on server-side URL sanitizing?

Frank Hecker Thu, 15 Sep 2005 14:51:46 -0700

This is somewhat off-topic for this group, so I beg your forgiveness inadvance: I am in the process of setting up a comments system on mypersonal blog, have been trying to determine how to properly filteruser-submitted content to minimize the possibility of cross-sitescripting attacks (including reading a number of articles on the topic[1]), and still have one or two questions. (One of my questions isactually about Firefox, to bring this back on topic a bit.)

(You may also ask: Frank, doesn't your blogging software take care ofthis? And I answer: No, since I was determined to be different, runBlosxom instead of WordPress, MT, etc., and am writing my own code tohandle comments because I don't like the available options in Blosxom :-)

First, I won't be allowing HTML tags in submitted comments. My plan wasto simply use the Perl CGI::EscapeHTML function (Blosxom is written inPerl) to convert '<', '>', double quote, and 0x8b and 0x9b to thecorresponding HTML character entities prior to the submitted commentbeing saved and displayed. Is this sufficient, or should I be escapingother characters as well?

Second, and more important (because I'm still unclear on this): I'll beaccepting URLs submitted with comments (as part of a email/URL textfield), and I obviously need to do something with them to avoid XSSproblems. The question is, what? I've gotten the impression that urlencoding characters like '<' that might appear in submitted URLs is nota total solution, and that retaining characters like '<' in the URL,even in encoded form, could be a problem.

What's the recommended approach? One thought I had was to parse the URL,go through any query parameters one by one, decode them, totally stripout any resulting '<' and related characters, and then put the URL backtogether again. Is this overkill? Still not enough?

Now for the Firefox question: In doing some testing I noticed that ifyou enter a '<', etc., in the location bar then Firefox will url encodeit prior to sending it to the server. Also, if such an encoded URL is onthe page as a link, Firefox will send it to the host still in encodedform. Are there other scenarios where Firefox will send such a URL tothe host *not* in encoded form (i.e., with bare '<')? Or was thepossibility of this closed off as part of anti-XSS changes?


Thanks in advance for any advice you all can give!

Frank


[1] Documents I've looked at include

Amit Klein's "Cross Site Scripting Explained"
  http://crypto.stanford.edu/cs155/CSS.pdf

CERT/CC's "Understanding Malicious Content Mitigation for Web Developers"
  http://www.cert.org/tech_tips/malicious_code_mitigation.html

CERT/CC's "How To Remove Meta-characters From User-Supplied Data In CGIScripts"

  http://www.cert.org/tech_tips/cgi_metacharacters.html

cgisecurity.com's "The Cross Site Scripting FAQ"
  http://www.cgisecurity.com/articles/xss-faq.shtml

Apache's "Cross Site Script Info: Encoding Examples"
  http://httpd.apache.org/info/css-security/encoding_examples.html

plus a few others.

--
Frank Hecker
[EMAIL PROTECTED]
_______________________________________________
Mozilla-security mailing list
[email protected]
http://mail.mozilla.org/listinfo/mozilla-security

Tips on server-side URL sanitizing?

Reply via email to