Re: Cross site scripting issue

Dale Newfield Thu, 15 Mar 2007 21:56:41 -0800

There are two discussions here that are getting convoluted: WHEN to"clean" and HOW to clean. I still have yet to find a good comprehensiveway to do the latter (more below), but right here I'm responding to theformer.


Christopher Schultz wrote:

If you /are/ capturing text you will be using that /can/ contain HTML
markup, then cleaning it as it comes in is still a mistake. Let's say
you have a bug in your cleansing code. In that case, bad stuff gets into
your database where it's hard to root out and fix.

If that data is hard to find than you haven't cleanly defined your DBschema.

WHEN to do the cleaning is not a question of security andmaintainability, but a question of amortizing clock cycles to try to getresponses out to browsers as quickly as possible. There is no reason toclean the same piece of text with the same algorithm more than once, sowhy not do it on the input side? If you find a bug in your cleansingcode, then once you change it, re-run it ONCE on all the potentiallydangerous text blocks. Those should map directly to columns in your DB.If you can't look at your DB schema and tell me which columns aredisplayed without escaping their contents, then something is wrong.

I agree with Leon: cleaning input is not usually a good idea. Cleaning
output is where the real money is -- from a security and maintainability
standpoint.

I'd be happy to change my mind if you can you suggest any other reasonto re-do that work more frequently than changes to the filtering module/ data that backs the filtering module?

The acknowledgment that said algorithm also needs backing data leads usright back to the question of HOW.

I believe all filtering efforts will eventually come down to "Whattags/attributes are OK?" (among other critical questions, like "Whatvalues for attributes are OK?".) (If you're stuck in the "whattags/attributes are NOT OK" world then we have need of a differentdiscussion: white lists vs black lists.)

So, does anyone have a good list of "safe" tags/attributes that shouldbe allowed through (assuming the attribute values also pass muster)?

For example, here are my (woefully incomplete) lists (plus a crossovertable (allowed_xhtml_tag_attribute_map) not shown linking allowablecombinations of the two):

allowed_xhtml_tag: a b blockquote br cite del div em font h1 h2 h3 h4h5 h6 i img ins li ol p pre span strong sub sup table td th tr u ul

allowed_xhtml_attribute: alt border cite class color href name srcstyle title

For example, I already know I need to add caption and tbody to the firsttable, but I've been delaying more by-hand tweaks in hopes of finding amore systematic way to fill the tables. I've yet to find it. Anysuggestions?



-Dale Newfield
 [EMAIL PROTECTED]

P.S.: the "tagsoup parse" suggestion is also good because it guaranteesthat anything you do reflect back to users is valid XHTML (and so won'tscrew up other parts of your page with illegally nested/unbalanced tags).


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Cross site scripting issue

Reply via email to