Ok, I'm going to pay attention to the problem, the XSS filter:

I am using a 'blacklist', because my users need to enter as much X\HTML
as I can possibly allow them.

So, tags I'm originally NOT allowing are:

<applet> <script> <embed> <object> <server> <frame> <iframe> <frameset> <html> <body>

I'm removing all javascript event attributes (   onclick="alert('xss');"  )

Removing all javascript escaped quotes:    \'  and   \"

In any tag left that has a link in it (src|href|action), I'm making sure it is NOT relative and NOT to my server: <a> <img> <ilayer> <form>

Any 'target' attributes, I'm changing to target='_blank', although I still think there is a security flaw in here for a popup window trying to run code on
the originating page.

I will be checking CSS urls.


Also, these dangerous strings:

javascript:

java   \n   script:

String.fromCharCode(x)  //mainly for js quotes or parenthesis

charCodeAt

eval(


Well, that is to start anyways.


-Joe










Dale Newfield wrote:
There are two discussions here that are getting convoluted: WHEN to "clean" and HOW to clean. I still have yet to find a good comprehensive way to do the latter (more below), but right here I'm responding to the former.

Christopher Schultz wrote:
If you /are/ capturing text you will be using that /can/ contain HTML
markup, then cleaning it as it comes in is still a mistake. Let's say
you have a bug in your cleansing code. In that case, bad stuff gets into
your database where it's hard to root out and fix.

If that data is hard to find than you haven't cleanly defined your DB schema.

WHEN to do the cleaning is not a question of security and maintainability, but a question of amortizing clock cycles to try to get responses out to browsers as quickly as possible. There is no reason to clean the same piece of text with the same algorithm more than once, so why not do it on the input side? If you find a bug in your cleansing code, then once you change it, re-run it ONCE on all the potentially dangerous text blocks. Those should map directly to columns in your DB. If you can't look at your DB schema and tell me which columns are displayed without escaping their contents, then something is wrong.

I agree with Leon: cleaning input is not usually a good idea. Cleaning
output is where the real money is -- from a security and maintainability
standpoint.

I'd be happy to change my mind if you can you suggest any other reason to re-do that work more frequently than changes to the filtering module / data that backs the filtering module?

The acknowledgment that said algorithm also needs backing data leads us right back to the question of HOW.

I believe all filtering efforts will eventually come down to "What tags/attributes are OK?" (among other critical questions, like "What values for attributes are OK?".) (If you're stuck in the "what tags/attributes are NOT OK" world then we have need of a different discussion: white lists vs black lists.)

So, does anyone have a good list of "safe" tags/attributes that should be allowed through (assuming the attribute values also pass muster)?

For example, here are my (woefully incomplete) lists (plus a crossover table (allowed_xhtml_tag_attribute_map) not shown linking allowable combinations of the two):

allowed_xhtml_tag: a b blockquote br cite del div em font h1 h2 h3 h4 h5 h6 i img ins li ol p pre span strong sub sup table td th tr u ul

allowed_xhtml_attribute: alt border cite class color href name src style title

For example, I already know I need to add caption and tbody to the first table, but I've been delaying more by-hand tweaks in hopes of finding a more systematic way to fill the tables. I've yet to find it. Any suggestions?


-Dale Newfield
 [EMAIL PROTECTED]

P.S.: the "tagsoup parse" suggestion is also good because it guarantees that anything you do reflect back to users is valid XHTML (and so won't screw up other parts of your page with illegally nested/unbalanced tags).

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to