Ed Griebel wrote:
So it seems like you want to a) render untrusted HTML, and b) render
secure html. Sounds like the basic requirement is at odds? You could
do something like slashdot and other BB systems do: restrict the
amount of valid markup to make your parsing job easier.
Ultimately, restricting allowed markup helps but doesn't make the hard
cases much easier :-) You're right that (a) and (b) conflict somewhat,
though. But think about something like Google Mail: it needs to be able to
display as much of a user's mail as possible whilst still remaining secure
against XSS attacks.
Actually, I'm not sure if gmail *does* support showing HTML formatted email
off hand, but you see what I mean.
Another idea, one single regexp won't do it, but have you thought of
making multiple passes through the data as a check? You could xlate
unicode, remove line splits, perform xml entity substitution, etc.,
then if it "passes", store the original html page as entered. I'm
I'm not sure I want ever to store a modified copy, but the multi-pass regex
approach is valid in any case. It's probably the best way to go if you're
not willing to use a complete HTML+CSS parser in your XSS filter.
guessing that your requirement is to store and re-present the original
markup as entered :-)
Pretty much, sans XSS hacks, of course :-)
Also, have you tried doing some research into what the PHP world does
to prevent it? It might give a good point of reference for Java.
I spent a little time hunting around in the PHP world today, though I've
yet to find anything particularly useful. Most of the implementations I've
looked at so far do a fairly minimal job to defeat just the most common
sorts of attack.
L.
-ed
On 7/18/05, Laurie Harper <[EMAIL PROTECTED]> wrote:
Frank W. Zammetti wrote:
Yeah, wouldn't help you filter on output, but I pointer that out before :)
True enough :)
Note that it does allow you to specify your own regex, so in reality you
can filter for whatever you want. I did this specifically so when
someone spots something I didn't think of it's easy to make it catch
those too.
The trouble is, I doubt it would be possible to construct a single regex
that did a robust job -- including handling of character references (as in
my example), differing syntax rules in embedded CSS, browser's recombining
keywords like 'javascript' that are split over multiple lines, etc. etc...
FYI, while I find it ironic to reference a Microsoft resource on a
security exploit, they actually do have a decent little page about XSS...
http://support.microsoft.com/default.aspx?scid=kb;en-us;252985
The solutions it discusses, though, really don't help much when the
requirement is to render untrusted HTML. There's a lot more detail on
what's involved in some of the CERT advisories, for example:
http://www.cert.org/advisories/CA-2000-02.html
http://www.cert.org/tech_tips/malicious_code_mitigation.html
L.
Frank
Laurie Harper wrote:
Frank W. Zammetti wrote:
Not a problem...
http://javawebparts.sourceforge.net/javadocs/index.html
In the javawebparts.filter package, you should see the
CrossSiteScriptingFilter.
This will filter any incoming parameters, and optionally attributes
(good
for if your forwarding somewhere) for a list of characters (you can
alter
what it looks for via regex).
Ah, I initially skipped that package, thinking a servlet filter wasn't
really what I was after. Browsing through the code, it seems I was right.
For one thing, I want to filter text on output, not filter request
parameters on input. But more important, your filter only checks for
(and rejects) anything with a few particular characters -- all of
which are valid in most cases from an XSS-prevention standpoint.
For what it's worth, injecting XSS attacks through that filter is
pretty easy. For example, the following wouldn't be caught:
<script type="text/javascript">HOSTILE CODE
HERE</script>
I'm hoping I can find something that addresses all the nefarious XSS
strategies out there. It's not easy to implement something that's
complete, especially when you try to deal with embedded CSS in the
HTML you're trying to sanitize...!
Thanks for the link though :-)
--
Laurie, Open Source advocate, Java geek and novice blogger:
http://www.holoweb.net/laurie
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Laurie, Open Source advocate, Java geek and novice blogger:
http://www.holoweb.net/laurie
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]