On 7/11/07, Allen Gilliland <[EMAIL PROTECTED]> wrote:
Okay, this work is basically complete, but after doing a fair amount of
testing I am a little uncertain if our handling of html in comments is
working how we expected.  My understanding from looking at the UI is
that we are supposed to have these options ...

1. HTML disabled and not allowed in comments.  Triggered by checking
"escape comment html" on global config page.  This is also the property
which toggles the "HTML Syntax: Enabled/Disabled" text on the comment form.

What this means is that, if you enter an HTML tag it will be escaped
on display and therefore tags you enter will be displayed as text and
not interpreted by the browser as HTML tags.


2. HTML subset available.  Only certain html tags work.  I'm not sure I
fully understand how this is supposed to be enforced, it looks like the
code tries to escape all html into &lt; &gt; syntax and then unescapes
certain tags.

In this case, we escape on display and then un-escape certain tags
that we deem to be safe.


3. HTML enabled.  Basically, anything goes.


Yep, that's how they are supposed to work.



Now, from looking at the code it doesn't look like we are properly
providing these options.  From my testing, when HTML is supposed to be
disabled it's not, and I could still put HTML in comments.  And to a
larger degree, it looks like the escapeHTML() method isn't really doing
much.  This basically means that the HTML subset option wasn't working
either because the comment was never being escaped in the first place,
so that option was just doing nothing.

Works for me. Escaping completely prevents HTML tags from being
interpreted by the browser.


So can someone verify that my assumptions above are correct and that's
the actual functionality we are shooting for?

I think that fixing #1 should be relatively easy and we can just display
a comment error message to users if they put html in a comment when they
aren't supposed to.

If we escape (as we are doing now) that works just fine.


To handle the conditions for #1 and #3 I'd like to
just migrate the old "escape comment html" property to a new one called
"enable html in comments" and if it's enabled then we don't do anything
to check html in comments, if it's disabled then we reject any comments
with html in them and display and error message.

#2 is a bit more tricky since you then need to be parsing the comment
and looking for specific html.  At the end of the day this one feels
more like a comment validator than a built-in option to me because
effectively all you are trying to do is check if an incoming comment
uses any html that you are not allowing, which is the purpose of our
comment validators.  So I think that one should be rewritten as a
comment validator.

What's the problem with the white-list based HTML subsetting we have now?

- Dave

Reply via email to