Dave wrote:
On 7/11/07, Allen Gilliland <[EMAIL PROTECTED]> wrote:
Okay, well for some reason I am not getting escaped html even though the
comment wrapper is escaping html from the comment content before
returning it, so that is one thing that needs to be fixed.
The other problem is that there isn't (and hasn't been) a way to get to
option #3 where html is fully allowed. This is what is used to render
comment content in macros in 3.1 ...
#if($config.commentEscapeHtml)
#set($content = $utils.escapeHTML($comment.content))
#else
#set($content = $utils.transformToHTMLSubset($utils.escapeHTML($
comment.content)))
#end
Hmmm... that must have slipped in during XSS fixes. I'd better check
3.1.1 for that issue.
Yeah, that's definitely possible.
... and that basically means your only options are #1 (no html) or #2
(partial html), which is not really what we are aiming for i don't think.
Then, even if html escaping is working properly for #1, i don't think
that's the right way of providing that functionality. If you don't want
people putting html in comments then we should be rejecting those
comments, not just leaving the html there and escaping it. The problem
with just escaping the html is that 1) it's ugly and if someone puts
lots of html in their comment and it gets escaped then it makes the
comment almost unreadable and 2) it's dangerous because technically if
at some later point in time html escaping isn't applied anymore then
potentially dangerous html in the comment can become active again. So
it makes far more sense to me that if the admin wants to prevent html in
comments then we should reject those comments with an error message.
So, getting back to my original suggestions again I think we can revise
a couple things.
1. I think we still need a new runtime property for "enable html in
comments?" This is kind of like a built in comment validator which if
enabled will reject comments with html in them.
Hmm... When HTML in comments is disabled I'm not sure if we should
reject HTML, strip it or simply escape it as we have been doing.
What if somebody wants to write about the <h1> tag and wants the
comment to display the string <h1> and not have it interpreted as
HTML. If HTML is turned off, why should should we reject <h1>. Why not
just treat it as text by escaping it at display time?
If we had a WYSIWYG comment editor then they could type <h1> and it
would appear as <h1>, but since we don't they have to type <h1>
which is bit of a pain. Then again, if somebody is writing about HTML
elements then they're probably smart enough to know how to escape
them.
Not sure what the right thing is to do here, but we need to be careful
about making such a significant change in comment behavior.
I agree, there are a variety of options and the choice isn't obvious.
I personally like the idea of rejecting comments with html in them when
the admin has disabled html in comments because it seems like the safest
and most consistent thing to do. My feeling is that it's more likely
that users will enter in a comment and try to use html than it is that
users will be using the < and > characters in a non-html fashion. So
rejecting the comment and telling them not to use those characters will
be appropriate and fitting most of the time.
There certainly will be some cases where users aren't trying to put in
html and their comment gets rejected, but my feeling is that those cases
will be relatively few and with a decent error message they should be
able to fix it without much of a problem.
A WYSIWYG comment editor would be nice, but ultimately that doesn't
solve the problem because at the end of the day we still need to
properly validate the comment on the server side. So even if a WYSIWYG
editor was escaping content for people there is still comment spam and
other situations where someone could try to submit html in their comment
and we would need to deal with it.
-- Allen
2. For doing the html subset we can basically leave it how it is now as
a comment plugin and what it will do is the same thing it does now, just
escape all html then unescape the supported tags. So the expected use
case here would be if the user checked the "enable html in comments?"
box to allow for html in their comments, then checked the HTML Subset
plugin which transforms the comment contents so that only certain html
tags are left unescaped.
Yes, that sounds good.
- Dave