Okay, this work is basically complete, but after doing a fair amount of
testing I am a little uncertain if our handling of html in comments is
working how we expected. My understanding from looking at the UI is
that we are supposed to have these options ...
1. HTML disabled and not allowed in comments. Triggered by checking
"escape comment html" on global config page. This is also the property
which toggles the "HTML Syntax: Enabled/Disabled" text on the comment form.
2. HTML subset available. Only certain html tags work. I'm not sure I
fully understand how this is supposed to be enforced, it looks like the
code tries to escape all html into < > syntax and then unescapes
certain tags.
3. HTML enabled. Basically, anything goes.
Now, from looking at the code it doesn't look like we are properly
providing these options. From my testing, when HTML is supposed to be
disabled it's not, and I could still put HTML in comments. And to a
larger degree, it looks like the escapeHTML() method isn't really doing
much. This basically means that the HTML subset option wasn't working
either because the comment was never being escaped in the first place,
so that option was just doing nothing.
So can someone verify that my assumptions above are correct and that's
the actual functionality we are shooting for?
I think that fixing #1 should be relatively easy and we can just display
a comment error message to users if they put html in a comment when they
aren't supposed to. To handle the conditions for #1 and #3 I'd like to
just migrate the old "escape comment html" property to a new one called
"enable html in comments" and if it's enabled then we don't do anything
to check html in comments, if it's disabled then we reject any comments
with html in them and display and error message.
#2 is a bit more tricky since you then need to be parsing the comment
and looking for specific html. At the end of the day this one feels
more like a comment validator than a built-in option to me because
effectively all you are trying to do is check if an incoming comment
uses any html that you are not allowing, which is the purpose of our
comment validators. So I think that one should be rewritten as a
comment validator.
Does that all make sense?
-- Allen
Allen Gilliland wrote:
Dave wrote:
On 7/10/07, Allen Gilliland <[EMAIL PROTECTED]> wrote
One thing that I didn't mention but is probably worth saying ... it's
also possible to design things a bit differently such that instead of
saving comments unmodified and then applying reformatting plugins as
they get displayed, we could apply the plugins on the incoming comment
and save the transformed version.
Each approach has it's own benefits. Applying plugins on the incoming
comment simplifies things quite a bit because we no longer need to add
the 'plugins' attribute to the db or pojo and it would improve
performance a fair amount since we wouldn't need to apply XX plugins per
comment during rendering. The downside is that since you modify the
comment before saving it you have no way of getting back to the original
comment if you wanted to, although that's not likely to be necessary.
I kind of like the idea of just applying the plugins once as the comment
is being submitted and then not having to worry about it after that, but
the solution I detailed above is slightly more flexible.
I don't have a strong feeling about this, but I guess I prefer to
store the raw comment data.
Actually, I remember why I abandoned this option as I was prototyping.
The major difficulty here is upgrade process because currently we are
doing transformations of comment content as it gets displayed and so if
we wanted to change that then it means going through all comments in the
db, applying the tranformations, and then re-saving. Basically, a major
PITA.
-- Allen
- Dave