Okay, this work is basically complete, but after doing a fair amount of testing I am a little uncertain if our handling of html in comments is working how we expected. My understanding from looking at the UI is that we are supposed to have these options ...

1. HTML disabled and not allowed in comments. Triggered by checking "escape comment html" on global config page. This is also the property which toggles the "HTML Syntax: Enabled/Disabled" text on the comment form.

2. HTML subset available. Only certain html tags work. I'm not sure I fully understand how this is supposed to be enforced, it looks like the code tries to escape all html into < > syntax and then unescapes certain tags.

3. HTML enabled.  Basically, anything goes.

Now, from looking at the code it doesn't look like we are properly providing these options. From my testing, when HTML is supposed to be disabled it's not, and I could still put HTML in comments. And to a larger degree, it looks like the escapeHTML() method isn't really doing much. This basically means that the HTML subset option wasn't working either because the comment was never being escaped in the first place, so that option was just doing nothing.

So can someone verify that my assumptions above are correct and that's the actual functionality we are shooting for?

I think that fixing #1 should be relatively easy and we can just display a comment error message to users if they put html in a comment when they aren't supposed to. To handle the conditions for #1 and #3 I'd like to just migrate the old "escape comment html" property to a new one called "enable html in comments" and if it's enabled then we don't do anything to check html in comments, if it's disabled then we reject any comments with html in them and display and error message.

#2 is a bit more tricky since you then need to be parsing the comment and looking for specific html. At the end of the day this one feels more like a comment validator than a built-in option to me because effectively all you are trying to do is check if an incoming comment uses any html that you are not allowing, which is the purpose of our comment validators. So I think that one should be rewritten as a comment validator.

Does that all make sense?

-- Allen


Allen Gilliland wrote:


Dave wrote:
On 7/10/07, Allen Gilliland <[EMAIL PROTECTED]> wrote
One thing that I didn't mention but is probably worth saying ... it's
also possible to design things a bit differently such that instead of
saving comments unmodified and then applying reformatting plugins as
they get displayed, we could apply the plugins on the incoming comment
and save the transformed version.

Each approach has it's own benefits.  Applying plugins on the incoming
comment simplifies things quite a bit because we no longer need to add
the 'plugins' attribute to the db or pojo and it would improve
performance a fair amount since we wouldn't need to apply XX plugins per
comment during rendering.  The downside is that since you modify the
comment before saving it you have no way of getting back to the original
comment if you wanted to, although that's not likely to be necessary.

I kind of like the idea of just applying the plugins once as the comment
is being submitted and then not having to worry about it after that, but
the solution I detailed above is slightly more flexible.

I don't have a strong feeling about this, but I guess I prefer to
store the raw comment data.


Actually, I remember why I abandoned this option as I was prototyping. The major difficulty here is upgrade process because currently we are doing transformations of comment content as it gets displayed and so if we wanted to change that then it means going through all comments in the db, applying the tranformations, and then re-saving. Basically, a major PITA.

-- Allen



- Dave

Reply via email to