Dave wrote:
On 7/11/07, Allen Gilliland <[EMAIL PROTECTED]> wrote:
I personally like the idea of rejecting comments with html in them when
the admin has disabled html in comments because it seems like the safest
  and most consistent thing to do.  My feeling is that it's more likely
that users will enter in a comment and try to use html than it is that
users will be using the < and > characters in a non-html fashion.  So
rejecting the comment and telling them not to use those characters will
be appropriate and fitting most of the time.

There certainly will be some cases where users aren't trying to put in
html and their comment gets rejected, but my feeling is that those cases
will be relatively few and with a decent error message they should be
able to fix it without much of a problem.

I don't like the idea of rejecting a comment because it has a bracket
that definitely doesn't seem right. If HTML Is disabled then we should
treat the comment as plain text. An angle bracket is allowed in plain
text and is not considered HTML.


I don't like it either, but I'm not sure that will happen all that often. What we really need is a better parser so that we could actually distinguish between real html and casual use of the < and > characters.



I think the most consistent thing to do is this:

If HTML is disabled in comments that means that Roller consider the
comment content to be of type text/plain. Add a content-type field to
the comment table so we can save that. And whenever Roller displays
content of type text/plain in HTML or XML it must be escaped.

If HTML is enabled then consider the content type to be text/html.
When we display HTML in a comment we always do the HTML
subsetting/white-listing bit. I don't think we should ever display raw
comment HTML.


Okay, that sounds like a pretty good plan to me. However, I'm not sure I agree about forcing the HTML subsetting/white-listing part. While I agree that it is typically dangerous and undesirable, we have to remember that many times blog applications are used in trusted environments now, such as corporate intranets, and in those cases it would be entirely valid to just allow any html in comments.

I guess I also think that it seems silly to not make that an option, since undoubtably someone will want to do things that way.

So I will add one more column to the roller_comment table for content-type and tweak the upgrade process so that it gets properly set based on the previous value of the 'users.comments.escapehtml' property. If that property was enabled then the content-type will be text/plain and if not then it will be text/html.



Going farther down that road, maybe we shouldn't have a "Allow HTML in
comments" property. Instead, have a "Set content-type to be used for
new comments" and offer options Text, HTML and XHTML.

This will also make it clear how comments should be treated when in a
feed. See the Atom <content> elements type attribute here:
http://tools.ietf.org/html/rfc4287#page-14


I think that's fine, but it's extra effort to do that and I don't have time for it right now. I think just sticking with a simple html or plain text option will be fine for now, and there won't be any issues with expanding on that down the road and allowing admins to choose their own content-type if they want.

So for right now I'm just going to leave things as is and provide just a single "Allow HTML in comments?" checkbox, which is effectively the same functionality we have right now. If HTML is allowed then the content-type for incoming comments will be set to text/html, otherwise it will be set to text/plain.

Cool?

-- Allen




If we don't want to go down that road and add a content-type, I think
we should stick with the practice of escaping comments that are not
supposed to be HTML rather than rejecting any angle brackets or HTML
tags. I don't think you have enough justification for making such a
behavior change.

- Dave





- Dave

Reply via email to