I've been skimming this thread, but not thoroughly reading each message. One of the things I'd like to see (that doesn't happen now) is to allow full HTML - no ifs ands or buts. Meaning that I can add JavaScript or any HTML tags I'd like to a comment. In addition, it would be nice to add some comment-post pre-processors that allow replacing \n with <br/> before the entry gets put in the database. The first feature is for me. I seem to comment as much on my blog as others do. It's frustrating for me when my HTML gets escaped. Furthermore, since I get almost instant notification of comments - I know when to delete those that have XSS text.
The 2nd feature is for readers. A lot of people comment on my blog w/o doing "preview" first and end up with comments that run together - even though there's line breaks in their entry. I realize I can change the macro to render the output different, but that affects all comments - whereas I just want to affect new ones. I regularly go into my database and format comments by adding <br/><br/> tags, but it'd be nice not to have to do that. Matt On 7/12/07, Allen Gilliland <[EMAIL PROTECTED]> wrote:
Dave wrote: > On 7/11/07, Allen Gilliland <[EMAIL PROTECTED]> wrote: >> I personally like the idea of rejecting comments with html in them when >> the admin has disabled html in comments because it seems like the safest >> and most consistent thing to do. My feeling is that it's more likely >> that users will enter in a comment and try to use html than it is that >> users will be using the < and > characters in a non-html fashion. So >> rejecting the comment and telling them not to use those characters will >> be appropriate and fitting most of the time. >> >> There certainly will be some cases where users aren't trying to put in >> html and their comment gets rejected, but my feeling is that those cases >> will be relatively few and with a decent error message they should be >> able to fix it without much of a problem. > > I don't like the idea of rejecting a comment because it has a bracket > that definitely doesn't seem right. If HTML Is disabled then we should > treat the comment as plain text. An angle bracket is allowed in plain > text and is not considered HTML. I don't like it either, but I'm not sure that will happen all that often. What we really need is a better parser so that we could actually distinguish between real html and casual use of the < and > characters. > > I think the most consistent thing to do is this: > > If HTML is disabled in comments that means that Roller consider the > comment content to be of type text/plain. Add a content-type field to > the comment table so we can save that. And whenever Roller displays > content of type text/plain in HTML or XML it must be escaped. > > If HTML is enabled then consider the content type to be text/html. > When we display HTML in a comment we always do the HTML > subsetting/white-listing bit. I don't think we should ever display raw > comment HTML. Okay, that sounds like a pretty good plan to me. However, I'm not sure I agree about forcing the HTML subsetting/white-listing part. While I agree that it is typically dangerous and undesirable, we have to remember that many times blog applications are used in trusted environments now, such as corporate intranets, and in those cases it would be entirely valid to just allow any html in comments. I guess I also think that it seems silly to not make that an option, since undoubtably someone will want to do things that way. So I will add one more column to the roller_comment table for content-type and tweak the upgrade process so that it gets properly set based on the previous value of the 'users.comments.escapehtml' property. If that property was enabled then the content-type will be text/plain and if not then it will be text/html. > > Going farther down that road, maybe we shouldn't have a "Allow HTML in > comments" property. Instead, have a "Set content-type to be used for > new comments" and offer options Text, HTML and XHTML. > > This will also make it clear how comments should be treated when in a > feed. See the Atom <content> elements type attribute here: > http://tools.ietf.org/html/rfc4287#page-14 I think that's fine, but it's extra effort to do that and I don't have time for it right now. I think just sticking with a simple html or plain text option will be fine for now, and there won't be any issues with expanding on that down the road and allowing admins to choose their own content-type if they want. So for right now I'm just going to leave things as is and provide just a single "Allow HTML in comments?" checkbox, which is effectively the same functionality we have right now. If HTML is allowed then the content-type for incoming comments will be set to text/html, otherwise it will be set to text/plain. Cool? -- Allen > > > If we don't want to go down that road and add a content-type, I think > we should stick with the practice of escaping comments that are not > supposed to be HTML rather than rejecting any angle brackets or HTML > tags. I don't think you have enough justification for making such a > behavior change. > > - Dave > > > > > > - Dave
-- http://raibledesigns.com
