* Louis-David Mitterrand <vindex+lists-markdown-disc...@apartia.org> 
[2010-05-05 16:05]:
> What would be a "reasonable defaults" whitelist for html tags
> in a forum context?

All the tags Markdown has syntax for:

    em strong a img code br
    p ul ol li blockquote pre h1 h2 h3 h4 h5 h6

Plus a few very reasonable extras:

    i b cite del ins
    dl dd dt

Attributes that should be allowed:

    a: href title
    img: src alt title
    ol: start
    blockquote: cite

That’s the minimal reasonable set, I think.

You may or may not want to also whitelist the table-related tags:

    table tr td th
    tbody tfoot thead caption

Most of their possible attributes should be allowed in that case.

For those, you’ll need to tidy the HTML, not just scrub it, else
people will be able to break your layout in malicious ways.

You ***DON’T*** want to whitelist the `style` attribute under any
circumstances, unless you also have a very very very careful CSS
scrubber, because otherwise it’s possible to inject Javascript
that way.

You’ll also want to validate `...@href` values to keep people from
putting `javascript:` URIs or similar foolishness in there. If in
doubt, allow too little.

That’s the main considerations out of the way.

Personally I’d also whitelist `small` and `big`, much like `i`
and `b`. You need the latter because `em` and `strong` are wrong
to use for some well-reasoned formatting that isn’t emphatic
(such as italicising names in citations) – likewise, if you only
leave the header tags for smaller/bigger text, people will abuse
them for setting large or small text that’s not a headline.

For similar reasons, I’d also whitelist `tt`.

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/>
_______________________________________________
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Reply via email to