On Fri, 2010-05-21 at 14:24 +0100, David Otton wrote:
> On 20 May 2010 16:51, Al <n...@ridersite.org> wrote:
> > I'm not being clear. First pass is thru the blacklist, which effectually
> > tells hacker to not bother and totally deletes the entry.
> > If the raw entry gets past the blacklist, it must then only contain my
> > whitelist tags. e.g., the two examples you cited were caught by the
> > whitelist parser.
> Ah, gotcha. That seems like a much better approach to me. But if the
> whitelist's going to stop the submission, then why bother with a
> blacklist at all?
I still think you might be better off using BBCode, which is used on
websites just for this very purpose. When any input comes back, you can
remove all the HTML completely and replace the BBCode tags that you
allow. This should guarantee that the only HTML in the text is what you
put there. That way, the only chance someone has to enter malicious code
is to manipulate your replacement algorithm.