On Monday, 12 December 2016 at 11:32:42 UTC, Nicholas Wilson wrote:
for strip_tags I would look for an xml library (e.g. arsd.dom) and parse it and then reprint it without the tags. There's probably a better way to do it though. I'm sure Adam Ruppe will be able to help you there.

Well, it depends what you are doing with it. If you are just outputting user data, I wouldn't allow any HTML at all... but I'd do it by encoding it all. So if they write "<script>" in the form, the output will be "&lt;script&gt;", which is harmless.

dom.d's htmlEntitiesEncode will do that:

http://dpldocs.info/experimental-docs/arsd.dom.htmlEntitiesEncode.html

auto safe = htmlEntitiesEncode(user_data);


Compare htmlentities() in PHP.



If you want to allow some HTML but not all, then yeah, you can use the full DOM parser and rip stuff out that way. Element.stripOut <http://dpldocs.info/experimental-docs/arsd.dom.Element.stripOut.html> can help with that, or innerText <http://dpldocs.info/experimental-docs/arsd.dom.Element.innerText.1.html>.


ask me if you need more

Reply via email to