Re: Sanatising HTML content through sandboxing

Charles Pritchard Thu, 10 Nov 2011 01:13:37 -0800

+1 to an HTMLParser object.
Many other methods end up loading resources when image elements are created.



On 11/8/11 11:54 PM, Adam Barth wrote:

Also, a div doesn't represent a security boundary.  It's difficult to
sandbox something unless you have a security boundary around it.
IMHO, an easy way to solve this problem is to just exposes an
HTMLParser object, analogous to DOMParser, which folks can use to
safely parse HTML, e.g., from XMLHttpRequest.

Adam


On Tue, Nov 8, 2011 at 11:28 PM, Jonas Sicking<[email protected]>  wrote:

Given that this type of sandbox would work very differently from the
iframe sandbox, I think reusing the same attribute name would be
confusing.

Additionally, what's the behavior if you remove the attribute? What if
you do elem.innerHTML += "foo" on the element after having removed the
sandbox? Or on an elements parent?

Or what happens if you do foo.innerHTML = bar.innerHTML where a parent
of bar has sandbox set?

When sanitizing, I strongly feel that we should simply remove all
content that could execute script as to ensure that it doesn't leak
somewhere else when markup is copied. Trying to ensure that it never
executes, while still allowing it to exist, is too high risk IMO.

/ Jonas

On Tue, Nov 8, 2011 at 5:21 PM, Ryan Seddon<[email protected]>  wrote:

Right now there is no simple way to sanitise HTML content by stripping it of
any potentially malicious HTML such as scripts etc.

In the "innerHTML in DocumentFragment" thread I suggested following the
sandbox attribute approach that can be applied to iframes. I've moved this
out into its own thread, as Jonas suggested, so as not to dilute the
innerHTML discussion.

There was mention of a suggested API called innerStaticHTML as a potential
solution to this, I personally would prefer to reuse the sandbox approach
that the iframes use.

e.g.

xhr.responseText = "<script
src='malicious.js'></script><div><h1>contentM/h1></div>";

var div = document.createElement("div");

div.sandbox = ""; // Static content only
div.innerHTML = xhr.responseText;

document.body.appendChild(div);

This could also apply to a documentFragment and any other applicable DOM
API's, being able to let the HTML parser do what it does best would make
sense.

The advantage of this over a new API is that it would also allow the use of
the space separated tokens to white list certain things within the HTML
being parsed into the document and open it to future extension.

-Ryan

Re: Sanatising HTML content through sandboxing

Reply via email to