Re: Cross Origin Web Components: Fixing iframes

Ryosuke Niwa Tue, 03 Dec 2013 18:48:36 -0800

On Nov 26, 2013, at 10:15 PM, Dominic Cooney <domin...@google.com> wrote:


> On Wed, Nov 27, 2013 at 2:19 PM, Ryosuke Niwa <rn...@apple.com> wrote:
> 
> On Nov 27, 2013, at 8:57 AM, Dominic Cooney <domin...@google.com> wrote:
> 
>> On Tue, Nov 26, 2013 at 2:03 PM, Ryosuke Niwa <rn...@apple.com> wrote:
>> Hi,
>> 
>> I have been having informal discussions of our earlier proposal for 
>> cross-orign use cases and declarative syntax for web components, and I 
>> realized there was a lot of confusion about our motivations and decision 
>> decisions.  So I wanted to explain why/how we came up that proposal in this 
>> email.
>> 
>> 
>> Problem: A lot of websites embed SNS widgets, increasing the security 
>> surface of embedders.  The old version of techcrunch.com, for example, had 
>> 5+ social share buttons on each article.  If any one of those SNS websites 
>> got compromised, then the embedder will also get compromised.
>> 
>> This is a valid problem. Does anyone have related use cases that might be 
>> in-scope for this discussion?
> 
> Comment forms (e.g. DISQUS) is another important use case.
> 
>> What if we used iframe?
>> What if we replaced each such instance with an iframe?  That would give us a 
>> security boundary.
>> 
>> On the other hand, using an iframe for each social button is very expensive 
>> because each iframe loads a document, creates its own security origin, JS 
>> global object, and so forth. Initializing new script context (a.k.a. "VM", 
>> "world", "isolate", etc…) for every single SNS widget on a page is quite 
>> expensive.  If we had 10 articles, and each article had 5 social buttons, 
>> we'll have 50 iframes, each of which needs to load megabytes of JavaScript.
>> 
>> iframe is also heavily restricted in terms of its ability to layout itself. 
>> Comment widgets (e.g. DISQUS) for example need to stretch themselves to the 
>> height of its content.
>> 
>> We also need a better mechanism to pass arguments and communicate with 
>> cross-origin frames than postMessage.
>> 
>> 
>> What if we made iframe lighter & used seamless iframe?
>> The cost of iframe could be reduced substantially if we cached and 
>> internally shared each page's JavaScript.  However, we still have to 
>> instantiate its own script context, document, and window objects.
>> 
>> We can also use seamless iframe to address the comment widget use case.
>> 
>> 
>> What if we let each iframe create multiple "views"?
>> The problem with using an iframe for a cross-origin widget is that each 
>> iframe creates its own document, window, etc… even if there are multiple 
>> widgets from the same origin.  e.g. if we had a tweet button on 10 different 
>> articles, we have to create its own document ,window, etc… for each tweet 
>> button.
>> 
>> We can reduce this cost if we could share the single frame, and have it 
>> render multiple "views".  Naturally, each such view will be represented as a 
>> separate DOM tree.  In this model, a single iframe owns multiple DOM trees, 
>> each of which will be displayed at different locations in the host document. 
>>  Each such a DOM tree is inaccessible from the host document, and the host 
>> document is inaccessible from the iframe.
>> 
>> This model dramatically reduces the cost of having multiple widgets from the 
>> same origin.  e.g. if we have 10 instances of widgets from 5 different 
>> social networks, then we'll have only 5 iframes (each of which will have 10 
>> "views") as opposed to 50 of them.
>> 
>> 
>> What if we provided a declarative syntax to create such a view?
>> Providing a better API proved to be challenging.  We could have let page 
>> authors register a custom element for each cross-origin widget but that 
>> would mean that page authors have to write a lot of script just to embed 
>> some third-party widgets.  We need some declarative syntax to let authors 
>> wrap an iframe.
>> 
>> Furthermore, if we wanted to use the multiple-views-per-iframe, then we'll 
>> need a mechanism to declare where each instance of such a view is placed in 
>> the host document with arguments/configuration options for each view.
>> 
>> A custom element seemed like a natural fit for this task but the 
>> prototype/element object cannot be instantiated in the host document since 
>> the cross-origin widgets' script can't run in the host document and 
>> prototype objects, etc… cannot be shared between the host document and the 
>> shared iframes.  So we'll need some mechanism for the shared iframe to 
>> define custom element names, and have the host document explicitly import 
>> them as needed.
>> 
>> 
>> At this point, the set of features we needed looked very similar to the 
>> existing custom element and shadow DOM.  Each "view" of the shared iframe 
>> was basically a shadow DOM with a security boundary sitting between the host 
>> element and the shadow root.  The declarative syntax for the "view" was 
>> basically a declarative syntax of a custom element that happens to 
>> instantiate a shadow DOM with a caveat that the shadow host is inaccessible 
>> form the component, and the shadow DOM is inaccessible from the host 
>> document.  It also seemed natural for such an "shared iframe" to be loaded 
>> using HTML imports.
>> 
>> 
>> You can think of our proposal as breaking iframe down into two pieces:
>> Creating a new document/window
>> Creating a new view
>> I think decomposing the problem this way is a good step.
>> 
>> Re: creating a new document/window, purely in terms of *mechanics*, IFRAME 
>> does this already. Is anything else required?
> 
> The problem is that iframe does both 1 and 2 but I agree that iframe already 
> provides this mechanism if we set style=display:none.  But it would be really 
> ugly and cumbersome if we had to import various SNS widgets with iframe with 
> style set to display:none.
> 
> Right. I think it will help us divide and conquer the problem if we can work 
> on (a) API aesthetics, and separately (b) mechanics in terms of as much 
> existing stuff as possible (IFRAME, viewport, etc.) I don't necessarily mean 
> taking that existing stuff as-is, but maybe pulling chunks out of the 
> existing stuff and specing it so it will explain the legacy stuff and work in 
> these new combinations for this new use case.

Yeah, that makes sense.

>> Re: creating a new view, this is really interesting to me. It seems there 
>> are a few different parts, I think most of these are needed for the use case 
>> above; I've also noted where we might break out and "explain" some existing 
>> part of the platform.
>> 
>> - Arranging the rendering of a DOM (sub)tree into a "view". IFRAME, 
>> ShadowRoot and indeed just "rendering in general" do this.
>> - Arranging the rendering of something else into a "view". Replaced elements 
>> like OBJECT and IMG do this. Maybe this is just trivially "arrange the 
>> rendering of a DOM containing CANVAS" though.
>> - Communicating or blocking layout across the "view" boundary. Cases where 
>> information flows outside-in: the viewport-document relationship; IFRAME. 
>> Cases where information flows two ways: seamless IFRAME, Shadow DOM, layout 
>> in general.
>> - Something about laying things out/rendering outside the bounds of the 
>> "view". Shadow DOM and does this (you can rel/abs/fixed position stuff 
>> outside of the host element bounds.) This is a tricky one... in scope or 
>> does Shadow DOM remain a special case? Would some embedders trust a 
>> component enough to let them clickjack them, just not steal their cookies, 
>> etc.?
> 
> Right.  We need to add something like overflow: clip by default to prevent 
> click hijacking.
> 
> Is overflow: clip sufficient?
> 
> If we're trying to map this to primitives, does this mean that the UA 
> stylesheet has a high specificity rule which says "if you're one of these 
> elements entangled with a viewport, overflow: clip"?
> 
> I note that fb:like has a "flyout". Do you think it is a reasonable use case? 
> Should the component author be allowed to detect when their view-thing will 
> clip them or not?
>> and providing a mechanism to do 2 without doing 1 (or that doing 2 multiple 
>> times after doing 1 once), and making it usable with a declarative syntax.
>> 
>> This definitely deserves to be bullet 3--usable with declarative syntax.
>> 
>> To clarify that I understand--the importance of succinct declarative syntax 
>> is so that the embedder doesn't end up including the "shim" script for Foo's 
>> widget from foo.com, which means trusting foo.com which was the whole point! 
>> Right?
> 
> Right.  Using Foo widget from foo.com should NOT involve running scripts from 
> foo.com in the host document.
> 
>> It would be nice if we could solve this problem in a layered way. For 
>> example, I think the "view" stuff above is a lower-level primitive, and the 
>> declarative syntax should be explained in terms of (something for getting a 
>> window+document--IFRAME?) plus "view" plus (extremely small alpha that 
>> explains how the stuff is wired up.)
> 
> That makes sense although we haven't come up with use cases where we just 
> want to use the multiple "views" cross-origin without the declarative syntax.
> 
> What about the status quo, where the embedder trusts the component being 
> embedded, but the component doesn't trust the embedder?

Authors can keep using script elements for that use case.  Is there some 
existing problem we want to solve in that use case?

> The component will be running scripts in the embedder's context (perhaps the 
> component has script API built on postMessage to the IFRAME) but for 
> efficiency its desirable to have one IFRAME for the multiple like buttons, 
> etc.
>> I guess it is OK if the API is not declarative on the widget side? If we 
>> assume the widget enjoys the isolation of an IFRAME, is performance the 
>> primary motivator on this side?
> 
> Being declarative will definitely benefit the performance because preload 
> scanner, etc… could detect what kind of "views" are exposed/implemented in a 
> given "slave" (or "widget") document without running scripts.
> 
> Also, I'd imagine a lot of widgets would end up using templates so having to 
> manually instantiate those templates would be annoyance.
> 
> Having written some basic apps with Polymer, it's evidently feasible to wrap 
> the template stamping up in a library. 
>> It would be nice if the widget author could get something rendered very 
>> quickly.
> 
> Right.
> 
>> I think this "declarative" part of the problem breaks down this way:
>> 
>> - How the page author "invokes" something in the embedded component. How is 
>> it named and how does the author mention the name?
> 
> So I think a custom element is the natural mechanism. e.g.
> 
> <import src="http://foo.com/widget.html"; customelements="foo-button">
> <foo-button>Foo this</foo-button>
> 
> I see the appeal of Custom Elements, because it has a way to define a name 
> (document.register), mention a name (createElement, write markup, etc.) and 
> has a model of instantiating elements. But it has baggage you don't want, 
> like prototypes and constructors (on the embedder side.) There's also all the 
> details of this viewport entangling. Likewise with HTML Imports, they've got 
> some things you want (new document) but some things you don't (shared window, 
> shared globals) and some things I'm unsure about (synchronous versus 
> asynchronous).

Right.  Perhaps we could either extract the common base of the custom elements.

Alternatively, we can provide a mechanism to auto-create custom elements as a 
wrapper for cross-origin widgets.  i.e. we want to have the imported document 
create a DOM tree given a name of tag/element, and then securely insert it 
somewhere in the host document as a custom element.

The downside of this approach is that now authors have to deal with two ways of 
defining widgets/components for same origin and cross origin use cases.

> I don't immediately have any better ideas so this is the straw man for now. 
> As we work through the details we might come up with some tweaks or 
> alternatives.
> 
> I guess that's another reason to sweat the small stuff--if we had prototype 
> implementations of element-view entangling and so on we could polyfill some 
> of the high level declarative syntax ideas and bounce prototypes off real web 
> developers and use cases.
>  
> Note that we can't let the imported "slave" document define an arbitrary set 
> of custom elements by default.
> 
> is=blah syntax isn't as useful/interesting here because it's unusual to use a 
> cross-origin widget to replace an existing built in HTML element.
> 
>> - How does the embedding page understand that there's an "instance" of their 
>> stuff contributing to the main page now?
> 
> Again, the custom element's created callback is a very nice mechanism for 
> that.
> 
>> - How does the author configure an instance from the embedded component? 
>> Presumably the button needs to know something things from its embedder, like 
>> API keys, etc.
> 
> 
> If we decided that each "view" is a custom element, then a very natural way 
> for it to communicate the information is via data attributes.
> 
> I note that fb:like already uses data- so there's precedent for that.

Right.

> Are there problems with data-?

Not that I know of.

> Where would the component access them? Can they see updates? Are updates one 
> way or bi-directional?

So if we had used shadow DOM as the security boundary, we can expose dataset on 
the shadow root, and have it sync'ed with data attributes on the shadow host.

- R. Niwa

Re: Cross Origin Web Components: Fixing iframes

Reply via email to