The desktop Firefox team is building a new Toolkit module that captures thumbnails of off-screen web pages. Critically, we want to avoid capturing any data in these thumbnails that could identify the user. More generally, we're looking for a way to visit pages in a sandboxed manner that does not interact with the user's normal browsing session. Does anyone know of such a way or know how we might change Gecko to support something like that?
To provide context, I'll describe the original problem that motivated the new thumbnail module, the poor way in which the new module interacts with the user's browsing state, and then some things we think we need in order to sandbox pages. See bug 870179 for motivation for this post. The Original Problem Toolkit already has a thumbnail module, [PageThumbs], but it can only capture thumbnails of open content windows, same as they appear to the user. Windows may contain sensitive data that should not be recorded in an image, however, like bank account numbers and so on, so Firefox uses some [heuristics] to determine when it's safe to capture a given window. If it's not safe, then Firefox makes no further attempt to capture the window's page until the user happens to visit it again. If it's never safe to capture any visit to the page, then the page never has a thumbnail. This is why you might end up with lots of blank thumbnails in Firefox's about:newtab page. (One of the most notable heuristics is the presence of the header Cache-Control: [no-cache]. Firefox treats it as an indication that a page may contain sensitive data and therefore should not be captured.) Our Solution We wrote a [new module] that's not limited to capturing thumbnails of open windows. It loads pages on demand in a xul:browser in the hidden window and then captures them. This browser is remote so that its page loads don't block the main thread (to the extent that page loads in remote browsers don't block the main thread). Further, the browser uses private browsing mode to sandbox its pages from the user's normal browsing session. If you're logged in to a site in a main window, your logged-in status is not reflected in thumbnails of that site. The Problem with Our Solution The thumbnail browser sandboxes its pages from your normal browsing session, but of course it doesn't sandbox them from your private browsing session. If you're logged in to a site in a [private browsing] window, then you'll also be logged in in the page we load in the thumbnail browser, which is the exact thing we're trying to avoid by using a private browser. This is a consequence of private browsing's binary design: it's either on or off, and all the various Gecko bits that manage state are designed to use one state for private requests and another state for normal requests. There's no notion of multiple concurrent private browsing sessions. We avoid this problem in a crude way by ignoring capture requests while there are private windows open. We need a better solution that allows us to capture sandboxed pages no matter what the user is doing. Requirements We think we need the following to sandbox pages in our thumbnail browser: (a) Requests must be stateless. Requests must not include any information derived from previous requests. e.g., requests must not include cookies. (b) Requests must leave no trace. Requests must not be stored in whole or part, and no information about or derived from requests must be stored. e.g., the request must not be recorded in the user's history. (c) Responses must leave no trace. Responses must not be stored in whole or part, and no information about or derived from responses must be stored. e.g., cookies returned by the response must be ignored, and the response must not be recorded in the user's history. A fourth requirement unrelated to sandboxing is: (d) We need to load pages off-screen, and the user should never have to know about it. It shouldn't impact browser responsiveness, windows and dialogs triggered by pages shouldn't pop up, audio shouldn't be audible, etc. (We've done some of this work already by making the browser remote, putting it in the hidden window, and in bugs 875157 and 759964.) What do people think? If what I've described is not already possible, how much work would it be to support it in Gecko, if only for this narrow use case of thumbnailing? Thanks, Drew [PageThumbs] http://mxr.mozilla.org/mozilla-central/source/toolkit/components/thumbnails/PageThumbs.jsm [heuristics] http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser-thumbnails.js#127 [no-cache] https://bugzilla.mozilla.org/show_bug.cgi?id=754608 [new module] https://bugzilla.mozilla.org/show_bug.cgi?id=841495 [private browsing] https://bugzilla.mozilla.org/show_bug.cgi?id=870179 _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform