Re: [blink-dev] Intent to Experiment: Load common payloads from privacy-preserving single-keyed cache

Mike Taylor Wed, 27 Apr 2022 06:16:27 -0700

Thanks Daisuke - sounds good. I don't think we'll need to extend beyondM102 (but I probably just jinxed it...).


On 4/26/22 8:50 PM, Daisuke Enomoto wrote:

Hi Mike,

Thank you for your question! We're targeting M103 to start theexperiment. So, IIUC, it would not interact with the double-keyexperiment running through M102 unless it's extended.

On Wed, Apr 27, 2022 at 5:55 AM Mike Taylor <[email protected]>wrote:


    Hi Daisuke,

    Can you clarify the timeline of the experiment? Would it begin in
    M103? I have concerns about interactions with the current
    double-key experiment
    <https://groups.google.com/a/chromium.org/g/blink-dev/c/WQtp7Ixd1RU>
    we're running for Network State Partitioning in M101 and M102.

    On 4/26/22 7:59 AM, Daisuke Enomoto wrote:

Contact emails

[email protected], [email protected], [email protected]

Explainer

https://docs.google.com/document/d/1pvaMg7J5beBXD7trzHJH_MDULc_wRHLx40MFYAmjknE/edit

<https://docs.google.com/document/d/1pvaMg7J5beBXD7trzHJH_MDULc_wRHLx40MFYAmjknE/edit>

Specification

N/A (because there are no web-exposed changes)

Summary

This limited experiment measures how much "pervasive payloads"
contribute to the performance impact of the split HTTP cache in
each Chrome channel over a three-week period. Pervasive payloads
are those third-party payloads included on at least 500 sites and
fetched at least 10M times in a month, based on Chrome's analysis
(payload list included below). This experiment further measures
the impact on Core Web Vitals metrics of restoring pervasive
payloads (and only pervasive payloads) to a single-keyed cache
regime. The privacy benefits of the split HTTP cache are preserved.

Blink component

Blink>Network

<https://bugs.chromium.org/p/chromium/issues/list?q=component:Blink%3ENetwork>

Motivation

Browsers split HTTP caches based on the top-frame visited origin
(“double-keyed” or "triple-keyed" caching) to prevent sites from
tracking users via a timing attack on a cross-site client cache.

Chrome’s analysis estimates that split caching results in a 3%
increase in cache misses, i.e. fetches for which a payload exists
in the cache of the user's device, but is unavailable to the page
because it was fetched by the user while loading a page from a
different origin. This results in approximately 4% more total
bytes being fetched over the network.

Our analysis further revealed that many of the redundant fetches
caused by split caching were for common payloads associated with
displaying user content (libraries, fonts, widgets, ads) or
common payloads that assist in operating online businesses
(analytics). The delayed arrival of these common payloads
resulted in visible "jank" for users, impacting performance
metrics like LCP <https://web.dev/lcp>, FCP
<https://web.dev/fcp>and CLS <https://web.dev/cls/>. This jank
has been associated with negative effects to online business'
engagement and conversion rates. Furthermore, delayed loads of
analytics and ads payloads can result in missed ads impressions
and dropped analytics hits.

Initial public proposal

This experiment sends a list to Chrome of 100 <URL, hash> pairs
whose payloads are considered pervasive (the "pervasive payloads
list"). During the three-week experiment period, if Chrome
fetches a payload that matches both the URL and its hash on the
pervasive payloads list, it is inserted into a local single-keyed
cache. This payload is then available for use by Chrome when
loading pages on other sites that include the matching URL. All
other fetches for URLs not on the pervasive payloads list are
cached according to the existing split HTTP cache.

The hash covers the payload body and most response headers,
except for those which change on every response.

To ensure we do not degrade the privacy profile of any users
during this experiment, only users with third-party cookies
currently enabled will be eligible for the experiment. We will
compare the experience of users in experiment and control arms
according to total bytes loaded and page performance metrics like
the Core Web Vitals <https://web.dev/vitals>.

The pervasive payloads list was produced by crawling the web and
aggregating the most commonly referenced third-party resource
URLs included in HTML content. We then used pseudonymous
URL-keyed metrics from Chrome to estimate the traffic to sites
and the number of impressions of third-party resources.
Individually identifiable browsing or search histories were not
used in the creation of the pervasive payload list (for more
information about Chrome's data collection policies and privacy
policies, see google.com/chrome/privacy
<https://google.com/chrome/privacy>). The resulting list was
further filtered for any URLs that might contain PII (e.g. URLs
with extensive or opaque query parameters). The list was also
manually reviewed to ensure it included only payloads reasonably
expected to be pervasive; the manual review did not result in any
payloads being removed.

The privacy properties of the split HTTP cache are considered
essential to users and this proposal intends to maintain those
properties, specifically by maintaining split HTTP caching for
all payloads not on the pervasive payloads list.

API semantics are unchanged. User-facing functionality is
unchanged (though we expect performance to be slightly improved).

The list of the top 100 Pervasive URLs for use in this experiment
is pending internal reviews and will be shared on this thread
upon approval.

Future directions

This experiment is the first step in a path exploring improved
handling of pervasive payloads in the browser cache. We outline
the intended future functionality here to clarify the intentions
behind the current experiment. The overview below is not complete
or final and subsequent parts of the design and implementation
will be presented and discussed in further Intents to Experiment
and Prototype.

At a high level, a future improvement to the handling of
pervasive payloads may involve:

Assembling a list of pervasive payloads that meets the
following criteria:

Maintains the privacy of user browsing histories in its
creation

Fairly represents pervasive payloads as they have been
chosen by developers on the web, not payloads selected or
favored by any particular library or browser vendor.

This experiment will initially use a static list of
predefined URLs assembled as described in the
'Initial public proposal' section above

A future implementation will likely dynamically
update the payloads list on, for example, a weekly
cadence.

Implementing shared caching for pervasive payloads that meets
the following criteria:

Materially improves load times and responsiveness for web
users(under study in this experiment)

Does not create a new tracking vector based on cache
timing attacks

Does not require users to fetch payloads before the
browser knows they will need it (i.e. we don't plan to
bundle payloads with browser installs or updates)

Does not increase local storage required by browser
installs or caches

To privately and fairly assemble the list of pervasive payloads,
we are exploring the use of Private Heavy Hitters
<https://www.tensorflow.org/federated/tutorials/private_heavy_hitters>.
To implement a privacy-preserving shared cache after the
deprecation of third-party cookies, we are exploring adding a
measure of randomness to the observed presence or absence of a
pervasive payload in the shared cache.

However, this work is only worthwhile if it results in materially
improved load times for real users. This Intent to Experiment
covers only whether or not we should attempt to measure the
performance gains that might be realized ifpervasive payloads
were placed in a shared cache, as one data point among others
that will contribute to discussions about future steps for the
proposal.

TAG review

None yet.

TAG review status

N/A

Risks

Interoperability and Compatibility

Chrome's compliance with the relevant standards is unchanged.
Caching behavior differs between browsers so interoperability
will not be affected.

The list of popular payloads is specifically chosen to minimize
compatibility risks.

Gecko: No signal

WebKit: No signal

Web developers: No signals

Other signals:

WebView application risks

Does this intent deprecate or change behavior of existing APIs,
such that it has potentially high risk for Android WebView-based
applications? No

Debuggability

There is no developer-exposed API for this feature, so most
DevTools support is not relevant. It would be useful to indicate
whether a resource was served from the single-keyed cache in the
network tab, however this will not be implemented in the initial
experiment.

Security and privacy

Single-keyed caching introduces global state shared between
different browsing contexts. A shared cache can introduce
information leaks based on cache probing
(https://xsleaks.dev/docs/attacks/cache-probing/
<https://xsleaks.dev/docs/attacks/cache-probing/>), including
XS-Search (https://xsleaks.dev/docs/attacks/xs-search/
<https://xsleaks.dev/docs/attacks/xs-search/>) in applications
which conditionally load a single-keyed-cache eligible resource
based on authenticated user state. The state of the cache,
queried across different contexts, could also be used as a
fingerprint, permitting user tracking; however, in this case, we
believe this does not provide tracking capabilities beyond those
of third-party cookies.

To protect users during this experiment, we limit the experiment
population to those users with third-party cookies enabled.
Recognizing that third-party cookies will eventually be switched
off for most users <https://privacysandbox.com/>, we are
developing protections such as slightly randomizing cache
hit/miss checks, disallowing eviction, or guaranteeing attempts
to read from the cache reliably populate that cache entry. These
protections will be proposed and incorporated before any future
optimizations are launched.

For the purposes of the current experiment, we will be using the
same implementation of single-keyed caching that Chrome used
before the HTTP cache was partitioned in M77
(https://chromestatus.com/feature/5730772021411840
<https://chromestatus.com/feature/5730772021411840>).

To summarize, the security and privacy restrictions on this
experiment are as follows:

We will exclude users that have third-party cookies disabled.

Only a small percentage of users will be included in the
experiment, reducing the likelihood and impact of any attacks
abusing the single-keyed cache.

We will strictly limit the duration of the experiment on each
channel to 3 weeks.

We will only serve pervasive resources from the single-keyed
cache.

We can turn off the experiment immediately (independent of
browser updates) if any other threats appear.

Is this feature fully tested by web-platform-tests

<https://chromium.googlesource.com/chromium/src/+/master/docs/testing/web_platform_tests.md>?

This behavior is specific to Chrome and not part of any standard,
so it will not be tested in web platform tests.

Flag name

CacheTransparency

Requires code in //chrome?

No, but the list of popular payloads and the mechanism for
distributing it to the browser will be Chrome-specific.

Tracking bug

https://bugs.chromium.org/p/chromium/issues/detail?id=1309002
<https://bugs.chromium.org/p/chromium/issues/detail?id=1309002>

Launch bug

https://bugs.chromium.org/p/chromium/issues/detail?id=1309353
<https://bugs.chromium.org/p/chromium/issues/detail?id=1309353>

Estimated milestones

M103 for off-by-default experiment

Link to entry on the Chrome Platform Status

https://chromestatus.com/feature/5768521127559168
<https://chromestatus.com/feature/5768521127559168>

--You received this message because you are subscribed to the

    Google Groups "blink-dev" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected].
    To view this discussion on the web visit
    
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAA5e6990s-e4aYUnYK5%2BqzQpAyFzJa42y%2B%3D_MAnL19z%3DqemnWg%40mail.gmail.com
    
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAA5e6990s-e4aYUnYK5%2BqzQpAyFzJa42y%2B%3D_MAnL19z%3DqemnWg%40mail.gmail.com?utm_medium=email&utm_source=footer>.

--You received this message because you are subscribed to the Google

    Groups "blink-dev" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected].
    To view this discussion on the web visit
    
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/432587f1-684f-af19-79ff-9c5514891999%40chromium.org
    
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/432587f1-684f-af19-79ff-9c5514891999%40chromium.org?utm_medium=email&utm_source=footer>.


--
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/7d5c8db3-0952-eb82-4a9d-7f3a9c003dde%40chromium.org.

Re: [blink-dev] Intent to Experiment: Load common payloads from privacy-preserving single-keyed cache

Reply via email to