Re: [EXTERNAL] Re: [blink-dev] Intent to ship: Cache sharing for extremely-pervasive resources

Daniel Bratell Wed, 19 Nov 2025 09:07:05 -0800

We had a discussion about this in the API OWNERS meeting and I was askedto make my thoughts public.


There are a couple of aspects to this:

1. This is trying to mitigate some negative effects of asecurity/privacy enhancing change (triple keyed cache). The negativeeffects are, as far as I can determine, in the form of reduced adrevenue ("web ecosystem") because some ad network scripts will have tobe reloaded from servers.

2. There is a fundamental idea that a browser should treat everyresource on the web equally (modulo security and some other exceptions).This is violating that idea.

3. The list of resources that will be allowed to bypass the third cachekey was created through a reasonably impartial method. Still, becausethe web is what it is, that resulted in certain entities getting a lotof their resources on the list. If I recall correctly, 30% of the listwas Google resources in one form or another.

4. Every resource on the list opens up the potential for thesecurity/privacy issues that the triple keyed cache was meant to protectagainst. Is there a point where the list has undermined enough of thebenefits that the whole triple keyed cache should be dropped instead?

All of this, and more, has to be weighed against each other to get to asolution with an "ideal" balance. I currently do not know what thatbalance is.

I do not like the look of certain aspects here, but on the other hand Ido like the security/privacy improvements and it would be sad if thosehave to be reverted because some considered them too costly.

This post does not really change anything, but just so that you knowwhat is being voiced. And if someone has more to add, please do add yourown information and thoughts. Especially if I have misunderstood ormischaracterized something. That is what these threads are for.


/Daniel

On 2025-11-09 16:46, Patrick Meenan wrote:

On Sat, Nov 8, 2025 at 1:32 PM Yoav Weiss (@Shopify)<[email protected]> wrote:


    I'm extremely supportive of this effort, with multiple hats on.

    I'd have loved if this wasn't restricted to users with 3P cookies
    enabled, but one can imagine abuse where pervasive resource
    *patterns* are used, but with unique hashes that are not deployed
    in the wild, and where each such URL is used as a cross-origin bit
    of entropy.

Yep, there are 2 risks for explicit tracking (that are effectivelymoot when you can track directly anyway). Differing the content ofsome of the responses some of the time (maybe for a slightly differentURL than the "current" version that still matches the pattern) andusing a broad sample of not-current URLs across a bunch of origins asa fingerprint. We can make some of that harder but I couldn't think ofany way to completely eliminate the risk.


    On Sat, Nov 8, 2025 at 7:04 AM Patrick Meenan
    <[email protected]> wrote:

        The list construction should already be completely objective.
        I changed the manual origin-owner validation to trust and
        require "cache-control: public" instead. The rest of the
        criteria
        
<https://docs.google.com/document/d/1xaoF9iSOojrlPrHZaKIJMK4iRZKA3AD6pQvbSy4ueUQ/edit?tab=t.0>
        should be well-defined and objective. I'm not sure if they can
        be fully automated yet (though that might just be my pre-AI
        thinking).

        The main need for humans in the loop right now is to create
        the patterns so that they each represent a "single" resource
        that is stable over time with URL changes (version/hash) and
        distinguishing those stable files from random hash bundles
        that aren't stable from release to release. That's fairly easy
        for a human to do (and get right).


    This is something that origins that use compression dictionaries
    already do by themselves - define the "match" pattern that covers
    the URL's semantics. Can we somehow use that for automation where
    it exists?

We can use the match patterns for script and style destinations as aninput when defining the patterns. If the resource URL matches thematch pattern and the match pattern is reasonably long (not /app/*.js)then it's probably a good pattern (and could be validated acrossmonths of HTTP Archive logs). There are patterns where dictionariesaren't used as strict delta updates for the same file (i.e. a scriptwith a lot of common code that portions of which might be in otherscripts used on other pages) so I wouldn't want to use it blindly butit is a very strong possibility.





        On Fri, Nov 7, 2025 at 4:47 PM Rick Byers
        <[email protected]> wrote:

            Thanks Pat. I am personally a big fan of things which
            increase publisher ad revenue across the web broadly
            without hurting (or ideally improving) the user
            experience, and this seems likely to do exactly that. In
            particular I recall all the debate around
            stale-while-revalidate
            <https://web.dev/articles/stale-while-revalidate> and am
            proud that we pushed
            
<https://groups.google.com/a/chromium.org/g/blink-dev/c/rspPrQHfFkI/m/c5j3xJQRDAAJ?e=48417069>
            through it with urgency and confirmed it indeed increased
            publisher ad revenue across the web
            
<https://web.dev/case-studies/ads-case-study-stale-while-revalidate>.


            Reading the Mozilla feedback carefully the point that
            resonates most with me is the risk of "gatekeeping" and
            the potential to mitigate that by establishing objective
            rules for inclusion. Is it plausible to imagine a version
            of this where the list construction would be entirely
            objective? What would the tradeoffs be?

            Thanks,
               Rick




            On Thu, Oct 30, 2025 at 3:50 PM Patrick Meenan
            <[email protected]> wrote:

                Reaching out to site owners was mostly for a sanity
                check that the resource is not expecting to be
                partitioned for some reason (even though the payloads
                are known to be identical). If it helps, we can
                replace the reach-out step with a requirement that the
                responses be "Cache-Control: public" (and hard-enforce
                it in the browser by not writing the resource to cache
                if it isn't). That is an explicit indicator that the
                resources are cacheable in shared upstream caches.

                I removed the 2 items from the design doc that were
                specifically targeted at direct fingerprinting since
                that's moot with the 3PC link (as well as the
                fingerprinting bits from the validation with resource
                owners).

                On the site-preferencing concern, it doesn't actually
                preference large sites but it does preference
                currently-popular third-party resources (most of which
                are provided by large corporations). The benefit is
                spread across all of the sites that they are embedded
                in (funnily enough, most large sites won't benefit
                because they don't tend to use third-parties).

                Determining the common resources at a local level
                exposes the same XS Leak issues as allowing all
                resources (i.e. your local map tiles will show up in
                multiple cache partitions because they all reference
                your current location but they can be used to identify
                your location since they are not globally common).
                Instead of using the HTTP Archive to collect the
                candidates, we could presumably build a centralized
                list based on aggregated common resources that are
                seen across cache partitions by each user but that
                feels like an awful lot of complexity for a very small
                number of resulting resources.

                On the test results, sorry, I thought I had included
                the experiment results in the I2S but it looks like I
                may not have.

                The test was specifically just with the patterns for
                the Google ads scripts because we aren't expecting
                this feature to impact the vitals for the main
                page/content since most of the pervasive resources are
                third-party content that is usually async already and
                not critical-path. It's possible some video or map
                embeds might trigger LCP in some cases but that's the
                exception more than the norm. This is more geared to
                making those supporting things work better while
                maintaining the user experience. Ads has the kind of
                instrumentation that we'd need to be able to get
                visibility into the success (or failure) of that
                assumption and to be able to measure small changes.

                The results were stat-sig positive but relatively
                small. The ad iframes displayed their content slightly
                faster and transmitted fewer bytes for each frame
                (very low single digit percentages).

                The guardrail metrics, including vitals) were all
                neutral which is what we were hoping for (improvement
                without a cost of increased contention).

                If you'd feel more comfortable with gathering more
                data, I wouldn't be opposed to running the full list
                at 1% to check the guardrail metrics again before
                fully launching. We won't necessarily expect to see
                positive movement to justify a launch since the
                resources are still async but we can validate that
                assumption with the full list at least (if that is the
                only remaining concern).


                On Thu, Oct 30, 2025 at 5:28 PM Rick Byers
                <[email protected]> wrote:

                    Thanks Erik and Patrick, of course that makes
                    sense. Sorry for the naive question. My naive
                    reading of the design doc suggested to me that a
                    lot of the privacy mitigations were about
                    preventing the cross-site tracking risk. Could the
                    design be simplified by removing some of those
                    mitigations? For example, the section about
                    reaching out to the resource owners, to what
                    extent is that really necessary when all we're
                    trying to mitigate is XS leaks? Don't the
                    popularity properties alone mitigate that
                    sufficiently?

                    What can you share about the magnitude of the
                    performance benefit in practice in your
                    experiments? In particular for LCP, since we know
                    <https://wpostats.com/> that correlates well with
                    user engagement (and against abandonment) and so
                    presumably user value.

                    The concern about not wanting to further advantage
                    more popular sites over less popular ones
                    resonates with me. Part of that argument seems to
                    apply broadly to the idea of any LRU cache
                    (especially one with a reuse bias which I believe
                    ours has
                    
<https://www.chromium.org/developers/design-documents/network-stack/disk-cache/#eviction>?).
                    But perhaps an important distinction here is that
                    the benefits are determined globally vs. on a
                    user-by-user basis? But I think any solution that
                    worked on a user-by-user basis would have the XS
                    leak problem, right? Perhaps it's worth reflecting
                    on our stance on using crowd-sourced data to try
                    to improve the experience for all users while
                    still being fair to sites broadly. In general I
                    think this is something Chromium is much more open
                    to (where it brings significant user benefit) than
                    other engines. For example, our Media Engagement
                    Index <https://developer.chrome.com/blog/autoplay>
                    system has some similar properties in terms of
                    using aggregate user behaviour to help decide
                    which sites have the power to play audio on page
                    load and which don't. I was personally uncertain
                    at the time if the complexity would prove to be
                    worth the benefit, but now I'm quite convinced it
                    is. Playing audio on load is just something users
                    and developers want in a few cases, but not most
                    cases. I wonder if perhaps cross-site caching is
                    similar?

                    Rick

                    On Thu, Oct 30, 2025 at 9:09 AM Matt Menke
                    <[email protected]> wrote:

                        Note that even with Vary: Origin, we still
                        have to load the HTTP request headers from the
                        disk cache to apply the vary header, which
                        leaks timing information, so "Vary: Origin" is
                        not a sufficient security mechanism to prevent
                        that sort of cross-site attack.

                        On Wednesday, October 29, 2025 at 5:08:42 PM
                        UTC-4 Erik Anderson wrote:

                            My understanding was that there was
                            believed to be a meaningful security
                            benefit with partitioning the cache.
                            That’s because it would limit a party from
                            being able to inferr that you’ve visited
                            some other site by measuring a side effect
                            tied to how quickly a resource loads. That
                            observation could potentially be made even
                            if that specific adversary doesn’t have
                            any of their own content loaded on the
                            other site.

                            Of course, if there is an entity with a
                            resource loaded across both sites with a
                            3p cookie /and/ they’re willing to share
                            that info/collude, there’s not much
                            benefit. And even when partitioned, if 3p
                            cookies are enabled, there are potentially
                            measurable side effects that differ based
                            on if the resource request had some
                            specific state in a 3p cookie.

                            Does that incremental security benefit of
                            partitioning the cache justify the
                            performance costs when 3p cookies are
                            still enabled? I’m not sure.

                            Even if partitioning was eliminated, a
                            site could protect themselves a bit by
                            specifying Vary: Origin, but that probably
                            doesn’t sufficiently cover iframe
                            scenarios (nor would I expect most sites
                            to hold it right).

                            *From:*Rick Byers <[email protected]>
                            *Sent:* Wednesday, October 29, 2025 11:56 AM
                            *To:* Patrick Meenan <[email protected]>
                            *Cc:* Mike Taylor <[email protected]>;
                            blink-dev <[email protected]>
                            *Subject:* [EXTERNAL] Re: [blink-dev]
                            Intent to ship: Cache sharing for
                            extremely-pervasive resources

                            If this is enabled only when 3PCs are
                            enabled, then what are the tradeoffs of
                            going through all this complexity and
                            governance vs. just broadly coupling HTTP
                            cache keying behavior to 3PC status in
                            some way? What can a tracker credibly do
                            with a single-keyed HTTP cache that they
                            cannot do with 3PCs? Are there also
                            concerns about accidental cross-site
                            resource sharing which could be mitigated
                            more simply by other means, eg. by scoping
                            to just to ETag-based caching?

                            I remember the controversy and some real
                            evidence of harm to users and businesses
                            in 2020 when we partitioned the HTTP
                            cache, but I was convinced that we had to
                            accept that harm in order to credibly
                            achieve 3PCD. At the time I was personally
                            a fan of a proposal like this (even for
                            users without 3PCs) in order to mitigate
                            the harm. But now it seems to me that if
                            we're going to start talking about poking
                            holes in that decision, perhaps we
                            should be doing a larger review of the
                            options in that space with the knowledge
                            that most Chrome users are likely to
                            continue to have 3PCs enabled. WDYT?

                            Thanks,

                             Rick

                            On Mon, Oct 27, 2025 at 10:27 AM Patrick
                            Meenan <[email protected]> wrote:

                                I don't believe the
                                security/privacy protections actually
                                rely on the assertions (and it's
                                unlikely those would be public). It's
                                more for awareness and to make sure
                                they don't accidentally break
                                something with their app if they were
                                relying on the responses being
                                partitioned by site.

                                As far as query params go, the browser
                                code already only filters for requests
                                with no query params so any that do
                                rely on query params won't get
                                included anyway.

                                The same goes for cookies. Since the
                                feature is only enabled when
                                third-party cookies are enabled,
                                adding cookies to these responses or
                                putting unique content in them won't
                                actually pierce any new boundaries but
                                it goes against the intent of only
                                using it for public/static resources
                                and they'd lose the benefit of the
                                shared cache when it gets updated.
                                Same goes for the fingerprinting risks
                                if the pattern was abused.

                                On Mon, Oct 27, 2025 at 9:39 AM Mike
                                Taylor <[email protected]> wrote:

                                    On 10/22/25 5:48 p.m., Patrick
                                    Meenan wrote:

                                        The candidate list goes down
                                        to 20k occurrences in order to
                                        catch resources that were
                                        updated mid-crawl and may have
                                        multiple entries with
                                        different hashes that add up
                                        to 100k+ occurrences. In the
                                        candidate list, without any
                                        filtering, the 100k cutoff is
                                        around 600, I'd estimate that
                                        well less than 25% of the
                                        candidates make it through the
                                        filtering for stable pattern,
                                        correct resource type and
                                        reliable pattern. First
                                        release will likely be 100-200
                                        and I don't expect it will
                                        ever grow above 500.

                                    Thanks - I see the living document
                                    has been updated to mention 500 as
                                    a ceiling.

                                        As far as cadence goes, I
                                        expect there will be a lot of
                                        activity for the next few
                                        releases as individual
                                        patterns are coordinated with
                                        the origin owners but then it
                                        will settle down to a much
                                        more bursty pattern of updates
                                        every few Chrome releases
                                        (likely linked with an origin
                                        changing their application and
                                        adding more/different
                                        resources). And yes, it is manual.

                                        As far as the process goes,
                                        resource owners need to
                                        actively assert that their
                                        resource is appropriate for
                                        the single-keyed cache and
                                        that they would like it
                                        included (usually in response
                                        to active outreach from us but
                                        we have the external-facing
                                        list for owner-initiated
                                        contact as well).  The design
                                        doc has the documentation for
                                        what it means to be
                                        appropriate (and the doc will
                                        be moved to a readme page in
                                        the repository next to the
                                        actual list so it's not a
                                        hard-to-find Google doc):

                                    Will there be any kind of public
                                    record of this assertion? What
                                    happens if a site starts using
                                    query params or sending cookies?
                                    Does the person in charge of
                                    manual list curation discover that
                                    in the next release? Does that
                                    require a new release (I don't
                                    know if this lives in component
                                    updater, or in the binary itself)?

                                        *5. Require resource owner opt-in*
                                        For each URL to be included,
                                        reach out to the team/company
                                        responsible for the resource
                                        to validate the URL pattern
                                        and get assurances that the
                                        pattern will always serve the
                                        same content to all sites and
                                        not be abused for tracking (by
                                        using unique URLs within the
                                        pattern mask as a bit-mask for
                                        fingerprinting). They will
                                        also need to validate that the
                                        URLs covered by the pattern
                                        will not rely on being able to
                                        set cookies over HTTP using a
                                        Set-CookieHTTP response header
                                        because they will not be
                                        re-applied across cache
                                        boundaries (the set-cookie is
                                        not cached with the resource).

                                        On Wed, Oct 22, 2025 at
                                        5:31 PM Mike Taylor
                                        <[email protected]> wrote:

                                            On 10/18/25 8:34 a.m.,
                                            Patrick Meenan wrote:

                                                Sorry, I missed a step
                                                in making the
                                                candidate resource
                                                list public. I have
                                                moved it to my
                                                chromium account and
                                                made it public here
                                                
<https://docs.google.com/spreadsheets/d/1TgWhdeqKbGm6hLM9WqnnXLn-iiO4Y9HTjDXjVO2aBqI/edit?usp=sharing>.


                                                Not everything in that
                                                list meets all of the
                                                criteria - it's just
                                                the first step in the
                                                manual curation (same
                                                URL served the same
                                                content across > 20k
                                                sites in the HTTP
                                                Archive dataset).

                                                The manual steps frome
                                                there for meeting the
                                                criteria are basically:

                                                - Cull the list for
                                                scripts, stylesheets
                                                and compression
                                                dictionaries.

                                                - Remove any URLs that
                                                use query parameters.

                                                - Exclude any
                                                responses that
                                                set cookies.

                                                - Identify URLs that
                                                are not manually
                                                versioned by site
                                                embedders (i.e. the
                                                embedded resource can
                                                not get stale). This
                                                is either in-place
                                                updating resources or
                                                automatically
                                                versioned resources.

                                                - Only include URLs
                                                that can reliably
                                                target a single
                                                resource by pattern
                                                (i.e.
                                                ..../<hash>-common.js
                                                but not ..../<hash>.js)

                                                - Get confirmation
                                                from the resource
                                                owner that the given
                                                URL Pattern is and
                                                will continue to be
                                                appropriate for the
                                                single-keyed cache

                                            A few questions on list
                                            curation:

                                            Can you clarify how big
                                            the list will be? The
                                            privacy review at
                                            
https://chromestatus.com/feature/5202380930678784?gate=5174931459145728 mentions
                                            ~500, while the design doc
                                            mentions 1000. I see the
                                            candidate resource list
                                            starts at ~5000, then
                                            presumably manual curation
                                            begins to get to one of
                                            those numbers.

                                            What is the expected list
                                            curation/update cadence?
                                            Is it actually manual?

                                            Is there any recourse
                                            process for owners of
                                            resources that don't want
                                            to be included? Do we have
                                            documentation on what it
                                            mean to be appropriate for
                                            the single-keyed cache?

                                            thanks,
                                            Mike

--You received this message because you

                                are subscribed to the Google Groups
                                "blink-dev" group.
                                To unsubscribe from this group and
                                stop receiving emails from it, send an
                                email to [email protected].
                                To view this discussion visit
                                
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w6UFSnxxzhGKBnY1BJKiZZeH7BUm7PmcjQm_%2BLjGyrtYg%40mail.gmail.com
                                
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w6UFSnxxzhGKBnY1BJKiZZeH7BUm7PmcjQm_%2BLjGyrtYg%40mail.gmail.com?utm_medium=email&utm_source=footer>.

--You received this message because you are

                            subscribed to the Google Groups
                            "blink-dev" group.
                            To unsubscribe from this group and stop
                            receiving emails from it, send an email to
                            [email protected].

                            To view this discussion visit
                            
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAFUtAY9Nffq00r-xbiu2BO00y%2B_2knAi-zheMs9hrE-dB%2BTZ3w%40mail.gmail.com
                            
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAFUtAY9Nffq00r-xbiu2BO00y%2B_2knAi-zheMs9hrE-dB%2BTZ3w%40mail.gmail.com?utm_medium=email&utm_source=footer>.

--You received this message because you are subscribed to the

        Google Groups "blink-dev" group.
        To unsubscribe from this group and stop receiving emails from
        it, send an email to [email protected].
        To view this discussion visit
        
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w4ceQ4Df%2BzFCYwFM5MSAh4APVXtCHj9Q7o5CP_B%3DKs1kA%40mail.gmail.com
        
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w4ceQ4Df%2BzFCYwFM5MSAh4APVXtCHj9Q7o5CP_B%3DKs1kA%40mail.gmail.com?utm_medium=email&utm_source=footer>.

--

You received this message because you are subscribed to the GoogleGroups "blink-dev" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected].To view this discussion visithttps://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w5R56xfGBsnOknw1Ha0ns%2BQW%2BQhtvPkR0aqHZAmnhiOOg%40mail.gmail.com<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w5R56xfGBsnOknw1Ha0ns%2BQW%2BQhtvPkR0aqHZAmnhiOOg%40mail.gmail.com?utm_medium=email&utm_source=footer>.


--
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/19ea588e-4d41-46d2-9aff-b4aa0a7acbf2%40gmail.com.

Re: [EXTERNAL] Re: [blink-dev] Intent to ship: Cache sharing for extremely-pervasive resources

Reply via email to