Re: [EXTERNAL] Re: [blink-dev] Intent to ship: Cache sharing for extremely-pervasive resources

Alex Russell Tue, 02 Dec 2025 12:53:22 -0800

I've been thinking a *lot * about this Intent, and I would feel much better 
about it if we were targeting the effects towards users that can benefit, 
rather than subsidising the excesses of JS libraries that have grown 
without push-back. The YT player, e.g., is egregiously sized, and it make 
me queasy to think that we're going to create even more induced demand.


I'd support a policy, e.g., that made this feature more likely to trigger 
the lower-spec'd the device and/or network the user is on.

Is that something you'd consider supporting, Pat?

Best,

Alex

On Thursday, November 27, 2025 at 9:15:26 AM UTC-8 Yoav Weiss wrote:

> LGTM1
>
> This is an important optimization that will take a small step at evening 
> out the playing field for smaller sites (that use 3Ps and distributed 
> platforms) compared to large, centralized ones with many repeat visitors. 
> It will become even more important once it's integrated with Compression 
> Dictionaries, as it would significantly increase cache reusability even 
> with high-frequency updates.
>
> While it does taint the security model a bit compared to pure triple-keyed 
> cache, I think the benefits outweigh that cost.
>
>
>
> On Wednesday, November 19, 2025 at 11:45:47 PM UTC+1 Joe Mason wrote:
>
> My thought was that the risk of fingerprinting is MUCH lower with this 
> cache than with 3p cookies in general, so a user that wants to disable 3p 
> cookies for better anonymity might be ok with leaving the pervasive cache 
> on. Tying it to 3p cookies seems to inflate the danger.
>
> I like the idea of making the top-level name something like "allow me to 
> be tracked across websites", but I wouldn't expect the pervasive cache ad 
> described to actually allow that so it would be misleading to tie it to 
> that language. (Also, I'm wary of a setting called "allow me to be 
> tracked", because we can't fully DISALLOW that - information leaks will 
> always be possible, we can only make fingerprinting harder or easier.)
>
> What about a "Difficulty of tracking me across websites" slider, from 
> "trivial" (3p cookies enabled) to "moderate" (features like pervasive cache 
> and other minor fingerprinting risks enabled) to "difficult" (but never 
> "impossible").
>
> This is getting far afield, though. I don't want to derail the discussion, 
> just noodling.
>
> On Wednesday, November 19, 2025 at 12:49:34 PM UTC-5 Patrick Meenan wrote:
>
> On Wed, Nov 19, 2025 at 12:25 PM Joe Mason <[email protected]> wrote:
>
> Regarding the concern that tying this to 3P cookies will make it harder to 
> disable 3P cookies completely: have you considered making this a separate 
> UI toggle instead? 
>
>
> A separate UI toggle would be more confusing than useful to the user.
>
> The link to cookies is because of the risk of explicit user tracking and 
> fingerprinting (which is reduced with the protections but non-zero with the 
> shared cache). Third-party cookies is a stand-in for "allow me to be 
> tracked across websites". It would probably be more accurate to have a more 
> friendly top-level name like that and cookies would be a feature that is 
> automatically toggled (along with a shared cache) but I don't see there 
> being a useful distinction for users to be able to toggle the different 
> fingerprinting risks individually (AFAIK, there are a few different 
> features linked to cookie tracking for the same reason).
>  
>
>
> On Wednesday, November 19, 2025 at 12:07:48 PM UTC-5 Daniel Bratell wrote:
>
> We had a discussion about this in the API OWNERS meeting and I was asked 
> to make my thoughts public. 
>
> There are a couple of aspects to this:
>
> 1. This is trying to mitigate some negative effects of a security/privacy 
> enhancing change (triple keyed cache). The negative effects are, as far as 
> I can determine, in the form of reduced ad revenue ("web ecosystem") 
> because some ad network scripts will have to be reloaded from servers.
>
> 2. There is a fundamental idea that a browser should treat every resource 
> on the web equally (modulo security and some other exceptions). This is 
> violating that idea.
>
> 3. The list of resources that will be allowed to bypass the third cache 
> key was created through a reasonably impartial method. Still, because the 
> web is what it is, that resulted in certain entities getting a lot of their 
> resources on the list. If I recall correctly, 30% of the list was Google 
> resources in one form or another.
>
> 4. Every resource on the list opens up the potential for the 
> security/privacy issues that the triple keyed cache was meant to protect 
> against. Is there a point where the list has undermined enough of the 
> benefits that the whole triple keyed cache should be dropped instead?
>
> All of this, and more, has to be weighed against each other to get to a 
> solution with an "ideal" balance. I currently do not know what that balance 
> is. 
>
> I do not like the look of certain aspects here, but on the other hand I do 
> like the security/privacy improvements and it would be sad if those have to 
> be reverted because some considered them too costly.
>
> This post does not really change anything, but just so that you know what 
> is being voiced. And if someone has more to add, please do add your own 
> information and thoughts. Especially if I have misunderstood or 
> mischaracterized something. That is what these threads are for.
>
> /Daniel
> On 2025-11-09 16:46, Patrick Meenan wrote:
>
>
>
> On Sat, Nov 8, 2025 at 1:32 PM Yoav Weiss (@Shopify) <[email protected]> 
> wrote:
>
> I'm extremely supportive of this effort, with multiple hats on.
>
> I'd have loved if this wasn't restricted to users with 3P cookies enabled, 
> but one can imagine abuse where pervasive resource *patterns* are used, but 
> with unique hashes that are not deployed in the wild, and where each such 
> URL is used as a cross-origin bit of entropy.
>
>
> Yep, there are 2 risks for explicit tracking (that are effectively moot 
> when you can track directly anyway). Differing the content of some of the 
> responses some of the time (maybe for a slightly different URL than the 
> "current" version that still matches the pattern) and using a broad sample 
> of not-current URLs across a bunch of origins as a fingerprint. We can make 
> some of that harder but I couldn't think of any way to completely eliminate 
> the risk.
>  
>
> On Sat, Nov 8, 2025 at 7:04 AM Patrick Meenan <[email protected]> wrote:
>
> The list construction should already be completely objective. I changed 
> the manual origin-owner validation to trust and require "cache-control: 
> public" instead. The rest of the criteria 
> <https://docs.google.com/document/d/1xaoF9iSOojrlPrHZaKIJMK4iRZKA3AD6pQvbSy4ueUQ/edit?tab=t.0>
>  
> should be well-defined and objective. I'm not sure if they can be fully 
> automated yet (though that might just be my pre-AI thinking). 
>
> The main need for humans in the loop right now is to create the patterns 
> so that they each represent a "single" resource that is stable over time 
> with URL changes (version/hash) and distinguishing those stable files from 
> random hash bundles that aren't stable from release to release. That's 
> fairly easy for a human to do (and get right).
>
>
> This is something that origins that use compression dictionaries already 
> do by themselves - define the "match" pattern that covers the URL's 
> semantics. Can we somehow use that for automation where it exists?
>
>
> We can use the match patterns for script and style destinations as an 
> input when defining the patterns. If the resource URL matches the match 
> pattern and the match pattern is reasonably long (not /app/*.js) then it's 
> probably a good pattern (and could be validated across months of HTTP 
> Archive logs). There are patterns where dictionaries aren't used as strict 
> delta updates for the same file (i.e. a script with a lot of common code 
> that portions of which might be in other scripts used on other pages) so I 
> wouldn't want to use it blindly but it is a very strong possibility.
>  
>
>  
>
>
>
>
> On Fri, Nov 7, 2025 at 4:47 PM Rick Byers <[email protected]> wrote:
>
> Thanks Pat. I am personally a big fan of things which increase publisher 
> ad revenue across the web broadly without hurting (or ideally improving) 
> the user experience, and this seems likely to do exactly that. In 
> particular I recall all the debate around stale-while-revalidate 
> <https://web.dev/articles/stale-while-revalidate> and am proud that we 
> pushed 
> <https://groups.google.com/a/chromium.org/g/blink-dev/c/rspPrQHfFkI/m/c5j3xJQRDAAJ?e=48417069>
>  
> through it with urgency and confirmed it indeed increased publisher ad 
> revenue across the web 
> <https://web.dev/case-studies/ads-case-study-stale-while-revalidate>. 
>
> Reading the Mozilla feedback carefully the point that resonates most with 
> me is the risk of "gatekeeping" and the potential to mitigate that by 
> establishing objective rules for inclusion. Is it plausible to imagine a 
> version of this where the list construction would be entirely objective? 
> What would the tradeoffs be?
>
> Thanks,
>    Rick 
>
>
>
>
> On Thu, Oct 30, 2025 at 3:50 PM Patrick Meenan <[email protected]> 
> wrote:
>
> Reaching out to site owners was mostly for a sanity check that the 
> resource is not expecting to be partitioned for some reason (even though 
> the payloads are known to be identical). If it helps, we can replace the 
> reach-out step with a requirement that the responses be "Cache-Control: 
> public" (and hard-enforce it in the browser by not writing the resource to 
> cache if it isn't). That is an explicit indicator that the resources are 
> cacheable in shared upstream caches. 
>
> I removed the 2 items from the design doc that were specifically targeted 
> at direct fingerprinting since that's moot with the 3PC link (as well as 
> the fingerprinting bits from the validation with resource owners). 
>
> On the site-preferencing concern, it doesn't actually preference large 
> sites but it does preference currently-popular third-party resources (most 
> of which are provided by large corporations). The benefit is spread across 
> all of the sites that they are embedded in (funnily enough, most large 
> sites won't benefit because they don't tend to use third-parties).
>
> Determining the common resources at a local level exposes the same XS Leak 
> issues as allowing all resources (i.e. your local map tiles will show up in 
> multiple cache partitions because they all reference your current location 
> but they can be used to identify your location since they are not globally 
> common). Instead of using the HTTP Archive to collect the candidates, we 
> could presumably build a centralized list based on aggregated common 
> resources that are seen across cache partitions by each user but that feels 
> like an awful lot of complexity for a very small number of resulting 
> resources.
>
> On the test results, sorry, I thought I had included the experiment 
> results in the I2S but it looks like I may not have.
>
> The test was specifically just with the patterns for the Google ads 
> scripts because we aren't expecting this feature to impact the vitals for 
> the main page/content since most of the pervasive resources are third-party 
> content that is usually async already and not critical-path. It's possible 
> some video or map embeds might trigger LCP in some cases but that's the 
> exception more than the norm. This is more geared to making those 
> supporting things work better while maintaining the user experience. Ads 
> has the kind of instrumentation that we'd need to be able to get visibility 
> into the success (or failure) of that assumption and to be able to measure 
> small changes.
>
> The results were stat-sig positive but relatively small. The ad iframes 
> displayed their content slightly faster and transmitted fewer bytes for 
> each frame (very low single digit percentages).
>
> The guardrail metrics, including vitals) were all neutral which is what we 
> were hoping for (improvement without a cost of increased contention).
>
> If you'd feel more comfortable with gathering more data, I wouldn't be 
> opposed to running the full list at 1% to check the guardrail metrics again 
> before fully launching. We won't necessarily expect to see positive 
> movement to justify a launch since the resources are still async but we can 
> validate that assumption with the full list at least (if that is the only 
> remaining concern).
>
>
> On Thu, Oct 30, 2025 at 5:28 PM Rick Byers <[email protected]> wrote:
>
> Thanks Erik and Patrick, of course that makes sense. Sorry for the naive 
> question. My naive reading of the design doc suggested to me that a lot of 
> the privacy mitigations were about preventing the cross-site tracking risk. 
> Could the design be simplified by removing some of those mitigations? For 
> example, the section about reaching out to the resource owners, to what 
> extent is that really necessary when all we're trying to mitigate is XS 
> leaks? Don't the popularity properties alone mitigate that sufficiently? 
>
> What can you share about the magnitude of the performance benefit in 
> practice in your experiments? In particular for LCP, since we know 
> <https://wpostats.com/> that correlates well with user engagement (and 
> against abandonment) and so presumably user value. 
>
> The concern about not wanting to further advantage more popular sites over 
> less popular ones resonates with me. Part of that argument seems to apply 
> broadly to the idea of any LRU cache (especially one with a reuse bias 
> which I believe ours has 
> <https://www.chromium.org/developers/design-documents/network-stack/disk-cache/#eviction>?).
>  
> But perhaps an important distinction here is that the benefits are 
> determined globally vs. on a user-by-user basis? But I think any solution 
> that worked on a user-by-user basis would have the XS leak problem, right? 
> Perhaps it's worth reflecting on our stance on using crowd-sourced data to 
> try to improve the experience for all users while still being fair to sites 
> broadly. In general I think this is something Chromium is much more open to 
> (where it brings significant user benefit) than other engines. For example, 
> our Media Engagement Index <https://developer.chrome.com/blog/autoplay> 
> system has some similar properties in terms of using aggregate user 
> behaviour to help decide which sites have the power to play audio on page 
> load and which don't. I was personally uncertain at the time if the 
> complexity would prove to be worth the benefit, but now I'm quite convinced 
> it is. Playing audio on load is just something users and developers want in 
> a few cases, but not most cases. I wonder if perhaps cross-site caching is 
> similar?
>
> Rick
>
> On Thu, Oct 30, 2025 at 9:09 AM Matt Menke <[email protected]> wrote:
>
> Note that even with Vary: Origin, we still have to load the HTTP request 
> headers from the disk cache to apply the vary header, which leaks timing 
> information, so "Vary: Origin" is not a sufficient security mechanism to 
> prevent that sort of cross-site attack.
>
> On Wednesday, October 29, 2025 at 5:08:42 PM UTC-4 Erik Anderson wrote:
>
> My understanding was that there was believed to be a meaningful security 
> benefit with partitioning the cache. That’s because it would limit a party 
> from being able to inferr that you’ve visited some other site by measuring 
> a side effect tied to how quickly a resource loads. That observation could 
> potentially be made even if that specific adversary doesn’t have any of 
> their own content loaded on the other site.
>
>  
>
> Of course, if there is an entity with a resource loaded across both sites 
> with a 3p cookie *and* they’re willing to share that info/collude, 
> there’s not much benefit. And even when partitioned, if 3p cookies are 
> enabled, there are potentially measurable side effects that differ based on 
> if the resource request had some specific state in a 3p cookie.
>
>  
>
> Does that incremental security benefit of partitioning the cache justify 
> the performance costs when 3p cookies are still enabled? I’m not sure.
>
>  
>
> Even if partitioning was eliminated, a site could protect themselves a bit 
> by specifying Vary: Origin, but that probably doesn’t sufficiently cover 
> iframe scenarios (nor would I expect most sites to hold it right).
>
>  
>
> *From:* Rick Byers <[email protected]> 
> *Sent:* Wednesday, October 29, 2025 11:56 AM
> *To:* Patrick Meenan <[email protected]>
> *Cc:* Mike Taylor <[email protected]>; blink-dev <[email protected]>
> *Subject:* [EXTERNAL] Re: [blink-dev] Intent to ship: Cache sharing for 
> extremely-pervasive resources
>
>  
>
> If this is enabled only when 3PCs are enabled, then what are the tradeoffs 
> of going through all this complexity and governance vs. just broadly 
> coupling HTTP cache keying behavior to 3PC status in some way? What can a 
> tracker credibly do with a single-keyed HTTP cache that they cannot do with 
> 3PCs? Are there also concerns about accidental cross-site resource sharing 
> which could be mitigated more simply by other means, eg. by scoping to just 
> to ETag-based caching?
>
>  
>
> I remember the controversy and some real evidence of harm to users and 
> businesses in 2020 when we partitioned the HTTP cache, but I was convinced 
> that we had to accept that harm in order to credibly achieve 3PCD. At the 
> time I was personally a fan of a proposal like this (even for users without 
> 3PCs) in order to mitigate the harm. But now it seems to me that if we're 
> going to start talking about poking holes in that decision, perhaps we 
> should be doing a larger review of the options in that space with the 
> knowledge that most Chrome users are likely to continue to have 3PCs 
> enabled. WDYT?
>
>  
>
> Thanks,
>
>    Rick
>
>  
>
> On Mon, Oct 27, 2025 at 10:27 AM Patrick Meenan <[email protected]> 
> wrote:
>
> I don't believe the security/privacy protections actually rely on the 
> assertions (and it's unlikely those would be public). It's more for 
> awareness and to make sure they don't accidentally break something with 
> their app if they were relying on the responses being partitioned by site.
>
>  
>
> As far as query params go, the browser code already only filters for 
> requests with no query params so any that do rely on query params won't get 
> included anyway.
>
>  
>
> The same goes for cookies. Since the feature is only enabled when 
> third-party cookies are enabled, adding cookies to these responses or 
> putting unique content in them won't actually pierce any new boundaries but 
> it goes against the intent of only using it for public/static resources and 
> they'd lose the benefit of the shared cache when it gets updated. Same goes 
> for the fingerprinting risks if the pattern was abused.
>
>  
>
> On Mon, Oct 27, 2025 at 9:39 AM Mike Taylor <[email protected]> wrote:
>
> On 10/22/25 5:48 p.m., Patrick Meenan wrote:
>
> The candidate list goes down to 20k occurrences in order to catch 
> resources that were updated mid-crawl and may have multiple entries with 
> different hashes that add up to 100k+ occurrences. In the candidate list, 
> without any filtering, the 100k cutoff is around 600, I'd estimate that 
> well less than 25% of the candidates make it through the filtering for 
> stable pattern, correct resource type and reliable pattern. First release 
> will likely be 100-200 and I don't expect it will ever grow above 500.
>
> Thanks - I see the living document has been updated to mention 500 as a 
> ceiling. 
>
>  
>
> As far as cadence goes, I expect there will be a lot of activity for the 
> next few releases as individual patterns are coordinated with the origin 
> owners but then it will settle down to a much more bursty pattern of 
> updates every few Chrome releases (likely linked with an origin changing 
> their application and adding more/different resources). And yes, it is 
> manual.
>
> As far as the process goes, resource owners need to actively assert that 
> their resource is appropriate for the single-keyed cache and that they 
> would like it included (usually in response to active outreach from us but 
> we have the external-facing list for owner-initiated contact as well).  The 
> design doc has the documentation for what it means to be appropriate (and 
> the doc will be moved to a readme page in the repository next to the actual 
> list so it's not a hard-to-find Google doc):
>
> Will there be any kind of public record of this assertion? What happens if 
> a site starts using query params or sending cookies? Does the person in 
> charge of manual list curation discover that in the next release? Does that 
> require a new release (I don't know if this lives in component updater, or 
> in the binary itself)? 
>
>  
>
> *5. Require resource owner opt-in*
> For each URL to be included, reach out to the team/company responsible for 
> the resource to validate the URL pattern and get assurances that the 
> pattern will always serve the same content to all sites and not be abused 
> for tracking (by using unique URLs within the pattern mask as a bit-mask 
> for fingerprinting). They will also need to validate that the URLs covered 
> by the pattern will not rely on being able to set cookies over HTTP using a 
> Set-Cookie HTTP response header because they will not be re-applied 
> across cache boundaries (the set-cookie is not cached with the resource).
>
>  
>
> On Wed, Oct 22, 2025 at 5:31 PM Mike Taylor <[email protected]> wrote:
>
> On 10/18/25 8:34 a.m., Patrick Meenan wrote:
>
> Sorry, I missed a step in making the candidate resource list public. I 
> have moved it to my chromium account and made it public here 
> <https://docs.google.com/spreadsheets/d/1TgWhdeqKbGm6hLM9WqnnXLn-iiO4Y9HTjDXjVO2aBqI/edit?usp=sharing>.
>  
>
>
>  
>
> Not everything in that list meets all of the criteria - it's just the 
> first step in the manual curation (same URL served the same content across 
> > 20k sites in the HTTP Archive dataset).
>
>  
>
> The manual steps frome there for meeting the criteria are basically:
>
>  
>
> - Cull the list for scripts, stylesheets and compression dictionaries.
>
> - Remove any URLs that use query parameters.
>
> - Exclude any responses that set cookies.
>
> - Identify URLs that are not manually versioned by site embedders (i.e. 
> the embedded resource can not get stale). This is either in-place updating 
> resources or automatically versioned resources.
>
> - Only include URLs that can reliably target a single resource by pattern 
> (i.e. ..../<hash>-common.js but not ..../<hash>.js)
>
> - Get confirmation from the resource owner that the given URL Pattern is 
> and will continue to be appropriate for the single-keyed cache
>
> A few questions on list curation:
>
> Can you clarify how big the list will be? The privacy review at 
> https://chromestatus.com/feature/5202380930678784?gate=5174931459145728 
> mentions 
> ~500, while the design doc mentions 1000. I see the candidate resource list 
> starts at ~5000, then presumably manual curation begins to get to one of 
> those numbers.
>
> What is the expected list curation/update cadence? Is it actually manual?
>
> Is there any recourse process for owners of resources that don't want to 
> be included? Do we have documentation on what it mean to be appropriate for 
> the single-keyed cache?
>
> thanks,
> Mike
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "blink-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion visit https://groups.google.com/a/
> chromium.org/d/msgid/blink-dev/CAPq58w6UFSnxxzhGKBnY1BJKiZZeH
> 7BUm7PmcjQm_%2BLjGyrtYg%40mail.gmail.com 
> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w6UFSnxxzhGKBnY1BJKiZZeH7BUm7PmcjQm_%2BLjGyrtYg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "blink-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
>
> To view this discussion visit https://groups.google.com/a/
> chromium.org/d/msgid/blink-dev/CAFUtAY9Nffq00r-
> xbiu2BO00y%2B_2knAi-zheMs9hrE-dB%2BTZ3w%40mail.gmail.com 
> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAFUtAY9Nffq00r-xbiu2BO00y%2B_2knAi-zheMs9hrE-dB%2BTZ3w%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "blink-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion visit https://groups.google.com/a/
> chromium.org/d/msgid/blink-dev/CAPq58w4ceQ4Df%
> 2BzFCYwFM5MSAh4APVXtCHj9Q7o5CP_B%3DKs1kA%40mail.gmail.com 
> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w4ceQ4Df%2BzFCYwFM5MSAh4APVXtCHj9Q7o5CP_B%3DKs1kA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "blink-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
>
> To view this discussion visit https://groups.google.com/a/
> chromium.org/d/msgid/blink-dev/CAPq58w5R56xfGBsnOknw1Ha0ns%
> 2BQW%2BQhtvPkR0aqHZAmnhiOOg%40mail.gmail.com 
> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w5R56xfGBsnOknw1Ha0ns%2BQW%2BQhtvPkR0aqHZAmnhiOOg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/1b5e23ed-1423-4757-a166-e29186a8fbffn%40chromium.org.

Re: [EXTERNAL] Re: [blink-dev] Intent to ship: Cache sharing for extremely-pervasive resources

Reply via email to