On Sat, Nov 8, 2025 at 1:32 PM Yoav Weiss (@Shopify)
<[email protected]> wrote:
I'm extremely supportive of this effort, with multiple hats on.
I'd have loved if this wasn't restricted to users with 3P cookies
enabled, but one can imagine abuse where pervasive resource
*patterns* are used, but with unique hashes that are not deployed
in the wild, and where each such URL is used as a cross-origin bit
of entropy.
Yep, there are 2 risks for explicit tracking (that are effectively
moot when you can track directly anyway). Differing the content of
some of the responses some of the time (maybe for a slightly different
URL than the "current" version that still matches the pattern) and
using a broad sample of not-current URLs across a bunch of origins as
a fingerprint. We can make some of that harder but I couldn't think of
any way to completely eliminate the risk.
On Sat, Nov 8, 2025 at 7:04 AM Patrick Meenan
<[email protected]> wrote:
The list construction should already be completely objective.
I changed the manual origin-owner validation to trust and
require "cache-control: public" instead. The rest of the
criteria
<https://docs.google.com/document/d/1xaoF9iSOojrlPrHZaKIJMK4iRZKA3AD6pQvbSy4ueUQ/edit?tab=t.0>
should be well-defined and objective. I'm not sure if they can
be fully automated yet (though that might just be my pre-AI
thinking).
The main need for humans in the loop right now is to create
the patterns so that they each represent a "single" resource
that is stable over time with URL changes (version/hash) and
distinguishing those stable files from random hash bundles
that aren't stable from release to release. That's fairly easy
for a human to do (and get right).
This is something that origins that use compression dictionaries
already do by themselves - define the "match" pattern that covers
the URL's semantics. Can we somehow use that for automation where
it exists?
We can use the match patterns for script and style destinations as an
input when defining the patterns. If the resource URL matches the
match pattern and the match pattern is reasonably long (not /app/*.js)
then it's probably a good pattern (and could be validated across
months of HTTP Archive logs). There are patterns where dictionaries
aren't used as strict delta updates for the same file (i.e. a script
with a lot of common code that portions of which might be in other
scripts used on other pages) so I wouldn't want to use it blindly but
it is a very strong possibility.
On Fri, Nov 7, 2025 at 4:47 PM Rick Byers
<[email protected]> wrote:
Thanks Pat. I am personally a big fan of things which
increase publisher ad revenue across the web broadly
without hurting (or ideally improving) the user
experience, and this seems likely to do exactly that. In
particular I recall all the debate around
stale-while-revalidate
<https://web.dev/articles/stale-while-revalidate> and am
proud that we pushed
<https://groups.google.com/a/chromium.org/g/blink-dev/c/rspPrQHfFkI/m/c5j3xJQRDAAJ?e=48417069>
through it with urgency and confirmed it indeed increased
publisher ad revenue across the web
<https://web.dev/case-studies/ads-case-study-stale-while-revalidate>.
Reading the Mozilla feedback carefully the point that
resonates most with me is the risk of "gatekeeping" and
the potential to mitigate that by establishing objective
rules for inclusion. Is it plausible to imagine a version
of this where the list construction would be entirely
objective? What would the tradeoffs be?
Thanks,
Rick
On Thu, Oct 30, 2025 at 3:50 PM Patrick Meenan
<[email protected]> wrote:
Reaching out to site owners was mostly for a sanity
check that the resource is not expecting to be
partitioned for some reason (even though the payloads
are known to be identical). If it helps, we can
replace the reach-out step with a requirement that the
responses be "Cache-Control: public" (and hard-enforce
it in the browser by not writing the resource to cache
if it isn't). That is an explicit indicator that the
resources are cacheable in shared upstream caches.
I removed the 2 items from the design doc that were
specifically targeted at direct fingerprinting since
that's moot with the 3PC link (as well as the
fingerprinting bits from the validation with resource
owners).
On the site-preferencing concern, it doesn't actually
preference large sites but it does preference
currently-popular third-party resources (most of which
are provided by large corporations). The benefit is
spread across all of the sites that they are embedded
in (funnily enough, most large sites won't benefit
because they don't tend to use third-parties).
Determining the common resources at a local level
exposes the same XS Leak issues as allowing all
resources (i.e. your local map tiles will show up in
multiple cache partitions because they all reference
your current location but they can be used to identify
your location since they are not globally common).
Instead of using the HTTP Archive to collect the
candidates, we could presumably build a centralized
list based on aggregated common resources that are
seen across cache partitions by each user but that
feels like an awful lot of complexity for a very small
number of resulting resources.
On the test results, sorry, I thought I had included
the experiment results in the I2S but it looks like I
may not have.
The test was specifically just with the patterns for
the Google ads scripts because we aren't expecting
this feature to impact the vitals for the main
page/content since most of the pervasive resources are
third-party content that is usually async already and
not critical-path. It's possible some video or map
embeds might trigger LCP in some cases but that's the
exception more than the norm. This is more geared to
making those supporting things work better while
maintaining the user experience. Ads has the kind of
instrumentation that we'd need to be able to get
visibility into the success (or failure) of that
assumption and to be able to measure small changes.
The results were stat-sig positive but relatively
small. The ad iframes displayed their content slightly
faster and transmitted fewer bytes for each frame
(very low single digit percentages).
The guardrail metrics, including vitals) were all
neutral which is what we were hoping for (improvement
without a cost of increased contention).
If you'd feel more comfortable with gathering more
data, I wouldn't be opposed to running the full list
at 1% to check the guardrail metrics again before
fully launching. We won't necessarily expect to see
positive movement to justify a launch since the
resources are still async but we can validate that
assumption with the full list at least (if that is the
only remaining concern).
On Thu, Oct 30, 2025 at 5:28 PM Rick Byers
<[email protected]> wrote:
Thanks Erik and Patrick, of course that makes
sense. Sorry for the naive question. My naive
reading of the design doc suggested to me that a
lot of the privacy mitigations were about
preventing the cross-site tracking risk. Could the
design be simplified by removing some of those
mitigations? For example, the section about
reaching out to the resource owners, to what
extent is that really necessary when all we're
trying to mitigate is XS leaks? Don't the
popularity properties alone mitigate that
sufficiently?
What can you share about the magnitude of the
performance benefit in practice in your
experiments? In particular for LCP, since we know
<https://wpostats.com/> that correlates well with
user engagement (and against abandonment) and so
presumably user value.
The concern about not wanting to further advantage
more popular sites over less popular ones
resonates with me. Part of that argument seems to
apply broadly to the idea of any LRU cache
(especially one with a reuse bias which I believe
ours has
<https://www.chromium.org/developers/design-documents/network-stack/disk-cache/#eviction>?).
But perhaps an important distinction here is that
the benefits are determined globally vs. on a
user-by-user basis? But I think any solution that
worked on a user-by-user basis would have the XS
leak problem, right? Perhaps it's worth reflecting
on our stance on using crowd-sourced data to try
to improve the experience for all users while
still being fair to sites broadly. In general I
think this is something Chromium is much more open
to (where it brings significant user benefit) than
other engines. For example, our Media Engagement
Index <https://developer.chrome.com/blog/autoplay>
system has some similar properties in terms of
using aggregate user behaviour to help decide
which sites have the power to play audio on page
load and which don't. I was personally uncertain
at the time if the complexity would prove to be
worth the benefit, but now I'm quite convinced it
is. Playing audio on load is just something users
and developers want in a few cases, but not most
cases. I wonder if perhaps cross-site caching is
similar?
Rick
On Thu, Oct 30, 2025 at 9:09 AM Matt Menke
<[email protected]> wrote:
Note that even with Vary: Origin, we still
have to load the HTTP request headers from the
disk cache to apply the vary header, which
leaks timing information, so "Vary: Origin" is
not a sufficient security mechanism to prevent
that sort of cross-site attack.
On Wednesday, October 29, 2025 at 5:08:42 PM
UTC-4 Erik Anderson wrote:
My understanding was that there was
believed to be a meaningful security
benefit with partitioning the cache.
That’s because it would limit a party from
being able to inferr that you’ve visited
some other site by measuring a side effect
tied to how quickly a resource loads. That
observation could potentially be made even
if that specific adversary doesn’t have
any of their own content loaded on the
other site.
Of course, if there is an entity with a
resource loaded across both sites with a
3p cookie /and/ they’re willing to share
that info/collude, there’s not much
benefit. And even when partitioned, if 3p
cookies are enabled, there are potentially
measurable side effects that differ based
on if the resource request had some
specific state in a 3p cookie.
Does that incremental security benefit of
partitioning the cache justify the
performance costs when 3p cookies are
still enabled? I’m not sure.
Even if partitioning was eliminated, a
site could protect themselves a bit by
specifying Vary: Origin, but that probably
doesn’t sufficiently cover iframe
scenarios (nor would I expect most sites
to hold it right).
*From:*Rick Byers <[email protected]>
*Sent:* Wednesday, October 29, 2025 11:56 AM
*To:* Patrick Meenan <[email protected]>
*Cc:* Mike Taylor <[email protected]>;
blink-dev <[email protected]>
*Subject:* [EXTERNAL] Re: [blink-dev]
Intent to ship: Cache sharing for
extremely-pervasive resources
If this is enabled only when 3PCs are
enabled, then what are the tradeoffs of
going through all this complexity and
governance vs. just broadly coupling HTTP
cache keying behavior to 3PC status in
some way? What can a tracker credibly do
with a single-keyed HTTP cache that they
cannot do with 3PCs? Are there also
concerns about accidental cross-site
resource sharing which could be mitigated
more simply by other means, eg. by scoping
to just to ETag-based caching?
I remember the controversy and some real
evidence of harm to users and businesses
in 2020 when we partitioned the HTTP
cache, but I was convinced that we had to
accept that harm in order to credibly
achieve 3PCD. At the time I was personally
a fan of a proposal like this (even for
users without 3PCs) in order to mitigate
the harm. But now it seems to me that if
we're going to start talking about poking
holes in that decision, perhaps we
should be doing a larger review of the
options in that space with the knowledge
that most Chrome users are likely to
continue to have 3PCs enabled. WDYT?
Thanks,
Rick
On Mon, Oct 27, 2025 at 10:27 AM Patrick
Meenan <[email protected]> wrote:
I don't believe the
security/privacy protections actually
rely on the assertions (and it's
unlikely those would be public). It's
more for awareness and to make sure
they don't accidentally break
something with their app if they were
relying on the responses being
partitioned by site.
As far as query params go, the browser
code already only filters for requests
with no query params so any that do
rely on query params won't get
included anyway.
The same goes for cookies. Since the
feature is only enabled when
third-party cookies are enabled,
adding cookies to these responses or
putting unique content in them won't
actually pierce any new boundaries but
it goes against the intent of only
using it for public/static resources
and they'd lose the benefit of the
shared cache when it gets updated.
Same goes for the fingerprinting risks
if the pattern was abused.
On Mon, Oct 27, 2025 at 9:39 AM Mike
Taylor <[email protected]> wrote:
On 10/22/25 5:48 p.m., Patrick
Meenan wrote:
The candidate list goes down
to 20k occurrences in order to
catch resources that were
updated mid-crawl and may have
multiple entries with
different hashes that add up
to 100k+ occurrences. In the
candidate list, without any
filtering, the 100k cutoff is
around 600, I'd estimate that
well less than 25% of the
candidates make it through the
filtering for stable pattern,
correct resource type and
reliable pattern. First
release will likely be 100-200
and I don't expect it will
ever grow above 500.
Thanks - I see the living document
has been updated to mention 500 as
a ceiling.
As far as cadence goes, I
expect there will be a lot of
activity for the next few
releases as individual
patterns are coordinated with
the origin owners but then it
will settle down to a much
more bursty pattern of updates
every few Chrome releases
(likely linked with an origin
changing their application and
adding more/different
resources). And yes, it is manual.
As far as the process goes,
resource owners need to
actively assert that their
resource is appropriate for
the single-keyed cache and
that they would like it
included (usually in response
to active outreach from us but
we have the external-facing
list for owner-initiated
contact as well). The design
doc has the documentation for
what it means to be
appropriate (and the doc will
be moved to a readme page in
the repository next to the
actual list so it's not a
hard-to-find Google doc):
Will there be any kind of public
record of this assertion? What
happens if a site starts using
query params or sending cookies?
Does the person in charge of
manual list curation discover that
in the next release? Does that
require a new release (I don't
know if this lives in component
updater, or in the binary itself)?
*5. Require resource owner opt-in*
For each URL to be included,
reach out to the team/company
responsible for the resource
to validate the URL pattern
and get assurances that the
pattern will always serve the
same content to all sites and
not be abused for tracking (by
using unique URLs within the
pattern mask as a bit-mask for
fingerprinting). They will
also need to validate that the
URLs covered by the pattern
will not rely on being able to
set cookies over HTTP using a
Set-CookieHTTP response header
because they will not be
re-applied across cache
boundaries (the set-cookie is
not cached with the resource).
On Wed, Oct 22, 2025 at
5:31 PM Mike Taylor
<[email protected]> wrote:
On 10/18/25 8:34 a.m.,
Patrick Meenan wrote:
Sorry, I missed a step
in making the
candidate resource
list public. I have
moved it to my
chromium account and
made it public here
<https://docs.google.com/spreadsheets/d/1TgWhdeqKbGm6hLM9WqnnXLn-iiO4Y9HTjDXjVO2aBqI/edit?usp=sharing>.
Not everything in that
list meets all of the
criteria - it's just
the first step in the
manual curation (same
URL served the same
content across > 20k
sites in the HTTP
Archive dataset).
The manual steps frome
there for meeting the
criteria are basically:
- Cull the list for
scripts, stylesheets
and compression
dictionaries.
- Remove any URLs that
use query parameters.
- Exclude any
responses that
set cookies.
- Identify URLs that
are not manually
versioned by site
embedders (i.e. the
embedded resource can
not get stale). This
is either in-place
updating resources or
automatically
versioned resources.
- Only include URLs
that can reliably
target a single
resource by pattern
(i.e.
..../<hash>-common.js
but not ..../<hash>.js)
- Get confirmation
from the resource
owner that the given
URL Pattern is and
will continue to be
appropriate for the
single-keyed cache
A few questions on list
curation:
Can you clarify how big
the list will be? The
privacy review at
https://chromestatus.com/feature/5202380930678784?gate=5174931459145728 mentions
~500, while the design doc
mentions 1000. I see the
candidate resource list
starts at ~5000, then
presumably manual curation
begins to get to one of
those numbers.
What is the expected list
curation/update cadence?
Is it actually manual?
Is there any recourse
process for owners of
resources that don't want
to be included? Do we have
documentation on what it
mean to be appropriate for
the single-keyed cache?
thanks,
Mike
--
You received this message because you
are subscribed to the Google Groups
"blink-dev" group.
To unsubscribe from this group and
stop receiving emails from it, send an
email to [email protected].
To view this discussion visit
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w6UFSnxxzhGKBnY1BJKiZZeH7BUm7PmcjQm_%2BLjGyrtYg%40mail.gmail.com
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w6UFSnxxzhGKBnY1BJKiZZeH7BUm7PmcjQm_%2BLjGyrtYg%40mail.gmail.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are
subscribed to the Google Groups
"blink-dev" group.
To unsubscribe from this group and stop
receiving emails from it, send an email to
[email protected].
To view this discussion visit
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAFUtAY9Nffq00r-xbiu2BO00y%2B_2knAi-zheMs9hrE-dB%2BTZ3w%40mail.gmail.com
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAFUtAY9Nffq00r-xbiu2BO00y%2B_2knAi-zheMs9hrE-dB%2BTZ3w%40mail.gmail.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the
Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to [email protected].
To view this discussion visit
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w4ceQ4Df%2BzFCYwFM5MSAh4APVXtCHj9Q7o5CP_B%3DKs1kA%40mail.gmail.com
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w4ceQ4Df%2BzFCYwFM5MSAh4APVXtCHj9Q7o5CP_B%3DKs1kA%40mail.gmail.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google
Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected].
To view this discussion visit
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w5R56xfGBsnOknw1Ha0ns%2BQW%2BQhtvPkR0aqHZAmnhiOOg%40mail.gmail.com
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w5R56xfGBsnOknw1Ha0ns%2BQW%2BQhtvPkR0aqHZAmnhiOOg%40mail.gmail.com?utm_medium=email&utm_source=footer>.