Re: [webkit-dev] Request for position: Topics API

John Wilander via webkit-dev Wed, 06 Apr 2022 12:20:04 -0700

Hi Josh!

Thanks for reaching out. Hope to see you in person at some standards meeting 
soon! Please see our feedback on your proposal below.

   Regards, John

> On Mar 17, 2022, at 9:04 AM, Josh Karlin via webkit-dev 
> <[email protected]> wrote:
> 
> Hi WebKit-Dev,
> 
> We've been working on the Topics API that allows for interest-based 
> advertising in a browser ecosystem in which storage is partitioned by 
> top-frame site. This API replaces our first proposal in this area, FLoC. We 
> would like to hear what you think about it. Note that Chrome is implementing 
> (with spec following shortly after) but we're quite open to evolving the API 
> over time and are appreciative of your feedback.
> 
> explainer: https://github.com/jkarlin/topics 
> <https://github.com/jkarlin/topics>
> chromestatus: https://chromestatus.com/feature/5680923054964736 
> <https://chromestatus.com/feature/5680923054964736>
> spec: TBD

The Topics API explainer is in a personal repository which gives us pause on 
commenting since it’s unclear what the proposal’s official status is.

Our analysis of the proposal assumes full per-site partitioning and no high 
entropy device fingerprinting such as IP address available cross-site. It’s 
important that any pre-existing privacy deficiencies on the web not be used as 
excuses for privacy deficiencies in new specs and proposals.

Apple does not think Topics API is a good addition to the web platform. Here’s 
why:
Cross-site data. We don’t think cross-site data about the user’s browsing 
should be exposed in APIs. We’ve been working for ten years in the opposite 
direction, partitioning data per-site.
Cross-site sharing default. We don’t think cross-site data sharing should be on 
by default as a web platform feature. Users must have agency over expressing 
their personal interests to websites and third parties. A browser exposing this 
data by default is not acting as a user agent. Further, using the user’s 
browsing history as the basis of determining interests undermines users’ trust 
in the browser as their agent.
Cross-site targeting by default. We don’t think cross-site targeting of ads 
should be on by default as a web platform feature. Put another way, we don’t 
think cross-site targeting of ads should be the default experience on the web.
Safe to roam. The web should be safe to roam and the user agent should be 
working in that direction. By default exposing cross-site data to facilitate 
personalized ad targeting would make the web less safe to roam. Users would 
have to always think twice about which sites they visit and how that can be 
used to manipulate or target them.
Enrichment of user profiles. Websites which already know a lot about a user can 
learn more through cross-site data APIs like Topics API. Prime examples of such 
sites are the user’s search engine or social networking sites. Worse, topics 
connected to the user’s browsing will evolve over time, allowing continuous 
enrichment of the user profile as an ongoing privacy exposure. An example: The 
user was interested in honeymoons, then baby clothing, then lawyers.
Sensitive topics. What’s sensitive information differs between for instance 
cultures, religions, ages, communities, and individuals. It is therefore not 
just hard but also foolish to think that browser vendors can come up with a 
safe set of personalized topics to expose to ad networks.
Topic bias. We understand that the current set of topics is not the one 
intended to be used in production. However, the set shows a concerning affluent 
western lifestyle bias and we worry that the eventual standardized taxonomy 
will contain such biases too. A prime example in the current taxonomy is “World 
Music” as a term for all non-Western music.
Hidden patterns. We believe that technologies like machine learning will be 
able to glean personal data and patterns out of something like Topics API that 
go far beyond whatever “safe” set of topics that browser vendors define.
Advantages established players. The Topics API will only provide cross-site 
topic data to callers who called the API in the past for this particular user 
and on a site about that topic. This benefits entities that have scripts or 
frames embedded on many sites, e.g. already prevalent ad trackers, or owners of 
embeds with an ostensibly non-ad-related purpose such as social or video. And 
it perpetuates the incentive for more embedding solely for the purpose of 
cross-site data usage and not for any clear user benefit, thus needlessly 
hurting performance and battery life.
Who will classify sites? The open questions at the end of the explainer suggest 
that a taxonomy should be produced, and that it should become an industry 
standard. A sample taxonomy is available. But a taxonomy (at least as 
presented) is just a list of categories. Who decides which sites or pages are 
in which category? Is this a globally maintained list? Would it be a 
Google-provided service that requires Google’s permission to access? Would each 
browser do it separately (and perhaps differently?) Would sites self-label? 
Perhaps the intent is that the industry standard taxonomy would bucket sites or 
pages in the categories, but if so that’s not clear from the explainer, and if 
not, it seems like a major problem left unaddressed.

_______________________________________________
webkit-dev mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-dev

Re: [webkit-dev] Request for position: Topics API

Reply via email to