[blink-dev] Intent to Prototype: Embedding API

Ian Zhao Tue, 26 May 2026 14:24:40 -0700

*Contact emails*
[email protected], [email protected], [email protected],
[email protected], [email protected]


*Explainer*
https://github.com/explainers-by-googlers/embedding-api

*Specification*
*No information provided*

*Summary*
The Embedding API is a proposed Web Platform API that allows developers to
generate high-dimensional vector representations (embeddings) of content
directly on the user's device.

By leveraging Chrome's on-device AI infrastructure and a shared on-device
model, this API enables powerful semantic understanding features—such as
semantic search, Retrieval-Augmented Generation (RAG), and content
clustering. It eliminates the latency, cost, and privacy trade-offs of
cloud services. Furthermore, compared to DIY client-side approaches, it
provides significant user benefits (saving bandwidth and local storage by
preventing each site from downloading its own massive model) and developer
benefits (abstracting away complex model delivery and keeping
WebAssembly/WebGPU frameworks up-to-date).

*Blink component*
Blink>AI>Embedder
<https://issues.chromium.org/issues?q=customfield1222907:%22Blink%3EAI%3EEmbedder%22>

*Web Feature ID*
Missing feature

*Motivation*
While existing web technologies like WebAssembly and WebGPU provide
standardized, high-performance, and privacy-preserving execution
environments, deploying an embedding model still forces developers into a
difficult trade-off:

   - WebAssembly/WebGPU (DIY): Leads to significant storage and memory
   bloat, as every site must download its own multi-hundred megabyte model.
   - Cloud APIs: Introduce network latency, financial costs for developers,
   and require sending potentially sensitive user text to third-party servers.

By ensuring stateless execution and explicitly not persisting embeddings
globally, an on-device API allows the browser to safely share a single,
optimized model across all origins, drastically reducing the resource
footprint while providing a simple, high-level JavaScript primitive for
generalist developers.

*Key Use Cases*

   - Semantic Search: Enable note-taking or documentation apps to find
   content based on meaning rather than keywords, entirely offline and
   private.
   - On-Device RAG: Power local Q&A bots that retrieve relevant context
   from a user’s own data.
   - Real-time Content Intelligence: Provide proactive moderation hints or
   content categorization as a user types, before content is ever transmitted
   to a server.

*Anticipated questions*
Here's a list of problems that we want to discuss with other browser
vendors and the Web Machine Learning Community Group (WebML CG) as part of
Standards to ensure interoperability (Note: the explainer lists more in the
"Ensuring an Interoperable API Design" section)

   - Model and Space Choices: Exploring requirements for open-weight models
   and allowing developers to specify or provide their own models, to ensure
   compatibility with server-side embedding databases.
   - Content Mediation: Can we develop some sort of mediation when
   embeddings must be used server-side?


*Initial public proposal*
https://github.com/webmachinelearning/proposals/issues/18


*Requires code in //chrome?*
True

*Tracking bug*
https://crbug.com/428233906

*Estimated milestones*

No milestones specified


*Link to entry on the Chrome Platform Status*
https://chromestatus.com/feature/5115796490682368?gate=5187435874091008

This intent message was generated by Chrome Platform Status
<https://chromestatus.com/>.

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAKHO3_pqm6s%3DgPYB55czaYf3TMMq5KfZ0F1X71d1uZjqeJWAdA%40mail.gmail.com.

[blink-dev] Intent to Prototype: Embedding API

Reply via email to