Hey all,

*tl;dr*: I'm proposing an official Provider Registry for the Airflow
project, deployed at https://airflow.apache.org/registry/.

Preview is up at https://airflow.staged.apache.org/registry/ -- take a look
and let me know what you think.

PR to Airflow repo incoming in a couple of hours :)

*Why now*
With AIP-95 approved, Airflow now has a formal provider lifecycle:
incubation, production, mature, and deprecated. That opens the door for
accepting more community-built providers and giving them an official home,
while setting clear expectations about maturity and support. But lifecycle
stages only work if users can actually see them.

Right now, there's no place on airflow.apache.org where someone can browse
providers, check their lifecycle stage, or discover what modules they ship.

This registry fills that gap. It gives the PMC a tool to communicate
provider maturity to users, and it gives the community an official way to
surface new providers -- clearly labelled with their lifecycle stage.

*What it does*
The registry currently catalogs 99 providers and 1,648 modules across all
11 module types (operators, hooks, sensors, triggers, transfers, executors,
notifiers, secret backends, logging handlers, Dag bundles, and decorators).

It's built with Eleventy <https://www.11ty.dev/> (thanks Ash, for the
suggestion and for prototyping an approach with it) and auto-generated
directly from the provider.yaml files in the repo -- no separate data
pipeline, no manual curation. When a provider is added or updated, the next
CI build picks it up automatically.

The entire registry is a static site (HTML, CSS, JS): no server, no
database, same deployment model as the existing Airflow docs. It's
generated at build time from the provider.yaml files and served from S3 via
CloudFront.

*A few things you can do with it*:

   - Search across all providers and modules (Cmd+K, powered by Pagefind)
   - Browse by category (Cloud, Databases, AI & ML, etc.)
   - Filter/sort by lifecycle stage, downloads, module count
   - Explore provider detail pages with per-version module listings,
   connection types, parameters, and install commands
   - Access JSON API endpoints (/api/providers.json, /api/modules.json) for
   programmatic access -- useful for AI agents and tooling


The design is deliberately discovery-first: it links out to the API
reference docs and user guides rather than hosting everything itself. This
avoids duplicating content between provider docs and registry entries.

CI/CD is integrated with our existing docs pipeline and syncs to S3
automatically. Nothing in provider code, provider.yaml schemas, core
Airflow, or the docs build is changed by this.

*How it relates to the Astronomer Registry*
Many of you know the Astronomer Registry (https://registry.astronomer.io),
which has been the go-to for discovering Airflow providers for years. Big
thanks to Astronomer and Josh Fell for building and maintaining it.

This new registry is designed to be a community-owned successor on
airflow.apache.org, with the eventual goal of redirecting
registry.astronomer.io traffic here once it's stable.


*Remaining work*
Still to do after this lands:

   - apache/airflow-site PR for .htaccess rewrite and a "Registry" nav link
   - Redirect registry.astronomer.io traffic once the official one is stable
   - A way to add third-party providers that are not in the Airflow repo,
   like Great Expectations, Cosmos etc - I have a POC working on this.


*Future ideas (will create GH issues)*

   - Explicit categories in provider.yaml (currently keyword-based matching)
   - LLM-friendly exports (llms.txt, "Copy for AI" buttons etc.)
   - Example DAGs for each Provider.
   - and many more – but I think the current state is valuable enough
   already


I'd appreciate feedback and reviews!

Regards,
Kaxil

Reply via email to