Hello Airflow Community, I would like to propose a new provider for Vespa.ai. There's a PR for this here: https://github.com/apache/airflow/pull/63988
Motivation: We'd like to make it easy for Vespa users to use Airflow to write data and run queries (potentially other tasks in the future). And vice versa: for Airflow users to use Vespa for retrieval. Background: Vespa.ai is an open-source retrieval engine. Somewhat similar to search engines like Elasticsearch or vector DBs like Qdrant. Vespa is used for E-commerce or real-time personalization (e.g., Spotify, Yahoo!, Vinted), web-scale RAG (e.g., Perplexity), and other use-cases needing flexible ranking at scale. Users typically combine signals like vector or lexical similarity, recency, etc., via custom ranking logic or models (GBDT, ONNX). They normally have complex data pipelines, which is where Airflow comes in: many are already Airflow users or should be :) It's in our interest for Airflow users to have a good experience (e.g., reliable, secure), so we're committed to maintaining this provider. This brings us to maintenance (AIP-95). We have two stewards: Radu Gheorghe (github: radu-gheorghe) - me Thomas Hjelde Thoresen (thomasht86) - maintainer of pyvespa (on which the provider is based) While Jarek Potiuk agreed to be our Committer Sponsor. Thanks a lot for your guidance in this whole process, Jarek! Looking forward to your feedback. Best regards, Radu Gheorghe --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
