I'd love to hear from others in the community who already use Qdrant what they think :) ?
Few comments to Anush: I did a bit of review of the links and did some usual research. 1) Re: requirements it does not introduce any big issues. Urllib3 < 2 is a bit strange (but we are anyhow limited by botocore now, so not a big issue, I hope it can be removed in the future. Requires-Dist: fastembed (==0.1.1) ; (python_version < "3.12") and (extra == "fastembed") Requires-Dist: grpcio (>=1.41.0) Requires-Dist: grpcio-tools (>=1.41.0) Requires-Dist: httpx[http2] (>=0.14.0) Requires-Dist: numpy (<1.21) ; python_version < "3.8" Requires-Dist: numpy (>=1.21) ; python_version >= "3.8" and python_version < "3.12" Requires-Dist: numpy (>=1.26) ; python_version >= "3.12" Requires-Dist: portalocker (>=2.7.0,<3.0.0) Requires-Dist: pydantic (>=1.10.8) Requires-Dist: urllib3 (>=1.26.14,<2.0.0) 2) Open source version seems to be fully supported and alive. This looks pretty cool after looking at the information provided. The code is small and literally calling the library QdrantClient, so it does not seem like something that might require a lot of maintenance, My concerns are with testability and future-proof maintenance. This is a fast-pacing area. There will be breaking changes. Yes. There are unit tests and system tests there. But we have no time/possibility to run our tests against real quadrant serve and especially against one run in the cloud "by hand". So, two points: 1) Open-source version: Similar to Kafka provider - seems Qdrant has a nicely dockerized version that can be installed from officially released images (https://qdrant.tech/documentation/quick-start/) - seems like perfect candidate to run integration tests with it on our CI. If that is there, this means that we can both - easily make sure it continues to work, but also - equally easily bump the version of Qudrant when new major/minor release is out and have our tests run automatically in our CI. And it will nicely run in Breeze with `breeze --integration qdrant` when someone wants to run the integration tests locally: See https://github.com/apache/airflow/tree/main/tests/integration/providers/apache/kafka and https://github.com/apache/airflow/blob/main/scripts/ci/docker-compose/integration-kafka.yml - I think that shoudl be condition of approving it 2) Cloud version: It would also help if you could (especially if you want to run the system tests against your cloud) that you get similar dashboards as we have for Amazon and other LLM providers (maintained by Astronomer) which would show the status of system tests you run with main version. Are you ok with extending the PR and adding integration tests and committing to maintaining such a dashboard? If there are voices from the community "yeah it's useful" - and the points 1) and 2) are addressed, I am quite positive about accepting the provider :) J On Tue, Jan 16, 2024 at 1:41 PM Anush Shetty <anush.she...@qdrant.com> wrote: > Hello, Airflow community, > > I am Anush - an Integrations engineer at Qdrant. This discussion proposes > to include Qdrant as a supported provider for Airflow. > Following up on https://lists.apache.org/list.html?dev@airflow.apache.org. > > Qdrant - https://github.com/qdrant/qdrant, is an open-source vector search > engine and database, governed by the Apache-2.0 license, allowing complete > freedom for commercial usage and redistribution. > > Proposed provider PR: https://github.com/apache/airflow/pull/36805 > > Qdrant ranks amongst the most performant and most used vector databases > available today. > - https://qdrant.tech/benchmarks/ > - https://ossinsight.io/collections/vector-search-engine/ > > We believe Qdrant would be a valuable addition for Airflow users to have as > an option when building DAGs. > > Qdrant can be deployed by users on their own or via Qdrant's cloud > offering. > > The proposed provider supports interfacing with Qdrant instances through > both REST and GRPC interfaces without any restrictions on the mode of > deployment used. > > As part of our commitment, the Qdrant team is willing to undertake the > responsibility of maintaining and updating the provider as per user > requests or any identified needs. > > Anush >