TL;DR; Generally we are very cautious about adding new "external
service" providers to the community and it's highly unlikely we would
want Weaviate in, but you are absolutely free (and encouraged) to
release the provider on your own. Even if there is an open-source
engine (like yours), when there is a cloud service behind it, really
those who run and build the services should - in general - be taking
responsibility for releasing their integration with Airflow.

In the vast majority of cases (and very likely in your case) it is far
better for services like yours to build and release the provider on
your own. There is absolutely nothing an Airflow provider released by
you cannot do differently than the community providers. It's much
simpler for you to support Airflow and release your own provider for
Airflow than for the Airflow community to maintain an external service
provider (adding to 70+ providers we already manage in the community).
We are happy to merge information about your provider to our ecosystem
page 
https://airflow.apache.org/ecosystem/#third-party-airflow-plugins-and-providers
and you can also add it to Astronomer Registry - you will find the
links to registry in the ecosystem (and any other registries that
might be there).

We would only consider a new provider to be donated to us if a lot of
condition is fulfilled - not only mocking, but we also expect from
those who wish to donate such service provider to run system tests
with the real services (using their own resources) and dedication to
keep the system tests running and tested (otherwise we will stop
releasing such provider). At the same time we impose a lot of
limitations for such provider including minimum supported Airflow
Version (in April we will bump all providers to only support Airflow
2.4+ for example) and bound to our very strict release process
https://github.com/apache/airflow#release-process-for-providers.
Recently this requirement caused Cloudera - for example - to release
their provider on their own (see the ecosystem page for the link).
Other - popular - providers that we already have are catching up with
this requirement. Fo example AWS recently released (and maintain)
their dashboard of system tests
https://aws-mwaa.github.io/open-source/system-tests/dashboard.html
that they run and maintain and we are going to use it in our release
process when releasing AWS provider. You would have to do something
similar as a very basic requirement to get the Weaviate provider
adopted in the Airflow community.

You can take a look at the recent discussions we had about it to get
more context. Read all of those before responding please. Those
threads have likely all the answers to all the questions you might
have:

* https://lists.apache.org/thread/hvl2sg7mc6gwxs1h5kzhrcdtt8cc36dd
* https://lists.apache.org/thread/1gtw5vyypxh0p72wh4dss7cllcvhgh01
* https://lists.apache.org/thread/qk2co6trd7gm57744shprw2fhgmjr637
* https://lists.apache.org/thread/8b1jvld3npgzz2z0o3gv14lvtornbdrm

A bit discouraging, I understand, but we debated and discussed a lot
about it and this is the approach we apply to all similar requests.

J.

On Sun, Jan 29, 2023 at 8:49 PM Marcus Eagan <m...@marcuseagan.com> wrote:
>
> Hi Devs,
>
> In keeping with the open source ethos and the need for DAG workflows in 
> Neural Search pipelines, I welcome feedback on the idea of adding a Weaviate 
> provider to Airflow. It's the best open source neural search engine.
>
> I see the need for a test and would be willing to invest in a mock if 
> necessary, but I'm curious about the appetite for such work in general.
>
> I'm open to feedback. I've contributed a lot to various open source projects 
> and one very small contribution to Airflow back in the day to help with 
> enterprise adoption.
>
> Best,
>
> Marcus
>

Reply via email to