I think an important question here is maintainability and the question is whether Microsoft (which I assume is behind this proposal - more or less directly). Could you please confirm and explain your affiliation with Microsoft (I think a big part of the community here is that we are transparent and open about our affiliations and who is behind the code contributed here, so that we have clarity also about long-term maintainability of the code.
I understand you work as a software developer for Microsoft? (from the Github profile). And let me be very clear - it's not something directed to you - but such provider discussion is not really (or should not be) just between community and a single developer submitting the code but between community and a team that will commit and prove the commitment that they will maintain it and engage back in the community. Could you (or someone else from Microsoft) please explain what is the role of Microsoft in leading that integration and future maintenance? This is something that for example Teradata team explained, promised, and - they keep their promise actually. Ideally we would love - whoever from Microsoft leadership is behind that that (if that's the case) to explain what is the role they are going to take in the future in maintenance, also whether they have plans to develop and maintain system test dashboards for the current azure provider, and whether they have plans (after that) to build system dashboard as part of contributing the new provider (for me this is absolute prerequisite in case of such a provider - it MUST have system dashboard and the system dashboard MUST be maintained and run by the stakeholder who is interested in having the provider in the community. Let me explain why I am asking. So far we have seen 0 activity from Microsoft Fabric team that has Airflow as a service (compared to Amazon, Google, Astronomer - who all here are actively contributing to Airflow). Amazon, Google, Astronomer teams do not only solve issues and maintain code for their respective providers but participate actively in developing core Airflow. All three of them (and Teradata team) - build and maintain system tests dashboards https://airflow.apache.org/ecosystem/#airflow-provider-system-test-dashboards - where they contribute multiple 100s of system tests to their respective providers and they run the system tests and maintain them. This means that they take responsibility to monitor and fix issues in the providers of theirs and having the dashboards showing status is not only good for them but also good for us because we know - when we release that the provider to be released is probably good. And we are even talking about next steps - machine readable data from the dashboard that we will be able to aggregate and have an overview of all those "big and important" providers. We call it - "mixed governance" approach - where the code is contributed to the community and it is developed according to the Apache Way and rules of the ASF. While testing and maintenance is largely led by the respective stakeholder teams - who contribute a lot of engineering effort - back. They do not expect to drop the code so that "community" will keep maintaining it. The leadership from Google, Astronomer, Amazon are all deeply involved with the community (organizing Airflow Summit, taking active part in Airflow Dev calls and discussions and planning. Now. We see precisely zero of such collaboration from Microsoft. Despite us nagging and reaching out in multiple ways. Seems that the Azure provider is not "taken care" of by Microsoft. (and here I might simply not be aware of some people contributing and being supported and sponsored by Microsoft) so I might be wrong here. But I do not see people from Microsoft (or paid by Microsoft) who would actively fix bugs, develop system tests and monitor their "greenness". And unlike in the case of Amazon, Google, Astronomer, Teradata I do not see anyone from Microsoft taking care about issues raised in any of our channels (issues/discussions/slack channels). The only commit I can associate with Microsoft is your https://github.com/apache/airflow/pull/35091 where a new operator has been added. Again - I might not be aware of some Microsoft's efforts here, but If I am right, and nothing will change here - I assume we might expect the same from the new provider - that the code will be dropped and the whole burden of maintaining it will be on the maintainers of Airflow and the community. And if we see such an approach - I think we all here in the community agree that the answer is "no thanks, release and maintain your own provider, please". We are not able to take more burden to support something that could and should be supported by the stakeholder who wants to make sure things are working well for their services. I think if we are going to seriously discuss this provider - we need to be absolutely sure that there is commitment (and we should see it) from the big stakeholder that is kind of Missing In Action here - while all the others are fully present and contributing back. This is at least my personal opinion - taken from years of collaboration with all those stakeholders and seeing win-win-win (users - maintinares - stakeholders) when such cooperation works. Can you please bring someone from Microsoft (if you cannot speak yourself) to explain what their plans are in this regard ? J On Wed, Jul 24, 2024 at 4:24 AM ambika garg <ambikagarg1...@gmail.com> wrote: > Hi Apache Airflow Community, > > I hope this message finds you well. > > TL; DR; I am writing to propose the addition of a new provider to Apache > Airflow for Microsoft Fabric <https://www.microsoft.com/microsoft-fabric>, > an end-to-end, unified analytics and data platform. This integration will > streamline workflow management and offer robust capabilities for Airflow > users leveraging Microsoft Fabric's comprehensive services. > > *What is Microsoft Fabric?* > > Microsoft Fabric is an end-to-end, unified analytics and data platform > designed for enterprises seeking a cohesive solution. Operating on a > Software as a Service (SaaS) model, it provides a suite of services > including Data Engineering, Data Factory, Data Science, Real-Time > Analytics, Data Warehouse, and Databases in a single, integrated ecosystem, > eliminating the need for disparate services from multiple vendors. Learn > more about Microsoft Fabric > <https://docs.microsoft.com/en-us/fabric/overview/>. > > *Why should Apache Airflow accept Microsoft Fabric as a provider?* > > 1. *Leverage the Microsoft Fabric items: *By integrating Microsoft > Fabric as a provider in Apache Airflow, we can leverage its comprehensive > suite of services such as Fabric notebooks, pipelines, warehouses etc. to > enhance workflow management for a variety of use cases. > > 2. *Unified Platform*: Microsoft Fabric offers a comprehensive set > of analytics experiences designed to work together seamlessly. Users don’t > need to assemble and manage disparate services from multiple vendors, > leading to more robust, simplified, and reliable workflows for those who > already rely on Airflow for orchestration. > > 3. *SaaS Model Efficiency*: As a SaaS platform, Microsoft Fabric > offers scalability, maintenance, and updates handled by Microsoft, reducing > the operational burden on users. Airflow users can leverage these > efficiencies while orchestrating workflows that involve Fabric services. > > 4. *Fabric is lake*-*centric and open: *Microsoft Fabric's > lake-centric design addresses the complexity and messiness of traditional > data lakes. By integrating Fabric with Airflow, users can leverage OneLake, > a multi-cloud data lake, to simplify data management and reduce data > duplication. > > 5. *Market Demand*: As enterprises increasingly adopt Microsoft > Fabric for their analytics needs, there will be growing demand for seamless > integration with existing and well-established tools like Apache Airflow. > > What do you guys think? > > Best Regards, > Ambika Garg >