I very much agree that those different groups need different treatment - both documentation and support-wise as well as a bit different messaging, tooling and level of "hand-holding". I do not want to deep dive into technicalities of those differences, this is something we can figure out when/if we turn the observation into specific actions.
But I think I need to add one thing - a bit of thoughts of mine why both 1) and 2) are super important for Airflow as a product and for the community as a whole. You might think 2) should be the primary and possibly the only community focus. It's very easy to fall into the trap of "let's focus on 2) only ". Let me be the devil's advocate here and throw two statements that you might easily come up with: 1) are not important, take too much of our time. They need to learn a lot, ask a lot of questions and bring zero value. They are just introducing a lot of noise. 2) are the only important segment that the community should focus on - they bring most to the community and they have enough skilled teams and experience to consume Airlfow's complexity and make the best out of it. Not so much with the others. 3) are important - but they will figure it out. They will make forks, or otherwise throw a lot of money at their offering and they will "make it happen for their case for the money of their customers". They will not give back anyway and they can go on with their forks as they want, we do not care. You could think that way, yes. But there is one huge fallacy here. It assumes the world of data is "static". I think it's as far from it as possible actually, and if we want the Airflow community and Airflow as a product to thrive and stay relevant 5 years from now, we need to account for what *might* happen and be prepared for different scenarios. The thing is that today's 1) will be tomorrow's decision makers of 2s) and might eventually become customers of the 3s). And the more we nurture and help 1)s now to make a great and pleasant experience the more of them in their future roles will pick and choose Airflow for 2)s. Also the 3s) should make a conscious effort to invest in making a good experience for 1)s to become 2)s and to make sure 2) got a great experience with the product so that they could in the future offering them extra stability/maintenance/SLAs/platform integration etc. and convert them to customers of theirs. If they don't do it, their growth and customer base will be impacted most. In the last case, what is important is to provide stability to the development of the product that Airflow is - fueling the efforts of the maintainers and contributors by often paying for their work, contributing valuable, big features that take months and years to complete and concerted efforts of a number of people to implement, contributing to the growth of community by helping with organising events and doing it in the way that they can also reap the benefit of it without diverging and "vendor lock-in". So writing those statements a bit differently: 1) are super important to get people "on-board" and to get the feeling that "Airflow is cool and easy", so that when they get decision makers in 2)s they have an obvious winner 2) are super important as they are those who bring and take most of the value from Airflow on multiple levels. 3) are super important to fuel the community - with investments, guiding the product direction, and giving the community the stability needed to develop the product That would by my addition, to Vikram's statement which I wholeheartedly agree with. J. On Fri, Jul 1, 2022 at 5:50 AM Vikram Koka <[email protected]> wrote: > > Hi everyone, > > As I have been looking through the recent AIPs, development features, and > mailing list discussions, it struck me that we have effectively three > different audiences here for Airflow. > > 1. Individuals and small teams using Airflow for their purpose, > 2. Enterprises managing Airflow for large teams of data engineers and data > scientists, and > 3. Service providers making "Airflow as a service" available for many > customers, either external or internal. > > Why does this even matter? Let me elaborate below: > > Clearly, a lot of "data practitioners", people who are primarily focused on > creating pipelines and working with data are spread across all three > audiences above. > However, "Airflow administrators" i.e. people who are focused on running > Airflow for data practitioners, especially at scale are primarily in the > audiences (2) and (3) above. > It is my observation that a lot of work being done right now in Airflow such > as multi-tenancy (but not limited to it), is focused on Airflow > administration. > I am concerned that we are overwhelming our audience segment (1) with the > work and configurations around running Airflow at scale. > > > If this is true, I would like to propose that we segment our Airflow > configurations, our packaging including our docs, and even our release notes > to make it easier for our audience (1), who is almost certainly the largest > block of our Airflow user community. > > I would like the opinions of the community on this topic. > > Best regards, > Vikram >
