potiuk commented on PR #25780: URL: https://github.com/apache/airflow/pull/25780#issuecomment-1234236371
> Yeah it will work , I'm just concerned about "encouraging" users to create `just one single image with multiple predefined envs` . Sometimes users create chaotic stack design because something work out of box :) I am not sure if we want to do it to be honest. I do not think we should encourage it at all (we should present it as an option and we do) because everyone's mileage is different. I spoke to a few users of Airlfow (my customers) and it really depends what stage and experience you have IMHO. 1) some customers who do not have many "system" dependencies problems and are not "super" excited about infrastructure and build pipelinesf for it, will likely prefer single image with multiple venvs (actually the idea of this operator came precisely from that discussion - I met them yesterday and they cannot wait for 2.4 being available because it solves huge problem they have with multiple teams in a simple way. 2) but there are more sophisticated customers who have multiple separate teams that have multiple complex requirements (including system dependencies). There only multiple container images will cut it. Extreme case of it - think GPU and ARM support for one team crossed with having to install Python 2.7 on an old debian because this is the only environment all dependencies will be installable (as much as we would not like it - Python 2.7 is still poptular in gaming industry apparently - Unity added Python binding few years ago with Python 2.7 only https://docs.unity3d.com/Packages/[email protected]/manual/installation.html and it is still 2.7 only !!! :scream: ). This is of course extreme, but you get the idea. So rather than advising the user to choose one over the other, I chose a different route, similar to the "installation" page - - https://airflow.apache.org/docs/apache-airflow/stable/installation/index.html - if you look at the "best practices" chapter in my PR I simply describe all the options and explain pros and cons of each approach and consequences of doing so. This description is precisely targeted for the users who will attempt to ask us "which is the best approach". Since we cannot answer this question authoritatively IMHO and we do not want to engage in long discussions with each user (this does not scale) to figure out which option is best for the particular user, we will simply send the user to that page, which they will be able to read and decide on their own. We simply cannot make the decisions for them, but we give them all the information in the way that they can make the decision themselves. I tried to make this "best practices" chapter to be unbiased, factual and very precisely describing pros and cons of each approach and they are grouped in one chapter progresslvely going from the simplest (PythonVirtualenv) to the most complex and involved (Kubernetes). I think this is the best we can do. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
