kaxil commented on pull request #19637: URL: https://github.com/apache/airflow/pull/19637#issuecomment-972130383
> Hmm. Maybe I am a little overcautious here, but it comes from the experience of responding and trying to help many of our users. I think it really boils down to what we want to optimize for. > > My current optimisation "goal" is to decrease the number of problems our users even approach us with. So if there is a problem we can anticipate and solve for our users without them even knowing it, I am all for that. Adding more and more escape hatches like that is going quite against that "optimisation" goal, because we admit that there is a problem and we try to "throw the problems at our users to solve" - and this (from the experience) will hit us back. > > This was the case with the warning for "removed" task instance entries. I know why we added it but IMHO it exposed an internal detail of airlfow to our users, and this kinda 'undermines' the feeling that airlfow is "reliable" and "safe to upgrade" - users who just want to install airflow without knowing how it works intenrnally should be first-class citizens as well, and those kind of "escape hatches" just leaks the "airlfow" abstraction. I think sometimes we tend to forget that our Ops (even admins) are not "developers" or even no "sql admins" should not eventually even know that there is a database behind. This is really an implementation detail for many. > > But If I am the only one to be worried, I am ok with the flag (I just hope in the next few months we will not have to handle a floo od of issues upgrading from say 2.0.2). I am "-1" for the same reasons that I don't want our users to be to have to "unnecessarily" wait and instead of actually making sure that we don't break serialization we are cleaning the serialized_dags. And this comes from years of experience and helping users and companies as well. >Adding more and more escape hatches like that is going quite against that "optimisation" goal, because we admit that there is a problem and we try to "throw the problems at our users to solve" - and this (from the experience) will hit us back. That's exactly why we should not just clean the table and can still achieve what we need without comprising the cost of reserializing the table. And it is not "throw the problems at our users to solve" , it is if we goof up for whatever reason we have a "backup" that helps users without compromising it for the versions that don't have it. It is a different way of looking at the coin !! > This was the case with the warning for "removed" task instance entries. I know why we added it but IMHO it exposed an internal detail of airlfow to our users, and this kinda 'undermines' the feeling that airlfow is "reliable" and "safe to upgrade" That is the reason we adopted the solution I suggested to move things to a temporary table then ask them to cleanup dangling TIs where they didn't even know what they need to run. And it is not "one size fits all" problem -- there are users who don't care about those but there are corporate, banks and others who have policies to retain data for a certain period, so they can decide if they want that data or not. >and this kinda 'undermines' the feeling that airlfow is "reliable" and "safe to upgrade" - users who just want to install airflow without knowing how it works intenrnally should be first-class citizens as well, and those kind of "escape hatches" just leaks the "airlfow" abstraction Again, there are two kinds of users and it is not one size fits all. For "new users" / "new installs" neither is the problem -- nor the TI issue or the dag serialization issue. It is for the users during the "upgrade" -- the main benefit of open source is transparency (and this is my view). To debug why a DAG is not working, you _might_ need to look at parsing logs and the serialized representation of the DAG -- so it is not leak of abstration but understanding of a core process. >I think sometimes we tend to forget that our Ops (even admins) are not "developers" or even no "sql admins" should not eventually even know that there is a database behind. This is really an implementation detail for many. And we should not forget that we in trying to optimize for new users we don't forget the users and companies running Airflow for years who want to optimize it and they need tools from us. I would love to make it as easy to upgrade but without compromising, the solution you mentions only achieves one thing which is why I am trying to find the best of both worlds as we have both type of users. >But If I am the only one to be worried, I am ok with the flag I will be very happy to hear other solutions that fixes it for both type of users and I have no strong opinion on the flag, I have proposed a separate sub-command solution to similar to https://github.com/apache/airflow/pull/19471 . > (I just hope in the next few months we will not have to handle a flood of issues upgrading from say 2.0.2). Do you think of all the issues that we are flooded with, this one issue will have more? I understand what you want to solve and why and I do appreciate it but I think we should not take this shortcut. However this is my personal opinion only :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
