> Sending frequent messages is a good idea. I think we can also (additionally) make airflow display a warning same as sqlite warning / sequential executor).
I think one difference between Sequential Executor and _PIP_ADDITIONAL_REQUIREMENTS. Sequential Executor is part of Airflow Core, and there is no appearance of _PIP_ADDITIONAL_REQUIREMENTS in Airflow Core, this only part of docker image. For implements this warning we need to make Airflow Core know about this environment variable, even if it os.environ.get("_PIP_ADDITIONAL_REQUIREMENTS", "") ---- Best Wishes *Andrey Anshin* On Tue, 29 Aug 2023 at 17:04, Jarek Potiuk <ja...@potiuk.com> wrote: > @Artira - yes. Sending frequent messages is a good idea. I think we can > also (additionally) make airflow display a warning same as sqlite warning / > sequential executor). > > (though I know for a fact a number of users are just conditioned to click > any warnings like that and ignore them). Only after we ask many questions > it turns out they are using Sequential Executor which we explicitly tell > them not to use and display both log and UI warnings about. Even if they > have 100% perfectly viable production-ready LocalExecutor with parallelism > = 1 even if they want to do something "like Sequential Executor". > > So I am worried if that will be enough :) > > J. > > On Tue, Aug 29, 2023 at 2:57 PM Jarek Potiuk <ja...@potiuk.com> wrote: > > > Yes. I contemplated removing it completely and we could straight away > fail > > an image that has it set and tell the user how to build the image and how > > to use the `--build` flag of docker-compose. > > > > That was my second-best option. > > > > J. > > > > > > On Tue, Aug 29, 2023 at 2:49 PM Maciej Obuchowski < > > obuchowski.mac...@gmail.com> wrote: > > > >> I feel like straight up disallowing the feature, or disabling it in > future > >> versions > >> and just strongly logging for now is the better option than failing > after > >> 10 minutes. > >> > >> Generally I think introducing "feature" that would work locally and on > dev > >> environment without noticeable problems - and failing on production is > >> wrong. > >> If I'm doing something wrong then either don't let me do that or just > tell > >> me to not do it. > >> > >> For all the pros, they hold also for removing > _PIP_ADDITIONAL_REQUIREMENTS > >> - so what's the rationale for still keeping that? > >> > >> wt., 29 sie 2023 o 14:30 Aritra Basu <aritrabasu1...@gmail.com> > >> napisał(a): > >> > >> > I haven't used this feature much yet, so not sure on the use case but > >> the > >> > 10 minutes seems a bit arbitrary to me considering I do often test out > >> dags > >> > that take longer than 10 mins to run? Would it perhaps make more sense > >> that > >> > if you launch using the variable on any failure it throws an error > >> saying > >> > you used this variable and that might cause this, retry without using > >> this > >> > and see if you still hit the issue? And perhaps send out warnings > every > >> X > >> > minutes/seconds saying you're using this feature which is not > >> recommended. > >> > I'm not sure how frequent warnings of that kind might discourage it's > >> usage > >> > in prod? > >> > > >> > -- > >> > Regards, > >> > Aritra Basu > >> > > >> > On Tue, Aug 29, 2023, 4:59 PM Andrey Anshin <andrey.ans...@taragol.is > > > >> > wrote: > >> > > >> > > Airflow Trial Mode ON :D > >> > > > >> > > > It is aggressive, yes. I wonder what you think. Is it too > >> aggressive? > >> > > > >> > > I think is not enough :D This one would be > >> > > > >> > > if [[ -n "${_PIP_ADDITIONAL_REQUIREMENTS=}" ]] ; then > >> > > echo "Longrid about best practices" > >> > > exit 42 > >> > > > >> > > In this case it is very easy to get a quick answer for an > >> > issue/discussion > >> > > like "My Airflow deployment does not start with exit code 42", > rather > >> > than > >> > > "My Airflow deployment constantly restarted". > >> > > > >> > > But I'm not sure if it is applicable and user-friendly. > >> > > > >> > > ---- > >> > > Best Wishes > >> > > *Andrey Anshin* > >> > > > >> > > > >> > > > >> > > On Tue, 29 Aug 2023 at 15:01, Jarek Potiuk <ja...@potiuk.com> > wrote: > >> > > > >> > > > Hello everyone, > >> > > > > >> > > > It happens more and more often (and seems that it is pretty > >> > > > widespread among "small" users of Airflow using Docker Compose and > >> > > official > >> > > > Airflow image that they are using _PIP_ADDITIONAL_REQUIREMENTS for > >> > > > production usage - even if building and hosting your modified > image > >> is > >> > as > >> > > > easy as copy & pasting few lines to a Dockerfile and running > `docker > >> > > build` > >> > > > following by `docker push. > >> > > > > >> > > > While I understand (and that's why we have it from the very > >> beginning) > >> > > why > >> > > > you would like to use it for local quick iteration and testing, > >> using > >> > it > >> > > > for anything close to production has a lot of negative > consequences > >> and > >> > > > leads to mysterious errors that take a lot of time for those users > >> who > >> > > are > >> > > > trying to investigating some dependency problems, slow start of > >> > > containers, > >> > > > race conditions and it makes them prone to security issues. > >> > > > > >> > > > I thought about whether we should remove the feature completely > (it > >> is > >> > > > clearly marked as "do not use it for anything in production - so > we > >> > > could). > >> > > > But I thought that a better solution could be to make the default > >> image > >> > > of > >> > > > ours simply unusable for production use if you use the flag. > >> > > > > >> > > > So I came up with this proposed PR: > >> > > > https://github.com/apache/airflow/pull/33879 > >> > > > > >> > > > It has a few protections against other side effects (such as > >> > accidentally > >> > > > downgrading or upgrading airflow by adding conflicting > requirements, > >> > and > >> > > > verification if the dependencies pass `pip check`) - but the gist > of > >> > it - > >> > > > the container will shut down automatically after 10 minutes if it > >> has > >> > > been > >> > > > started with _PIP_ADDITIONAL_REQUIREMENTS (and will print > >> instructions > >> > > what > >> > > > to do - how to build your image - to avoid it). And we have much > >> more > >> > > > comprehensive and detailed instructions on how to build your image > >> and > >> > > even > >> > > > how to integrate the "--build" docker-compose workflow to do the > >> same > >> > > > following the way docker-compose was designed. > >> > > > > >> > > > Yes. It is super aggressive. Yes it will break a number of user's > >> > > workflows > >> > > > (especially when they are using 3rd-party charts that encourage > >> using > >> > > > _PIP_ADDITIONAL_REQUIREMENTS) - part of the problem is that some > >> charts > >> > > out > >> > > > there are exposing this as a "production functionality" despite it > >> not > >> > > > being one. > >> > > > > >> > > > But IMHO - using the variable in production does more harm than > good > >> > and > >> > > it > >> > > > has very easy alternatives, which users don't use because they > >> choose > >> > > > shortcuts without realising the consequences - which is not the > way > >> you > >> > > > should treat a production environment. > >> > > > > >> > > > Also restarting the container after 10 minutes is pretty > consistent > >> > with > >> > > > the usage that _PIP_ADDITIONAL_REQUIRENENTS has been added for. > >> > > Restarting > >> > > > the container in a quick test/dependency iteration environment > >> > regularly > >> > > > could even be seen as a "feature" if you approach the "formal" > >> status > >> > of > >> > > > this variable. > >> > > > > >> > > > Also this has a bit of a "sneaky" way of tricking people into > >> > > > "good practices" learning. In order to avoid the restart you > really > >> > need > >> > > to > >> > > > .... build your own image ... So this has also an educational > >> aspect - > >> > > > people will have to learn about building the image anyway if they > >> will > >> > > want > >> > > > to learn how to "not build their images" and use something like > the > >> > > > variable again and it will be even more complex than just adding > the > >> > > > requirements to the custom image - so they will literally have to > >> learn > >> > > the > >> > > > "good practice". > >> > > > > >> > > > It is aggressive, yes. I wonder what you think. Is it too > >> aggressive? > >> > > > Should we do it? > >> > > > > >> > > > Maybe there are really good reasons why it's impossible to use > your > >> own > >> > > > custom images other than the extra step and annoyance of being > >> able to > >> > > > push your images to some - public (free) or private (internal) > >> > registry. > >> > > I > >> > > > can't imagine how someone deploying docker-compose or k8s does not > >> have > >> > > the > >> > > > way of having their own images available. Even if it means . that > >> you > >> > > have > >> > > > to go through some kind of infra of yours you generally should do > >> for > >> > > > production. > >> > > > > >> > > > Looking forward to your comments :D > >> > > > > >> > > > J. > >> > > > > >> > > > >> > > >> > > >