[
https://issues.apache.org/jira/browse/AIRFLOW-6823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17039157#comment-17039157
]
Jarek Potiuk commented on AIRFLOW-6823:
---------------------------------------
It's a very simplified view that applies mostly to production-ready docker
images but it does not apply to the CI which is development and CI friendly.
Many of the "best practices" for production docker image do not apply to a
ci/development one.
For one we have other software that is installed/configured in the Docker that
we do not mount (minikube and a number of others that are installed and
configured during the docker build). That includes (but is not limited to) for
example to airflow binary which is not mounted from sources but build during
the docker build. If you build it as airflow user and do not make it globally
available for all the users , then it will not be executable for the user that
you are using with --user when you enter the docker.
We are installing a number of those manually in the context of the user that
the docker file is built with and until we remove those dependencies I would
not like to change it and make it available for all users (including the --user
one). there is quite a number of problems involved in us installing a
"development" version of airflow as well as a number of dependencies
(minicluster/hadoop/hive/kubectl/gcloud/aws etc. etc.). Those tools are usually
foreseen (in our version) in developmentĀ mode - so for the current user not
for all users on the machine and by default they come with other/group access
disabled. Rather than fix those one-by-one manually to be accessible for other
users, I prefer to run them with the same user that they were installed with.
Airflow installation itself in development mode '-e' is never intended to be
used by another user than it was installed with. Surely you can install airflow
in production mode from pypi for all users, but installing airflow in
"development" mode for all users is quite a bit more complex due to
permissions. We simply cannot mount the whole "airflow sources" folder as on
Mac it has very bad side effects which I experienced in the past - for example
all the locally built egginfos etc. etc. (so all the folders and files that are
created in the sources of airflow) are leaking into docker. One example is that
if the .egginfo folders are built locally (for your local virtual environment)
- they are mapped into the docker container and cause all the different
compatibility problems if you try to use them from inside the container. So
literally you would have to manually remove all the .egginfo files every time
you enter container and re-install airflow again in the -e development mode.
And mapping all the generated files back to the host makes also docker changes
leak back to the host.
For example if you have locally python2 version and in container python3, the
whole thing breaks in a number of unexpected ways in very strange moments and
without explainable error messages.
This means that we have to live with (for example) .egginfo in Dockerfile
generated during the Docker build rather than mounted from the host. In this
case if we change the user while entering the docker we would have to either
change permission of those .egginfo files to be owned by the new user or change
permissions of those folders to be other-writable (i do not know which one is
which). They have to be group-writeable to allow for example installing new
dependencies. And there are quite a few of other similar folders.
The solution is to find and modify all the scripts, folders etc that are not
executable/runnable/readable for "others" and make them
executable/runnable/readable as "other" - and there are quite a few of those -
during the docker build. Which is something that I would not like to do because
you never know if you have not fixed this or that permission.
And there are other problems I have forgotten about already.
If you would like to pursue it - I am happy to review it and help with
debugging. But I gave up trying to do that and simply prefer to fix root
permissionsĀ for generated files and run airflow/tests inside the docker as
root. It's literally one find + chmod command to execute at the start of the
docker rather than trying to fix all the problems above. Sounds like a
child-play comparing.
> Breeze broken on current master on Linux
> ----------------------------------------
>
> Key: AIRFLOW-6823
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6823
> Project: Apache Airflow
> Issue Type: Bug
> Components: breeze
> Affects Versions: 2.0.0
> Reporter: Aleksander Nitecki
> Assignee: Ash Berlin-Taylor
> Priority: Minor
> Attachments: Breeze on current master log.txt
>
>
> See the attachment for log.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)