[jira] [Commented] (AIRFLOW-6823) Breeze broken on current master on Linux

Jarek Potiuk (Jira) Tue, 18 Feb 2020 07:09:13 -0800


    [ 
https://issues.apache.org/jira/browse/AIRFLOW-6823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17039157#comment-17039157
 ]


Jarek Potiuk commented on AIRFLOW-6823:
---------------------------------------

It's a very simplified view that applies mostly to production-ready docker 
images but it does not apply to the CI which is development and CI friendly. 
Many of the "best practices" for production docker image do not apply to a 
ci/development one.

For one we have other software that is installed/configured in the Docker that 
we do not mount (minikube and a number of others that are installed and 
configured during the docker build). That includes (but is not limited to) for 
example to airflow binary which is not mounted from sources but build during 
the docker build. If you build it as airflow user and do not make it globally 
available for all the users , then it will not be executable for the user that 
you are using with --user when you enter the docker.

We are installing a number of those manually in the context of the user that 
the docker file is built with and until we remove those dependencies I would 
not like to change it and make it available for all users (including the --user 
one). there is quite a number of problems involved in us installing a 
"development" version of airflow as well as a number of dependencies 
(minicluster/hadoop/hive/kubectl/gcloud/aws etc. etc.). Those tools are usually 
foreseen (in our version) in development  mode - so for the current user not 
for all users on the machine and by default they come with other/group access 
disabled. Rather than fix those one-by-one manually to be accessible for other 
users, I prefer to run them with the same user that they were installed with.

Airflow installation itself in development mode '-e' is never intended to be 
used by another user than it was installed with. Surely you can install airflow 
in production mode from pypi for all users, but installing airflow in 
"development" mode for all users is quite a bit more complex due to 
permissions. We simply cannot mount the whole "airflow sources" folder as on 
Mac it has very bad side effects which I experienced in the past - for example 
all the locally built egginfos etc. etc. (so all the folders and files that are 
created in the sources of airflow) are leaking into docker. One example is that 
if the .egginfo folders are built locally (for your local virtual environment) 
- they are mapped into the docker container and cause all the different 
compatibility problems if you try to use them from inside the container. So 
literally you would have to manually remove all the .egginfo files every time 
you enter container and re-install airflow again in the -e development mode. 
And mapping all the generated files back to the host makes also docker changes 
leak back to the host.

For example if you have locally python2 version and in container python3, the 
whole thing breaks in a number of unexpected ways in very strange moments and 
without explainable error messages.

This means that we have to live with (for example) .egginfo in Dockerfile 
generated during the Docker build rather than mounted from the host. In this 
case if we change the user while entering the docker we would have to either 
change permission of those .egginfo files to be owned by the new user or change 
permissions of those folders to be other-writable (i do not know which one is 
which). They have to be group-writeable to allow for example installing new 
dependencies. And there are quite a few of other similar folders.

The solution is to find and modify all the scripts, folders etc that are not 
executable/runnable/readable for "others" and make them 
executable/runnable/readable as "other" - and there are quite a few of those - 
during the docker build. Which is something that I would not like to do because 
you never know if you have not fixed this or that permission.

And there are other problems I have forgotten about already.

If you would like to pursue it - I am happy to review it and help with 
debugging. But I gave up trying to do that and simply prefer to fix root 
permissions  for generated files and run airflow/tests inside the docker as 
root. It's literally one find + chmod command to execute at the start of the 
docker rather than trying to fix all the problems above. Sounds like a 
child-play comparing.

> Breeze broken on current master on Linux
> ----------------------------------------
>
>                 Key: AIRFLOW-6823
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6823
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: breeze
>    Affects Versions: 2.0.0
>            Reporter: Aleksander Nitecki
>            Assignee: Ash Berlin-Taylor
>            Priority: Minor
>         Attachments: Breeze on current master log.txt
>
>
> See the attachment for log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (AIRFLOW-6823) Breeze broken on current master on Linux

Reply via email to