potiuk commented on issue #4938: [AIRFLOW-4117] Multi-staging Image - Travis CI 
tests [Step 3/3]
URL: https://github.com/apache/airflow/pull/4938#issuecomment-507790924
 
 
   The case of root user: Explained above (and also in a few resolved comments 
above): "For one in the new environment everything runs as root. This is 
another fix (along the way) because of the way local sources are mounted on 
Linux when you try to run tests locally. Current way when sources are mounted 
and airflow user is used, works on Mac but when you try to run it on Linux with 
docker it breaks in cases of generated npm code etc. In the new environment 
starting docker with mounted local sources works also on Linux ."
   
   Longer explanation: 
   
   The goal of the CI image (and upcoming breeze environment) was to make 
easily reproducible and manageable TravisCI-like environment using 
docker-compose. Ideally when you run the docker compose locally with -v <your 
sources>:/opt/airflow you should get the same experience as when you run Travis 
CI, but with locally modified code mounted from the host. It was how run_tests 
and docker-compose was implemented in the original environment. And it's great 
developer experience.
   
    You use MacOS @ashb in your development experience, so you got two things 
with it: 
   * shared volumes are slow (try npm build in docker vs. npm in host and you 
will see the difference) 
   * you have no problems with ownership of files for mapped volumes because it 
is handled by the osxfs - the changes of ownership do not propagate to MacOS 
(see below)
   
   People using Linux have different shared volumes experience (been there, 
done that):
   * shared volumes are super-fast (pretty much native speed) 
   * by default users are not mapped so you have the same permissions and 
user/group ownership as in the host environment (UID/GID) are the same in host 
and in Docker container. Basically UIDS/GIDS are the same in Host and in 
Container. Re-mapping users Host-Docker requires some daemon-level 
modifications and it's system-wide rather than per-docker-run (see below - I 
provided some sources). 
   
   The latter means that if Airflow runs inside Docker container as "airflow" 
user (UID 501 - I think), it has no access to the mounted volumes  - unless the 
permissions for all airflow sources are set to o+rw(x). The effect you get is 
that the files get mounted to the Docker container but then "airflow" user has 
no access to it. I tested it on plain ubuntu desktop and it works like that in 
the old environment. It does not work like that in Travis/old CI environment 
because there is this line in the old `run_ci.sh` script as the first thing 
that happens when you enter the environment:
   ```
   sudo chown -R airflow.airflow . $HOME/.cache $HOME/.wheelhouse/ 
$HOME/.cache/pip
   ```
   
   And that's fine for using the same docker-compose in Mac because changing 
ownership does not propagate to the Desktop user of Mac. But it propagates (or 
actually is simply natively changed) to the Linux Desktop user. This means that 
after running `run_ci.sh` script in Docker container you end up with all files 
having user "501:501" on the host -(I believe) - because this is the airflow 
User ID in Docker. 
   
   If you are Lucky(TM) Linux Developer, and you have the same user id 501 in 
the Host - nothing changes for you. But it is really distro-dependent and not 
guaranteed in any way - so at the end a lot of people will have with their 
airflow sources owned by another, or even non-existing user (if they try to use 
docker-compose environment) - just after entering the docker-compose 
environment. This is hardly nice development experience.
   
   If we want to reach the Developer base that have Linux desktops (I bet, we 
want), and give them environment that easily reproduces the Travis CI one - 
then we have to make sure it works seamlessly for any developer having Linux 
workstation not only Macs.
   
   So far the only way I found it works (in a few companies across 2 years) was 
to make the applications  in Docker run as root. Root will have access to all 
files no matter who owns them and will be able to create new files (for example 
.pyc) as needed. The side effect is that files created in container in airflow 
sources are owned by root user also in the host - this is another thing that we 
will have to deal with - but in the breeze environment I solved it by simply 
cleaning up generated files in docker and there is an easy way to delete them 
when you need (usually when you need to switch branches and some directories 
change). This is one of the features of Breeze to help with that case, that 
otherwise is difficult to even understand if you are not aware of Docker 
internals.
   
   Some additional sources: https://docs.docker.com/docker-for-mac/osxfs/.  
This is how file sharing works on Mac. You can read how permissions and 
ownership sharing works on Mac.
   
   Here is also the discussion on how you can achieve user mapping on Linux - 
it requires daemon modifications for docker or changing the ownership after you 
enter the environment (but then when the ownership is changed you change it 
also for host in Linux - so it's not really good for Desktop case and seamless 
sharing): https://github.com/moby/moby/issues/22258 
   
   I hope it explains it in detail :)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to