potiuk commented on a change in pull request #4937: [AIRFLOW-4116] 
Multi-staging includes CI image [Step 2/3]
URL: https://github.com/apache/airflow/pull/4937#discussion_r268895206
 
 

 ##########
 File path: Dockerfile
 ##########
 @@ -85,14 +85,134 @@ RUN adduser airflow \
     && echo "airflow ALL=(ALL) NOPASSWD: ALL" > /etc/sudoers.d/airflow \
     && chmod 0440 /etc/sudoers.d/airflow
 
+############################################################################################################
+# This is an image with all APT dependencies needed by CI. It is built on top 
of the airlfow APT image
+# Parameters:
+#     airflow-apt-deps - this is the base image for CI deps image.
+############################################################################################################
+FROM airflow-apt-deps as airflow-ci-apt-deps
+
+SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"]
+
+ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
+
+ARG APT_DEPS_IMAGE
+ENV APT_DEPS_IMAGE=${APT_DEPS_IMAGE}
+
+RUN echo "${APT_DEPS_IMAGE}"
+
+# Note the ifs below might be removed if Buildkit will become usable. It 
should skip building this
+# image automatically if it is not used. For now we still go through all 
layers below but they are empty
+# Note missing directories on debian-stretch 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=863199
+RUN if [[ "${APT_DEPS_IMAGE}" == "airflow-ci-apt-deps" ]]; then \
+        mkdir -pv /usr/share/man/man1 \
+        && mkdir -pv /usr/share/man/man7 \
+        && apt-get update \
+        && apt-get install --no-install-recommends -y \
+          lsb-release gnupg dirmngr openjdk-8-jdk \
+          vim tmux less unzip net-tools netcat \
+          ldap-utils postgresql-client sqlite3 \
+          krb5-user openssh-client openssh-server \
+          python-selinux \
+        && apt-get autoremove -yqq --purge \
+        && apt-get clean \
+        && rm -rf /var/lib/apt/lists/* \
+        ;\
+    fi
+
+RUN if [[ "${APT_DEPS_IMAGE}" == "airflow-ci-apt-deps" ]]; then \
+        KEY="A4A9406876FCBD3C456770C88C718D3B5072E1F5" \
+        && GNUPGHOME="$(mktemp -d)" \
+        && export GNUPGHOME \
+        && for KEYSERVER in $(shuf -e \
+                ha.pool.sks-keyservers.net \
+                hkp://p80.pool.sks-keyservers.net:80 \
+                keyserver.ubuntu.com \
+                hkp://keyserver.ubuntu.com:80 \
+                pgp.mit.edu) ; do \
+              gpg --keyserver "${KEYSERVER}" --recv-keys "${KEY}" && break || 
true ; \
+           done \
+        && gpg --export "${KEY}" > /etc/apt/trusted.gpg.d/mysql.gpg \
+        && gpgconf --kill all \
+        rm -rf "${GNUPGHOME}"; \
+        apt-key list > /dev/null \
+        && echo "deb http://repo.mysql.com/apt/ubuntu/ trusty mysql-5.7" | \
+            tee -a /etc/apt/sources.list.d/mysql.list \
+        && apt-get update \
+        && MYSQL_PASS="secret" \
+        && debconf-set-selections <<< \
+            "mysql-community-server mysql-community-server/data-dir select ''" 
\
+        && debconf-set-selections <<< \
+            "mysql-community-server mysql-community-server/root-pass password 
${MYSQL_PASS}" \
+        && debconf-set-selections <<< \
+            "mysql-community-server mysql-community-server/re-root-pass 
password ${MYSQL_PASS}" \
 
 Review comment:
   I am not 100% sure now, but I believe I struggled with similar problems when 
implementing CloudSQL operators (with cloudsqlproxy). I spent quite some time 
trying to find the best way to install mysql client on debian-stretch so it 
might be the result of my struggles and trial/error then and when I clean it 
up, it might be not needed. 
   
   I might find other, simpler solution - like stop using root user for mysql 
db at all or enforcing TCP protocol. 
   
   Sorry for being so verbose but I am venting my frustrations with mysql 
connectivity issues :). But for some people it might be even interesting :). 
   
   
   OK. Beginning of rant on MySQL.
   
   The problem is that the original scripts from ci_script use passwordless, 
remote root connection to create airflow database and I probably want to change 
that. This is the most probable reason why Travis tests fail now - some 
authentication option somewhere either on client or server prevent from running 
the query below. We currently got "Host mysql does not exist" but really this 
is a generic error when there is any problem with authentication. Yeah - 
security first - better pretend that you are not there than admit that password 
is wrong.
   
   Here is the offending query:
   
       mysql -h ${MYSQL_HOST} -u root -e 'drop database if exists airflow; 
create database airflow
   
   The main reason is that MySQL 5.7 changed the secure model: now MySQL root 
login requires a sudo. By default you cannot login to root user unless you have 
sudo. And that's where the journey starts.
   
   For mysql "root only with sudo" is checked and enforced in both client and 
server and both check configuration using server configuration variables 
(Yay!).  The problem is that by default root user is only allowed to login 
using UNIX sockets - which allows only for local connectivity and is the only 
way to check for the user privileges/sudo. Imagine trying to connect vial 
cloudsqlproxy which can forward UNIX socket ... but then looses sudo check as 
we are on a different machine (yeah!).
   
   There is also another trickery connected with how mysql treats localhost and 
127.0.0.1 differently, and possibly how local docker hostname is treated in 
docker network created by docker compose. There are multiple threads on 
StackOverflow some with several hundreds upvotes (like this one: 
https://askubuntu.com/questions/766334/cant-login-as-mysql-user-root-from-normal-user-account-in-ubuntu-16-04).
 This one is my favourite (almost 500 upvotes) 
https://stackoverflow.com/questions/7739645/install-mysql-on-ubuntu-without-a-password-prompt
 . There are many ways you can set this up,  it's different for different 
versions of mysql. If you look at this closely - most of the suggested ways is 
really catch 22 - you need to connect to a running database and run a query to 
change root user configuration to be able to connect via TCP. If you want to 
run it remotely on a Dockerised database - you are basically out of luck :).
   
   MySQL connectivity for root is a big ball of mud really.
   
   And yes - this is for server, but it requires some trickery (setting the 
server variables) to change this default behaviour on the client side as well 
as on server. If you don't set those parameters, client will always try to use 
UNIX socket authentication for root user, no matter what you specify as host. 
Setting the server parameters is the trick to make client side to allow TCP 
communication for root user and get the client installed without asking for 
root password (wait what? yeah, I know).
   
   Just for curiosity - yet another interesting problem I had initially when i 
used wheels. I created wheel packages in a different image (based on the same 
python image) - it used mariadb (default) rather than mysql -  which I manually 
installed only in CI image. That was first problem and the result was rather 
strange. Wheel packages (mysql db api) were compiled/linked against mariadb.so 
and we had only mysql ones (and mariadb one was removed) and the connection 
attempts failed of course. This last problem is now gone as it turned out that 
wheel packages give marginal improvements and complicate a lot.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to