potiuk commented on a change in pull request #19189:
URL: https://github.com/apache/airflow/pull/19189#discussion_r737947221
##########
File path: Dockerfile
##########
@@ -441,25 +443,53 @@ RUN chmod a+x /scripts/docker/install_mysql.sh && \
find "${AIRFLOW_HOME}" -executable -print0 | xargs --null chmod g+x && \
find "${AIRFLOW_HOME}" -print0 | xargs --null chmod g+rw
-COPY --chown=airflow:root --from=airflow-build-image /root/.local
"${AIRFLOW_USER_HOME_DIR}/.local"
+COPY --chown=airflow:root --from=airflow-build-image /.venv /.venv
COPY --chown=airflow:root scripts/in_container/prod/entrypoint_prod.sh
/entrypoint
COPY --chown=airflow:root scripts/in_container/prod/clean-logs.sh /clean-logs
# Make /etc/passwd root-group-writeable so that user can be dynamically added
by OpenShift
# See https://github.com/apache/airflow/issues/9248
RUN chmod a+x /entrypoint /clean-logs && \
- chmod g=u /etc/passwd && \
- bash /scripts/docker/install_pip_version.sh
+ chmod g=u /etc/passwd
+
+# Set default groups for airflow and root user
+RUN usermod -g 0 airflow -G 0
+
+# make sure that the venv is activated for all users
+# interactive, non-interactive shells and plain sudo, sudo with --interactive
flag
+RUN echo ". /.venv/bin/activate" >> "${AIRFLOW_USER_HOME_DIR}/.bashrc" && \
+ echo ". /.venv/bin/activate" >> /root/.bashrc && \
+ sed --in-place=.bak "s/secure_path=\"/secure_path=\"\/.venv\/bin:/"
/etc/sudoers
Review comment:
I believe it's not enough.
See the comment in this dicussion
https://github.com/pypa/pip/issues/10556#issuecomment-950304351
@mik-laj is trying to convince them to switch their way of creating
executable `dbt` script
https://github.com/dbt-labs/dbt-core/issues/4035#issuecomment-941674931.
This might actually "work" in `dbt` case, if the python interpreter is
actually run from the PATH by the `dbt` binary script, but I am not sure if
this is always the case for all the tools we might have instaled there.
Kamil explained it very well in the issue - this is something we also used
to have in Airflow 1.10. The side effect of this approach (very popular among
package creators) is that it requires full activation of hte virtual
environment. If you want to execute the binary scrupt just via absolute script
without activating the environment, it will fail. Unfortunately the explanation
from https://docs.python.org/3/library/venv.html is that you do not have to
activate virtualenv for **python interpreter** run directly from the path, but
it does not mean that any other package installed there can also be run without
activation.
I am not sure if there are other cases - I believe sometimes there might or
might not work, depending how the "binary scripts" are written - they are
usually only tested with "activated" virtualenv, so there can be some very
nasty surprises.
We have no control on how 'proper' implementation of binary scripts
implements - and many of those tools are alreay used in providers or simply
often used with airflow..
That's why I think just adding `bin` to the path is not enough. And that's
one of the main reaons why "switch to virtualenv' is not enoug and why I
believe (as I stated in
https://discuss.python.org/t/graceful-cooperation-between-external-and-python-package-managers-pep-668/10302/18?u=potiuk
- still waiting for comments) PEP 668 should make a bit more precise
recommendations and guidance to image creators - because while those steps are
all necessary to make use of venv in the image, none of those things are
mentioned nor even obvious (and a lot of people does not realise of those bad
consequences)
Unless I am totally mistaken and we can always trust that we can run any
binary script in `.venv/bin` without venv activation - but I don't believe so.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]