potiuk commented on a change in pull request #4932: [AIRFLOW-3611] Breeze URL: https://github.com/apache/airflow/pull/4932#discussion_r316808688
########## File path: BREEZE.rst ########## @@ -0,0 +1,766 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +.. raw:: html + + <p align="center"> + <img src="images/AirflowBreeze_logo.png" alt="Airflow Breeze Logo"/> + </p> + + +Table of Contents +================= + +* `Airflow Breeze <#airflow-breeze>`_ +* `Installation <#installation>`_ +* `Setting up autocomplete <#setting-up-autocomplete>`_ +* `Using the Airflow Breeze environment <#using-the-airflow-breeze-environment>`_ + - `Entering the environment <#entering-the-environment>`_ + - `Running tests in Airflow Breeze <#running-tests-in-airflow-breeze>`_ + - `Debugging with ipdb <#debugging-with-ipdb>`_ + - `Airflow directory structure in Docker <#airflow-directory-structure-inside-docker>`_ + - `Port forwarding <#port-forwarding>`_ +* `Using your host IDE <#using-your-host-ide>`_ + - `Configuring local virtualenv <#configuring-local-virtualenv>`_ + - `Running unit tests via IDE <#running-unit-tests-via-ide>`_ + - `Debugging Airflow Breeze Tests in IDE <#debugging-airflow-breeze-tests-in-ide>`_ +* `Running commands via Airflow Breeze <#running-commands-via-airflow-breeze>`_ + - `Running static code checks <#running-static-code-checks>`_ + - `Building the documentation <#building-the-documentation>`_ + - `Running tests <#running-tests>`_ + - `Running commands inside Docker <#running-commands-inside-docker>`_ + - `Running Docker Compose commands <#running-docker-compose-commands>`_ + - `Convenience scripts <#convenience-scripts>`_ +* `Keeping images up-to-date <#keeping-images-up-to-date>`_ + - `Updating dependencies <#updating-dependencies>`_ + - `Pulling the images <#pulling-the-images>`_ +* `Airflow Breeze flags <#airflow-breeze-flags>`_ + +Airflow Breeze +============== + +Airflow Breeze is an easy-to-use integration test environment managed via +`Docker Compose <https://docs.docker.com/compose/>`_ . +The environment is easy to use locally and it is also used by Airflow's CI Travis tests. + +It's called **Airflow Breeze** as in "It's a *Breeze* to develop Airflow" + +The advantages and disadvantages of using the environment vs. other ways of testing Airflow +are described in `CONTRIBUTING.md <CONTRIBUTING.md#integration-test-development-environment>`_. + +Here is the short 10 minute video about Airflow Breeze + +.. image:: http://img.youtube.com/vi/ffKFHV6f3PQ/0.jpg + :width: 480px + :height: 360px + :scale: 100 % + :alt: Airflow Breeze Simplified Development Workflow + :align: center + :target: http://www.youtube.com/watch?v=ffKFHV6f3PQ + + +Installation +============ + +Prerequisites for the installation: + + +* + If you are on MacOS you need gnu getopt to get Airflow Breeze running. Typically + uou need to run ``brew install gnu-getopt`` and then follow instructions (you need + to link the gnu getopt version to become first on the PATH). + +* + Latest stable Docker Community Edition installed and on the PATH. It should be + configured to be able to run ``docker`` commands directly and not only via root user + + + * your user should be in ``docker`` group. + See `Docker installation guide <https://docs.docker.com/install/>`_ + +* + Latest stable Docker Compose installed and on the PATH. It should be + configured to be able to run ``docker-compose`` command. + See `Docker compose installation guide <https://docs.docker.com/compose/install/>`_ + +* + in case of MacOS you need ``gstat`` installed via brew or port. The stat command from MacOS + is very old and poor. The best way to install it is via ``brew install coreutils`` + + +Your entry point for Airflow Breeze is `./breeze <./breeze>`_ +script. You can run it with ``-h`` option to see the list of available flags. +You can add the checked out airflow repository to your PATH to run breeze +without the ./ and from any directory if you have only one airflow directory checked out. + +See `Airflow Breeze flags <#airflow-breeze-flags>`_ for details. + +First time you run `./breeze <./breeze>`_ script, it will pull and build local version of docker images. +It will pull latest Airflow CI images from `Apache Airflow DockerHub <https://hub.docker.com/r/apache/airflow>`_ +and use them to build your local docker images. It will use latest sources from your source code. +Further on ``breeze`` uses md5sum calculation and Docker caching mechanisms to only rebuild what is needed. +Airflow Breeze will detect if Docker images need to be rebuilt and ask you for confirmation. + +Setting up autocomplete +======================= + +The ``breeze`` command comes with built-in bash/zsh autocomplete. When you start typing +`./breeze <./breeze>`_ command you can use <TAB> to show all the available switches +nd to get autocompletion on typical values of parameters that you can use. + +You can setup auto-complete automatically by running this command (-a is shortcut for --setup-autocomplete): + +.. code-block:: bash + + ./breeze --setup-autocomplete + + +You get autocomplete working when you re-enter the shell. + +Zsh autocompletion is currently limited to only autocomplete flags. Bash autocompletion also completes +flag values (for example python version or static check name). + + +Using the Airflow Breeze environment +==================================== + +Entering the environment +------------------------ + +You enter the integration test environment by running the `./breeze <./breeze>`_ script. + +You can specify python version to use, backend to use and environment for testing - so that you can +recreate the same environments as we have in matrix builds in Travis CI. The defaults when you +run the environment are reasonable (python 3.6, sqlite, docker). + +What happens next is the appropriate docker images are pulled, local sources are used to build local version +of the image and you are dropped into bash shell of the airflow container - +with all necessary dependencies started up. Note that the first run (per python) might take up to 10 minutes +on a fast connection to start. Subsequent runs should be much faster. + +.. code-block:: bash + + ./breeze + +You can choose the optional flags you need with `./breeze <./breeze>`_. + +For example you could choose to run python 3.6 tests with mysql as backend and in docker +environment by: + +.. code-block:: bash + + ./breeze --python 3.6 --backend mysql --env docker + +The choices you made are persisted in ``./.build/`` cache directory so that next time when you use the +`./breeze <./breeze>`_ script, it will use the values that were used previously. This way you do not +have to specify them when you run the script. You can delete the ``./.build/`` in case you want to +restore default settings. + +Relevant sources of airflow are mounted inside the ``airflow-testing`` container that you enter, +which means that you can continue editing your changes in the host in your favourite IDE and have them +visible in docker immediately and ready to test without rebuilding images. This can be disabled by specifying +``--skip-mounting-source-volume`` flag when running breeze, in which case you will have sources +embedded in the container - and changes to those sources will not be persistent. + +Once you enter the environment you are dropped into bash shell and you can run tests immediately. + +Running tests in Airflow Breeze +------------------------------- + +Once you enter Airflow Breeze environment you should be able to simply run +`run-tests` at will. Note that if you want to pass extra parameters to nose +you should do it after '--' + +For example, in order to just execute the "core" unit tests, run the following: + +.. code-block:: bash + + run-tests tests.core:CoreTest -- -s --logging-level=DEBUG + +or a single test method: + +.. code-block:: bash + + run-tests tests.core:CoreTest.test_check_operators -- -s --logging-level=DEBUG + + +The tests will run 'airflow db reset' and 'airflow db init' the first time you +run tests in running container, so you can count on database being initialized. + +All subsequent test executions within the same container will run without database +initialisation. + +You can also optionally add --with-db-init flag if you want to re-initialize +the database. + +.. code-block:: bash + + run-tests --with-db-init tests.core:CoreTest.test_check_operators -- -s --logging-level=DEBUG + +Debugging with ipdb +------------------- + +You can debug any code you run in the container using ``ipdb`` debugger if you prefer console debugging. +It is as easy as copy&pasting this line into your code: + +.. code-block:: python + + import ipdb; ipdb.set_trace() + +Once you hit the line you will be dropped into interactive ipdb debugger where you have colors +and auto-completion to guide your debugging. This works from the console where you started your program. +Note that in case of `nosetest` you need to provide `--nocapture` flag to avoid nosetests capturing the stdout +of your process. + +Airflow directory structure inside Docker +----------------------------------------- + +When you are in the container note that following directories are used: + +.. code-block:: text + + /opt/airflow - here sources of Airflow are mounted from the host (AIRFLOW_SOURCES) + /root/airflow - all the "dynamic" Airflow files are created here: (AIRFLOW_HOME) + airflow.db - sqlite database in case sqlite is used + dags - folder where non-test dags are stored (test dags are in /opt/airflow/tests/dags) + logs - logs from airflow executions are created there + unittest.cfg - unit test configuration generated when entering the environment + webserver_config.py - webserver configuration generated when running airflow in the container + +Note that when run in your local environment ``/root/airflow/logs`` folder is actually mounted from your +``logs`` directory in airflow sources, so all logs created in the container are automatically visible in the host +as well. Every time you enter the container the logs directory is cleaned so that logs do not accumulate. + +Port forwarding +--------------- + +When you run Airflow Breeze, the following ports are automatically forwarded: + +* 28080 -> forwarded to airflow webserver -> airflow-testing:8080 +* 25433 -> forwarded to postgres database -> postgres:5432 +* 23306 -> forwarded to mysql database -> mysql:3306 + +You can connect to those ports/databases using: + +* Webserver: (http://127.0.0.1:28080)[http://127.0.0.1:28080] +* Postgres: ``jdbc:postgresql://127.0.0.1:25433/airflow?user=postgres&password=airflow`` +* Mysql: ``jdbc:mysql://localhost:23306/airflow?user=root`` + +Note that you need to start the webserver manually with ``airflow webserver`` command if you want to connect +to the webserver (you can use ``tmux`` to multiply terminals). + +For databases you need to run ``airflow resetdb`` at least once after you started Airflow Breeze to get +the database/tables created. You can connect to databases with IDE or any other Database client: + +.. raw:: html + + <p align="center"> + <img src="images/database_view.png" alt="Database view"/> + </p> + +You can change host port numbers used by setting appropriate environment variables: + +* WEBSERVER_HOST_PORT +* POSTGRES_HOST_PORT +* MYSQL_HOST_PORT + +When you set those variables, next time when you enter the environment the new ports should be in effect. + +Using your host IDE +=================== + +Configuring local virtualenv +---------------------------- + +In order to use your host IDE (for example IntelliJ's PyCharm/Idea) you need to have virtual environments +setup. Ideally you should have virtualenvs for all python versions that Airflow supports (2.7, 3.5, 3.6). Review comment: Resolved ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
