dimberman commented on a change in pull request #4938: [AIRFLOW-4117] Multi-staging Image - Travis CI tests [Step 3/3] URL: https://github.com/apache/airflow/pull/4938#discussion_r298810750
########## File path: CONTRIBUTING.md ########## @@ -95,123 +97,272 @@ cd docs ./start_doc_server.sh ``` -Only a subset of the API reference documentation builds. Install additional -extras to build the full API reference. +# Setting up a development environment -## Development and Testing +There are three ways to setup an Apache Airflow development environment: -### Setting up a development environment +* [Local virtualenv development environment](#local-virtualenv-development-environment) + using tools and libraries directly on your system +* [Docker container development environment](#docker-container-development-environment) + using simple manually managed docker container +* [Integration test development environment](#integration-test-development-environment) + using Docker compose based environment - the same that we use to run integration tests in Travis CI -There are three ways to setup an Apache Airflow development environment. +## Local virtualenv development environment -1. Using tools and libraries installed directly on your system +You can create local virtualenv with all requirements required by Airflow. - Install Python (2.7.x or 3.5.x), MySQL, and libxml by using system-level package - managers like yum, apt-get for Linux, or Homebrew for Mac OS at first. Refer to the [base CI Dockerfile](https://github.com/apache/airflow-ci/blob/master/Dockerfile) for - a comprehensive list of required packages. +Advantage of local installation is that everything works locally, you do not have to enter Docker/container +environment and you can easily debug the code locally. You can also have access to python virtualenv that +contains all the necessary requirements and use it in your local IDE - this aids autocompletion, and +running tests directly from within the IDE. - Then install python development requirements. It is usually best to work in a virtualenv: +The disadvantage is that you have to maintain your dependencies and local environment consistent with +other development environments that you have on your local machine. - ```bash - cd $AIRFLOW_HOME - virtualenv env - source env/bin/activate - pip install -e '.[devel]' - ``` +Another disadvantage is that you you cannot run tests that require +external components - mysql, postgres database, hadoop, mongo, cassandra, redis etc.. +The tests in Airflow are a mixture of unit and integration tests and some of them +require those components to be setup. Only real unit tests can be run bu default in local environment. -2. Using a Docker container +If you want to run integration tests, you need to configure and install the dependencies on your own. - Go to your Airflow directory and start a new docker container. You can choose between Python 2 or 3, whatever you prefer. +It's also very difficult to make sure that your local environment is consistent with other's environments. +This can often lead to "works for me" syndrome. It's better to use the Docker Compose integration test +environment in case you want reproducible environment consistent with other people. - ``` - # Start docker in your Airflow directory - docker run -t -i -v `pwd`:/airflow/ -w /airflow/ python:3 bash +### Installation - # To install all of airflows dependencies to run all tests (this is a lot) - pip install -e . - - # To run only certain tests install the devel requirements and whatever is required - # for your test. See setup.py for the possible requirements. For example: - pip install -e '.[gcp,devel]' +Install Python (3.5 or 3.6), MySQL, and libxml by using system-level package +managers like yum, apt-get for Linux, or Homebrew for Mac OS at first. +Refer to the [Dockerfile](Dockerfile) for a comprehensive list of required packages. - # Init the database - airflow initdb +In order to use your IDE you need you can use the virtual environment. Ideally +you should setup virtualenv for all python versions that Airflow supports (2.7, 3.5, 3.6). +An easy way to create the virtualenv is to use +[virtualenvwrapper](https://virtualenvwrapper.readthedocs.io/en/latest/) - it allows +you to easily switch between virtualenvs using `workon` command and mange +your virtual environments more easily. Typically creating the environment can be done by: - nosetests -v tests/hooks/test_druid_hook.py +``` +mkvirtualenv <ENV_NAME> --python=python<VERSION> +``` + +Then you need to install python PIP requirements. Typically it can be done with: +`pip install -e ".[devel]"`. Then you need to run `airflow initdb` to create sqlite database. - test_get_first_record (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok - test_get_records (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok - test_get_uri (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok - test_get_conn_url (tests.hooks.test_druid_hook.TestDruidHook) ... ok - test_submit_gone_wrong (tests.hooks.test_druid_hook.TestDruidHook) ... ok - test_submit_ok (tests.hooks.test_druid_hook.TestDruidHook) ... ok - test_submit_timeout (tests.hooks.test_druid_hook.TestDruidHook) ... ok - test_submit_unknown_response (tests.hooks.test_druid_hook.TestDruidHook) ... ok +Once initialization is done, you should select the virtualenv you initialized as the +project's default virtualenv in your IDE and run tests efficiently. - ---------------------------------------------------------------------- - Ran 8 tests in 3.036s +After setting it up - you can use the usual "Run Test" option of the IDE and have +the autocomplete and documentation support from IDE as well as you can +debug and view the sources of Airflow - which is very helpful during +development. - OK - ``` +### Running tests + +To run tests locally, once you activate virtualenv you should be able to simply run +``run-tests`` at will. Note that if you want to pass extra parameters to nose +you should do it after '--' + +For example, in order to just execute the "core" unit tests, run the following: - The Airflow code is mounted inside of the Docker container, so if you change something using your favorite IDE, you can directly test it in the container. +```bash +run-tests tests.core:CoreTest -- -s --logging-level=DEBUG +``` +or a single test method: -3. Using [Docker Compose](https://docs.docker.com/compose/) and Airflow's CI scripts +```bash +run-tests tests.core:CoreTest.test_check_operators -- -s --logging-level=DEBUG +``` - Start a docker container through Compose for development to avoid installing the packages directly on your system. The following will give you a shell inside a container, run all required service containers (MySQL, PostgresSQL, krb5 and so on) and install all the dependencies: +### Running tests directly from the IDE - ```bash - docker-compose -f scripts/ci/docker-compose.yml run airflow-testing bash - # From the container - export TOX_ENV=py35-backend_mysql-env_docker - /app/scripts/ci/run-ci.sh - ``` +Once you configure your tests to use the virtualenv you created. running tests +from IDE is as simple as: - If you wish to run individual tests inside of Docker environment you can do as follows: + - ```bash - # From the container (with your desired environment) with druid hook - export TOX_ENV=py35-backend_mysql-env_docker - /app/scripts/ci/run-ci.sh -- tests/hooks/test_druid_hook.py - ``` +Note that while most of the tests are typical "unit" tests that do not +require external components, there are a number of tests that are more of +"integration" ot even "system" tests (depending on the convention you use). +Those tests interact with external components. For those tests +you need to run complete Docker Compose - based +[Integration test development environment](#integration-test-development-environment). +## Docker container development environment Review comment: Have you figured out how to do docker-compose-based testing within an IDE? I was trying to do so within pycharm and found it pretty unintuitive. This would be "best of both worlds" if possible. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
