mik-laj commented on a change in pull request #6283: [AIRFLOW-XXXX] Google Season of Docs updates to CONTRIBUTING doc URL: https://github.com/apache/airflow/pull/6283#discussion_r332492928
########## File path: CONTRIBUTING.rst ########## @@ -0,0 +1,662 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +.. contents:: :local: + +Contributions +============= + +Contributions are welcome and are greatly appreciated! Every little bit helps, +and credit will always be given. + +Report Bugs +----------- + +Report bugs through `Apache +Jira <https://issues.apache.org/jira/browse/AIRFLOW>`__. + +Please report relevant information and preferably code that exhibits the +problem. + +Fix Bugs +-------- + +Look through the JIRA issues for bugs. Anything is open to whoever wants to +implement it. + +Implement Features +------------------ + +Look through the `Apache +JIRA <https://issues.apache.org/jira/browse/AIRFLOW>`__ for features. + +Any unassigned "Improvement" issue is open to whoever wants to implement it. + +We've created the operators, hooks, macros and executors we needed, but we've +made sure that this part of Airflow is extensible. New operators, hooks, macros +and executors are very welcomed! + +Improve Documentation +--------------------- + +Airflow could always use better documentation, whether as part of the official +Airflow docs, in docstrings, ``docs/*.rst`` or even on the web as blog posts or +articles. + +Submit Feedback +--------------- + +The best way to send feedback is to open an issue on `Apache +JIRA <https://issues.apache.org/jira/browse/AIRFLOW>`__. + +If you are proposing a feature: + +- Explain in detail how it would work. +- Keep the scope as narrow as possible to make it easier to implement. +- Remember that this is a volunteer-driven project, and that contributions are + welcome :) + +Documentation +============= + +The latest API documentation is usually available +`here <https://airflow.apache.org/>`__. + +To generate a local version: + +1. Set up an Airflow development environment. + +2. Install the ``doc`` extra. + +.. code-block:: bash + + pip install -e '.[doc]' + + +3. Generate and serve the documentation as follows: + +.. code-block:: bash + + cd docs + ./build.sh + ./start_doc_server.sh + + +Pull Request Guidelines +======================= + +Before you submit a pull request (PR) from your forked repo, check that it meets +these guidelines: + +- Include tests, either as doctests, unit tests, or both, to your pull + request. + + The airflow repo uses `Travis CI <https://travis-ci.org/apache/airflow>`__ to + run the tests and `codecov <https://codecov.io/gh/apache/airflow>`__ to track + coverage. You can set up both for free on your fork (see + `Travis CI Testing Framework <#travis-ci-testing-framework>`__ section below). + It will help you make sure you do not break the build with your PR and + that you help increase coverage. + +- `Rebase your fork <http://stackoverflow.com/a/7244456/1110993>`__, squash + commits, and resolve all conflicts. + +- When merging PRs, wherever possible try to use **Squash and Merge** instead of **Rebase and Merge**. + +- Make sure every pull request has an associated + `JIRA <https://issues.apache.org/jira/browse/AIRFLOW/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel>`__ + ticket. The JIRA link should also be added to the PR description. + +- Preface your commit's subject & PR title with **[AIRFLOW-XXX] COMMIT_MSG (#PR_NUMBER)** where *XXX* + is the JIRA number. For example: [AIRFLOW-5574] Fix Google Analytics script loading (#6218). + We compose Airflow release notes from all commit titles in a release. By placing the JIRA number in the + commit title and hence in the release notes, we let Airflow users look into + JIRA and GitHub PRs for more details about a particular change. + +- Add an `Apache License <http://www.apache.org/legal/src-headers.html>`__ header + to all new files. + + If you have `pre-commit hooks <#pre-commit-hooks>`_ enabled, they automatically add + license headers during commit. + +- If your pull request adds functionality, make sure to update the docs as part + of the same PR. Doc string is often sufficient. Make sure to follow the + Sphinx compatible standards. + +- Make sure the pull request works for Python 3.5 and 3.6. + +- Run tests locally before opening PR. + + As Airflow grows as a project, we try to enforce a more consistent style and + follow the Python community guidelines. We currently enforce most of + `PEP8 <https://www.python.org/dev/peps/pep-0008/>`__ and a few other linting + rules described in `Running static code analysis locally <#running-static-code-analysis-locally>`__ section. + +- Adhere to guidelines for commit messages described in this `article <http://chris.beams.io/posts/git-commit/>`__. + This makes the lives of those who come after you a lot easier. + + + +Development Environments +======================== + +There are two environments, available on Linux and macOS, that you can use to +develop Apache Airflow: + +- `Local virtualenv development environment <#local-virtualenv-development-environment>`_ + that supports running unit tests and can be used in your IDE. + +- `Breeze Docker-based development environment <#breeze-test-development-environment>`_ that provides + an end-to-end CI solution with all software dependencies covered. + +The table below summarizes differences between the two environments: + + +========================= ================================ ===================================== +**Property** **Local virtualenv** **Breeze environment** +========================= ================================ ===================================== +Test coverage - (-) unit tests only - (+) integration and unit tests +------------------------- -------------------------------- ------------------------------------- +Setup - (+) automated with breeze cmd - (+) automated with breeze cmd +------------------------- -------------------------------- ------------------------------------- +Installation difficulty - (-) depends on the OS setup - (+) works whenever Docker works +------------------------- -------------------------------- ------------------------------------- +Team synchronization - (-) difficult to achieve - (+) reproducible within team +------------------------- -------------------------------- ------------------------------------- +Reproducing CI failures - (-) not possible in many cases - (+) fully reproducible +------------------------- -------------------------------- ------------------------------------- +Ability to update - (-) requires manual updates - (+) automated update via breeze cmd +------------------------- -------------------------------- ------------------------------------- +Disk space and CPU usage - (+) relatively lightweight - (-) uses GBs of disk and many CPUs +------------------------- -------------------------------- ------------------------------------- +IDE integration - (+) straightforward - (-) via remote debugging only +========================= ================================ ===================================== + + +Typically, you are recommended to use both of these environments depending on your needs. + +Local virtualenv Development Environment +---------------------------------------- + +All details about using and running local virtualenv enviroment for Airflow can be found in `LOCAL_VIRTUALENV.rst <LOCAL_VIRTUALENV.rst>`__. + +Benefits: + +- Packages are installed locally. No container environment is required. + +- You can benefit from local debugging within your IDE. + +- With the virtualenv in your IDE, you can benefit from autocompletion and running tests directly from the IDE. + +Limitations: + +- You have to maintain your dependencies and local environment consistent with + other development environments that you have on your local machine. + +- You cannot run tests that require external components, such as mysql, + postgres database, hadoop, mongo, cassandra, redis, etc. + + The tests in Airflow are a mixture of unit and integration tests and some of + them require these components to be set up. Local virtualenv supports only + real unit tests. Technically, to run integration tests, you can configure + and install the dependencies on your own, but it is usually complex. + Instead, you are recommended to use + `Breeze development environment <#breeze-development-environment>`__ with all required packages + pre-installed. + +- You need to make sure that your local environment is consistent with other + developer environments. This often leads to a "works for me" syndrome. The + Breeze container-based solution provides a reproducible environment that is + consistent with other developers. + +Possible extensions: + +- You are **STRONGLY** encouraged to also install and use `pre-commit hooks <#pre-commit-hooks>`_ + for your local virtualenv development environment. + Pre-commit hooks can speed up your development cycle a lot. + +Breeze Development Environment +------------------------------ + +All details about using and running Airflow Breeze can be found in +`BREEZE.rst <BREEZE.rst>`__. + +The Airflow Breeze solution is intended to ease your local development as "*It's +a Breeze to develop Airflow*". + +Benefits: + +- Breeze is a complete environment that includes external components, such as + mysql database, hadoop, mongo, cassandra, redis, etc., required by some of + Airflow tests. Breeze provides a preconfigured Docker Compose environment + where all these services are available and can be used by tests + automatically. + +- Breeze environment is almost the same as used in `Travis CI <https://travis-ci.com/>`__ automated builds. + So, if the tests run in your Breeze environment, they will most likely work in Travis CI as well. + +Limitations: + +- Breeze environment takes significant space in your local Docker cache. There + are separate environments for different Python and Airflow versions, and + each of the images takes around 3GB in total. + +- Though Airflow Breeze setup is automated, it takes time. The Breeze + environment uses pre-built images from DockerHub and it takes time to + download and extract those images. Building the environment for a particular + Python version takes less than 10 minutes. + +- Breeze environment runs in the background taking precious resources, such as + disk space and CPU. You can stop the environment manually after you use it + or even use a ``bare`` environment to decrease resource usage. + +**NOTE:** Breeze CI images are not supposed to be used in production environments. +They are optimized for repeatability of tests, maintainability and speed of building rather +than production performance. The production images are not yet officially published. + +Pylint Checks +============= + +We are in the process of fixing code flagged with pylint checks for the whole Airflow project. +This is a huge task so we implemented an incremental approach for the process. +Currently most of the code is excluded from pylint checks via scripts/ci/pylint_todo.txt. +We have an open JIRA issue AIRFLOW-4364 which has a number of sub-tasks for each of +the modules that should be made compatible. Fixing problems identified with pylint is one of +straightforward and easy tasks to do (but time-consuming), so if you are a first-time +contributor to Airflow, you can choose one of the sub-tasks as your first issue to fix. + +To fix a pylint issue, do the following: + +1. Remove module/modules from the + `scripts/ci/pylint_todo.txt <scripts/ci/pylint_todo.txt>`__. + +2. Run `scripts/ci/ci_pylint.sh <scripts/ci/ci_pylint.sh>`__. + +3. Fix all the issues reported by pylint. + +4. Re-run `scripts/ci/ci_pylint.sh <scripts/ci/ci_pylint.sh>`__. + +5. If you see "success", submit a PR following + `Pull Request guidelines <#pull-request-guidelines>`__. + + +These are guidelines for fixing errors reported by pylint: + +- Fix the errors rather than disable pylint checks. Often you can easily + refactor the code (IntelliJ/PyCharm might be helpful when extracting methods + in complex code or moving methods around). + +- If disabling a particular problem, make sure to disable only that error by + using the symbolic name of the error as reported by pylint. + +.. code-block:: python + + import airflow.* # pylint: disable=wildcard-import + + +- If there is a single line where you need to disable a particular error, + consider adding a comment to the line that causes the problem. For example: + +.. code-block:: python + + def MakeSummary(pcoll, metric_fn, metric_keys): # pylint: disable=invalid-name + + +- For multiple lines/block of code, to disable an error, you can surround the + block with ``pylint:disable/pylint:enable`` comment lines. For example: + +.. code-block:: python + + # pylint: disable=too-few-public-methods + class LoginForm(Form): + """Form for the user""" Review comment: It requires indentation. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
