This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch v1-10-test in repository https://gitbox.apache.org/repos/asf/airflow.git
commit 6f6b5600c371783d91a1f1234c627f895a3bbef9 Author: Ry Walker <4283+...@users.noreply.github.com> AuthorDate: Fri Oct 30 15:10:30 2020 -0400 Move Project focus and Principles higher in the README (#11973) (cherry picked from commit 3c723e35a58b274962dc47e21cbb05389263d97a) --- README.md | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/README.md b/README.md index 5e638a5..ccac802 100644 --- a/README.md +++ b/README.md @@ -40,13 +40,13 @@ Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The <!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --> **Table of contents** +- [Project Focus](#project-focus) +- [Principles](#principles) - [Requirements](#requirements) - [Getting started](#getting-started) - [Installing from PyPI](#installing-from-pypi) - [Official source code](#official-source-code) - [Convenience packages](#convenience-packages) -- [Project Focus](#project-focus) -- [Principles](#principles) - [User Interface](#user-interface) - [Contributing](#contributing) - [Who uses Apache Airflow?](#who-uses-apache-airflow) @@ -57,6 +57,21 @@ Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The <!-- END doctoc generated TOC please keep comment here to allow auto update --> +## Project Focus + +Airflow works best with workflows that are mostly static and slowly changing. When the structure is similar from one run to the next, it allows for clarity around unit of work and continuity. Other similar projects include [Luigi](https://github.com/spotify/luigi), [Oozie](http://oozie.apache.org/) and [Azkaban](https://azkaban.github.io/). + +Airflow is commonly used to process data, but has the opinion that tasks should ideally be idempotent, and should not pass large quantities of data from one task to the next (though tasks can pass metadata using Airflow's [Xcom feature](https://airflow.apache.org/docs/stable/concepts.html#xcoms)). For high-volume, data-intensive tasks, a best practice is to delegate to external services that specialize on that type of work. + +Airflow **is not** a streaming solution. Airflow is not in the [Spark Streaming](http://spark.apache.org/streaming/) or [Storm](https://storm.apache.org/) space. + +## Principles + +- **Dynamic**: Airflow pipelines are configuration as code (Python), allowing for dynamic pipeline generation. This allows for writing code that instantiates pipelines dynamically. +- **Extensible**: Easily define your own operators, executors and extend the library so that it fits the level of abstraction that suits your environment. +- **Elegant**: Airflow pipelines are lean and explicit. Parameterizing your scripts is built into the core of Airflow using the powerful **Jinja** templating engine. +- **Scalable**: Airflow has a modular architecture and uses a message queue to orchestrate an arbitrary number of workers. + ## Requirements Apache Airflow is tested with: @@ -150,21 +165,6 @@ All those artifacts are not official releases, but they are prepared using offic Some of those artifacts are "development" or "pre-release" ones, and they are clearly marked as such following the ASF Policy. -## Project Focus - -Airflow works best with workflows that are mostly static and slowly changing. When the structure is similar from one run to the next, it allows for clarity around unit of work and continuity. Other similar projects include [Luigi](https://github.com/spotify/luigi), [Oozie](http://oozie.apache.org/) and [Azkaban](https://azkaban.github.io/). - -Airflow is commonly used to process data, but has the opinion that tasks should ideally be idempotent, and should not pass large quantities of data from one task to the next (though tasks can pass metadata using Airflow's [Xcom feature](https://airflow.apache.org/docs/stable/concepts.html#xcoms)). For high-volume, data-intensive tasks, a best practice is to delegate to external services that specialize on that type of work. - -Airflow **is not** a streaming solution. Airflow is not in the [Spark Streaming](http://spark.apache.org/streaming/) or [Storm](https://storm.apache.org/) space. - -## Principles - -- **Dynamic**: Airflow pipelines are configuration as code (Python), allowing for dynamic pipeline generation. This allows for writing code that instantiates pipelines dynamically. -- **Extensible**: Easily define your own operators, executors and extend the library so that it fits the level of abstraction that suits your environment. -- **Elegant**: Airflow pipelines are lean and explicit. Parameterizing your scripts is built into the core of Airflow using the powerful **Jinja** templating engine. -- **Scalable**: Airflow has a modular architecture and uses a message queue to orchestrate an arbitrary number of workers. - ## User Interface - **DAGs**: Overview of all DAGs in your environment.