This is an automated email from the ASF dual-hosted git repository.

kamilbregula pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
     new 8080929  Improvements for database setup docs (#13696)
8080929 is described below

commit 808092928a66908f36aec585b881c5390d365130
Author: Kamil BreguĊ‚a <[email protected]>
AuthorDate: Sat Jan 16 00:21:14 2021 +0100

    Improvements for database setup docs (#13696)
---
 docs/apache-airflow/howto/index.rst               |   2 +-
 docs/apache-airflow/howto/initialize-database.rst | 105 ----------------
 docs/apache-airflow/howto/set-up-database.rst     | 145 ++++++++++++++++++++++
 docs/apache-airflow/installation.rst              |  16 +--
 docs/apache-airflow/production-deployment.rst     |   2 +-
 docs/apache-airflow/redirects.txt                 |   3 +
 6 files changed, 154 insertions(+), 119 deletions(-)

diff --git a/docs/apache-airflow/howto/index.rst 
b/docs/apache-airflow/howto/index.rst
index a5b0e2c..0baec68 100644
--- a/docs/apache-airflow/howto/index.rst
+++ b/docs/apache-airflow/howto/index.rst
@@ -31,7 +31,7 @@ configuring an Airflow environment.
 
     add-dag-tags
     set-config
-    initialize-database
+    set-up-database
     operator/index
     customize-state-colors-ui
     custom-operator
diff --git a/docs/apache-airflow/howto/initialize-database.rst 
b/docs/apache-airflow/howto/initialize-database.rst
deleted file mode 100644
index 2b8d309..0000000
--- a/docs/apache-airflow/howto/initialize-database.rst
+++ /dev/null
@@ -1,105 +0,0 @@
- .. Licensed to the Apache Software Foundation (ASF) under one
-    or more contributor license agreements.  See the NOTICE file
-    distributed with this work for additional information
-    regarding copyright ownership.  The ASF licenses this file
-    to you under the Apache License, Version 2.0 (the
-    "License"); you may not use this file except in compliance
-    with the License.  You may obtain a copy of the License at
-
- ..   http://www.apache.org/licenses/LICENSE-2.0
-
- .. Unless required by applicable law or agreed to in writing,
-    software distributed under the License is distributed on an
-    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-    KIND, either express or implied.  See the License for the
-    specific language governing permissions and limitations
-    under the License.
-
-
-
-Initializing a Database Backend
-===============================
-
-If you want to take a real test drive of Airflow, you should consider
-setting up a real database backend and switching to the LocalExecutor.
-
-Airflow was built to interact with its metadata using SqlAlchemy
-with **MySQL**,  **Postgres** and **SQLite** as supported backends (SQLite is 
used primarily for development purpose).
-
-.. seealso:: :ref:`Scheduler HA Database Requirements 
<scheduler:ha:db_requirements>` if you plan on running
-   more than one scheduler
-
-.. note:: We rely on more strict ANSI SQL settings for MySQL in order to have
-   sane defaults. Make sure to have specified 
``explicit_defaults_for_timestamp=1``
-   in your my.cnf under ``[mysqld]``
-
-.. note:: If you decide to use **MySQL**, we recommend using the 
``mysqlclient``
-   driver and specifying it in your SqlAlchemy connection string. (I.e.,
-   ``mysql+mysqldb://<user>:<password>@<host>[:<port>]/<dbname>``.)
-   But we also support the ``mysql-connector-python`` driver (I.e.,
-   ``mysql+mysqlconnector://<user>:<password>@<host>[:<port>]/<dbname>``.) 
which lets you connect through SSL
-   without any cert options provided. However if you want to use other drivers 
visit the
-   `SqlAlchemy docs <https://docs.sqlalchemy.org/en/13/dialects/mysql.html>`_ 
for more information regarding download
-   and setup of the SqlAlchemy connection.
-
-.. note:: If you decide to use **Postgres**, we recommend using the 
``psycopg2``
-   driver and specifying it in your SqlAlchemy connection string. (I.e.,
-   ``postgresql+psycopg2://<user>:<password>@<host>/<db>``.)
-   Also note that since SqlAlchemy does not expose a way to target a
-   specific schema in the Postgres connection URI, you may
-   want to set a default schema for your role with a
-   command similar to ``ALTER ROLE username SET search_path = airflow, 
foobar;``
-
-Setup your database to host Airflow
------------------------------------
-
-Create a database called ``airflow`` and a database user that Airflow
-will use to access this database.
-
-Example, for **MySQL**:
-
-.. code-block:: sql
-
-   CREATE DATABASE airflow CHARACTER SET utf8 COLLATE utf8_unicode_ci;
-   CREATE USER 'airflow' IDENTIFIED BY 'airflow';
-   GRANT ALL PRIVILEGES ON airflow.* TO 'airflow';
-
-Example, for **Postgres**:
-
-.. code-block:: sql
-
-   CREATE DATABASE airflow;
-   CREATE USER airflow WITH PASSWORD 'airflow';
-   GRANT ALL PRIVILEGES ON DATABASE airflow TO airflow;
-
-You may need to update your Postgres ``pg_hba.conf`` to add the
-``airflow`` user to the database access control list; and to reload
-the database configuration to load your change. See
-`The pg_hba.conf File 
<https://www.postgresql.org/docs/current/auth-pg-hba-conf.html>`__
-in the Postgres documentation to learn more.
-
-Configure Airflow's database connection string
-----------------------------------------------
-
-Once you have setup your database to host Airflow, you'll need to alter the
-SqlAlchemy connection string located in ``sql_alchemy_conn`` option in 
``[core]`` section in your configuration file
-``$AIRFLOW_HOME/airflow.cfg``.
-
-You can also define connection URI using ``AIRFLOW__CORE__SQL_ALCHEMY_CONN`` 
environment variable.
-
-Configure a worker that supports parallelism
---------------------------------------------
-
-You should then also change the ``executor`` option in the ``[core]`` option 
to use ``LocalExecutor``, an executor that can parallelize task instances 
locally.
-
-Initialize the database
------------------------
-
-.. code-block:: bash
-
-    # initialize the database
-    airflow db init
-
-.. spelling::
-
-     hba
diff --git a/docs/apache-airflow/howto/set-up-database.rst 
b/docs/apache-airflow/howto/set-up-database.rst
new file mode 100644
index 0000000..b13fdc4
--- /dev/null
+++ b/docs/apache-airflow/howto/set-up-database.rst
@@ -0,0 +1,145 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+
+Set up a Database Backend
+=========================
+
+Airflow was built to interact with its metadata using `SqlAlchemy 
<https://docs.sqlalchemy.org/en/13/>`__.
+
+The document below describes the database engine configurations, the necessary 
changes to their configuration to be used with Airflow, as well as changes to 
the Airflow configurations to connect to these databases.
+
+Choosing database backend
+-------------------------
+
+If you want to take a real test drive of Airflow, you should consider setting 
up a database backend to **MySQL** and **PostgresSQL**.
+By default, Airflow uses **SQLite**, which is not intended for development 
purposes only.
+
+Airflow supports the following database engine versions, so make sure which 
version you have. Old versions may not support all SQL statements.
+
+  * PostgreSQL:  9.6, 10, 11, 12, 13
+  * MySQL: 5.7, 8
+  * SQLite: 3.15.0+
+
+If you plan on running more than one scheduler, you have to meet additional 
requirements.
+For details, see :ref:`Scheduler HA Database Requirements 
<scheduler:ha:db_requirements>`.
+
+Database URI
+------------
+
+Airflow uses SQLAlchemy to connect to the database, which requires you to 
configure the Database URL.
+You can do this in option ``sql_alchemy_conn`` in section ``[core]``. It is 
also common to configure
+this option with ``AIRFLOW__CORE__SQL_ALCHEMY_CONN`` environment variable.
+
+.. note::
+    For more information on setting the configuration, see 
:doc:`/howto/set-config`.
+
+If you want to check the current value, you can use ``airflow config get-value 
core sql_alchemy_conn`` command as in
+the example below.
+
+.. code-block:: bash
+
+    $ airflow config get-value core sql_alchemy_conn
+    sqlite:////tmp/airflow/airflow.db
+
+The exact format description is described in the SQLAlchemy documentation, see 
`Database Urls <https://docs.sqlalchemy.org/en/14/core/engines.html>`__. We 
will also show you some examples below.
+
+Setting up a MySQL Database
+---------------------------
+
+You need to create a database and a database user that Airflow will use to 
access this database.
+In the example below, a database ``airflow_db`` and user  with username 
``airflow_user`` with password ``airflow_pass`` will be created
+
+.. code-block:: sql
+
+   CREATE DATABASE airflow_db CHARACTER SET utf8 COLLATE utf8_unicode_ci;
+   CREATE USER 'airflow_user' IDENTIFIED BY 'airflow_pass';
+   GRANT ALL PRIVILEGES ON airflow_db.* TO 'airflow_user';
+
+We rely on more strict ANSI SQL settings for MySQL in order to have sane 
defaults.
+Make sure to have specified ``explicit_defaults_for_timestamp=1`` option under 
``[mysqld]`` section
+in your ``my.cnf`` file. You can also activate these options with the 
``--explicit-defaults-for-timestamp`` switch passed to ``mysqld`` executable
+
+We recommend using the ``mysqlclient`` driver and specifying it in your 
SqlAlchemy connection string.
+
+.. code-block:: text
+
+    mysql+mysqldb://<user>:<password>@<host>[:<port>]/<dbname>
+
+But we also support the ``mysql-connector-python`` driver, which lets you 
connect through SSL
+without any cert options provided.
+
+.. code-block:: text
+
+   mysql+mysqlconnector://<user>:<password>@<host>[:<port>]/<dbname>
+
+However if you want to use other drivers visit the `MySQL Dialect 
<https://docs.sqlalchemy.org/en/13/dialects/mysql.html>`__  in SQLAlchemy 
documentation for more information regarding download
+and setup of the SqlAlchemy connection.
+
+Setting up a PostgreSQL Database
+--------------------------------
+
+You need to create a database and a database user that Airflow will use to 
access this database.
+In the example below, a database ``airflow_db`` and user  with username 
``airflow_user`` with password ``airflow_pass`` will be created
+
+.. code-block:: sql
+
+   CREATE DATABASE airflow_db;
+   CREATE USER airflow_user WITH PASSWORD 'airflow_user';
+   GRANT ALL PRIVILEGES ON DATABASE airflow_db TO airflow_user;
+
+You may need to update your Postgres ``pg_hba.conf`` to add the
+``airflow`` user to the database access control list; and to reload
+the database configuration to load your change. See
+`The pg_hba.conf File 
<https://www.postgresql.org/docs/current/auth-pg-hba-conf.html>`__
+in the Postgres documentation to learn more.
+
+We recommend using the ``psycopg2`` driver and specifying it in your 
SqlAlchemy connection string.
+
+.. code-block:: text
+
+   postgresql+psycopg2://<user>:<password>@<host>/<db>
+
+Also note that since SqlAlchemy does not expose a way to target a specific 
schema in the database URI, you may
+want to set a default schema for your role with a SQL statement similar to 
``ALTER ROLE username SET search_path = airflow, foobar;``
+
+For more information regarding setup of the PostgresSQL connection, see 
`PostgreSQL dialect 
<https://docs.sqlalchemy.org/en/13/dialects/postgresql.html>`__ in SQLAlchemy 
documentation.
+
+.. spelling::
+
+     hba
+
+Other configuration options
+---------------------------
+
+There are more configuration options for configuring SQLAlchemy behavior. For 
details, see :ref:`reference documentation <config:core>` for ``sqlalchemy_*`` 
option in ``[core]`` section.
+
+Initialize the database
+-----------------------
+
+After configuring the database and connecting to it in Airflow configuration, 
you should create the database schema.
+
+.. code-block:: bash
+
+    airflow db init
+
+What's next?
+------------
+
+By default, Airflow uses ``SequentialExecutor``, which does not provide 
parallelism. You should consider
+configuring a different :doc:`executor </executor/index>` for better 
performance.
diff --git a/docs/apache-airflow/installation.rst 
b/docs/apache-airflow/installation.rst
index dd31ba5..b16a5ed 100644
--- a/docs/apache-airflow/installation.rst
+++ b/docs/apache-airflow/installation.rst
@@ -209,20 +209,12 @@ release schedule of Python, nicely summarized in the
    it works in our CI pipeline (which might not be immediate) and release a 
new version of Airflow
    (non-Patch version) based on this CI set-up.
 
-Initializing Airflow Database
-'''''''''''''''''''''''''''''
+Set up a database
+'''''''''''''''''
 
-Airflow requires a database to be initialized before you can run tasks. If
-you're just experimenting and learning Airflow, you can stick with the
+Airflow requires a database. If you're just experimenting and learning 
Airflow, you can stick with the
 default SQLite option. If you don't want to use SQLite, then take a look at
-:doc:`howto/initialize-database` to setup a different database.
-
-After configuration, you'll need to initialize the database before you can
-run tasks:
-
-.. code-block:: bash
-
-    airflow db init
+:doc:`howto/set-up-database` to setup a different database.
 
 
 Troubleshooting
diff --git a/docs/apache-airflow/production-deployment.rst 
b/docs/apache-airflow/production-deployment.rst
index 02194ac..47e477f 100644
--- a/docs/apache-airflow/production-deployment.rst
+++ b/docs/apache-airflow/production-deployment.rst
@@ -26,7 +26,7 @@ Database backend
 
 Airflow comes with an ``SQLite`` backend by default. This allows the user to 
run Airflow without any external database.
 However, such a setup is meant to be used for testing purposes only; running 
the default setup in production can lead to data loss in multiple scenarios.
-If you want to run production-grade Airflow, make sure you :doc:`configure the 
backend <howto/initialize-database>` to be an external database such as 
PostgreSQL or MySQL.
+If you want to run production-grade Airflow, make sure you :doc:`configure the 
backend <howto/set-up-database>` to be an external database such as PostgreSQL 
or MySQL.
 
 You can change the backend using the following config
 
diff --git a/docs/apache-airflow/redirects.txt 
b/docs/apache-airflow/redirects.txt
index dc0addd..36a72fa 100644
--- a/docs/apache-airflow/redirects.txt
+++ b/docs/apache-airflow/redirects.txt
@@ -27,6 +27,9 @@ howto/connection/index.rst howto/connection.rst
 # Web UI
 howto/add-new-role.rst security/access-control.rst
 
+# Set up a database
+howto/initialize-database.rst howto/set-up-database.rst
+
 # Logging & Monitoring
 howto/check-health.rst logging-monitoring/check-health.rst
 errors.rst logging-monitoring/errors.rst

Reply via email to