This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
     new b35b08ec41 Improve pre-commit to generate Airflow diagrams as a code 
(#36333)
b35b08ec41 is described below

commit b35b08ec41814b6fe5d7388296db83a726e6d6d0
Author: Jarek Potiuk <[email protected]>
AuthorDate: Wed Dec 20 19:44:33 2023 +0100

    Improve pre-commit to generate Airflow diagrams as a code (#36333)
    
    Since we are getting more diagrams generated in Airflow using the
    "diagram as a code" approach, this PR improves the pre-commit to be
    more suitable to support generation of more of the images coming
    from different sources, placed in different directories and generated
    independently, so that the whole process is more distributed and easy
    for whoever creates diagrams to add their own diagram.
    
    The changes implemented in this PR:
    
    * the code to generate the diagrams is now next to the diagram they
      generate. It has the same name as the diagram, but it has the .py
      extension. This way it is immediately visible where is the source
      of each diagram (right next to each diagram)
    
    * each of the .py diagram Python files is runnable on its own. This
      way you can easily regenerate the diagrams by running corresponding
      Python file or even automate it by running "save" action and generate
      the diagrams automatically by running the Python code every time
      the file is saved. That makes a very nice workflow on iterating on
      each diagram, independently from each othere
    
    * the pre-commit script is given a set of folders which should be
      scanned and it finds and run the diagrams on pre-commmit. It also
      creates and verifies the md5sum hash of the source Python file
      separately for each diagram and only runs diagram generation when
      the source file changed vs. last time the hash was saved and
      committed. The hash sum is stored next to the image and sources
      with .md5sum extension
    
    Also updated documentation in the CONTRIBUTING.rst explaining how
    to generate the diagrams and what is the mechanism of that
    generation.
---
 .pre-commit-config.yaml                            |   4 +-
 .rat-excludes                                      |   3 +
 CONTRIBUTING.rst                                   |  45 +++++
 ...am_fab_auth_manager_airflow_architecture.md5sum |   1 +
 ...agram_fab_auth_manager_airflow_architecture.png | Bin 81823 -> 81735 bytes
 ...iagram_fab_auth_manager_airflow_architecture.py |  74 +++++++
 ...iagram_auth_manager_airflow_architecture.md5sum |   1 +
 .../diagram_auth_manager_airflow_architecture.png  | Bin 53958 -> 54220 bytes
 .../diagram_auth_manager_airflow_architecture.py   |  73 +++++++
 .../img/diagram_basic_airflow_architecture.md5sum  |   1 +
 .../img/diagram_basic_airflow_architecture.png     | Bin 100899 -> 87096 bytes
 .../img/diagram_basic_airflow_architecture.py      |  77 ++++++++
 ...agram_dag_processor_airflow_architecture.md5sum |   1 +
 .../diagram_dag_processor_airflow_architecture.png | Bin 121666 -> 106642 bytes
 .../diagram_dag_processor_airflow_architecture.py  |  84 ++++++++
 ...agram_fab_auth_manager_airflow_architecture.png | Bin 0 -> 49545 bytes
 .../diagrams/python_multiprocess_logo.png          | Bin
 .../pre_commit_generate_airflow_diagrams.py        | 217 ++-------------------
 18 files changed, 381 insertions(+), 200 deletions(-)

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index dfed80447b..18c1d7d64c 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -416,8 +416,8 @@ repos:
         name: Generate airflow diagrams
         entry: ./scripts/ci/pre_commit/pre_commit_generate_airflow_diagrams.py
         language: python
-        files: ^scripts/ci/pre_commit/pre_commit_generate_airflow_diagrams.py
-        pass_filenames: false
+        files: ^docs/.*/diagram_[^/]*\.py$
+        pass_filenames: true
         additional_dependencies: ['rich>=12.4.4', "diagrams>=0.23.4"]
       - id: update-supported-versions
         name: Updates supported versions in documentation
diff --git a/.rat-excludes b/.rat-excludes
index 751742b1af..d881787de9 100644
--- a/.rat-excludes
+++ b/.rat-excludes
@@ -145,3 +145,6 @@ doap_airflow.rdf
 
 # PKG-INFO file
 PKG-INFO
+
+# checksum files
+.*\.md5sum
diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst
index 818ca6120d..3640b1eb31 100644
--- a/CONTRIBUTING.rst
+++ b/CONTRIBUTING.rst
@@ -981,6 +981,51 @@ Documentation for ``apache-airflow`` package and other 
packages that are closely
 providers packages are in ``/docs/`` directory. For detailed information on 
documentation development,
 see: `docs/README.rst <docs/README.rst>`_
 
+Diagrams
+========
+
+We started to use (and gradually convert old diagrams to use it) `Diagrams 
<https://diagrams.mingrammer.com/>`_
+as our tool of choice to generate diagrams. The diagrams are generated from 
Python code and can be
+automatically updated when the code changes. The diagrams are generated using 
pre-commit hooks (See
+static checks below) but they can also be generated manually by running the 
corresponding Python code.
+
+To run the code you need to install the dependencies in the virtualenv you use 
to run it:
+* ``pip install diagrams rich``. You need to have graphviz installed in your
+system (``brew install graphviz`` on macOS for example).
+
+The source code of the diagrams are next to the generated diagram, the 
difference is that the source
+code has ``.py`` extension and the generated diagram has ``.png`` extension. 
The pre-commit hook
+ ``generate-airflow-diagrams`` will look for ``diagram_*.py`` files in the 
``docs`` subdirectories
+to find them and runs them when the sources changed and the diagrams are not 
up to date (the
+pre-commit will automatically generate an .md5sum hash of the sources and 
store it next to the diagram
+file).
+
+In order to generate the diagram manually you can run the following command:
+
+.. code-block:: bash
+
+    python <path-to-diagram-file>.py
+
+You can also generate all diagrams by:
+
+.. code-block:: bash
+
+    pre-commit run generate-airflow-diagrams
+
+or with Breeze:
+
+.. code-block:: bash
+
+    breeze static-checks --type generate-airflow-diagrams --all-files
+
+When you iterate over a diagram, you can also setup a "save" action in your 
IDE to run the python
+file automatically when you save the diagram file.
+
+Once you've done iteration and you are happy with the diagram, you can commit 
the diagram, the source
+code and the .md5sum file. The pre-commit hook will then not run the diagram 
generation until the
+source code for it changes.
+
+
 Static code checks
 ==================
 
diff --git 
a/docs/apache-airflow-providers-fab/img/diagram_fab_auth_manager_airflow_architecture.md5sum
 
b/docs/apache-airflow-providers-fab/img/diagram_fab_auth_manager_airflow_architecture.md5sum
new file mode 100644
index 0000000000..fb928aa691
--- /dev/null
+++ 
b/docs/apache-airflow-providers-fab/img/diagram_fab_auth_manager_airflow_architecture.md5sum
@@ -0,0 +1 @@
+aa73a8292341145e0f60682f7047503b
diff --git 
a/docs/apache-airflow-providers-fab/img/diagram_fab_auth_manager_airflow_architecture.png
 
b/docs/apache-airflow-providers-fab/img/diagram_fab_auth_manager_airflow_architecture.png
index 4299bb28d2..9c7a1d1561 100644
Binary files 
a/docs/apache-airflow-providers-fab/img/diagram_fab_auth_manager_airflow_architecture.png
 and 
b/docs/apache-airflow-providers-fab/img/diagram_fab_auth_manager_airflow_architecture.png
 differ
diff --git 
a/docs/apache-airflow-providers-fab/img/diagram_fab_auth_manager_airflow_architecture.py
 
b/docs/apache-airflow-providers-fab/img/diagram_fab_auth_manager_airflow_architecture.py
new file mode 100644
index 0000000000..393d988bb2
--- /dev/null
+++ 
b/docs/apache-airflow-providers-fab/img/diagram_fab_auth_manager_airflow_architecture.py
@@ -0,0 +1,74 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from pathlib import Path
+
+from diagrams import Cluster, Diagram, Edge
+from diagrams.custom import Custom
+from diagrams.onprem.client import User
+from diagrams.onprem.database import PostgreSQL
+from rich.console import Console
+
+MY_DIR = Path(__file__).parent
+MY_FILENAME = Path(__file__).with_suffix("").name
+PYTHON_MULTIPROCESS_LOGO = MY_DIR.parents[1] / "diagrams" / 
"python_multiprocess_logo.png"
+
+console = Console(width=400, color_system="standard")
+
+
+def generate_fab_auth_manager_airflow_diagram():
+    image_file = (MY_DIR / MY_FILENAME).with_suffix(".png")
+    console.print(f"[bright_blue]Generating architecture image {image_file}")
+    with Diagram(
+        name="",
+        show=False,
+        direction="LR",
+        curvestyle="ortho",
+        filename=MY_FILENAME,
+    ):
+        user = User("User")
+        with Cluster("Airflow environment"):
+            webserver = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+            with Cluster("FAB provider"):
+                fab_auth_manager = Custom("FAB auth manager", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+            with Cluster("Core Airflow"):
+                auth_manager_interface = Custom(
+                    "Auth manager\ninterface", 
PYTHON_MULTIPROCESS_LOGO.as_posix()
+                )
+
+            db = PostgreSQL("Metadata DB")
+
+        user >> Edge(color="black", style="solid", reverse=True, label="Access 
to the console") >> webserver
+        (
+            webserver
+            >> Edge(color="black", style="solid", reverse=True, label="Is user 
authorized?")
+            >> fab_auth_manager
+        )
+        (fab_auth_manager >> Edge(color="black", style="solid", reverse=True) 
>> db)
+        (
+            fab_auth_manager
+            >> Edge(color="black", style="dotted", reverse=False, 
label="Inherit")
+            >> auth_manager_interface
+        )
+
+    console.print(f"[green]Generating architecture image {image_file}")
+
+
+if __name__ == "__main__":
+    generate_fab_auth_manager_airflow_diagram()
diff --git 
a/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.md5sum 
b/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.md5sum
new file mode 100644
index 0000000000..ac3e24d848
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.md5sum
@@ -0,0 +1 @@
+5b82cba489898a46dcfe5f458eeee33b
diff --git 
a/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.png 
b/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.png
index ba6cfaef61..35f3f418f2 100644
Binary files 
a/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.png and 
b/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.png differ
diff --git 
a/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.py 
b/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.py
new file mode 100644
index 0000000000..453d17267c
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.py
@@ -0,0 +1,73 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from pathlib import Path
+
+from diagrams import Cluster, Diagram, Edge
+from diagrams.custom import Custom
+from diagrams.onprem.client import User
+from rich.console import Console
+
+MY_DIR = Path(__file__).parent
+MY_FILENAME = Path(__file__).with_suffix("").name
+PYTHON_MULTIPROCESS_LOGO = MY_DIR.parents[1] / "diagrams" / 
"python_multiprocess_logo.png"
+
+console = Console(width=400, color_system="standard")
+
+
+def generate_auth_manager_airflow_diagram():
+    image_file = (MY_DIR / MY_FILENAME).with_suffix(".png")
+
+    console.print(f"[bright_blue]Generating architecture image {image_file}")
+    with Diagram(
+        name="",
+        show=False,
+        direction="LR",
+        curvestyle="ortho",
+        filename=MY_FILENAME,
+    ):
+        user = User("User")
+        with Cluster("Airflow environment"):
+            webserver = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+            with Cluster("Provider X"):
+                auth_manager = Custom("X auth manager", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+            with Cluster("Core Airflow"):
+                auth_manager_interface = Custom(
+                    "Auth manager\ninterface", 
PYTHON_MULTIPROCESS_LOGO.as_posix()
+                )
+
+        (user >> Edge(color="black", style="solid", reverse=True, 
label="Access to the console") >> webserver)
+
+        (
+            webserver
+            >> Edge(color="black", style="solid", reverse=True, label="Is user 
authorized?")
+            >> auth_manager
+        )
+
+        (
+            auth_manager
+            >> Edge(color="black", style="dotted", reverse=False, 
label="Inherit")
+            >> auth_manager_interface
+        )
+
+    console.print(f"[green]Generating architecture image {image_file}")
+
+
+if __name__ == "__main__":
+    generate_auth_manager_airflow_diagram()
diff --git a/docs/apache-airflow/img/diagram_basic_airflow_architecture.md5sum 
b/docs/apache-airflow/img/diagram_basic_airflow_architecture.md5sum
new file mode 100644
index 0000000000..d20c0307d4
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_basic_airflow_architecture.md5sum
@@ -0,0 +1 @@
+ac9bd11824e7faf5ed5232ff242c3157
diff --git a/docs/apache-airflow/img/diagram_basic_airflow_architecture.png 
b/docs/apache-airflow/img/diagram_basic_airflow_architecture.png
index 51f571e0e8..feae0a63bb 100644
Binary files a/docs/apache-airflow/img/diagram_basic_airflow_architecture.png 
and b/docs/apache-airflow/img/diagram_basic_airflow_architecture.png differ
diff --git a/docs/apache-airflow/img/diagram_basic_airflow_architecture.py 
b/docs/apache-airflow/img/diagram_basic_airflow_architecture.py
new file mode 100644
index 0000000000..d65a6ae83a
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_basic_airflow_architecture.py
@@ -0,0 +1,77 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from pathlib import Path
+
+from diagrams import Cluster, Diagram, Edge
+from diagrams.custom import Custom
+from diagrams.onprem.client import User
+from diagrams.onprem.database import PostgreSQL
+from diagrams.programming.flowchart import MultipleDocuments
+from rich.console import Console
+
+MY_DIR = Path(__file__).parent
+MY_FILENAME = Path(__file__).with_suffix("").name
+PYTHON_MULTIPROCESS_LOGO = MY_DIR.parents[1] / "diagrams" / 
"python_multiprocess_logo.png"
+
+console = Console(width=400, color_system="standard")
+
+
+def generate_basic_airflow_diagram():
+    image_file = (MY_DIR / MY_FILENAME).with_suffix(".png")
+
+    console.print(f"[bright_blue]Generating architecture image {image_file}")
+    with Diagram(
+        name="", show=False, direction="LR", curvestyle="ortho", 
filename=MY_FILENAME, outformat="png"
+    ):
+        with Cluster("Parsing & Scheduling"):
+            schedulers = Custom("Scheduler(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+        metadata_db = PostgreSQL("Metadata DB")
+
+        dag_author = User("DAG Author")
+        dag_files = MultipleDocuments("DAG files")
+
+        dag_author >> Edge(color="black", style="dashed", reverse=False) >> 
dag_files
+
+        with Cluster("Execution"):
+            workers = Custom("Worker(s)", PYTHON_MULTIPROCESS_LOGO.as_posix())
+            triggerer = Custom("Triggerer(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+        schedulers - Edge(color="blue", style="dashed", taillabel="Executor") 
- workers
+
+        schedulers >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
+        workers >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
+        triggerer >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
+
+        operations_user = User("Operations User")
+        with Cluster("UI"):
+            webservers = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+        webservers >> Edge(color="black", style="dashed", reverse=True) >> 
operations_user
+
+        metadata_db >> Edge(color="red", style="dotted", reverse=True) >> 
webservers
+
+        dag_files >> Edge(color="brown", style="solid") >> workers
+        dag_files >> Edge(color="brown", style="solid") >> schedulers
+        dag_files >> Edge(color="brown", style="solid") >> triggerer
+    console.print(f"[green]Generating architecture image {image_file}")
+
+
+if __name__ == "__main__":
+    generate_basic_airflow_diagram()
diff --git 
a/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.md5sum 
b/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.md5sum
new file mode 100644
index 0000000000..ebe1a15d56
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.md5sum
@@ -0,0 +1 @@
+e189c45f79a7a878802bde13be27a112
diff --git 
a/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.png 
b/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.png
index f44eaa35ec..8a2d48df19 100644
Binary files 
a/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.png and 
b/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.png differ
diff --git 
a/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.py 
b/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.py
new file mode 100644
index 0000000000..714049d349
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.py
@@ -0,0 +1,84 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from pathlib import Path
+
+from diagrams import Cluster, Diagram, Edge
+from diagrams.custom import Custom
+from diagrams.onprem.client import User
+from diagrams.onprem.database import PostgreSQL
+from diagrams.programming.flowchart import MultipleDocuments
+from rich.console import Console
+
+MY_DIR = Path(__file__).parent
+MY_FILENAME = Path(__file__).with_suffix("").name
+PYTHON_MULTIPROCESS_LOGO = MY_DIR.parents[1] / "diagrams" / 
"python_multiprocess_logo.png"
+
+console = Console(width=400, color_system="standard")
+
+
+def generate_dag_processor_airflow_diagram():
+    dag_processor_architecture_image_file = (MY_DIR / 
MY_FILENAME).with_suffix(".png")
+    console.print(f"[bright_blue]Generating architecture image 
{dag_processor_architecture_image_file}")
+    with Diagram(
+        name="",
+        show=False,
+        direction="LR",
+        curvestyle="ortho",
+        filename=MY_FILENAME,
+        outformat="png",
+    ):
+        operations_user = User("Operations User")
+        with Cluster("No DAG Python Code Execution", graph_attr={"bgcolor": 
"lightgrey"}):
+            with Cluster("Scheduling"):
+                schedulers = Custom("Scheduler(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+            with Cluster("UI"):
+                webservers = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+        webservers >> Edge(color="black", style="dashed", reverse=True) >> 
operations_user
+
+        metadata_db = PostgreSQL("Metadata DB")
+
+        dag_author = User("DAG Author")
+        with Cluster("DAG Python Code Execution"):
+            with Cluster("Execution"):
+                workers = Custom("Worker(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+                triggerer = Custom("Triggerer(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+            with Cluster("Parsing"):
+                dag_processors = Custom("DAG\nProcessor(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+            dag_files = MultipleDocuments("DAG files")
+
+        dag_author >> Edge(color="black", style="dashed", reverse=False) >> 
dag_files
+
+        workers - Edge(color="blue", style="dashed", headlabel="Executor") - 
schedulers
+
+        metadata_db >> Edge(color="red", style="dotted", reverse=True) >> 
webservers
+        metadata_db >> Edge(color="red", style="dotted", reverse=True) >> 
schedulers
+        dag_processors >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
+        workers >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
+        triggerer >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
+
+        dag_files >> Edge(color="brown", style="solid") >> workers
+        dag_files >> Edge(color="brown", style="solid") >> dag_processors
+        dag_files >> Edge(color="brown", style="solid") >> triggerer
+    console.print(f"[green]Generating architecture image 
{dag_processor_architecture_image_file}")
+
+
+if __name__ == "__main__":
+    generate_dag_processor_airflow_diagram()
diff --git 
a/docs/apache-airflow/img/diagram_fab_auth_manager_airflow_architecture.png 
b/docs/apache-airflow/img/diagram_fab_auth_manager_airflow_architecture.png
new file mode 100644
index 0000000000..4057a67615
Binary files /dev/null and 
b/docs/apache-airflow/img/diagram_fab_auth_manager_airflow_architecture.png 
differ
diff --git a/images/diagrams/python_multiprocess_logo.png 
b/docs/diagrams/python_multiprocess_logo.png
similarity index 100%
rename from images/diagrams/python_multiprocess_logo.png
rename to docs/diagrams/python_multiprocess_logo.png
diff --git a/scripts/ci/pre_commit/pre_commit_generate_airflow_diagrams.py 
b/scripts/ci/pre_commit/pre_commit_generate_airflow_diagrams.py
index 0afb9b9bf5..f809d566e3 100755
--- a/scripts/ci/pre_commit/pre_commit_generate_airflow_diagrams.py
+++ b/scripts/ci/pre_commit/pre_commit_generate_airflow_diagrams.py
@@ -18,217 +18,38 @@
 from __future__ import annotations
 
 import hashlib
-import os
+import subprocess
+import sys
 from pathlib import Path
 
-from diagrams import Cluster, Diagram, Edge
-from diagrams.custom import Custom
-from diagrams.onprem.client import User
-from diagrams.onprem.database import PostgreSQL
-from diagrams.programming.flowchart import MultipleDocuments
 from rich.console import Console
 
 console = Console(width=400, color_system="standard")
 
 LOCAL_DIR = Path(__file__).parent
 AIRFLOW_SOURCES_ROOT = Path(__file__).parents[3]
-DOCS_IMAGES_DIR = AIRFLOW_SOURCES_ROOT / "docs" / "apache-airflow" / "img"
-FAB_PROVIDER_DOCS_IMAGES_DIR = AIRFLOW_SOURCES_ROOT / "docs" / 
"apache-airflow-providers-fab" / "img"
-PYTHON_MULTIPROCESS_LOGO = AIRFLOW_SOURCES_ROOT / "images" / "diagrams" / 
"python_multiprocess_logo.png"
 
-BASIC_ARCHITECTURE_IMAGE_NAME = "diagram_basic_airflow_architecture"
-DAG_PROCESSOR_AIRFLOW_ARCHITECTURE_IMAGE_NAME = 
"diagram_dag_processor_airflow_architecture"
-AUTH_MANAGER_AIRFLOW_ARCHITECTURE_IMAGE_NAME = 
"diagram_auth_manager_airflow_architecture"
-FAB_AUTH_MANAGER_AIRFLOW_ARCHITECTURE_IMAGE_NAME = 
"diagram_fab_auth_manager_airflow_architecture"
-DIAGRAM_HASH_FILE_NAME = "diagram_hash.txt"
 
-
-def generate_basic_airflow_diagram():
-    basic_architecture_image_file = (DOCS_IMAGES_DIR / 
BASIC_ARCHITECTURE_IMAGE_NAME).with_suffix(".png")
-    console.print(f"[bright_blue]Generating architecture image 
{basic_architecture_image_file}")
-    with Diagram(
-        name="", show=False, direction="LR", curvestyle="ortho", 
filename=BASIC_ARCHITECTURE_IMAGE_NAME
-    ):
-        with Cluster("Parsing & Scheduling"):
-            schedulers = Custom("Scheduler(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-
-        metadata_db = PostgreSQL("Metadata DB")
-
-        dag_author = User("DAG Author")
-        dag_files = MultipleDocuments("DAG files")
-
-        dag_author >> Edge(color="black", style="dashed", reverse=False) >> 
dag_files
-
-        with Cluster("Execution"):
-            workers = Custom("Worker(s)", PYTHON_MULTIPROCESS_LOGO.as_posix())
-            triggerer = Custom("Triggerer(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-
-        schedulers - Edge(color="blue", style="dashed", taillabel="Executor") 
- workers
-
-        schedulers >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
-        workers >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
-        triggerer >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
-
-        operations_user = User("Operations User")
-        with Cluster("UI"):
-            webservers = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-
-        webservers >> Edge(color="black", style="dashed", reverse=True) >> 
operations_user
-
-        metadata_db >> Edge(color="red", style="dotted", reverse=True) >> 
webservers
-
-        dag_files >> Edge(color="brown", style="solid") >> workers
-        dag_files >> Edge(color="brown", style="solid") >> schedulers
-        dag_files >> Edge(color="brown", style="solid") >> triggerer
-    console.print(f"[green]Generating architecture image 
{basic_architecture_image_file}")
-
-
-def generate_dag_processor_airflow_diagram():
-    dag_processor_architecture_image_file = (
-        DOCS_IMAGES_DIR / DAG_PROCESSOR_AIRFLOW_ARCHITECTURE_IMAGE_NAME
-    ).with_suffix(".png")
-    console.print(f"[bright_blue]Generating architecture image 
{dag_processor_architecture_image_file}")
-    with Diagram(
-        name="",
-        show=False,
-        direction="LR",
-        curvestyle="ortho",
-        filename=DAG_PROCESSOR_AIRFLOW_ARCHITECTURE_IMAGE_NAME,
-    ):
-        operations_user = User("Operations User")
-        with Cluster("No DAG Python Code Execution", graph_attr={"bgcolor": 
"lightgrey"}):
-            with Cluster("Scheduling"):
-                schedulers = Custom("Scheduler(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-
-            with Cluster("UI"):
-                webservers = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-
-        webservers >> Edge(color="black", style="dashed", reverse=True) >> 
operations_user
-
-        metadata_db = PostgreSQL("Metadata DB")
-
-        dag_author = User("DAG Author")
-        with Cluster("DAG Python Code Execution"):
-            with Cluster("Execution"):
-                workers = Custom("Worker(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-                triggerer = Custom("Triggerer(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-            with Cluster("Parsing"):
-                dag_processors = Custom("DAG\nProcessor(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-            dag_files = MultipleDocuments("DAG files")
-
-        dag_author >> Edge(color="black", style="dashed", reverse=False) >> 
dag_files
-
-        workers - Edge(color="blue", style="dashed", headlabel="Executor") - 
schedulers
-
-        metadata_db >> Edge(color="red", style="dotted", reverse=True) >> 
webservers
-        metadata_db >> Edge(color="red", style="dotted", reverse=True) >> 
schedulers
-        dag_processors >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
-        workers >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
-        triggerer >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
-
-        dag_files >> Edge(color="brown", style="solid") >> workers
-        dag_files >> Edge(color="brown", style="solid") >> dag_processors
-        dag_files >> Edge(color="brown", style="solid") >> triggerer
-    console.print(f"[green]Generating architecture image 
{dag_processor_architecture_image_file}")
-
-
-def generate_auth_manager_airflow_diagram():
-    auth_manager_architecture_image_file = (
-        DOCS_IMAGES_DIR / AUTH_MANAGER_AIRFLOW_ARCHITECTURE_IMAGE_NAME
-    ).with_suffix(".png")
-    console.print(f"[bright_blue]Generating architecture image 
{auth_manager_architecture_image_file}")
-    with Diagram(
-        name="",
-        show=False,
-        direction="LR",
-        curvestyle="ortho",
-        filename=AUTH_MANAGER_AIRFLOW_ARCHITECTURE_IMAGE_NAME,
-    ):
-        user = User("User")
-        with Cluster("Airflow environment"):
-            webserver = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-
-            with Cluster("Provider X"):
-                auth_manager = Custom("X auth manager", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-            with Cluster("Core Airflow"):
-                auth_manager_interface = Custom(
-                    "Auth manager\ninterface", 
PYTHON_MULTIPROCESS_LOGO.as_posix()
-                )
-
-        (user >> Edge(color="black", style="solid", reverse=True, 
label="Access to the console") >> webserver)
-
-        (
-            webserver
-            >> Edge(color="black", style="solid", reverse=True, label="Is user 
authorized?")
-            >> auth_manager
-        )
-
-        (
-            auth_manager
-            >> Edge(color="black", style="dotted", reverse=False, 
label="Inherit")
-            >> auth_manager_interface
-        )
-
-    console.print(f"[green]Generating architecture image 
{auth_manager_architecture_image_file}")
-
-
-def generate_fab_auth_manager_airflow_diagram():
-    auth_manager_architecture_image_file = (
-        FAB_PROVIDER_DOCS_IMAGES_DIR / 
FAB_AUTH_MANAGER_AIRFLOW_ARCHITECTURE_IMAGE_NAME
-    ).with_suffix(".png")
-    console.print(f"[bright_blue]Generating architecture image 
{auth_manager_architecture_image_file}")
-    with Diagram(
-        name="",
-        show=False,
-        direction="LR",
-        curvestyle="ortho",
-        filename=FAB_AUTH_MANAGER_AIRFLOW_ARCHITECTURE_IMAGE_NAME,
-    ):
-        user = User("User")
-        with Cluster("Airflow environment"):
-            webserver = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-
-            with Cluster("FAB provider"):
-                fab_auth_manager = Custom("FAB auth manager", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-            with Cluster("Core Airflow"):
-                auth_manager_interface = Custom(
-                    "Auth manager\ninterface", 
PYTHON_MULTIPROCESS_LOGO.as_posix()
-                )
-
-            db = PostgreSQL("Metadata DB")
-
-        user >> Edge(color="black", style="solid", reverse=True, label="Access 
to the console") >> webserver
-        (
-            webserver
-            >> Edge(color="black", style="solid", reverse=True, label="Is user 
authorized?")
-            >> fab_auth_manager
-        )
-        (fab_auth_manager >> Edge(color="black", style="solid", reverse=True) 
>> db)
-        (
-            fab_auth_manager
-            >> Edge(color="black", style="dotted", reverse=False, 
label="Inherit")
-            >> auth_manager_interface
-        )
-
-    console.print(f"[green]Generating architecture image 
{auth_manager_architecture_image_file}")
+def _get_file_hash(file_to_check: Path) -> str:
+    hash_md5 = hashlib.md5()
+    hash_md5.update(Path(file_to_check).resolve().read_bytes())
+    return hash_md5.hexdigest()
 
 
 def main():
-    hash_md5 = hashlib.md5()
-    hash_md5.update(Path(__file__).resolve().read_bytes())
-    my_file_hash = hash_md5.hexdigest()
-    hash_file = LOCAL_DIR / DIAGRAM_HASH_FILE_NAME
-    if not hash_file.exists() or not hash_file.read_text().strip() == 
str(my_file_hash).strip():
-        os.chdir(DOCS_IMAGES_DIR)
-        generate_basic_airflow_diagram()
-        generate_dag_processor_airflow_diagram()
-        generate_auth_manager_airflow_diagram()
-        os.chdir(FAB_PROVIDER_DOCS_IMAGES_DIR)
-        generate_fab_auth_manager_airflow_diagram()
-        os.chdir(DOCS_IMAGES_DIR)
-        hash_file.write_text(str(my_file_hash) + "\n")
-    else:
-        console.print("[bright_blue]No changes to generation script. Not 
regenerating the images.")
+    # get all files as arguments
+    for arg in sys.argv[1:]:
+        source_file = Path(arg).resolve()
+        checksum = _get_file_hash(source_file)
+        hash_file = source_file.with_suffix(".md5sum")
+        if not hash_file.exists() or not hash_file.read_text().strip() == 
str(checksum).strip():
+            console.print(f"[bright_blue]Changes in {source_file}. 
Regenerating the image.")
+            subprocess.run(
+                [sys.executable, source_file.resolve().as_posix()], 
check=True, cwd=source_file.parent
+            )
+            hash_file.write_text(str(checksum) + "\n")
+        else:
+            console.print(f"[bright_blue]No changes in {source_file}. Not 
regenerating the image.")
 
 
 if __name__ == "__main__":


Reply via email to