This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch v2-8-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit ae0ceb382dc0dfb91adc5d22b480c321ffbe3fe0
Author: Jarek Potiuk <[email protected]>
AuthorDate: Wed Dec 20 19:44:33 2023 +0100

    Improve pre-commit to generate Airflow diagrams as a code (#36333)
    
    Since we are getting more diagrams generated in Airflow using the
    "diagram as a code" approach, this PR improves the pre-commit to be
    more suitable to support generation of more of the images coming
    from different sources, placed in different directories and generated
    independently, so that the whole process is more distributed and easy
    for whoever creates diagrams to add their own diagram.
    
    The changes implemented in this PR:
    
    * the code to generate the diagrams is now next to the diagram they
      generate. It has the same name as the diagram, but it has the .py
      extension. This way it is immediately visible where is the source
      of each diagram (right next to each diagram)
    
    * each of the .py diagram Python files is runnable on its own. This
      way you can easily regenerate the diagrams by running corresponding
      Python file or even automate it by running "save" action and generate
      the diagrams automatically by running the Python code every time
      the file is saved. That makes a very nice workflow on iterating on
      each diagram, independently from each othere
    
    * the pre-commit script is given a set of folders which should be
      scanned and it finds and run the diagrams on pre-commmit. It also
      creates and verifies the md5sum hash of the source Python file
      separately for each diagram and only runs diagram generation when
      the source file changed vs. last time the hash was saved and
      committed. The hash sum is stored next to the image and sources
      with .md5sum extension
    
    Also updated documentation in the CONTRIBUTING.rst explaining how
    to generate the diagrams and what is the mechanism of that
    generation.
    
    (cherry picked from commit b35b08ec41814b6fe5d7388296db83a726e6d6d0)
---
 .pre-commit-config.yaml                            |   4 +-
 .rat-excludes                                      |   3 +
 CONTRIBUTING.rst                                   |  45 ++++++++
 ...agram_fab_auth_manager_airflow_architecture.png | Bin 0 -> 81735 bytes
 ...iagram_auth_manager_airflow_architecture.md5sum |   1 +
 .../diagram_auth_manager_airflow_architecture.png  | Bin 0 -> 54220 bytes
 .../diagram_auth_manager_airflow_architecture.py   |  73 +++++++++++++
 .../img/diagram_basic_airflow_architecture.md5sum  |   1 +
 .../img/diagram_basic_airflow_architecture.py      |  77 +++++++++++++
 ...agram_dag_processor_airflow_architecture.md5sum |   1 +
 .../diagram_dag_processor_airflow_architecture.py  |  84 ++++++++++++++
 ...am_fab_auth_manager_airflow_architecture.md5sum |   1 +
 ...agram_fab_auth_manager_airflow_architecture.png | Bin 0 -> 49545 bytes
 ...iagram_fab_auth_manager_airflow_architecture.py |  74 +++++++++++++
 .../diagrams/python_multiprocess_logo.png          | Bin
 .../pre_commit_generate_airflow_diagrams.py        | 121 ++++-----------------
 16 files changed, 381 insertions(+), 104 deletions(-)

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index c12be3a5f1..7a7b2a64e5 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -416,8 +416,8 @@ repos:
         name: Generate airflow diagrams
         entry: ./scripts/ci/pre_commit/pre_commit_generate_airflow_diagrams.py
         language: python
-        files: ^scripts/ci/pre_commit/pre_commit_generate_airflow_diagrams.py
-        pass_filenames: false
+        files: ^docs/.*/diagram_[^/]*\.py$
+        pass_filenames: true
         additional_dependencies: ['rich>=12.4.4', "diagrams>=0.23.4"]
       - id: update-supported-versions
         name: Updates supported versions in documentation
diff --git a/.rat-excludes b/.rat-excludes
index 751742b1af..d881787de9 100644
--- a/.rat-excludes
+++ b/.rat-excludes
@@ -145,3 +145,6 @@ doap_airflow.rdf
 
 # PKG-INFO file
 PKG-INFO
+
+# checksum files
+.*\.md5sum
diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst
index c015e457bd..abb9ed59da 100644
--- a/CONTRIBUTING.rst
+++ b/CONTRIBUTING.rst
@@ -942,6 +942,51 @@ Documentation for ``apache-airflow`` package and other 
packages that are closely
 providers packages are in ``/docs/`` directory. For detailed information on 
documentation development,
 see: `docs/README.rst <docs/README.rst>`_
 
+Diagrams
+========
+
+We started to use (and gradually convert old diagrams to use it) `Diagrams 
<https://diagrams.mingrammer.com/>`_
+as our tool of choice to generate diagrams. The diagrams are generated from 
Python code and can be
+automatically updated when the code changes. The diagrams are generated using 
pre-commit hooks (See
+static checks below) but they can also be generated manually by running the 
corresponding Python code.
+
+To run the code you need to install the dependencies in the virtualenv you use 
to run it:
+* ``pip install diagrams rich``. You need to have graphviz installed in your
+system (``brew install graphviz`` on macOS for example).
+
+The source code of the diagrams are next to the generated diagram, the 
difference is that the source
+code has ``.py`` extension and the generated diagram has ``.png`` extension. 
The pre-commit hook
+ ``generate-airflow-diagrams`` will look for ``diagram_*.py`` files in the 
``docs`` subdirectories
+to find them and runs them when the sources changed and the diagrams are not 
up to date (the
+pre-commit will automatically generate an .md5sum hash of the sources and 
store it next to the diagram
+file).
+
+In order to generate the diagram manually you can run the following command:
+
+.. code-block:: bash
+
+    python <path-to-diagram-file>.py
+
+You can also generate all diagrams by:
+
+.. code-block:: bash
+
+    pre-commit run generate-airflow-diagrams
+
+or with Breeze:
+
+.. code-block:: bash
+
+    breeze static-checks --type generate-airflow-diagrams --all-files
+
+When you iterate over a diagram, you can also setup a "save" action in your 
IDE to run the python
+file automatically when you save the diagram file.
+
+Once you've done iteration and you are happy with the diagram, you can commit 
the diagram, the source
+code and the .md5sum file. The pre-commit hook will then not run the diagram 
generation until the
+source code for it changes.
+
+
 Static code checks
 ==================
 
diff --git 
a/docs/apache-airflow-providers-fab/img/diagram_fab_auth_manager_airflow_architecture.png
 
b/docs/apache-airflow-providers-fab/img/diagram_fab_auth_manager_airflow_architecture.png
new file mode 100644
index 0000000000..9c7a1d1561
Binary files /dev/null and 
b/docs/apache-airflow-providers-fab/img/diagram_fab_auth_manager_airflow_architecture.png
 differ
diff --git 
a/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.md5sum 
b/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.md5sum
new file mode 100644
index 0000000000..ac3e24d848
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.md5sum
@@ -0,0 +1 @@
+5b82cba489898a46dcfe5f458eeee33b
diff --git 
a/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.png 
b/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.png
new file mode 100644
index 0000000000..35f3f418f2
Binary files /dev/null and 
b/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.png differ
diff --git 
a/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.py 
b/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.py
new file mode 100644
index 0000000000..453d17267c
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_auth_manager_airflow_architecture.py
@@ -0,0 +1,73 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from pathlib import Path
+
+from diagrams import Cluster, Diagram, Edge
+from diagrams.custom import Custom
+from diagrams.onprem.client import User
+from rich.console import Console
+
+MY_DIR = Path(__file__).parent
+MY_FILENAME = Path(__file__).with_suffix("").name
+PYTHON_MULTIPROCESS_LOGO = MY_DIR.parents[1] / "diagrams" / 
"python_multiprocess_logo.png"
+
+console = Console(width=400, color_system="standard")
+
+
+def generate_auth_manager_airflow_diagram():
+    image_file = (MY_DIR / MY_FILENAME).with_suffix(".png")
+
+    console.print(f"[bright_blue]Generating architecture image {image_file}")
+    with Diagram(
+        name="",
+        show=False,
+        direction="LR",
+        curvestyle="ortho",
+        filename=MY_FILENAME,
+    ):
+        user = User("User")
+        with Cluster("Airflow environment"):
+            webserver = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+            with Cluster("Provider X"):
+                auth_manager = Custom("X auth manager", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+            with Cluster("Core Airflow"):
+                auth_manager_interface = Custom(
+                    "Auth manager\ninterface", 
PYTHON_MULTIPROCESS_LOGO.as_posix()
+                )
+
+        (user >> Edge(color="black", style="solid", reverse=True, 
label="Access to the console") >> webserver)
+
+        (
+            webserver
+            >> Edge(color="black", style="solid", reverse=True, label="Is user 
authorized?")
+            >> auth_manager
+        )
+
+        (
+            auth_manager
+            >> Edge(color="black", style="dotted", reverse=False, 
label="Inherit")
+            >> auth_manager_interface
+        )
+
+    console.print(f"[green]Generating architecture image {image_file}")
+
+
+if __name__ == "__main__":
+    generate_auth_manager_airflow_diagram()
diff --git a/docs/apache-airflow/img/diagram_basic_airflow_architecture.md5sum 
b/docs/apache-airflow/img/diagram_basic_airflow_architecture.md5sum
new file mode 100644
index 0000000000..d20c0307d4
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_basic_airflow_architecture.md5sum
@@ -0,0 +1 @@
+ac9bd11824e7faf5ed5232ff242c3157
diff --git a/docs/apache-airflow/img/diagram_basic_airflow_architecture.py 
b/docs/apache-airflow/img/diagram_basic_airflow_architecture.py
new file mode 100644
index 0000000000..d65a6ae83a
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_basic_airflow_architecture.py
@@ -0,0 +1,77 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from pathlib import Path
+
+from diagrams import Cluster, Diagram, Edge
+from diagrams.custom import Custom
+from diagrams.onprem.client import User
+from diagrams.onprem.database import PostgreSQL
+from diagrams.programming.flowchart import MultipleDocuments
+from rich.console import Console
+
+MY_DIR = Path(__file__).parent
+MY_FILENAME = Path(__file__).with_suffix("").name
+PYTHON_MULTIPROCESS_LOGO = MY_DIR.parents[1] / "diagrams" / 
"python_multiprocess_logo.png"
+
+console = Console(width=400, color_system="standard")
+
+
+def generate_basic_airflow_diagram():
+    image_file = (MY_DIR / MY_FILENAME).with_suffix(".png")
+
+    console.print(f"[bright_blue]Generating architecture image {image_file}")
+    with Diagram(
+        name="", show=False, direction="LR", curvestyle="ortho", 
filename=MY_FILENAME, outformat="png"
+    ):
+        with Cluster("Parsing & Scheduling"):
+            schedulers = Custom("Scheduler(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+        metadata_db = PostgreSQL("Metadata DB")
+
+        dag_author = User("DAG Author")
+        dag_files = MultipleDocuments("DAG files")
+
+        dag_author >> Edge(color="black", style="dashed", reverse=False) >> 
dag_files
+
+        with Cluster("Execution"):
+            workers = Custom("Worker(s)", PYTHON_MULTIPROCESS_LOGO.as_posix())
+            triggerer = Custom("Triggerer(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+        schedulers - Edge(color="blue", style="dashed", taillabel="Executor") 
- workers
+
+        schedulers >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
+        workers >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
+        triggerer >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
+
+        operations_user = User("Operations User")
+        with Cluster("UI"):
+            webservers = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+        webservers >> Edge(color="black", style="dashed", reverse=True) >> 
operations_user
+
+        metadata_db >> Edge(color="red", style="dotted", reverse=True) >> 
webservers
+
+        dag_files >> Edge(color="brown", style="solid") >> workers
+        dag_files >> Edge(color="brown", style="solid") >> schedulers
+        dag_files >> Edge(color="brown", style="solid") >> triggerer
+    console.print(f"[green]Generating architecture image {image_file}")
+
+
+if __name__ == "__main__":
+    generate_basic_airflow_diagram()
diff --git 
a/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.md5sum 
b/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.md5sum
new file mode 100644
index 0000000000..ebe1a15d56
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.md5sum
@@ -0,0 +1 @@
+e189c45f79a7a878802bde13be27a112
diff --git 
a/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.py 
b/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.py
new file mode 100644
index 0000000000..714049d349
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_dag_processor_airflow_architecture.py
@@ -0,0 +1,84 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from pathlib import Path
+
+from diagrams import Cluster, Diagram, Edge
+from diagrams.custom import Custom
+from diagrams.onprem.client import User
+from diagrams.onprem.database import PostgreSQL
+from diagrams.programming.flowchart import MultipleDocuments
+from rich.console import Console
+
+MY_DIR = Path(__file__).parent
+MY_FILENAME = Path(__file__).with_suffix("").name
+PYTHON_MULTIPROCESS_LOGO = MY_DIR.parents[1] / "diagrams" / 
"python_multiprocess_logo.png"
+
+console = Console(width=400, color_system="standard")
+
+
+def generate_dag_processor_airflow_diagram():
+    dag_processor_architecture_image_file = (MY_DIR / 
MY_FILENAME).with_suffix(".png")
+    console.print(f"[bright_blue]Generating architecture image 
{dag_processor_architecture_image_file}")
+    with Diagram(
+        name="",
+        show=False,
+        direction="LR",
+        curvestyle="ortho",
+        filename=MY_FILENAME,
+        outformat="png",
+    ):
+        operations_user = User("Operations User")
+        with Cluster("No DAG Python Code Execution", graph_attr={"bgcolor": 
"lightgrey"}):
+            with Cluster("Scheduling"):
+                schedulers = Custom("Scheduler(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+            with Cluster("UI"):
+                webservers = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+        webservers >> Edge(color="black", style="dashed", reverse=True) >> 
operations_user
+
+        metadata_db = PostgreSQL("Metadata DB")
+
+        dag_author = User("DAG Author")
+        with Cluster("DAG Python Code Execution"):
+            with Cluster("Execution"):
+                workers = Custom("Worker(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+                triggerer = Custom("Triggerer(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+            with Cluster("Parsing"):
+                dag_processors = Custom("DAG\nProcessor(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+            dag_files = MultipleDocuments("DAG files")
+
+        dag_author >> Edge(color="black", style="dashed", reverse=False) >> 
dag_files
+
+        workers - Edge(color="blue", style="dashed", headlabel="Executor") - 
schedulers
+
+        metadata_db >> Edge(color="red", style="dotted", reverse=True) >> 
webservers
+        metadata_db >> Edge(color="red", style="dotted", reverse=True) >> 
schedulers
+        dag_processors >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
+        workers >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
+        triggerer >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
+
+        dag_files >> Edge(color="brown", style="solid") >> workers
+        dag_files >> Edge(color="brown", style="solid") >> dag_processors
+        dag_files >> Edge(color="brown", style="solid") >> triggerer
+    console.print(f"[green]Generating architecture image 
{dag_processor_architecture_image_file}")
+
+
+if __name__ == "__main__":
+    generate_dag_processor_airflow_diagram()
diff --git 
a/docs/apache-airflow/img/diagram_fab_auth_manager_airflow_architecture.md5sum 
b/docs/apache-airflow/img/diagram_fab_auth_manager_airflow_architecture.md5sum
new file mode 100644
index 0000000000..fb928aa691
--- /dev/null
+++ 
b/docs/apache-airflow/img/diagram_fab_auth_manager_airflow_architecture.md5sum
@@ -0,0 +1 @@
+aa73a8292341145e0f60682f7047503b
diff --git 
a/docs/apache-airflow/img/diagram_fab_auth_manager_airflow_architecture.png 
b/docs/apache-airflow/img/diagram_fab_auth_manager_airflow_architecture.png
new file mode 100644
index 0000000000..4057a67615
Binary files /dev/null and 
b/docs/apache-airflow/img/diagram_fab_auth_manager_airflow_architecture.png 
differ
diff --git 
a/docs/apache-airflow/img/diagram_fab_auth_manager_airflow_architecture.py 
b/docs/apache-airflow/img/diagram_fab_auth_manager_airflow_architecture.py
new file mode 100644
index 0000000000..393d988bb2
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_fab_auth_manager_airflow_architecture.py
@@ -0,0 +1,74 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from pathlib import Path
+
+from diagrams import Cluster, Diagram, Edge
+from diagrams.custom import Custom
+from diagrams.onprem.client import User
+from diagrams.onprem.database import PostgreSQL
+from rich.console import Console
+
+MY_DIR = Path(__file__).parent
+MY_FILENAME = Path(__file__).with_suffix("").name
+PYTHON_MULTIPROCESS_LOGO = MY_DIR.parents[1] / "diagrams" / 
"python_multiprocess_logo.png"
+
+console = Console(width=400, color_system="standard")
+
+
+def generate_fab_auth_manager_airflow_diagram():
+    image_file = (MY_DIR / MY_FILENAME).with_suffix(".png")
+    console.print(f"[bright_blue]Generating architecture image {image_file}")
+    with Diagram(
+        name="",
+        show=False,
+        direction="LR",
+        curvestyle="ortho",
+        filename=MY_FILENAME,
+    ):
+        user = User("User")
+        with Cluster("Airflow environment"):
+            webserver = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+
+            with Cluster("FAB provider"):
+                fab_auth_manager = Custom("FAB auth manager", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
+            with Cluster("Core Airflow"):
+                auth_manager_interface = Custom(
+                    "Auth manager\ninterface", 
PYTHON_MULTIPROCESS_LOGO.as_posix()
+                )
+
+            db = PostgreSQL("Metadata DB")
+
+        user >> Edge(color="black", style="solid", reverse=True, label="Access 
to the console") >> webserver
+        (
+            webserver
+            >> Edge(color="black", style="solid", reverse=True, label="Is user 
authorized?")
+            >> fab_auth_manager
+        )
+        (fab_auth_manager >> Edge(color="black", style="solid", reverse=True) 
>> db)
+        (
+            fab_auth_manager
+            >> Edge(color="black", style="dotted", reverse=False, 
label="Inherit")
+            >> auth_manager_interface
+        )
+
+    console.print(f"[green]Generating architecture image {image_file}")
+
+
+if __name__ == "__main__":
+    generate_fab_auth_manager_airflow_diagram()
diff --git a/images/diagrams/python_multiprocess_logo.png 
b/docs/diagrams/python_multiprocess_logo.png
similarity index 100%
rename from images/diagrams/python_multiprocess_logo.png
rename to docs/diagrams/python_multiprocess_logo.png
diff --git a/scripts/ci/pre_commit/pre_commit_generate_airflow_diagrams.py 
b/scripts/ci/pre_commit/pre_commit_generate_airflow_diagrams.py
index 22f05715b1..f809d566e3 100755
--- a/scripts/ci/pre_commit/pre_commit_generate_airflow_diagrams.py
+++ b/scripts/ci/pre_commit/pre_commit_generate_airflow_diagrams.py
@@ -18,121 +18,38 @@
 from __future__ import annotations
 
 import hashlib
-import os
+import subprocess
+import sys
 from pathlib import Path
 
-from diagrams import Cluster, Diagram, Edge
-from diagrams.custom import Custom
-from diagrams.onprem.client import User
-from diagrams.onprem.database import PostgreSQL
-from diagrams.programming.flowchart import MultipleDocuments
 from rich.console import Console
 
 console = Console(width=400, color_system="standard")
 
 LOCAL_DIR = Path(__file__).parent
 AIRFLOW_SOURCES_ROOT = Path(__file__).parents[3]
-DOCS_IMAGES_DIR = AIRFLOW_SOURCES_ROOT / "docs" / "apache-airflow" / "img"
-PYTHON_MULTIPROCESS_LOGO = AIRFLOW_SOURCES_ROOT / "images" / "diagrams" / 
"python_multiprocess_logo.png"
 
-BASIC_ARCHITECTURE_IMAGE_NAME = "diagram_basic_airflow_architecture"
-DAG_PROCESSOR_AIRFLOW_ARCHITECTURE_IMAGE_NAME = 
"diagram_dag_processor_airflow_architecture"
-DIAGRAM_HASH_FILE_NAME = "diagram_hash.txt"
 
-
-def generate_basic_airflow_diagram(filename: str):
-    basic_architecture_image_file = (DOCS_IMAGES_DIR / 
BASIC_ARCHITECTURE_IMAGE_NAME).with_suffix(".png")
-    console.print(f"[bright_blue]Generating architecture image 
{basic_architecture_image_file}")
-    with Diagram(name="", show=False, direction="LR", curvestyle="ortho", 
filename=filename):
-        with Cluster("Parsing & Scheduling"):
-            schedulers = Custom("Scheduler(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-
-        metadata_db = PostgreSQL("Metadata DB")
-
-        dag_author = User("DAG Author")
-        dag_files = MultipleDocuments("DAG files")
-
-        dag_author >> Edge(color="black", style="dashed", reverse=False) >> 
dag_files
-
-        with Cluster("Execution"):
-            workers = Custom("Worker(s)", PYTHON_MULTIPROCESS_LOGO.as_posix())
-            triggerer = Custom("Triggerer(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-
-        schedulers - Edge(color="blue", style="dashed", taillabel="Executor") 
- workers
-
-        schedulers >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
-        workers >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
-        triggerer >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
-
-        operations_user = User("Operations User")
-        with Cluster("UI"):
-            webservers = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-
-        webservers >> Edge(color="black", style="dashed", reverse=True) >> 
operations_user
-
-        metadata_db >> Edge(color="red", style="dotted", reverse=True) >> 
webservers
-
-        dag_files >> Edge(color="brown", style="solid") >> workers
-        dag_files >> Edge(color="brown", style="solid") >> schedulers
-        dag_files >> Edge(color="brown", style="solid") >> triggerer
-    console.print(f"[green]Generating architecture image 
{basic_architecture_image_file}")
-
-
-def generate_dag_processor_airflow_diagram(filename: str):
-    dag_processor_architecture_image_file = (
-        DOCS_IMAGES_DIR / DAG_PROCESSOR_AIRFLOW_ARCHITECTURE_IMAGE_NAME
-    ).with_suffix(".png")
-    console.print(f"[bright_blue]Generating architecture image 
{dag_processor_architecture_image_file}")
-    with Diagram(name="", show=False, direction="LR", curvestyle="ortho", 
filename=filename):
-        operations_user = User("Operations User")
-        with Cluster("No DAG Python Code Execution", graph_attr={"bgcolor": 
"lightgrey"}):
-            with Cluster("Scheduling"):
-                schedulers = Custom("Scheduler(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-
-            with Cluster("UI"):
-                webservers = Custom("Webserver(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-
-        webservers >> Edge(color="black", style="dashed", reverse=True) >> 
operations_user
-
-        metadata_db = PostgreSQL("Metadata DB")
-
-        dag_author = User("DAG Author")
-        with Cluster("DAG Python Code Execution"):
-            with Cluster("Execution"):
-                workers = Custom("Worker(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-                triggerer = Custom("Triggerer(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-            with Cluster("Parsing"):
-                dag_processors = Custom("DAG\nProcessor(s)", 
PYTHON_MULTIPROCESS_LOGO.as_posix())
-            dag_files = MultipleDocuments("DAG files")
-
-        dag_author >> Edge(color="black", style="dashed", reverse=False) >> 
dag_files
-
-        workers - Edge(color="blue", style="dashed", headlabel="Executor") - 
schedulers
-
-        metadata_db >> Edge(color="red", style="dotted", reverse=True) >> 
webservers
-        metadata_db >> Edge(color="red", style="dotted", reverse=True) >> 
schedulers
-        dag_processors >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
-        workers >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
-        triggerer >> Edge(color="red", style="dotted", reverse=True) >> 
metadata_db
-
-        dag_files >> Edge(color="brown", style="solid") >> workers
-        dag_files >> Edge(color="brown", style="solid") >> dag_processors
-        dag_files >> Edge(color="brown", style="solid") >> triggerer
-    console.print(f"[green]Generating architecture image 
{dag_processor_architecture_image_file}")
+def _get_file_hash(file_to_check: Path) -> str:
+    hash_md5 = hashlib.md5()
+    hash_md5.update(Path(file_to_check).resolve().read_bytes())
+    return hash_md5.hexdigest()
 
 
 def main():
-    hash_md5 = hashlib.md5()
-    hash_md5.update(Path(__file__).resolve().read_bytes())
-    my_file_hash = hash_md5.hexdigest()
-    hash_file = LOCAL_DIR / DIAGRAM_HASH_FILE_NAME
-    if not hash_file.exists() or not hash_file.read_text().strip() == 
str(my_file_hash).strip():
-        os.chdir(DOCS_IMAGES_DIR)
-        generate_basic_airflow_diagram(BASIC_ARCHITECTURE_IMAGE_NAME)
-        
generate_dag_processor_airflow_diagram(DAG_PROCESSOR_AIRFLOW_ARCHITECTURE_IMAGE_NAME)
-        hash_file.write_text(str(my_file_hash) + "\n")
-    else:
-        console.print("[bright_blue]No changes to generation script. Not 
regenerating the images.")
+    # get all files as arguments
+    for arg in sys.argv[1:]:
+        source_file = Path(arg).resolve()
+        checksum = _get_file_hash(source_file)
+        hash_file = source_file.with_suffix(".md5sum")
+        if not hash_file.exists() or not hash_file.read_text().strip() == 
str(checksum).strip():
+            console.print(f"[bright_blue]Changes in {source_file}. 
Regenerating the image.")
+            subprocess.run(
+                [sys.executable, source_file.resolve().as_posix()], 
check=True, cwd=source_file.parent
+            )
+            hash_file.write_text(str(checksum) + "\n")
+        else:
+            console.print(f"[bright_blue]No changes in {source_file}. Not 
regenerating the image.")
 
 
 if __name__ == "__main__":

Reply via email to