This is an automated email from the ASF dual-hosted git repository.

ryw pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
     new 72b2be7  [AIRFLOW-XXX] Add task execution process on Celery Execution 
diagram (#6961)
72b2be7 is described below

commit 72b2be71b054d6d9abfe62e46d833e220c284113
Author: Kamil BreguĊ‚a <[email protected]>
AuthorDate: Wed Sep 2 13:47:34 2020 +0200

    [AIRFLOW-XXX] Add task execution process on Celery Execution diagram (#6961)
---
 .pre-commit-config.yaml                   |   4 +-
 docs/executor/celery.rst                  |  35 ++++++++++++++
 docs/img/run_task_on_celery_executor.png  | Bin 0 -> 55939 bytes
 docs/img/run_task_on_celery_executor.puml |  77 ++++++++++++++++++++++++++++++
 4 files changed, 114 insertions(+), 2 deletions(-)

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index f7dce85..2e5f9e1 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -58,8 +58,8 @@ repos:
           - --fuzzy-match-generates-todo
         files: \.rst$
       - id: insert-license
-        name: Add license for all JS/CSS files
-        files: \.(js|css)$
+        name: Add license for all JS/CSS/PUML files
+        files: \.(js|css|puml)$
         exclude: ^\.github/.*$
         args:
           - --comment-style
diff --git a/docs/executor/celery.rst b/docs/executor/celery.rst
index a1245cb..bab42e0 100644
--- a/docs/executor/celery.rst
+++ b/docs/executor/celery.rst
@@ -158,6 +158,41 @@ The components communicate with each other in many places
 * [10] **Scheduler** --> **Celery's result backend** - Gets information about 
the status of completed tasks
 * [11] **Scheduler** --> **Celery's broker** - Put the commands to be executed
 
+Task execution process
+----------------------
+
+.. figure:: ../img/run_task_on_celery_executor.png
+    :scale: 50 %
+
+    Sequence diagram - task execution process
+
+Initially, two processes are running:
+
+- SchedulerProcess - process the tasks and run using CeleryExecutor
+- WorkerProcess - observes the queue waiting for new tasks to appear
+- WorkerChildProcess - waits for new tasks
+
+Two databases are also available:
+
+- QueueBroker
+- ResultBackend
+
+During this process, two 2 process are created:
+
+- LocalTaskJobProcess - It logic is described by LocalTaskJob. It is 
monitoring RawTaskProcess. New processes are started using TaskRunner.
+- RawTaskProcess - It is process with the user code e.g. 
:meth:`~airflow.models.BaseOperator.execute`.
+
+| [1] **SchedulerProcess** processes the tasks and when it finds a task that 
needs to be done, sends it to the **QueueBroker**.
+| [2] **QueueBroker** also begins to periodically query **ResultBackend** for 
the status of the task.
+| [3] **QueueBroker**, when it becomes aware of the task, sends information 
about it to one WorkerProcess.
+| [4] **WorkerProcess** assigns a single task to a one **WorkerChildProcess**.
+| [5] **WorkerChildProcess** performs the proper task handling functions - 
:meth:`~airflow.executor.celery_executor.execute_command`. It creates a new 
process - **LocalTaskJobProcess**.
+| [6] LocalTaskJobProcess logic is described by 
:class:`~airflow.jobs.local_task_job.LocalTaskJob` class. It starts new process 
using TaskRunner.
+| [7][8] Process **RawTaskProcess** and **LocalTaskJobProcess** is stopped 
when they have finished their work.
+| [10][12] **WorkerChildProcess** notifies the main process - 
**WorkerProcess** about the end of the task and the availability of subsequent 
tasks.
+| [11] **WorkerProcess** saves status information in **ResultBackend**.
+| [13] When **SchedulerProcess** asks **ResultBackend** again about the 
status, it will get information about the status of the task.
+
 Queues
 ------
 
diff --git a/docs/img/run_task_on_celery_executor.png 
b/docs/img/run_task_on_celery_executor.png
new file mode 100644
index 0000000..e3433c6
Binary files /dev/null and b/docs/img/run_task_on_celery_executor.png differ
diff --git a/docs/img/run_task_on_celery_executor.puml 
b/docs/img/run_task_on_celery_executor.puml
new file mode 100644
index 0000000..f344217
--- /dev/null
+++ b/docs/img/run_task_on_celery_executor.puml
@@ -0,0 +1,77 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/**
+ * This file contains source code of run_task_on_celery_executor.png image.
+ *
+ * If you want regenerate this image, you should follow instructions here:
+ * https://plantuml.com/starting
+ */
+
+@startuml
+autonumber
+
+box Scheduler
+    participant SchedulerProcess order 10
+endbox
+database QueueBroker order 20
+database ResultBackend order 30
+box Worker
+    participant WorkerProcess order 40
+    participant WorkerChildProcess order 50
+    participant LocalTaskJobProcess order 60
+    participant RawTaskProcess order 70
+endbox
+
+activate SchedulerProcess
+activate WorkerChildProcess
+
+SchedulerProcess->>QueueBroker: Send task
+activate QueueBroker
+SchedulerProcess->ResultBackend: Pool celery \ntask state
+deactivate SchedulerProcess
+WorkerChildProcess->QueueBroker: Pool task
+QueueBroker->WorkerChildProcess: Send task
+deactivate QueueBroker
+activate WorkerChildProcess
+create LocalTaskJobProcess
+WorkerChildProcess->LocalTaskJobProcess: Start process
+deactivate
+create RawTaskProcess
+activate LocalTaskJobProcess
+LocalTaskJobProcess->RawTaskProcess: Start process
+deactivate LocalTaskJobProcess
+activate RawTaskProcess
+RawTaskProcess->RawTaskProcess: Execute user code
+RawTaskProcess-->LocalTaskJobProcess: Finish process
+destroy RawTaskProcess
+activate LocalTaskJobProcess
+LocalTaskJobProcess-->WorkerChildProcess: Finish process
+destroy LocalTaskJobProcess
+activate WorkerChildProcess
+WorkerChildProcess-->WorkerProcess: Report task result
+deactivate WorkerChildProcess
+activate WorkerProcess
+WorkerProcess-->ResultBackend: Save Celery task state
+deactivate WorkerProcess
+activate ResultBackend
+ResultBackend-->SchedulerProcess: Send celery task state
+deactivate ResultBackend
+
+@enduml

Reply via email to