jscheffl commented on code in PR #43737:
URL: https://github.com/apache/airflow/pull/43737#discussion_r1833445046


##########
docs/apache-airflow-providers-edge/edge_executor.rst:
##########
@@ -209,6 +210,52 @@ could take thousands of tasks without a problem), or from 
an environment
 perspective (you want a worker running from a specific location where required
 infrastructure is available).
 
+Capacity handling
+-----------------
+
+Some task may need more resources than other tasks, to handle these use case 
the Edge worker supports
+capacity handling. The logic behind this is the same as the pool slot feature
+see :doc:`apache-airflow:administration-and-deployment/pools`.
+If a task needs more resource, the need_capacity value can be increased. The 
value can be used to block
+other task from being executed in parallel on the same worker. The 
need_capacity value works together
+with the capacity value of the worker. A need_capacity of 2 and a worker 
capacity of 3 means
+that a worker which executes this task can only execute a job with a 
need_capacity of 1 in parallel.
+If not capacity is defined for a task the default value is 1. The 
need_capacity value only supports
+integer values.

Review Comment:
   Some wording nit.
   ```suggestion
   Some tasks may need more resources than other tasks, to handle these use 
case the Edge worker supports
   capacity handling. The logic behind this is the same as the pool slot feature
   see :doc:`apache-airflow:administration-and-deployment/pools`.
   If a task needs more resources, the ``need_capacity`` value can be increased 
to reduce concurrency. The value can be used to block
   other tasks from being executed in parallel on the same worker. The 
``need_capacity`` value works together
   with the capacity value of the worker. A ``need_capacity`` of 2 and a worker 
capacity of 3 means
   that a worker which executes this task can only execute a job with a 
``need_capacity`` of 1 in parallel.
   If no capacity is defined for a task the default value is 1. The 
``need_capacity`` value only supports
   integer values.
   ```



##########
providers/src/airflow/providers/edge/provider.yaml:
##########
@@ -78,12 +78,12 @@ config:
         type: integer
         example: "10"
         default: "30"
-      worker_concurrency:
+      worker_capacity:
         description: |
-          The concurrency that will be used when starting workers with the
-          ``airflow edge worker`` command. This defines the number of task 
instances that
-          a worker will take, so size up your workers based on the resources on
-          your worker box and the nature of your tasks
+          The capacity defined the max parallel running task instances and can 
be defined during
+          starting worker with the ``airflow edge worker`` command. So size up 
your workers
+          based on the resources on your worker box and the nature of your 
tasks. The parameter
+          works together with the need_capacity parameter of a task.

Review Comment:
   SMall nit in working, probably tpyo?
   ```suggestion
             The capacity defines the default max parallel running task 
instances and can also be set during
             start of worker with the ``airflow edge worker`` command 
parameter. The size of the workers
             and the resources must support the nature of your tasks. The 
parameter
             works together with the need_capacity parameter of a task.
   ```



##########
providers/src/airflow/providers/edge/models/edge_job.py:
##########
@@ -59,6 +59,7 @@ class EdgeJobModel(Base, LoggingMixin):
     try_number = Column(Integer, primary_key=True, default=0)
     state = Column(String(20))
     queue = Column(String(256))
+    need_capacity = Column(Integer, default=1)

Review Comment:
   As you require the field to be set, you have no migration as you drop the 
table... no default is needed.
   ```suggestion
       need_capacity = Column(Integer)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to