jscheffl commented on code in PR #43737:
URL: https://github.com/apache/airflow/pull/43737#discussion_r1833445046
##########
docs/apache-airflow-providers-edge/edge_executor.rst:
##########
@@ -209,6 +210,52 @@ could take thousands of tasks without a problem), or from
an environment
perspective (you want a worker running from a specific location where required
infrastructure is available).
+Capacity handling
+-----------------
+
+Some task may need more resources than other tasks, to handle these use case
the Edge worker supports
+capacity handling. The logic behind this is the same as the pool slot feature
+see :doc:`apache-airflow:administration-and-deployment/pools`.
+If a task needs more resource, the need_capacity value can be increased. The
value can be used to block
+other task from being executed in parallel on the same worker. The
need_capacity value works together
+with the capacity value of the worker. A need_capacity of 2 and a worker
capacity of 3 means
+that a worker which executes this task can only execute a job with a
need_capacity of 1 in parallel.
+If not capacity is defined for a task the default value is 1. The
need_capacity value only supports
+integer values.
Review Comment:
Some wording nit.
```suggestion
Some tasks may need more resources than other tasks, to handle these use
case the Edge worker supports
capacity handling. The logic behind this is the same as the pool slot feature
see :doc:`apache-airflow:administration-and-deployment/pools`.
If a task needs more resources, the ``need_capacity`` value can be increased
to reduce concurrency. The value can be used to block
other tasks from being executed in parallel on the same worker. The
``need_capacity`` value works together
with the capacity value of the worker. A ``need_capacity`` of 2 and a worker
capacity of 3 means
that a worker which executes this task can only execute a job with a
``need_capacity`` of 1 in parallel.
If no capacity is defined for a task the default value is 1. The
``need_capacity`` value only supports
integer values.
```
##########
providers/src/airflow/providers/edge/provider.yaml:
##########
@@ -78,12 +78,12 @@ config:
type: integer
example: "10"
default: "30"
- worker_concurrency:
+ worker_capacity:
description: |
- The concurrency that will be used when starting workers with the
- ``airflow edge worker`` command. This defines the number of task
instances that
- a worker will take, so size up your workers based on the resources on
- your worker box and the nature of your tasks
+ The capacity defined the max parallel running task instances and can
be defined during
+ starting worker with the ``airflow edge worker`` command. So size up
your workers
+ based on the resources on your worker box and the nature of your
tasks. The parameter
+ works together with the need_capacity parameter of a task.
Review Comment:
SMall nit in working, probably tpyo?
```suggestion
The capacity defines the default max parallel running task
instances and can also be set during
start of worker with the ``airflow edge worker`` command
parameter. The size of the workers
and the resources must support the nature of your tasks. The
parameter
works together with the need_capacity parameter of a task.
```
##########
providers/src/airflow/providers/edge/models/edge_job.py:
##########
@@ -59,6 +59,7 @@ class EdgeJobModel(Base, LoggingMixin):
try_number = Column(Integer, primary_key=True, default=0)
state = Column(String(20))
queue = Column(String(256))
+ need_capacity = Column(Integer, default=1)
Review Comment:
As you require the field to be set, you have no migration as you drop the
table... no default is needed.
```suggestion
need_capacity = Column(Integer)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]