potiuk commented on a change in pull request #10956:
URL: https://github.com/apache/airflow/pull/10956#discussion_r499630961
##########
File path: airflow/models/dag.py
##########
@@ -1824,10 +1960,34 @@ class DagModel(Base):
# Tags for view filter
tags = relationship('DagTag', cascade='all,delete-orphan',
backref=backref('dag'))
+ concurrency = Column(Integer, nullable=False)
+
+ has_task_concurrency_limits = Column(Boolean, nullable=False)
+
+ # The execution_date of the next dag run
+ next_dagrun = Column(UtcDateTime)
+ # Earliest time at which this ``next_dagrun`` can be created
+ next_dagrun_create_after = Column(UtcDateTime)
+
__table_args__ = (
Index('idx_root_dag_id', root_dag_id, unique=False),
+ Index('idx_next_dagrun_create_after', next_dagrun_create_after,
unique=False),
)
+ NUM_DAGS_PER_DAGRUN_QUERY = conf.getint(
+ 'scheduler',
+ 'num_dags_needing_dagrun_per_scheduler_loop',
+ fallback=10
+ )
Review comment:
Yeah. I think we should keep it documented. I think also some
consequences - what we "expect" to happen if we decrease/increase it. This all
can be put in the docs similarly to those comments we have for the other
parameter added.
As I see it, changing this number has this behavior (I hope I am
interpreting it correctly):
If we decrease it, the scheduling might take longer but have smaller random
latency for some Dag Runs. if we have more dags to process in one transaction,
it means that there might be a delay in scheduling those dags which are at teh
beginning of the batch. However this can decrease the overall "capacity" of the
scheduler as batch processing of DagRuns simply uses less resources than
processing them one-by-one. There is also less contention possible (do we have
other queries/processes that compete for those locks)? If so then there is a
higher chance of those contention locks happening and then the overall
capacity of the system can be impacted by that. Another impact here is that
when we have few big dags (in multi-scheduler scenario) that one scheduler will
be doing most of the job.
Is that correct description?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]