potiuk commented on a change in pull request #10956:
URL: https://github.com/apache/airflow/pull/10956#discussion_r499643538
##########
File path: airflow/models/dag.py
##########
@@ -1939,6 +2099,37 @@ def deactivate_deleted_dags(cls, alive_dag_filelocs:
List[str], session=None):
session.rollback()
raise
+ @classmethod
+ def dags_needing_dagruns(cls, session: Session):
+ """
+ Return (and lock) a list of Dag objects that are due to create a new
DagRun This will return a
+ resultset of rows that is row-level-locked with a "SELECT ... FOR
UPDATE" query, you should ensure
+ that any scheduling decisions are made in a single transaction -- as
soon as the transaction is
+ committed it will be unlocked.
+ """
+ # TODO[HA]: Bake this query, it is run _A lot_
+ # TODO[HA]: Make this limit a tunable. We limit so that _one_ scheduler
+ # doesn't try to do all the creation of dag runs
+ query = session.query(cls).filter(
+ cls.is_paused.is_(False),
+ cls.is_active.is_(True),
+ cls.next_dagrun_create_after <= func.now(),
+ ).order_by(
+ cls.next_dagrun_create_after
+ ).limit(cls.NUM_DAGS_PER_DAGRUN_QUERY)
+
+ return with_for_update(query, of=cls, **skip_locked(session=session))
+
+
+STATICA_HACK = True
Review comment:
Yeah I know, but there we do it because we had to keep the compatibility
with the API, and here it seems that we are doing it here same way but I feel
it can be done differently. We will also take a look at that but, this does not
seem is the best and most "readable" approach for the "actual code" (not the
references to the code from __init__.py )
I understand that the problem is that we want to have a non-DAG (in the
Directed Acyclic Graph sense) DB model relationships. But I would argue then,
that maybe we want to define those models that are so closely coupled in the
same module and use then the technique from
https://stackoverflow.com/questions/53842124/sqlalchemy-avoid-cyclic-dependencies-in-orm-without-using-strings
(which is what is happening here but the STATIC_HACK would not be needed) -
just to reflect their close coupling. I think all the SerializedDag and Dag -
they are manifestation of the same "logical" object and having the closely
coupled objects together in a single module might simply make sense.
And I am not talking we should do it now. When we reviewed this code, we
agreed, that some of further refactoring and separation should be done later -
to minimise number of changed/new files, I just wanted to raise the point here
that this might be something we want to do very early after that change. Simply
this change is mostly about optimisations, which is likely (and does) make the
code more complex and intertwined because that's usually the nature of
optmisations. But it also increases the complexity of the software and changes
meaning of some of the functions and classes (several of the function names,
architecture decisions about splitting (or not) the classes that were made some
time ago are not valid any more (some of the classes we have more than one
responsibility, some of the classes started to share logic from several
unrelated now areas and some are split where they should be joined).
I think if we agree that we should continue our refactoring after this
merge I think I will be fine in deferring this work to the beta iterations.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]