XD-DENG commented on PR #37778:
URL: https://github.com/apache/airflow/pull/37778#issuecomment-1995139815

   > A simple example on how this can be used (that cannot be covered 
otherwise) would be useful.
   
   Hi @uranusjr , sure thing, let me try to elaborate a bit more and get your 
opinions/inputs on this.
   
   In different scenarios, we realize our TIs may have different extra 
dependencies. For example,
   - Scenario 1: The TI may have to use separate hardware resource. It can be 
external GPU/acceleration hardware, other than CPU/memory resource (this is 
literally similar to the built-in Dep `PoolSlotsAvailableDep` or 
`DagTISlotsAvailableDep`).
   - Scenario 2: the TI is only supposed to start when an event is identified 
(this event may be checked via an API, but it's not present as a `Dataset` in 
Airflow, and adding Operator to describe the dependency here is not desirable)
   - Scenario 3: Among all the TIs (they may not belong to the same DAG), we 
may want them to be executed in certain customized order. By default, Airflow 
will execute the TIs in a somehow "FIFO" order, OR take the TI `Priority 
Weight` into consideration. But we are having a bigger idea in mind: based on 
the TIs' expected duration + the global concurrency we allow + resource 
availability, we may want to shuffle the execution order of the TIs, in order 
to achieve the best global efficiency. 
   
   The easiest way to achieve these ideas above, as far as my team can see, is 
to ensure we can add our custom TI Deps into the DagRun's TI scheduling 
decision making process. 
   
   I would love to hear how you think of this, or if you have any good 
alternative solution to share. Many thanks!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to