jedcunningham commented on PR #24743: URL: https://github.com/apache/airflow/pull/24743#issuecomment-1175635715
I think "any" is a more general and simpler first pass. If you go "all", and you have a really infrequently produced dataset, we are basically forced to provide more functionality initially. For instance, if you have a DAG that produces a companies holidays for the year, but other datasets are produced way more frequently. This sorta cuts the other way too though, e.g. depending on 2 daily datasets, one of which is slower/later in the day. However, in this case you could short circuit if only 1 of the 2 is ready since you at least have a dagrun. Not great, but doable (and arguably better than rerunning infrequent jobs at the lowest common frequency, if we kept it simple). > if you have 10 upstream datasets, and each is in its own dag, you could easily get 10 dag runs created. Yep, exactly. I'd say that's desired for "any". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
