uranusjr edited a comment on issue #19058:
URL: https://github.com/apache/airflow/issues/19058#issuecomment-946580267


   OK, I think I know what's going on. The log handler uses a template to 
render the log's filename, which (by default) is `{{ ti.dag_id }}/{{ ti.task_id 
}}/{{ ts }}/{{ try_number }}.log`. The problem is the `{{ ts }}` part. Prior to 
AIP-39, this is simply the ISO-8601 rendition of `execution_date`, but AIP-39 
decoupled a DAG run's _identifying timestamp_ (`logical_date`) and _operating 
period_ (`data_interval`) and split `execution_date`'s semantic meaning into 
two. This means we needed to decide which semantic some of the "derived" 
variables, such as `ts` and `ts_nodash`, so pick to use. We chose 
`data_interval.start` in 2.2.0 because we guessed that's what most people would 
want.
   
   Another change we made when we decoupled `logical_date` to `data_interval` 
is to "fix" manual DAG runs' data interval. Prior to AIP-39, since a DAG run's 
operating period is inferred from `execution_date`, a manual DAG run's data 
interval is nonsensical since `execution_date` is set to when the run is 
triggered doesn't have a logical end time at all. So 2.2.0 introduced new logic 
to "align" a manual run's data interval to match the _most recent completed 
schedule_, but keep its `logical_date` to indicate the same value as 
`execution_date` previously. But this introduces a problem for log file 
identification with `ts`, as shown here.
   
   So the easiest way out here is to change the default log filename template 
to not use `{{ ts }}` but `{{ logical_date|ts }}`. But this would also mean 
that any user-specified custom `log_filename_template` configuration would 
still be broken and need to be migrated, which does not sound viable (and 
compatibility-breaking). Therefore, I think the only viable fix available is to 
roll back the semantic change we made to `ts`, `ts_nodash` etc. so they again 
indicate `execution_date` i.e. `logical_date`. This is quite unfortunate since 
it'd make migration from pre-AIP-39 implicit data interval to modern data 
interval-based semantic more difficult, but probably the only reasonable 
approach.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to