Hi guys,

                When I try to use the airflow, I found the dag `run_id` shown 
on the page is the UTC time and my time zone is +8:00, it makes me quite hard 
to know which runs exactly are?

For example, I trigger a dag run at ‘2020-08-18 10:10:00’ but the dag `run_id` 
is `2020-08-18 02:10:00`.

So I create a PR here: https://github.com/apache/airflow/pull/17502 to localize 
the dag `run_id` and the PR is WIP now.

But I think we can have a discussion about the `run_id`. Actually, it makes me 
quite confused about the `run_id` definition when I check the sources.

There are 2 points:
Actually, most of the time we use the `execution_date` to query the dag_runs, 
and there is also a UNIQUE_KEY( dag_id+ execution_date), why do we still need 
another key to query.  And in fact, the `execution_date` can be the `run_id` 
already and we don’t need another `run_id`. 
If we want to use the `run_id` to let the user know when the task extract ly 
run, but it is UTC time, and it is very hard for users to use
I saw use in some places, we get the run_type from the `run_id`, but we didn’t 
set a clear rule of the `run_id`. It will be a risk in the future because it is 
a hidden rule of the dag `run_id`.
For my suggestions:

1.                   We should clear the definition of the `run_id` and make a 
clear rule of it.

2.                   Avoid getting the `run_type` from the `run_id` and only 
use the `run_type` in the dag_run

3.                   Change the `run_id` to local time to make the user know 
the exact run time easily.

 

 

Just a wider discussions, let me know what do you think.

Thanks a lot

 

 

From,

Lionel Zhao

 

Reply via email to