yuqian90 commented on a change in pull request #7324: [AIRFLOW-6704] Copy 
common TaskInstance attributes from Task
URL: https://github.com/apache/airflow/pull/7324#discussion_r379829531
 
 

 ##########
 File path: airflow/models/taskinstance.py
 ##########
 @@ -888,13 +899,11 @@ def _run_raw_task(
         from airflow.sensors.base_sensor_operator import BaseSensorOperator
 
         task = self.task
-        self.pool = pool or task.pool
-        self.pool_slots = task.pool_slots
         self.test_mode = test_mode
+        self.refresh_from_task(task, pool_override=pool)
 
 Review comment:
   This is mostly trying to preserve the existing behavior and also move some 
duplicated code into `refresh_from_task()`. However, you are right that this 
part is not perfect:
   
   Ideally we should first call `refresh_from_db()` and then call 
`refresh_from_task()`. The call to `refresh_from_db()` is to load those 
**cumulative** values such as `self.try_number` and `self.max_tries` from db so 
that individual runs of the task can increment these numbers. The call to 
`refresh_from_task()` is to get those configurable values from the latest DAG 
definition. However at the moment `refresh_from_db()` is loading both 
cumulative values and configurable attributes. So it also sets configurable 
values such as `self.queue` and `self.operator` which are most likely more 
useful to be read from DAG definition via `refresh_task()`. 
   
   This PR is not trying to fix everything. It only consolidate some duplicated 
code and make attributes such as `self.queue` and `self.pool` update-able when 
tasks are cleared in `clear_task_instances()`. It's probably worth a separate 
and bigger PR to make sure `refresh_from_db()` is only reading those attributes 
that really should come from db and leave other attributes to 
`refresh_from_task()`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to