chonein opened a new issue, #40623:
URL: https://github.com/apache/airflow/issues/40623

   ### Apache Airflow version
   
   2.9.2
   
   ### If "Other Airflow 2 version" selected, which one?
   
   Also tested on main and 2.8.0
   
   ### What happened?
   
   When a task instance's state changes to success, updated_at column does not 
update.
   I have a task which is just `sleep 50` bash command. As show in the 
screenshot, start_date and end_date make sense (~50 second different) but 
updated_at doesn't make sense at all; it's very close to the start_date.
   
   
![image](https://github.com/apache/airflow/assets/61818445/c4306db3-f2d4-434e-a822-4abf01ec5f6c)
   
   ### What you think should happen instead?
   
   updated_at for a row should always be updated anytime any column in that row 
changes.
   
   ### How to reproduce
   
   1. Host airflow instance (was able to reproduce with standalone sqlite 
instance and postgresql instance)
   2. Create a DAG with a sleep task
   ```py
   from datetime import datetime, timedelta
   from textwrap import dedent
   
   from airflow import DAG
   
   from airflow.operators.bash import BashOperator
   with DAG(
       'test_1',
       default_args={
           'depends_on_past': False,
           'email': ['[email protected]'],
           'email_on_failure': False,
           'email_on_retry': False,
           'retries': 1,
           'retry_delay': timedelta(minutes=5),
       },
       description='A simple tutorial DAG',
       schedule_interval=timedelta(days=1),
       start_date=datetime(2021, 1, 1),
       catchup=False,
       tags=['example'],
   ) as dag:
   
       # t1, t2 and t3 are examples of tasks created by instantiating operators
       t1 = BashOperator(
           task_id='print_date',
           bash_command='date',
       )
   
       t2 = BashOperator(
           task_id='sleep',
           depends_on_past=False,
           bash_command='sleep 50',
           retries=3,
       )
       t1.doc_md = dedent(
           """\
       #### Task Documentation
       You can document your task using the attributes `doc_md` (markdown),
       `doc` (plain text), `doc_rst`, `doc_json`, `doc_yaml` which gets
       rendered in the UI's Task Instance Details page.
       
![img](http://montcs.bloomu.edu/~bobmon/Semesters/2012-01/491/import%20soul.png)
   
       """
       )
   
       dag.doc_md = __doc__  # providing that you have a docstring at the 
beginning of the DAG
       dag.doc_md = """
       This is a documentation placed anywhere
       """  # otherwise, type it like this
       templated_command = dedent(
           """
       {% for i in range(5) %}
           echo "{{ ds }}"
           echo "{{ macros.ds_add(ds, 7)}}"
       {% endfor %}
       """
       )
   
       t3 = BashOperator(
           task_id='templated',
           depends_on_past=False,
           bash_command=templated_command,
       )
   
       t1 >> [t2, t3]
   ```
   4. trigger a dag run and wait for it to complete
   5. In the database query that task instance for sleep task and check 
start_date, end_date, updated_at
   
   ### Operating System
   
   RHEL8 and MacOS 14.5
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   _No response_
   
   ### Anything else?
   
   Problem occurs every time.
   I am not very familiar with sql alchemy but I believe the problem has to do 
with `onupdate` 
(https://github.pie.apple.com/IPR/apache-airflow/blob/main/airflow/models/taskinstance.py#L3262)
 not triggering for session.merge
   
https://github.pie.apple.com/IPR/apache-airflow/blob/main/airflow/models/taskinstance.py#L3262
 
   I added `echo=True` to the sqlalchemy engine so that it outputs the sql 
statements that were ran. This is what shows up in the log of sleep's task 
(logs are from version 2.9.2):
   ```
   [2024-07-03T16:01:09.092-0700] {taskinstance.py:1206} INFO - Marking task as 
SUCCESS. dag_id=test_1, task_id=sleep, 
run_id=manual__2024-07-03T23:00:14.090496+00:00, 
execution_date=20240703T230014, start_date=20240703T230018, 
end_date=20240703T230109
   [2024-07-03T16:01:09.138-0700] {base.py:1865} INFO - UPDATE task_instance 
SET end_date=%(end_date)s, duration=%(duration)s, state=%(state)s, 
updated_at=%(updated_at)s WHERE task_instance.dag_id = %(task_instance_dag_id)s 
AND task_instance.task_id = %(task_instance_task_id)s AND task_instance.run_id 
= %(task_instance_run_id)s AND task_instance.map_index = 
%(task_instance_map_index)s
   [2024-07-03T16:01:09.139-0700] {base.py:1870} INFO - [generated in 0.01674s] 
{'end_date': datetime.datetime(2024, 7, 3, 23, 1, 9, 92275, 
tzinfo=Timezone('UTC')), 'duration': 50.378017, 'state': 
<TaskInstanceState.SUCCESS: 'success'>, 'updated_at': datetime.datetime(2024, 
7, 3, 23, 0, 18, 736664, tzinfo=Timezone('UTC')), 'task_instance_dag_id': 
'test_1', 'task_instance_task_id': 'sleep', 'task_instance_run_id': 
'manual__2024-07-03T23:00:14.090496+00:00', 'task_instance_map_index': -1}
   ```
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to