potiuk commented on issue #33869:
URL: https://github.com/apache/airflow/issues/33869#issuecomment-1709917907

   CAnnot reproduce. Generally the temporary task task run should be removed 
when the `test` command completes. 
   
   I think you must be doing or expecting something different.  And if you do, 
then you should provide some more evidences and example where the task test 
remains in the database. If you are expecting that the `airflow task test` will 
not be creating those run_ids - then your expectation is wrong. The command is 
not foreseen to be "side-effect free" when run on production server. 
   
   You should never, ever do that - you should have dedicated staging/test 
sytem to run your tests on.
   
   
   Here is what happens when you run it - the temporary run is removed after 
the test command completes. 
   
   ```
   airflow=# select * from dag_run;
   airflow=# 
   
   id | dag_id | queued_at | execution_date | start_date | end_date | state | 
run_id | creating_job_id | external_trigger | run_type | conf | 
data_interval_start | data_interval_end | last_scheduling_decision | dag_hash | 
log_template_id | updated_at 
   
----+--------+-----------+----------------+------------+----------+-------+--------+-----------------+------------------+----------+------+---------------------+-------------------+--------------------------+----------+-----------------+------------
   (0 rows)
   
   
   airflow-# \q
   
   
   root@bd424c042845:/opt/airflow# airflow tasks test example_bash_operator 
runme_0 '20230907T102850'
   
   /opt/airflow/airflow/models/taskinstance.py:3021 SAWarning: Can't validate 
argument 'foreign_key'; can't locate any SQLAlchemy dialect named 'foreign'
   /opt/airflow/airflow/models/dagrun.py:1413 SAWarning: Can't validate 
argument 'foreign_key'; can't locate any SQLAlchemy dialect named 'foreign'
   [2023-09-07T10:31:38.265+0000] {dagbag.py:541} INFO - Filling up the DagBag 
from /files/dags
   [2023-09-07T10:31:39.137+0000] {taskinstance.py:1163} INFO - Dependencies 
all met for dep_context=non-requeueable deps ti=<TaskInstance: 
example_bash_operator.runme_0 
__***_temporary_run_2023-09-07T10:31:39.109979+00:00__ [None]>
   [2023-09-07T10:31:39.141+0000] {taskinstance.py:1163} INFO - Dependencies 
all met for dep_context=requeueable deps ti=<TaskInstance: 
example_bash_operator.runme_0 
__***_temporary_run_2023-09-07T10:31:39.109979+00:00__ [None]>
   [2023-09-07T10:31:39.141+0000] {taskinstance.py:1365} INFO - Starting 
attempt 1 of 1
   [2023-09-07T10:31:39.141+0000] {taskinstance.py:1434} WARNING - cannot 
record queued_duration for task runme_0 because previous state change time has 
not been saved
   [2023-09-07T10:31:39.142+0000] {taskinstance.py:1386} INFO - Executing 
<Task(BashOperator): runme_0> on 2023-09-07T10:28:50+00:00
   [2023-09-07T10:31:39.178+0000] {taskinstance.py:1666} INFO - Exporting env 
vars: AIRFLOW_CTX_DAG_OWNER='***' AIRFLOW_CTX_DAG_ID='example_bash_operator' 
AIRFLOW_CTX_TASK_ID='runme_0' 
AIRFLOW_CTX_EXECUTION_DATE='2023-09-07T10:28:50+00:00' 
AIRFLOW_CTX_TRY_NUMBER='1' 
AIRFLOW_CTX_DAG_RUN_ID='__***_temporary_run_2023-09-07T10:31:39.109979+00:00__'
   [2023-09-07T10:31:39.180+0000] {subprocess.py:63} INFO - Tmp dir root 
location: /tmp
   [2023-09-07T10:31:39.180+0000] {subprocess.py:75} INFO - Running command: 
['/bin/bash', '-c', 'echo "example_bash_operator__runme_0__20230907" && sleep 
1']
   [2023-09-07T10:31:39.187+0000] {subprocess.py:86} INFO - Output:
   [2023-09-07T10:31:39.188+0000] {subprocess.py:93} INFO - 
example_bash_operator__runme_0__20230907
   [2023-09-07T10:31:40.189+0000] {subprocess.py:97} INFO - Command exited with 
return code 0
   [2023-09-07T10:31:40.200+0000] {taskinstance.py:1404} INFO - Marking task as 
SUCCESS. dag_id=example_bash_operator, task_id=runme_0, 
execution_date=20230907T102850, start_date=, end_date=20230907T103140
   
   root@bd424c042845:/opt/airflow# airflow db shell 
   DB: postgresql+psycopg2://postgres:***@postgres/airflow
   [2023-09-07T10:31:49.750+0000] {process_utils.py:205} INFO - Executing cmd: 
psql
   psql (15.4 (Debian 15.4-1.pgdg110+1), server 11.16 (Debian 11.16-1.pgdg90+1))
   Type "help" for help.
   
   airflow=# select * from dag_run;
   
    id | dag_id | queued_at | execution_date | start_date | end_date | state | 
run_id | creating_job_id | external_trigger | run_type | conf | 
data_interval_start | data_interval_end | last_scheduling_decision | dag_hash | 
log_template_id | updated_at 
   
----+--------+-----------+----------------+------------+----------+-------+--------+-----------------+------------------+----------+------+---------------------+-------------------+--------------------------+----------+-----------------+------------
   (0 rows)
   
   ```
   
   Converting to a discussion if more discussion is needed.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to