stromal opened a new issue, #27131:
URL: https://github.com/apache/airflow/issues/27131

   ### Apache Airflow version
   
   2.4.1
   
   ### What happened
   
   # Situation
   
   - I have a DAG file that is scheduled to rune at 1:30 am UTC 
   - but in the logs it shows that it have started running at 2022-10-19, 
08:09:12 UTC
   - It have run for about the same period as before and delivered the data 
correctly from point A to B
   - I haven't modified this DAG file for days
   - there is nothing else scheduled for this time period at all
   - If I look at the previouse logs sometimes it star running at 1:30 am UTC 
sometimes at totally different random times 
      - (these are Saturday and Sunday runs, 
      - I am the only one to acces to airflow and the files, 
      - noone else modified them, 
      - and I defiantly not touched or triggered them on Saturday and Sunday) 
   
   # System
   
   - ubuntu 20.04 LTS
   - Docker Airflow
   - AWS EC2
   - x86 AMD / x86 Intel CPU
   
   # Dockerfile
   
   ```
   FROM apache/airflow:2.4.1-python3.8
   
   # ExternalPythonOperator
   # Compulsory to switch parameter to use your own venve via 
ExternalPythonOperator
   ENV PIP_USER=false
   
   #python venv setup that you have created locally
   RUN python3 -m venv /opt/airflow/my_virtual_env
   
   # Install dependencies:
   COPY airflow/requirements.txt .
   # copy over folder that containes a my own local pip package
   COPY airflow/my_own_pip_package_folder /opt/airflow/my_own_pip_package_folder
   # copy over ahtever file
   COPY my_file.py /opt/airflow/my_file.py
   
   # install packages in my own docker virtual environemnt
   RUN /opt/airflow/my_virtual_env/bin/pip install -r requirements.txt
   RUN /opt/airflow/my_virtual_env/bin/pip install 
/opt/airflow/my_own_pip_package_folder
   RUN /opt/airflow/my_virtual_env/bin/pip install 'apache-airflow==2.4.1' 
--constraint 
"https://raw.githubusercontent.com/apache/airflow/constraints-2.4.1/constraints-3.8.txt";
   
   ```
   
   
   
   
   
   
   # My DAG File
   
   ```
   """
   Example DAG demonstrating the usage of the TaskFlow API to execute Python 
functions natively and within a
   virtual environment.
   """
   from __future__ import annotations
   
   import logging
   import os
   import shutil
   import sys
   import tempfile
   import time
   from pprint import pprint
   
   import pendulum
   
   from airflow import DAG
   from airflow.decorators import task
   
   log = logging.getLogger(__name__)
   PYTHON = sys.executable
   BASE_DIR = tempfile.gettempdir()
   
   
   with DAG(
       dag_id='my_dag',
       schedule='30 1 * * *', # IT IS IN UTC. EX.: 11:12 UTC = 12:12am BST = 
12:12am UK time
       start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
       catchup=False,
       tags=['Redshift'],
       ) as dag:
   
       @task.external_python(task_id="my_dag_task", 
python='/opt/airflow/my_virtual_env/bin/python3')
       def go():
           """
           Example function that will be performed in a virtual environment.
           Importing at the module level ensures that it will not attempt to 
import the
           library before it is installed.
           """
          # my imports
          # my code
   
       external_python_task = go()
   ```
   
   
   # Logs
   
   ## Incorrect time run log, but rune without failure (Wednesday early morning)
   
   ```
   4d37309d53a3
   *** Reading local file: 
/opt/airflow/logs/dag_id=my_dag/run_id=scheduled__2022-10-18T01:30:00+00:00/task_id=my_dag_task/attempt=1.log
   [2022-10-19, 08:09:12 UTC] {taskinstance.py:1165} INFO - Dependencies all 
met for <TaskInstance: my_dag.my_dag_task scheduled__2022-10-18T01:30:00+00:00 
[queued]>
   [2022-10-19, 08:09:12 UTC] {taskinstance.py:1165} INFO - Dependencies all 
met for <TaskInstance: my_dag.my_dag_task scheduled__2022-10-18T01:30:00+00:00 
[queued]>
   [2022-10-19, 08:09:12 UTC] {taskinstance.py:1362} INFO - 
   
--------------------------------------------------------------------------------
   [2022-10-19, 08:09:12 UTC] {taskinstance.py:1363} INFO - Starting attempt 1 
of 1
   [2022-10-19, 08:09:12 UTC] {taskinstance.py:1364} INFO - 
   
--------------------------------------------------------------------------------
   [2022-10-19, 08:09:12 UTC] {taskinstance.py:1383} INFO - Executing 
<Task(_PythonExternalDecoratedOperator): my_dag_task> on 2022-10-18 
01:30:00+00:00
   [2022-10-19, 08:09:12 UTC] {standard_task_runner.py:54} INFO - Started 
process 58 to run task
   [2022-10-19, 08:09:12 UTC] {standard_task_runner.py:82} INFO - Running: 
['airflow', 'tasks', 'run', 'my_dag', 'my_dag_task', 
'scheduled__2022-10-18T01:30:00+00:00', '--job-id', '235', '--raw', '--subdir', 
'DAGS_FOLDER/my_dag.py', '--cfg-path', '/tmp/tmpps6y22oi']
   [2022-10-19, 08:09:12 UTC] {standard_task_runner.py:83} INFO - Job 235: 
Subtask my_dag_task
   [2022-10-19, 08:09:12 UTC] {dagbag.py:525} INFO - Filling up the DagBag from 
/opt/airflow/dags/my_dag.py
   [2022-10-19, 08:09:12 UTC] {task_command.py:384} INFO - Running 
<TaskInstance: my_dag.my_dag_task scheduled__2022-10-18T01:30:00+00:00 
[running]> on host 4d37309d53a3
   [2022-10-19, 08:09:13 UTC] {taskinstance.py:1590} INFO - Exporting the 
following env vars:
   AIRFLOW_CTX_DAG_OWNER=airflow
   AIRFLOW_CTX_DAG_ID=my_dag
   AIRFLOW_CTX_TASK_ID=my_dag_task
   AIRFLOW_CTX_EXECUTION_DATE=2022-10-18T01:30:00+00:00
   AIRFLOW_CTX_TRY_NUMBER=1
   AIRFLOW_CTX_DAG_RUN_ID=scheduled__2022-10-18T01:30:00+00:00
   [2022-10-19, 08:09:16 UTC] {process_utils.py:179} INFO - Executing cmd: 
/opt/airflow/my_virtual_env/bin/python3 /tmp/tmdkd47ow0r/script.py 
/tmp/tmdkd47ow0r/script.in /tmp/tmdkd47ow0r/script.out 
/tmp/tmdkd47ow0r/string_args.txt
   [2022-10-19, 08:09:16 UTC] {process_utils.py:183} INFO - Output:
   [2022-10-19, 08:09:18 UTC] {process_utils.py:187} INFO - 
WARNING:root:/opt/airflow/logs/scheduler/latest already exists as a dir/file. 
Skip creating symlink.
   [2022-10-19, 08:09:22 UTC] {process_utils.py:187} INFO - startdate: 
2022-10-18T00:00:00.000 enddate: 2022-10-18T23:59:59.000
   [2022-10-19, 08:09:22 UTC] {process_utils.py:187} INFO - [2022-10-19 
08:09:22,282] {credentials.py:1108} INFO - Found credentials 
from IAM Role: BI-Read-Creds-2
   [2022-10-19, 08:09:30 UTC] {python.py:177} INFO - Done. Returned value was: 
[True, 'schema.table_sql']
   [2022-10-19, 08:09:30 UTC] {taskinstance.py:1401} INFO - Marking task as 
SUCCESS. dag_id=my_dag, task_id=my_dag_task, execution_date=20221018T013000, 
start_date=20221019T080912, end_date=20221019T080930
   [2022-10-19, 08:09:31 UTC] {local_task_job.py:164} INFO - Task exited with 
return code 0
   [2022-10-19, 08:09:31 UTC] {local_task_job.py:273} INFO - 0 downstream tasks 
scheduled from follow-on schedule check
   
   ```
   
   
   ## Correct run log, but rune without failure (Tuesday evening)
   
   ```
   9adeff0b1844
   *** Reading local file: 
/opt/airflow/logs/dag_id=my_dag/run_id=scheduled__2022-10-17T01:30:00+00:00/task_id=my_dag_task/attempt=1.log
   [2022-10-18, 01:30:00 UTC] {taskinstance.py:1165} INFO - Dependencies all 
met for <TaskInstance: my_dag.my_dag_task scheduled__2022-10-17T01:30:00+00:00 
[queued]>
   [2022-10-18, 01:30:00 UTC] {taskinstance.py:1165} INFO - Dependencies all 
met for <TaskInstance: my_dag.my_dag_task scheduled__2022-10-17T01:30:00+00:00 
[queued]>
   [2022-10-18, 01:30:00 UTC] {taskinstance.py:1362} INFO - 
   
--------------------------------------------------------------------------------
   [2022-10-18, 01:30:00 UTC] {taskinstance.py:1363} INFO - Starting attempt 1 
of 1
   [2022-10-18, 01:30:00 UTC] {taskinstance.py:1364} INFO - 
   
--------------------------------------------------------------------------------
   [2022-10-18, 01:30:00 UTC] {taskinstance.py:1383} INFO - Executing 
<Task(_PythonExternalDecoratedOperator): my_dag_task> on 2022-10-17 
01:30:00+00:00
   [2022-10-18, 01:30:00 UTC] {standard_task_runner.py:54} INFO - Started 
process 23538 to run task
   [2022-10-18, 01:30:00 UTC] {standard_task_runner.py:82} INFO - Running: 
['airflow', 'tasks', 'run', 'my_dag', 'my_dag_task', 
'scheduled__2022-10-17T01:30:00+00:00', '--job-id', '212', '--raw', '--subdir', 
'DAGS_FOLDER/my_dag.py', '--cfg-path', '/tmp/tmp4aa6swro']
   [2022-10-18, 01:30:00 UTC] {standard_task_runner.py:83} INFO - Job 212: 
Subtask my_dag_task
   [2022-10-18, 01:30:00 UTC] {dagbag.py:525} INFO - Filling up the DagBag from 
/opt/airflow/dags/my_dag.py
   [2022-10-18, 01:30:01 UTC] {task_command.py:384} INFO - Running 
<TaskInstance: my_dag.my_dag_task scheduled__2022-10-17T01:30:00+00:00 
[running]> on host 9adeff0b1844
   [2022-10-18, 01:30:01 UTC] {taskinstance.py:1590} INFO - Exporting the 
following env vars:
   AIRFLOW_CTX_DAG_OWNER=airflow
   AIRFLOW_CTX_DAG_ID=my_dag
   AIRFLOW_CTX_TASK_ID=my_dag_task
   AIRFLOW_CTX_EXECUTION_DATE=2022-10-17T01:30:00+00:00
   AIRFLOW_CTX_TRY_NUMBER=1
   AIRFLOW_CTX_DAG_RUN_ID=scheduled__2022-10-17T01:30:00+00:00
   [2022-10-18, 01:30:02 UTC] {process_utils.py:179} INFO - Executing cmd: 
/opt/airflow/my_virtual_env/bin/python3 /tmp/tmdebmcqmsm/script.py 
/tmp/tmdebmcqmsm/script.in /tmp/tmdebmcqmsm/script.out 
/tmp/tmdebmcqmsm/string_args.txt
   [2022-10-18, 01:30:02 UTC] {process_utils.py:183} INFO - Output:
   [2022-10-18, 01:30:03 UTC] {process_utils.py:187} INFO - 
WARNING:root:/opt/airflow/logs/scheduler/latest already exists as a dir/file. 
Skip creating symlink.
   [2022-10-18, 01:30:06 UTC] {process_utils.py:187} INFO - startdate: 
2022-10-17T00:00:00.000 enddate: 2022-10-17T23:59:59.000
   [2022-10-18, 01:30:06 UTC] {process_utils.py:187} INFO - [2022-10-18 
01:30:06,223] {credentials.py:1108} INFO - Found credentials 
from ......[0m
   [2022-10-18, 01:30:17 UTC] {python.py:177} INFO - Done. Returned value was: 
[True, 'schema.table_sql']
   [2022-10-18, 01:30:17 UTC] {taskinstance.py:1401} INFO - Marking task as 
SUCCESS. dag_id=my_dag, task_id=my_dag_task, execution_date=20221017T013000, 
start_date=20221018T013000, end_date=20221018T013017
   [2022-10-18, 01:30:17 UTC] {local_task_job.py:164} INFO - Task exited with 
return code 0
   [2022-10-18, 01:30:17 UTC] {local_task_job.py:273} INFO - 0 downstream tasks 
scheduled from follow-on schedule check
   ```
   
   
   
   ### What you think should happen instead
   
   Start the file run at the specific period even if it will fail
   
   ### How to reproduce
   
   _No response_
   
   ### Operating System
   
   Ubuntu 20.04 LTS
   
   ### Versions of Apache Airflow Providers
   
   IMAGE extended: ```apache/airflow:2.4.1-python3.8```` 
   
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to