Taragolis commented on PR #29616:
URL: https://github.com/apache/airflow/pull/29616#issuecomment-1437330747

   Yeah as soon as just run a go to do my daily routine it finally failed 🥇 💯 😢 
   
   That is quite a bit interesting thing, some of them mostly could be a "wrong 
assumptions"
   
   Dag Runs
   ---
   
   ```console
     HTTP: GET dags/example_bash_operator/dagRuns
     {'dag_runs': [{'conf': {},
                    'dag_id': 'example_bash_operator',
                    'dag_run_id': 'test_dag_run_id',
                    'data_interval_end': '2023-02-20T00:00:00+00:00',
                    'data_interval_start': '2023-02-19T00:00:00+00:00',
                    'end_date': None,
                    'execution_date': '2023-02-20T10:30:00.702880+00:00',
                    'external_trigger': True,
                    'last_scheduling_decision': None,
                    'logical_date': '2023-02-20T10:30:00.702880+00:00',
                    'note': None,
                    'run_type': 'manual',
                    'start_date': None,
                    'state': 'queued'}],
      'total_entries': 1}
   ```
   
   `example_bash_operator` DAG has scheduling interval, as result we should see 
here 2 DAG Runs, first for scheduled and second manual, in this case we could 
see only one - manual which created during the test.
   
   Scheduler Logs
   ---
   
   ```console
     airflow-scheduler_1  | 
     airflow-scheduler_1  | BACKEND=redis
     airflow-scheduler_1  | DB_HOST=redis
     airflow-scheduler_1  | DB_PORT=6379
     airflow-scheduler_1  | 
     airflow-scheduler_1  | 
/home/airflow/.local/lib/python3.7/site-packages/airflow/models/base.py:49 
MovedIn20Warning: Deprecated API features detected! These feature(s) are not 
compatible with SQLAlchemy 2.0. To prevent incompatible upgrades prior to 
updating applications, ensure requirements files are pinned to 
"sqlalchemy<2.0". Set environment variable SQLALCHEMY_WARN_20=1 to show all 
deprecation warnings.  Set environment variable 
SQLALCHEMY_SILENCE_UBER_WARNING=1 to silence this message. (Background on 
SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
     airflow-scheduler_1  |   ____________       _____________
     airflow-scheduler_1  |  ____    |__( )_________  __/__  /________      __
     airflow-scheduler_1  | ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
     airflow-scheduler_1  | ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
     airflow-scheduler_1  |  _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
     airflow-scheduler_1  | [2023-02-20T10:29:14.618+0000] 
{executor_loader.py:114} INFO - Loaded executor: CeleryExecutor
     airflow-scheduler_1  | [2023-02-20T10:29:14.664+0000] 
{scheduler_job.py:724} INFO - Starting the scheduler
     airflow-scheduler_1  | [2023-02-20T10:29:14.665+0000] 
{scheduler_job.py:731} INFO - Processing each file at most -1 times
     airflow-scheduler_1  | [2023-02-20T10:29:14.669+0000] {manager.py:164} 
INFO - Launched DagFileProcessorManager with pid: 33
     airflow-scheduler_1  | [2023-02-20T10:29:14.671+0000] 
{scheduler_job.py:1437} INFO - Resetting orphaned tasks for active dag runs
     airflow-scheduler_1  | [2023-02-20T10:29:14.685+0000] {settings.py:61} 
INFO - Configured default timezone Timezone('UTC')
   ```
   
   Thats all, seems like it scheduler is just hang but service reported that it 
healthy. Is it problem with recent changes in health check 
https://github.com/apache/airflow/pull/29408 and maybe problem with simple http 
server in scheduler.
   I would add output from `/health` endpoint in case of failure
   
   Docker services after test failure
   ---
   
   ```console
     $ docker ps
     CONTAINER ID   IMAGE                                                       
                          COMMAND                  CREATED         STATUS       
            PORTS                                       NAMES
     8da8ebd97f17   
ghcr.io/apache/airflow/main/prod/python3.7:a8723aa63be724652809c141714af95493aea68c
   "/usr/bin/dumb-init …"   2 minutes ago   Up 2 minutes (healthy)   8080/tcp   
                                 quick-start_airflow-triggerer_1
     88a829428ce8   
ghcr.io/apache/airflow/main/prod/python3.7:a8723aa63be724652809c141714af95493aea68c
   "/usr/bin/dumb-init …"   2 minutes ago   Up 2 minutes (healthy)   
0.0.0.0:8080->8080/tcp, :::8080->8080/tcp   quick-start_airflow-webserver_1
     f3baa9496225   
ghcr.io/apache/airflow/main/prod/python3.7:a8723aa63be724652809c141714af95493aea68c
   "/usr/bin/dumb-init …"   2 minutes ago   Up 2 minutes (healthy)   8080/tcp   
                                 quick-start_airflow-scheduler_1
     134b3356ed96   
ghcr.io/apache/airflow/main/prod/python3.7:a8723aa63be724652809c141714af95493aea68c
   "/usr/bin/dumb-init …"   2 minutes ago   Up 2 minutes (healthy)   8080/tcp   
                                 quick-start_airflow-worker_1
     a5f5e8250820   redis:latest                                                
                          "docker-entrypoint.s…"   3 minutes ago   Up 3 minutes 
(healthy)   6379/tcp                                    quick-start_redis_1
     de963f245166   postgres:13                                                 
                          "docker-entrypoint.s…"   3 minutes ago   Up 3 minutes 
(healthy)   5432/tcp                                    quick-start_postgres_1
   ```
   
   All healthy, that mean initially services pass health check after start time
   
   Versions
   ---
   
   ```console
   $ docker version
     Client:
      Version:           20.10.23+azure-2
      API version:       1.41
      Go version:        go1.19.6
      Git commit:        715524332ff91d0f9ec5ab2ec95f051456ed1dba
      Built:             Wed Jan 18 20:42:16 UTC 2023
      OS/Arch:           linux/amd64
      Context:           default
      Experimental:      true
     
     Server:
      Engine:
       Version:          20.10.22+azure-1
       API version:      1.41 (minimum version 1.12)
       Go version:       go1.18.9
       Git commit:       42c8b314993e5eb3cc2776da0bbe41d5eb4b707b
       Built:            Thu Dec 15 22:17:04 2022
       OS/Arch:          linux/amd64
       Experimental:     false
      containerd:
       Version:          1.6.18+azure-1
       GitCommit:        2456e983eb9e37e47538f59ea18f2043c9a73640
      runc:
       Version:          1.1.4
       GitCommit:        5fd4c4d144137e991c4acebb2146ab1483a97925
      docker-init:
       Version:          0.19.0
       GitCommit:        
   ```
   
   ```console
     $ docker-compose version
     docker-compose version 1.29.2, build 5becea4c
     docker-py version: 5.0.0
     CPython version: 3.7.10
     OpenSSL version: OpenSSL 1.1.0l  10 Sep 2019
   ```
   
   That is more interesting. I've seen before that statics checks sometimes 
failed with particular this version of docker `20.10.23+azure-2` and didn't 
seen that this happen in docker without `azure-X`.
   
   Another strange things
   ---
   
   `Prepare Breeze and PROD image` step have a lot of errors witch refers to 
permission denied
   
   ```console
   Received 27910740 of 32105044 (86.9%), 26.6 MBs/sec
   Received 32105044 of 32105044 (100.0%), 29.8 MBs/sec
   Cache Size: ~31 MB (32105044 B)
   /usr/bin/tar -xf 
/home/runner/work/_temp/00fdf96d-139b-4[95](https://github.com/apache/airflow/actions/runs/4222264319/jobs/7330883288#step:4:100)4-ad8c-852b0f051104/cache.tgz
 -P -C /home/runner/work/airflow/airflow -z
   /usr/bin/tar: ../../../../.local: Cannot mkdir: Permission denied
   /usr/bin/tar: ../../../../.local/pipx: Cannot mkdir: No such file or 
directory
   /usr/bin/tar: ../../../../.local: Cannot mkdir: Permission denied
   /usr/bin/tar: ../../../../.local/pipx/shared: Cannot mkdir: No such file or 
directory
   /usr/bin/tar: ../../../../.local: Cannot mkdir: Permission denied
   /usr/bin/tar: ../../../../.local/pipx/shared/lib: Cannot mkdir: No such file 
or directory
   /usr/bin/tar: ../../../../.local: Cannot mkdir: Permission denied
   /usr/bin/tar: ../../../../.local/pipx/shared/lib/python3.7: Cannot mkdir: No 
such file or directory
   /usr/bin/tar: ../../../../.local: Cannot mkdir: Permission denied
   /usr/bin/tar: ../../../../.local/pipx/shared/lib/python3.7/site-packages: 
Cannot mkdir: No such file or directory
   /usr/bin/tar: ../../../../.local: Cannot mkdir: Permission denied
   /usr/bin/tar: 
../../../../.local/pipx/shared/lib/python3.7/site-packages/_distutils_hack: 
Cannot mkdir: No such file or directory
   /usr/bin/tar: ../../../../.local: Cannot mkdir: Permission denied
   /usr/bin/tar: 
../../../../.local/pipx/shared/lib/python3.7/site-packages/_distutils_hack/override.py:
 Cannot open: No such file or directory
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to