Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-06-07 Thread via GitHub
eladkal closed issue #39088: Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested URL: https://github.com/apache/airflow/issues/39088 -- This is an automated message from the Apache Git

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-05-04 Thread via GitHub
tanvn commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2094155454 @RNHTTR Hi, I created a PR to fix this issue: https://github.com/apache/airflow/pull/39406 PTAL at your convenience. -- This is an automated message from the Apache Git

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-05-03 Thread via GitHub
tanvn commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2093473579 I think I found the cause of this issue. The logic of `adopt_or_reset_orphaned_tasks` is called 2 times in a very short time due to the use of `run_with_db_retries`

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-23 Thread via GitHub
tanvn commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2072822564 @RNHTTR Tested and reproduced the issue on Airflow 2.8.4, below is the error logs from scheduler and one terminated pod ``` 2024-04-23T15:56:15.805+]

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-23 Thread via GitHub
RNHTTR commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2072601414 @tanvn Can you share the full traceback? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-23 Thread via GitHub
tanvn commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2072363288 @paramjeet01 I use pip to install from the source code locally: https://github.com/apache/airflow/blob/4ae85d754e9f8a65d461e86eb6111d3b9974a065/INSTALL#L73 on a docker

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-23 Thread via GitHub
paramjeet01 commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2071929213 > @dstandish > > ~I would like to build a Docker image from the source code (which I am going to add some modifications for testing) on my Macbook (Apple M2), is there

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-21 Thread via GitHub
paramjeet01 commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2068007395 @tanvn @RNHTTR , Could this issue be related to the one I raised : https://github.com/apache/airflow/issues/39096 We are running 8 schedulers and we see this issue

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-18 Thread via GitHub
RNHTTR commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2065334748 Thanks for the extra details! I've assigned you to the Issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-18 Thread via GitHub
tanvn commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2063824238 @dstandish I would like to build a Docker image from the source code (which I am going to add some modifications for testing) on my Macbook (Apple M2), is there any document

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-18 Thread via GitHub
tanvn commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2063329564 **Just tested with Airflow 2.8.4 and confirmed that the issue still happens.** ``` [2024-04-18T08:29:32.060+] {kubernetes_executor.py:661} INFO - attempting to

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-18 Thread via GitHub
tanvn commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2063070213 I managed to reproduce this issue on my development environment by creating a DAG with about 100 tasks. When there are about 15 running tasks, I run a helm command to upgrade the

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-17 Thread via GitHub
tanvn commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2062863762 @dstandish Thanks! Yes, I believe I can reproduce on my dev environment and I want to build a docker image from the source code, if there is a document, I would appreciate if you

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-17 Thread via GitHub
RNHTTR commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2062447790 @tanvn Can you check the [audit logs](https://airflow.apache.org/docs/apache-airflow/stable/security/audit_logs.html) for a task with the following log? ``` [2024-04-17,

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-17 Thread via GitHub
dstandish commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2061970202 Hi @tanvn It seems you've made some progress with your repro scenario. What I would recommend doing is continue digging in. Add more logging to try to see exactly what

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-17 Thread via GitHub
tanvn commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2061708972 And @dstandish it seems that this issue is related to the health check server of the scheduler so I wonder if you might have any idea on this  -- This is an automated

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-17 Thread via GitHub
tanvn commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2061362205 I also checked the log of the scheduler ``` [2024-04-17T08:14:19.661+] {kubernetes_executor.py:749} INFO - attempting to adopt pod pod-3dbaxxx

Re: [I] Many running task instances are cleared by the new scheduler when an old scheduler is terminated and its health check server is periodically requested [airflow]

2024-04-17 Thread via GitHub
tanvn commented on issue #39088: URL: https://github.com/apache/airflow/issues/39088#issuecomment-2061264796 @potiuk @ephraimbuddy Hi, do you even have a clue about this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub