gakhrejah opened a new issue #12229:
URL: https://github.com/apache/airflow/issues/12229


   Hi Team,
   
   We are getting below error Logs while running the Apache Airflow On AWS EKS .
   All the Pods(Tasks) are in completed state but not removed by Airflow. I had 
to do manual restart of scheduler it everything works for 2-3 days. Then again 
all the tasks are stuck .
   
   ERROR LOGS
   [2020-11-10 07:00:07,752] {{kubernetes_executor.py:447}} ERROR - Error while 
health checking kube watcher process. Process died for unknown reasons
   [2020-11-10 07:00:07,765] {{kubernetes_executor.py:351}} INFO - Event: and 
now my watch begins starting at resource_version: 107544455
   [2020-11-10 07:00:07,782] {{kubernetes_executor.py:342}} ERROR - Unknown 
error in KubernetesJobWatcher. Failing
   Traceback (most recent call last):
     File 
"/usr/local/lib/python3.7/site-packages/airflow/contrib/executors/kubernetes_executor.py",
 line 340, in run
       self.worker_uuid, self.kube_config)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/contrib/executors/kubernetes_executor.py",
 line 364, in _run
       **kwargs):
     File "/usr/local/lib/python3.7/site-packages/kubernetes/watch/watch.py", 
line 177, in stream
       status=obj['code'], reason=reason)
   kubernetes.client.exceptions.ApiException: (410)
   Reason: Gone: too old resource version: 107544455 (108550177)
   
   Process KubernetesJobWatcher-135237:
   Traceback (most recent call last):
     File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in 
_bootstrap
       self.run()
     File 
"/usr/local/lib/python3.7/site-packages/airflow/contrib/executors/kubernetes_executor.py",
 line 340, in run
       self.worker_uuid, self.kube_config)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/contrib/executors/kubernetes_executor.py",
 line 364, in _run
       **kwargs):
     File "/usr/local/lib/python3.7/site-packages/kubernetes/watch/watch.py", 
line 177, in stream
       status=obj['code'], reason=reason)
   kubernetes.client.exceptions.ApiException: (410)
   Reason: Gone: too old resource version: 107544455 (108550177)
   
   AIRFLOW_VERSION=1.10.9
   ENVIRONMENT: QA| PROD
   Docker Image : python:3.7-slim-buster
   
   Please let us know if you require any more information and how we can 
resolve this issue . We have also tried to upgrade the AIRFLOW version to 
1.10.10 but no luck.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to