cccs-cat001 opened a new issue #16298:
URL: https://github.com/apache/airflow/issues/16298


   **Apache Airflow version**: 2.1.0
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   ```
   Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", 
GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", 
BuildDate:"2021-04-08T16:31:21Z", GoVersion:"go1.16.1", Compiler:"gc", 
Platform:"linux/amd64"}
   Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.7", 
GitCommit:"6b3f9b283463c1d5a2455df301182805e65c7145", GitTreeState:"clean", 
BuildDate:"2021-05-19T22:28:47Z", GoVersion:"go1.15.12", Compiler:"gc", 
Platform:"linux/amd64"}
   ```
   **Environment**:
   
   - **Cloud provider or hardware configuration**: Azure
   - **OS** (e.g. from /etc/os-release): ubuntu 18.04
   - **Kernel** (e.g. `uname -a`):
   - **Install tools**:
   - **Others**:
   
   **What happened**:
   Since I launched airflow 2.1.0 on our cluster on Friday, the scheduler has 
failed 716 times stating "BrokenPipeError"
   ```
   [2021-06-07 12:07:19,362] {scheduler_job.py:1205} INFO - Executor reports 
execution of demo_git_notebook_parameterized.demo_git_notebook_parameterized 
execution_date=2021-06-07 12:05:41.835167+00:00 exited with status None for 
try_number 1
   [2021-06-07 12:07:22,798] {scheduler_job.py:748} INFO - Exiting gracefully 
upon receiving signal 15
   [2021-06-07 12:07:23,800] {process_utils.py:100} INFO - Sending 
Signals.SIGTERM to GPID 55
   [2021-06-07 12:07:24,154] {process_utils.py:207} INFO - Waiting up to 5 
seconds for processes to exit...
   [2021-06-07 12:07:24,211] {process_utils.py:207} INFO - Waiting up to 5 
seconds for processes to exit...
   [2021-06-07 12:07:24,265] {process_utils.py:66} INFO - Process 
psutil.Process(pid=55, status='terminated', exitcode=0, started='12:02:39') 
(55) terminated with exit code 0
   [2021-06-07 12:07:24,266] {process_utils.py:66} INFO - Process 
psutil.Process(pid=7433, status='terminated', started='12:07:23') (7433) 
terminated with exit code None
   [2021-06-07 12:07:24,266] {process_utils.py:66} INFO - Process 
psutil.Process(pid=7432, status='terminated', started='12:07:22') (7432) 
terminated with exit code None
   [2021-06-07 12:07:24,266] {kubernetes_executor.py:759} INFO - Shutting down 
Kubernetes executor
   [2021-06-07 12:07:24,266] {scheduler_job.py:1308} ERROR - Exception when 
executing Executor.end
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 1286, in _execute
       self._run_scheduler_loop()
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 1400, in _run_scheduler_loop
       time.sleep(min(self._processor_poll_interval, next_event))
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 751, in _exit_gracefully
       sys.exit(os.EX_OK)
   SystemExit: 0
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 1306, in _execute
       self.executor.end()
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/kubernetes_executor.py",
 line 761, in end
       self._flush_task_queue()
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/kubernetes_executor.py",
 line 714, in _flush_task_queue
       self.log.debug('Executor shutting down, task_queue approximate size=%d', 
self.task_queue.qsize())
     File "<string>", line 2, in qsize
     File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 834, in 
_callmethod
       conn.send((self._id, methodname, args, kwds))
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 206, 
in send
       self._send_bytes(_ForkingPickler.dumps(obj))
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 411, 
in _send_bytes
       self._send(header + buf)
     File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 368, 
in _send
       n = write(self._handle, buf)
   BrokenPipeError: [Errno 32] Broken pipe
   [2021-06-07 12:07:24,268] {process_utils.py:100} INFO - Sending 
Signals.SIGTERM to GPID 55
   [2021-06-07 12:07:24,268] {scheduler_job.py:1313} INFO - Exited execute loop
   ```
   
   **What you expected to happen**:
   For it to not do that.
   
   **How to reproduce it**:
   I'm not too sure. Could it be an issue with Airflow 2.1.0 itself, and it can 
be reproduced just by launching it in a cluster? Using KubernetesExecutor, no 
celery. 
   Could it be an issue with Azure? 
   
   **Anything else we need to know**:
   by my very rough calculations it happens every 6 minutes? 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to