cccs-cat001 opened a new issue #16298:
URL: https://github.com/apache/airflow/issues/16298
**Apache Airflow version**: 2.1.0
**Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
```
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0",
GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean",
BuildDate:"2021-04-08T16:31:21Z", GoVersion:"go1.16.1", Compiler:"gc",
Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.7",
GitCommit:"6b3f9b283463c1d5a2455df301182805e65c7145", GitTreeState:"clean",
BuildDate:"2021-05-19T22:28:47Z", GoVersion:"go1.15.12", Compiler:"gc",
Platform:"linux/amd64"}
```
**Environment**:
- **Cloud provider or hardware configuration**: Azure
- **OS** (e.g. from /etc/os-release): ubuntu 18.04
- **Kernel** (e.g. `uname -a`):
- **Install tools**:
- **Others**:
**What happened**:
Since I launched airflow 2.1.0 on our cluster on Friday, the scheduler has
failed 716 times stating "BrokenPipeError"
```
[2021-06-07 12:07:19,362] {scheduler_job.py:1205} INFO - Executor reports
execution of demo_git_notebook_parameterized.demo_git_notebook_parameterized
execution_date=2021-06-07 12:05:41.835167+00:00 exited with status None for
try_number 1
[2021-06-07 12:07:22,798] {scheduler_job.py:748} INFO - Exiting gracefully
upon receiving signal 15
[2021-06-07 12:07:23,800] {process_utils.py:100} INFO - Sending
Signals.SIGTERM to GPID 55
[2021-06-07 12:07:24,154] {process_utils.py:207} INFO - Waiting up to 5
seconds for processes to exit...
[2021-06-07 12:07:24,211] {process_utils.py:207} INFO - Waiting up to 5
seconds for processes to exit...
[2021-06-07 12:07:24,265] {process_utils.py:66} INFO - Process
psutil.Process(pid=55, status='terminated', exitcode=0, started='12:02:39')
(55) terminated with exit code 0
[2021-06-07 12:07:24,266] {process_utils.py:66} INFO - Process
psutil.Process(pid=7433, status='terminated', started='12:07:23') (7433)
terminated with exit code None
[2021-06-07 12:07:24,266] {process_utils.py:66} INFO - Process
psutil.Process(pid=7432, status='terminated', started='12:07:22') (7432)
terminated with exit code None
[2021-06-07 12:07:24,266] {kubernetes_executor.py:759} INFO - Shutting down
Kubernetes executor
[2021-06-07 12:07:24,266] {scheduler_job.py:1308} ERROR - Exception when
executing Executor.end
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
line 1286, in _execute
self._run_scheduler_loop()
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
line 1400, in _run_scheduler_loop
time.sleep(min(self._processor_poll_interval, next_event))
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
line 751, in _exit_gracefully
sys.exit(os.EX_OK)
SystemExit: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
line 1306, in _execute
self.executor.end()
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/kubernetes_executor.py",
line 761, in end
self._flush_task_queue()
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/kubernetes_executor.py",
line 714, in _flush_task_queue
self.log.debug('Executor shutting down, task_queue approximate size=%d',
self.task_queue.qsize())
File "<string>", line 2, in qsize
File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 834, in
_callmethod
conn.send((self._id, methodname, args, kwds))
File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 206,
in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 411,
in _send_bytes
self._send(header + buf)
File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 368,
in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
[2021-06-07 12:07:24,268] {process_utils.py:100} INFO - Sending
Signals.SIGTERM to GPID 55
[2021-06-07 12:07:24,268] {scheduler_job.py:1313} INFO - Exited execute loop
```
**What you expected to happen**:
For it to not do that.
**How to reproduce it**:
I'm not too sure. Could it be an issue with Airflow 2.1.0 itself, and it can
be reproduced just by launching it in a cluster? Using KubernetesExecutor, no
celery.
Could it be an issue with Azure?
**Anything else we need to know**:
by my very rough calculations it happens every 6 minutes?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]