olchas commented on issue #12103:
URL: https://github.com/apache/airflow/issues/12103#issuecomment-722480796
Approximately 10 minutes after sending SIGTERM to the pid 2130, the entire
pod got restarted, which effectively cleared the remaining hanging processes.
However, there is no error message in the log of pod
`airflow-worker-86455b549d-zkjsc` apart from the info on the terminated task
ending with cod `-15`:
```
[2020-11-05 12:45:14,735] {local_task_job.py:103} INFO - Task exited with
return code -15
```
Around the time of worker pod restart, I found the following error message
in scheduler pod
```
[2020-11-05 12:54:24,015] {dagbag.py:397} INFO - Filling up the DagBag from
/home/airflow/gcs/dags/elastic_dag.py
[2020-11-05 12:54:35,050] {dagbag.py:397} INFO - Filling up the DagBag from
/home/airflow/gcs/dags/airflow_monitoring.py
Process DagFileProcessor502-Process:
Traceback (most recent call last):
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line
2285, in _wrap_pool_connect
return fn()
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 363, in connect
return _ConnectionFairy._checkout(self)
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 773, in _checkout
fairy = _ConnectionRecord.checkout(pool)
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 492, in checkout
rec = pool._do_get()
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/impl.py",
line 139, in _do_get
self._dec_overflow()
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py",
line 69, in __exit__
exc_value, with_traceback=exc_tb,
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line
178, in raise_
raise exception
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/impl.py",
line 136, in _do_get
return self._create_connection()
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 308, in _create_connection
return _ConnectionRecord(self)
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 437, in __init__
self.__connect(first_connect_check=True)
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 657, in __connect
pool.logger.debug("Error on connect(): %s", e)
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py",
line 69, in __exit__
exc_value, with_traceback=exc_tb,
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line
178, in raise_
raise exception
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 652, in __connect
connection = pool._invoke_creator(self)
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/engine/strategies.py",
line 114, in connect
return dialect.connect(*cargs, **cparams)
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line
488, in connect
return self.dbapi.connect(*cargs, **cparams)
File "/opt/python3.6/lib/python3.6/site-packages/MySQLdb/__init__.py",
line 85, in Connect
return Connection(*args, **kwargs)
File "/opt/python3.6/lib/python3.6/site-packages/MySQLdb/connections.py",
line 208, in __init__
super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (2006, "Unknown MySQL server host
'airflow-sqlproxy-service.default.svc.cluster.local' (110)")
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/python3.6/lib/python3.6/multiprocessing/process.py", line 258,
in _bootstrap
self.run()
File "/opt/python3.6/lib/python3.6/multiprocessing/process.py", line 93,
in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/airflow/airflow/jobs/scheduler_job.py", line 164, in
_run_file_processor
pickle_dags)
File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/airflow/airflow/jobs/scheduler_job.py", line 1571, in
process_file
dag.sync_to_db()
File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/airflow/airflow/models/dag.py", line 1506, in
sync_to_db
DagModel).filter(DagModel.dag_id == self.dag_id).first()
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/orm/query.py",
line 3298, in first
ret = list(self[0:1])
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/orm/query.py",
line 3076, in __getitem__
return list(res)
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/orm/query.py",
line 3403, in __iter__
return self._execute_and_instances(context)
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/orm/query.py",
line 3425, in _execute_and_instances
querycontext, self._connection_from_session, close_with_result=True
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/orm/query.py",
line 3440, in _get_bind_args
mapper=self._bind_mapper(), clause=querycontext.statement, **kw
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/orm/query.py",
line 3418, in _connection_from_session
conn = self.session.connection(**kw)
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line
1133, in connection
execution_options=execution_options,
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line
1139, in _connection_for_bind
engine, execution_options
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line
432, in _connection_for_bind
conn = bind._contextual_connect()
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line
2251, in _contextual_connect
self._wrap_pool_connect(self.pool.connect, None),
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line
2289, in _wrap_pool_connect
e, dialect, self
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line
1555, in _handle_dbapi_exception_noconnection
sqlalchemy_exception, with_traceback=exc_info[2], from_=e
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line
178, in raise_
raise exception
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line
2285, in _wrap_pool_connect
return fn()
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 363, in connect
return _ConnectionFairy._checkout(self)
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 773, in _checkout
fairy = _ConnectionRecord.checkout(pool)
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 492, in checkout
rec = pool._do_get()
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/impl.py",
line 139, in _do_get
self._dec_overflow()
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py",
line 69, in __exit__
exc_value, with_traceback=exc_tb,
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line
178, in raise_
raise exception
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/impl.py",
line 136, in _do_get
return self._create_connection()
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 308, in _create_connection
return _ConnectionRecord(self)
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 437, in __init__
self.__connect(first_connect_check=True)
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 657, in __connect
pool.logger.debug("Error on connect(): %s", e)
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py",
line 69, in __exit__
exc_value, with_traceback=exc_tb,
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line
178, in raise_
raise exception
File "/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/pool/base.py",
line 652, in __connect
connection = pool._invoke_creator(self)
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/engine/strategies.py",
line 114, in connect
return dialect.connect(*cargs, **cparams)
File
"/opt/python3.6/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line
488, in connect
return self.dbapi.connect(*cargs, **cparams)
File "/opt/python3.6/lib/python3.6/site-packages/MySQLdb/__init__.py",
line 85, in Connect
return Connection(*args, **kwargs)
File "/opt/python3.6/lib/python3.6/site-packages/MySQLdb/connections.py",
line 208, in __init__
super(Connection, self).__init__(*args, **kwargs2)
sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2006,
"Unknown MySQL server host 'airflow-sqlproxy-service.default.svc.cluster.local'
(110)")
(Background on this error at: http://sqlalche.me/e/e3q8)
[2020-11-05 12:55:00,149] {dagbag.py:397} INFO - Filling up the DagBag from
/home/airflow/gcs/dags/airflow_monitoring.py
[2020-11-05 12:55:00,153] {dagbag.py:397} INFO - Filling up the DagBag from
/home/airflow/gcs/dags/elastic_dag.py
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]