[
https://issues.apache.org/jira/browse/AIRFLOW-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005910#comment-17005910
]
zorro.shi commented on AIRFLOW-2516:
------------------------------------
Hi [~higrys] I had a similar issue happening at 1.10.6. The following is a
deadlock log that may help you
------------------------
LATEST DETECTED DEADLOCK
------------------------
2019-12-31 01:00:35 0x7fa024527700
*** (1) TRANSACTION:
TRANSACTION 12953238, ACTIVE 0 sec starting index read
mysql tables in use 3, locked 3
LOCK WAIT 16 lock struct(s), heap size 1136, 26 row lock(s)
MySQL thread id 4441025, OS thread handle 140324389902080, query id 132956673
iZwz9ds6obifnanjsnb2gqZ 172.18.38.223 airflow Sending data
UPDATE task_instance, (SELECT task_instance.try_number AS try_number,
task_instance.task_id AS task_id, task_instance.dag_id AS dag_id,
task_instance.execution_date AS execution_date, task_instance.start_date AS
start_date, task_instance.end_date AS end_date, task_instance.duration AS
duration, task_instance.state AS state, task_instance.max_tries AS max_tries,
task_instance.hostname AS hostname, task_instance.unixname AS unixname,
task_instance.job_id AS job_id, task_instance.pool AS pool, task_instance.queue
AS queue, task_instance.priority_weight AS priority_weight,
task_instance.operator AS operator, task_instance.queued_dttm AS queued_dttm,
task_instance.pid AS pid, task_instance.executor_config AS executor_config
FROM task_instance LEFT OUTER JOIN dag_run ON task_instance.dag_id =
dag_run.dag_id AND task_instance.execution_date = dag_run.execution_date
WHERE task_in
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 122 page no 39 n bits 160 index PRIMARY of table
`airflow`.`task_instance` trx id 12953238 lock_mode X locks rec but not gap
waiting
Record lock, heap no 88 PHYSICAL RECORD: n_fields 21; compact format; info bits 0
0: len 24; hex 776d735f6e6b635f7072696e74696e675f686973746f7279; asc
wms_nkc_printing_history;;
1: len 19; hex 776d735f617263686976652e776d735f6e6b63; asc
wms_archive.wms_nkc;;
2: len 7; hex 5e08db90000000; asc ^ ;;
3: len 6; hex 000000c5a694; asc ;;
4: len 7; hex 3b000001870928; asc ; (;;
5: len 7; hex 5e0a2d22050d7b; asc ^ -" {;;
6: len 7; hex 5e0a2d22050d88; asc ^ -" ;;
7: SQL NULL;
8: len 6; hex 717565756564; asc queued;;
9: len 4; hex 80000000; asc ;;
10: len 0; hex ; asc ;;
11: len 4; hex 726f6f74; asc root;;
12: SQL NULL;
13: len 12; hex 64656661756c745f706f6f6c; asc default_pool;;
14: len 7; hex 64656661756c74; asc default;;
15: len 4; hex 80000002; asc ;;
16: SQL NULL;
17: len 7; hex 5e0a2d33061021; asc ^ -3 !;;
18: SQL NULL;
19: len 4; hex 80000000; asc ;;
20: len 5; hex 80057d942e; asc } .;;
*** (2) TRANSACTION:
TRANSACTION 12953236, ACTIVE 0 sec updating or deleting, thread declared inside
InnoDB 4999
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 4445271, OS thread handle 140325780879104, query id 132956674
iZwz9ds6obifnanjsnb2gqZ 172.18.38.223 airflow updating
UPDATE task_instance SET state='queued', queued_dttm='2019-12-30
17:00:35.397345' WHERE task_instance.task_id = 'wms_nkc_printing_history' AND
task_instance.dag_id = 'wms_archive.wms_nkc' AND task_instance.execution_date =
'2019-12-29 17:00:00'
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 122 page no 39 n bits 160 index PRIMARY of table
`airflow`.`task_instance` trx id 12953236 lock_mode X locks rec but not gap
Record lock, heap no 88 PHYSICAL RECORD: n_fields 21; compact format; info bits 0
0: len 24; hex 776d735f6e6b635f7072696e74696e675f686973746f7279; asc
wms_nkc_printing_history;;
1: len 19; hex 776d735f617263686976652e776d735f6e6b63; asc
wms_archive.wms_nkc;;
2: len 7; hex 5e08db90000000; asc ^ ;;
3: len 6; hex 000000c5a694; asc ;;
4: len 7; hex 3b000001870928; asc ; (;;
5: len 7; hex 5e0a2d22050d7b; asc ^ -" {;;
6: len 7; hex 5e0a2d22050d88; asc ^ -" ;;
7: SQL NULL;
8: len 6; hex 717565756564; asc queued;;
9: len 4; hex 80000000; asc ;;
10: len 0; hex ; asc ;;
11: len 4; hex 726f6f74; asc root;;
12: SQL NULL;
13: len 12; hex 64656661756c745f706f6f6c; asc default_pool;;
14: len 7; hex 64656661756c74; asc default;;
15: len 4; hex 80000002; asc ;;
16: SQL NULL;
17: len 7; hex 5e0a2d33061021; asc ^ -3 !;;
18: SQL NULL;
19: len 4; hex 80000000; asc ;;
20: len 5; hex 80057d942e; asc } .;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 122 page no 17 n bits 320 index ti_dag_state of table
`airflow`.`task_instance` trx id 12953236 lock_mode X locks rec but not gap
waiting
Record lock, heap no 235 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
0: len 19; hex 776d735f617263686976652e776d735f6e6b63; asc
wms_archive.wms_nkc;;
1: len 9; hex 7363686564756c6564; asc scheduled;;
2: len 24; hex 776d735f6e6b635f7072696e74696e675f686973746f7279; asc
wms_nkc_printing_history;;
3: len 7; hex 5e08db90000000; asc ^ ;;
*** WE ROLL BACK TRANSACTION (2)
----------------------
Airflow exception log
----------------------
Traceback (most recent call last):
File
"/usr/local/python38/lib/python3.8/site-packages/airflow/models/taskinstance.py",
line 930, in _run_raw_task
result = task_copy.execute(context=context)
File
"/usr/local/python38/lib/python3.8/site-packages/airflow/operators/subdag_operator.py",
line 100, in execute
self.subdag.run(
File "/usr/local/python38/lib/python3.8/site-packages/airflow/models/dag.py",
line 1284, in run
job.run()
File
"/usr/local/python38/lib/python3.8/site-packages/airflow/jobs/base_job.py",
line 222, in run
self._execute()
File "/usr/local/python38/lib/python3.8/site-packages/airflow/utils/db.py",
line 74, in wrapper
return func(*args, **kwargs)
File
"/usr/local/python38/lib/python3.8/site-packages/airflow/jobs/backfill_job.py",
line 764, in _execute
self._execute_for_run_dates(run_dates=dates_to_process,
File "/usr/local/python38/lib/python3.8/site-packages/airflow/utils/db.py",
line 70, in wrapper
return func(*args, **kwargs)
File
"/usr/local/python38/lib/python3.8/site-packages/airflow/jobs/backfill_job.py",
line 694, in _execute_for_run_dates
processed_dag_run_dates = self._process_backfill_task_instances(
File "/usr/local/python38/lib/python3.8/site-packages/airflow/utils/db.py",
line 70, in wrapper
return func(*args, **kwargs)
File
"/usr/local/python38/lib/python3.8/site-packages/airflow/jobs/backfill_job.py",
line 586, in _process_backfill_task_instances
_per_task_process(task, key, ti)
File "/usr/local/python38/lib/python3.8/site-packages/airflow/utils/db.py",
line 74, in wrapper
return func(*args, **kwargs)
File
"/usr/local/python38/lib/python3.8/site-packages/airflow/jobs/backfill_job.py",
line 508, in _per_task_process
session.commit()
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/session.py",
line 1027, in commit
self.transaction.commit()
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/session.py",
line 494, in commit
self._prepare_impl()
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/session.py",
line 473, in _prepare_impl
self.session.flush()
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/session.py",
line 2470, in flush
self._flush(objects)
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/session.py",
line 2608, in _flush
transaction.rollback(_capture_exception=True)
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py",
line 68, in __exit__
compat.reraise(exc_type, exc_value, exc_tb)
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/util/compat.py",
line 153, in reraise
raise value
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/session.py",
line 2568, in _flush
flush_context.execute()
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py",
line 422, in execute
rec.execute(self)
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py",
line 586, in execute
persistence.save_obj(
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py",
line 230, in save_obj
_emit_update_statements(
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py",
line 995, in _emit_update_statements
c = cached_connections[connection].execute(
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py",
line 982, in execute
return meth(self, multiparams, params)
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/sql/elements.py",
line 287, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py",
line 1095, in _execute_clauseelement
ret = self._execute_context(
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py",
line 1249, in _execute_context
self._handle_dbapi_exception(
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py",
line 1476, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/util/compat.py",
line 398, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/util/compat.py",
line 152, in reraise
raise value.with_traceback(tb)
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/base.py",
line 1245, in _execute_context
self.dialect.do_execute(
File
"/usr/local/python38/lib/python3.8/site-packages/sqlalchemy/engine/default.py",
line 581, in do_execute
cursor.execute(statement, parameters)
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/cursors.py",
line 255, in execute
self.errorhandler(self, exc, value)
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/connections.py",
line 50, in defaulterrorhandler
raise errorvalue
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/cursors.py",
line 252, in execute
res = self._query(query)
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/cursors.py",
line 378, in _query
db.query(q)
File "/usr/local/python38/lib/python3.8/site-packages/MySQLdb/connections.py",
line 280, in query
_mysql.connection.query(self, query)
sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1213,
'Deadlock found when trying to get lock; try restarting transaction')
[SQL: UPDATE task_instance SET state=%s, queued_dttm=%s WHERE
task_instance.task_id = %s AND task_instance.dag_id = %s AND
task_instance.execution_date = %s]
[parameters: ('queued', datetime.datetime(2019, 12, 30, 17, 0, 35, 397345,
tzinfo=<Timezone [UTC]>), 'wms_nkc_printing_history', 'wms_archive.wms_nkc',
datetime.datetime(2019, 12, 29, 17, 0, tzinfo=<Timezone [UTC]>))]
> Deadlock found when trying to update task_instance table
> --------------------------------------------------------
>
> Key: AIRFLOW-2516
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2516
> Project: Apache Airflow
> Issue Type: Bug
> Components: DagRun
> Affects Versions: 1.8.0
> Reporter: Jeff Liu
> Priority: Major
> Attachments: Screenshot 2019-12-30 at 10.42.52.png,
> image-2019-12-30-10-48-41-313.png, image-2019-12-30-10-58-02-610.png, jobs.py
>
>
>
>
> {code:java}
> [2018-05-23 17:59:57,218] {base_task_runner.py:98} INFO - Subtask:
> [2018-05-23 17:59:57,217] {base_executor.py:49} INFO - Adding to queue:
> airflow run production_wipeout_wipe_manager.Carat Carat_20180227
> 2018-05-23T17:41:18.815809 --local -sd DAGS_FOLDER/wipeout/wipeout.py
> [2018-05-23 17:59:57,231] {base_task_runner.py:98} INFO - Subtask: Traceback
> (most recent call last):
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/bin/airflow", line 27, in <module>
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask:
> args.func(args)
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/airflow/bin/cli.py", line 392, in run
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask:
> pool=args.pool,
> [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/airflow/utils/db.py", line 50, in
> wrapper
> [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: result =
> func(*args, **kwargs)
> [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1532, in
> _run_raw_task
> [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask:
> self.handle_failure(e, test_mode, context)
> [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1641, in
> handle_failure
> [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask:
> session.merge(self)
> [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line
> 1920, in merge
> [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask:
> _resolve_conflict_map=_resolve_conflict_map)
> [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line
> 1974, in _merge
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: merged =
> self.query(mapper.class_).get(key[1])
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 882,
> in get
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: ident,
> loading.load_on_pk_identity)
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 952,
> in _get_impl
> [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: return
> db_load_fn(self, primary_key_identity)
> [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", line 247,
> in load_on_pk_i
> dentity
> [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: return
> q.one()
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2884,
> in one
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: ret =
> self.one_or_none()
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2854,
> in one_or_none
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: ret =
> list(self)
> [2018-05-23 17:59:57,239] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2925,
> in __iter__
> [2018-05-23 17:59:57,239] {base_task_runner.py:98} INFO - Subtask: return
> self._execute_and_instances(context)
> [2018-05-23 17:59:57,239] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2946,
> in _execute_and_instances
> [2018-05-23 17:59:57,240] {base_task_runner.py:98} INFO - Subtask:
> close_with_result=True)
> [2018-05-23 17:59:57,240] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2955,
> in _get_bind_ar
> s
> [2018-05-23 17:59:57,240] {base_task_runner.py:98} INFO - Subtask: **kw
> [2018-05-23 17:59:57,240] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2937,
> in _connection_f
> rom_session
> [2018-05-23 17:59:57,241] {base_task_runner.py:98} INFO - Subtask: conn =
> self.session.connection(**kw)
> [2018-05-23 17:59:57,241] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line
> 1035, in connection
> [2018-05-23 17:59:57,241] {base_task_runner.py:98} INFO - Subtask:
> execution_options=execution_options)
> [2018-05-23 17:59:57,241] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line
> 1040, in _connection
> _for_bind
> [2018-05-23 17:59:57,242] {base_task_runner.py:98} INFO - Subtask: engine,
> execution_options)
> [2018-05-23 17:59:57,242] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 388,
> in _connection_
> for_bind
> [2018-05-23 17:59:57,242] {base_task_runner.py:98} INFO - Subtask:
> self._assert_active()
> [2018-05-23 17:59:57,243] {base_task_runner.py:98} INFO - Subtask: File
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 276,
> in _assert_acti
> ve
> [2018-05-23 17:59:57,243] {base_task_runner.py:98} INFO - Subtask: %
> self._rollback_exception
> [2018-05-23 17:59:57,244] {base_task_runner.py:98} INFO - Subtask:
> sqlalchemy.exc.InvalidRequestError: This Session's transaction has been
> rolled back due to a previou
> s exception during flush. To begin a new transaction with this Session, first
> issue Session.rollback(). Original exception was:
> (_mysql_exceptions.OperationalError) (1
> 213, 'Deadlock found when trying to get lock; try restarting transaction')
> [SQL: u'UPDATE task_instance SET state=%s WHERE task_instance.task_id = %s
> AND task_instance
> .dag_id = %s AND task_instance.execution_date = %s'] [parameters: (u'queued',
> 'Carat_20180227', 'production_wipeout_wipe_manager.Carat',
> datetime.datetime(2018, 5, 23,
> 17, 41, 18, 815809))] (Background on this error at:
> http://sqlalche.me/e/e3q8){code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)