I am experiencing database deadlocks in a worker processes that is
being scheduled by web2py's new scheduler.
I'm using the latest web2py with Python 2.7.2 and MySQL (a recent
release) on OS X 10.7.3.
The worker does a lengthy calculation that I wanted to take off the
http thread. Every time I get a certain action, I schedule the thread
programmatically like this:
db.scheduler_task.insert(application_name = 'geo', task_name =
'match_task',
function_name = 'match_listings', vars =
json.dumps(vars))
In my testing I execute this task about once every two seconds. It has
failed in two separate tests after about an hour.
I'm getting this traceback:
File "/Users/davidp/dev/python/web2py/gluon/shell.py", line 214, in
run
exec(python_code, _env)
File "<string>", line 1, in <module>
File "/Users/davidp/dev/python/web2py/gluon/scheduler.py", line 363,
in loop
MetaScheduler.loop(self)
File "/Users/davidp/dev/python/web2py/gluon/scheduler.py", line 255,
in loop
task = self.pop_task()
File "/Users/davidp/dev/python/web2py/gluon/scheduler.py", line 392,
in pop_task
grabbed.update(assigned_worker_name='',status=QUEUED)
File "/Users/davidp/dev/python/web2py/gluon/dal.py", line 6346, in
update
return self.db._adapter.update(tablename,self.query,fields)
File "/Users/davidp/dev/python/web2py/gluon/dal.py", line 1093, in
update
self.execute(sql)
File "/Users/davidp/dev/python/web2py/gluon/dal.py", line 1359, in
execute
return self.log_execute(*a, **b)
File "/Users/davidp/dev/python/web2py/gluon/dal.py", line 1353, in
log_execute
ret = self.cursor.execute(*a, **b)
File "/Users/davidp/dev/python/web2py/gluon/contrib/pymysql/
cursors.py", line 108, in execute
self.errorhandler(self, exc, value)
File "/Users/davidp/dev/python/web2py/gluon/contrib/pymysql/
connections.py", line 184, in defaulterrorhandler
raise errorclass, errorvalue
InternalError: (1213, u'Deadlock found when trying to get lock; try
restarting transaction')
When executed, the task does this SQL select:
expr = '''SELECT *, ( 3959 * acos (cos (radians (%s)) * cos (radians
(lat)) *
cos (radians (lon) - radians (%s)) + sin (radians (%s)) * sin
(radians (lat))))
AS distance FROM users HAVING distance < radius;''' % (point[0],
point[1], point[0])
nearby_users = db.executesql(expr, as_dict = True)
and then does a read to collect some statistics to track performance:
num_listings = db (db.listings.id > 0).count()
During this time, I am receiving asynchronous http requests that
insert, update and delete from this table (and another table, too).
My MySQL client responds to a "show engine innodb status;" command
this way:
------------------------
LATEST DETECTED DEADLOCK
------------------------
120227 20:38:16
*** (1) TRANSACTION:
TRANSACTION 27AFF, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 376, 2 row lock(s), undo log
entries 1
MySQL thread id 19816, OS thread handle 0x10c5db000, query id 1558959
localhost 127.0.0.1 root Updating
UPDATE scheduler_task SET status='QUEUED',assigned_worker_name=''
WHERE ((scheduler_task.assigned_worker_name = 'Mycroft.local#d600fcf2-
a363-4870-bd24-1ad3694f6a62') AND (scheduler_task.status =
'ASSIGNED'))
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 3031 n bits 120 index `PRIMARY` of
table `geo`.`scheduler_task` trx id 27AFF lock_mode X waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 20; compact format;
info bits 0
0: len 4; hex 8000002c; asc ,;;
1: len 6; hex 000000024db1; asc M ;;
2: len 7; hex 6a00000d4c2f35; asc j L/5;;
3: len 3; hex 67656f; asc geo;;
4: len 10; hex 656d61696c207461736b; asc email task;;
5: len 4; hex 6d61696e; asc main;;
6: len 6; hex 515545554544; asc QUEUED;;
7: len 10; hex 73656e645f656d61696c; asc send_email;;
8: len 2; hex 5b5d; asc [];;
9: len 2; hex 7b7d; asc {};;
10: len 1; hex 54; asc T;;
11: len 8; hex 8000124c9afce86c; asc L l;;
12: len 8; hex 8000124c9afee908; asc L ;;
13: len 8; hex 8000124c9b0c193a; asc L :;;
14: len 4; hex 80000000; asc ;;
15: len 4; hex 80000e10; asc ;;
16: len 4; hex 80000dde; asc ;;
17: len 4; hex 8000000d; asc ;;
18: len 8; hex 8000124c9afec1f8; asc L ;;
19: len 0; hex ; asc ;;
*** (2) TRANSACTION:
TRANSACTION 27AFD, ACTIVE 0 sec fetching rows
mysql tables in use 2, locked 2
52 lock struct(s), heap size 6960, 2775 row lock(s), undo log entries
9
MySQL thread id 19817, OS thread handle 0x10c598000, query id 1558955
localhost 127.0.0.1 root Sending data
UPDATE scheduler_task SET status='QUEUED',assigned_worker_name=''
WHERE ((scheduler_task.assigned_worker_name IN (SELECT
scheduler_worker.worker_name FROM scheduler_worker WHERE
(scheduler_worker.last_heartbeat < '2012-02-27 20:38:07'))) AND
(scheduler_task.status IN ('RUNNING','ASSIGNED','QUEUED')))
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 0 page no 3031 n bits 120 index `PRIMARY` of
table `geo`.`scheduler_task` trx id 27AFD lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format;
info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;
There are pages of other stuff besides, but this seems most to the
point. I suppose if I knew more about databases that would mean
something to me.
I also have two other tasks that I schedule manually, one running
every hour and another every minute, although the first deadlock
occurred before the once-a-minute task was installed and neither time
did it occur near the time of execution of the hourly task.
I understand that deadlocks occur when different processes acquire
locks in different orders. But I don't see how I can control that if
I'm using the DAL.
Any insight would be appreciated.