Public bug reported:
[Description]
On a 3 nodes rabbitMQ cluster after a failure on one of the nodes (kernel
crash) the following succession
of events has been registered:
===
This initial crash report
$ ack-grep -i mnesia_locker | wc -l
16
60-=CRASH REPORT==== 7-Sep-2015::14:48:45 ===
61- crasher:
--
66: {mnesia_locker,'rabbit@10-10-34-2',granted}}
67- in function gen_server2:terminate/3 (src/gen_server2.erl, line 1045)
68- ancestors: [worker_pool_sup,rabbit_sup,<0.110.0>]
69- messages: []
70- links: [<0.114.0>]
71- dictionary: [{{xtype_to_module,direct},rabbit_exchange_type_direct},
72- {{xtype_to_module,topic},rabbit_exchange_type_topic},
73- {random_seed,{5643,25632,27953}},
74- {{xtype_to_module,fanout},rabbit_exchange_type_fanout},
75- {worker_pool_worker,true},
76- {fhc_age_tree,{0,nil}},
--
88: Reason:
{unexpected_info,{mnesia_locker,'rabbit@10-10-34-2',granted}}
89- Offender: [{pid,<0.142.0>},
90- {name,25},
91- {mfargs,{worker_pool_worker,start_link,[25]}},
92- {restart_type,transient},
93- {shutdown,4294967295},
94- {child_type,worker}]
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
5166:Mnesia('rabbit@10-10-34-1'): ** ERROR ** mnesia_event got
{inconsistent_database, starting_partitioned_network, 'rabbit@10-10-34-2'}
sosreport-svz-op-fdc-os-3.00087142-20150909140820/var/log/rabbitmq/[email protected]
545:Mnesia('rabbit@10-10-34-3'): ** ERROR ** mnesia_event got
{inconsistent_database, starting_partitioned_network, 'rabbit@10-10-34-1'}
After this I started seeing this traces:
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]:572:=CRASH
REPORT==== 7-Sep-2015::14:48:45 ===
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]:573:
crasher:
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
initial call: gen:init_it/6
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
pid: <0.2346.189>
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
registered_name: []
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
exception exit: {{unexpected_cast,{next_job_from,<0.4960.181>}},
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
{gen_server2,call,
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
[<0.22014.255>,
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
{submit,#Fun<rabbit_misc.6.25154013>,<0.2862.189>},
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
infinity]}}
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
in function gen_server2:terminate/3 (src/gen_server2.erl, line 1045)
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
ancestors: [rabbit_mirror_queue_slave_sup,rabbit_sup,<0.110.0>]
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
messages: []
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
links: [<0.259.0>]
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
dictionary: [{guid,{{3735536962,2437967587,3023977752,60868675},0}}]
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
trap_exit: true
Related Bugs:
- https://bugs.launchpad.net/mos/+bug/1401948
-
https://github.com/erlang/otp/compare/maint...dgud:dgud%3Bmnesia%3Bsticky-race%3BOTP-11375
Possible related upstream commit:
- http://hg.rabbitmq.com/rabbitmq-server/rev/5a63c9e273cc
This seems to be related to [1] and [2] both fixes available since
rabbitmq-server 3.4.0
[1] https://github.com/rabbitmq/rabbitmq-
server/commit/1a32616b744f4dab09cba4e7a7e747c2b6550361#diff-
3b9dc5e3c18be9549b0ab00763e4e123
[2] https://github.com/rabbitmq/rabbitmq-
server/commit/238243b10ba6f666fcd8e84289525961fe6e68b9#diff-
3b9dc5e3c18be9549b0ab00763e4e123
** Affects: rabbitmq-server (Ubuntu)
Importance: Undecided
Status: New
** Tags: sts
** Tags added: sts
** Description changed:
[Description]
On a 3 nodes rabbitMQ cluster after a failure on one of the nodes (kernel
crash) the following succession
of events has been registered:
===
This initial crash report
$ ack-grep -i mnesia_locker | wc -l
16
60-=CRASH REPORT==== 7-Sep-2015::14:48:45 ===
61- crasher:
--
66: {mnesia_locker,'rabbit@10-10-34-2',granted}}
67- in function gen_server2:terminate/3 (src/gen_server2.erl, line 1045)
68- ancestors: [worker_pool_sup,rabbit_sup,<0.110.0>]
69- messages: []
70- links: [<0.114.0>]
71- dictionary: [{{xtype_to_module,direct},rabbit_exchange_type_direct},
72- {{xtype_to_module,topic},rabbit_exchange_type_topic},
73- {random_seed,{5643,25632,27953}},
74- {{xtype_to_module,fanout},rabbit_exchange_type_fanout},
75- {worker_pool_worker,true},
76- {fhc_age_tree,{0,nil}},
--
88: Reason:
{unexpected_info,{mnesia_locker,'rabbit@10-10-34-2',granted}}
89- Offender: [{pid,<0.142.0>},
90- {name,25},
91- {mfargs,{worker_pool_worker,start_link,[25]}},
92- {restart_type,transient},
93- {shutdown,4294967295},
94- {child_type,worker}]
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
5166:Mnesia('rabbit@10-10-34-1'): ** ERROR ** mnesia_event got
{inconsistent_database, starting_partitioned_network, 'rabbit@10-10-34-2'}
sosreport-svz-op-fdc-os-3.00087142-20150909140820/var/log/rabbitmq/[email protected]
545:Mnesia('rabbit@10-10-34-3'): ** ERROR ** mnesia_event got
{inconsistent_database, starting_partitioned_network, 'rabbit@10-10-34-1'}
After this I started seeing this traces:
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]:572:=CRASH
REPORT==== 7-Sep-2015::14:48:45 ===
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]:573:
crasher:
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
initial call: gen:init_it/6
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
pid: <0.2346.189>
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
registered_name: []
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
exception exit: {{unexpected_cast,{next_job_from,<0.4960.181>}},
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
{gen_server2,call,
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
[<0.22014.255>,
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
{submit,#Fun<rabbit_misc.6.25154013>,<0.2862.189>},
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
infinity]}}
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
in function gen_server2:terminate/3 (src/gen_server2.erl, line 1045)
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
ancestors: [rabbit_mirror_queue_slave_sup,rabbit_sup,<0.110.0>]
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
messages: []
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
links: [<0.259.0>]
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
dictionary: [{guid,{{3735536962,2437967587,3023977752,60868675},0}}]
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/[email protected]
trap_exit: true
Related Bugs:
- https://bugs.launchpad.net/mos/+bug/1401948
-
https://github.com/erlang/otp/compare/maint...dgud:dgud%3Bmnesia%3Bsticky-race%3BOTP-11375
-
Possible related upstream commit:
- http://hg.rabbitmq.com/rabbitmq-server/rev/5a63c9e273cc
+
+ This seems to be related to [1] and [2] both fixes available since
+ rabbitmq-server 3.4.0
+
+ [1] https://github.com/rabbitmq/rabbitmq-
+ server/commit/1a32616b744f4dab09cba4e7a7e747c2b6550361#diff-
+ 3b9dc5e3c18be9549b0ab00763e4e123
+
+ [2] https://github.com/rabbitmq/rabbitmq-
+ server/commit/238243b10ba6f666fcd8e84289525961fe6e68b9#diff-
+ 3b9dc5e3c18be9549b0ab00763e4e123
** Summary changed:
- Race condition in mnesia_locker after nodedown
+ Race condition in mnesia_locker after node down
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1496409
Title:
Race condition in mnesia_locker after node down
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rabbitmq-server/+bug/1496409/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs