Public bug reported:

[Description]

On a 3 nodes rabbitMQ cluster after a failure on one of the nodes (kernel 
crash) the following succession
of events has been registered:

===
This initial crash report

$ ack-grep -i mnesia_locker  | wc -l
16

60-=CRASH REPORT==== 7-Sep-2015::14:48:45 ===
61-  crasher:
--
66:                        {mnesia_locker,'rabbit@10-10-34-2',granted}}
67-      in function  gen_server2:terminate/3 (src/gen_server2.erl, line 1045)
68-    ancestors: [worker_pool_sup,rabbit_sup,<0.110.0>]
69-    messages: []
70-    links: [<0.114.0>]
71-    dictionary: [{{xtype_to_module,direct},rabbit_exchange_type_direct},
72-                  {{xtype_to_module,topic},rabbit_exchange_type_topic},
73-                  {random_seed,{5643,25632,27953}},
74-                  {{xtype_to_module,fanout},rabbit_exchange_type_fanout},
75-                  {worker_pool_worker,true},
76-                  {fhc_age_tree,{0,nil}},
--
88:     Reason:     
{unexpected_info,{mnesia_locker,'rabbit@10-10-34-2',granted}}
89-     Offender:   [{pid,<0.142.0>},
90-                  {name,25},
91-                  {mfargs,{worker_pool_worker,start_link,[25]}},
92-                  {restart_type,transient},
93-                  {shutdown,4294967295},
94-                  {child_type,worker}]

sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1.log
5166:Mnesia('rabbit@10-10-34-1'): ** ERROR ** mnesia_event got 
{inconsistent_database, starting_partitioned_network, 'rabbit@10-10-34-2'}

sosreport-svz-op-fdc-os-3.00087142-20150909140820/var/log/rabbitmq/rab...@10-10-34-3.log.1
545:Mnesia('rabbit@10-10-34-3'): ** ERROR ** mnesia_event got 
{inconsistent_database, starting_partitioned_network, 'rabbit@10-10-34-1'}

After this I started seeing this traces:

sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log:572:=CRASH
 REPORT==== 7-Sep-2015::14:48:45 ===
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log:573:
  crasher:
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-574-
    initial call: gen:init_it/6
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-575-
    pid: <0.2346.189>
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-576-
    registered_name: []
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-577-
    exception exit: {{unexpected_cast,{next_job_from,<0.4960.181>}},
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-578-
                     {gen_server2,call,
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-579-
                         [<0.22014.255>,
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-580-
                          {submit,#Fun<rabbit_misc.6.25154013>,<0.2862.189>},
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-581-
                          infinity]}}
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-582-
      in function  gen_server2:terminate/3 (src/gen_server2.erl, line 1045)
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-583-
    ancestors: [rabbit_mirror_queue_slave_sup,rabbit_sup,<0.110.0>]
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-584-
    messages: []
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-585-
    links: [<0.259.0>]
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-586-
    dictionary: [{guid,{{3735536962,2437967587,3023977752,60868675},0}}]
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-587-
    trap_exit: true

Related Bugs:

- https://bugs.launchpad.net/mos/+bug/1401948
- 
https://github.com/erlang/otp/compare/maint...dgud:dgud%3Bmnesia%3Bsticky-race%3BOTP-11375

Possible related upstream commit:

- http://hg.rabbitmq.com/rabbitmq-server/rev/5a63c9e273cc

This seems to be related to [1] and [2] both fixes available since
rabbitmq-server 3.4.0

[1] https://github.com/rabbitmq/rabbitmq-
server/commit/1a32616b744f4dab09cba4e7a7e747c2b6550361#diff-
3b9dc5e3c18be9549b0ab00763e4e123

[2] https://github.com/rabbitmq/rabbitmq-
server/commit/238243b10ba6f666fcd8e84289525961fe6e68b9#diff-
3b9dc5e3c18be9549b0ab00763e4e123

** Affects: rabbitmq-server (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: sts

** Tags added: sts

** Description changed:

  [Description]
  
  On a 3 nodes rabbitMQ cluster after a failure on one of the nodes (kernel 
crash) the following succession
  of events has been registered:
  
  ===
  This initial crash report
  
  $ ack-grep -i mnesia_locker  | wc -l
  16
  
  60-=CRASH REPORT==== 7-Sep-2015::14:48:45 ===
  61-  crasher:
  --
  66:                        {mnesia_locker,'rabbit@10-10-34-2',granted}}
  67-      in function  gen_server2:terminate/3 (src/gen_server2.erl, line 1045)
  68-    ancestors: [worker_pool_sup,rabbit_sup,<0.110.0>]
  69-    messages: []
  70-    links: [<0.114.0>]
  71-    dictionary: [{{xtype_to_module,direct},rabbit_exchange_type_direct},
  72-                  {{xtype_to_module,topic},rabbit_exchange_type_topic},
  73-                  {random_seed,{5643,25632,27953}},
  74-                  {{xtype_to_module,fanout},rabbit_exchange_type_fanout},
  75-                  {worker_pool_worker,true},
  76-                  {fhc_age_tree,{0,nil}},
  --
  88:     Reason:     
{unexpected_info,{mnesia_locker,'rabbit@10-10-34-2',granted}}
  89-     Offender:   [{pid,<0.142.0>},
  90-                  {name,25},
  91-                  {mfargs,{worker_pool_worker,start_link,[25]}},
  92-                  {restart_type,transient},
  93-                  {shutdown,4294967295},
  94-                  {child_type,worker}]
  
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1.log
  5166:Mnesia('rabbit@10-10-34-1'): ** ERROR ** mnesia_event got 
{inconsistent_database, starting_partitioned_network, 'rabbit@10-10-34-2'}
  
  
sosreport-svz-op-fdc-os-3.00087142-20150909140820/var/log/rabbitmq/rab...@10-10-34-3.log.1
  545:Mnesia('rabbit@10-10-34-3'): ** ERROR ** mnesia_event got 
{inconsistent_database, starting_partitioned_network, 'rabbit@10-10-34-1'}
  
  After this I started seeing this traces:
  
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log:572:=CRASH
 REPORT==== 7-Sep-2015::14:48:45 ===
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log:573:
  crasher:
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-574-
    initial call: gen:init_it/6
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-575-
    pid: <0.2346.189>
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-576-
    registered_name: []
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-577-
    exception exit: {{unexpected_cast,{next_job_from,<0.4960.181>}},
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-578-
                     {gen_server2,call,
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-579-
                         [<0.22014.255>,
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-580-
                          {submit,#Fun<rabbit_misc.6.25154013>,<0.2862.189>},
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-581-
                          infinity]}}
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-582-
      in function  gen_server2:terminate/3 (src/gen_server2.erl, line 1045)
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-583-
    ancestors: [rabbit_mirror_queue_slave_sup,rabbit_sup,<0.110.0>]
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-584-
    messages: []
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-585-
    links: [<0.259.0>]
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-586-
    dictionary: [{guid,{{3735536962,2437967587,3023977752,60868675},0}}]
  
sosreport-svz-op-fdc-os-1.00087142-20150909104806/var/log/rabbitmq/rab...@10-10-34-1-sasl.log-587-
    trap_exit: true
  
  Related Bugs:
  
  - https://bugs.launchpad.net/mos/+bug/1401948
  - 
https://github.com/erlang/otp/compare/maint...dgud:dgud%3Bmnesia%3Bsticky-race%3BOTP-11375
  
- 
  Possible related upstream commit:
  
  - http://hg.rabbitmq.com/rabbitmq-server/rev/5a63c9e273cc
+ 
+ This seems to be related to [1] and [2] both fixes available since
+ rabbitmq-server 3.4.0
+ 
+ [1] https://github.com/rabbitmq/rabbitmq-
+ server/commit/1a32616b744f4dab09cba4e7a7e747c2b6550361#diff-
+ 3b9dc5e3c18be9549b0ab00763e4e123
+ 
+ [2] https://github.com/rabbitmq/rabbitmq-
+ server/commit/238243b10ba6f666fcd8e84289525961fe6e68b9#diff-
+ 3b9dc5e3c18be9549b0ab00763e4e123

** Summary changed:

- Race condition in mnesia_locker after nodedown
+ Race condition in mnesia_locker after node down

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1496409

Title:
  Race condition in mnesia_locker after node down

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rabbitmq-server/+bug/1496409/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to