** Tags added: sts

** Description changed:

+ [Impact]
+ 
+ 
+ Affected
+  Bionic
+ Not affected
+  Focal
+ 
+ [Test Case]
+ TBD
+ 
+ 
+ [Where problems could occur]
+ TBD
+ 
+ [Others]
+ 
+ 
+ // original description
+ 
  Input:
-  - OpenStack Pike cluster with ~500 nodes
-  - DVR enabled in neutron
-  - Lots of messages
+  - OpenStack Pike cluster with ~500 nodes
+  - DVR enabled in neutron
+  - Lots of messages
  
  Scenario: failover of one rabbit node in a cluster
  
  Issue: after failed rabbit node gets back online some rpc communications 
appear broken
  Logs from rabbit:
  
  =ERROR REPORT==== 10-Aug-2018::17:24:37 ===
  Channel error on connection <0.14839.1> (10.200.0.24:55834 -> 
10.200.0.31:5672, vhost: '/openstack', user: 'openstack'), channel 1:
  operation basic.publish caused a channel exception not_found: no exchange 
'reply_5675d7991b4a4fb7af5d239f4decb19f' in vhost '/openstack'
  
  Investigation:
  After rabbit node gets back online it gets many new connections immediately 
and fails to synchronize exchanges for some reason (number of exchanges in that 
cluster was ~1600), on that node it stays low and not increasing.
  
  Workaround: let the recovered node synchronize all exchanges - forbid
  new connections with iptables rules for some time after failed node gets
  online (30 sec)
  
  Proposal: do not create new exchanges (use default) for all direct
  messages - this also fixes the issue.
  
  Is there a good reason for creating new exchanges for direct messages?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1789177

Title:
  RabbitMQ fails to synchronize exchanges under high load

To manage notifications about this bug go to:
https://bugs.launchpad.net/oslo.messaging/+bug/1789177/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to