On 05/06/2014 10:42 PM, Roman Sokolkov wrote: > Hello, fuelers. > > I'm using Fuel 4.1A + Havana in HA mode. > > I permanently observe (on other deployments also) issue with stuck > "nova-compute" service. But i think problem is more fundamental and > relates to HA RabbitMQ and OpenStack AMQP driver implementation. > > Symptoms: > > * Random nova-compute from time to time marked as "XXX" for a while. > * I see that service itself works properly. In logs i see that it > sends status updates to conductor. But actually nothing is sent. > * "netstat" shows that all connections to/from rabbit "ESTABLISHED" > * rabbitmqctl shows that "compute.node-x" queue synced to all slaves. > * nothing has been broken before, i mean rabbitmq cluster, etc. > > Axe style solution: > > * /etc/init.d/openstack-nova-compute restart > > So here i've found a lot of interesting stuff (and solutions): > > https://bugs.launchpad.net/oslo.messaging/+bug/856764 > > > My questions are: > > * Are there any thoughts particular for Fuel to solve/workaround this > issue? > * Any fast solution for this in 4.1? Like adjust TCP keep-alive timeouts? > >
I submitted an issue for Fuel https://bugs.launchpad.net/fuel/+bug/1317488 and assigned it to Fuel hardening team. Feel free to update it as appropriate. > -- > Roman Sokolkov, > Deployment Engineer, > Mirantis, Inc. > Skype rsokolkov, > [email protected] <mailto:[email protected]> > > -- Best regards, Bogdan Dobrelya, Skype #bogdando_at_yahoo.com Irc #bogdando -- Mailing list: https://launchpad.net/~fuel-dev Post to : [email protected] Unsubscribe : https://launchpad.net/~fuel-dev More help : https://help.launchpad.net/ListHelp

