This seems like a totally reasonable solution, and would enable us to more thoroughly test the performance implications of this change between 8.0 and 9.0 release.
+1 -----Original Message----- From: Davanum Srinivas [mailto:dava...@gmail.com] Sent: Wednesday, December 02, 2015 9:32 AM To: OpenStack Development Mailing List (not for usage questions) <openstack-dev@lists.openstack.org> Subject: Re: [openstack-dev] [Fuel][FFE] Disabling HA for RPC queues in RabbitMQ Vova, Folks, +1 to "set this option to false as an experimental feature" Thanks, Dims On Wed, Dec 2, 2015 at 10:08 AM, Vladimir Kuklin <vkuk...@mirantis.com> wrote: > Dmitry > > Although, I am a big fan of disabling replication for RPC, I think it > is too late to introduce it so late by default. I would suggest that > we control this part of OCF script with a specific parameter 'e.g. > enable RPC replication' and set it to 'true' by default. Then we can > set this option to false as an experimental feature, run some tests > and decide whether it should be enabled by default or not. In this > case, users who are interested in this, will be able to enable it when > they need it, while we still stick to our old and tested approach. > > On Wed, Dec 2, 2015 at 5:52 PM, Konstantin Kalin <kka...@mirantis.com> > wrote: >> >> I would add on top of that Dmirty said that HA queues also increases >> probability to have messages duplications under certain scenarios >> (besides of that they are ~10x slower). Would Openstack services >> tolerate if RPC request will be duplicated? What I've already learned >> - No. Also if cluster_partition_handling=autoheal (what we currently >> have) the messages may be lost as well during the failover scenarios like non-HA queues. >> Honestly I believe there is no difference between HA queues and non >> HA-queues in RPC layer fail-tolerance in the way how we use RabbitMQ. >> >> Thank you, >> Konstantin. >> >> On Dec 2, 2015, at 4:05 AM, Dmitry Mescheryakov >> <dmescherya...@mirantis.com> wrote: >> >> >> >> 2015-12-02 12:48 GMT+03:00 Sergii Golovatiuk <sgolovat...@mirantis.com>: >>> >>> Hi, >>> >>> >>> On Tue, Dec 1, 2015 at 11:34 PM, Peter Lemenkov <lemen...@gmail.com> >>> wrote: >>>> >>>> Hello All! >>>> >>>> Well, side-effects (or any other effects) are quite obvious and >>>> predictable - this will decrease availability of RPC queues a bit. >>>> That's for sure. >>> >>> >>> Imagine the case when user creates VM instance, and some nova >>> messages are lost. I am not sure we want half-created instances. Who >>> is going to clean up them? Since we do not have results of >>> destructive tests, I vote -2 for FFE for this feature. >> >> >> Sergii, actually messaging layer can not provide any guarantee that >> it will not happen even if all messages are preserved. Assume the >> following >> scenario: >> >> * nova-scheduler (or conductor?) sends request to nova-compute to >> spawn a VM >> * nova-compute receives the message and spawned the VM >> * due to some reason (rabbitmq unavailable, nova-compute lagged) >> nova-compute did not respond within timeout (1 minute, I think) >> * nova-scheduler does not get response within 1 minute and marks the >> VM with Error status. >> >> In that scenario no message was lost, but still we have a VM half >> spawned and it is up to Nova to handle the error and do the cleanup in that case. >> >> Such issue already happens here and there when something glitches. >> For instance our favorite MessagingTimeout exception could be caused >> by such scenario. Specifically, in that example when nova-scheduler >> times out waiting for reply, it will throw exactly that exception. >> >> My point is simple - lets increase our architecture scalability by >> 2-3 times by _maybe_ causing more errors for users during failover. >> The failover time itself should not get worse (to be tested by me) >> and errors should be correctly handler by services anyway. >> >>>> >>>> However, Dmitry's guess is that the overall messaging backplane >>>> stability increase (RabitMQ won't fail too often in some cases) >>>> would compensate for this change. This issue is very much real - >>>> speaking of me I've seen an awful cluster's performance degradation >>>> when a failing RabbitMQ node was killed by some watchdog >>>> application (or even worse wasn't killed at all). One of these >>>> issues was quite recently, and I'd love to see them less frequently. >>>> >>>> That said I'm uncertain about the stability impact of this change, >>>> yet I see a reasoning worth discussing behind it. >>>> >>>> 2015-12-01 20:53 GMT+01:00 Sergii Golovatiuk <sgolovat...@mirantis.com>: >>>> > Hi, >>>> > >>>> > -1 for FFE for disabling HA for RPC queue as we do not know all >>>> > side effects in HA scenarios. >>>> > >>>> > On Tue, Dec 1, 2015 at 7:34 PM, Dmitry Mescheryakov >>>> > <dmescherya...@mirantis.com> wrote: >>>> >> >>>> >> Folks, >>>> >> >>>> >> I would like to request feature freeze exception for disabling >>>> >> HA for RPC queues in RabbitMQ [1]. >>>> >> >>>> >> As I already wrote in another thread [2], I've conducted tests >>>> >> which clearly show benefit we will get from that change. The >>>> >> change itself is a very small patch [3]. The only thing which I >>>> >> want to do before proposing to merge this change is to conduct >>>> >> destructive tests against it in order to make sure that we do >>>> >> not have a regression here. That should take just several days, >>>> >> so if there will be no other objections, we will be able to >>>> >> merge the change in a week or two timeframe. >>>> >> >>>> >> Thanks, >>>> >> >>>> >> Dmitry >>>> >> >>>> >> [1] https://review.openstack.org/247517 >>>> >> [2] >>>> >> >>>> >> http://lists.openstack.org/pipermail/openstack-dev/2015-December >>>> >> /081006.html [3] https://review.openstack.org/249180 >>>> >> >>>> >> >>>> >> ________________________________________________________________ >>>> >> __________ OpenStack Development Mailing List (not for usage >>>> >> questions) >>>> >> Unsubscribe: >>>> >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-de >>>> >> v >>>> >> >>>> > >>>> > >>>> > >>>> > _________________________________________________________________ >>>> > _________ OpenStack Development Mailing List (not for usage >>>> > questions) >>>> > Unsubscribe: >>>> > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>> > >>>> >>>> >>>> >>>> -- >>>> With best regards, Peter Lemenkov. >>>> >>>> >>>> ___________________________________________________________________ >>>> _______ OpenStack Development Mailing List (not for usage >>>> questions) >>>> Unsubscribe: >>>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >>> >>> >>> >>> ____________________________________________________________________ >>> ______ OpenStack Development Mailing List (not for usage questions) >>> Unsubscribe: >>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >> >> _____________________________________________________________________ >> _____ OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> >> _____________________________________________________________________ >> _____ OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > > > -- > Yours Faithfully, > Vladimir Kuklin, > Fuel Library Tech Lead, > Mirantis, Inc. > +7 (495) 640-49-04 > +7 (926) 702-39-68 > Skype kuklinvv > 35bk3, Vorontsovskaya Str. > Moscow, Russia, > www.mirantis.com > www.mirantis.ru > vkuk...@mirantis.com > > ______________________________________________________________________ > ____ OpenStack Development Mailing List (not for usage questions) > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Davanum Srinivas :: https://twitter.com/dims __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev