I would add on top of that Dmirty said that HA queues also increases probability to have messages duplications under certain scenarios (besides of that they are ~10x slower). Would Openstack services tolerate if RPC request will be duplicated? What I've already learned - No. Also if cluster_partition_handling=autoheal (what we currently have) the messages may be lost as well during the failover scenarios like non-HA queues. Honestly I believe there is no difference between HA queues and non HA-queues in RPC layer fail-tolerance in the way how we use RabbitMQ.
Thank you, Konstantin. > On Dec 2, 2015, at 4:05 AM, Dmitry Mescheryakov <dmescherya...@mirantis.com> > wrote: > > > > 2015-12-02 12:48 GMT+03:00 Sergii Golovatiuk <sgolovat...@mirantis.com > <mailto:sgolovat...@mirantis.com>>: > Hi, > > > On Tue, Dec 1, 2015 at 11:34 PM, Peter Lemenkov <lemen...@gmail.com > <mailto:lemen...@gmail.com>> wrote: > Hello All! > > Well, side-effects (or any other effects) are quite obvious and > predictable - this will decrease availability of RPC queues a bit. > That's for sure. > > Imagine the case when user creates VM instance, and some nova messages are > lost. I am not sure we want half-created instances. Who is going to clean up > them? Since we do not have results of destructive tests, I vote -2 for FFE > for this feature. > > Sergii, actually messaging layer can not provide any guarantee that it will > not happen even if all messages are preserved. Assume the following scenario: > > * nova-scheduler (or conductor?) sends request to nova-compute to spawn a VM > * nova-compute receives the message and spawned the VM > * due to some reason (rabbitmq unavailable, nova-compute lagged) > nova-compute did not respond within timeout (1 minute, I think) > * nova-scheduler does not get response within 1 minute and marks the VM with > Error status. > > In that scenario no message was lost, but still we have a VM half spawned and > it is up to Nova to handle the error and do the cleanup in that case. > > Such issue already happens here and there when something glitches. For > instance our favorite MessagingTimeout exception could be caused by such > scenario. Specifically, in that example when nova-scheduler times out waiting > for reply, it will throw exactly that exception. > > My point is simple - lets increase our architecture scalability by 2-3 times > by _maybe_ causing more errors for users during failover. The failover time > itself should not get worse (to be tested by me) and errors should be > correctly handler by services anyway. > > > However, Dmitry's guess is that the overall messaging backplane > stability increase (RabitMQ won't fail too often in some cases) would > compensate for this change. This issue is very much real - speaking of > me I've seen an awful cluster's performance degradation when a failing > RabbitMQ node was killed by some watchdog application (or even worse > wasn't killed at all). One of these issues was quite recently, and I'd > love to see them less frequently. > > That said I'm uncertain about the stability impact of this change, yet > I see a reasoning worth discussing behind it. > > 2015-12-01 20:53 GMT+01:00 Sergii Golovatiuk <sgolovat...@mirantis.com > <mailto:sgolovat...@mirantis.com>>: > > Hi, > > > > -1 for FFE for disabling HA for RPC queue as we do not know all side effects > > in HA scenarios. > > > > On Tue, Dec 1, 2015 at 7:34 PM, Dmitry Mescheryakov > > <dmescherya...@mirantis.com <mailto:dmescherya...@mirantis.com>> wrote: > >> > >> Folks, > >> > >> I would like to request feature freeze exception for disabling HA for RPC > >> queues in RabbitMQ [1]. > >> > >> As I already wrote in another thread [2], I've conducted tests which > >> clearly show benefit we will get from that change. The change itself is a > >> very small patch [3]. The only thing which I want to do before proposing to > >> merge this change is to conduct destructive tests against it in order to > >> make sure that we do not have a regression here. That should take just > >> several days, so if there will be no other objections, we will be able to > >> merge the change in a week or two timeframe. > >> > >> Thanks, > >> > >> Dmitry > >> > >> [1] https://review.openstack.org/247517 > >> <https://review.openstack.org/247517> > >> [2] > >> http://lists.openstack.org/pipermail/openstack-dev/2015-December/081006.html > >> > >> <http://lists.openstack.org/pipermail/openstack-dev/2015-December/081006.html> > >> [3] https://review.openstack.org/249180 > >> <https://review.openstack.org/249180> > >> > >> __________________________________________________________________________ > >> OpenStack Development Mailing List (not for usage questions) > >> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > >> <http://openstack-dev-requ...@lists.openstack.org/?subject:unsubscribe> > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev> > >> > > > > > > __________________________________________________________________________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > <http://openstack-dev-requ...@lists.openstack.org/?subject:unsubscribe> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev> > > > > > > -- > With best regards, Peter Lemenkov. > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > <http://openstack-dev-requ...@lists.openstack.org/?subject:unsubscribe> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev> > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > <http://openstack-dev-requ...@lists.openstack.org/?subject:unsubscribe> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev> > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev