Re: [openstack-dev] [nova] Service group foundations and features

Sylvain Bauza Thu, 11 Jun 2015 12:36:06 -0700


Le 11/06/2015 18:52, Vilobh Meshram a écrit :

Few more places which can trigger inconsistent behaviour.
-https://github.com/openstack/nova/blob/stable/kilo/nova/api/openstack/compute/contrib/services.py#L44
-https://github.com/openstack/nova/blob/stable/kilo/nova/api/openstack/compute/contrib/hypervisors.py#L98
-https://github.com/openstack/nova/blob/stable/kilo/nova/availability_zones.py#L130
-https://github.com/openstack/nova/blob/stable/kilo/nova/api/openstack/compute/contrib/availability_zone.py#L68
-https://github.com/openstack/nova/blob/stable/kilo/nova/api/openstack/compute/contrib/hosts.py#L88-L89
-https://github.com/openstack/nova/blob/stable/kilo/nova/compute/api.py#L3399-L3421.
Blueprint which plans to fix this :https://blueprints.launchpad.net/nova/+spec/servicegroup-api-control-plane
Related Spec : 1) https://review.openstack.org/#/c/190322/

     2) https://review.openstack.org/#/c/138607/

-Vilobh

tl,dr: checking a Service (is_up) should only be for making sure we cansend a message to it, but not for checking if the related hypervisor(s)is/are up. Having a reference in the services table mapping 1:1 to areference in a separate datastore is fine by me.



So, I'm going to review the specs above and leave my comments there.

That said, I want to also point out some humble opinion about whatshould be the relationship between a Service and what could be calledthe "ServiceGroup API" (badly named IMHO since it only checks a service,not a group ;-) )

From my perspective, the Service object is related to the AMQP servicetied to the queue and... that's it.That has nothing to do related to an hypervisor (since hypervisors canbe distributed for a single service). That only represents the singlepoint of failure for messages sent to a nova-compute service (and not acompute node, remember the distributed stuff) and since this is the onlyway to communicate with the related hypervisor(s), we have to know itsstatus.

Again, that doesn't necessarly imply that if the service (who listens tothe AMQP queue) is up, the hypervisors will be up as well, but that'senough strong to say that if it's down, we are sure that thehypervisor(s) won't receive messages.Whether if the hypervisor is still continuing to work while the serviceis down is a corner case that the service status should not provide IMHO.

That's exactly why we need to consider that the service is a referencewhich can be used as it is for any relationship with a list ofhypervisors (call that ComputeNode now) and checking its state (usingany driver for it) should just be used for knowing if the message can besent to it - *and not for checking if the related hypervisor(s) arerunning or not*

Given that disclaimer (which implies that we need to be very clear aboutwhen to wonder if is_up(service) ), I'm fine with considering thereference stored in DB (ie. the services table) as only a list ofreferences pointing to a separate object which can be stored in anydatastore (DB/Memcache/ZK/pick your favorite)

The only thing we need to make sure is that there is a 1:1 mappingbetween the 2 objects (eg. the DB "service" item and the "datastored"object) which can only be done logically.


My 2 cts,
-Sylvain

On Mon, May 11, 2015 at 8:08 AM, Chris Friesen<[email protected] <mailto:[email protected]>> wrote:


    On 05/11/2015 07:13 AM, Attila Fazekas wrote:

            From: "John Garbutt" <[email protected]
            <mailto:[email protected]>>


            * From the RPC api point of view, do we want to send a cast to
            something that we know is dead, maybe we want to? Should
            we wait for
            calls to timeout, or give up quicker?


        How to fail sooner:
        https://bugs.launchpad.net/oslo.messaging/+bug/1437955

        We do not need a dedicated is_up just for this.


    Is that really going to help?  As I understand it if nova-compute
    dies (or is isolated) then the queue remains present on the server
    but nothing will process messages from it.

    Chris


    __________________________________________________________________________
    OpenStack Development Mailing List (not for usage questions)
    Unsubscribe:
    [email protected]?subject:unsubscribe
    <http://[email protected]?subject:unsubscribe>
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Service group foundations and features

Reply via email to