Maybe time to revive something like:

https://review.openstack.org/#/c/12759/


>From experience, all sites (and those internal to yahoo) provide a /status
(or equivalent) that is used for all sorts of things (from basic
load-balancing up/down) to other things like actually introspecting the
state of the process (or to get basics about what the process is doing).
Typically this is not exposed to the public (its why
http://www.yahoo.com/status works for me but not for u). It seems like
something like that could help (but of course not completely solve) the
type of response jay mentioned.

-Josh

On 12/12/13 10:10 AM, "Jay Pipes" <jaypi...@gmail.com> wrote:

>On 12/12/2013 12:53 PM, Kyle Mestery wrote:
>> On Dec 12, 2013, at 11:44 AM, Jay Pipes <jaypi...@gmail.com> wrote:
>>> On 12/12/2013 12:36 PM, Clint Byrum wrote:
>>>> Excerpts from Russell Bryant's message of 2013-12-12 09:09:04 -0800:
>>>>> On 12/12/2013 12:02 PM, Clint Byrum wrote:
>>>>>> I've been chasing quite a few bugs in the TripleO automated bring-up
>>>>>> lately that have to do with failures because either there are no
>>>>>>valid
>>>>>> hosts ready to have servers scheduled, or there are hosts listed and
>>>>>> enabled, but they can't bind to the network because for whatever
>>>>>>reason
>>>>>> the L2 agent has not checked in with Neutron yet.
>>>>>>
>>>>>> This is only a problem in the first few minutes of a nova-compute
>>>>>>host's
>>>>>> life. But it is critical for scaling up rapidly, so it is important
>>>>>>for
>>>>>> me to understand how this is supposed to work.
>>>>>>
>>>>>> So I'm asking, is there a standard way to determine whether or not a
>>>>>> nova-compute is definitely ready to have things scheduled on it?
>>>>>>This
>>>>>> can be via an API, or even by observing something on the
>>>>>>nova-compute
>>>>>> host itself. I just need a definitive signal that "the compute host
>>>>>>is
>>>>>> ready".
>>>>>
>>>>> If a nova compute host has registered itself to start having
>>>>>instances
>>>>> scheduled to it, it *should* be ready.  AFAIK, we're not doing any
>>>>> network sanity checks on startup, though.
>>>>>
>>>>> We already do some sanity checks on startup.  For example,
>>>>>nova-compute
>>>>> requires that it can talk to nova-conductor.  nova-compute will
>>>>>block on
>>>>> startup until nova-conductor is responding if they happened to be
>>>>> brought up at the same time.
>>>>>
>>>>> We could do something like this with a networking sanity check if
>>>>> someone could define what that check should look like.
>>>>>
>>>> Could we ask Neutron if our compute host has an L2 agent yet? That
>>>>seems
>>>> like a valid sanity check.
>>>
>>> ++
>>>
>> This makes sense to me as well. Although, not all Neutron plugins have
>> an L2 agent, so I think the check needs to be more generic than that.
>> For example, the OpenDaylight MechanismDriver we have developed
>> doesn't need an agent. I also believe the Nicira plugin is agent-less,
>> perhaps there are others as well.
>>
>> And I should note, does this sort of integration also happen with
>>cinder,
>> for example, when we're dealing with storage? Any other services which
>> have a requirement on startup around integration with nova as well?
>
>Right, it's more general than "is the L2 agent alive and running". It's
>more about having each service understand the relative dependencies it
>has on other supporting services.
>
>For instance, have each service implement a:
>
>GET /healthcheck
>
>that would return either a 200 OK or 409 Conflict with the body
>containing a list of service types that it is waiting to hear back from
>in order to provide a 200 OK for itself.
>
>Anyway, just some thoughts...
>
>-jay
>
>
>
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to