Re: [openstack-dev] [puppet][Fuel] OpenstackLib Client Provider Better Exception Handling

Gilles Dubreuil Thu, 15 Oct 2015 17:49:24 -0700


On 15/10/15 21:10, Vladimir Kuklin wrote:
> Gilles,
> 
> 5xx errors like 503 and 502/504 could always be intermittent operational
> issues. E.g. when you access your keystone backends through some proxy
> and there is a connectivity issue between the proxy and backends which
> disappears in 10 seconds, you do not need to rerun the puppet completely
> - just retry the request.
>


Look, I don't have much experience with those errors in real case
scenarios. And this is just a details for my understanding,  those
errors are coming from a running HTTP service, therefore this is not a
connectivity issue to the service but something wrong beyond that.

> Regarding "REST interfaces for all Openstack API" - this is very close
> to another topic that I raised ([0]) - using native Ruby application and
> handle the exceptions. Otherwise whenever we have an OpenStack client
> (generic or neutron/glance/etc. one) sending us a message like '[111]
> Connection refused' this message is very much determined by the
> framework that OpenStack is using within this release for clients. It
> could be `requests` or any other type of framework which sends different
> text message depending on its version. So it is very bothersome to write
> a bunch of 'if' clauses or gigantic regexps instead of handling simple
> Ruby exception. So I agree with you here - we need to work with the API
> directly. And, by the way, if you also support switching to native Ruby
> OpenStack API client, please feel free to support movement towards it in
> the thread [0]
> 

Yes, I totally agree with you on that approach (native Ruby lib).
This why I mentioned it here because for me the exception handling would
be solved at once.

> Matt and Gilles,
> 
> Regarding puppet-healthcheck - I do not think that puppet-healtcheck
> handles exactly what I am mentioning here - it is not running exactly at
> the same time as we run the request.
> 
> E.g. 10 seconds ago everything was OK, then we had a temporary
> connectivity issue, then everything is ok again in 10 seconds. Could you
> please describe how puppet-healthcheck can help us solve this problem? 
> 
> Or another example - there was an issue with keystone accessing token
> database when you have several keystone instances running, or there was
> some desync between these instances, e.g. you fetched the token at
> keystone #1 and then you verify it again keystone #2. Keystone #2 had
> some issues verifying it not due to the fact that token was bad, but due
> to the fact that that keystone #2 had some issues. We would get 401
> error and instead of trying to rerun the puppet we would need just to
> handle this issue locally by retrying the request.
> 
> [0] http://permalink.gmane.org/gmane.comp.cloud.openstack.devel/66423
> 
> On Thu, Oct 15, 2015 at 12:23 PM, Gilles Dubreuil <[email protected]
> <mailto:[email protected]>> wrote:
> 
> 
> 
>     On 15/10/15 12:42, Matt Fischer wrote:
>     >
>     >
>     > On Thu, Oct 8, 2015 at 5:38 AM, Vladimir Kuklin <[email protected] 
> <mailto:[email protected]>
>     > <mailto:[email protected] <mailto:[email protected]>>> wrote:
>     >
>     >     Hi, folks
>     >
>     >     * Intro
>     >
>     >     Per our discussion at Meeting #54 [0] I would like to propose the
>     >     uniform approach of exception handling for all puppet-openstack
>     >     providers accessing any types of OpenStack APIs.
>     >
>     >     * Problem Description
>     >
>     >     While working on Fuel during deployment of multi-node HA-aware
>     >     environments we faced many intermittent operational issues, e.g.:
>     >
>     >     401/403 authentication failures when we were doing scaling of
>     >     OpenStack controllers due to difference in hashing view between
>     >     keystone instances
>     >     503/502/504 errors due to temporary connectivity issues
> 
>     The 5xx errors are not connectivity issues:
> 
>     500 Internal Server Error
>     501 Not Implemented
>     502 Bad Gateway
>     503 Service Unavailable
>     504 Gateway Timeout
>     505 HTTP Version Not Supported
> 
>     I believe nothing should be done to trap them.
> 
>     The connectivity issues are different matter (to be addressed as
>     mentioned by Matt)
> 
>     >     non-idempotent operations like deletion or creation - e.g. if you
>     >     are deleting an endpoint and someone is deleting on the other node
>     >     and you get 404 - you should continue with success instead of
>     >     failing. 409 Conflict error should also signal us to re-fetch
>     >     resource parameters and then decide what to do with them.
>     >
>     >     Obviously, it is not optimal to rerun puppet to correct such errors
>     >     when we can just handle an exception properly.
>     >
>     >     * Current State of Art
>     >
>     >     There is some exception handling, but it does not cover all the
>     >     aforementioned use cases.
>     >
>     >     * Proposed solution
>     >
>     >     Introduce a library of exception handling methods which should be
>     >     the same for all puppet openstack providers as these exceptions seem
>     >     to be generic. Then, for each of the providers we can introduce
>     >     provider-specific libraries that will inherit from this one.
>     >
>     >     Our mos-puppet team could add this into their backlog and could work
>     >     on that in upstream or downstream and propose it upstream.
>     >
>     >     What do you think on that, puppet folks?
>     >
> 
>     The real issue is because we're dealing with openstackclient, a CLI tool
>     and not an API. Therefore no error propagation is expected.
> 
>     Using REST interfaces for all Openstack API would provide all HTTP
>     errors:
> 
>     Check for "HTTP Response Classes" in
>     http://ruby-doc.org/stdlib-2.2.3/libdoc/net/http/rdoc/Net/HTTP.html
> 
> 
>     >     [0] 
> http://eavesdrop.openstack.org/meetings/puppet_openstack/2015/puppet_openstack.2015-10-06-15.00.html
>     >
>     >
>     > I think that we should look into some solutions here as I'm generally
>     > for something we can solve once and re-use. Currently we solve some of
>     > this at TWC by serializing our deploys and disabling puppet site wide
>     > while we do so. This avoids the issue of Keystone on one node removing
>     > and endpoint while the other nodes (who still have old code) keep trying
>     > to add it back.
>     >
>     > For connectivity issues especially after service restarts, we're using
>     > puppet-healthcheck [0] and I'd like to discuss that more in Tokyo as an
>     > alternative to explicit retries and delays. It's in the etherpad so
>     > hopefully you can attend.
> 
>     +1
> 
>     >
>     > [0] - https://github.com/puppet-community/puppet-healthcheck
>     >
>     >
>     >
>     >
>     __________________________________________________________________________
>     > OpenStack Development Mailing List (not for usage questions)
>     > Unsubscribe:
>     [email protected]?subject:unsubscribe
>     <http://[email protected]?subject:unsubscribe>
>     > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>     >
> 
>     __________________________________________________________________________
>     OpenStack Development Mailing List (not for usage questions)
>     Unsubscribe:
>     [email protected]?subject:unsubscribe
>     <http://[email protected]?subject:unsubscribe>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> -- 
> Yours Faithfully,
> Vladimir Kuklin,
> Fuel Library Tech Lead,
> Mirantis, Inc.
> +7 (495) 640-49-04
> +7 (926) 702-39-68
> Skype kuklinvv
> 35bk3, Vorontsovskaya Str.
> Moscow, Russia,
> www.mirantis.com <http://www.mirantis.ru/>
> www.mirantis.ru <http://www.mirantis.ru/>
> [email protected] <mailto:[email protected]>
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [email protected]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [puppet][Fuel] OpenstackLib Client Provider Better Exception Handling

Reply via email to