On 09/10/2014 08:46 PM, Jamie Lennox wrote:
> ----- Original Message -----
>> From: "Steven Hardy" <sha...@redhat.com>
>> To: "OpenStack Development Mailing List (not for usage questions)" 
>> <openstack-dev@lists.openstack.org>
>> Sent: Thursday, September 11, 2014 1:55:49 AM
>> Subject: Re: [openstack-dev] [all] [clients] [keystone] lack of retrying 
>> tokens leads to overall OpenStack fragility
>> On Wed, Sep 10, 2014 at 10:14:32AM -0400, Sean Dague wrote:
>>> Going through the untriaged Nova bugs, and there are a few on a similar
>>> pattern:
>>> Nova operation in progress.... takes a while
>>> Crosses keystone token expiration time
>>> Timeout thrown
>>> Operation fails
>>> Terrible 500 error sent back to user
>> We actually have this exact problem in Heat, which I'm currently trying to
>> solve:
>> https://bugs.launchpad.net/heat/+bug/1306294
>> Can you clarify, is the issue either:
>> 1. Create novaclient object with username/password
>> 2. Do series of operations via the client object which eventually fail
>> after $n operations due to token expiry
>> or:
>> 1. Create novaclient object with username/password
>> 2. Some really long operation which means token expires in the course of
>> the service handling the request, blowing up and 500-ing
>> If the former, then it does sound like a client, or usage-of-client bug,
>> although note if you pass a *token* vs username/password (as is currently
>> done for glance and heat in tempest, because we lack the code to get the
>> token outside of the shell.py code..), there's nothing the client can do,
>> because you can't request a new token with longer expiry with a token...
>> However if the latter, then it seems like not really a client problem to
>> solve, as it's hard to know what action to take if a request failed
>> part-way through and thus things are in an unknown state.
>> This issue is a hard problem, which can possibly be solved by
>> switching to a trust scoped token (service impersonates the user), but then
>> you're effectively bypassing token expiry via delegation which sits
>> uncomfortably with me (despite the fact that we may have to do this in heat
>> to solve the afforementioned bug)
>>> It seems like we should have a standard pattern that on token expiration
>>> the underlying code at least gives one retry to try to establish a new
>>> token to complete the flow, however as far as I can tell *no* clients do
>>> this.
>> As has been mentioned, using sessions may be one solution to this, and
>> AFAIK session support (where it doesn't already exist) is getting into
>> various clients via the work being carried out to add support for v3
>> keystone by David Hu:
>> https://review.openstack.org/#/q/owner:david.hu%2540hp.com,n,z
>> I see patches for Heat (currently gating), Nova and Ironic.
>>> I know we had to add that into Tempest because tempest runs can exceed 1
>>> hr, and we want to avoid random fails just because we cross a token
>>> expiration boundary.
>> I can't claim great experience with sessions yet, but AIUI you could do
>> something like:
>> from keystoneclient.auth.identity import v3
>> from keystoneclient import session
>> from keystoneclient.v3 import client
>> auth = v3.Password(auth_url=OS_AUTH_URL,
>>                    username=USERNAME,
>>                    password=PASSWORD,
>>                    project_id=PROJECT,
>>                    user_domain_name='default')
>> sess = session.Session(auth=auth)
>> ks = client.Client(session=sess)
>> And if you can pass the same session into the various clients tempest
>> creates then the Password auth-plugin code takes care of reauthenticating
>> if the token cached in the auth plugin object is expired, or nearly
>> expired:
>> https://github.com/openstack/python-keystoneclient/blob/master/keystoneclient/auth/identity/base.py#L120
>> So in the tempest case, it seems like it may be a case of migrating the
>> code creating the clients to use sessions instead of passing a token or
>> username/password into the client object?
>> That's my understanding of it atm anyway, hopefully jamielennox will be along
>> soon with more details :)
>> Steve
> By clients here are you referring to the CLIs or the python libraries? 
> Implementation is at different points with each. 
> Sessions will handle automatically reauthenticating and retrying a request, 
> however it relies on the service throwing a 401 Unauthenticated error. If a 
> service is returning a 500 (or a timeout?) then there isn't much that a 
> client can/should do for that because we can't assume that trying again with 
> a new token will solve anything. 
> At the moment we have keystoneclient, novaclient, cinderclient neutronclient 
> and then a number of the smaller projects with support for sessions. That 
> obviously doesn't mean that existing users of that code have transitioned to 
> the newer way though. David Hu has been working on using this code within the 
> existing CLIs. I have prototypes for at least nova to talk to neutron and 
> cinder which i'm waiting for Kilo to push. From there it should be easier to 
> do this for other services. 
> For service to service communication there are two types.
> 1) using the user's token like nova->cinder. If this token expires there is 
> really nothing that nova can do except raise 401 and make the client do it 
> again. 

In this case it would be really good to do at least 1 retry, because
it's completely silly for us to fail an action based on a token timeout.
The solution ops are doing is changing their token expiration back to
some really large number.

> 2) using a service user like nova->neutron. This should allow automatic 
> reauthentication and will be fixed/standardied by sessions. 

Ok, glanceclient should be a high target here, because that's often
involved in long running things (snapshot manip is slow).


Sean Dague

OpenStack-dev mailing list

Reply via email to