Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On 16 September 2014 01:28, Nathan Kinder nkin...@redhat.com wrote: The idea would be to leave normal tokens with a smaller validity period (like the current default of an hour), but also allow one-time use tokens to be requested. Cinder backup makes many requests to swift during a backup, one per chunk to be uploaded plus one or more for the metadata file. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On Wed, 17 Sep 2014 04:53:28 PM Duncan Thomas wrote: On 16 September 2014 01:28, Nathan Kinder nkin...@redhat.com wrote: The idea would be to leave normal tokens with a smaller validity period (like the current default of an hour), but also allow one-time use tokens to be requested. Cinder backup makes many requests to swift during a backup, one per chunk to be uploaded plus one or more for the metadata file. Right, and what if the HTTP connection times out and needs to be retried. Can I reuse my single use token? Also: single-use tokens scale badly since they need a strongly consistent validation point that in normal use requires frequent writes. -- - Gus ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On 09/15/2014 08:28 PM, Nathan Kinder wrote: On 09/12/2014 12:46 AM, Angus Lees wrote: On Thu, 11 Sep 2014 03:21:52 PM Steven Hardy wrote: On Wed, Sep 10, 2014 at 08:46:45PM -0400, Jamie Lennox wrote: For service to service communication there are two types. 1) using the user's token like nova-cinder. If this token expires there is really nothing that nova can do except raise 401 and make the client do it again. 2) using a service user like nova-neutron. This should allow automatic reauthentication and will be fixed/standardied by sessions. (1) is the problem I'm trying to solve in bug #1306294, and (for Heat at least) there seems to be two solutions, neither of which I particularly like: - Require username/password to be passed into the service (something we've been trying to banish via migrating to trusts for deferred authentication) - Create a trust, and impersonate the user for the duration of the request, or after the token expires until it is completed, using the service user credentials and the trust_id. It's the second one which I'm deliberating over - technically it will work, and we create the trust anyway (e.g for later use to do autoscaling etc), but can anyone from the keystone team comment on the legitimacy of the approach? Intuitively it seems wrong, but I can't see any other way if we want to support token-only auth and cope with folks doing stuff which takes 2 hours with a 1 hour token expiry? A possible 3rd option is some sort of longer lived, but limited scope capability token. The user would create a capability token that represents anyone possessing this token is (eg) allowed to write to swift as $user. The token could be created by keystone as a trusted 3rd party or by swift (doesn't matter which), in response to a request authenticated as $user. The client then includes that token in the request *to cinder*, so cinder can pass it back to swift when doing the writes. This capability token would be of much longer duration (long enough to complete the cinder-swift task), which is ok because it is of a much more limited scope (ideally as fine grained as we can bother implementing). With UUID tokens, it would even be possible to implement a one-time use sort of token. Since Keystone needs to be asked to validate a UUID token, the token could be invalidated by Keystone after the first verification. Since the token is limited based off of number of times of usage, there should be less concerns about a long validity period (though it would make sense to use something sane still). This approach wouldn't be possible with PKI tokens since Keystone is not in the validation path. Your idea of passing the capability token in the request would work well with this, as the token only needs to be extracted and used once instead of being passed from service to service and validated at each hop (usercinder-swift in your example). The idea would be to leave normal tokens with a smaller validity period (like the current default of an hour), but also allow one-time use tokens to be requested. It is dumb to make service get a token just to hand the token back to Keystone. Guang Yee has pushed for years to get a capability into Keystone where certain API calls did not require a token, but would instead the permission would be based on whatever the users capabilites were at the time. The problem is that Admin in the default policy (and hardcoded in V2) is definded to mean User has the role admin on anything which is, of course, suboptimal (to say the least). So validating a Token should not require a token. We could add to the request some standard Stanza for saying Here is the project/domain that I want to do this with so that we can atleast Keep Keystone's current behavior somewhat sane (I like this option) A 4th option is to have much longer lived tokens everywhere (long enough for this backup), but the user is able to expire it early via keystone whenever they feel it might be compromised (aiui this is exactly how things work now - we just need to increase the timeout). Greater exposure to replay attacks, but if detected they can still be invalidated quickly. (This is the easiest option, it's basically just formalising what the operators are already doing) A 5th option (wow) is to have the end user/client repeatedly push in fresh tokens during long-running operations (and heat is the uber-example since it basically wants to impersonate the user forever). Those tokens would then need to be refreshed all the way down the stack for any outstanding operations that might need the new token. (This or the 4th option seems ugly but unavoidable for forever services like heat. There has to be some way to invalidate their access if they go rogue, either by time (and thus needs a refresh mechanism) or by invalidation-via- keystone (which implies the token lasts forever unless invalidated)) I think Keystone trusts are better for forever
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On Wed, Sep 10, 2014 at 9:14 AM, Sean Dague s...@dague.net wrote: Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user It seems like we should have a standard pattern that on token expiration the underlying code at least gives one retry to try to establish a new token to complete the flow, however as far as I can tell *no* clients do this. I know we had to add that into Tempest because tempest runs can exceed 1 hr, and we want to avoid random fails just because we cross a token expiration boundary. Anyone closer to the clients that can comment here? -Sean Currently, a service with a token can't always refresh a new token, because the service doesn't always have the user's credentials (which is good... the service shouldn't have the user's credentials), and even if the credentials were available the service might not be able to use them to authenticate (not all authentication is done using username and password). The most obvious solution to me is to have the identity server provides an api where, given a token, you can get a new token with an expiration time of your choice. Use of the API would be limited to service users. When a service gets a token that it wants to send on to another service it first uses the existing token to get a new token with whatever expiration time it thinks would be adequate. If the service knows that it's done with the token it will hopefully revoke the new token to keep the token database clean. The only thing missing from the existing auth API for getting a token from a token is being able to set the expiration time -- https://github.com/openstack/identity-api/blob/master/v3/src/markdown/identity-api-v3.md#authentication-authentication . Keystone will also have to be enhanced to validate that if the token-from-token request has a new expiration time the requestor has the required role. - Brant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On 09/12/2014 12:46 AM, Angus Lees wrote: On Thu, 11 Sep 2014 03:21:52 PM Steven Hardy wrote: On Wed, Sep 10, 2014 at 08:46:45PM -0400, Jamie Lennox wrote: For service to service communication there are two types. 1) using the user's token like nova-cinder. If this token expires there is really nothing that nova can do except raise 401 and make the client do it again. 2) using a service user like nova-neutron. This should allow automatic reauthentication and will be fixed/standardied by sessions. (1) is the problem I'm trying to solve in bug #1306294, and (for Heat at least) there seems to be two solutions, neither of which I particularly like: - Require username/password to be passed into the service (something we've been trying to banish via migrating to trusts for deferred authentication) - Create a trust, and impersonate the user for the duration of the request, or after the token expires until it is completed, using the service user credentials and the trust_id. It's the second one which I'm deliberating over - technically it will work, and we create the trust anyway (e.g for later use to do autoscaling etc), but can anyone from the keystone team comment on the legitimacy of the approach? Intuitively it seems wrong, but I can't see any other way if we want to support token-only auth and cope with folks doing stuff which takes 2 hours with a 1 hour token expiry? A possible 3rd option is some sort of longer lived, but limited scope capability token. The user would create a capability token that represents anyone possessing this token is (eg) allowed to write to swift as $user. The token could be created by keystone as a trusted 3rd party or by swift (doesn't matter which), in response to a request authenticated as $user. The client then includes that token in the request *to cinder*, so cinder can pass it back to swift when doing the writes. This capability token would be of much longer duration (long enough to complete the cinder-swift task), which is ok because it is of a much more limited scope (ideally as fine grained as we can bother implementing). With UUID tokens, it would even be possible to implement a one-time use sort of token. Since Keystone needs to be asked to validate a UUID token, the token could be invalidated by Keystone after the first verification. Since the token is limited based off of number of times of usage, there should be less concerns about a long validity period (though it would make sense to use something sane still). This approach wouldn't be possible with PKI tokens since Keystone is not in the validation path. Your idea of passing the capability token in the request would work well with this, as the token only needs to be extracted and used once instead of being passed from service to service and validated at each hop (usercinder-swift in your example). The idea would be to leave normal tokens with a smaller validity period (like the current default of an hour), but also allow one-time use tokens to be requested. (I like this option) A 4th option is to have much longer lived tokens everywhere (long enough for this backup), but the user is able to expire it early via keystone whenever they feel it might be compromised (aiui this is exactly how things work now - we just need to increase the timeout). Greater exposure to replay attacks, but if detected they can still be invalidated quickly. (This is the easiest option, it's basically just formalising what the operators are already doing) A 5th option (wow) is to have the end user/client repeatedly push in fresh tokens during long-running operations (and heat is the uber-example since it basically wants to impersonate the user forever). Those tokens would then need to be refreshed all the way down the stack for any outstanding operations that might need the new token. (This or the 4th option seems ugly but unavoidable for forever services like heat. There has to be some way to invalidate their access if they go rogue, either by time (and thus needs a refresh mechanism) or by invalidation-via- keystone (which implies the token lasts forever unless invalidated)) I think Keystone trusts are better for forever services, though I see no reason why a trust token also couldn't have a limited number of uses with a longer validity period. The trust itself doesn't need an expiration, so the trust can be executed at some future point in time to get a limited use trust token. However we do it: the permission to do the action should come from the original user - and this is expressed as tokens coming from the original client/user in some form. By allowing services to create something without the original client/user being involved, we're really just bypassing the token authentication mechanism (and there are easier ways to ignore the token ;) Yeah, this is ugly. You give up any control you have
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On Thu, 11 Sep 2014 03:00:02 PM Duncan Thomas wrote: On 11 September 2014 03:17, Angus Lees g...@inodes.org wrote: (As inspired by eg kerberos) 2. Ensure at some environmental/top layer that the advertised token lifetime exceeds the timeout set on the request, before making the request. This implies (since there's no special handling in place) failing if the token was expired earlier than expected. We've a related problem in cinder (cinder-backup uses the user's token to talk to swift, and the backup can easily take longer than the token expiry time) which could not be solved by this, since the time the backup takes is unknown (compression, service and resource contention, etc alter the time by multiple orders of magnitude) Yes, this sounds like another example of the cross-service problem I was describing with refreshing the token at the bottom layer - but I disagree that this is handled any better by refreshing tokens on-demand at the bottom layer. In order to have cinder refresh the token while talking to swift, it needs to know the user's password (ouch - why even have the token) or have magic token creating powers (in which case none of this matters, because cinder can just create tokens any time it wants). As far as I can see, we either need to be able to 1) generate tokens that _do_ last long enough, 2) pass user+password to cinder so it is capable of creating new tokens as necessary, or 3) only perform token-based auth once at the start of a long cinder-glance workflow like this, and then use some sort of limited-scope-but-unlimited-time session token for follow-on requests. I think I'm advocating for (1) or (3), and (2) as a distant third. ... Unless there's some other option here? Your dismissal above sounded like there was already a solution for this - what's the current solution? -- - Gus ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On Thu, 11 Sep 2014 03:21:52 PM Steven Hardy wrote: On Wed, Sep 10, 2014 at 08:46:45PM -0400, Jamie Lennox wrote: For service to service communication there are two types. 1) using the user's token like nova-cinder. If this token expires there is really nothing that nova can do except raise 401 and make the client do it again. 2) using a service user like nova-neutron. This should allow automatic reauthentication and will be fixed/standardied by sessions. (1) is the problem I'm trying to solve in bug #1306294, and (for Heat at least) there seems to be two solutions, neither of which I particularly like: - Require username/password to be passed into the service (something we've been trying to banish via migrating to trusts for deferred authentication) - Create a trust, and impersonate the user for the duration of the request, or after the token expires until it is completed, using the service user credentials and the trust_id. It's the second one which I'm deliberating over - technically it will work, and we create the trust anyway (e.g for later use to do autoscaling etc), but can anyone from the keystone team comment on the legitimacy of the approach? Intuitively it seems wrong, but I can't see any other way if we want to support token-only auth and cope with folks doing stuff which takes 2 hours with a 1 hour token expiry? A possible 3rd option is some sort of longer lived, but limited scope capability token. The user would create a capability token that represents anyone possessing this token is (eg) allowed to write to swift as $user. The token could be created by keystone as a trusted 3rd party or by swift (doesn't matter which), in response to a request authenticated as $user. The client then includes that token in the request *to cinder*, so cinder can pass it back to swift when doing the writes. This capability token would be of much longer duration (long enough to complete the cinder-swift task), which is ok because it is of a much more limited scope (ideally as fine grained as we can bother implementing). (I like this option) A 4th option is to have much longer lived tokens everywhere (long enough for this backup), but the user is able to expire it early via keystone whenever they feel it might be compromised (aiui this is exactly how things work now - we just need to increase the timeout). Greater exposure to replay attacks, but if detected they can still be invalidated quickly. (This is the easiest option, it's basically just formalising what the operators are already doing) A 5th option (wow) is to have the end user/client repeatedly push in fresh tokens during long-running operations (and heat is the uber-example since it basically wants to impersonate the user forever). Those tokens would then need to be refreshed all the way down the stack for any outstanding operations that might need the new token. (This or the 4th option seems ugly but unavoidable for forever services like heat. There has to be some way to invalidate their access if they go rogue, either by time (and thus needs a refresh mechanism) or by invalidation-via- keystone (which implies the token lasts forever unless invalidated)) However we do it: the permission to do the action should come from the original user - and this is expressed as tokens coming from the original client/user in some form. By allowing services to create something without the original client/user being involved, we're really just bypassing the token authentication mechanism (and there are easier ways to ignore the token ;) -- - Gus ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On 09/11/2014 01:44 PM, Sean Dague wrote: On 09/10/2014 08:46 PM, Jamie Lennox wrote: - Original Message - From: Steven Hardy sha...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Thursday, September 11, 2014 1:55:49 AM Subject: Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility On Wed, Sep 10, 2014 at 10:14:32AM -0400, Sean Dague wrote: Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user We actually have this exact problem in Heat, which I'm currently trying to solve: https://bugs.launchpad.net/heat/+bug/1306294 Can you clarify, is the issue either: 1. Create novaclient object with username/password 2. Do series of operations via the client object which eventually fail after $n operations due to token expiry or: 1. Create novaclient object with username/password 2. Some really long operation which means token expires in the course of the service handling the request, blowing up and 500-ing If the former, then it does sound like a client, or usage-of-client bug, although note if you pass a *token* vs username/password (as is currently done for glance and heat in tempest, because we lack the code to get the token outside of the shell.py code..), there's nothing the client can do, because you can't request a new token with longer expiry with a token... However if the latter, then it seems like not really a client problem to solve, as it's hard to know what action to take if a request failed part-way through and thus things are in an unknown state. This issue is a hard problem, which can possibly be solved by switching to a trust scoped token (service impersonates the user), but then you're effectively bypassing token expiry via delegation which sits uncomfortably with me (despite the fact that we may have to do this in heat to solve the afforementioned bug) It seems like we should have a standard pattern that on token expiration the underlying code at least gives one retry to try to establish a new token to complete the flow, however as far as I can tell *no* clients do this. As has been mentioned, using sessions may be one solution to this, and AFAIK session support (where it doesn't already exist) is getting into various clients via the work being carried out to add support for v3 keystone by David Hu: https://review.openstack.org/#/q/owner:david.hu%2540hp.com,n,z I see patches for Heat (currently gating), Nova and Ironic. I know we had to add that into Tempest because tempest runs can exceed 1 hr, and we want to avoid random fails just because we cross a token expiration boundary. I can't claim great experience with sessions yet, but AIUI you could do something like: from keystoneclient.auth.identity import v3 from keystoneclient import session from keystoneclient.v3 import client auth = v3.Password(auth_url=OS_AUTH_URL, username=USERNAME, password=PASSWORD, project_id=PROJECT, user_domain_name='default') sess = session.Session(auth=auth) ks = client.Client(session=sess) And if you can pass the same session into the various clients tempest creates then the Password auth-plugin code takes care of reauthenticating if the token cached in the auth plugin object is expired, or nearly expired: https://github.com/openstack/python-keystoneclient/blob/master/keystoneclient/auth/identity/base.py#L120 So in the tempest case, it seems like it may be a case of migrating the code creating the clients to use sessions instead of passing a token or username/password into the client object? That's my understanding of it atm anyway, hopefully jamielennox will be along soon with more details :) Steve By clients here are you referring to the CLIs or the python libraries? Implementation is at different points with each. Sessions will handle automatically reauthenticating and retrying a request, however it relies on the service throwing a 401 Unauthenticated error. If a service is returning a 500 (or a timeout?) then there isn't much that a client can/should do for that because we can't assume that trying again with a new token will solve anything. At the moment we have keystoneclient, novaclient, cinderclient neutronclient and then a number of the smaller projects with support for sessions. That obviously doesn't mean that existing users of that code have transitioned to the newer way though. David Hu has been working on using this code within the existing CLIs. I have prototypes for at least nova to talk to neutron and cinder which i'm waiting for Kilo to push. From there it should be easier to do
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On Thu, Sep 11, 2014 at 08:43:22PM -0400, Jamie Lennox wrote: - Original Message - From: Steven Hardy sha...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Friday, 12 September, 2014 12:21:52 AM Subject: Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility On Wed, Sep 10, 2014 at 08:46:45PM -0400, Jamie Lennox wrote: - Original Message - From: Steven Hardy sha...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Thursday, September 11, 2014 1:55:49 AM Subject: Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility On Wed, Sep 10, 2014 at 10:14:32AM -0400, Sean Dague wrote: Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user We actually have this exact problem in Heat, which I'm currently trying to solve: https://bugs.launchpad.net/heat/+bug/1306294 Can you clarify, is the issue either: 1. Create novaclient object with username/password 2. Do series of operations via the client object which eventually fail after $n operations due to token expiry or: 1. Create novaclient object with username/password 2. Some really long operation which means token expires in the course of the service handling the request, blowing up and 500-ing If the former, then it does sound like a client, or usage-of-client bug, although note if you pass a *token* vs username/password (as is currently done for glance and heat in tempest, because we lack the code to get the token outside of the shell.py code..), there's nothing the client can do, because you can't request a new token with longer expiry with a token... However if the latter, then it seems like not really a client problem to solve, as it's hard to know what action to take if a request failed part-way through and thus things are in an unknown state. This issue is a hard problem, which can possibly be solved by switching to a trust scoped token (service impersonates the user), but then you're effectively bypassing token expiry via delegation which sits uncomfortably with me (despite the fact that we may have to do this in heat to solve the afforementioned bug) It seems like we should have a standard pattern that on token expiration the underlying code at least gives one retry to try to establish a new token to complete the flow, however as far as I can tell *no* clients do this. As has been mentioned, using sessions may be one solution to this, and AFAIK session support (where it doesn't already exist) is getting into various clients via the work being carried out to add support for v3 keystone by David Hu: https://review.openstack.org/#/q/owner:david.hu%2540hp.com,n,z I see patches for Heat (currently gating), Nova and Ironic. I know we had to add that into Tempest because tempest runs can exceed 1 hr, and we want to avoid random fails just because we cross a token expiration boundary. I can't claim great experience with sessions yet, but AIUI you could do something like: from keystoneclient.auth.identity import v3 from keystoneclient import session from keystoneclient.v3 import client auth = v3.Password(auth_url=OS_AUTH_URL, username=USERNAME, password=PASSWORD, project_id=PROJECT, user_domain_name='default') sess = session.Session(auth=auth) ks = client.Client(session=sess) And if you can pass the same session into the various clients tempest creates then the Password auth-plugin code takes care of reauthenticating if the token cached in the auth plugin object is expired, or nearly expired: https://github.com/openstack/python-keystoneclient/blob/master/keystoneclient/auth/identity/base.py#L120 So in the tempest case, it seems like it may be a case of migrating the code creating the clients to use sessions instead of passing a token or username/password into the client object? That's my understanding of it atm anyway, hopefully jamielennox will be along soon with more details :) Steve By clients here are you referring to the CLIs or the python libraries? Implementation is at different points with each. I think for both heat and tempest we're
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On 09/10/2014 08:46 PM, Jamie Lennox wrote: - Original Message - From: Steven Hardy sha...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Thursday, September 11, 2014 1:55:49 AM Subject: Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility On Wed, Sep 10, 2014 at 10:14:32AM -0400, Sean Dague wrote: Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user We actually have this exact problem in Heat, which I'm currently trying to solve: https://bugs.launchpad.net/heat/+bug/1306294 Can you clarify, is the issue either: 1. Create novaclient object with username/password 2. Do series of operations via the client object which eventually fail after $n operations due to token expiry or: 1. Create novaclient object with username/password 2. Some really long operation which means token expires in the course of the service handling the request, blowing up and 500-ing If the former, then it does sound like a client, or usage-of-client bug, although note if you pass a *token* vs username/password (as is currently done for glance and heat in tempest, because we lack the code to get the token outside of the shell.py code..), there's nothing the client can do, because you can't request a new token with longer expiry with a token... However if the latter, then it seems like not really a client problem to solve, as it's hard to know what action to take if a request failed part-way through and thus things are in an unknown state. This issue is a hard problem, which can possibly be solved by switching to a trust scoped token (service impersonates the user), but then you're effectively bypassing token expiry via delegation which sits uncomfortably with me (despite the fact that we may have to do this in heat to solve the afforementioned bug) It seems like we should have a standard pattern that on token expiration the underlying code at least gives one retry to try to establish a new token to complete the flow, however as far as I can tell *no* clients do this. As has been mentioned, using sessions may be one solution to this, and AFAIK session support (where it doesn't already exist) is getting into various clients via the work being carried out to add support for v3 keystone by David Hu: https://review.openstack.org/#/q/owner:david.hu%2540hp.com,n,z I see patches for Heat (currently gating), Nova and Ironic. I know we had to add that into Tempest because tempest runs can exceed 1 hr, and we want to avoid random fails just because we cross a token expiration boundary. I can't claim great experience with sessions yet, but AIUI you could do something like: from keystoneclient.auth.identity import v3 from keystoneclient import session from keystoneclient.v3 import client auth = v3.Password(auth_url=OS_AUTH_URL, username=USERNAME, password=PASSWORD, project_id=PROJECT, user_domain_name='default') sess = session.Session(auth=auth) ks = client.Client(session=sess) And if you can pass the same session into the various clients tempest creates then the Password auth-plugin code takes care of reauthenticating if the token cached in the auth plugin object is expired, or nearly expired: https://github.com/openstack/python-keystoneclient/blob/master/keystoneclient/auth/identity/base.py#L120 So in the tempest case, it seems like it may be a case of migrating the code creating the clients to use sessions instead of passing a token or username/password into the client object? That's my understanding of it atm anyway, hopefully jamielennox will be along soon with more details :) Steve By clients here are you referring to the CLIs or the python libraries? Implementation is at different points with each. Sessions will handle automatically reauthenticating and retrying a request, however it relies on the service throwing a 401 Unauthenticated error. If a service is returning a 500 (or a timeout?) then there isn't much that a client can/should do for that because we can't assume that trying again with a new token will solve anything. At the moment we have keystoneclient, novaclient, cinderclient neutronclient and then a number of the smaller projects with support for sessions. That obviously doesn't mean that existing users of that code have transitioned to the newer way though. David Hu has been working on using this code within the existing CLIs. I have prototypes for at least nova to talk to neutron and cinder which i'm waiting for Kilo to push. From there it should be easier to do this for other services. For service
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On 09/10/2014 11:55 AM, Steven Hardy wrote: On Wed, Sep 10, 2014 at 10:14:32AM -0400, Sean Dague wrote: Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user We actually have this exact problem in Heat, which I'm currently trying to solve: https://bugs.launchpad.net/heat/+bug/1306294 Can you clarify, is the issue either: 1. Create novaclient object with username/password 2. Do series of operations via the client object which eventually fail after $n operations due to token expiry or: 1. Create novaclient object with username/password 2. Some really long operation which means token expires in the course of the service handling the request, blowing up and 500-ing From what I can tell of the Nova bugs both are issues. Honestly, it would probably be really telling to setup a test env with 10s token timeouts and see how crazy it broke. I expect that our expiration logic, and how our components react to it, is actually a lot less coherent than we believe. -Sean -- Sean Dague http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On 11 September 2014 03:17, Angus Lees g...@inodes.org wrote: (As inspired by eg kerberos) 2. Ensure at some environmental/top layer that the advertised token lifetime exceeds the timeout set on the request, before making the request. This implies (since there's no special handling in place) failing if the token was expired earlier than expected. We've a related problem in cinder (cinder-backup uses the user's token to talk to swift, and the backup can easily take longer than the token expiry time) which could not be solved by this, since the time the backup takes is unknown (compression, service and resource contention, etc alter the time by multiple orders of magnitude) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On Wed, Sep 10, 2014 at 08:46:45PM -0400, Jamie Lennox wrote: - Original Message - From: Steven Hardy sha...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Thursday, September 11, 2014 1:55:49 AM Subject: Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility On Wed, Sep 10, 2014 at 10:14:32AM -0400, Sean Dague wrote: Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user We actually have this exact problem in Heat, which I'm currently trying to solve: https://bugs.launchpad.net/heat/+bug/1306294 Can you clarify, is the issue either: 1. Create novaclient object with username/password 2. Do series of operations via the client object which eventually fail after $n operations due to token expiry or: 1. Create novaclient object with username/password 2. Some really long operation which means token expires in the course of the service handling the request, blowing up and 500-ing If the former, then it does sound like a client, or usage-of-client bug, although note if you pass a *token* vs username/password (as is currently done for glance and heat in tempest, because we lack the code to get the token outside of the shell.py code..), there's nothing the client can do, because you can't request a new token with longer expiry with a token... However if the latter, then it seems like not really a client problem to solve, as it's hard to know what action to take if a request failed part-way through and thus things are in an unknown state. This issue is a hard problem, which can possibly be solved by switching to a trust scoped token (service impersonates the user), but then you're effectively bypassing token expiry via delegation which sits uncomfortably with me (despite the fact that we may have to do this in heat to solve the afforementioned bug) It seems like we should have a standard pattern that on token expiration the underlying code at least gives one retry to try to establish a new token to complete the flow, however as far as I can tell *no* clients do this. As has been mentioned, using sessions may be one solution to this, and AFAIK session support (where it doesn't already exist) is getting into various clients via the work being carried out to add support for v3 keystone by David Hu: https://review.openstack.org/#/q/owner:david.hu%2540hp.com,n,z I see patches for Heat (currently gating), Nova and Ironic. I know we had to add that into Tempest because tempest runs can exceed 1 hr, and we want to avoid random fails just because we cross a token expiration boundary. I can't claim great experience with sessions yet, but AIUI you could do something like: from keystoneclient.auth.identity import v3 from keystoneclient import session from keystoneclient.v3 import client auth = v3.Password(auth_url=OS_AUTH_URL, username=USERNAME, password=PASSWORD, project_id=PROJECT, user_domain_name='default') sess = session.Session(auth=auth) ks = client.Client(session=sess) And if you can pass the same session into the various clients tempest creates then the Password auth-plugin code takes care of reauthenticating if the token cached in the auth plugin object is expired, or nearly expired: https://github.com/openstack/python-keystoneclient/blob/master/keystoneclient/auth/identity/base.py#L120 So in the tempest case, it seems like it may be a case of migrating the code creating the clients to use sessions instead of passing a token or username/password into the client object? That's my understanding of it atm anyway, hopefully jamielennox will be along soon with more details :) Steve By clients here are you referring to the CLIs or the python libraries? Implementation is at different points with each. I think for both heat and tempest we're talking about the python libraries (Client objects). Sessions will handle automatically reauthenticating and retrying a request, however it relies on the service throwing a 401 Unauthenticated error. If a service is returning a 500 (or a timeout?) then there isn't much that a client can/should do for that because we can't assume that trying again with a new token will solve anything. Hmm, I was hoping it would reauthenticate based on the auth_ref will_expire_soon, as it would fit better with out current usage of the auth_ref in heat. At the moment we have keystoneclient, novaclient, cinderclient neutronclient and then a number of the smaller
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
- Original Message - From: Sean Dague s...@dague.net To: openstack-dev@lists.openstack.org Sent: Thursday, 11 September, 2014 9:44:43 PM Subject: Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility On 09/10/2014 08:46 PM, Jamie Lennox wrote: - Original Message - From: Steven Hardy sha...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Thursday, September 11, 2014 1:55:49 AM Subject: Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility On Wed, Sep 10, 2014 at 10:14:32AM -0400, Sean Dague wrote: Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user We actually have this exact problem in Heat, which I'm currently trying to solve: https://bugs.launchpad.net/heat/+bug/1306294 Can you clarify, is the issue either: 1. Create novaclient object with username/password 2. Do series of operations via the client object which eventually fail after $n operations due to token expiry or: 1. Create novaclient object with username/password 2. Some really long operation which means token expires in the course of the service handling the request, blowing up and 500-ing If the former, then it does sound like a client, or usage-of-client bug, although note if you pass a *token* vs username/password (as is currently done for glance and heat in tempest, because we lack the code to get the token outside of the shell.py code..), there's nothing the client can do, because you can't request a new token with longer expiry with a token... However if the latter, then it seems like not really a client problem to solve, as it's hard to know what action to take if a request failed part-way through and thus things are in an unknown state. This issue is a hard problem, which can possibly be solved by switching to a trust scoped token (service impersonates the user), but then you're effectively bypassing token expiry via delegation which sits uncomfortably with me (despite the fact that we may have to do this in heat to solve the afforementioned bug) It seems like we should have a standard pattern that on token expiration the underlying code at least gives one retry to try to establish a new token to complete the flow, however as far as I can tell *no* clients do this. As has been mentioned, using sessions may be one solution to this, and AFAIK session support (where it doesn't already exist) is getting into various clients via the work being carried out to add support for v3 keystone by David Hu: https://review.openstack.org/#/q/owner:david.hu%2540hp.com,n,z I see patches for Heat (currently gating), Nova and Ironic. I know we had to add that into Tempest because tempest runs can exceed 1 hr, and we want to avoid random fails just because we cross a token expiration boundary. I can't claim great experience with sessions yet, but AIUI you could do something like: from keystoneclient.auth.identity import v3 from keystoneclient import session from keystoneclient.v3 import client auth = v3.Password(auth_url=OS_AUTH_URL, username=USERNAME, password=PASSWORD, project_id=PROJECT, user_domain_name='default') sess = session.Session(auth=auth) ks = client.Client(session=sess) And if you can pass the same session into the various clients tempest creates then the Password auth-plugin code takes care of reauthenticating if the token cached in the auth plugin object is expired, or nearly expired: https://github.com/openstack/python-keystoneclient/blob/master/keystoneclient/auth/identity/base.py#L120 So in the tempest case, it seems like it may be a case of migrating the code creating the clients to use sessions instead of passing a token or username/password into the client object? That's my understanding of it atm anyway, hopefully jamielennox will be along soon with more details :) Steve By clients here are you referring to the CLIs or the python libraries? Implementation is at different points with each. Sessions will handle automatically reauthenticating and retrying a request, however it relies on the service throwing a 401 Unauthenticated error. If a service is returning a 500 (or a timeout?) then there isn't much that a client can/should do for that because we can't assume that trying again with a new token will solve anything. At the moment we have keystoneclient, novaclient, cinderclient neutronclient and then a number of the smaller projects with support
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
- Original Message - From: Steven Hardy sha...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Friday, 12 September, 2014 12:21:52 AM Subject: Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility On Wed, Sep 10, 2014 at 08:46:45PM -0400, Jamie Lennox wrote: - Original Message - From: Steven Hardy sha...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Thursday, September 11, 2014 1:55:49 AM Subject: Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility On Wed, Sep 10, 2014 at 10:14:32AM -0400, Sean Dague wrote: Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user We actually have this exact problem in Heat, which I'm currently trying to solve: https://bugs.launchpad.net/heat/+bug/1306294 Can you clarify, is the issue either: 1. Create novaclient object with username/password 2. Do series of operations via the client object which eventually fail after $n operations due to token expiry or: 1. Create novaclient object with username/password 2. Some really long operation which means token expires in the course of the service handling the request, blowing up and 500-ing If the former, then it does sound like a client, or usage-of-client bug, although note if you pass a *token* vs username/password (as is currently done for glance and heat in tempest, because we lack the code to get the token outside of the shell.py code..), there's nothing the client can do, because you can't request a new token with longer expiry with a token... However if the latter, then it seems like not really a client problem to solve, as it's hard to know what action to take if a request failed part-way through and thus things are in an unknown state. This issue is a hard problem, which can possibly be solved by switching to a trust scoped token (service impersonates the user), but then you're effectively bypassing token expiry via delegation which sits uncomfortably with me (despite the fact that we may have to do this in heat to solve the afforementioned bug) It seems like we should have a standard pattern that on token expiration the underlying code at least gives one retry to try to establish a new token to complete the flow, however as far as I can tell *no* clients do this. As has been mentioned, using sessions may be one solution to this, and AFAIK session support (where it doesn't already exist) is getting into various clients via the work being carried out to add support for v3 keystone by David Hu: https://review.openstack.org/#/q/owner:david.hu%2540hp.com,n,z I see patches for Heat (currently gating), Nova and Ironic. I know we had to add that into Tempest because tempest runs can exceed 1 hr, and we want to avoid random fails just because we cross a token expiration boundary. I can't claim great experience with sessions yet, but AIUI you could do something like: from keystoneclient.auth.identity import v3 from keystoneclient import session from keystoneclient.v3 import client auth = v3.Password(auth_url=OS_AUTH_URL, username=USERNAME, password=PASSWORD, project_id=PROJECT, user_domain_name='default') sess = session.Session(auth=auth) ks = client.Client(session=sess) And if you can pass the same session into the various clients tempest creates then the Password auth-plugin code takes care of reauthenticating if the token cached in the auth plugin object is expired, or nearly expired: https://github.com/openstack/python-keystoneclient/blob/master/keystoneclient/auth/identity/base.py#L120 So in the tempest case, it seems like it may be a case of migrating the code creating the clients to use sessions instead of passing a token or username/password into the client object? That's my understanding of it atm anyway, hopefully jamielennox will be along soon with more details :) Steve By clients here are you referring to the CLIs or the python libraries? Implementation is at different points with each. I think for both heat and tempest we're talking about the python libraries (Client objects). Sessions will handle automatically reauthenticating and retrying a request, however it relies on the service throwing a 401 Unauthenticated error
[openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user It seems like we should have a standard pattern that on token expiration the underlying code at least gives one retry to try to establish a new token to complete the flow, however as far as I can tell *no* clients do this. I know we had to add that into Tempest because tempest runs can exceed 1 hr, and we want to avoid random fails just because we cross a token expiration boundary. Anyone closer to the clients that can comment here? -Sean -- Sean Dague http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
I think at least clients supporting keystone sessions that are configured to use the auth.Password mech supports this since re-auth is done by the session rather then the service client itself. 2014-09-10 16:14 GMT+02:00 Sean Dague s...@dague.net: Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user It seems like we should have a standard pattern that on token expiration the underlying code at least gives one retry to try to establish a new token to complete the flow, however as far as I can tell *no* clients do this. I know we had to add that into Tempest because tempest runs can exceed 1 hr, and we want to avoid random fails just because we cross a token expiration boundary. Anyone closer to the clients that can comment here? -Sean -- Sean Dague http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
Do we know which versions of the clients do that? -Sean On 09/10/2014 10:22 AM, Endre Karlson wrote: I think at least clients supporting keystone sessions that are configured to use the auth.Password mech supports this since re-auth is done by the session rather then the service client itself. 2014-09-10 16:14 GMT+02:00 Sean Dague s...@dague.net mailto:s...@dague.net: Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user It seems like we should have a standard pattern that on token expiration the underlying code at least gives one retry to try to establish a new token to complete the flow, however as far as I can tell *no* clients do this. I know we had to add that into Tempest because tempest runs can exceed 1 hr, and we want to avoid random fails just because we cross a token expiration boundary. Anyone closer to the clients that can comment here? -Sean -- Sean Dague http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Sean Dague http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On Wed, Sep 10, 2014 at 10:14:32AM -0400, Sean Dague wrote: Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user We actually have this exact problem in Heat, which I'm currently trying to solve: https://bugs.launchpad.net/heat/+bug/1306294 Can you clarify, is the issue either: 1. Create novaclient object with username/password 2. Do series of operations via the client object which eventually fail after $n operations due to token expiry or: 1. Create novaclient object with username/password 2. Some really long operation which means token expires in the course of the service handling the request, blowing up and 500-ing If the former, then it does sound like a client, or usage-of-client bug, although note if you pass a *token* vs username/password (as is currently done for glance and heat in tempest, because we lack the code to get the token outside of the shell.py code..), there's nothing the client can do, because you can't request a new token with longer expiry with a token... However if the latter, then it seems like not really a client problem to solve, as it's hard to know what action to take if a request failed part-way through and thus things are in an unknown state. This issue is a hard problem, which can possibly be solved by switching to a trust scoped token (service impersonates the user), but then you're effectively bypassing token expiry via delegation which sits uncomfortably with me (despite the fact that we may have to do this in heat to solve the afforementioned bug) It seems like we should have a standard pattern that on token expiration the underlying code at least gives one retry to try to establish a new token to complete the flow, however as far as I can tell *no* clients do this. As has been mentioned, using sessions may be one solution to this, and AFAIK session support (where it doesn't already exist) is getting into various clients via the work being carried out to add support for v3 keystone by David Hu: https://review.openstack.org/#/q/owner:david.hu%2540hp.com,n,z I see patches for Heat (currently gating), Nova and Ironic. I know we had to add that into Tempest because tempest runs can exceed 1 hr, and we want to avoid random fails just because we cross a token expiration boundary. I can't claim great experience with sessions yet, but AIUI you could do something like: from keystoneclient.auth.identity import v3 from keystoneclient import session from keystoneclient.v3 import client auth = v3.Password(auth_url=OS_AUTH_URL, username=USERNAME, password=PASSWORD, project_id=PROJECT, user_domain_name='default') sess = session.Session(auth=auth) ks = client.Client(session=sess) And if you can pass the same session into the various clients tempest creates then the Password auth-plugin code takes care of reauthenticating if the token cached in the auth plugin object is expired, or nearly expired: https://github.com/openstack/python-keystoneclient/blob/master/keystoneclient/auth/identity/base.py#L120 So in the tempest case, it seems like it may be a case of migrating the code creating the clients to use sessions instead of passing a token or username/password into the client object? That's my understanding of it atm anyway, hopefully jamielennox will be along soon with more details :) Steve ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
- Original Message - From: Steven Hardy sha...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Thursday, September 11, 2014 1:55:49 AM Subject: Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility On Wed, Sep 10, 2014 at 10:14:32AM -0400, Sean Dague wrote: Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user We actually have this exact problem in Heat, which I'm currently trying to solve: https://bugs.launchpad.net/heat/+bug/1306294 Can you clarify, is the issue either: 1. Create novaclient object with username/password 2. Do series of operations via the client object which eventually fail after $n operations due to token expiry or: 1. Create novaclient object with username/password 2. Some really long operation which means token expires in the course of the service handling the request, blowing up and 500-ing If the former, then it does sound like a client, or usage-of-client bug, although note if you pass a *token* vs username/password (as is currently done for glance and heat in tempest, because we lack the code to get the token outside of the shell.py code..), there's nothing the client can do, because you can't request a new token with longer expiry with a token... However if the latter, then it seems like not really a client problem to solve, as it's hard to know what action to take if a request failed part-way through and thus things are in an unknown state. This issue is a hard problem, which can possibly be solved by switching to a trust scoped token (service impersonates the user), but then you're effectively bypassing token expiry via delegation which sits uncomfortably with me (despite the fact that we may have to do this in heat to solve the afforementioned bug) It seems like we should have a standard pattern that on token expiration the underlying code at least gives one retry to try to establish a new token to complete the flow, however as far as I can tell *no* clients do this. As has been mentioned, using sessions may be one solution to this, and AFAIK session support (where it doesn't already exist) is getting into various clients via the work being carried out to add support for v3 keystone by David Hu: https://review.openstack.org/#/q/owner:david.hu%2540hp.com,n,z I see patches for Heat (currently gating), Nova and Ironic. I know we had to add that into Tempest because tempest runs can exceed 1 hr, and we want to avoid random fails just because we cross a token expiration boundary. I can't claim great experience with sessions yet, but AIUI you could do something like: from keystoneclient.auth.identity import v3 from keystoneclient import session from keystoneclient.v3 import client auth = v3.Password(auth_url=OS_AUTH_URL, username=USERNAME, password=PASSWORD, project_id=PROJECT, user_domain_name='default') sess = session.Session(auth=auth) ks = client.Client(session=sess) And if you can pass the same session into the various clients tempest creates then the Password auth-plugin code takes care of reauthenticating if the token cached in the auth plugin object is expired, or nearly expired: https://github.com/openstack/python-keystoneclient/blob/master/keystoneclient/auth/identity/base.py#L120 So in the tempest case, it seems like it may be a case of migrating the code creating the clients to use sessions instead of passing a token or username/password into the client object? That's my understanding of it atm anyway, hopefully jamielennox will be along soon with more details :) Steve By clients here are you referring to the CLIs or the python libraries? Implementation is at different points with each. Sessions will handle automatically reauthenticating and retrying a request, however it relies on the service throwing a 401 Unauthenticated error. If a service is returning a 500 (or a timeout?) then there isn't much that a client can/should do for that because we can't assume that trying again with a new token will solve anything. At the moment we have keystoneclient, novaclient, cinderclient neutronclient and then a number of the smaller projects with support for sessions. That obviously doesn't mean that existing users of that code have transitioned to the newer way though. David Hu has been working on using this code within the existing CLIs. I have prototypes for at least nova to talk to neutron and cinder which i'm waiting for Kilo to push. From there it should be easier to do this for other services. For service to service communication
Re: [openstack-dev] [all] [clients] [keystone] lack of retrying tokens leads to overall OpenStack fragility
On Wed, 10 Sep 2014 10:14:32 AM Sean Dague wrote: Going through the untriaged Nova bugs, and there are a few on a similar pattern: Nova operation in progress takes a while Crosses keystone token expiration time Timeout thrown Operation fails Terrible 500 error sent back to user It seems like we should have a standard pattern that on token expiration the underlying code at least gives one retry to try to establish a new token to complete the flow, however as far as I can tell *no* clients do this. Just because this came up in conversation a few weeks ago in the context of the ironic client. I've read some docs and written a keystone client, but I'm not super-familiar with keystone internals - apologies if I miss something fundamental. There are two broadly different approaches to dealing with this: (As described by Sean, and implemented in a few clients) 1. At the bottom layer, try to refresh the token and immediately retry whenever a server response indicates the token has expired. (As inspired by eg kerberos) 2. Ensure at some environmental/top layer that the advertised token lifetime exceeds the timeout set on the request, before making the request. This implies (since there's no special handling in place) failing if the token was expired earlier than expected. The primary distinction being that in (2) the client is ignorant of how to create tokens, and just assumes they're valid. (2) is particularly easy to code for simple one shot command line clients. For a persistent client, the easiest approach is probably to have an asynchronous loop that just keeps refreshing the stored token whenever it approaches expiry - max_single_request_timeout. My concern with (1) is that it involves passing username/password all the way down to the bottom layers - see the heat example where this means crossing into another program/service. Moreover, if the token was expired earlier than advertised it probably means the admin has deliberately rejected the user or something and the intent is that they _should_ be locked out - it would be unfortunate to have a synchronised retry attack on keystone from all the rejected clients at that point :/ -- - Gus ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev