Re: [openstack-dev] [Openstack-operators] [all][log] Openstack HTTP error codes

2015-04-08 Thread Everett Toews
On Jan 29, 2015, at 8:34 PM, Rochelle Grober 
mailto:rochelle.gro...@huawei.com>> wrote:

Hi folks!

Changed the tags a bit because this is a discussion for all projects and 
dovetails with logging rationalization/standards/

At the Paris summit, we had a number of session on logging that kept circling 
back to Error Codes.  But, these codes would not be http codes, rather, as 
others have pointed out, codes related to the calling entities and referring 
entities and the actions that happened or didn’t.  Format suggestions were 
gathered from the Operators and from some senior developers.  The Logging 
Working Group is planning to put forth a spec for discussion on formats and 
standards before the Ops mid-cycle meetup.

Working from a Glance proposal on error codes:  
https://review.openstack.org/#/c/127482/ and discussions with operators and 
devs, we have a strawman to propose.  We also have a number of requirements 
from Ops and some Devs.

Here is the basic idea:

Code for logs would have four segments:
Project Vendor/Component  Error Catalog 
number Criticality
Def [A-Z] [A-Z] [A-Z]   -  [{0-9}|{A-Z}][A-Z] - 
[-]-   [0-9]
Ex.  CIN-   NA- 
   0001- 2
Cinder   NetApp 
   driver error no  Criticality
Ex.  GLA-  0A-  
   0051   3
Glance  Api 
error no   Criticality
Three letters for project,  Either a two letter vendor code or a number and 
letter for 0+letter for internal component of project (like API=0A, Controller 
=0C, etc),  four digit error number which could be subsetted for even finer 
granularity, and a criticality number.

This is for logging purposes and tracking down root cause faster for operators, 
but if an error is generated, why can the same codes be used internally for the 
code as externally for the logs?  This also allows for a unique message to be 
associated with the error code that is more descriptive and that can be pre 
translated.  Again, for logging purposes, the error code would not be part of 
the message payload, but part of the headers.  Referrer IDs and other info 
would still be expected in the payload of the message and could include 
instance ids/names, NICs or VIFs, etc.  The message headers is code in Oslo.log 
and when using the Oslo.log library, will be easy to use.

Since this discussion came up, I thought I needed to get this info out to folks 
and advertise that anyone will be able to comment on the spec to drive it to 
agreement.  I will be  advertising it here and on Ops and Product-WG mailing 
lists.  I’d also like to invite anyone who want to participate in discussions 
to join them.  We’ll be starting a bi-weekly or weekly IRC meeting (also 
announced in the stated MLs) in February.

And please realize that other than Oslo.log, the changes to make the errors 
more useable will be almost entirely community created standards with community 
created tools to help enforce them.  None of which exist yet, FYI.

Hi Rocky,

The API WG is trying to come up with a guideline for an error format for the 
HTTP APIs [1]. In that error format is a code field that I was hoping could 
match the code in the logs you mention above.

I noticed in the Logging WG meetings [2] that you mention an "Error Code Spec”. 
I’d like to be able to use info from that spec in the example [2] of the error 
format.

Has there been any progress on that spec? Can you link me to it?

Also, if you have time for a review of the error format, I’d like to hear your 
thoughts.

Thanks,
Everett

[1] https://review.openstack.org/#/c/167793/
[2] https://wiki.openstack.org/wiki/Meetings/log-wg
[3] 
https://review.openstack.org/#/c/167793/4/guidelines/errors-example.json,unified


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [all][log] Openstack HTTP error codes

2015-01-30 Thread Dean Troyer
On Fri, Jan 30, 2015 at 3:08 PM, Everett Toews 
wrote:

> I like the idea of the log error codes being aligned with the API errors
> codes but I have some thoughts/concerns.
>
>  Project: A client dealing with the API already knows what project
> (service) they’re dealing with. Including this in an API error message
> would be redundant. That’s not necessarily so bad and it could actually be
> convenient for client logging purposes to have this there.
>

Agreed that this is not necessary, but it is not objectionable if that
simplifies coding the server side.


> Vendor/Component: Including any vendor information at all would be leaking
> implementation details. This absolutely cannot be exposed in an API error
> message. Even including the component would be leaking too much.
>

++


> Error Catalog Number: If there could be alignment around this, that would
> be great.
>

I think the important alignment here is being able to trace a client-side
API error back to the service log for further research.  This might not be
a high-volume use, but I have to do this all the time for chasing down
client-side dev issues.  Its easy in DevStack, but in a deployed cloud of
any size not so much.  A timestamp and _anything_ that can map the
user-visible error into a log file is all that is really needed.  We often
can't even do that today.

dt

-- 

Dean Troyer
dtro...@gmail.com
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [all][log] Openstack HTTP error codes

2015-01-30 Thread Everett Toews
On Jan 30, 2015, at 3:17 PM, Jesse Keating  wrote:

> On 1/30/15 1:08 PM, Everett Toews wrote:
>> Project: A client dealing with the API already knows what project
>> (service) they’re dealing with. Including this in an API error message
>> would be redundant. That’s not necessarily so bad and it could actually
>> be convenient for client logging purposes to have this there.
>> 
> 
> Is this really true though? When your interaction with nova is being thwarted 
> by a problem with keystone, wouldn't the end user want to see the keystone 
> name in there as a helpful breadcrumb as to where the problem actually lies?

Once I have the token from Keystone, I’ll be talking directly to the services. 
So either something goes wrong with Keystone and I get no token or I get a 
token and talk directly to a service. Either way a client knows who it's 
talking to.

I suppose one possible case outside of that is token revocation. If I’m talking 
to a service and the token gets revoked, does the error originate in Keystone? 
I’m not really sure.

Everett


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [all][log] Openstack HTTP error codes

2015-01-30 Thread Kevin L. Mitchell
On Fri, 2015-01-30 at 21:08 +, Everett Toews wrote:
> Project: A client dealing with the API already knows what project
> (service) they’re dealing with. Including this in an API error message
> would be redundant. That’s not necessarily so bad and it could
> actually be convenient for client logging purposes to have this there.

Do they?  We boot a server and interact with Cinder and Neutron, right?
What if the nova API is simply forwarding an error that originally came
from Cinder?

> Vendor/Component: Including any vendor information at all would be
> leaking implementation details. This absolutely cannot be exposed in
> an API error message. Even including the component would be leaking
> too much.

While I agree with you from a security standpoint, this is probably
coming in due to a desire to namespace the errors.  Ideally, we'd have a
set of common error codes to cover conditions that the API user could
rectify ("You picked a nic type we don't support" or something like
that), but I fear there may always be errors that are things the API
user could rectify but which don't fit into any of those buckets…

> Error Catalog Number: If there could be alignment around this, that
> would be great.
[snip]
> Criticality: This might be useful to clients? I don’t know. I don’t
> feel too strongly about it.

I feel this part of the code needs more thought to properly round out.
Is it intended to convey information similar to the distinction between
4xx and 5xx errors in HTTP?  ("You made an error" vs. "The server messed
up".)  Is it intended to convey a retryable condition?  ("If you retry
this, it may succeed.")  If it's intended to convey that the server
messed up spectacularly and that everything's broken now, well… :)
-- 
Kevin L. Mitchell 
Rackspace


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [all][log] Openstack HTTP error codes

2015-01-30 Thread Everett Toews
On Jan 29, 2015, at 7:34 PM, Rochelle Grober 
mailto:rochelle.gro...@huawei.com>> wrote:

Hi folks!

Changed the tags a bit because this is a discussion for all projects and 
dovetails with logging rationalization/standards/

At the Paris summit, we had a number of session on logging that kept circling 
back to Error Codes.  But, these codes would not be http codes, rather, as 
others have pointed out, codes related to the calling entities and referring 
entities and the actions that happened or didn’t.  Format suggestions were 
gathered from the Operators and from some senior developers.  The Logging 
Working Group is planning to put forth a spec for discussion on formats and 
standards before the Ops mid-cycle meetup.

Working from a Glance proposal on error codes:  
https://review.openstack.org/#/c/127482/ and discussions with operators and 
devs, we have a strawman to propose.  We also have a number of requirements 
from Ops and some Devs.

Here is the basic idea:

Code for logs would have four segments:
Project Vendor/Component  Error Catalog 
number Criticality
Def [A-Z] [A-Z] [A-Z]   -  [{0-9}|{A-Z}][A-Z] - 
[-]-   [0-9]
Ex.  CIN-   NA- 
   0001- 2
Cinder   NetApp 
   driver error no  Criticality
Ex.  GLA-  0A-  
   0051   3
Glance  Api 
error no   Criticality
Three letters for project,  Either a two letter vendor code or a number and 
letter for 0+letter for internal component of project (like API=0A, Controller 
=0C, etc),  four digit error number which could be subsetted for even finer 
granularity, and a criticality number.

This is for logging purposes and tracking down root cause faster for operators, 
but if an error is generated, why can the same codes be used internally for the 
code as externally for the logs?

I like the idea of the log error codes being aligned with the API errors codes 
but I have some thoughts/concerns.

Project: A client dealing with the API already knows what project (service) 
they’re dealing with. Including this in an API error message would be 
redundant. That’s not necessarily so bad and it could actually be convenient 
for client logging purposes to have this there.

Vendor/Component: Including any vendor information at all would be leaking 
implementation details. This absolutely cannot be exposed in an API error 
message. Even including the component would be leaking too much.

Error Catalog Number: If there could be alignment around this, that would be 
great.

Criticality: This might be useful to clients? I don’t know. I don’t feel too 
strongly about it.

Thanks,
Everett

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev