Re: [openstack-dev] [Openstack-operators] [all][log] Openstack HTTP error codes
On Jan 29, 2015, at 8:34 PM, Rochelle Grober mailto:rochelle.gro...@huawei.com>> wrote: Hi folks! Changed the tags a bit because this is a discussion for all projects and dovetails with logging rationalization/standards/ At the Paris summit, we had a number of session on logging that kept circling back to Error Codes. But, these codes would not be http codes, rather, as others have pointed out, codes related to the calling entities and referring entities and the actions that happened or didn’t. Format suggestions were gathered from the Operators and from some senior developers. The Logging Working Group is planning to put forth a spec for discussion on formats and standards before the Ops mid-cycle meetup. Working from a Glance proposal on error codes: https://review.openstack.org/#/c/127482/ and discussions with operators and devs, we have a strawman to propose. We also have a number of requirements from Ops and some Devs. Here is the basic idea: Code for logs would have four segments: Project Vendor/Component Error Catalog number Criticality Def [A-Z] [A-Z] [A-Z] - [{0-9}|{A-Z}][A-Z] - [-]- [0-9] Ex. CIN- NA- 0001- 2 Cinder NetApp driver error no Criticality Ex. GLA- 0A- 0051 3 Glance Api error no Criticality Three letters for project, Either a two letter vendor code or a number and letter for 0+letter for internal component of project (like API=0A, Controller =0C, etc), four digit error number which could be subsetted for even finer granularity, and a criticality number. This is for logging purposes and tracking down root cause faster for operators, but if an error is generated, why can the same codes be used internally for the code as externally for the logs? This also allows for a unique message to be associated with the error code that is more descriptive and that can be pre translated. Again, for logging purposes, the error code would not be part of the message payload, but part of the headers. Referrer IDs and other info would still be expected in the payload of the message and could include instance ids/names, NICs or VIFs, etc. The message headers is code in Oslo.log and when using the Oslo.log library, will be easy to use. Since this discussion came up, I thought I needed to get this info out to folks and advertise that anyone will be able to comment on the spec to drive it to agreement. I will be advertising it here and on Ops and Product-WG mailing lists. I’d also like to invite anyone who want to participate in discussions to join them. We’ll be starting a bi-weekly or weekly IRC meeting (also announced in the stated MLs) in February. And please realize that other than Oslo.log, the changes to make the errors more useable will be almost entirely community created standards with community created tools to help enforce them. None of which exist yet, FYI. Hi Rocky, The API WG is trying to come up with a guideline for an error format for the HTTP APIs [1]. In that error format is a code field that I was hoping could match the code in the logs you mention above. I noticed in the Logging WG meetings [2] that you mention an "Error Code Spec”. I’d like to be able to use info from that spec in the example [2] of the error format. Has there been any progress on that spec? Can you link me to it? Also, if you have time for a review of the error format, I’d like to hear your thoughts. Thanks, Everett [1] https://review.openstack.org/#/c/167793/ [2] https://wiki.openstack.org/wiki/Meetings/log-wg [3] https://review.openstack.org/#/c/167793/4/guidelines/errors-example.json,unified __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Openstack-operators] [all][log] Openstack HTTP error codes
On Fri, Jan 30, 2015 at 3:08 PM, Everett Toews wrote: > I like the idea of the log error codes being aligned with the API errors > codes but I have some thoughts/concerns. > > Project: A client dealing with the API already knows what project > (service) they’re dealing with. Including this in an API error message > would be redundant. That’s not necessarily so bad and it could actually be > convenient for client logging purposes to have this there. > Agreed that this is not necessary, but it is not objectionable if that simplifies coding the server side. > Vendor/Component: Including any vendor information at all would be leaking > implementation details. This absolutely cannot be exposed in an API error > message. Even including the component would be leaking too much. > ++ > Error Catalog Number: If there could be alignment around this, that would > be great. > I think the important alignment here is being able to trace a client-side API error back to the service log for further research. This might not be a high-volume use, but I have to do this all the time for chasing down client-side dev issues. Its easy in DevStack, but in a deployed cloud of any size not so much. A timestamp and _anything_ that can map the user-visible error into a log file is all that is really needed. We often can't even do that today. dt -- Dean Troyer dtro...@gmail.com __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Openstack-operators] [all][log] Openstack HTTP error codes
On Jan 30, 2015, at 3:17 PM, Jesse Keating wrote: > On 1/30/15 1:08 PM, Everett Toews wrote: >> Project: A client dealing with the API already knows what project >> (service) they’re dealing with. Including this in an API error message >> would be redundant. That’s not necessarily so bad and it could actually >> be convenient for client logging purposes to have this there. >> > > Is this really true though? When your interaction with nova is being thwarted > by a problem with keystone, wouldn't the end user want to see the keystone > name in there as a helpful breadcrumb as to where the problem actually lies? Once I have the token from Keystone, I’ll be talking directly to the services. So either something goes wrong with Keystone and I get no token or I get a token and talk directly to a service. Either way a client knows who it's talking to. I suppose one possible case outside of that is token revocation. If I’m talking to a service and the token gets revoked, does the error originate in Keystone? I’m not really sure. Everett __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Openstack-operators] [all][log] Openstack HTTP error codes
On Fri, 2015-01-30 at 21:08 +, Everett Toews wrote: > Project: A client dealing with the API already knows what project > (service) they’re dealing with. Including this in an API error message > would be redundant. That’s not necessarily so bad and it could > actually be convenient for client logging purposes to have this there. Do they? We boot a server and interact with Cinder and Neutron, right? What if the nova API is simply forwarding an error that originally came from Cinder? > Vendor/Component: Including any vendor information at all would be > leaking implementation details. This absolutely cannot be exposed in > an API error message. Even including the component would be leaking > too much. While I agree with you from a security standpoint, this is probably coming in due to a desire to namespace the errors. Ideally, we'd have a set of common error codes to cover conditions that the API user could rectify ("You picked a nic type we don't support" or something like that), but I fear there may always be errors that are things the API user could rectify but which don't fit into any of those buckets… > Error Catalog Number: If there could be alignment around this, that > would be great. [snip] > Criticality: This might be useful to clients? I don’t know. I don’t > feel too strongly about it. I feel this part of the code needs more thought to properly round out. Is it intended to convey information similar to the distinction between 4xx and 5xx errors in HTTP? ("You made an error" vs. "The server messed up".) Is it intended to convey a retryable condition? ("If you retry this, it may succeed.") If it's intended to convey that the server messed up spectacularly and that everything's broken now, well… :) -- Kevin L. Mitchell Rackspace __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Openstack-operators] [all][log] Openstack HTTP error codes
On Jan 29, 2015, at 7:34 PM, Rochelle Grober mailto:rochelle.gro...@huawei.com>> wrote: Hi folks! Changed the tags a bit because this is a discussion for all projects and dovetails with logging rationalization/standards/ At the Paris summit, we had a number of session on logging that kept circling back to Error Codes. But, these codes would not be http codes, rather, as others have pointed out, codes related to the calling entities and referring entities and the actions that happened or didn’t. Format suggestions were gathered from the Operators and from some senior developers. The Logging Working Group is planning to put forth a spec for discussion on formats and standards before the Ops mid-cycle meetup. Working from a Glance proposal on error codes: https://review.openstack.org/#/c/127482/ and discussions with operators and devs, we have a strawman to propose. We also have a number of requirements from Ops and some Devs. Here is the basic idea: Code for logs would have four segments: Project Vendor/Component Error Catalog number Criticality Def [A-Z] [A-Z] [A-Z] - [{0-9}|{A-Z}][A-Z] - [-]- [0-9] Ex. CIN- NA- 0001- 2 Cinder NetApp driver error no Criticality Ex. GLA- 0A- 0051 3 Glance Api error no Criticality Three letters for project, Either a two letter vendor code or a number and letter for 0+letter for internal component of project (like API=0A, Controller =0C, etc), four digit error number which could be subsetted for even finer granularity, and a criticality number. This is for logging purposes and tracking down root cause faster for operators, but if an error is generated, why can the same codes be used internally for the code as externally for the logs? I like the idea of the log error codes being aligned with the API errors codes but I have some thoughts/concerns. Project: A client dealing with the API already knows what project (service) they’re dealing with. Including this in an API error message would be redundant. That’s not necessarily so bad and it could actually be convenient for client logging purposes to have this there. Vendor/Component: Including any vendor information at all would be leaking implementation details. This absolutely cannot be exposed in an API error message. Even including the component would be leaking too much. Error Catalog Number: If there could be alignment around this, that would be great. Criticality: This might be useful to clients? I don’t know. I don’t feel too strongly about it. Thanks, Everett __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev