> On 14 May 2017, at 13:04, Sean Dague <s...@dague.net> wrote: > > One of the things that came up in a logging Forum session is how much effort > operators are having to put into reconstructing flows for things like server > boot when they go wrong, as every time we jump a service barrier the > request-id is reset to something new. The back and forth between Nova / > Neutron and Nova / Glance would be definitely well served by this. Especially > if this is something that's easy to query in elastic search. > > The last time this came up, some people were concerned that trusting > request-id on the wire was concerning to them because it's coming from random > users. We're going to assume that's still a concern by some. However, since > the last time that came up, we've introduced the concept of "service users", > which are a set of higher priv services that we are using to wrap user > requests between services so that long running request chains (like image > snapshot). We trust these service users enough to keep on trucking even after > the user token has expired for this long run operations. We could use this > same trust path for request-id chaining. > > So, the basic idea is, services will optionally take an inbound > X-OpenStack-Request-ID which will be strongly validated to the format > (req-$uuid). They will continue to always generate one as well. When the > context is built (which is typically about 3 more steps down the paste > pipeline), we'll check that the service user was involved, and if not, reset > the request_id to the local generated one. We'll log both the global and > local request ids. All of these changes happen in oslo.middleware, > oslo.context, oslo.log, and most projects won't need anything to get this > infrastructure. > > The python clients, and callers, will then need to be augmented to pass the > request-id in on requests. Servers will effectively decide when they want to > opt into calling other services this way. > > This only ends up logging the top line global request id as well as the last > leaf for each call. This does mean that full tree construction will take more > work if you are bouncing through 3 or more servers, but it's a step which I > think can be completed this cycle. > > I've got some more detailed notes, but before going through the process of > putting this into an oslo spec I wanted more general feedback on it so that > any objections we didn't think about yet can be raised before going through > the detailed design.
This is very consistent with what I had understood during the forum session. Having a single request id across multiple services as the end user operation is performed would be a great help in operations, where we are often using a solution like ElasticSearch/Kibana to show logs and interactively query the timing and results of a given request id. It would also improve traceability during investigations where we are aiming to determine who the initial requesting user. Tim > > -Sean > > -- > Sean Dague > http://dague.net > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev