Hello everyone I have added a blueprint on having tail-based sampling as a sampling option for continuous tracing in OSProfiler. It would be really helpful to have some thoughts, ideas, comments on this from the community.
Continuous tracing provides a good insight on how various transactions behave across in a distributed system. Currently, OpenStack doesn't have a defined solution for continuous tracing. Though, it has OSProfiler that does generates selective traces, it may not capture the occurrence. Even if we have OSProfiler running continuously [1], we need to sample the traces so as to cut down the data generated and still keep the useful info. Head based sampling can be applied that decides initially whether a trace should be saved or not. However, it may miss out on some useful traces. I propose to have tail-based sampling [2] mechanism that makes the decision at the end of the transaction and tends to keep all the useful traces. This may require a lot of changes depending on what all type of info is required and the solution that we pick to implement it [2]. This may not affect the current working of any of the services on OpenStack as it will be off the critical path [3]. Please share your thoughts on this and what solution should be preferred in a broader OpenStack's perspective. This is a step in the process of having an automated diagnostic solution for OpenStack cluster. [1] https://blueprints.launchpad.net/osprofiler/+spec/osprofiler-overhead-control [2] https://blueprints.launchpad.net/osprofiler/+spec/tail-based-coherent-sampling [3] https://blueprints.launchpad.net/osprofiler/+spec/asynchronous-trace-collection Thanks Rajul Kumar
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev