I'll open a ticket on this soon, but I'd like to start a discussion first. We're working on a distributed tracing system, whose design is somewhat inspired by the Google Dapper paper [1]. We have instrumented a bunch of our internal services through our custom networking stack [2].
In a nutshell, the way it works is that each request is given a trace id which gets passed through to each service involved in servicing that request. Each hop in that tree is given a span id. Each node logs its data to a local agent (we use scribe for this). An aggregator can pull the pieces back together so you can do analysis. I'd like to add the ability to plug tracers into cassandra. Like with many things in cassandra, I think like many parts of Cassandra we should make this an extensible point with a good default implementation in place. Here's what I propose: 1. Update the thrift server to allow clients to pass in tracing details. I'll have docs soon on how we're doing this internally. 2. Add the necessary metadata to each message passed between cassandra nodes. This should be easy to Message.java and thread through to the places we need it. 3. Implement a universally useful version of this– one that's not dependent on our system since it may not ever get open-sourced. Perhaps writing to local files? Thoughts? Opinions? -ryan 1. http://research.google.com/pubs/pub36356.html 2. https://github.com/twitter/finagle/tree/master/finagle-b3