I had some experience with the feature in MySQL. Its main good use is to identify queries that are obviously bad (full scans on OLTP system) and need optimization. You can't infer from it anything about the system as a whole because it lacks context and information about what the rest of the system was doing at the same time.
I'd like to hear how you see yourself using it in Apache Kafka to better understand its usefulness. Can you share some details about how you would have used it in the recent issue you mentioned? What I see as helpful: 1. Ability to enable/disable trace/debug level logging of request handling for specific request types and clients without restarting the broker (i.e. through JMX, protocol or ZK) 2. Publish histograms of the existing request time metrics 3. Capture detailed timing of a random sample of the requests and log it (i.e sample metrics rather than avgs). Note that clients that send more requests and longer requests are more likely to get sampled. I've found this super useful in the past. Gwen On Wed, Oct 14, 2015 at 3:39 PM, Aditya Auradkar < aaurad...@linkedin.com.invalid> wrote: > Hey everyone, > > We were recently discussing a small logging improvement for Kafka. > Basically, add a request log for queries that took longer than a certain > configurable time to execute. This can be quite useful for debugging > purposes, in fact it would have proven handy while investigating a recent > issue during one of our deployments at LinkedIn. > > There is also supported in several other projects. For example: MySQL and > Postgres both have slow request logs. > https://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html > https://wiki.postgresql.org/wiki/Logging_Difficult_Queries > > Thoughts? > > Thanks, > Aditya >