Repository: kafka Updated Branches: refs/heads/0.10.0 22f82abb5 -> aaa52996b
KAFKA-3479: Add new consumer metrics documentation added new consumer metrics section refactored common metrics into new section updated TOC Author: Kaufman Ng <[email protected]> Reviewers: Jason Gustafson <[email protected]>, Ewen Cheslack-Postava <[email protected]> Closes #1361 from coughman/KAFKA-3479-consumer-metrics-doc (cherry picked from commit 6b2564811a6137f1fe639dee236f2538bb7160b1) Signed-off-by: Ewen Cheslack-Postava <[email protected]> Project: http://git-wip-us.apache.org/repos/asf/kafka/repo Commit: http://git-wip-us.apache.org/repos/asf/kafka/commit/aaa52996 Tree: http://git-wip-us.apache.org/repos/asf/kafka/tree/aaa52996 Diff: http://git-wip-us.apache.org/repos/asf/kafka/diff/aaa52996 Branch: refs/heads/0.10.0 Commit: aaa52996b9bc531fe8222a11fe732565c90388fb Parents: 22f82ab Author: Kaufman Ng <[email protected]> Authored: Sun Aug 7 14:29:03 2016 -0700 Committer: Ewen Cheslack-Postava <[email protected]> Committed: Sun Aug 7 14:29:17 2016 -0700 ---------------------------------------------------------------------- .gitignore | 1 + docs/documentation.html | 5 + docs/ops.html | 399 +++++++++++++++++++++++++++++++++---------- 3 files changed, 314 insertions(+), 91 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/kafka/blob/aaa52996/.gitignore ---------------------------------------------------------------------- diff --git a/.gitignore b/.gitignore index 73972e6..b54fcf3 100644 --- a/.gitignore +++ b/.gitignore @@ -27,6 +27,7 @@ kafka.iws .vagrant Vagrantfile.local /logs +.DS_Store config/server-* config/zookeeper-* http://git-wip-us.apache.org/repos/asf/kafka/blob/aaa52996/docs/documentation.html ---------------------------------------------------------------------- diff --git a/docs/documentation.html b/docs/documentation.html index 31dc039..c3425c0 100644 --- a/docs/documentation.html +++ b/docs/documentation.html @@ -110,6 +110,11 @@ Prior releases: <a href="/07/documentation.html">0.7.x</a>, <a href="/08/documen <li><a href="#ext4">Ext4 Notes</a> </ul> <li><a href="#monitoring">6.6 Monitoring</a> + <ul> + <li><a href="#selector_monitoring">Common monitoring metrics for producer/consumer/connect</a></li> + <li><a href="#new_producer_monitoring">New producer monitoring</a></li> + <li><a href="#new_consumer_monitoring">New consumer monitoring</a></li> + </ul> <li><a href="#zk">6.7 ZooKeeper</a> <ul> <li><a href="#zkversion">Stable Version</a> http://git-wip-us.apache.org/repos/asf/kafka/blob/aaa52996/docs/ops.html ---------------------------------------------------------------------- diff --git a/docs/ops.html b/docs/ops.html index d7b87e1..98ce0c3 100644 --- a/docs/ops.html +++ b/docs/ops.html @@ -689,6 +689,149 @@ We do graphing and alerting on the following metrics: </tr> </tbody></table> +<h4><a id="selector_monitoring" href="#selector_monitoring">Common monitoring metrics for producer/consumer/connect</a></h4> + +The following metrics are available on producer/consumer/connector instances. For specific metrics, please see following sections. + +<table class="data-table"> + <tbody> + <tr> + <th>Metric/Attribute name</th> + <th>Description</th> + <th>Mbean name</th> + </tr> + <tr> + <td>connection-close-rate</td> + <td>Connections closed per second in the window.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>connection-creation-rate</td> + <td>New connections established per second in the window.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>network-io-rate</td> + <td>The average number of network operations (reads or writes) on all connections per second.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>outgoing-byte-rate</td> + <td>The average number of outgoing bytes sent per second to all servers.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>request-rate</td> + <td>The average number of requests sent per second.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>request-size-avg</td> + <td>The average size of all requests in the window.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>request-size-max</td> + <td>The maximum size of any request sent in the window.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>incoming-byte-rate</td> + <td>Bytes/second read off all sockets.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>response-rate</td> + <td>Responses received sent per second.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>select-rate</td> + <td>Number of times the I/O layer checked for new I/O to perform per second.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>io-wait-time-ns-avg</td> + <td>The average length of time the I/O thread spent waiting for a socket ready for reads or writes in nanoseconds.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>io-wait-ratio</td> + <td>The fraction of time the I/O thread spent waiting.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>io-time-ns-avg</td> + <td>The average length of time for I/O per select call in nanoseconds.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>io-ratio</td> + <td>The fraction of time the I/O thread spent doing I/O.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>connection-count</td> + <td>The current number of active connections.</td> + <td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td> + </tr> + </tbody> +</table> + +<h4><a id="common_node_monitoring" href="#common_node_monitoring">Common Per-broker metrics for producer/consumer/connect</a></h4> + +The following metrics are available on producer/consumer/connector instances. For specific metrics, please see following sections. + +<table class="data-table"> + <tbody> + <tr> + <th>Metric/Attribute name</th> + <th>Description</th> + <th>Mbean name</th> + </tr> + <tr> + <td>outgoing-byte-rate</td> + <td>The average number of outgoing bytes sent per second for a node.</td> + <td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + </tr> + <tr> + <td>request-rate</td> + <td>The average number of requests sent per second for a node.</td> + <td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + </tr> + <tr> + <td>request-size-avg</td> + <td>The average size of all requests in the window for a node.</td> + <td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + </tr> + <tr> + <td>request-size-max</td> + <td>The maximum size of any request sent in the window for a node.</td> + <td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + </tr> + <tr> + <td>incoming-byte-rate</td> + <td>The average number of responses received per second for a node.</td> + <td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + </tr> + <tr> + <td>request-latency-avg</td> + <td>The average request latency in ms for a node.</td> + <td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + </tr> + <tr> + <td>request-latency-max</td> + <td>The maximum request latency in ms for a node.</td> + <td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + </tr> + <tr> + <td>response-rate</td> + <td>Responses received sent per second for a node.</td> + <td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + </tr> + </tbody> +</table> + <h4><a id="new_producer_monitoring" href="#new_producer_monitoring">New producer monitoring</a></h4> The following metrics are available on new producer instances. @@ -794,157 +937,231 @@ The following metrics are available on new producer instances. <td>The age in seconds of the current producer metadata being used.</td> <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> </tr> + <tr> - <td>connection-close-rate</td> - <td>Connections closed per second in the window.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>record-send-rate</td> + <td>The average number of records sent per second for a topic.</td> + <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> </tr> <tr> - <td>connection-creation-rate</td> - <td>New connections established per second in the window.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>byte-rate</td> + <td>The average number of bytes sent per second for a topic.</td> + <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> </tr> <tr> - <td>network-io-rate</td> - <td>The average number of network operations (reads or writes) on all connections per second.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>compression-rate</td> + <td>The average compression rate of record batches for a topic.</td> + <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> </tr> <tr> - <td>outgoing-byte-rate</td> - <td>The average number of outgoing bytes sent per second to all servers.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>record-retry-rate</td> + <td>The average per-second number of retried record sends for a topic.</td> + <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> </tr> <tr> - <td>request-rate</td> - <td>The average number of requests sent per second.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>record-error-rate</td> + <td>The average per-second number of record sends that resulted in errors for a topic.</td> + <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> </tr> <tr> - <td>request-size-avg</td> - <td>The average size of all requests in the window.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>produce-throttle-time-max</td> + <td>The maximum time in ms a request was throttled by a broker.</td> + <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>request-size-max</td> - <td>The maximum size of any request sent in the window.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>produce-throttle-time-avg</td> + <td>The average time in ms a request was throttled by a broker.</td> + <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+)</td> </tr> +</tbody></table> + + +<h4><a id="new_consumer_monitoring" href="#new_consumer_monitoring">New consumer monitoring</a></h4> + +The following metrics are available on new consumer instances. + +<h5><a id="new_consumer_group_monitoring" href="#new_consumer_group_monitoring">Consumer Group Metrics</a></h5> +<table class="data-table"> + <tbody> <tr> - <td>incoming-byte-rate</td> - <td>Bytes/second read off all sockets.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <th>Metric/Attribute name</th> + <th>Description</th> + <th>Mbean name</th> </tr> <tr> - <td>response-rate</td> - <td>Responses received sent per second.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>commit-latency-avg</td> + <td>The average time taken for a commit request</td> + <td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>select-rate</td> - <td>Number of times the I/O layer checked for new I/O to perform per second.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>commit-latency-max</td> + <td>The max time taken for a commit request</td> + <td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>io-wait-time-ns-avg</td> - <td>The average length of time the I/O thread spent waiting for a socket ready for reads or writes in nanoseconds.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>commit-rate</td> + <td>The number of commit calls per second</td> + <td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>io-wait-ratio</td> - <td>The fraction of time the I/O thread spent waiting.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>assigned-partitions</td> + <td>The number of partitions currently assigned to this consumer</td> + <td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>io-time-ns-avg</td> - <td>The average length of time for I/O per select call in nanoseconds.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>heartbeat-response-time-max</td> + <td>The max time taken to receive a response to a heartbeat request</td> + <td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>io-ratio</td> - <td>The fraction of time the I/O thread spent doing I/O.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>heartbeat-rate</td> + <td>The average number of heartbeats per second</td> + <td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>connection-count</td> - <td>The current number of active connections.</td> - <td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td> + <td>join-time-avg</td> + <td>The average time taken for a group rejoin</td> + <td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>outgoing-byte-rate</td> - <td>The average number of outgoing bytes sent per second for a node.</td> - <td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + <td>join-time-max</td> + <td>The max time taken for a group rejoin</td> + <td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>request-rate</td> - <td>The average number of requests sent per second for a node.</td> - <td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + <td>join-rate</td> + <td>The number of group joins per second</td> + <td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>request-size-avg</td> - <td>The average size of all requests in the window for a node.</td> - <td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + <td>sync-time-avg</td> + <td>The average time taken for a group sync</td> + <td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>request-size-max</td> - <td>The maximum size of any request sent in the window for a node.</td> - <td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + <td>sync-time-max</td> + <td>The max time taken for a group sync</td> + <td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>incoming-byte-rate</td> - <td>The average number of responses received per second for a node.</td> - <td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + <td>sync-rate</td> + <td>The number of group syncs per second</td> + <td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>request-latency-avg</td> - <td>The average request latency in ms for a node.</td> - <td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + <td>last-heartbeat-seconds-ago</td> + <td>The number of seconds since the last controller heartbeat</td> + <td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td> </tr> + </tbody> +</table> + +<h5><a id="new_consumer_fetch_monitoring" href="#new_consumer_fetch_monitoring">Consumer Fetch Metrics</a></h5> + +<table class="data-table"> + <tbody> <tr> - <td>request-latency-max</td> - <td>The maximum request latency in ms for a node.</td> - <td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + <th>Metric/Attribute name</th> + <th>Description</th> + <th>Mbean name</th> </tr> <tr> - <td>response-rate</td> - <td>Responses received sent per second for a node.</td> - <td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td> + <td>fetch-size-avg</td> + <td>The average number of bytes fetched per request</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>record-send-rate</td> - <td>The average number of records sent per second for a topic.</td> - <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> + <td>fetch-size-max</td> + <td>The maximum number of bytes fetched per request</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>byte-rate</td> - <td>The average number of bytes sent per second for a topic.</td> - <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> + <td>bytes-consumed-rate</td> + <td>The average number of bytes consumed per second</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>compression-rate</td> - <td>The average compression rate of record batches for a topic.</td> - <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> + <td>records-per-request-avg</td> + <td>The average number of records in each request</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>record-retry-rate</td> - <td>The average per-second number of retried record sends for a topic.</td> - <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> + <td>records-consumed-rate</td> + <td>The average number of records consumed per second</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>record-error-rate</td> - <td>The average per-second number of record sends that resulted in errors for a topic.</td> - <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> + <td>fetch-latency-avg</td> + <td>The average time taken for a fetch request</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>produce-throttle-time-max</td> - <td>The maximum time in ms a request was throttled by a broker.</td> - <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+)</td> + <td>fetch-latency-max</td> + <td>The max time taken for a fetch request</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td> </tr> <tr> - <td>produce-throttle-time-avg</td> - <td>The average time in ms a request was throttled by a broker.</td> - <td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+)</td> + <td>fetch-rate</td> + <td>The number of fetch requests per second</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td> </tr> -</tbody></table> + <tr> + <td>records-lag-max</td> + <td>The maximum lag in terms of number of records for any partition in this window</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>fetch-throttle-time-avg</td> + <td>The average throttle time in ms</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td> + </tr> + <tr> + <td>fetch-throttle-time-max</td> + <td>The maximum throttle time in ms</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td> + </tr> + </tbody> +</table> + + +<h5><a id="topic_fetch_monitoring" href="#topic_fetch_monitoring">Topic-level Fetch Metrics</a></h5> + +<table class="data-table"> + <tbody> + <tr> + <th>Metric/Attribute name</th> + <th>Description</th> + <th>Mbean name</th> + </tr> + <tr> + <td>fetch-size-avg</td> + <td>The average number of bytes fetched per request for a specific topic.</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> + </tr> + <tr> + <td>fetch-size-max</td> + <td>The maximum number of bytes fetched per request for a specific topic.</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> + </tr> + <tr> + <td>bytes-consumed-rate</td> + <td>The average number of bytes consumed per second for a specific topic.</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> + </tr> + <tr> + <td>records-per-request-avg</td> + <td>The average number of records in each request for a specific topic.</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> + </tr> + <tr> + <td>records-consumed-rate</td> + <td>The average number of records consumed per second for a specific topic.</td> + <td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td> + </tr> + </tbody> +</table> + +<h5><a id="others_monitoring" href="#others_monitoring">Others</a></h5> We recommend monitoring GC time and other stats and various server stats such as CPU utilization, I/O service time, etc.
