[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers
[ https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562861#comment-13562861 ] Joey Echeverria commented on HBASE-7633: I still think a direct metric would be useful here. The issue where I saw this was a slowly dying disk caused a few region servers to slow way, way down. The client application was hammering HBase with new threads trying to write with no back pressure. The writers eventually exhausted the IPC threads on the region servers which blocked incoming reads. This situation would have been a bit more graceful if we could have alerted on the IPC threads getting exhausted. Add a metric that tracks the current number of used RPC threads on the regionservers Key: HBASE-7633 URL: https://issues.apache.org/jira/browse/HBASE-7633 Project: HBase Issue Type: Improvement Components: metrics Reporter: Joey Echeverria Assignee: Elliott Clark One way to detect that you're hitting a John Wayne disk[1] would be if we could see when region servers exhausted their RPC handlers. This would also be useful when tuning the cluster for your workload to make sure that reads or writes were not starving the other operations out. [1] http://hbase.apache.org/book.html#bad.disk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers
[ https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563195#comment-13563195 ] Elliott Clark commented on HBASE-7633: -- That's exactly what call queue length would have shown. It would normally show 0 and then as things get slower the queue length would grow as it approaches the max of ~500. The way I see it having all ipc threads full isn't a bad thing. If the threads are answering requests at the same rate as they are coming in then having all the threads answering something is just fine. The bad part was that they were all full and the number of requests waiting to be answered was growing. hence the callQueueLength was what I would look at. Add a metric that tracks the current number of used RPC threads on the regionservers Key: HBASE-7633 URL: https://issues.apache.org/jira/browse/HBASE-7633 Project: HBase Issue Type: Improvement Components: metrics Reporter: Joey Echeverria Assignee: Elliott Clark One way to detect that you're hitting a John Wayne disk[1] would be if we could see when region servers exhausted their RPC handlers. This would also be useful when tuning the cluster for your workload to make sure that reads or writes were not starving the other operations out. [1] http://hbase.apache.org/book.html#bad.disk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers
[ https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563231#comment-13563231 ] stack commented on HBASE-7633: -- Looks like we don't doc callQueueLen in refguide, not currently. If we added it Joey, would that do to close this issue? (Elliott, what is the metric's full name and I'll add the doc.) Add a metric that tracks the current number of used RPC threads on the regionservers Key: HBASE-7633 URL: https://issues.apache.org/jira/browse/HBASE-7633 Project: HBase Issue Type: Improvement Components: metrics Reporter: Joey Echeverria Assignee: Elliott Clark One way to detect that you're hitting a John Wayne disk[1] would be if we could see when region servers exhausted their RPC handlers. This would also be useful when tuning the cluster for your workload to make sure that reads or writes were not starving the other operations out. [1] http://hbase.apache.org/book.html#bad.disk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers
[ https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563260#comment-13563260 ] Joey Echeverria commented on HBASE-7633: Yes, if we added the docs for that with the info that Elliot just provided for how to interpret the results, that'd be perfect. Add a metric that tracks the current number of used RPC threads on the regionservers Key: HBASE-7633 URL: https://issues.apache.org/jira/browse/HBASE-7633 Project: HBase Issue Type: Improvement Components: metrics Reporter: Joey Echeverria Assignee: Elliott Clark One way to detect that you're hitting a John Wayne disk[1] would be if we could see when region servers exhausted their RPC handlers. This would also be useful when tuning the cluster for your workload to make sure that reads or writes were not starving the other operations out. [1] http://hbase.apache.org/book.html#bad.disk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers
[ https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559080#comment-13559080 ] Elliott Clark commented on HBASE-7633: -- In 0.94 there's: * callQueueLen In trunk there are a few more metrics: * numCallsInGeneralQueue * numCallsInPriorityQueue * numCallsInReplicationQueue While the don't tell you how many threads are currently running they do hint at if things are backing up. Add a metric that tracks the current number of used RPC threads on the regionservers Key: HBASE-7633 URL: https://issues.apache.org/jira/browse/HBASE-7633 Project: HBase Issue Type: Improvement Reporter: Joey Echeverria Assignee: Elliott Clark One way to detect that you're hitting a John Wayne disk[1] would be if we could see when region servers exhausted their RPC handlers. This would also be useful when tuning the cluster for your workload to make sure that reads or writes were not starving the other operations out. [1] http://hbase.apache.org/book.html#bad.disk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers
[ https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559103#comment-13559103 ] stack commented on HBASE-7633: -- Given what Elliott says, can we close this [~fwiffo]? Add a metric that tracks the current number of used RPC threads on the regionservers Key: HBASE-7633 URL: https://issues.apache.org/jira/browse/HBASE-7633 Project: HBase Issue Type: Improvement Reporter: Joey Echeverria Assignee: Elliott Clark One way to detect that you're hitting a John Wayne disk[1] would be if we could see when region servers exhausted their RPC handlers. This would also be useful when tuning the cluster for your workload to make sure that reads or writes were not starving the other operations out. [1] http://hbase.apache.org/book.html#bad.disk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers
[ https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559106#comment-13559106 ] Joey Echeverria commented on HBASE-7633: callQueueLen is close I think, but I'm not sure how that translates into tuning hbase.regionserver.handler.count. Do we have a good example of interpreting that value? Add a metric that tracks the current number of used RPC threads on the regionservers Key: HBASE-7633 URL: https://issues.apache.org/jira/browse/HBASE-7633 Project: HBase Issue Type: Improvement Reporter: Joey Echeverria Assignee: Elliott Clark One way to detect that you're hitting a John Wayne disk[1] would be if we could see when region servers exhausted their RPC handlers. This would also be useful when tuning the cluster for your workload to make sure that reads or writes were not starving the other operations out. [1] http://hbase.apache.org/book.html#bad.disk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers
[ https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559114#comment-13559114 ] stack commented on HBASE-7633: -- bq. ...translates into tuning hbase.regionserver.handler.count. Do we have a good example of interpreting that value? Not really other than if they are backed up frequently and we're not blocked on cpu or io, then bump them up (Is this for another issue [~fwiffo]?) Thanks boss. Add a metric that tracks the current number of used RPC threads on the regionservers Key: HBASE-7633 URL: https://issues.apache.org/jira/browse/HBASE-7633 Project: HBase Issue Type: Improvement Reporter: Joey Echeverria Assignee: Elliott Clark One way to detect that you're hitting a John Wayne disk[1] would be if we could see when region servers exhausted their RPC handlers. This would also be useful when tuning the cluster for your workload to make sure that reads or writes were not starving the other operations out. [1] http://hbase.apache.org/book.html#bad.disk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira