[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers

2013-01-25 Thread Joey Echeverria (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562861#comment-13562861
 ] 

Joey Echeverria commented on HBASE-7633:


I still think a direct metric would be useful here. The issue where I saw this 
was a slowly dying disk caused a few region servers to slow way, way down. The 
client application was hammering HBase with new threads trying to write with no 
back pressure. The writers eventually exhausted the IPC threads on the region 
servers which blocked incoming reads. This situation would have been a bit more 
graceful if we could have alerted on the IPC threads getting exhausted.

 Add a metric that tracks the current number of used RPC threads on the 
 regionservers
 

 Key: HBASE-7633
 URL: https://issues.apache.org/jira/browse/HBASE-7633
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Reporter: Joey Echeverria
Assignee: Elliott Clark

 One way to detect that you're hitting a John Wayne disk[1] would be if we 
 could see when region servers exhausted their RPC handlers. This would also 
 be useful when tuning the cluster for your workload to make sure that reads 
 or writes were not starving the other operations out.
 [1] http://hbase.apache.org/book.html#bad.disk

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers

2013-01-25 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563195#comment-13563195
 ] 

Elliott Clark commented on HBASE-7633:
--

That's exactly what call queue length would have shown.  It would normally show 
0 and then as things get slower the queue length would grow as it approaches 
the max of ~500.

The way I see it having all ipc threads full isn't a bad thing.  If the threads 
are answering requests at the same rate as they are coming in then having all 
the threads answering something is just fine.  The bad part was that they were 
all full and the number of requests waiting to be answered was growing.  hence 
the callQueueLength was what I would look at. 

 Add a metric that tracks the current number of used RPC threads on the 
 regionservers
 

 Key: HBASE-7633
 URL: https://issues.apache.org/jira/browse/HBASE-7633
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Reporter: Joey Echeverria
Assignee: Elliott Clark

 One way to detect that you're hitting a John Wayne disk[1] would be if we 
 could see when region servers exhausted their RPC handlers. This would also 
 be useful when tuning the cluster for your workload to make sure that reads 
 or writes were not starving the other operations out.
 [1] http://hbase.apache.org/book.html#bad.disk

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers

2013-01-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563231#comment-13563231
 ] 

stack commented on HBASE-7633:
--

Looks like we don't doc callQueueLen in refguide, not currently.  If we added 
it Joey, would that do to close this issue?  (Elliott, what is the metric's 
full name and I'll add the doc.)

 Add a metric that tracks the current number of used RPC threads on the 
 regionservers
 

 Key: HBASE-7633
 URL: https://issues.apache.org/jira/browse/HBASE-7633
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Reporter: Joey Echeverria
Assignee: Elliott Clark

 One way to detect that you're hitting a John Wayne disk[1] would be if we 
 could see when region servers exhausted their RPC handlers. This would also 
 be useful when tuning the cluster for your workload to make sure that reads 
 or writes were not starving the other operations out.
 [1] http://hbase.apache.org/book.html#bad.disk

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers

2013-01-25 Thread Joey Echeverria (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563260#comment-13563260
 ] 

Joey Echeverria commented on HBASE-7633:


Yes, if we added the docs for that with the info that Elliot just provided for 
how to interpret the results, that'd be perfect. 

 Add a metric that tracks the current number of used RPC threads on the 
 regionservers
 

 Key: HBASE-7633
 URL: https://issues.apache.org/jira/browse/HBASE-7633
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Reporter: Joey Echeverria
Assignee: Elliott Clark

 One way to detect that you're hitting a John Wayne disk[1] would be if we 
 could see when region servers exhausted their RPC handlers. This would also 
 be useful when tuning the cluster for your workload to make sure that reads 
 or writes were not starving the other operations out.
 [1] http://hbase.apache.org/book.html#bad.disk

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers

2013-01-21 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559080#comment-13559080
 ] 

Elliott Clark commented on HBASE-7633:
--

In 0.94 there's: 
* callQueueLen

In trunk there are a few more metrics:
* numCallsInGeneralQueue
* numCallsInPriorityQueue
* numCallsInReplicationQueue

While the don't tell you how many threads are currently running they do hint at 
if things are backing up.

 Add a metric that tracks the current number of used RPC threads on the 
 regionservers
 

 Key: HBASE-7633
 URL: https://issues.apache.org/jira/browse/HBASE-7633
 Project: HBase
  Issue Type: Improvement
Reporter: Joey Echeverria
Assignee: Elliott Clark

 One way to detect that you're hitting a John Wayne disk[1] would be if we 
 could see when region servers exhausted their RPC handlers. This would also 
 be useful when tuning the cluster for your workload to make sure that reads 
 or writes were not starving the other operations out.
 [1] http://hbase.apache.org/book.html#bad.disk

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers

2013-01-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559103#comment-13559103
 ] 

stack commented on HBASE-7633:
--

Given what Elliott says, can we close this [~fwiffo]?

 Add a metric that tracks the current number of used RPC threads on the 
 regionservers
 

 Key: HBASE-7633
 URL: https://issues.apache.org/jira/browse/HBASE-7633
 Project: HBase
  Issue Type: Improvement
Reporter: Joey Echeverria
Assignee: Elliott Clark

 One way to detect that you're hitting a John Wayne disk[1] would be if we 
 could see when region servers exhausted their RPC handlers. This would also 
 be useful when tuning the cluster for your workload to make sure that reads 
 or writes were not starving the other operations out.
 [1] http://hbase.apache.org/book.html#bad.disk

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers

2013-01-21 Thread Joey Echeverria (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559106#comment-13559106
 ] 

Joey Echeverria commented on HBASE-7633:


callQueueLen is close I think, but I'm not sure how that translates into tuning 
hbase.regionserver.handler.count. Do we have a good example of interpreting 
that value?

 Add a metric that tracks the current number of used RPC threads on the 
 regionservers
 

 Key: HBASE-7633
 URL: https://issues.apache.org/jira/browse/HBASE-7633
 Project: HBase
  Issue Type: Improvement
Reporter: Joey Echeverria
Assignee: Elliott Clark

 One way to detect that you're hitting a John Wayne disk[1] would be if we 
 could see when region servers exhausted their RPC handlers. This would also 
 be useful when tuning the cluster for your workload to make sure that reads 
 or writes were not starving the other operations out.
 [1] http://hbase.apache.org/book.html#bad.disk

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7633) Add a metric that tracks the current number of used RPC threads on the regionservers

2013-01-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559114#comment-13559114
 ] 

stack commented on HBASE-7633:
--

bq. ...translates into tuning hbase.regionserver.handler.count.  Do we have a 
good example of interpreting that value?

Not really other than if they are backed up frequently and we're not blocked on 
cpu or io, then bump them up (Is this for another issue [~fwiffo]?)  Thanks 
boss.



 Add a metric that tracks the current number of used RPC threads on the 
 regionservers
 

 Key: HBASE-7633
 URL: https://issues.apache.org/jira/browse/HBASE-7633
 Project: HBase
  Issue Type: Improvement
Reporter: Joey Echeverria
Assignee: Elliott Clark

 One way to detect that you're hitting a John Wayne disk[1] would be if we 
 could see when region servers exhausted their RPC handlers. This would also 
 be useful when tuning the cluster for your workload to make sure that reads 
 or writes were not starving the other operations out.
 [1] http://hbase.apache.org/book.html#bad.disk

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira