[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2012-06-26 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401683#comment-13401683
 ] 

Jean-Daniel Cryans commented on HBASE-4145:
---

I just stumbled upon this code, it seems there's an issue in 
{{TableRecordReaderImpl}}. Calling restart() does this:

{code}
public void restart(byte[] firstRow) throws IOException {
  currentScan = new Scan(scan);
{code}

Which by itself is fine since the metrics will be copied from *scan* to 
*currentScan*, except that it's *currentScan* that has the updated metrics not 
*scan*.

In other words, *currentScan* is the object that is used for scanning so it 
contains the metrics. If restart() is called, that object is overwritten by the 
original definition of the {{Scan}}. I think to fix this we could grab the 
metrics from *currentScan* first then set them back on the new object.

 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.94.0

 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in all mapper instances. However, it might be good 
 enough for now. With this approach, the metrics value will be available via 
 MapReduce counter.
 a) Have ResultScanner return a new ResultScannerMetrics interface.
 b) TableInputFormat will access data from ResultScannerMetrics and populate 
 MapReduce counters accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2012-06-26 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401734#comment-13401734
 ] 

Zhihong Ted Yu commented on HBASE-4145:
---

@J-D:
HBASE-6277 has been created to address your finding.

 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.94.0

 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in all mapper instances. However, it might be good 
 enough for now. With this approach, the metrics value will be available via 
 MapReduce counter.
 a) Have ResultScanner return a new ResultScannerMetrics interface.
 b) TableInputFormat will access data from ResultScannerMetrics and populate 
 MapReduce counters accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-30 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13118134#comment-13118134
 ] 

Hudson commented on HBASE-4145:
---

Integrated in HBase-TRUNK #2272 (See 
[https://builds.apache.org/job/HBase-TRUNK/2272/])
HBASE-4145 Provide metrics for hbase client, add ScanMetrics.java
HBASE-4145  Provide metrics for hbase client (Ming Ma)

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java


 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.94.0

 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in all mapper instances. However, it might be good 
 enough for now. With this approach, the metrics value will be available via 
 MapReduce counter.
 a) Have ResultScanner return a new ResultScannerMetrics interface.
 b) TableInputFormat will access data from ResultScannerMetrics and populate 
 MapReduce counters accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-29 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117295#comment-13117295
 ] 

jirapos...@reviews.apache.org commented on HBASE-4145:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1674/#review2150
---



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
https://reviews.apache.org/r/1674/#comment4994

I think we need not give copyright information.


- ramkrishna


On 2011-09-28 23:03:54, Ming Ma wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1674/
bq.  ---
bq.  
bq.  (Updated 2011-09-28 23:03:54)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  1. Collect client-side scan related metrics during scan operation. It is 
turned off by default.
bq.  2. TableInputFormat enables metrics collection on scan and pass the data 
to mapreduce framework. It only works with new mapreduce APIs that allow 
TableInputFormat to get access to mapreduce Counter.
bq.  3. Clean up some minor issues in tableInputFormat as well as test code.
bq.  
bq.  
bq.  This addresses bug hbase-4145.
bq.  https://issues.apache.org/jira/browse/hbase-4145
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
 1176942 
bq.  
bq.  Diff: https://reviews.apache.org/r/1674/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  1. Verified on a small cluster.
bq.  2. Existing unit tests.
bq.  3. Added new tests.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ming
bq.  
bq.



 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in 

[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-29 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117608#comment-13117608
 ] 

jirapos...@reviews.apache.org commented on HBASE-4145:
--



bq.  On 2011-09-29 13:33:05, ramkrishna vasudevan wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java,
 line 2
bq.   https://reviews.apache.org/r/1674/diff/4/?file=46377#file46377line2
bq.  
bq.   I think we need not give copyright information.

Fixed.


- Ming


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1674/#review2150
---


On 2011-09-29 21:00:18, Ming Ma wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1674/
bq.  ---
bq.  
bq.  (Updated 2011-09-29 21:00:18)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  1. Collect client-side scan related metrics during scan operation. It is 
turned off by default.
bq.  2. TableInputFormat enables metrics collection on scan and pass the data 
to mapreduce framework. It only works with new mapreduce APIs that allow 
TableInputFormat to get access to mapreduce Counter.
bq.  3. Clean up some minor issues in tableInputFormat as well as test code.
bq.  
bq.  
bq.  This addresses bug hbase-4145.
bq.  https://issues.apache.org/jira/browse/hbase-4145
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java
 1176942 
bq.  
bq.  Diff: https://reviews.apache.org/r/1674/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  1. Verified on a small cluster.
bq.  2. Existing unit tests.
bq.  3. Added new tests.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ming
bq.  
bq.



 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the 

[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-29 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117611#comment-13117611
 ] 

jirapos...@reviews.apache.org commented on HBASE-4145:
--



bq.  On 2011-09-29 03:48:24, Ted Yu wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java,
 line 43
bq.   https://reviews.apache.org/r/1674/diff/3/?file=46268#file46268line43
bq.  
bq.   Should read 'can be easily'

Fixed.


- Ming


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1674/#review2148
---


On 2011-09-29 21:00:18, Ming Ma wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1674/
bq.  ---
bq.  
bq.  (Updated 2011-09-29 21:00:18)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  1. Collect client-side scan related metrics during scan operation. It is 
turned off by default.
bq.  2. TableInputFormat enables metrics collection on scan and pass the data 
to mapreduce framework. It only works with new mapreduce APIs that allow 
TableInputFormat to get access to mapreduce Counter.
bq.  3. Clean up some minor issues in tableInputFormat as well as test code.
bq.  
bq.  
bq.  This addresses bug hbase-4145.
bq.  https://issues.apache.org/jira/browse/hbase-4145
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java
 1176942 
bq.  
bq.  Diff: https://reviews.apache.org/r/1674/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  1. Verified on a small cluster.
bq.  2. Existing unit tests.
bq.  3. Added new tests.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ming
bq.  
bq.



 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs 

[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-29 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117609#comment-13117609
 ] 

jirapos...@reviews.apache.org commented on HBASE-4145:
--



bq.  On 2011-09-29 04:18:54, Ted Yu wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java,
 line 48
bq.   https://reviews.apache.org/r/1674/diff/4/?file=46377#file46377line48
bq.  
bq.   Should be declared as implementing VersionedWritable.

The issue with VersionedWritable is it throws VersionMismatchException if the 
version doesn't match. 

  public void readFields(DataInput in) throws IOException {
byte version = in.readByte(); // read version
if (version != getVersion())
  throw new VersionMismatchException(getVersion(), version);
  }

I want to make it backward compatible to support version = getVersion(). The 
program could catch VersionMismatchException, however, there is no way to find 
out the expectedVersion and foundVersion, given they are private members.

public class VersionMismatchException extends IOException {

  private byte expectedVersion;
  private byte foundVersion;
...
}

Any other suggestions, or Is it something that need to be fixed in 
VersionedWritable, VersionMismatchException?


bq.  On 2011-09-29 04:18:54, Ted Yu wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java,
 line 76
bq.   https://reviews.apache.org/r/1674/diff/4/?file=46377#file46377line76
bq.  
bq.   It is a bit hard to read this counter.

Fixed.


bq.  On 2011-09-29 04:18:54, Ted Yu wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java,
 line 94
bq.   https://reviews.apache.org/r/1674/diff/4/?file=46377#file46377line94
bq.  
bq.   This is count of regions scanned, right ?
bq.   If so, please name it that way.

Todd suggested to rename it from COUNT_OF_REGIONS to REGIONS, given the fact 
that it is a counter is implicit in mapreduce framework.


bq.  On 2011-09-29 04:18:54, Ted Yu wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java,
 line 127
bq.   https://reviews.apache.org/r/1674/diff/4/?file=46377#file46377line127
bq.  
bq.   mb should be included in the exception.

Fixed.


bq.  On 2011-09-29 04:18:54, Ted Yu wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java,
 line 133
bq.   https://reviews.apache.org/r/1674/diff/4/?file=46377#file46377line133
bq.  
bq.   Why do we need this check again ?

Fixed.


bq.  On 2011-09-29 04:18:54, Ted Yu wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java,
 line 143
bq.   https://reviews.apache.org/r/1674/diff/4/?file=46377#file46377line143
bq.  
bq.   Value of version should be included here.

Fixed.


bq.  On 2011-09-29 04:18:54, Ted Yu wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java,
 line 151
bq.   https://reviews.apache.org/r/1674/diff/4/?file=46377#file46377line151
bq.  
bq.   I think we should have else block where the unsupported mb is logged.

Fixed.


bq.  On 2011-09-29 04:18:54, Ted Yu wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java,
 line 52
bq.   https://reviews.apache.org/r/1674/diff/4/?file=46380#file46380line52
bq.  
bq.   This name doesn't really match the constant above. I think HBase 
mapreduce Counters would be better.

The name should show up in mapreduce UI and report. Other group names don't 
have mapreduce. So keep it as HBase Counters and rename the variable.


bq.  On 2011-09-29 04:18:54, Ted Yu wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java,
 line 83
bq.   https://reviews.apache.org/r/1674/diff/4/?file=46380#file46380line83
bq.  
bq.   This should not be a tongue twister.
bq.   How about naming it retrieveGetCounterWithStrings ?

Fixed.


bq.  On 2011-09-29 04:18:54, Ted Yu wrote:
bq.   
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java,
 line 232
bq.   https://reviews.apache.org/r/1674/diff/4/?file=46380#file46380line232
bq.  
bq.   Shall we create the Object array outside the for loop and only fill 
in Metric name here ?

Fixed. Don't create Object at all, pass the parameters directly.


- Ming


---
This is an automatically generated e-mail. To reply, visit:

[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-29 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117612#comment-13117612
 ] 

jirapos...@reviews.apache.org commented on HBASE-4145:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1674/
---

(Updated 2011-09-29 21:00:18.525989)


Review request for hbase.


Changes
---

Thanks for the review, Ted, Ram. Most are fixed. Please find comments inline.


Summary
---

1. Collect client-side scan related metrics during scan operation. It is turned 
off by default.
2. TableInputFormat enables metrics collection on scan and pass the data to 
mapreduce framework. It only works with new mapreduce APIs that allow 
TableInputFormat to get access to mapreduce Counter.
3. Clean up some minor issues in tableInputFormat as well as test code.


This addresses bug hbase-4145.
https://issues.apache.org/jira/browse/hbase-4145


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java
 1176942 

Diff: https://reviews.apache.org/r/1674/diff


Testing
---

1. Verified on a small cluster.
2. Existing unit tests.
3. Added new tests.


Thanks,

Ming



 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in all mapper instances. However, it might be good 
 enough for now. With this approach, the metrics value will be available via 
 MapReduce counter.
 a) Have ResultScanner return a new ResultScannerMetrics interface.
 b) TableInputFormat will access data from ResultScannerMetrics and populate 
 MapReduce counters accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-29 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117621#comment-13117621
 ] 

jirapos...@reviews.apache.org commented on HBASE-4145:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1674/#review2175
---


Only one minor comment left, see below.
If test suite passes, I would vote +1


http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
https://reviews.apache.org/r/1674/#comment5068

VersionedWritable from hadoop left something to be desired.
Such discussion came up during HBASE-2195.
We can address this elsewhere.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
https://reviews.apache.org/r/1674/#comment5069

I meant that this counter can be named REGIONS_SCANNED because its value 
may be lower than the total number of regions in the table(s).


- Ted


On 2011-09-29 21:00:18, Ming Ma wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1674/
bq.  ---
bq.  
bq.  (Updated 2011-09-29 21:00:18)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  1. Collect client-side scan related metrics during scan operation. It is 
turned off by default.
bq.  2. TableInputFormat enables metrics collection on scan and pass the data 
to mapreduce framework. It only works with new mapreduce APIs that allow 
TableInputFormat to get access to mapreduce Counter.
bq.  3. Clean up some minor issues in tableInputFormat as well as test code.
bq.  
bq.  
bq.  This addresses bug hbase-4145.
bq.  https://issues.apache.org/jira/browse/hbase-4145
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java
 1176942 
bq.  
bq.  Diff: https://reviews.apache.org/r/1674/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  1. Verified on a small cluster.
bq.  2. Existing unit tests.
bq.  3. Added new tests.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ming
bq.  
bq.



 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan 

[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-29 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117634#comment-13117634
 ] 

jirapos...@reviews.apache.org commented on HBASE-4145:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1674/#review2178
---

Ship it!


- Ted


On 2011-09-29 21:36:19, Ming Ma wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1674/
bq.  ---
bq.  
bq.  (Updated 2011-09-29 21:36:19)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  1. Collect client-side scan related metrics during scan operation. It is 
turned off by default.
bq.  2. TableInputFormat enables metrics collection on scan and pass the data 
to mapreduce framework. It only works with new mapreduce APIs that allow 
TableInputFormat to get access to mapreduce Counter.
bq.  3. Clean up some minor issues in tableInputFormat as well as test code.
bq.  
bq.  
bq.  This addresses bug hbase-4145.
bq.  https://issues.apache.org/jira/browse/hbase-4145
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java
 1176942 
bq.  
bq.  Diff: https://reviews.apache.org/r/1674/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  1. Verified on a small cluster.
bq.  2. Existing unit tests.
bq.  3. Added new tests.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ming
bq.  
bq.



 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.94.0

 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in all mapper instances. However, it might be good 
 enough for now. With this approach, the metrics value will be available via 
 MapReduce counter.
 a) Have ResultScanner return a new 

[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-29 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117871#comment-13117871
 ] 

Ted Yu commented on HBASE-4145:
---

Integrated to TRUNK.

Thanks for the patch, Ming.

 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.94.0

 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in all mapper instances. However, it might be good 
 enough for now. With this approach, the metrics value will be available via 
 MapReduce counter.
 a) Have ResultScanner return a new ResultScannerMetrics interface.
 b) TableInputFormat will access data from ResultScannerMetrics and populate 
 MapReduce counters accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-29 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117870#comment-13117870
 ] 

Ted Yu commented on HBASE-4145:
---

I ran test suite.
I got intermittent failure for testMasterFailoverWithMockedRITOnDeadRS but 
couldn't reproduce in standalone mode.

 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.94.0

 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in all mapper instances. However, it might be good 
 enough for now. With this approach, the metrics value will be available via 
 MapReduce counter.
 a) Have ResultScanner return a new ResultScannerMetrics interface.
 b) TableInputFormat will access data from ResultScannerMetrics and populate 
 MapReduce counters accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-28 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116588#comment-13116588
 ] 

jirapos...@reviews.apache.org commented on HBASE-4145:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1674/
---

(Updated 2011-09-28 16:35:57.691899)


Review request for hbase.


Changes
---

Merge with latest trunk.
Run unit tests couple more times.


Summary
---

1. Collect client-side scan related metrics during scan operation. It is turned 
off by default.
2. TableInputFormat enables metrics collection on scan and pass the data to 
mapreduce framework. It only works with new mapreduce APIs that allow 
TableInputFormat to get access to mapreduce Counter.
3. Clean up some minor issues in tableInputFormat as well as test code.


This addresses bug hbase-4145.
https://issues.apache.org/jira/browse/hbase-4145


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableInputFormatScan.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
 1176942 

Diff: https://reviews.apache.org/r/1674/diff


Testing
---

1. Verified on a small cluster.
2. Existing unit tests.
3. Added new tests.


Thanks,

Ming



 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in all mapper instances. However, it might be good 
 enough for now. With this approach, the metrics value will be available via 
 MapReduce counter.
 a) Have ResultScanner return a new ResultScannerMetrics interface.
 b) TableInputFormat will access data from ResultScannerMetrics and populate 
 MapReduce counters accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more 

[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-28 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116687#comment-13116687
 ] 

Todd Lipcon commented on HBASE-4145:


This is nice stuff. I haven't looked at the code yet, but the feature seems 
very useful. One small nit from the screenshot - I  think we can rename the 
counters from COUNT_OF_FOO to just FOOS -- the fact that it's' a COUNT_OF 
or SUM_OF is implicit in it being a counter. eg we had HDFS_BYTES_READ, not 
COUNT_OF_HDFS_BYTES_READ

 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in all mapper instances. However, it might be good 
 enough for now. With this approach, the metrics value will be available via 
 MapReduce counter.
 a) Have ResultScanner return a new ResultScannerMetrics interface.
 b) TableInputFormat will access data from ResultScannerMetrics and populate 
 MapReduce counters accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-28 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116884#comment-13116884
 ] 

jirapos...@reviews.apache.org commented on HBASE-4145:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1674/
---

(Updated 2011-09-28 23:03:54.523337)


Review request for hbase.


Changes
---

Thanks, Todd.

Rename counter name from COUNT_OF_RPC_CALLS to RPC_CALLS, etc.


Summary
---

1. Collect client-side scan related metrics during scan operation. It is turned 
off by default.
2. TableInputFormat enables metrics collection on scan and pass the data to 
mapreduce framework. It only works with new mapreduce APIs that allow 
TableInputFormat to get access to mapreduce Counter.
3. Clean up some minor issues in tableInputFormat as well as test code.


This addresses bug hbase-4145.
https://issues.apache.org/jira/browse/hbase-4145


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
 1176942 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
 1176942 

Diff: https://reviews.apache.org/r/1674/diff


Testing
---

1. Verified on a small cluster.
2. Existing unit tests.
3. Added new tests.


Thanks,

Ming



 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in all mapper instances. However, it might be good 
 enough for now. With this approach, the metrics value will be available via 
 MapReduce counter.
 a) Have ResultScanner return a new ResultScannerMetrics interface.
 b) TableInputFormat will access data from ResultScannerMetrics and populate 
 MapReduce counters accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-28 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116985#comment-13116985
 ] 

jirapos...@reviews.apache.org commented on HBASE-4145:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1674/#review2148
---



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
https://reviews.apache.org/r/1674/#comment4983

Should read 'can be easily'


- Ted


On 2011-09-28 23:03:54, Ming Ma wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1674/
bq.  ---
bq.  
bq.  (Updated 2011-09-28 23:03:54)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  1. Collect client-side scan related metrics during scan operation. It is 
turned off by default.
bq.  2. TableInputFormat enables metrics collection on scan and pass the data 
to mapreduce framework. It only works with new mapreduce APIs that allow 
TableInputFormat to get access to mapreduce Counter.
bq.  3. Clean up some minor issues in tableInputFormat as well as test code.
bq.  
bq.  
bq.  This addresses bug hbase-4145.
bq.  https://issues.apache.org/jira/browse/hbase-4145
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
 1176942 
bq.  
bq.  Diff: https://reviews.apache.org/r/1674/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  1. Verified on a small cluster.
bq.  2. Existing unit tests.
bq.  3. Added new tests.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ming
bq.  
bq.



 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HBaseClientSideMetrics.jpg


 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in all mapper instances. 

[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-09-28 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116989#comment-13116989
 ] 

jirapos...@reviews.apache.org commented on HBASE-4145:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1674/#review2149
---


Nice work.


http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
https://reviews.apache.org/r/1674/#comment4986

Should be declared as implementing VersionedWritable.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
https://reviews.apache.org/r/1674/#comment4984

It is a bit hard to read this counter.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
https://reviews.apache.org/r/1674/#comment4985

This is count of regions scanned, right ?
If so, please name it that way.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
https://reviews.apache.org/r/1674/#comment4987

mb should be included in the exception.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
https://reviews.apache.org/r/1674/#comment4988

Why do we need this check again ?



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
https://reviews.apache.org/r/1674/#comment4989

Value of version should be included here.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
https://reviews.apache.org/r/1674/#comment4990

I think we should have else block where the unsupported mb is logged.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
https://reviews.apache.org/r/1674/#comment4991

This name doesn't really match the constant above. I think HBase mapreduce 
Counters would be better.



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
https://reviews.apache.org/r/1674/#comment4992

This should not be a tongue twister.
How about naming it retrieveGetCounterWithStrings ?



http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
https://reviews.apache.org/r/1674/#comment4993

Shall we create the Object array outside the for loop and only fill in 
Metric name here ?


- Ted


On 2011-09-28 23:03:54, Ming Ma wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1674/
bq.  ---
bq.  
bq.  (Updated 2011-09-28 23:03:54)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  1. Collect client-side scan related metrics during scan operation. It is 
turned off by default.
bq.  2. TableInputFormat enables metrics collection on scan and pass the data 
to mapreduce framework. It only works with new mapreduce APIs that allow 
TableInputFormat to get access to mapreduce Counter.
bq.  3. Clean up some minor issues in tableInputFormat as well as test code.
bq.  
bq.  
bq.  This addresses bug hbase-4145.
bq.  https://issues.apache.org/jira/browse/hbase-4145
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapred/TestTableInputFormat.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1176942 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
 1176942 

[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-08-28 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092591#comment-13092591
 ] 

jirapos...@reviews.apache.org commented on HBASE-4145:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1674/
---

Review request for hbase.


Summary
---

1. Collect client-side scan related metrics during scan operation. It is turned 
off by default.
2. TableInputFormat enables metrics collection on scan and pass the data to 
mapreduce framework. It only works with new mapreduce APIs that allow 
TableInputFormat to get access to mapreduce Counter.
3. Clean up some minor issues in tableInputFormat as well as test code.


This addresses bug hbase-4145.
https://issues.apache.org/jira/browse/hbase-4145


Diffs
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
 1162612 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
 1162612 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1162612 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
 1162612 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/metrics/ScanMetrics.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
 1162612 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java
 1162612 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReaderImpl.java
 1162612 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
 1162612 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableInputFormatScan.java
 1162612 

Diff: https://reviews.apache.org/r/1674/diff


Testing
---

1. Verified on a small cluster.
2. Existing unit tests.
3. Added new tests.


Thanks,

Ming



 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma

 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in all mapper instances. However, it might be good 
 enough for now. With this approach, the metrics value will be available via 
 MapReduce counter.
 a) Have ResultScanner return a new ResultScannerMetrics interface.
 b) TableInputFormat will access data from ResultScannerMetrics and populate 
 MapReduce counters accordingly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4145) Provide metrics for hbase client

2011-08-18 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087418#comment-13087418
 ] 

Ming Ma commented on HBASE-4145:


Ah, thanks for pointing out this, Stack. We can use this for #3. The 
ClientScanner will call scan.setAttribute with well-defined metrics property 
names. TableInputFormat will call scan.getAttribute to access the metrics 
values and pass onto MapReduce framework as counters.

 Provide metrics for hbase client
 

 Key: HBASE-4145
 URL: https://issues.apache.org/jira/browse/HBASE-4145
 Project: HBase
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma

 Sometimes it is useful to get some metrics from hbase client point of view. 
 This will help understand the metrics for scan/TableInputFormat map job 
 scenario.
 What to capture, for example, for each ResultScanner object,
 1. The number of RPC calls to RSs.
 2. The delta time between consecutive RPC calls in the current serialized 
 scan implementation.
 3. The number of RPC retry to RSs.
 4. The number of NotServingRegionException got.
 5. The number of remote RPC calls. This excludes those call that hbase client 
 calls the RS on the same machine.
 6. The number of regions accessed.
 How to capture
 1. Metrics framework works for a fixed number of metrics. It doesn't fit this 
 scenario.
 2. Use some TBD solution in HBase to capture such dynamic metrics. If we 
 assume there is a solution in HBase that HBase client can use to log such 
 kind of metrics, TableInputFormat can pass in mapreduce task ID as 
 application scan ID to HBase client as small addition to existing scan API; 
 and HBase client can log metrics accordingly with such ID. That will allow 
 query, analysis later on the metrics data for specific map reduce job.
 3. Expose via MapReduce counter. It lacks certain features, for example, 
 there is no good way to access the metrics on per map instance; the MapReduce 
 framework only performs sum on the counter values so it is tricky to find the 
 max of certain metrics in all mapper instances. However, it might be good 
 enough for now. With this approach, the metrics value will be available via 
 MapReduce counter.
 a) Have ResultScanner return a new ResultScannerMetrics interface.
 b) TableInputFormat will access data from ResultScannerMetrics and populate 
 MapReduce counters accordingly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira