[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-05 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490649#comment-13490649
 ] 

Cheng Hao commented on HBASE-6852:
--

@Lars, thank you for the committing;
The snapshot of 0.94 branch code improves about 17.7% for scanning in my case, 
and it's sure the HBASE-6032 helps a lot; 
Here is the new hotspots for RegionServer via OProfile:

{code:title=Hotspots|borderStyle=solid}
CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (No unit mask) count 500
samples  %image name   symbol name
183371   17.1144  4465.jo  int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int)
63267 5.9049  4465.jo  
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher$MatchCode 
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(org.apache.hadoop.hbase.KeyValue)
59762 5.5777  4465.jo  byte[] 
org.apache.hadoop.hbase.KeyValue.createByteArray(byte[], int, int, byte[], int, 
int, byte[], int, int, long, org.apache.hadoop.hbase.KeyValue$Type, byte[], 
int, int)
50975 4.7576  4465.jo  int 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.blockSeek(byte[], int, 
int, boolean)
50891 4.7498  4465.jo  void 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek()
38257 3.5706  4465.jo  jbyte_disjoint_arraycopy
37973 3.5441  4465.jo  boolean 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(boolean, 
org.apache.hadoop.hbase.KeyValue, boolean, boolean)~1
33978 3.1712  4465.jo  void 
org.apache.hadoop.util.PureJavaCrc32C.update(byte[], int, int)
{code}

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, 

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490780#comment-13490780
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Thanks Cheng, this is very helpful!

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490347#comment-13490347
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Yeah... Looks good! Thanks again Cheng.

Hey, I was also wondering whether there a chance to do your profiling one more 
time with HBASE-6032 applied. In your last profiling run here (Oct 7th) 
HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex takes the top spot 
(after this patch). HBASE-6032 was applied on Oct 17th, I'm wondering whether 
that helped.

If you're busy, that's fine too... You already spent a lot of time on this 
issue.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490392#comment-13490392
 ] 

Hudson commented on HBASE-6852:
---

Integrated in HBase-0.94-security-on-Hadoop-23 #9 (See 
[https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/9/])
HBASE-6852 RE-REAPPLY, Cheng worked tirelessly to fix the issues. (Revision 
1405083)
HBASE-6852, REVERT again, due to unexplained test failures that only occur on 
the jenkins machines (Revision 1404691)
HBASE-6852 SchemaMetrics.updateOnCacheHit costs too much while full scanning a 
table with all of its fields (Cheng Hao and LarsH) - REAPPLY (Revision 1404464)
HBASE-6852 REVERT due to test failures. (Revision 1402588)
HBASE-6852 SchemaMetrics.updateOnCacheHit costs too much while full scanning a 
table with all of its fields (Cheng Hao and LarsH) (Revision 1402392)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java

larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java

larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java

larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java

larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-02 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489521#comment-13489521
 ] 

Ted Yu commented on HBASE-6852:
---

@Cheng:
Thanks for your persistence.
I will run patch v3 on Linux.

Can you tell us more about the bug you found ?

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.4

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-02 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489532#comment-13489532
 ] 

Ted Yu commented on HBASE-6852:
---

I looped TestScannerSelectionUsingTTL 14 times on Linux and they passed.

Looking at patch v3, updateOnCacheHit() and flushOnCacheHitMetrics() are 
checking this SchemaMetrics against ALL_SCHEMA_METRICS.
I think patch v3 should be good to go.

I am running patch v3 through test suite. Will report back if I see anomaly.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.4

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-02 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489584#comment-13489584
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Thanks Cheng and thanks Ted!

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.4

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-02 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489596#comment-13489596
 ] 

Ted Yu commented on HBASE-6852:
---

0.94 test suite passed with patch v3:
{code}
Tests run: 1071, Failures: 0, Errors: 0, Skipped: 12

[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 45:45.503s
[INFO] Finished at: Fri Nov 02 10:30:09 PDT 2012
{code}
@Lars:
Are you going to commit ?

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.4

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-02 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489607#comment-13489607
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Awesome... 3rd time's a charm :)

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.4

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489773#comment-13489773
 ] 

Hudson commented on HBASE-6852:
---

Integrated in HBase-0.94 #567 (See 
[https://builds.apache.org/job/HBase-0.94/567/])
HBASE-6852 RE-REAPPLY, Cheng worked tirelessly to fix the issues. (Revision 
1405083)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-02 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489861#comment-13489861
 ] 

Cheng Hao commented on HBASE-6852:
--

Ouch!Still failed,and I still couldn't access the build server.
Any problem of the build server?

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-02 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489865#comment-13489865
 ] 

Ted Yu commented on HBASE-6852:
---

@Cheng:
The build failure might be due to other reasons.
Check back in a day or two.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-02 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489922#comment-13489922
 ] 

Ted Yu commented on HBASE-6852:
---

There was no related test failure in 
https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.94/567/ where patch 
v3 went in.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488516#comment-13488516
 ] 

Hudson commented on HBASE-6852:
---

Integrated in HBase-0.94 #562 (See 
[https://builds.apache.org/job/HBase-0.94/562/])
HBASE-6852 SchemaMetrics.updateOnCacheHit costs too much while full 
scanning a table with all of its fields (Cheng Hao and LarsH) - REAPPLY 
(Revision 1404464)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488549#comment-13488549
 ] 

Cheng Hao commented on HBASE-6852:
--

Still failed,

And I can not open the URL https://builds.apache.org/job/HBase-0.94/562/;, not 
sure if there any problem for the build server.



 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488653#comment-13488653
 ] 

Ted Yu commented on HBASE-6852:
---

Test failure in hadoop.hbase.io.hfile.TestScannerSelectionUsingTTL is 
reproducible. 

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488739#comment-13488739
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Not on my machine... Weird.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488740#comment-13488740
 ] 

Lars Hofhansl commented on HBASE-6852:
--

---
 T E S T S
---
Running org.apache.hadoop.hbase.io.hfile.TestScannerSelectionUsingTTL
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.642 sec

Results :

Tests run: 6, Failures: 0, Errors: 0, Skipped: 0


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488746#comment-13488746
 ] 

Ted Yu commented on HBASE-6852:
---

Here is the environment where the test failed:
{code}
$ uname -a
Linux s0 2.6.38-11-generic #48-Ubuntu SMP Fri Jul 29 19:02:55 UTC 2011 x86_64 
x86_64 x86_64 GNU/Linux

$ java -version
java version 1.6.0_26
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
{code}

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488763#comment-13488763
 ] 

Lars Hofhansl commented on HBASE-6852:
--

I triggered a new build. If that fails again, I am not sure what to do.
I ran the test a lot of times locally and it always passes.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488791#comment-13488791
 ] 

Ted Yu commented on HBASE-6852:
---

The new build failed again.
After reverting patch v2, TestScannerSelectionUsingTTL passed on the above 
mentioned platform.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488794#comment-13488794
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Does it fail locally on your machine Ted?
I'm going to run the test on a different machine so that I can debug.
If that is not fruitful, I'll revert the change again... Sigh :(

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488795#comment-13488795
 ] 

Ted Yu commented on HBASE-6852:
---

I wasn't able to produce the test failure on MacBook.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488808#comment-13488808
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Also tried on some other machines (JDK7 and JDK6) it passes all the time.
This is extremely disconcerting.


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488824#comment-13488824
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Sorry, Cheng, I am probably going to have to roll this back again.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489033#comment-13489033
 ] 

Hudson commented on HBASE-6852:
---

Integrated in HBase-0.94 #564 (See 
[https://builds.apache.org/job/HBase-0.94/564/])
HBASE-6852, REVERT again, due to unexplained test failures that only occur 
on the jenkins machines (Revision 1404691)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.4

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489176#comment-13489176
 ] 

Cheng Hao commented on HBASE-6852:
--

Thanks Lars and Ted, I will try to reproduce the failure locally first, and 
then to see if any logical bug of the schema metrics flushing.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.4

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-29 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486068#comment-13486068
 ] 

Cheng Hao commented on HBASE-6852:
--

oh, sorry for that, I will resolve it asap.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-29 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486086#comment-13486086
 ] 

Lars Hofhansl commented on HBASE-6852:
--

No problem. Thanks for providing a patch. :)

The problem seems to be the ALL_SCHEMA_METRIC not always updated.
(The flushing is definitely not correct... A call to flush won't flush the 
ALL_SCHEMA_METRIC, but even when I fixed that, the tests still failed).


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484742#comment-13484742
 ] 

Hudson commented on HBASE-6852:
---

Integrated in HBase-0.94 #556 (See 
[https://builds.apache.org/job/HBase-0.94/556/])
HBASE-6852 SchemaMetrics.updateOnCacheHit costs too much while full 
scanning a table with all of its fields (Cheng Hao and LarsH) (Revision 1402392)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-26 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484919#comment-13484919
 ] 

Ted Yu commented on HBASE-6852:
---

There were 10 test failures in build 556 which might be related to this JIRA.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484976#comment-13484976
 ] 

Lars Hofhansl commented on HBASE-6852:
--

will check it out




 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485062#comment-13485062
 ] 

Lars Hofhansl commented on HBASE-6852:
--

These are the failing tests (in case we do not get to this before jenkins 
removes the old run):

org.apache.hadoop.hbase.io.hfile.TestScannerSelectionUsingTTL.testScannerSelection[2]

org.apache.hadoop.hbase.io.hfile.TestScannerSelectionUsingTTL.testScannerSelection[3]

org.apache.hadoop.hbase.io.hfile.TestScannerSelectionUsingTTL.testScannerSelection[4]

org.apache.hadoop.hbase.io.hfile.TestScannerSelectionUsingTTL.testScannerSelection[5]
org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScanned

org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose
org.apache.hadoop.hbase.regionserver.TestStoreFile.testBloomFilter

org.apache.hadoop.hbase.regionserver.TestStoreFile.testDeleteFamilyBloomFilter
org.apache.hadoop.hbase.regionserver.TestStoreFile.testBloomTypes
org.apache.hadoop.hbase.regionserver.TestStoreFile.testBloomEdgeCases


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485064#comment-13485064
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Reverted for now. I think I know what is happening (the metrics are just not 
flushed right away), but I have no time to look into this.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485127#comment-13485127
 ] 

Hudson commented on HBASE-6852:
---

Integrated in HBase-0.94 #557 (See 
[https://builds.apache.org/job/HBase-0.94/557/])
HBASE-6852 REVERT due to test failures. (Revision 1402588)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485345#comment-13485345
 ] 

Lars Hofhansl commented on HBASE-6852:
--

There's a bug about the ALL_SCHEMA_METRIC is updated/flushed, which causes this.
[~hcheng] If you could have a look that'd be cool :)

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484399#comment-13484399
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Did a microbenchmark too.
ConcurrentHashMap.get/putIfAbsent plus updating an atomiclong takes about twice 
as long as just updating an atomiclong (testing with 1 thread and 100 threads 
on a dual core machine).


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-25 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484491#comment-13484491
 ] 

Elliott Clark commented on HBASE-6852:
--

HBASE-6410 removes most of the calls to concurrent hash maps.  And it starts 
using the high scalability counter class.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484543#comment-13484543
 ] 

Lars Hofhansl commented on HBASE-6852:
--

So we should probably not consider this for 0.94 then. And for 0.96 this is a 
non-issue, right?


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-25 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484551#comment-13484551
 ] 

Elliott Clark commented on HBASE-6852:
--

For 0.96 it's (hopefully) a non-issue.
For 0.94 I think the perf gain might be worth applying this patch.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13484705#comment-13484705
 ] 

stack commented on HBASE-6852:
--

Patch looks good to me for 0.94

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-23 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482954#comment-13482954
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Thanks for all your work here Cheng.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476686#comment-13476686
 ] 

Hadoop QA commented on HBASE-6852:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12549247/metrics_hotspots.png
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3055//console

This message is automatically generated.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-15 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476695#comment-13476695
 ] 

Cheng Hao commented on HBASE-6852:
--

Sorry, just read an article, the self time may not accurate in sampling result, 
as the modern JVM will optimize the function call as inlined.

But from the sampling call graph, it may tells the ConcurrentHashMap.get() is 
one of the hotspots, and that may also explains why the patch reduced the 
overhead.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-08 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471764#comment-13471764
 ] 

Todd Lipcon commented on HBASE-6852:


bq. @Todd: Re: ThreadLocal. We had a bunch of incidents a few years back at 
Salesforce where it turned out that accessing threadlocals is not free.

Agreed, it involves a lookup in a hashmap. But we could do that lookup once, 
and pass it through the whole scanner stack, etc, in some kind of ScanContext 
parameter. That would be helpful for a bunch of places where we currently use 
threadlocals (metrics, rpc call cancellation checks, tracing, etc)

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: AtomicTest.java, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471390#comment-13471390
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Thanks Cheng. That microbenchmark might not cover the actual cost of a memory 
barriers when many different threads are running on different cores.

It looks like the patch will be an improvement.

It would still be great to know why updateOnCacheHit causes such a performance 
hit. If it is the ConcurrentMap access we should fix it there (with the 
lock-free array theme I mentioned above - maybe with the padding as Todd 
suggests, if needed). That would be a more general fix.
What do you think, Cheng?

@Todd: Re: ThreadLocal. We had a bunch of incidents a few years back at 
Salesforce where it turned out that accessing threadlocals is not free.


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: AtomicTest.java, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471149#comment-13471149
 ] 

Hadoop QA commented on HBASE-6852:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12548145/AtomicTest.java
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3018//console

This message is automatically generated.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: AtomicTest.java, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-06 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471150#comment-13471150
 ] 

Cheng Hao commented on HBASE-6852:
--

I re-ran the scanning tests, with or without the patch attached, still, the 
patched version got 10% shorter in entire running time.
The oprofile result of the un-patched version as (top 4):
samples  %image name   symbol name
---
5418214.6977  23960.jo void 
org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.incrNumericMetric(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
 boolean, org.a
pache.hadoop.hbase.regionserver.metrics.SchemaMetrics$BlockMetricType)
  54182100.000  23960.jo void 
org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.incrNumericMetric(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
 boolean, org
.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics$BlockMetricType) [self]
---
4394911.9219  23960.jo int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int)
  43949100.000  23960.jo int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int) [self]
---
20725 5.6220  23960.jo int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
 int, int, java.nio.ByteBuffer, org.apache.hadoop.io
.RawComparator)
  20725100.000  23960.jo int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
 int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator) [self]
---
17554 4.7618  23960.jo 
org.apache.hadoop.hbase.io.hfile.HFileBlock 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(long, long, boolean, 
boolean, boolean, org.apache.hadoop.hbase.io.hfile.BlockType)
  17554100.000  23960.jo 
org.apache.hadoop.hbase.io.hfile.HFileBlock 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(long, long, boolean, 
boolean, boolean, org.apache.hadoop.hbase.io.hfile.BlockType) [self]

And the oprofile result for patched version as (Top 4):
samples  %image name   symbol name
---
5371611.9679  3683.jo  int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
 int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)
  53716100.000  3683.jo  int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
 int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator) [self]
---
34921 7.7804  3683.jo  int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int)
  34921100.000  3683.jo  int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int) [self]
---
31446 7.0061  3683.jo  
org.apache.hadoop.hbase.io.hfile.HFileBlock 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(long, long, boolean, 
boolean, boolean, org.apache.hadoop.hbase.io.hfile.BlockType)
  31446100.000  3683.jo  
org.apache.hadoop.hbase.io.hfile.HFileBlock 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(long, long, boolean, 
boolean, boolean, org.apache.hadoop.hbase.io.hfile.BlockType) [self]
---
20126 4.4841  3683.jo  
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher$MatchCode 
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(org.apache.hadoop.hbase.KeyValue)
  20126100.000  3683.jo  
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher$MatchCode 
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(org.apache.hadoop.hbase.KeyValue)
 [self]

Perhaps, the function call itself may costs too much, like the stacks poping / 
pushing etc. and the patch just reduces the un-necessary function calls.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: 

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-06 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471151#comment-13471151
 ] 

Cheng Hao commented on HBASE-6852:
--

Sorry, please check the AtomicTest.java attached, to compare the performance of 
AtomicLong / Counter / Normal function call.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: AtomicTest.java, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465393#comment-13465393
 ] 

Lars Hofhansl commented on HBASE-6852:
--

I still find it strange that still using AtomicLongs gives an improvement 
(because all that's different then is an access into a concurrent map) and that 
your test in fact was CPU bound. 10% improvement seems almost unbelievable, it 
makes me think there is something else at play.
There's probably no harm in committed (it's only slightly more complicated).


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465399#comment-13465399
 ] 

stack commented on HBASE-6852:
--

OK. Will leave it for now.  Will commit it if Elliott doesn't subsume this w/ 
his cliffclick counter.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-27 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465319#comment-13465319
 ] 

stack commented on HBASE-6852:
--

So what is the feeling here?  This is an improvement.  Its a sketch of what 
we'd like to do longterm.  It improves your performance [~chenghao]?  I'm 
inclined to commit it.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-27 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465357#comment-13465357
 ] 

Cheng Hao commented on HBASE-6852:
--

Hi, stack, the patch does improve the performance in my case, and for the 
AtomicLong stuff, maybe we could wait for the next generation of Metrics 
framework.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-23 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461559#comment-13461559
 ] 

Cheng Hao commented on HBASE-6852:
--

{quote} Cheng Hao: you said that your dataset size was 600GB, and the total 
amount of block cache was presumably much smaller than that, which makes me 
think the workload should have been I/O-bound. What was the CPU utilization on 
your test? What was the disk throughput?
{quote}
Actually it's the CPU-bound. and the utilization is more than 80%.

I have 4 machines and each machine has 12 disks and 24 CPU cores.
Besides, in order to make it more effective, I have splitted the regions twice, 
and then did the major compact, to be sure the data locality. After that, I ran 
the data scanning tests base on Hive query like select count() from xxx;

I am also curious if there any overheads of threads/syscalls switching (like 
during the IPC). PS: I did set the hbase.client.scanner.caching as 1000;

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460253#comment-13460253
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Interesting. Thanks Cheng. I wonder what causes the performance problem then. 
Is it the get/putIfAbsent of the ConcurrentMap we store the metrics in?

I'd probably feel better if you set the threshold to 100 (instead of 2000) - 
you'd still reduce the time used there by 99%.

Also looking at the places where updateOnCacheHit is called... We also 
increment an AtomicLong (cacheHits), which is never read (WTF). We should 
remove that counter while we're at it (even when AtomicLongs are not the 
problem).


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460275#comment-13460275
 ] 

liang xie commented on HBASE-6852:
--

Hi Cheng, for running time, could you exclude the system resouce factor ?  e.g. 
you ran the original version with many physical IOs, but reran the patched 
version without similar physical IO requests due to hitting OS page cache.  
In other words, could the reduced running time symptom be reproduced always, 
even you run patched version first, then rerun the original version ?  It'd 
better if you can issue echo 1  /proc/sys/vm/drop_caches to free pagecache 
between each test.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460280#comment-13460280
 ] 

Cheng Hao commented on HBASE-6852:
--

Lars, the only place to use the ConcurentMap in SchemaMetrics is 
tableAndFamilyToMetrics. in this patch, I pre-create an array of AtomicLong for 
all of the possible oncachehit metrics items, which will avoids the concurrent 
issue and easy to be indexed while accessing.

Thanks stack and Lars for the suggestions, I will create another patch file 
instead.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460297#comment-13460297
 ] 

Cheng Hao commented on HBASE-6852:
--

Hi Liang, it's really good suggestion. Actually I didn't free the pagecache of 
OS before each launch. But I can try that later.

In my tests, the table data was about 600GB within 4 machines, I guess the 
system cache may not impact the entire performance so much for a full table 
scanning.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460316#comment-13460316
 ] 

Cheng Hao commented on HBASE-6852:
--

I didn't remove the cacheHits in the HFileReaderV1  V2, hope it's a good 
start to design a less overhead metrics framework.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460318#comment-13460318
 ] 

Hadoop QA commented on HBASE-6852:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12546009/onhitcache-trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2913//console

This message is automatically generated.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460615#comment-13460615
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Patch looks good. I'll remain sceptical about the real life impact, though. The 
expensive is taking out the memory barriers. As long as we use AtomicLong (or 
volatiles, or synchronized, or use ConcurrentMap) this is still going to happen.

Lemme move this out to 0.94.3, so that we can performance test this a bit more.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Mikhail Bautin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460683#comment-13460683
 ] 

Mikhail Bautin commented on HBASE-6852:
---

[~lhofhansl]: what are the other cases when metrics came up as performance 
issues?

[~chenghao_sh]: you said that your dataset size was 600GB, and the total amount 
of block cache was presumably much smaller than that, which makes me think the 
workload should have been I/O-bound. What was the CPU utilization on your test? 
What was the disk throughput?

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460689#comment-13460689
 ] 

Todd Lipcon commented on HBASE-6852:


I have a full table scan in isolation benchmark I've been working on. My 
benchmark currently disables metrics, so I haven't seen this, but I'll add a 
flag to it to enable metrics and see if I can reproduce. Since it runs in 
isolation it's easy to run under perf stat and get cycle counts, etc, out of 
it. Will report back next week.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460695#comment-13460695
 ] 

Lars Hofhansl commented on HBASE-6852:
--

HBASE-6603 was the other one. Turns out this is the 2nd time (not the 3rd). The 
other issue I found through profiling were not metric related.

So I was thinking what we should generally do about this. The idea in this 
patch (using an array indexed by metric) is a good one. Can we generally do 
that? I.e.:
# we know the metric we wish to collect ahead of time
# Assign an index to each of them, and collect the value in an array
# Simply use long (not volatile, atomiclong, just long)
# Upon update or read we access the metric array by index

That would eliminate the cost of the ConcurrentMap and of the AtomicXYZ, with 
the caveat that the metric are only an approximation, which at the very least 
will make testing much harder.
Maybe we have exact and fuzzy metric and only use the fuzzy one on the hot 
code-paths.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460698#comment-13460698
 ] 

Todd Lipcon commented on HBASE-6852:


If using an array of longs, we'd get a ton of cache contention effects. 
Whatever we do should be cache-line padded to avoid this perf hole.

Having a per-thread (ThreadLocal) metrics array isn't a bad way to go: no 
contention, can use non-volatile types, and can be stale-read during metrics 
snapshots by just iterating over all the threads.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460699#comment-13460699
 ] 

stack commented on HBASE-6852:
--

Perhaps use the cliffclick counter (if cost  volatile) and not have to do 
fuzzy?

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460702#comment-13460702
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Oh yeah, you mentioned cliffclick... Need to look at that.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460717#comment-13460717
 ] 

stack commented on HBASE-6852:
--

[~lhofhansl] I made my comment before I saw Todd's suggestion

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460725#comment-13460725
 ] 

Elliott Clark commented on HBASE-6852:
--

I think we should start doing more of what this patch does. Collect the values 
locally and then use a single call into the metrics sources to push the 
collected metrics.  In addition I think that we should remove some of the 
lesser used dynamic metrics, and for other stop using the time varying rate.

For the most part I think that will remove the cost of metrics getting too out 
of control.  However I don't think that we should stop using 
AtomicLong/AtomicInt. From my understanding on most architectures the JVM will 
turn getAndIncrement into just one cpu instruction, rather than using compare 
and swap.  So there's very little gained by sacrificing correctness.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460736#comment-13460736
 ] 

Todd Lipcon commented on HBASE-6852:


bq. getAndIncrement into just one cpu instruction

True, but it's a pretty expensive instruction, since it has to steal that cache 
line from whichever other core used it previously, and I believe acts as a full 
memory barrier as well (eg flushing write-combining buffers)


The cliff click counter is effective but has more memory usage. Aggregating 
stuff locally and pushing to metrics seems ideal, but if we can't do that 
easily, then having the metrics per-thread and then occasionally grabbing them 
would work too. Memcached metrics work like that.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460786#comment-13460786
 ] 

Elliott Clark commented on HBASE-6852:
--

bq.Aggregating stuff locally and pushing to metrics seems ideal
With that comes a lot of book keeping and potential places to leak memory(if we 
use strong references) or to lose metrics data (if we use weak references). I'm 
not sure that the perf gain will be high enough to justify that. 

Since we already shim a lot to the metrics2 classes it seems like using the 
high-scale-lib counters to create conurrent versions of the 
MetricMutableCounter{Long|Int} would stop most cache contention pretty easily.  
For me these seem like the order of cost vs benefit:
# Aggregating metrics locally before pushing to the metrics system whenever 
possible
# Using the hashmap less (This is already happening in the metrics2 move over. 
See 
[MasterMetricsSourceImpl|https://github.com/apache/hbase/blob/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java]
 for how known metrics are staying away from the hashmap)
# Changing  metrics to use counters rather than time varying rate wherever 
possible (Lots less locking if we don't need to keep min/max)
# Create CliffClick versions of Counters and use them whenever there's 
concurrent access
# Look at ThreadLocal caches versions of metrics.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460161#comment-13460161
 ] 

Hadoop QA commented on HBASE-6852:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12545995/onhitcache-trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2911//console

This message is automatically generated.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460169#comment-13460169
 ] 

stack commented on HBASE-6852:
--

[~chenghao_sh] Is it 0.94.0 that you are running?

[~lhofhansl] Did we fix these in later 0.94s?

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460171#comment-13460171
 ] 

stack commented on HBASE-6852:
--

It doesn't look like it (after taking a look).

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460172#comment-13460172
 ] 

Cheng Hao commented on HBASE-6852:
--

yes, I ran the profiling in 0.94.0, but the patch is based on the trunk. it 
should also works for the later 0.94s.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460174#comment-13460174
 ] 

Cheng Hao commented on HBASE-6852:
--

It's quite similar with https://issues.apache.org/jira/browse/HBASE-6603, but 
per my testing, the 6603 doesn't improve that much in my case (full scan a 
table), but this fix did improve the performance a lot (it's 10% time shorter 
totally).

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460176#comment-13460176
 ] 

Cheng Hao commented on HBASE-6852:
--

stack, do you mean I should submit the patch for 0.94 as well?

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460182#comment-13460182
 ] 

stack commented on HBASE-6852:
--

Patch looks good as does the change in the character of the pasted oprofile 
output.

Did you look at adding a close to AbstractHFileReader that hfile v1 and v2 
reader close could share?  Would that make sense here?

The THRESHOLD_METRICS_FLUSH = 2k seems arbitrary.  Any reason why this number 
in particular?

Nit is that the param name isCompaction is the name of a method that returns a 
boolean result.

+1 on patch.

[~eclark] Mr. Metrics, want to take a look see at this one?






 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460189#comment-13460189
 ] 

Lars Hofhansl commented on HBASE-6852:
--

@Stack: No, this is a different issue. Didn't come up in my profiling since I 
only did cache path (so far).

Good one Cheng.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460193#comment-13460193
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Wait. This is the cache hit path we're talking about. Didn't come up in my 
profiling at all.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460198#comment-13460198
 ] 

Lars Hofhansl commented on HBASE-6852:
--

This is third time that metrics come up as a performance issue.
Do we have to think about this generally? How perfect do these metrics have to 
be?

(Assuming a 64 bit architecture) we *could* just use plain (not even volatile) 
longs and accept the fact that we'll miss some updates or overwrite others; the 
values would still be the right ballpark.


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460199#comment-13460199
 ] 

Lars Hofhansl commented on HBASE-6852:
--

@Cheng: Even with this patch we're still updating an AtomicLong each time we 
get a cache hit, right? I had assumed that that was the slow part. Is it not?


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460224#comment-13460224
 ] 

Cheng Hao commented on HBASE-6852:
--

@stack: it should make more sense if we put the close() into the 
AbastractHFileReader, but not sure if there any other concern, since the 
AbstractHFileReader doesn't have it.

And for the THRESHOLD_METRICS_FLUSH = 2k, which I used during my testing, hope 
it's big enough for reducing the overhead, and less impact for getting the 
metrics snapshot timely. sorry, I may not able to give a good experiential 
number for it.

@Lars: Yes, that's right, we're still updating an AtomicLong each time, but 
from profiling result, I didn't see the AtomicLong became the new hotspots, and 
the testing also did 10% saved in running time, which may means the overhead 
of AtomicLong could be ignored.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460244#comment-13460244
 ] 

stack commented on HBASE-6852:
--

bq. Do we have to think about this generally? How perfect do these metrics have 
to be?

In 0.94 we started recording way more than previous.

I like your question on how perfect do they need to be.  For metrics updated by 
1 frequently, my quess is we could miss a few.

Why we using atomic longs anyway and not cliffclick's high scale lib... its in 
our CLASSPATH...

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira