from:"Cheng Hao \(JIRA\)"

[jira] [Updated] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-18 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-7381:
-

Attachment: result_lightweight_copy_v2.patch

Thanks, Lars. I didn't notice that in trunk.

It will be helpful if the changes can apply to 0.94, too. 

I created a new patch with slightly different changes compared to the trunk, 
but keep the same method name. 



 Lightweight data transfer for Class Result
 --

 Key: HBASE-7381
 URL: https://issues.apache.org/jira/browse/HBASE-7381
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Cheng Hao
Priority: Trivial
 Fix For: 0.94.4

 Attachments: result_lightweight_copy_v2.patch


 Currently,the data transferring between 2 Result objects in the same process, 
 will cause additional/unnecessary data parsing  copying; as we have to do 
 that via Writables.copyWritable(result1, result2), which internally is 
 serialization, data copying, and de-serialization.
 The use case are quite common when integrated with Hadoop job running;
 The protocol org.apache.hadoop.mapred.RecordReader defined in Hadoop, 
 provides 3 interfaces:
 1) K createKey();
 2) V createValue();
 3) boolean next(K key, V value) throws IOException;
 In the 3rd method implementation, most likely requires the value (should be 
 Result object) to be filled, with the Result object from HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-18 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-7381:
-

Attachment: (was: result_lightweight_copy.patch)

 Lightweight data transfer for Class Result
 --

 Key: HBASE-7381
 URL: https://issues.apache.org/jira/browse/HBASE-7381
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Cheng Hao
Priority: Trivial
 Fix For: 0.94.4

 Attachments: result_lightweight_copy_v2.patch


 Currently,the data transferring between 2 Result objects in the same process, 
 will cause additional/unnecessary data parsing  copying; as we have to do 
 that via Writables.copyWritable(result1, result2), which internally is 
 serialization, data copying, and de-serialization.
 The use case are quite common when integrated with Hadoop job running;
 The protocol org.apache.hadoop.mapred.RecordReader defined in Hadoop, 
 provides 3 interfaces:
 1) K createKey();
 2) V createValue();
 3) boolean next(K key, V value) throws IOException;
 In the 3rd method implementation, most likely requires the value (should be 
 Result object) to be filled, with the Result object from HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-18 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535676#comment-13535676
 ] 

Cheng Hao commented on HBASE-7381:
--

oh, sorry, I will take care of that next time.

Thanks Lars.

 Lightweight data transfer for Class Result
 --

 Key: HBASE-7381
 URL: https://issues.apache.org/jira/browse/HBASE-7381
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Cheng Hao
Priority: Trivial
 Fix For: 0.94.4

 Attachments: result_lightweight_copy_v2.patch


 Currently,the data transferring between 2 Result objects in the same process, 
 will cause additional/unnecessary data parsing  copying; as we have to do 
 that via Writables.copyWritable(result1, result2), which internally is 
 serialization, data copying, and de-serialization.
 The use case are quite common when integrated with Hadoop job running;
 The protocol org.apache.hadoop.mapred.RecordReader defined in Hadoop, 
 provides 3 interfaces:
 1) K createKey();
 2) V createValue();
 3) boolean next(K key, V value) throws IOException;
 In the 3rd method implementation, most likely requires the value (should be 
 Result object) to be filled, with the Result object from HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-17 Thread Cheng Hao (JIRA)

Cheng Hao created HBASE-7381:


 Summary: Lightweight data transfer for Class Result
 Key: HBASE-7381
 URL: https://issues.apache.org/jira/browse/HBASE-7381
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Cheng Hao
Priority: Trivial


Currently,the data transferring between 2 Result objects in the same process, 
will cause additional/unnecessary data parsing  copying; as we have to do that 
via Writables.copyWritable(result1, result2), which internally is 
serialization, data copying, and de-serialization.

The use case are quite common when integrated with Hadoop job running;
The protocol org.apache.hadoop.mapred.RecordReader defined in Hadoop, provides 
3 interfaces:
1) K createKey();
2) V createValue();
3) boolean next(K key, V value) throws IOException;

In the 3rd method implementation, most likely requires the value (should be 
Result object) to be filled, with the Result object from HBase.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-17 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-7381:
-

Attachment: result_lightweight_copy.patch

Provide a new API in Result class.

 Lightweight data transfer for Class Result
 --

 Key: HBASE-7381
 URL: https://issues.apache.org/jira/browse/HBASE-7381
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Cheng Hao
Priority: Trivial
 Attachments: result_lightweight_copy.patch


 Currently,the data transferring between 2 Result objects in the same process, 
 will cause additional/unnecessary data parsing  copying; as we have to do 
 that via Writables.copyWritable(result1, result2), which internally is 
 serialization, data copying, and de-serialization.
 The use case are quite common when integrated with Hadoop job running;
 The protocol org.apache.hadoop.mapred.RecordReader defined in Hadoop, 
 provides 3 interfaces:
 1) K createKey();
 2) V createValue();
 3) boolean next(K key, V value) throws IOException;
 In the 3rd method implementation, most likely requires the value (should be 
 Result object) to be filled, with the Result object from HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-17 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-7381:
-

Fix Version/s: 0.94.4
   Status: Patch Available  (was: Open)

 Lightweight data transfer for Class Result
 --

 Key: HBASE-7381
 URL: https://issues.apache.org/jira/browse/HBASE-7381
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Cheng Hao
Priority: Trivial
 Fix For: 0.94.4

 Attachments: result_lightweight_copy.patch


 Currently,the data transferring between 2 Result objects in the same process, 
 will cause additional/unnecessary data parsing  copying; as we have to do 
 that via Writables.copyWritable(result1, result2), which internally is 
 serialization, data copying, and de-serialization.
 The use case are quite common when integrated with Hadoop job running;
 The protocol org.apache.hadoop.mapred.RecordReader defined in Hadoop, 
 provides 3 interfaces:
 1) K createKey();
 2) V createValue();
 3) boolean next(K key, V value) throws IOException;
 In the 3rd method implementation, most likely requires the value (should be 
 Result object) to be filled, with the Result object from HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-17 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534679#comment-13534679
 ] 

Cheng Hao commented on HBASE-7381:
--

@Yu
The following code is from the Hive HBaseHandler, you may need to notice 
the part I commented out.

{code:title=HiveHBaseTableInputFormat.java|borderStyle=solid}
  @Override
  public boolean next(ImmutableBytesWritable rowKey, Result value) throws 
IOException {

boolean next = false;

try {
  next = recordReader.nextKeyValue();

  if (next) {
rowKey.set(recordReader.getCurrentValue().getRow());
Writables.copyWritable(recordReader.getCurrentValue(), value);
//Result.copy(recordReader.getCurrentValue(), value);
  }
} catch (InterruptedException e) {
  throw new IOException(e);
}

return next;
  }
{code} 


 Lightweight data transfer for Class Result
 --

 Key: HBASE-7381
 URL: https://issues.apache.org/jira/browse/HBASE-7381
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Cheng Hao
Priority: Trivial
 Fix For: 0.94.4

 Attachments: result_lightweight_copy.patch


 Currently,the data transferring between 2 Result objects in the same process, 
 will cause additional/unnecessary data parsing  copying; as we have to do 
 that via Writables.copyWritable(result1, result2), which internally is 
 serialization, data copying, and de-serialization.
 The use case are quite common when integrated with Hadoop job running;
 The protocol org.apache.hadoop.mapred.RecordReader defined in Hadoop, 
 provides 3 interfaces:
 1) K createKey();
 2) V createValue();
 3) boolean next(K key, V value) throws IOException;
 In the 3rd method implementation, most likely requires the value (should be 
 Result object) to be filled, with the Result object from HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-17 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534688#comment-13534688
 ] 

Cheng Hao commented on HBASE-7381:
--

Thanks @Yu

Once this patch is applied, I will create another Jira issue for Hive.

 Lightweight data transfer for Class Result
 --

 Key: HBASE-7381
 URL: https://issues.apache.org/jira/browse/HBASE-7381
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Cheng Hao
Priority: Trivial
 Fix For: 0.94.4

 Attachments: result_lightweight_copy.patch


 Currently,the data transferring between 2 Result objects in the same process, 
 will cause additional/unnecessary data parsing  copying; as we have to do 
 that via Writables.copyWritable(result1, result2), which internally is 
 serialization, data copying, and de-serialization.
 The use case are quite common when integrated with Hadoop job running;
 The protocol org.apache.hadoop.mapred.RecordReader defined in Hadoop, 
 provides 3 interfaces:
 1) K createKey();
 2) V createValue();
 3) boolean next(K key, V value) throws IOException;
 In the 3rd method implementation, most likely requires the value (should be 
 Result object) to be filled, with the Result object from HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-05 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490649#comment-13490649
 ] 

Cheng Hao commented on HBASE-6852:
--

@Lars, thank you for the committing;
The snapshot of 0.94 branch code improves about 17.7% for scanning in my case, 
and it's sure the HBASE-6032 helps a lot; 
Here is the new hotspots for RegionServer via OProfile:

{code:title=Hotspots|borderStyle=solid}
CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (No unit mask) count 500
samples  %image name   symbol name
183371   17.1144  4465.jo  int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int)
63267 5.9049  4465.jo  
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher$MatchCode 
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(org.apache.hadoop.hbase.KeyValue)
59762 5.5777  4465.jo  byte[] 
org.apache.hadoop.hbase.KeyValue.createByteArray(byte[], int, int, byte[], int, 
int, byte[], int, int, long, org.apache.hadoop.hbase.KeyValue$Type, byte[], 
int, int)
50975 4.7576  4465.jo  int 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.blockSeek(byte[], int, 
int, boolean)
50891 4.7498  4465.jo  void 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek()
38257 3.5706  4465.jo  jbyte_disjoint_arraycopy
37973 3.5441  4465.jo  boolean 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(boolean, 
org.apache.hadoop.hbase.KeyValue, boolean, boolean)~1
33978 3.1712  4465.jo  void 
org.apache.hadoop.util.PureJavaCrc32C.update(byte[], int, int)
{code}

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int,

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-02 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6852:
-

Attachment: 6852-0.94_3.patch

Lars, Ted, 

It did have a bug in the v2 patch, please take the 6852-0.94_3.patch; and it 
passed all of the metrics unit tests locally. Hopefully the weird failure won't 
happen again.

Thanks

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.4

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-02 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489861#comment-13489861
 ] 

Cheng Hao commented on HBASE-6852:
--

Ouch！Still failed，and I still couldn't access the build server.
Any problem of the build server?

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, 
 metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488549#comment-13488549
 ] 

Cheng Hao commented on HBASE-6852:
--

Still failed,

And I can not open the URL https://builds.apache.org/job/HBase-0.94/562/;, not 
sure if there any problem for the build server.



 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489176#comment-13489176
 ] 

Cheng Hao commented on HBASE-6852:
--

Thanks Lars and Ted, I will try to reproduce the failure locally first, and 
then to see if any logical bug of the schema metrics flushing.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.4

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-31 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6852:
-

Attachment: 6852-0.94_2.patch

Please take the 6852-0.94_2.patch

Found a small bug while updating the ALL_CATEGORY metrics in the function 
SchemaMetrics.incrNumericMetric(BlockCategory blockCategory,
  boolean isCompaction, BlockMetricType metricType, long amount)

And I also add the flushMetrics() in the function 
SchemaMetrics.getMetricsSnapshot()

Now it passed the unit test in my local.

Thank you.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-29 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486068#comment-13486068
 ] 

Cheng Hao commented on HBASE-6852:
--

oh, sorry for that, I will resolve it asap.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3

 Attachments: 6852-0.94.txt, metrics_hotspots.png, 
 onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-15 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6852:
-

Attachment: (was: AtomicTest.java)

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-15 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6852:
-

Attachment: metrics_hotspots.png

Sample callgraph via visualvm, seems the bottleneck was the 
SechemaMetrics.incrNumericMetric() itself; and the Map is does another hotspot; 
does recursive calls take that high overhead? Interesting.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-15 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476695#comment-13476695
 ] 

Cheng Hao commented on HBASE-6852:
--

Sorry, just read an article, the self time may not accurate in sampling result, 
as the modern JVM will optimize the function call as inlined.

But from the sampling call graph, it may tells the ConcurrentHashMap.get() is 
one of the hotspots, and that may also explains why the patch reduced the 
overhead.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: metrics_hotspots.png, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-06 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6852:
-

Attachment: AtomicTest.java

I tested the AtomicLong, Counter, and normal function call, and the result as:
with my laptop (Windows 7, 64 bit jdk 1.7,Core i5-2540M CPU @ 2.60GHz):
AtomicTest: 1429ms,1
AtomicTest: 1433ms,1
AtomicTest: 1445ms,1
CounterTest: 6659ms,1
CounterTest: 6609ms,1
CounterTest: 6486ms,1
NormalTest(Function): 238ms,1
NormalTest(Function): 237ms,1
NormalTest(Function): 230ms,1

With my server (Linux, 64 bit jdk 1.7, Intel(R) Xeon(R) CPU   L5640  @ 
2.27GHz)
AtomicTest: 1344ms,1
AtomicTest: 1220ms,1
AtomicTest: 1085ms,1
CounterTest: 1518ms,1
CounterTest: 1438ms,1
CounterTest: 1815ms,1
NormalTest(Function): 94ms,1
NormalTest(Function): 89ms,1
NormalTest(Function): 89ms,1

In both env, the Counter seems slower than the AtomicLong.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: AtomicTest.java, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-06 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471150#comment-13471150
 ] 

Cheng Hao commented on HBASE-6852:
--

I re-ran the scanning tests, with or without the patch attached, still, the 
patched version got 10% shorter in entire running time.
The oprofile result of the un-patched version as (top 4):
samples  %image name   symbol name
---
5418214.6977  23960.jo void 
org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.incrNumericMetric(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
 boolean, org.a
pache.hadoop.hbase.regionserver.metrics.SchemaMetrics$BlockMetricType)
  54182100.000  23960.jo void 
org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.incrNumericMetric(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
 boolean, org
.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics$BlockMetricType) [self]
---
4394911.9219  23960.jo int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int)
  43949100.000  23960.jo int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int) [self]
---
20725 5.6220  23960.jo int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
 int, int, java.nio.ByteBuffer, org.apache.hadoop.io
.RawComparator)
  20725100.000  23960.jo int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
 int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator) [self]
---
17554 4.7618  23960.jo 
org.apache.hadoop.hbase.io.hfile.HFileBlock 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(long, long, boolean, 
boolean, boolean, org.apache.hadoop.hbase.io.hfile.BlockType)
  17554100.000  23960.jo 
org.apache.hadoop.hbase.io.hfile.HFileBlock 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(long, long, boolean, 
boolean, boolean, org.apache.hadoop.hbase.io.hfile.BlockType) [self]

And the oprofile result for patched version as (Top 4):
samples  %image name   symbol name
---
5371611.9679  3683.jo  int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
 int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)
  53716100.000  3683.jo  int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
 int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator) [self]
---
34921 7.7804  3683.jo  int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int)
  34921100.000  3683.jo  int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int) [self]
---
31446 7.0061  3683.jo  
org.apache.hadoop.hbase.io.hfile.HFileBlock 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(long, long, boolean, 
boolean, boolean, org.apache.hadoop.hbase.io.hfile.BlockType)
  31446100.000  3683.jo  
org.apache.hadoop.hbase.io.hfile.HFileBlock 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(long, long, boolean, 
boolean, boolean, org.apache.hadoop.hbase.io.hfile.BlockType) [self]
---
20126 4.4841  3683.jo  
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher$MatchCode 
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(org.apache.hadoop.hbase.KeyValue)
  20126100.000  3683.jo  
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher$MatchCode 
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(org.apache.hadoop.hbase.KeyValue)
 [self]

Perhaps, the function call itself may costs too much, like the stacks poping / 
pushing etc. and the patch just reduces the un-necessary function calls.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key:

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-06 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471151#comment-13471151
 ] 

Cheng Hao commented on HBASE-6852:
--

Sorry, please check the AtomicTest.java attached, to compare the performance of 
AtomicLong / Counter / Normal function call.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: AtomicTest.java, onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6805) Extend co-processor framework to provide observers for filter operations

2012-10-01 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6805:
-

Attachment: (was: extend_coprocessor.patch)

 Extend co-processor framework to provide observers for filter operations
 

 Key: HBASE-6805
 URL: https://issues.apache.org/jira/browse/HBASE-6805
 Project: HBase
  Issue Type: Sub-task
  Components: Coprocessors
Affects Versions: 0.96.0
Reporter: Jason Dai
 Attachments: extend_coprocessor.patch


 There are several filter operations (e.g., filterKeyValue, filterRow, 
 transform, etc.) at the region server side that either exclude KVs from the 
 returned results, or transform the returned KV. We need to provide observers 
 (e.g., preFilterKeyValue and postFilterKeyValue) for these operations in the 
 same way as the observers for other data access operations (e.g., preGet and 
 postGet). This extension is needed to support DOT (e.g., extracting 
 individual fields from the document in the observers before passing them to 
 the related filter operations) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6805) Extend co-processor framework to provide observers for filter operations

2012-10-01 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6805:
-

Attachment: extend_coprocessor.patch

 Extend co-processor framework to provide observers for filter operations
 

 Key: HBASE-6805
 URL: https://issues.apache.org/jira/browse/HBASE-6805
 Project: HBase
  Issue Type: Sub-task
  Components: Coprocessors
Affects Versions: 0.96.0
Reporter: Jason Dai
 Attachments: extend_coprocessor.patch


 There are several filter operations (e.g., filterKeyValue, filterRow, 
 transform, etc.) at the region server side that either exclude KVs from the 
 returned results, or transform the returned KV. We need to provide observers 
 (e.g., preFilterKeyValue and postFilterKeyValue) for these operations in the 
 same way as the observers for other data access operations (e.g., preGet and 
 postGet). This extension is needed to support DOT (e.g., extracting 
 individual fields from the document in the observers before passing them to 
 the related filter operations) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6805) Extend co-processor framework to provide observers for filter operations

2012-10-01 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467428#comment-13467428
 ] 

Cheng Hao commented on HBASE-6805:
--

Thank you Andrew for the clarity.
I added unit test (examples) in the new patch file. Hope it could help to 
understand the motive of adding the interface, and I will provide the 
performance test report later.

 Extend co-processor framework to provide observers for filter operations
 

 Key: HBASE-6805
 URL: https://issues.apache.org/jira/browse/HBASE-6805
 Project: HBase
  Issue Type: Sub-task
  Components: Coprocessors
Affects Versions: 0.96.0
Reporter: Jason Dai
 Attachments: extend_coprocessor.patch


 There are several filter operations (e.g., filterKeyValue, filterRow, 
 transform, etc.) at the region server side that either exclude KVs from the 
 returned results, or transform the returned KV. We need to provide observers 
 (e.g., preFilterKeyValue and postFilterKeyValue) for these operations in the 
 same way as the observers for other data access operations (e.g., preGet and 
 postGet). This extension is needed to support DOT (e.g., extracting 
 individual fields from the document in the observers before passing them to 
 the related filter operations) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6805) Extend co-processor framework to provide observers for filter operations

2012-09-27 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6805:
-

Attachment: extend_coprocessor.patch

Please check the patch attached. Hope it make more sense.

 Extend co-processor framework to provide observers for filter operations
 

 Key: HBASE-6805
 URL: https://issues.apache.org/jira/browse/HBASE-6805
 Project: HBase
  Issue Type: Sub-task
  Components: Coprocessors
Affects Versions: 0.96.0
Reporter: Jason Dai
 Attachments: extend_coprocessor.patch


 There are several filter operations (e.g., filterKeyValue, filterRow, 
 transform, etc.) at the region server side that either exclude KVs from the 
 returned results, or transform the returned KV. We need to provide observers 
 (e.g., preFilterKeyValue and postFilterKeyValue) for these operations in the 
 same way as the observers for other data access operations (e.g., preGet and 
 postGet). This extension is needed to support DOT (e.g., extracting 
 individual fields from the document in the observers before passing them to 
 the related filter operations) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-27 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465357#comment-13465357
 ] 

Cheng Hao commented on HBASE-6852:
--

Hi, stack, the patch does improve the performance in my case, and for the 
AtomicLong stuff, maybe we could wait for the next generation of Metrics 
framework.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-23 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461559#comment-13461559
 ] 

Cheng Hao commented on HBASE-6852:
--

{quote} Cheng Hao: you said that your dataset size was 600GB, and the total 
amount of block cache was presumably much smaller than that, which makes me 
think the workload should have been I/O-bound. What was the CPU utilization on 
your test? What was the disk throughput?
{quote}
Actually it's the CPU-bound. and the utilization is more than 80%.

I have 4 machines and each machine has 12 disks and 24 CPU cores.
Besides, in order to make it more effective, I have splitted the regions twice, 
and then did the major compact, to be sure the data locality. After that, I ran 
the data scanning tests base on Hive query like select count() from xxx;

I am also curious if there any overheads of threads/syscalls switching (like 
during the IPC). PS: I did set the hbase.client.scanner.caching as 1000;

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.3, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460280#comment-13460280
 ] 

Cheng Hao commented on HBASE-6852:
--

Lars, the only place to use the ConcurentMap in SchemaMetrics is 
tableAndFamilyToMetrics. in this patch, I pre-create an array of AtomicLong for 
all of the possible oncachehit metrics items, which will avoids the concurrent 
issue and easy to be indexed while accessing.

Thanks stack and Lars for the suggestions, I will create another patch file 
instead.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460297#comment-13460297
 ] 

Cheng Hao commented on HBASE-6852:
--

Hi Liang, it's really good suggestion. Actually I didn't free the pagecache of 
OS before each launch. But I can try that later.

In my tests, the table data was about 600GB within 4 machines, I guess the 
system cache may not impact the entire performance so much for a full table 
scanning.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6852:
-

Attachment: onhitcache-trunk.patch

change the THRESHOLD_METRICS_FLUSH from 2000 to 100, per Lars' suggestion

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6852:
-

Attachment: (was: onhitcache-trunk.patch)

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460316#comment-13460316
 ] 

Cheng Hao commented on HBASE-6852:
--

I didn't remove the cacheHits in the HFileReaderV1  V2, hope it's a good 
start to design a less overhead metrics framework.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)

Cheng Hao created HBASE-6852:


 Summary: SchemaMetrics.updateOnCacheHit costs too much while full 
scanning a table with all of its fields
 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor


The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
table scanning.
Here is the top 5 hotspots within regionserver while full scanning a table: 
(Sorry for the less-well-format)

CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (No unit mask) count 500
samples  %image name   symbol name
---
9844713.4324  14033.jo void 
org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
 boolean)
  98447100.000  14033.jo void 
org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
 boolean) [self]
---
45814 6.2510  14033.jo int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
byte[], int, int)
  45814100.000  14033.jo int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
byte[], int, int) [self]
---
43523 5.9384  14033.jo boolean 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  43523100.000  14033.jo boolean 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
 [self]
---
42548 5.8054  14033.jo int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int)
  42548100.000  14033.jo int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int) [self]
---
40572 5.5358  14033.jo int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
 int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
  40572100.000  14033.jo int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
 int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6852:
-

Attachment: onhitcache-trunk.patch

The fix will cache the metrics and flush every 2000 calls, or the HFileReader 
closed.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6852:
-

Fix Version/s: 0.96.0
   Status: Patch Available  (was: Open)

After patch the fix, the oprofile shows the top 8 hotspots as:

samples  %image name   app name symbol name
---
59829 7.9422  17779.jo java int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, i
nt, byte[], int, int)
  59829100.000  17779.jo java int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int,
 int, byte[], int, int) [self]
---
28571 3.7927  17779.jo java int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.bin
arySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, 
org.apache.hadoop.io.RawComparator)
  28571100.000  17779.jo java int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.b
inarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, 
org.apache.hadoop.io.RawComparator) [self]
---
19331 2.5662  17779.jo java 
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher$MatchCode org.apach
e.hadoop.hbase.regionserver.ScanQueryMatcher.match(org.apache.hadoop.hbase.KeyValue)
  19331100.000  17779.jo java 
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher$MatchCode org.apa
che.hadoop.hbase.regionserver.ScanQueryMatcher.match(org.apache.hadoop.hbase.KeyValue)
 [self]
---
19063 2.5306  17779.jo java void 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek()
  19063100.000  17779.jo java void 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(
) [self]
---
  1 0.0054  libjvm.sojava 
Monitor::ILock(Thread*)
  1 0.0054  libjvm.sojava 
ObjectMonitor::enter(Thread*)
  2 0.0107  libjvm.sojava 
VMThread::loop()
  1864299.9785  libjvm.sojava 
StealTask::do_it(GCTaskManager*, unsigned int)
18646 2.4752  libjvm.sojava SpinPause
  18646100.000  libjvm.sojava SpinPause 
[self]
---
15860 2.1054  17779.jo java byte[] 
org.apache.hadoop.hbase.KeyValue.createByteArray(byte[], int, int,
 byte[], int, int, byte[], int, int, long, 
org.apache.hadoop.hbase.KeyValue$Type, byte[], int, int)
  15860100.000  17779.jo java byte[] 
org.apache.hadoop.hbase.KeyValue.createByteArray(byte[], int, in
t, byte[], int, int, byte[], int, int, long, 
org.apache.hadoop.hbase.KeyValue$Type, byte[], int, int) [self]
---
14754 1.9586  17779.jo java 
org.apache.hadoop.hbase.io.hfile.Cacheable org.apache.hadoop.hbase.io.hfi
le.LruBlockCache.getBlock(org.apache.hadoop.hbase.io.hfile.BlockCacheKey, 
boolean)
  14754100.000  17779.jo java 
org.apache.hadoop.hbase.io.hfile.Cacheable org.apache.hadoop.hbase.io.h
file.LruBlockCache.getBlock(org.apache.hadoop.hbase.io.hfile.BlockCacheKey, 
boolean) [self]
---
13068 1.7348  17779.jo java 
org.apache.hadoop.hbase.io.hfile.HFileBlock org.apache.hadoop.hbase.io.hf
ile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(byte[], int, int, 
org.apache.hadoop.hbase.io.hfile.HFileBlock, boolean, boolean, boolean
)~2
  13068100.000  17779.jo java 
org.apache.hadoop.hbase.io.hfile.HFileBlock org.apache.hadoop.hbase.io.
hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(byte[], int, int, 
org.apache.hadoop.hbase.io.hfile.HFileBlock, boolean, boolean, boole
an)~2 [self]
---


 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460172#comment-13460172
 ] 

Cheng Hao commented on HBASE-6852:
--

yes, I ran the profiling in 0.94.0, but the patch is based on the trunk. it 
should also works for the later 0.94s.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460174#comment-13460174
 ] 

Cheng Hao commented on HBASE-6852:
--

It's quite similar with https://issues.apache.org/jira/browse/HBASE-6603, but 
per my testing, the 6603 doesn't improve that much in my case (full scan a 
table), but this fix did improve the performance a lot (it's 10% time shorter 
totally).

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460176#comment-13460176
 ] 

Cheng Hao commented on HBASE-6852:
--

stack, do you mean I should submit the patch for 0.94 as well?

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460224#comment-13460224
 ] 

Cheng Hao commented on HBASE-6852:
--

@stack: it should make more sense if we put the close() into the 
AbastractHFileReader, but not sure if there any other concern, since the 
AbstractHFileReader doesn't have it.

And for the THRESHOLD_METRICS_FLUSH = 2k, which I used during my testing, hope 
it's big enough for reducing the overhead, and less impact for getting the 
metrics snapshot timely. sorry, I may not able to give a good experiential 
number for it.

@Lars: Yes, that's right, we're still updating an AtomicLong each time, but 
from profiling result, I didn't see the AtomicLong became the new hotspots, and 
the testing also did 10% saved in running time, which may means the overhead 
of AtomicLong could be ignored.

 SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
 with all of its fields
 

 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor
  Labels: performance
 Fix For: 0.94.2, 0.96.0

 Attachments: onhitcache-trunk.patch


 The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
 table scanning.
 Here is the top 5 hotspots within regionserver while full scanning a table: 
 (Sorry for the less-well-format)
 CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
 mask of 0x00 (No unit mask) count 500
 samples  %image name   symbol name
 ---
 9844713.4324  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean)
   98447100.000  14033.jo void 
 org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
  boolean) [self]
 ---
 45814 6.2510  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int)
   45814100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
 byte[], int, int) [self]
 ---
 43523 5.9384  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
   43523100.000  14033.jo boolean 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  [self]
 ---
 42548 5.8054  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int)
   42548100.000  14033.jo int 
 org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
 byte[], int, int) [self]
 ---
 40572 5.5358  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
   40572100.000  14033.jo int 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

39 matches

Mail list logo