[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460786#comment-13460786 ]
Elliott Clark commented on HBASE-6852: -------------------------------------- bq.Aggregating stuff locally and pushing to metrics seems ideal With that comes a lot of book keeping and potential places to leak memory(if we use strong references) or to lose metrics data (if we use weak references). I'm not sure that the perf gain will be high enough to justify that. Since we already shim a lot to the metrics2 classes it seems like using the high-scale-lib counters to create conurrent versions of the MetricMutableCounter{Long|Int} would stop most cache contention pretty easily. For me these seem like the order of cost vs benefit: # Aggregating metrics locally before pushing to the metrics system whenever possible # Using the hashmap less (This is already happening in the metrics2 move over. See [MasterMetricsSourceImpl|https://github.com/apache/hbase/blob/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java] for how known metrics are staying away from the hashmap) # Changing metrics to use counters rather than time varying rate wherever possible (Lots less locking if we don't need to keep min/max) # Create CliffClick versions of Counters and use them whenever there's concurrent access # Look at ThreadLocal caches versions of metrics. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > ------------------------------------------------------------------------------------------------ > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics > Affects Versions: 0.94.0 > Reporter: Cheng Hao > Priority: Minor > Labels: performance > Fix For: 0.94.3, 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 5000000 > samples % image name symbol name > ------------------------------------------------------------------------------- > 98447 13.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447 100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > ------------------------------------------------------------------------------- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814 100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > ------------------------------------------------------------------------------- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523 100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > ------------------------------------------------------------------------------- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548 100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > ------------------------------------------------------------------------------- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572 100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira