[
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461559#comment-13461559
]
Cheng Hao commented on HBASE-6852:
----------------------------------
{quote} Cheng Hao: you said that your dataset size was 600GB, and the total
amount of block cache was presumably much smaller than that, which makes me
think the workload should have been I/O-bound. What was the CPU utilization on
your test? What was the disk throughput?
{quote}
Actually it's the CPU-bound. and the utilization is more than 80%.
I have 4 machines and each machine has 12 disks and 24 CPU cores.
Besides, in order to make it more effective, I have splitted the regions twice,
and then did the major compact, to be sure the data locality. After that, I ran
the data scanning tests base on Hive query like "select count() from xxx";
I am also curious if there any overheads of threads/syscalls switching (like
during the IPC). PS: I did set the "hbase.client.scanner.caching" as 1000;
> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table
> with all of its fields
> ------------------------------------------------------------------------------------------------
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
> Issue Type: Improvement
> Components: metrics
> Affects Versions: 0.94.0
> Reporter: Cheng Hao
> Priority: Minor
> Labels: performance
> Fix For: 0.94.3, 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table:
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
> mask of 0x00 (No unit mask) count 5000000
> samples % image name symbol name
> -------------------------------------------------------------------------------
> 98447 13.4324 14033.jo void
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
> boolean)
> 98447 100.000 14033.jo void
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
> boolean) [self]
> -------------------------------------------------------------------------------
> 45814 6.2510 14033.jo int
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int,
> byte[], int, int)
> 45814 100.000 14033.jo int
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int,
> byte[], int, int) [self]
> -------------------------------------------------------------------------------
> 43523 5.9384 14033.jo boolean
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
> 43523 100.000 14033.jo boolean
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
> [self]
> -------------------------------------------------------------------------------
> 42548 5.8054 14033.jo int
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int,
> byte[], int, int)
> 42548 100.000 14033.jo int
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int,
> byte[], int, int) [self]
> -------------------------------------------------------------------------------
> 40572 5.5358 14033.jo int
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
> int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
> 40572 100.000 14033.jo int
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
> int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira