I've been profiling HBase with my own benchmark but I'm having trouble interpreting some of the results. Maybe someone can help shed some light.

The benchmark:

* Test table consists of 10 columns all of the same family. All values are random selection from 10 city names.
* My read method scans all rows on 3 of those columns using HClient.
* I run HMaster and one HRegion in separate JVM's (for more controlled profiling), both using local FS, not HDFS, all on a single machine. * Each test is cold: run right after HRegion startup with nothing in HMemCache.
* hbase-site.xml is attached
* profile *.cpu files are large, i'll send them individually if requested

Profiling:
I'm using YourKit 6.0.15 to profile just the CPU time of HRegionServer running in its own JVM.

With a table of size 10k:
   - Scan took 5220ms.

With a table of size 100k:
   - Scan took 29366ms.

Obervations:

Within the combined RPC$server.call() of all Handler threads, HRegionServer.next() took about 90% in both the 10K and 100K cases. This is as expected since it's called 10K/100K times. The biggest bottleneck bellow that is HStoreKey.extractFamily() which is 33%/39% of HRegionServer.next() which is more expensive than all the IO operations in HStoreScanner.getNext() (31%/31%). Couldn't there be a cheaper way to extract family from a col key?

Outside the RPC$server.call(),

Server$Connection.readAndProcess() takes about 17%/22% of the time that HRegionServer.next() takes suggesting a heavy overhead in RPC IO and unmarshalling. Overhead in the other direction, ObjectWritable.write() + DataOutputStream.flush() take 63%/57% of the time that HRegionServer.next() takes. The ratio of input/output makes sense since more data is returned than sent. But this seems like a lot of RPC overhead.

I may be incorrect, but my understanding of YourKit (latest build) is that it does not count the time that threads spend blocked. So it seems quite a mystery to me that the following methods consume so much cpu time (not counting methods called from within them).

Server$Handler.run()  68%
SelectorImpl.select() 7%

Relative to RPC$Server.call which is only 2%

In previous profiles i've observed that Server$Listener.run() consumed most of the time instead of Server$Handler.run(). Both those thread synchronize on the callQueue and I'm sure that synchronization has something to do with this result. Unless i'm wrong about my YourKit assumption, i don't see why sync blocking should show up like this. Does anyone know?

As for SelectorImpl.select(), I looked it up but don't completely understand what's happening under the hood. I know that it is blocking too while waiting for a connection but I don't see why it should be consuming so much CPU.

If anyone else has done YourKit profiling on HBase, please let me know if you've found similar results or can help me clarify my results.

Right now, if i assume that YourKit is counting block time and I ignore those numbers, I come to the following conclusions:

It looks like HStoreKey.extractFamily() is a bottleneck. For 100k rows and 10 cols, it will be called over 1,000,000 times during a full table scan (even if just retrieving one col) and each call requires multiple byte[] <-> string conversions.

RPC overhead can be quite high during a scan. Could batched/read-ahead scanning be useful to minimize RPC calls?






<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>hbase.master</name>
        <value>172.30.20.15:60000</value>
        <description>The host and port that the HBase master runs at.
            TODO: Support 'local' (All running in single context).
        </description>
    </property>
    <property>
        <name>hbase.regionserver</name>
        <value>172.30.20.15:60010</value>
        <description>The host and port a HBase region server runs at.
        </description>
    </property>
    <property>
        <name>hbase.regiondir</name>
        <value>hbase</value>
        <description>The directory shared by region servers.
        </description>
    </property>
    <property>
        <name>hbase.hregion.max.filesize</name>
        <value>134217728</value>
        <description>
            Maximum desired file size for an HRegion.  If filesize exceeds
            value + (value / 2), the HRegion is split in two.  Default: 128M (134217728).
        </description>
    </property>
    <property>
        <name>hbase.hregion.compactionThreshold</name>
        <value>10</value>
        <description>
            By default, we compact the region if an HStore has more than 10 map files.
        </description>
    </property>
    <property>
        <name>hbase.regionserver.thread.splitcompactcheckfrequency</name>
        <value>40000</value>
        <description>
            Default: 60 * 1000
        </description>
    </property>
    
    <property>
        <name>hbase.client.pause</name>
        <value>2000</value>
    </property>
    <property>
        <name>hbase.client.timeout.number</name>
        <value>30000</value>
        <description>Try this many timeouts before giving up.
        </description>
    </property>
    <property>
        <name>hbase.client.retries.number</name>
        <value>15</value>
        <description>Count of maximum retries fetching the root region from root
            region server.
        </description>
    </property>
    <property>
        <name>hbase.master.meta.thread.rescanfrequency</name>
        <value>60000</value>
        <description>How long the HMaster sleeps (in milliseconds) between scans of
            the root and meta tables.
        </description>
    </property>
    <property>
        <name>hbase.master.lease.period</name>
        <value>30000</value>
        <description>HMaster server lease period in milliseconds. Default is
            30 seconds.</description>
    </property>
    <property>
        <name>hbase.server.thread.wakefrequency</name>
        <value>5000</value>
        <description>Time to sleep in between searches for work (in milliseconds).
            Used as sleep interval by service threads such as META scanner and log roller.
        </description>
    </property>
    <property>
        <name>hbase.regionserver.lease.period</name>
        <value>30000</value>
        <description>HRegion server lease period in milliseconds. Default is
            30 seconds.</description>
    </property>
    <property>
        <name>hbase.regionserver.handler.count</name>
        <value>10</value>
        <description>Count of RPC Server instances spun up on RegionServers
            Same property is used by the HMaster for count of master handlers.
            Default is 10.
        </description>
    </property>
    <property>
        <name>hbase.regionserver.msginterval</name>
        <value>500</value>
        <description>Interval between messages from the RegionServer to HMaster
            in milliseconds.  Default is 15 sec. Set this value low if you want unit
            tests to be responsive.
        </description>
    </property>
    
</configuration>

Reply via email to