Re: Performance tuning

Kristoffer Sjögren Sat, 21 Dec 2013 13:37:37 -0800

Btw, I have tried different number of rows with similar symptom on the bad
RS.



On Sat, Dec 21, 2013 at 10:28 PM, Kristoffer Sjögren <[email protected]>wrote:

> @pradeep scanner caching should not be an issue since data transferred to
> the client is tiny.
>
> @lars Yes, the data might be small for this particular case :-)
>
> I have checked everything I can think of on RS (CPU, network, Hbase
> console, uptime etc) and nothing stands out, except for the pings (network
> pings).
> There are 5 regions on 7, 18, 19, and 23 the others have 4.
> hdfsBlocksLocalityIndex=100 on all RS (was that the correct metric?)
>
> -Kristoffer
>
>
>
> On Sat, Dec 21, 2013 at 9:44 PM, lars hofhansl <[email protected]> wrote:
>
>> Hi Kristoffer,
>> For this particular problem. Are many regions on the same RegionServers?
>> Did you profile those RegionServers? Anything weird on that box?
>> Pings slower might well be an issue. How's the data locality? (You can
>> check on a RegionServer's overview page).
>> If needed, you can issue a major compaction to reestablish local data on
>> all RegionServers.
>>
>>
>> 32 cores matched with only 4G of RAM is a bit weird, but with your tiny
>> dataset it doesn't matter anyway.
>>
>> 10m rows across 96 regions is just about 100k rows per region. You won't
>> see many of the nice properties for HBase.
>> Try with 100m (or better 1bn rows). Then we're talking. For anything
>> below this you wouldn't want to use HBase anyway.
>> (100k rows I could scan on my phone with a Perl script in less than 1s)
>>
>>
>> With "ping" you mean an actual network ping, or some operation on top of
>> HBase?
>>
>>
>> -- Lars
>>
>>
>>
>> ________________________________
>>  From: Kristoffer Sjögren <[email protected]>
>> To: [email protected]
>> Sent: Saturday, December 21, 2013 11:17 AM
>> Subject: Performance tuning
>>
>>
>> Hi
>>
>> I have been performance tuning HBase 0.94.6 running Phoenix 2.2.0 the last
>> couple of days and need some help.
>>
>> Background.
>>
>> - 23 machine cluster, 32 cores, 4GB heap per RS.
>> - Table t_24 have 24 online regions (24 salt buckets).
>> - Table t_96 have 96 online regions (96 salt buckets).
>> - 10.5 million rows per table.
>> - Count query - select (*) from ...
>> - Group by query - select A, B, C sum(D) from ... where (A = 1 and T >= 0
>> and T <= 2147482800) group by A, B, C;
>>
>> What I found ultimately is that region servers 19, 20, 21, 22 and 23
>> are consistently
>> 2-3x slower than the others. This hurts overall latency pretty bad since
>> queries are executed in parallel on the RS and then aggregated at the
>> client (through Phoenix). In Hannibal regions spread out evenly over
>> region
>> servers, according to salt buckets (phoenix feature, pre-create regions
>> and
>> a rowkey prefix).
>>
>> As far as I can tell, there is no network or hardware configuration
>> divergence between the machines. No CPU, network or other notable
>> divergence
>> in Ganglia. No RS metric differences in HBase master console.
>>
>> The only thing that may be of interest is that pings (within the cluster)
>> to
>> bad RS is about 2-3x slower, around 0.050ms vs 0.130ms. Not sure if
>> this is significant,
>> but I get a bad feeling about it since it match exactly with the RS that
>> stood out in my performance tests.
>>
>> Any ideas of how I might find the source of this problem?
>>
>> Cheers,
>> -Kristoffer
>>
>
>

Re: Performance tuning

Reply via email to