RE: HBase: Paralel Query

Job Thomas Wed, 27 Nov 2013 00:42:49 -0800

Here is the describtion of two tables created :
************************************************************************************************************************************************************************
FIRST TABLE 
************************************************************************************************************************************************************************
hbase(main):013:0> describe 'TEST5MILLION8KB'
DESCRIPTION                                                                     
                       ENABLED
 'TEST5MILLION8KB', {METHOD => 'table_att', coprocessor$1 => 
'|com.salesforce.phoenix.coprocessor.Scan true
 RegionObserver|1|', coprocessor$2 => 
'|com.salesforce.phoenix.coprocessor.UngroupedAggregateRegionObs
 erver|1|', coprocessor$3 => 
'|com.salesforce.phoenix.coprocessor.GroupedAggregateRegionObserver|1|',
 coprocessor$4 => '|com.salesforce.phoenix.join.HashJoiningRegionObserver|1|', 
coprocessor$5 => '|com.
 salesforce.phoenix.coprocessor.ServerCachingEndpointImpl|1|', coprocessor$6 => 
'|com.salesforce.hbase
 
.index.Indexer|1073741823|com.salesforce.hbase.index.codec.class=com.salesforce.phoenix.index.Phoenix
 IndexCodec,index.builder=com.salesforce.phoenix.index.PhoenixIndexBuilder'}, 
{NAME => 'M', DATA_BLOCK
 _ENCODING => 'FAST_DIFF', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', 
VERSIONS => '1', COMPRESSIO
 N => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', KEEP_DELETED_CELLS => 
'true', BLOCKSIZE => '81
 92', IN_MEMORY => 'false', ENCODE_ON_DISK => 'true', BLOCKCACHE => 'true'}
************************************************************************************************************************************************************************
After first table created , I got very  good performance .
Then created second table  
************************************************************************************************************************************************************************
hbase(main):014:0> describe 'TEST5MILLION8KB2'
DESCRIPTION                                                                     
                       ENABLED
 'TEST5MILLION8KB2', {METHOD => 'table_att', coprocessor$1 => 
'|com.salesforce.phoenix.coprocessor.Sca true
 nRegionObserver|1|', coprocessor$2 => 
'|com.salesforce.phoenix.coprocessor.UngroupedAggregateRegionOb
 server|1|', coprocessor$3 => 
'|com.salesforce.phoenix.coprocessor.GroupedAggregateRegionObserver|1|',
  coprocessor$4 => '|com.salesforce.phoenix.join.HashJoiningRegionObserver|1|', 
coprocessor$5 => '|com
 .salesforce.phoenix.coprocessor.ServerCachingEndpointImpl|1|', coprocessor$6 
=> '|com.salesforce.hbas
 
e.index.Indexer|1073741823|com.salesforce.hbase.index.codec.class=com.salesforce.phoenix.index.Phoeni
 xIndexCodec,index.builder=com.salesforce.phoenix.index.PhoenixIndexBuilder'}, 
{NAME => 'M', DATA_BLOC
 K_ENCODING => 'FAST_DIFF', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', 
VERSIONS => '1', COMPRESSI
 ON => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', KEEP_DELETED_CELLS => 
'true', BLOCKSIZE => '8
 192', IN_MEMORY => 'false', ENCODE_ON_DISK => 'true', BLOCKCACHE => 'true'}
1 row(s) in 0.0600 seconds
************************************************************************************************************************************************************************
the performance of second table has been decreased dramaticaly and that of 
second table is very good.
************************************************************************************************************************************************************************
 
Please give your suggestions.
With Thanks,
 
Best Regards,
Job M Thomas


________________________________

From: Job Thomas [mailto:[email protected]]
Sent: Wed 11/27/2013 1:52 PM
To: [email protected]; [email protected]
Subject: RE: HBase: Paralel Query




Hi Ted,All

I have set

hfile.block.cache.size to 0.6
hbase.regionserver.handler.count to 60
DATA_BLOCK _ENCODING => 'FAST_DIFF'
 BLOOMFILTER => 'ROW'
 BLOCKSIZE => '8192'
 BLOCKCACHE => 'true'

The performance has been increased.

But after creating another table with same size and configurations , the 
performance of previous table has been reduced and I am getting good 
performance for the new table created.

I have seen that whle querying out of maxHeapMB=15983 Hbase using only  
usedHeapMB=72.
why hbase not utilizing heap space even though I have set BLOCKSIZE => '8192' ( 
For to store more number of indexes in memory ).

I have read that once block size of hfile has been reduce, the sequential 
access speed will decrease . but I didn't experienced this  even though my   
BLOCKSIZE is 192'

Best Regards,
Job M Thomas

________________________________

From: Ted Yu [mailto:[email protected]]
Sent: Wed 11/27/2013 11:48 AM
To: [email protected]
Subject: Re: HBase: Paralel Query



bq. I didn't enabled blockcache

What if you enable blockcache ?

Cheers


On Tue, Nov 26, 2013 at 8:45 PM, Job Thomas <[email protected]> wrote:

> Hello lars,
>
> Here re the answers ,
>
> -> I have only one region server. ( I am testing Hbase via phoenix with
> Hbase in a single server).
> -> All queries are fired through Phoenix only.( select Lastname from
> tablename where Id=? ( Here Id is the primary key))
> -> hbase.regionserver.handler.count=30(default value).
> -> Hardware:   Core =8
>                      Ram =8 Gb
> -> I didn't enabled blockcache.
> -> Are the client in multiple threads in the process or multiple
> processes? - I am not clear
>
>
> Best Regards,
> Job M Thomas
>
> ________________________________
>
> From: lars hofhansl [mailto:[email protected]]
> Sent: Tue 11/26/2013 11:16 PM
> To: [email protected]
> Subject: Re: HBase: Paralel Query
>
>
>
> Hi Job,
>
> first off some questions :)
> How many regions are you accessing?
> What type of query is this (get or scan)?
> How many handlers have you configured?
> What does you hardware look like (how many cores, etc)?
> Is the data all in the blockcache?
> If not, what does the disk IO look like?
> Are the client in multiple threads in the process or multiple processes?
>
>
> Sorry for all the questions, but we need a bit more data.
>
>
> -- Lars
>
>
>
> ________________________________
>  From: Job Thomas <[email protected]>
> To: [email protected]
> Sent: Tuesday, November 26, 2013 12:26 AM
> Subject: HBase: Paralel Query
>
>
>
>
> Hi All,
>
> How can we configure Hbase  inorder to perform multythreading/parallel
> query faster .
>
> These are some bits from my analysis:
>
> Each Thread contain 10 query ( Random)
>
> Tread        H2(Msec)  Phoenix(Msec)
>   1            34             215
>   2            63             222
>   4            120           324
>   6            200          340
>   8           250           460
>   10         350          560
>   12          410         592
>
> I have to find some points in the graph ploted with these values where
> lines are intercepting .
> So I need hbase to perform well with multythreaded condition .
>
>
> Best Regards,
> Job M Thomas
>
>

RE: HBase: Paralel Query

Reply via email to