Re: Re: Any fast way to random access hbase data?

Esteban Gutierrez Wed, 13 Aug 2014 10:06:42 -0700

Hi Lei,

Any chance for you to provide the value for hfile.block.cache.size from one
of the region servers? The HBase master disables the block cache (thats why
it shows 'programatically' as the source of the config)


cheers,
esteban.


--
Cloudera, Inc.



On Wed, Aug 13, 2014 at 6:41 AM, Jean-Marc Spaggiari <
[email protected]> wrote:

> Like what Esteban said.
>
> Try to use more threads to query HBase. Start with 10 clients, each with 1K
> gets per batch, and adjust those numbers to see the impact on the
> performances.
>
> Any reason why your block cache is disabled? (hfile.block.cache.size = 0)
>
> JM
>
>
> 2014-08-13 5:23 GMT-04:00 [email protected] <[email protected]>:
>
> >
> > Haven't tried yet
> > only one thread
> > 10 regions servers, total 2555 regions.
> > I am just new to HBase and not sure what exactly the block cache mean,
> > here's the configuration i can see from the CDH HBase master UI:
> > <name>hbase.rs.cacheblocksonwrite</name>
> > <value>false</value>
> > <source>hbase-default.xml</source>
> >
> > <name>hbase.offheapcache.percentage</name>
> > <value>0</value>
> > <source>hbase-default.xml</source>
> >
> > <name>hfile.block.cache.size</name>
> > <value>0.0</value>
> > <source>programatically</source>
> > Table description:
> >  {NAME => 'userdigest', coprocessor$3 =>
> > 'hdfs://agrant/user/tracking/userdigest/copro
> >
> >
> cessor/endpoint_0.0.17.jar|com.agrantsem.data.userdigest.endpoint.UserdigestEndPoint|
> > 1001|', coprocessor$2 =>
> > '|org.apache.hadoop.hbase.coprocessor.AggregateImplementatio
> > n||', FAMILIES => [{NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE',
> > BLOOMFILTER => 'ROWC
> > OL', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'LZ4',
> > MIN_VERSIONS =>
> > '0', TTL => '2147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE =>
> > '65536', IN_ME
> > MORY => 'false', ENCODE_ON_DISK => 'true', BLOCKCACHE => 'true'}]}
> >
> >
> >
> > [email protected]
> >
> > From: Esteban Gutierrez
> > Date: 2014-08-13 15:59
> > To: [email protected]
> > Subject: Re: Any fast way to random access hbase data?
> > Hello Lei,
> >
> > Have you tried a larger batch size? how many threads or tasks are you
> using
> > to fetch data? could you please describe a little bit more your HBase
> > cluster? e.g. how many region servers, how many regions per RS? whats the
> > hit ratio of the block cache? any chance for you to share the table
> schema?
> >
> > cheers,
> > esteban.
> >
> >
> >
> > --
> > Cloudera, Inc.
> >
> >
> >
> > On Wed, Aug 13, 2014 at 12:34 AM, [email protected] <
> > [email protected]
> > > wrote:
> >
> > >
> > > I have a hbase table with more than 2G rows.
> > > Every hour there comes 5M~10M row ids and i must get all the row info
> > from
> > > the hbase table.
> > > But even I use the batch call(1000 row ids as a list) as described here
> > >
> > >
> >
> http://stackoverflow.com/questions/13310434/hbase-api-get-data-rows-information-by-list-of-row-ids
> > >
> > > It takes about 1 hour.
> > > Any other way to do this more quickly?
> > >
> > > Thanks,
> > > Lei
> > >
> > >
> > > [email protected]
> > >
> >
>

Re: Re: Any fast way to random access hbase data?

Reply via email to