RE: Retrieving large rows from Hbase

Gautham Acharya Sat, 14 Sep 2019 09:32:23 -0700

The 3.5 seconds is the time taken to fetch data from Hbase

-----Original Message-----
From: Stack [mailto:[email protected]] 
Sent: Saturday, September 14, 2019 9:16 AM
To: Hbase-User <[email protected]>
Subject: Re: Retrieving large rows from Hbase


CAUTION: This email originated from outside the Allen Institute. Please do not 
click links or open attachments unless you've validated the sender and know the 
content is safe.
________________________________

On Thu, Sep 12, 2019 at 6:14 PM Gautham Acharya <[email protected]>
wrote:

> Hi,
>
> I'm new to this distribution list and to Hbase in general, so I 
> apologize if I'm asking a basic question.
>
> I'm running an Apache Hbase Cluster on AWS EMR. I have a table that is 
> a single column family, 75,000 columns and 50,000 rows. I'm trying to 
> get all the column values for a single row, and when the row is not 
> sparse, and has
> 75,000 values, the return time is extremely slow - it takes almost 3.5 
> seconds for me to fetch the data from the DB. I'm querying the table 
> from a Lambda function running Happybase.
>
> Can you figure where the time is being spent -- in hbase or in the
happybase processing? Happybase means an extra hop recasting 75k items in 
python.

Thanks,
S


>
> What can I do to make this faster? This seems incredibly slow - the 
> return payload is 75,000 value pairs, and is only ~2MB. It should be 
> much faster than 3 seconds. I'm looking for millisecond return time.
>
> I have a BLOCKCACHE size of 8194kb, a BLOOMFILTER of type ROW, and 
> SNAPPY compression enabled on this table.
>




> --gautham
>
>

RE: Retrieving large rows from Hbase

Reply via email to