The 3.5 seconds is the time taken to fetch data from Hbase -----Original Message----- From: Stack [mailto:[email protected]] Sent: Saturday, September 14, 2019 9:16 AM To: Hbase-User <[email protected]> Subject: Re: Retrieving large rows from Hbase
CAUTION: This email originated from outside the Allen Institute. Please do not click links or open attachments unless you've validated the sender and know the content is safe. ________________________________ On Thu, Sep 12, 2019 at 6:14 PM Gautham Acharya <[email protected]> wrote: > Hi, > > I'm new to this distribution list and to Hbase in general, so I > apologize if I'm asking a basic question. > > I'm running an Apache Hbase Cluster on AWS EMR. I have a table that is > a single column family, 75,000 columns and 50,000 rows. I'm trying to > get all the column values for a single row, and when the row is not > sparse, and has > 75,000 values, the return time is extremely slow - it takes almost 3.5 > seconds for me to fetch the data from the DB. I'm querying the table > from a Lambda function running Happybase. > > Can you figure where the time is being spent -- in hbase or in the happybase processing? Happybase means an extra hop recasting 75k items in python. Thanks, S > > What can I do to make this faster? This seems incredibly slow - the > return payload is 75,000 value pairs, and is only ~2MB. It should be > much faster than 3 seconds. I'm looking for millisecond return time. > > I have a BLOCKCACHE size of 8194kb, a BLOOMFILTER of type ROW, and > SNAPPY compression enabled on this table. > > --gautham > >
