That's a good explanation, Kevin! It's also good to keep in mind that the ResultScanner implementation is not reading data in parallel. You have many servers to read data from, but you're only communicating with one at a time.

Also, remember that HBase stores its data in sorted-order. The best use-cases are the ones that can exploit the natural ordering of data to optimize your application.

For example, pruning your "search space" from the entire table to a set of rows in that table wouldn't make a big difference with 100MB of data, but that is monumental when we're talking about 10TB.

Kevin O'Dell wrote:
Hi Lydia,

   Welcome to the wonderful world of HBase! I don't think it is wrong that
you are seeing linear results from doing a scan.  When doing a scan HBase
will collect X amount of rows to return to the client. X being the value of
your scan cache. If each round trip grabs 100 rows and takes 1 second to do
it, then it is safe to assume time will grow in a linear nature.  The good
news is HBase is much faster than the example I gave.  I would recommend
looking at how much you are caching and raise that value, though I am not
surprised your scans are growing in a linear nature as the scan function is
rather linear itself. Does this make sense?

Also I may be completely wrong so I will defer to anyone else's expert
information.

On Wed, May 3, 2017 at 6:51 AM, Lydia<[email protected]>  wrote:

Hi,

I would like to know if my query times seem appropriate since I do not
have a lot experience with HBase.

I have three tables - stored in HDFS, on one machine:
         table1: 5 million rows
         table2: 15 million rows
         table3: 90 million rows

I do a scan using the Java API including a prefix-filter and some column
filter.
My rowkeys are encoded with geohashes.

Execution Times:
         table1: ~   3.072 s
         table2: ~ 10.117 s
         table3: ~ 60.00 s

It seems really odd to me that the execution time is increasing linear
with the amount of rows!
Am I doing something terribly wrong?

Thanks in advance!
Best regards,
Lydia




Reply via email to