Lydia: bq. including a prefix-filter and some column filter.
Are you expecting roughly same rate of filtering across the tables ? Which hbase / hadoop release are you using ? Cheers On Wed, May 3, 2017 at 6:32 AM, Kevin O'Dell <[email protected]> wrote: > Hi Lydia, > > Welcome to the wonderful world of HBase! I don't think it is wrong that > you are seeing linear results from doing a scan. When doing a scan HBase > will collect X amount of rows to return to the client. X being the value of > your scan cache. If each round trip grabs 100 rows and takes 1 second to do > it, then it is safe to assume time will grow in a linear nature. The good > news is HBase is much faster than the example I gave. I would recommend > looking at how much you are caching and raise that value, though I am not > surprised your scans are growing in a linear nature as the scan function is > rather linear itself. Does this make sense? > > Also I may be completely wrong so I will defer to anyone else's expert > information. > > On Wed, May 3, 2017 at 6:51 AM, Lydia <[email protected]> wrote: > > > Hi, > > > > I would like to know if my query times seem appropriate since I do not > > have a lot experience with HBase. > > > > I have three tables - stored in HDFS, on one machine: > > table1: 5 million rows > > table2: 15 million rows > > table3: 90 million rows > > > > I do a scan using the Java API including a prefix-filter and some column > > filter. > > My rowkeys are encoded with geohashes. > > > > Execution Times: > > table1: ~ 3.072 s > > table2: ~ 10.117 s > > table3: ~ 60.00 s > > > > It seems really odd to me that the execution time is increasing linear > > with the amount of rows! > > Am I doing something terribly wrong? > > > > Thanks in advance! > > Best regards, > > Lydia > > > > > -- > Kevin O'Dell > Field Engineer > 850-496-1298 | [email protected] > @kevinrodell > <http://www.rocana.com> >
