Re: Scan time increasing linearly

Ted Yu Wed, 03 May 2017 06:51:04 -0700

Lydia:

bq. including a prefix-filter and some column filter.


Are you expecting roughly same rate of filtering across the tables ?

Which hbase / hadoop release are you using ?

Cheers

On Wed, May 3, 2017 at 6:32 AM, Kevin O'Dell <[email protected]> wrote:

> Hi Lydia,
>
>   Welcome to the wonderful world of HBase! I don't think it is wrong that
> you are seeing linear results from doing a scan.  When doing a scan HBase
> will collect X amount of rows to return to the client. X being the value of
> your scan cache. If each round trip grabs 100 rows and takes 1 second to do
> it, then it is safe to assume time will grow in a linear nature.  The good
> news is HBase is much faster than the example I gave.  I would recommend
> looking at how much you are caching and raise that value, though I am not
> surprised your scans are growing in a linear nature as the scan function is
> rather linear itself. Does this make sense?
>
> Also I may be completely wrong so I will defer to anyone else's expert
> information.
>
> On Wed, May 3, 2017 at 6:51 AM, Lydia <[email protected]> wrote:
>
> > Hi,
> >
> > I would like to know if my query times seem appropriate since I do not
> > have a lot experience with HBase.
> >
> > I have three tables - stored in HDFS, on one machine:
> >         table1: 5 million rows
> >         table2: 15 million rows
> >         table3: 90 million rows
> >
> > I do a scan using the Java API including a prefix-filter and some column
> > filter.
> > My rowkeys are encoded with geohashes.
> >
> > Execution Times:
> >         table1: ~   3.072 s
> >         table2: ~ 10.117 s
> >         table3: ~ 60.00 s
> >
> > It seems really odd to me that the execution time is increasing linear
> > with the amount of rows!
> > Am I doing something terribly wrong?
> >
> > Thanks in advance!
> > Best regards,
> > Lydia
>
>
>
>
> --
> Kevin O'Dell
> Field Engineer
> 850-496-1298 | [email protected]
> @kevinrodell
> <http://www.rocana.com>
>

Re: Scan time increasing linearly

Reply via email to