Re: sum, avg, count, etc...

Rita Wed, 26 Oct 2011 17:28:30 -0700

Thanks for all of your responses.

The original file is a text file and when I try to search that using grep it
takes minutes. So, taking 7 seconds aint too bad.


thanks again for your time and advise

On Wed, Oct 26, 2011 at 2:49 PM, Gary Helmling <[email protected]> wrote:

> Also, make sure that you're either setting a stop row on the scan, or
> if you're using a filter, try wrapping it in a WhileMatchFilter.  This
> tells the scanner it can stop as soon as the filter starts rejecting
> rows.  Otherwise you can wind up getting back just the data you
> expect, but still scanning all the way to the end of the table, just
> filtering out all the remaining rows.
>
> On Wed, Oct 26, 2011 at 6:18 AM, Doug Meil
> <[email protected]> wrote:
> > Hi there-
> >
> > First, make sure you aren't tripping on any of these issues..
> >
> > http://hbase.apache.org/book.html#perf.reading
> >
> >
> >
> >
> >
> > On 10/26/11 6:21 AM, "Rita" <[email protected]> wrote:
> >
> >>I am trying to do some simple statistics with my data but its taking
> >>longer
> >>than expected.
> >>
> >>
> >>
> >>Here is how my data is structured in hbase.
> >>
> >>keys (symbol#epoch time stamp)
> >>msft#1319562974#NASDAQ
> >>t#1319562974#NYSE
> >>yhoo#1319562974#NASDAQ
> >>msft#1319562975#NASDAQ
> >>
> >>The values look like this (for instance microsoft)
> >>...
> >>price=26.81
> >>open=
> >>close=
> >>...
> >>
> >>there are about 300 values per each key.
> >>
> >>
> >>So, for instance if I want to calculate avg price of msft I am setting up
> >>a
> >>start and stop filter and its able to calculate it by tick. But its
> taking
> >>about 7 seconds to go thru 500 keys. Is that normal? Is there a faster
> way
> >>to calculate sum,avg,count in hbase? would I need to redo my schema?
> >>
> >>tia
> >>
> >>
> >>
> >>
> >>
> >>--
> >>--- Get your facts first, then you can distort them as you please.--
> >
> >
>



-- 
--- Get your facts first, then you can distort them as you please.--

Re: sum, avg, count, etc...

Reply via email to