As far as I understand sequential keys with a timerange scan have the best
read performance possible, because of the HFile metadata, just as N
indicates. Maybe adding Bloomfilters can further up the performance.

Still, in my case with random keys I get quick(sub second) response from my
scan example earlier. Does HBase keep all the HFile metadata in memory? I
can't imagine it will start hitting hundreds, potentially thousands of
HFiles, reading their metadata, start full scanning the files and returning
rows. Does it?

I'm trying to see if I can use this type of scan as a polling mechanism for
returning all rows since time X, given the fact of using random keys. Pre
major compaction I can see this working quite well, but don't know if it
will work in large scale after major compactions have taken place.

On Sat, Apr 14, 2012 at 8:04 PM, Doug Meil <[email protected]>wrote:

>
> Thanks N!  That's a good point.  I'll update the RefGuide with that.
>
> So if the data is evenly distributed (and evenly old per HFile) you still
> have the same problem, but it's conceivable that could not be the case.
> This is a case where monotonically increasing keys would actually help you.
>
>
>
>
>
> On 4/14/12 11:57 AM, "N Keywal" <[email protected]> wrote:
>
> >Hi,
> >
> >For the filtering part, every HFile is associated to a set of meta data.
> >This meta data includes the timerange. So if there is no overlap between
> >the time range you want and the time range of the store, the HFile is
> >totally skipped.
> >
> >This work is done in StoreScanner#selectScannersFrom
> >
> >Cheers,
> >
> >N.
> >
> >
> >On Sat, Apr 14, 2012 at 5:11 PM, Doug Meil
> ><[email protected]>wrote:
> >
> >> Hi there-
> >>
> >> With respect to:
> >>
> >> "* Does it need to hit every memstore and HFile to determine if there
> >> isdata available? And if so does it need to do a full scan of that file
> >>to
> >> determine the records qualifying to the timerange, since keys are stored
> >> lexicographically?"
> >>
> >> And...
> >>
> >> "Using "scan 'table', {TIMERANGE => [t, t+x]}" :"
> >> See...
> >>
> >>
> >> http://hbase.apache.org/book.html#regions.arch
> >> 8.7.5.4. KeyValue
> >>
> >>
> >>
> >> The timestamp is an attribute of the KeyValue, but unless you perform a
> >> restriction using start/stop row it have to process every row.
> >>
> >> Major compactions don't change this fact, they just change the number of
> >> HFiles that have to get processed.
> >>
> >>
> >>
> >> On 4/14/12 10:38 AM, "Rob Verkuylen" <[email protected]> wrote:
> >>
> >> >I'm trying to find a definitive answer to the question if scans on
> >> >timerange alone will scale when you use uniformly distributed keys like
> >> >UUIDs.
> >> >
> >> >Since the keys are randomly generated that would mean the keys will be
> >> >spread out over all RegionServers, Regions and HFiles. In theory,
> >>assuming
> >> >enough writes, that would mean that every HFile will contain the entire
> >> >timerange of writes.
> >> >
> >> >Now before a major compaction, data is in the memstores and (non
> >> >max.filesize) flushed&merged HFiles. I can imagine that a scan using a
> >> >TIMERANGE can quickly serve from memstores and the smaller files, but
> >>how
> >> >does it perform after a major compaction?
> >> >
> >> >Using "scan 'table', {TIMERANGE => [t, t+x]}" :
> >> >* How does HBase handle this query in this case(UUIDs)?
> >> >* Does it need to hit every memstore and HFile to determine if there is
> >> >data available? And if so does it need to do a full scan of that file
> >>to
> >> >determine the records qualifying to the timerange, since keys are
> >>stored
> >> >lexicographically?
> >> >
> >> >I've run some tests on 300+ region tables, on month old data(so after
> >> >major
> >> >compaction) and performance/response seems fairly quick. But I'm
> >>trying to
> >> >understand why that is, because hitting every HFile on every region
> >>seems
> >> >to be ineffective. Lars' book figure 9-3 seems to indicate this as
> >>well,
> >> >but cant seem to get the answer from the book or anywhere else.
> >> >
> >> >Thnx, Rob
> >>
> >>
> >>
>
>
>

Reply via email to