Ryan, thanks, I think a full scan'll be fine as it's a one time event
on startup/recovery, and I am curious either way.

On Fri, Feb 18, 2011 at 10:08 AM, Ryan Rawson <[email protected]> wrote:
> There is minimal/no underlying efficiency. It's basically a full
> table/region scan with a filter to discard the uninteresting values.
> We have various timestamp filtering techniques to avoid reading from
> files, eg: if you specify a time range [100,200) and a hfile only
> contains [0,50) we'll not include the file.  So perhaps in your case
> this might help.  Compactions will merge files and thus timestamp
> ranges together, and you'll lose some efficiency, assuming you COULD
> have done a query involving only the most recent HFiles.
>
>
>
> On Fri, Feb 18, 2011 at 10:02 AM, Jason Rutherglen
> <[email protected]> wrote:
>> Thanks Ted!  Is there some underlying efficiency to this, or will it
>> be scanning all of the rows underneath?
>>
>> On Fri, Feb 18, 2011 at 7:47 AM, Ted Yu <[email protected]> wrote:
>>> From Scan.java:
>>>  * To only retrieve columns within a specific range of version timestamps,
>>>  * execute {@link #setTimeRange(long, long) setTimeRange}.
>>>
>>> On Fri, Feb 18, 2011 at 6:48 AM, Jason Rutherglen <
>>> [email protected]> wrote:
>>>
>>>> For search integration we need to, on server reboot scan over key
>>>> values since the last Lucene commit, and add them to the index.  Is
>>>> there an efficient way to do this?
>>>>
>>>
>>
>

Reply via email to