Re: Scanning over key values > timestamp?

Ryan Rawson Fri, 18 Feb 2011 10:08:38 -0800

There is minimal/no underlying efficiency. It's basically a full
table/region scan with a filter to discard the uninteresting values.
We have various timestamp filtering techniques to avoid reading from
files, eg: if you specify a time range [100,200) and a hfile only
contains [0,50) we'll not include the file.  So perhaps in your case
this might help.  Compactions will merge files and thus timestamp
ranges together, and you'll lose some efficiency, assuming you COULD
have done a query involving only the most recent HFiles.




On Fri, Feb 18, 2011 at 10:02 AM, Jason Rutherglen
<[email protected]> wrote:
> Thanks Ted!  Is there some underlying efficiency to this, or will it
> be scanning all of the rows underneath?
>
> On Fri, Feb 18, 2011 at 7:47 AM, Ted Yu <[email protected]> wrote:
>> From Scan.java:
>>  * To only retrieve columns within a specific range of version timestamps,
>>  * execute {@link #setTimeRange(long, long) setTimeRange}.
>>
>> On Fri, Feb 18, 2011 at 6:48 AM, Jason Rutherglen <
>> [email protected]> wrote:
>>
>>> For search integration we need to, on server reboot scan over key
>>> values since the last Lucene commit, and add them to the index.  Is
>>> there an efficient way to do this?
>>>
>>
>

Re: Scanning over key values > timestamp?

Reply via email to