Re: Question about HFile seeking

Varun Sharma Thu, 16 May 2013 14:56:22 -0700

Sorry I may have misunderstood what you meant.

When you look for "row1c" in the HFile index - is it going to also match
for "row1,col1" or only match "row1c". It all depends how the index is
organized, if its only on HFile keys, it could also match row1,col1 unless
we use some demarcator b/w row1 and col1 in our HFile keys. So I am just
wondering if we will totally skip touch row1,col1 in this case and jump
straight to row1c or not. The other option is that we would actually hit
row1,col1 since the prefix matches row1c when looking at the HFile key and
then, we look at the length of the row to grab the real portion from the
concatenated HFile key and discard all row1 entries.


Does that make my query clearer ?


On Thu, May 16, 2013 at 2:42 PM, Varun Sharma <va...@pinterest.com> wrote:

> Nothing, I am just curious...
>
> So, we will do a bunch of wasteful scanning - that's lets say row1 has
> col1 - col100000 - basically 100K columns, we will scan all those key
> values even though we are going to discard them, is that correct ?
>
>
> On Thu, May 16, 2013 at 2:30 PM, Stack <st...@duboce.net> wrote:
>
>> What you seeing Varun (or think you are seeing)?
>> St.Ack
>>
>>
>> On Thu, May 16, 2013 at 2:30 PM, Stack <st...@duboce.net> wrote:
>>
>> > On Thu, May 16, 2013 at 2:03 PM, Varun Sharma <va...@pinterest.com>
>> wrote:
>> >
>> >> Or do we use some kind of demarcator b/w rows and columns and
>> timestamps
>> >> when building the HFile keys and the indices ?
>> >>
>> >
>> > No demarcation but in KeyValue, we keep row, column family name, column
>> > family qualifier, etc., lengths and offsets so the comparators on ly
>> > compare pertinent bytes.
>> >
>> > If you doing a prefix scan w/ row1c, we should be starting the scan at
>> > row1c, not row1 (or more correctly at the row that starts the block we
>> > believe has a row1c row in it...).
>> >
>> > St.Ack
>> >
>>
>
>

Re: Question about HFile seeking

Reply via email to