Re: Some questions about secondary index

James Taylor Thu, 18 Aug 2016 16:55:24 -0700

Hi William,
I think those classes demonstrate how to use mutable secondary indexes
directly with HBase (i.e. outside of Phoenix). I agree, they could be moved
into the IT directory.


You might take a look at this[1] presentation (also linked way down on the
bottom of our secondary index page), starting from slide 31. It has some
examples of out of order handling. It's not an easy problem.

Thanks,
James

[1]
http://www.slideshare.net/jesse_yates/phoenix-secondary-indexing-la-hug-sept-9th-2013

On Tue, Aug 16, 2016 at 10:15 PM, William <[email protected]> wrote:

> Hi all,
>   I've been reading the code of secondary index recently and i found it
> very hard to understand. Here are some questions:
>   1. there are 5 classes defined in package 
> 'org.apache.phoenix.hbase.index.covered.example',
> but it seems that these classes are only referenced in tests.
>       If that's true, then why not putting them into IT/test directory?
>       If not, then what are they used for?
>
>
>   2. class IndexMemStore.
>       I read the comment at the header of this class many times but I
> still cannot get the point. What is the 'out-of-order' scenario?
>       I read the comment of CoveredColumnIndexer too, it might have showed
> me an 'example' of this scenario.  The comments:
>
>
>  Taking the simple case, assume we do a single column in a group. Then if
> we get an out of order
>  update, we need to check the current state of that column in the current
> row. If the current row
>  is older, we can issue a delete as normal. If the current row is newer,
> however, we then have to
>  issue a delete for the index update at the time of the current row. This
> ensures that the index
>  update made for the 'future' time still covers the existing row.
>
>
>       So, If I delete an existing row of the data table with ts = 10,
> while the existing row has a ts of 20 which is 'newer' than the current
> operation, then, we call the current Delete operation is 'back-in-time' or
> 'out-of-order'? What makes me confused is the solution of this scenario:
> just issue the delete with the ts of the existing row, which means issuing
> a Delete with ts = 20 ? Am i right?
>      In my opinion, if a Delete is back in time, we can just ignore it or
> issue an index Delete simply with the same ts.  Why are we using such a
> complex way to generating the index update?
>      The 'roll back' operation in NonTxIndexBuilder, and
> IndexUpdateManager#fixUpCurrentUpdates(), I cannot see the purpose of
> these facilities. I think I must have missed  something very important,
> which might be some core concept or design. May someone provide me an
> easier way to understand these code?
>
>
> Thanks.
> William

Re: Some questions about secondary index

Reply via email to