[ 
https://issues.apache.org/jira/browse/PHOENIX-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875213#comment-16875213
 ] 

Kadir OZDEMIR commented on PHOENIX-5373:
----------------------------------------

The upgrade process requires loading the new coprocessors on the existing 
tables and initiating index rebuild on them. We are implementing a tool to do 
this and it will be checked in soon. During rebuild, if the rows that have not 
been rebuilt are scanned, then GlobalIndexChecker will rebuild them during 
scans. I have tested this with TOP N queries. TOP 50 queries are served with 
around 100ms latency when all the rows are unverified. So, during the rebuild 
process which is done table by table and on demand, scan latencies will 
increase because of the read-repair operations. Phoenix index tables are built 
within hours usually so this will be acceptable performance impact, I think. 

> GlobalIndexChecker should treat the rows created by the previous design as 
> unverified 
> --------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5373
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5373
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0, 4.14.2
>            Reporter: Kadir OZDEMIR
>            Assignee: Kadir OZDEMIR
>            Priority: Major
>             Fix For: 4.15.0, 5.1.0
>
>         Attachments: PHOENIX-5373.4.x-HBase-1.4.001.patch, 
> PHOENIX-5373.master.001.patch
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> For the ease of transition from the old global secondary index design to the 
> new one (without having read performance impact), GlobalIndexChecker treats 
> existing index rows (i.e., the rows created by the previous design) as 
> verified. We have discovered that this would lead to keeping stale index rows 
> around forever and including them in the result of queries. A stale index row 
> is a row for which we do not have the corresponding data table row. The 
> reason that we do not have the data table row is either the row is deleted 
> (but not the corresponding index row(s)), or the data table and index rows 
> are written with different timestamps. The assumption was that such rows 
> would be fixed by index rebuild. Unfortunately, without dropping or 
> truncating index tables, these stale rows may not be fixed by index rebuild. 
> Thus, GlobalIndexChecker should treat the rows created by the previous design 
> as unverified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to