[ 
https://issues.apache.org/jira/browse/PHOENIX-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113128#comment-16113128
 ] 

James Taylor edited comment on PHOENIX-3525 at 8/4/17 5:02 PM:
---------------------------------------------------------------

Current plan is to eliminate simultaneous writes from the rebuilder and clients 
to prevent any race conditions by:
# introducing an INDEX_ACTIVATE_TIMESTAMP column that determines when 
incremental indexing will begin again. This timestamp will be set by the 
rebuilder to a time in the future (by a configurable delta) after all index 
regions are online.
# the INDEX_ACTIVATE_TIMESTAMP will be cleared when the 
INDEX_DISABLED_TIMESTAMP is set in the MetaDataEndPointImpl.setIndexState call. 
The rebuilder would then reset it according to the logic in (1), moving it out 
to a later time.
# the INDEX_ACTIVATE_TIMESTAMP will act as an upper bound on the rebuilder scan 
that replays mutations. Only after this timestamp plus some delta passes  (and 
the replaying is complete) will an index be marked as ACTIVE and the 
INDEX_ACTIVATE_TIMESTAMP and INDEX_DISABLED_TIMESTAMP be cleared.
# index maintenance will be prevented while server-based timestamp < 
INDEX_ACTIVATE_TIMESTAMP by having the clients not send the IndexMaintainer . 
The INDEX_ACTIVATE_TIMESTAMP will be included in PTable so that it makes its 
way to the clients.



was (Author: jamestaylor):
Current plan is to eliminate simultaneous writes from the rebuilder and clients 
to prevent any race conditions by:
* introducing a PENDING_ACTIVE index state. When in PENDING_ACTIVE state, an 
index will not be used by queries until the server-based timestamp >= 
INDEX_ACTIVATE_TIMESTAMP.
* introducing an INDEX_ACTIVATE_TIMESTAMP column that determines when an index 
will be reactivated. This timestamp will be set by the rebuilder to a time in 
the future (by a configurable amount of time) after all index regions are 
online. The index will be put either left in an ACTIVE state (depending on 
config) or moved to a PENDING_ACTIVE state.
* prevent index maintenance by not sending IndexMaintainer until server-based 
timestamp >= INDEX_ACTIVATE_TIMESTAMP.
* include INDEX_ACTIVATE_TIMESTAMP in PTable so that clients can use it to 
control whether index maintenance is performed.


> Cap automatic index rebuilding to inactive timestamp.
> -----------------------------------------------------
>
>                 Key: PHOENIX-3525
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3525
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Ankit Singhal
>            Assignee: James Taylor
>         Attachments: PHOENIX-3525_wip2.patch, PHOENIX-3525_wip.patch
>
>
> From [[email protected]] review comment on 
> https://github.com/apache/phoenix/pull/210
> For automatic rebuilding ,DISABLED_TIMESTAMP is lower bound but there is no 
> upper bound so we are going rebuild all the new writes written after 
> DISABLED_TIMESTAMP even though indexes updated properly. So we can introduce 
> an upper bound of time where we are going to start a rebuild thread so we can 
> limit the data to rebuild. In case If there are frequent writes then we can 
> increment the rebuild period exponentially



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to