[
https://issues.apache.org/jira/browse/PHOENIX-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113128#comment-16113128
]
James Taylor edited comment on PHOENIX-3525 at 8/4/17 5:02 PM:
---------------------------------------------------------------
Current plan is to eliminate simultaneous writes from the rebuilder and clients
to prevent any race conditions by:
# introducing an INDEX_ACTIVATE_TIMESTAMP column that determines when
incremental indexing will begin again. This timestamp will be set by the
rebuilder to a time in the future (by a configurable delta) after all index
regions are online.
# the INDEX_ACTIVATE_TIMESTAMP will be cleared when the
INDEX_DISABLED_TIMESTAMP is set in the MetaDataEndPointImpl.setIndexState call.
The rebuilder would then reset it according to the logic in (1), moving it out
to a later time.
# the INDEX_ACTIVATE_TIMESTAMP will act as an upper bound on the rebuilder scan
that replays mutations. Only after this timestamp plus some delta passes (and
the replaying is complete) will an index be marked as ACTIVE and the
INDEX_ACTIVATE_TIMESTAMP and INDEX_DISABLED_TIMESTAMP be cleared.
# index maintenance will be prevented while server-based timestamp <
INDEX_ACTIVATE_TIMESTAMP by having the clients not send the IndexMaintainer .
The INDEX_ACTIVATE_TIMESTAMP will be included in PTable so that it makes its
way to the clients.
was (Author: jamestaylor):
Current plan is to eliminate simultaneous writes from the rebuilder and clients
to prevent any race conditions by:
* introducing a PENDING_ACTIVE index state. When in PENDING_ACTIVE state, an
index will not be used by queries until the server-based timestamp >=
INDEX_ACTIVATE_TIMESTAMP.
* introducing an INDEX_ACTIVATE_TIMESTAMP column that determines when an index
will be reactivated. This timestamp will be set by the rebuilder to a time in
the future (by a configurable amount of time) after all index regions are
online. The index will be put either left in an ACTIVE state (depending on
config) or moved to a PENDING_ACTIVE state.
* prevent index maintenance by not sending IndexMaintainer until server-based
timestamp >= INDEX_ACTIVATE_TIMESTAMP.
* include INDEX_ACTIVATE_TIMESTAMP in PTable so that clients can use it to
control whether index maintenance is performed.
> Cap automatic index rebuilding to inactive timestamp.
> -----------------------------------------------------
>
> Key: PHOENIX-3525
> URL: https://issues.apache.org/jira/browse/PHOENIX-3525
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Ankit Singhal
> Assignee: James Taylor
> Attachments: PHOENIX-3525_wip2.patch, PHOENIX-3525_wip.patch
>
>
> From [[email protected]] review comment on
> https://github.com/apache/phoenix/pull/210
> For automatic rebuilding ,DISABLED_TIMESTAMP is lower bound but there is no
> upper bound so we are going rebuild all the new writes written after
> DISABLED_TIMESTAMP even though indexes updated properly. So we can introduce
> an upper bound of time where we are going to start a rebuild thread so we can
> limit the data to rebuild. In case If there are frequent writes then we can
> increment the rebuild period exponentially
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)