[
https://issues.apache.org/jira/browse/PHOENIX-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185791#comment-14185791
]
rajeshbabu commented on PHOENIX-1170:
-------------------------------------
[~jamestaylor] Thanks for the review. Here is the patch addressing the review
comments.
bq. Instead of creating a new LocalIndexCompactionObserver coprocessor,
override the preCompactScannerOpen and postCompact methods in the existing
LocalIndexSplitter coprocessor
Currently preCompactScannerOpen and postCompact methods required for local
index region only so added them to IndexHalfStoreFileReaderGenerator
coprocessor which has added only for local index table. LocalIndexSplitter
coprocessor has added to data table desc only.
bq. Instead of creating a new PIndexState enum, just use INACTIVE instead of
SPLIT.
Correct better to have INACTIVE only.
bq. Does the preSplitBeforePONR method get called in both cases (i.e. after
preCompactScannerOpen and postCompact) so that the index will be marked ACTIVE
again in all cases?
In preSplitBeforePONR method we mask local indexes state to INACTIVE which
always called before preCompactScannerOpen and postCompact hooks. In the
postCompact hook marking the indexes state to ACTIVE after all the files
separated to daughter regions.
Ideally we should change the state to ACTIVE after both daughter regions
completes the compaction after split. With the patch we are setting indexes
state to active when at least one region compaction is completed, still there
is a chance that other daughter region compaction is still in progress by the
time we set ACTIVE. This is approximation only. What do you say [~jamestaylor]?
Is it ok? We can check for both the regions has completed compaction or not
but there are chances of race condition and we may leave the indexes in
INACTIVE state only.
> Change status of local index during splitting to prevent usage when slower
> than query through data table
> --------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-1170
> URL: https://issues.apache.org/jira/browse/PHOENIX-1170
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: James Taylor
> Assignee: rajeshbabu
> Fix For: 5.0.0, 4.2
>
> Attachments: PHOENIX-1170.patch, PHOENIX-1170_v2.patch
>
>
> Without pre-split, queries to the table take 9x more time (i.e. 1 sec versus
> 9sec) for a count(*). If we can't bring the time down to be less than a full
> scan over the data table, we should update the local index status as INACTIVE
> while it's splitting, then it wouldn't be used for queries, but it would
> continue to be maintained. Then when the split is done, we could move it back
> to ACTIVE. Alternatively, we could invent a new status, like SPLITTING, and
> only use the local index for point lookups until the status is back to ACTIVE.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)