Lokesh Khurana created PHOENIX-7915:
---------------------------------------
Summary: Dual-maintain secondary indexes through the transform
window
Key: PHOENIX-7915
URL: https://issues.apache.org/jira/browse/PHOENIX-7915
Project: Phoenix
Issue Type: Sub-task
Reporter: Lokesh Khurana
A transform on a data table that has secondary indexes must keep the
indexes consistent with the new physical table. A drop-and-recreate
approach (drop indexes pre-cutover, rebuild post-cutover via
{{CREATE INDEX}} MR) costs N × M full-table scans (one per index per view)
and produces a customer-visible window where index-using queries fall
back to base-table scans during the rebuild. For rowkey-changing
transforms (e.g., a salt-bucket change that reshapes every rowkey on the
data table), this rebuild cost dominates the transform's wall-clock and
makes the feature impractical at the cluster sizes Phoenix is deployed
at today.
This sub-task replaces drop-and-recreate with a dual-maintain approach.
Each secondary index participates in the same dual-write protocol as
the data table for the duration of the transform; cutover swaps the
index physical pointers atomically alongside the data-table pointer; no
post-cutover index rebuild is needed.
*Approach:*
{{IndexRegionObserver}} already supports N {{IndexMaintainer}} instances per
mutation. This sub-task extends that to 2N during a transform: each
existing {{IndexMaintainer}} is paired with a new
{{IndexTransformMaintainer}} that targets the new index physical. Same
verified/unverified protocol the data-table dual-write uses.
*Mechanics:*
* *At* {*}{{addTransform}}{*}{*}:{*} create a new HBase physical for each
global /
uncovered index on the data table and each view-index. Local indexes
share the data-table physical, so no new physical is created for them
— they ride along with the data-table swap. Per-index OLD_METADATA
snapshot is stored alongside the data-table snapshot for use by
partial-pass.
* *During transform window:* each existing {{IndexMaintainer}} is paired
with an {{IndexTransformMaintainer}} targeting the new index physical.
Dual-writes flow per-index in lock-step with data-table dual-writes.
The bulk-copy MR re-projects index entries onto the new index
physicals.
* *At cutover:* a single SYSCAT commit swaps {{PHYSICAL_TABLE_NAME}} on
the data-table row, each index row, each view row, and each
view-index row, and DELETEs the {{TRANSFORMING_NEW_TABLE}} link rows.
This means one client-cache invalidation cycle propagates both the
pointer swap and the dual-write shutoff for every participating
table.
* *Partial-pass:* the data-table partial-pass re-runs all maintainers
(including {{{}IndexMaintainer{}}}) on UNVERIFIED rows on the new data
physical. This produces correct index entries on the new index
physicals. No separate per-index partial-pass is needed.
* *At* {*}{{completeTransform}}{*}{*}:{*} disable the old data physical and
each
old index physical; schedule each for deferred-drop after the
retention window.
* *On abort:* schedule deferred-drop for the new data physical and
each new index physical; same cache-cycle wait pattern as the
data-table abort path.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)