Re: Provide an option to infinite retry when updating index failed

James Taylor Wed, 09 Aug 2017 11:05:12 -0700

We've been doing a ton of work to stabilize mutable secondary indexing
recently (see recently resolved JIRAs). This will appear in 4.12.0 or
4.11.1. Also, a lot of work was done for local indexing in 4.11. We're
still testing these at scale, so there may be more to come in 4.12 for
local indexing too.


One other potentially viable option if you're ok with eventual consistency
is to leave the index active even when data table writes fail (and let the
partial index rebuilder catch it up for you). We likely need PHOENIX-3949
to make that option viable.

So my thinking would be to wait until 4.12 and to check out local indexing
too.

Thanks,
James

On Tue, Aug 8, 2017 at 8:09 PM, William <[email protected]> wrote:

> Hi all,
> To maintain consistency between data table and its index tables, we have
> to do a transactional update cross regions on different region servers. For
> non-transactional table, we cannot guarantee this consistency for mutable
> global secondary index. Here are the problems of existing solutions:
> 1. disable index write
>   a) update system.catalog to change index status, and set timestamp, may
> lead to chain failures
>   b) partially rebuild index may not be a good solution for production
> env, because:
>      b1) may execute for a long time for large table (several TBs)
>      b2) there might be only a few inconsistent data which needs to be
> caught up but we have to do a full table time-ranged scan over the data
> table
>      b3) if there are deletes/updates and a major compaction took place,
> it'll leave dirty data in index tables
>   c) selects that hits the disabled index will degenerate to full table
> scan against data table which may quickly exhausts the read ability of the
> whole cluster
>
>
> 2. disable data table write
>   a) selects that hits index still works
>   b) actually data table write is not disabled, but raise an exception.
> So  still needs to rebuild index tables when index regions are back online,
> which has the same issues in 1.b
>   c) as index rebuild is needed, system.catalog still needs to be updated,
> so chain failure may still happen.
>
>
> What should be guaranteed:
> 1. absolutely no chain failure
> 2. absolutely no inconsistency no matter what happened
> 3. selects that hit the index will not degenerate
>
>
> New solution:
> 1. When update index failed, retry forever until succeed
> 2. Do the same retry when replaying WAL
> 3. No need to update catalog table to avoid potential chain failures
> 4. This index failure policy is an option that can be switched on/off
>
>
> About this solution:
> 1. Simple
> 2. When update index failed, we give up the write ability to maintain
> consistency and read ability. This is acceptable for mutable global index
> as its read ability is more important.
> 3. No need to rebuild index afterwards, as long as the pending retries
> complete, indexes will be in sync.
> 4. In worst case, some or all of the RS will not be able to write.
> 5. We cannot handle index updates failure elegantly because we are not
> doing real transactions. So this solution is a simple but effective way to
> achieve consistency without transactions, though there is a price.
>
>
> What does everybody think?
>
>
> Thanks
> William

Re: Provide an option to infinite retry when updating index failed

Reply via email to