Hi all,
To maintain consistency between data table and its index tables, we have to do
a transactional update cross regions on different region servers. For
non-transactional table, we cannot guarantee this consistency for mutable
global secondary index. Here are the problems of existing solutions:
1. disable index write
a) update system.catalog to change index status, and set timestamp, may lead
to chain failures
b) partially rebuild index may not be a good solution for production env,
because:
b1) may execute for a long time for large table (several TBs)
b2) there might be only a few inconsistent data which needs to be caught
up but we have to do a full table time-ranged scan over the data table
b3) if there are deletes/updates and a major compaction took place, it'll
leave dirty data in index tables
c) selects that hits the disabled index will degenerate to full table scan
against data table which may quickly exhausts the read ability of the whole
cluster
2. disable data table write
a) selects that hits index still works
b) actually data table write is not disabled, but raise an exception. So
still needs to rebuild index tables when index regions are back online, which
has the same issues in 1.b
c) as index rebuild is needed, system.catalog still needs to be updated, so
chain failure may still happen.
What should be guaranteed:
1. absolutely no chain failure
2. absolutely no inconsistency no matter what happened
3. selects that hit the index will not degenerate
New solution:
1. When update index failed, retry forever until succeed
2. Do the same retry when replaying WAL
3. No need to update catalog table to avoid potential chain failures
4. This index failure policy is an option that can be switched on/off
About this solution:
1. Simple
2. When update index failed, we give up the write ability to maintain
consistency and read ability. This is acceptable for mutable global index as
its read ability is more important.
3. No need to rebuild index afterwards, as long as the pending retries
complete, indexes will be in sync.
4. In worst case, some or all of the RS will not be able to write.
5. We cannot handle index updates failure elegantly because we are not doing
real transactions. So this solution is a simple but effective way to achieve
consistency without transactions, though there is a price.
What does everybody think?
Thanks
William