[ 
https://issues.apache.org/jira/browse/PHOENIX-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Jacoby updated PHOENIX-5604:
-------------------------------------
    Summary: Index rebuilds and read repairs should not skip WAL  (was: Index 
rebuilds should not skip WAL)

> Index rebuilds and read repairs should not skip WAL
> ---------------------------------------------------
>
>                 Key: PHOENIX-5604
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5604
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Geoffrey Jacoby
>            Assignee: Geoffrey Jacoby
>            Priority: Major
>
> Currently both Index read repairs and IndexTool build/rebuilds in the new 
> design continue to skip the WAL, following the same pattern the old Indexer 
> used. However, there are key differences between the old and new logic that 
> make this no longer the correct choice.
> First, recall that all HBase replication is based on tailing the WAL, and 
> that any transaction that skips the WAL doesn't get replicated. 
> In the old logic, the data table write (and WAL append) would be accompanied 
> by an IndexedKeyValue which would contain enough information to reconstitute 
> the index edit in the event of a failure before the index edit could be 
> committed. So skipping the WAL during recovery was _potentially_ OK, because 
> writing to the WAL would be redundant locally. (But that still seems to me 
> wrong in a case with replication, since I don't believe IndexedKeyValues are 
> replicated, since they use the "magic" METAFAMILY cf.)  
> In the new logic, on a normal write, we write to the index first (which will 
> go into a WAL), then the data table (into a potentially different RS's WAL), 
> and lastly the verified flag flip into the Index, into the original index 
> write's WAL. If something goes wrong with stage 2 or 3, read repair will fix 
> it, but if the repair action – whether a put or delete – doesn't go into the 
> WAL, a DR buddy of the index will be out of sync. 
> This is even more important on an async initial build of an index, where if I 
> understand right, there is no WAL append for the index write at all in the 
> current UngroupedAggregateRegionObserver rebuild logic. The same would be the 
> case of a rebuild of a new-style index in the event of non-Phoenix related 
> corruption (such as HDFS or raw HBase level). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to