James Taylor created PHOENIX-3847:
-------------------------------------
Summary: Handle out of order rows during index maintenance
Key: PHOENIX-3847
URL: https://issues.apache.org/jira/browse/PHOENIX-3847
Project: Phoenix
Issue Type: Bug
Reporter: James Taylor
Based on the investigation and work done in PHOENIX-3825 plus the existence of
the ignoreNewerMutations flag, it seems that out of order rows are not handled
correctly during index maintenance. Regardless of the order the server
processes data table mutations, the resulting index rows should be the same and
should purely be based on the cell time stamp of the data rows. Ideally, we
shouldn't need the ignoreNewerMutations flag at all. Perhaps that was the
intent with IndexUpdateManager.fixUpCurrentUpdates(), but it doesn't to be
working.
Would it work to simply generate all the index rows for the mutating data rows
for all versions? We should walk through a series of examples to see if this
would work. For example, with the following data table:
|Type|RowKey|Value|Timestamp
| Put | 1 | A | 1000
| Put | 1 | C | 3000
the index table would look like this:
|Type|RowKey|Timestamp
| Put | A,1 | 1000
| Del | A,1 | 3000
| Put | C,1 | 3000
Then if a Put comes in out of order at 2000, the data table would look like
this:
|Type|RowKey|Value|Timestamp
| Put | 1 | A | 1000
| Put | 1 | B | 2000
| Put | 1 | C | 3000
and the index table should look like this:
|Type|RowKey|Timestamp
| Put | A,1 | 1000
| Del | A,1 | 2000
| Put | B,1 | 2000
| Del | B,1 | 3000
| Put | C,1 | 3000
Given that we can't reverse Delete markers, I'm not sure we can get there
completely. We'd still have a Delete of A,1 @ 3000. But perhaps this is not a
problem? We'd need to play this out further and include scenarios with row
delete as well.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)