wangweiming800 commented on a change in pull request #741: PHOENIX-5791
Eliminate false invalid row detection due to concurrent …
URL: https://github.com/apache/phoenix/pull/741#discussion_r398921503
##########
File path:
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -614,38 +606,132 @@ private boolean isDeleteFamilyVersion(Mutation
mutation) {
return getMutationsWithSameTS(put, del);
}
+ private void repairActualMutationList(List<Mutation> actualMutationList,
List<Mutation> expectedMutationList)
+ throws IOException {
+ // find the first (latest) actual unverified put mutation
+ Mutation actual = null;
+ for (Mutation mutation : actualMutationList) {
+ if (mutation instanceof Put && !isVerified((Put) mutation)) {
+ actual = mutation;
+ break;
+ }
+ }
+ if (actual == null) {
+ return;
+ }
+ long ts = getTimestamp(actual);
+ int expectedIndex;
+ int expectedListSize = expectedMutationList.size();
+ for (expectedIndex = 0; expectedIndex < expectedListSize;
expectedIndex++) {
+ if (getTimestamp(expectedMutationList.get(expectedIndex)) <= ts) {
+ if (expectedIndex > 0) {
+ expectedIndex--;
+ }
+ break;
+ }
+ }
+ if (expectedIndex == expectedListSize) {
+ return;
+ }
+ for (; expectedIndex < expectedListSize; expectedIndex++) {
+ Mutation mutation = expectedMutationList.get(expectedIndex);
+ if (mutation instanceof Put) {
+ mutation = new Put((Put) mutation);
+ } else {
+ mutation = new Delete((Delete) mutation);
+ }
+ actualMutationList.add(mutation);
+ }
+ Collections.sort(actualMutationList, MUTATION_TS_DESC_COMPARATOR);
+ }
+
+ private void cleanUpActualMutationList(List<Mutation> actualMutationList)
+ throws IOException {
+ Iterator<Mutation> iterator = actualMutationList.iterator();
+ Mutation previous = null;
+ while (iterator.hasNext()) {
+ Mutation mutation = iterator.next();
+ if ((mutation instanceof Put && !isVerified((Put) mutation)) ||
+ (mutation instanceof Delete &&
isDeleteFamilyVersion(mutation))) {
+ iterator.remove();
Review comment:
I am wondering whether it will cause false alarm when the last data mutation
is put and fails on the 3rd phase and there is no read repair to fix this
unverified row before inline validation. In this case, the unverified index row
will be included in the expected mutation list, but cannot find in the actual
mutation list.
Following is the link I come up a scenario to simulate this case. We may
discuss it offline
https://docs.google.com/spreadsheets/d/1FDKkcGrJobR5Jtd73vNSVfC1yDDrisyuqUdYWLde5rw/edit?usp=sharing
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services