[ 
https://issues.apache.org/jira/browse/PHOENIX-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204203#comment-17204203
 ] 

ASF GitHub Bot commented on PHOENIX-6160:
-----------------------------------------

gjacoby126 commented on a change in pull request #897:
URL: https://github.com/apache/phoenix/pull/897#discussion_r496949729



##########
File path: 
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/IndexRegionObserver.java
##########
@@ -166,12 +175,36 @@ public static void 
setFailDataTableUpdatesForTesting(boolean fail) {
       private HashSet<ImmutableBytesPtr> rowsToLock = new HashSet<>();
       // The current and next states of the data rows corresponding to the 
pending mutations
       private HashMap<ImmutableBytesPtr, Pair<Put, Put>> dataRowStates;
-      // Data table pending mutations
+      // The previous concurrent batch contexts
+      private HashMap<ImmutableBytesPtr, BatchMutateContext> 
lastConcurrentBatchContext = null;
+      // The latches of the threads waiting for this batch to complete
+      private List<CountDownLatch> waitList = null;
       private Map<ImmutableBytesPtr, MultiMutation> multiMutationMap;
 
       private BatchMutateContext(int clientVersion) {
           this.clientVersion = clientVersion;
       }
+
+      public BatchMutatePhase getCurrentPhase() {
+          return currentPhase;
+      }
+
+      public Put getNextDataRowState(ImmutableBytesPtr rowKeyPtr) {
+          Pair<Put, Put> rowState = dataRowStates.get(rowKeyPtr);
+          if (rowState != null) {
+              return dataRowStates.get(rowKeyPtr).getSecond();

Review comment:
       nit: rowState.getSecond(). No need to pull out of the hashmap a second 
time. 

##########
File path: 
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/IndexRegionObserver.java
##########
@@ -211,9 +244,11 @@ private BatchMutateContext(int clientVersion) {
   private long slowIndexPrepareThreshold;
   private long slowPreIncrementThreshold;
   private int rowLockWaitDuration;
+  private int concurrentMutationWaitDuration;
   private String dataTableName;
 
   private static final int DEFAULT_ROWLOCK_WAIT_DURATION = 30000;
+  private static final int DEFAULT_CONCURRENT_MUTATION_WAIT_DURATION_IN_MS = 
1000;

Review comment:
       1s seems long

##########
File path: 
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/IndexRegionObserver.java
##########
@@ -150,9 +154,14 @@ public static void 
setFailDataTableUpdatesForTesting(boolean fail) {
       failDataTableUpdatesForTesting = fail;
   }
 
+  public enum BatchMutatePhase {
+      PRE, POST, FAILED
+  }
   // Hack to get around not being able to save any state between
   // coprocessor calls. TODO: remove after HBASE-18127 when available
+
   private static class BatchMutateContext {
+      private BatchMutatePhase currentPhase = BatchMutatePhase.PRE;

Review comment:
       This variable gets accessed from multiple threads -- should it be 
atomic? Since only one thread will write to it at once, it would currently only 
occasionally prevent having to go through the wait loop more than times than 
necessary. Might also prevent correctness issues in the future if these 
assumptions change. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


> Simplifying concurrent mutation handling for global Indexes
> -----------------------------------------------------------
>
>                 Key: PHOENIX-6160
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6160
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 5.0.0, 4.15.0
>            Reporter: Kadir OZDEMIR
>            Assignee: Kadir OZDEMIR
>            Priority: Major
>         Attachments: PHOENIX-6160.4.x.001.patch
>
>
> Please see the attached design document for the proposed simplification. The 
> proposed design is simpler to understand and does not require a special 
> handling of partial concurrent updates without indexed columns.
> One of the desired features for global indexes is to support atomic 
> operations (ON_DUPLICATE_KEY statements). We have found that it is quite 
> difficult to build such a feature on the current design as we need to add 
> more case handling to the current design to handle data table update ordering 
> issues. The proposed design does not require us to do changes on concurrent 
> mutation handling for such features.
> The proposed design almost eliminates unverified index rows due to concurrent 
> mutations. The index rows are left unverified only when batches fail to 
> complete the data table updates. This leads to read performance improvement 
> as repairing unverified rows is costly and each row repair adds several tens 
> of milliseconds to the overall scan latency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to