brfrn169 commented on a change in pull request #2498:
URL: https://github.com/apache/hbase/pull/2498#discussion_r513236879



##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
##########
@@ -3069,27 +3073,33 @@ private RegionScannerImpl getScanner(Scan scan, 
List<KeyValueScanner> additional
           checkFamily(family);
         }
       }
-      return instantiateRegionScanner(scan, additionalScanners, nonceGroup, 
nonce);
+      return instantiateRegionScanner(scan, additionalScanners, 
additionalScannersForStores,

Review comment:
       > Here we know all the action before actually executing them, so if 
there are multiple actions operation on the same row, we could merge them first.
   
   I thought the same thing first, but it is not such a simple thing because we 
can specify timestamp to mutations. So we can't always know the latest column 
value without merging the existing values (in memstore and hfiles). For 
example, if we have one existing Put with timestamp 2, and if we perform Put 
with timestamp 1 and CAS for the same column in a batch, we can't know the 
latest column value without merging the existing values. That's why I needed to 
add the new additional Scanner. However, I know it will make things complicated.
   
   > Or maybe even make a simple statement that, please do not operate on the 
same row in a batch, we will not consider the previous operation in the same 
batch when performing the operations in the batch.
   
   Sounds this is a good idea. I will modify the patch this way. Thank you for 
discussing this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to