brfrn169 commented on a change in pull request #2498: URL: https://github.com/apache/hbase/pull/2498#discussion_r513236879
########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java ########## @@ -3069,27 +3073,33 @@ private RegionScannerImpl getScanner(Scan scan, List<KeyValueScanner> additional checkFamily(family); } } - return instantiateRegionScanner(scan, additionalScanners, nonceGroup, nonce); + return instantiateRegionScanner(scan, additionalScanners, additionalScannersForStores, Review comment: > Here we know all the action before actually executing them, so if there are multiple actions operation on the same row, we could merge them first. I thought the same thing first, but it is not such a simple thing because we can specify timestamp to mutations. So we can't always know the latest column value without merging the existing values (in memstore and hfiles). For example, if we have one existing Put with timestamp 2, and if we perform Put with timestamp 1 and CAS for the same column in a batch, we can't know the latest column value without merging the existing values. That's why I needed to add the new additional Scanner. However, I know it will make things complicated. > Or maybe even make a simple statement that, please do not operate on the same row in a batch, we will not consider the previous operation in the same batch when performing the operations in the batch. Sounds this is a good idea. I will modify the patch this way. Thank you for discussing this. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org