[jira] [Updated] (PHOENIX-2903) Handle split during scan for row key ordered aggregations

James Taylor (JIRA) Sun, 22 May 2016 12:27:34 -0700

     [ 
https://issues.apache.org/jira/browse/PHOENIX-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


James Taylor updated PHOENIX-2903:
----------------------------------
    Attachment: PHOENIX-2903_v1.patch

Here's a patch the handles splits for all cases, including aggregation which is 
occurring in-place. Please review, [~rajeshbabu]. Here's a bit of info on it:
- Performs a second check for the scan start/stop being within the region while 
we have the region lock. This covers the case of a split occurring after the 
preScannerOpen, but before the postScannerOpen. That's theoretically possible, 
right?
- Sets a SCAN_START_AFTER_ROW attribute on the scan based on the previous tuple 
and ensures in BaseRegionScannerObserver that we ignore rows at or before that 
row key. We do this instead of trying to increment the row key because that 
won't work for the aggregation key as we don't have a complete row key.
- Commonizes the skip code in BaseRegionScannerObserver as we don't need to 
have two versions of this. Also, we don't need to do any of the 
replaceArrayIndexElement on the rows being skipped.
- Moved code from ScanUtil that's only called in BaseRegionScannerObserver and 
made it private.
- Removed duplicate code between the two nextRaw() methods and just call the 
two argument nextRaw() method with the defaultScannerContext.

I believe there's still one issue with this technique, though, when the scan is 
over a local index. Before the split occurs, there will be a merge sort 
occurring among all scanners across all regions. After the split occurs, the 
original merge sort will continue and there'll be a new merge sort, again 
across all regions. We really need the original merge sort to continue only for 
the result iterator in which the split was detected. Otherwise, we'll get 
duplicate rows across the new and old iterators.

WDYT, [~rajeshbabu]? I can attempt to fix that in my next version of the patch, 
but would appreciate you reviewing what I've got so far. A test around this 
scenario would be useful too. 

> Handle split during scan for row key ordered aggregations
> ---------------------------------------------------------
>
>                 Key: PHOENIX-2903
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2903
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: James Taylor
>         Attachments: PHOENIX-2903_v1.patch, PHOENIX-2903_wip.patch
>
>
> Currently a hole in our split detection code



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PHOENIX-2903) Handle split during scan for row key ordered aggregations

Reply via email to