stoty commented on code in PR #1607:
URL: https://github.com/apache/phoenix/pull/1607#discussion_r1198572887
##########
phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java:
##########
@@ -1119,12 +1124,21 @@ private List<List<Scan>> getParallelScans(byte[]
startKey, byte[] stopKey) throw
int gpsComparedToEndKey = -1;
boolean everNotDelayed = false;
while (intersectWithGuidePosts && (endKey.length == 0 ||
(gpsComparedToEndKey=currentGuidePost.compareTo(endKey)) <= 0)) {
- Scan newScan = scanRanges.intersectScan(scan,
currentKeyBytes, currentGuidePostBytes, keyOffset,
- false);
- if (newScan != null) {
- ScanUtil.setLocalIndexAttributes(newScan, keyOffset,
- regionInfo.getStartKey(), regionInfo.getEndKey(),
- newScan.getStartRow(), newScan.getStopRow());
+ List<Scan> newScans =
+ scanRanges.intersectScan(scan, currentKeyBytes,
currentGuidePostBytes,
+ keyOffset, false, splitPostfix,
getTable().getBucketNum());
+ for (Scan newScan : newScans) {
+ if (newScan != null) {
+ ScanUtil.setLocalIndexAttributes(newScan,
keyOffset,
+ regionInfo.getStartKey(),
regionInfo.getEndKey(),
+ newScan.getStartRow(), newScan.getStopRow());
+ //TODO why are we passing endKey instead of
startKey here ?
+ scans =
+ addNewScan(parallelScans, scans, newScan,
currentGuidePostBytes,
+ false, regionLocation);
Review Comment:
Yes, it could be less than the region's endKey.
I have looked into what the purpose of
ParallelScanGrouper.shouldStartNewScan() is.
If we need the scan results to be ordered, then shouldStartNewScan() returns
whether the results of the new scan can be concatenated to the results previous
scans and keep them ordered (per phoenix SQL semantics)
For salted queries, this checks if the passed in "startKey" is in the same
bucket (has the same leading byte) as the startKey of the last scan in the
group.
If the salted table is properly pre-split, then it always is, so it doesn't
really do anything.
If the salted table is not properly pre-slit (the very case we are trying to
fix), then this makes sure that we put the results for different buckets into
different groups.
In either case, it is both an assumption of ParallelScanGrouper and
requirement for correct sorting that scans do not pass bucket boundaries, so it
doesn't matter whether we check the start or the end key. If the scan passes a
bucket boundary, then we have bigger problems than bad result ordering. (i.e.,
PHOENIX-4096/6910)
I'm not even sure why we even pass key separately, we could just use the
startKey of the new region.
##########
phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java:
##########
@@ -1119,12 +1124,21 @@ private List<List<Scan>> getParallelScans(byte[]
startKey, byte[] stopKey) throw
int gpsComparedToEndKey = -1;
boolean everNotDelayed = false;
while (intersectWithGuidePosts && (endKey.length == 0 ||
(gpsComparedToEndKey=currentGuidePost.compareTo(endKey)) <= 0)) {
- Scan newScan = scanRanges.intersectScan(scan,
currentKeyBytes, currentGuidePostBytes, keyOffset,
- false);
- if (newScan != null) {
- ScanUtil.setLocalIndexAttributes(newScan, keyOffset,
- regionInfo.getStartKey(), regionInfo.getEndKey(),
- newScan.getStartRow(), newScan.getStopRow());
+ List<Scan> newScans =
+ scanRanges.intersectScan(scan, currentKeyBytes,
currentGuidePostBytes,
+ keyOffset, false, splitPostfix,
getTable().getBucketNum());
+ for (Scan newScan : newScans) {
+ if (newScan != null) {
+ ScanUtil.setLocalIndexAttributes(newScan,
keyOffset,
+ regionInfo.getStartKey(),
regionInfo.getEndKey(),
+ newScan.getStartRow(), newScan.getStopRow());
+ //TODO why are we passing endKey instead of
startKey here ?
+ scans =
+ addNewScan(parallelScans, scans, newScan,
currentGuidePostBytes,
+ false, regionLocation);
Review Comment:
Thanks for pointing this out, @virajjasani .
This was actually a bug in my patch:
Even if the new code split a scan because of crossing bucket boundaries, we
still used the un-split region's key, which meant that we may have grouped
different buckets together, which would have broken ordering.
I have removed the endKey parameter, and now use the new scan's end key.
I have also removed some logic that used the current region's start key as
the end key, as it was unneccessary, and incorrect in the merged region case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]