maedhroz commented on code in PR #3054:
URL: https://github.com/apache/cassandra/pull/3054#discussion_r1459669699


##########
src/java/org/apache/cassandra/index/sai/iterators/KeyRangeIntersectionIterator.java:
##########
@@ -77,7 +77,19 @@ protected PrimaryKey computeNext()
                 if (index != alreadyAvanced)
                 {
                     KeyRangeIterator range = ranges.get(index);
-                    PrimaryKey nextKey = nextOrNull(range, highestKey);
+                    PrimaryKey nextKey = range.getCurrent();
+
+                    // Note that we will either have a data model that 
produces SKINNY primary keys or a data model
+                    // that produces some combination of WIDE and STATIC 
prikary keys.
+                    if (nextKey.kind() == PrimaryKey.Kind.WIDE || 
nextKey.kind() == highestKey.kind())
+                        // We can always skip if the target is of the same 
kind or this range is non-static. 
+                        nextKey = nextOrNull(range, highestKey);
+                    else if (nextKey.kind() == PrimaryKey.Kind.STATIC && 
nextKey.compareTo(highestKey) < 0)
+                        // For a range of static keys, only skip if we'e 
advanced to a new partition, and when we
+                        // do, skip to an actual static key. We may otherwise 
skip too far, as static row IDs always
+                        // precede non-static ones in on-disk postings lists.
+                        nextKey = nextOrNull(range, highestKey.toStatic());

Review Comment:
   This is the core of the fix. Two problems:
   
   1.) We were advancing streams of STATIC `PrimaryKey` objects from static 
column indexes before keys from the same partition from non-static column 
indexes could match. This was discovered by Harry on in-memory indexes.
   
   2.) When we did skip the STATIC key streams, we could skip w/ a non-static 
`PrimaryKey` object, which would map to a posting row ID that was after the row 
ID of the static row, and so we could skip past the static `PrimaryKey` we 
wanted. (This affected on-disk indexes only.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to