maedhroz commented on code in PR #3054:
URL: https://github.com/apache/cassandra/pull/3054#discussion_r1466954075
##########
src/java/org/apache/cassandra/index/sai/iterators/KeyRangeIntersectionIterator.java:
##########
@@ -77,7 +77,19 @@ protected PrimaryKey computeNext()
if (index != alreadyAvanced)
{
KeyRangeIterator range = ranges.get(index);
- PrimaryKey nextKey = nextOrNull(range, highestKey);
+ PrimaryKey nextKey = range.getCurrent();
+
+ // Note that we will either have a data model that
produces SKINNY primary keys or a data model
+ // that produces some combination of WIDE and STATIC
prikary keys.
+ if (nextKey.kind() == PrimaryKey.Kind.WIDE ||
nextKey.kind() == highestKey.kind())
+ // We can always skip if the target is of the same
kind or this range is non-static.
+ nextKey = nextOrNull(range, highestKey);
+ else if (nextKey.kind() == PrimaryKey.Kind.STATIC &&
nextKey.compareTo(highestKey) < 0)
+ // For a range of static keys, only skip if we'e
advanced to a new partition, and when we
+ // do, skip to an actual static key. We may otherwise
skip too far, as static row IDs always
+ // precede non-static ones in on-disk postings lists.
+ nextKey = nextOrNull(range, highestKey.toStatic());
+
if (nextKey == null || nextKey.compareTo(highestKey) > 0)
Review Comment:
Would it break end-to-end, or would it just return a `STATIC` `PrimaryKey`
from the intersection, which would potentially cause more post-filtering than
we want in `ResultRetriever`? In any case, I think in the case where we have a
`STATIC` `highestKey`, and `WIDE` `PrimaryKey` can be treated as higher, and we
could probably make that change without changing any of the existing
`compareTo()` behavior, which seems to open up a can of worms.
In short, I think we could fix this doing something like...
```
if (nextKey == null || nextKey.compareTo(highestKey) > 0 ||
(nextKey.compareTo(highestKey) == 0 && nextKey.kind() == WIDE &&
highestKey.kind() == STATIC))
```
...or just add a `compareToStrict()` to `PrimaryKey` that encapsulates this.
We **DO** want to avoid a case where we post-filter a whole partition when 2
`WIDE` key iterators don't match anything from each other in the context of a
match from a `STATIC` key iterator. Let me work on this a bit...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]