thomasmueller commented on code in PR #2724:
URL: https://github.com/apache/jackrabbit-oak/pull/2724#discussion_r2772612720
##########
oak-core/src/main/java/org/apache/jackrabbit/oak/query/QueryImpl.java:
##########
@@ -1117,12 +1115,6 @@ private SelectorExecutionPlan
getBestSelectorExecutionPlan(
if (p.getSupportsPathRestriction()) {
entryCount = scaleEntryCount(rootState, filter,
entryCount);
}
- if (sortOrder == null || p.getSortOrder() != null) {
- // if the query is unordered, or
- // if the query contains "order by" and the index can
sort on that,
- // then we don't need to read all entries from the
index
- entryCount = Math.min(maxEntryCount, entryCount);
- }
Review Comment:
> But maybe, couldn't we always favor indices which support sorting
systematically rather than only if their is a limit ?
Most relational databases account for this by adding a "cost to sort". In
our case, the indexes return the same cost (which is fine), and report whether
they can sort or not. So the query engines task is to add a "cost to sort" (on
top of the cost returned by the indexes) for indexes that can _not_ sort. This
almost exactly matches the description "favor indices which support sorting
systematically" but not 100%. What it means that an index that can _not_ sort
can still be cheaper in total than index that _can_ sort. It just depends on
the cost of sorting in the query engine.
(In Jackrabbit Oak, the Lucene and Elastic indexes often report wildly
inaccurate numbers (orders of magnitude wrong), because we do not have accurate
statistics currently. So such improvements will likely not have a big impact
currently.)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]