[
https://issues.apache.org/jira/browse/OAK-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tommaso Teofili resolved OAK-3129.
----------------------------------
Resolution: Fixed
fixed as the first requests uses the 'rows' setting e.g. fetching the first 10k
rows, then if the matching query contains less than 30k entries it fetches 10k
at a time while traversing the cursor, otherwise it makes the following 2
requests and fetches 'numFound' / 2 docs per request.
> SolrQueryIndex making too many Solr requests per jCR query
> ----------------------------------------------------------
>
> Key: OAK-3129
> URL: https://issues.apache.org/jira/browse/OAK-3129
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: solr
> Affects Versions: 1.2.2, 1.3.2, 1.0.17
> Reporter: Tommaso Teofili
> Assignee: Tommaso Teofili
> Fix For: 1.2.4, 1.3.3, 1.0.18
>
>
> {{SolrQueryIndex}} and {{FilterQueryParser}} use the
> {{OakSolrConfiguration#getRows}} setting in order to set the number of
> documents that should be fetched in batches while iterating the {{Cursor}}
> resulting from a certain query.
> While this is an optimization that avoids loading all the results in memory
> in cases where only e.g. the first 10 results of the {{Cursor}} are visited,
> it tends to perform really bad when resultsets' cardinality is 10 times or
> more bigger than the 'rows' setting, because for each JCR query, 10 or more
> Solr queries are performed (with the additional network, Solr calls, etc.
> latencies).
> In order to avoid that we could make use of the 'rows' setting in order to
> perform the first request to Solr and then adapt the subsequent paged
> requests (controlled by start and rows Solr HTTP parameters) to be run
> against the rest of the resultset in no more than 2 Solr queries. This can be
> done by looking at the _numFound_ value from Solr's response header (from the
> first query) and set the start/rows parameters accordingly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)