[jira] [Resolved] (OAK-3129) SolrQueryIndex making too many Solr requests per jCR query

Tommaso Teofili (JIRA) Tue, 21 Jul 2015 08:25:02 -0700

     [ 
https://issues.apache.org/jira/browse/OAK-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tommaso Teofili resolved OAK-3129.
----------------------------------
    Resolution: Fixed

fixed as the first requests uses the 'rows' setting e.g. fetching the first 10k 
rows, then if the matching query contains less than 30k entries it fetches 10k 
at a time while traversing the cursor, otherwise it makes the following 2 
requests and fetches 'numFound' / 2 docs per request.

> SolrQueryIndex making too many Solr requests per jCR query
> ----------------------------------------------------------
>
>                 Key: OAK-3129
>                 URL: https://issues.apache.org/jira/browse/OAK-3129
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: solr
>    Affects Versions: 1.2.2, 1.3.2, 1.0.17
>            Reporter: Tommaso Teofili
>            Assignee: Tommaso Teofili
>             Fix For: 1.2.4, 1.3.3, 1.0.18
>
>
> {{SolrQueryIndex}} and {{FilterQueryParser}} use the 
> {{OakSolrConfiguration#getRows}} setting in order to set the number of 
> documents that should be fetched in batches while iterating the {{Cursor}} 
> resulting from a certain query.
> While this is an optimization that avoids loading all the results in memory 
> in cases where only e.g. the first 10 results of the {{Cursor}} are visited, 
> it tends to perform really bad when resultsets' cardinality is 10 times or 
> more bigger than the 'rows' setting, because for each JCR query, 10 or more 
> Solr queries are performed (with the additional network, Solr calls, etc. 
> latencies).
> In order to avoid that we could make use of the 'rows' setting in order to 
> perform the first request to Solr and then adapt the subsequent paged 
> requests (controlled by start and rows Solr HTTP parameters) to be run 
> against the rest of the resultset in no more than 2 Solr queries. This can be 
> done by looking at the _numFound_ value from Solr's response header (from the 
> first query) and set the start/rows parameters accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (OAK-3129) SolrQueryIndex making too many Solr requests per jCR query

Reply via email to