We've found that you can do a lot for yourself by using a filter query
to page through your data if it has a natural range to do so instead
of start and rows.

Michael Della Bitta

------------------------------------------------
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Mon, Apr 29, 2013 at 6:44 AM, Dmitry Kan <solrexp...@gmail.com> wrote:
> Abhishek,
>
> There is a wiki regarding this:
>
> http://wiki.apache.org/solr/CommonQueryParameters
>
> search "pageDoc and pageScore".
>
>
> On Mon, Apr 29, 2013 at 1:17 PM, Abhishek Sanoujam
> <abhi.sanou...@gmail.com>wrote:
>
>> We have a single shard, and all the data is in a single box only.
>> Definitely looks like "deep-paging" is having problems.
>>
>> Just to understand, is the searcher looping over the result set everytime
>> and skipping the first "start" count? This will definitely take a toll when
>> we reach higher "start" values.
>>
>>
>>
>>
>> On 4/29/13 2:28 PM, Jan Høydahl wrote:
>>
>>> Hi,
>>>
>>> How many shards do you have? This is a known issue with deep paging with
>>> multi shard, see 
>>> https://issues.apache.org/**jira/browse/SOLR-1726<https://issues.apache.org/jira/browse/SOLR-1726>
>>>
>>> You may be more successful in going to each shard, one at a time (with
>>> &distrib=false) to avoid this issue.
>>>
>>> --
>>> Jan Høydahl, search solution architect
>>> Cominvent AS - www.cominvent.com
>>> Solr Training - www.solrtraining.com
>>>
>>> 29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam <abhi.sanou...@gmail.com
>>> >:
>>>
>>>  We have a solr core with about 115 million documents. We are trying to
>>>> migrate data and running a simple query with *:* query and with start and
>>>> rows param.
>>>> The performance is becoming too slow in solr, its taking almost 2 mins
>>>> to get 4000 rows and migration is being just too slow. Logs snippet below:
>>>>
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55438000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=168308
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55446000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=122771
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55454000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=137615
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55450000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=141223
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55462000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=97474
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55458000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=98115
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55466000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=143822
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55474000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=118066
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55470000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=121498
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55482000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=164062
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55478000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=165518
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55486000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=118163
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55494000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=141642
>>>> INFO: [coreName] webapp=/solr path=/select params={start=55490000&q=*:*&
>>>> **wt=javabin&version=2&rows=**4000} hits=115760479 status=0 QTime=145037
>>>>
>>>>
>>>> I've taken some thread dumps in the solr server and most of the time the
>>>> threads seem to be busy in the following stacks mostly:
>>>> Is there anything that can be done to improve the performance? Is it a
>>>> known issue? Its very surprising that querying for some just rows starting
>>>> at some points is taking in order of minutes.
>>>>
>>>>
>>>> "395883378@qtp-162198005-7" prio=10 tid=0x00007f4aa0636000 nid=0x295a
>>>> runnable [0x00007f42865dd000]
>>>>    java.lang.Thread.State: RUNNABLE
>>>>         at org.apache.lucene.util.**PriorityQueue.downHeap(**
>>>> PriorityQueue.java:252)
>>>>         at org.apache.lucene.util.**PriorityQueue.pop(**
>>>> PriorityQueue.java:184)
>>>>         at org.apache.lucene.search.**TopDocsCollector.**
>>>> populateResults(**TopDocsCollector.java:61)
>>>>         at org.apache.lucene.search.**TopDocsCollector.topDocs(**
>>>> TopDocsCollector.java:156)
>>>>         at org.apache.solr.search.**SolrIndexSearcher.**getDocListNC(**
>>>> SolrIndexSearcher.java:1499)
>>>>         at org.apache.solr.search.**SolrIndexSearcher.getDocListC(**
>>>> SolrIndexSearcher.java:1366)
>>>>         at org.apache.solr.search.**SolrIndexSearcher.search(**
>>>> SolrIndexSearcher.java:457)
>>>>         at org.apache.solr.handler.**component.QueryComponent.**
>>>> process(QueryComponent.java:**410)
>>>>         at org.apache.solr.handler.**component.SearchHandler.**
>>>> handleRequestBody(**SearchHandler.java:208)
>>>>         at org.apache.solr.handler.**RequestHandlerBase.**handleRequest(
>>>> **RequestHandlerBase.java:135)
>>>>         at org.apache.solr.core.SolrCore.**execute(SolrCore.java:1817)
>>>>         at org.apache.solr.servlet.**SolrDispatchFilter.execute(**
>>>> SolrDispatchFilter.java:639)
>>>>         at org.apache.solr.servlet.**SolrDispatchFilter.doFilter(**
>>>> SolrDispatchFilter.java:345)
>>>>         at org.apache.solr.servlet.**SolrDispatchFilter.doFilter(**
>>>> SolrDispatchFilter.java:141)
>>>>
>>>>
>>>> "1154127582@qtp-162198005-3" prio=10 tid=0x00007f4aa0613800 nid=0x2956
>>>> runnable [0x00007f42869e1000]
>>>>    java.lang.Thread.State: RUNNABLE
>>>>         at org.apache.lucene.util.**PriorityQueue.downHeap(**
>>>> PriorityQueue.java:252)
>>>>         at org.apache.lucene.util.**PriorityQueue.updateTop(**
>>>> PriorityQueue.java:210)
>>>>         at org.apache.lucene.search.**TopScoreDocCollector$**
>>>> InOrderTopScoreDocCollector.**collect(TopScoreDocCollector.**java:62)
>>>>         at org.apache.lucene.search.**Scorer.score(Scorer.java:64)
>>>>         at org.apache.lucene.search.**IndexSearcher.search(**
>>>> IndexSearcher.java:605)
>>>>         at org.apache.lucene.search.**IndexSearcher.search(**
>>>> IndexSearcher.java:297)
>>>>         at org.apache.solr.search.**SolrIndexSearcher.**getDocListNC(**
>>>> SolrIndexSearcher.java:1491)
>>>>         at org.apache.solr.search.**SolrIndexSearcher.getDocListC(**
>>>> SolrIndexSearcher.java:1366)
>>>>         at org.apache.solr.search.**SolrIndexSearcher.search(**
>>>> SolrIndexSearcher.java:457)
>>>>         at org.apache.solr.handler.**component.QueryComponent.**
>>>> process(QueryComponent.java:**410)
>>>>         at org.apache.solr.handler.**component.SearchHandler.**
>>>> handleRequestBody(**SearchHandler.java:208)
>>>>         at org.apache.solr.handler.**RequestHandlerBase.**handleRequest(
>>>> **RequestHandlerBase.java:135)
>>>>         at org.apache.solr.core.SolrCore.**execute(SolrCore.java:1817)
>>>>         at org.apache.solr.servlet.**SolrDispatchFilter.execute(**
>>>> SolrDispatchFilter.java:639)
>>>>         at org.apache.solr.servlet.**SolrDispatchFilter.doFilter(**
>>>> SolrDispatchFilter.java:345)
>>>>         at org.apache.solr.servlet.**SolrDispatchFilter.doFilter(**
>>>> SolrDispatchFilter.java:141)
>>>>
>>>>
>>>> --
>>>> ---------
>>>> Cheers,
>>>> Abhishek
>>>>
>>>>
>>
>> --
>> ---------
>> Cheers,
>> Abhishek
>>
>>

Reply via email to