[ 
https://issues.apache.org/jira/browse/SOLR-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831369#comment-13831369
 ] 

Manuel Lenormand commented on SOLR-5478:
----------------------------------------

Here are the patches, still preliminary in order to test the idea for this 
usage.

First one is the patch for BinaryResponseWriter, the way it could be if the 
concept works.
For testing I'd recommand using the second patch which is a class that extends 
BinaryResponseWriter and is simply used by adding its reference to the 
solrconfig.xml. You can straightly use the jar with the following:

add this to the schema:
<field name="id" type="string_onmemory" indexed="true" stored="true" 
required="true" docValues="true"/>

and according to this field this fieldtype
<fieldType name="string_onmemory" class="solr.StrField" 
docValuesFormat="Memory"/> 

And this to the solrconfig.xml
<queryResponseWriter name="javabin" 
class="test.solr.DocValuesBinaryResponseWriter"/>

You have two options that can be tested. These are the parameters required to 
the query:
1. accelerateSearch=true - orders the distributed search (phase I) to seek for 
docValues and should have no effects on the results returned.
2. accelerateIdSearch=true - orders the distributedsearch and id seek (phase I 
+ II) to seek for docValues. This way the search would return uniqueKeys only.

Assuming you have 10 shards on your instance and requesting rows=1000 (start=0 
and hits per shard>1000), accelerateSearch would avoid 10*1000 id reads and 
lazy doc loading from the stored. accelerateIdSearch would avoid another total 
of 1000 id and lazy doc loading from the stored.

Feel free adding any oppinion about the idea and implementation 


Hope this rocks,
Manuel

> Speed-up distributed search with high rows param or deep paging by 
> transforming docId's to uniqueKey via memory docValues
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-5478
>                 URL: https://issues.apache.org/jira/browse/SOLR-5478
>             Project: Solr
>          Issue Type: Improvement
>          Components: Response Writers
>    Affects Versions: 4.5
>            Reporter: Manuel Lenormand
>             Fix For: 4.6
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to