[jira] [Comment Edited] (SOLR-8344) Decide default when requested fields are both column and row stored.

Cao Manh Dat (JIRA) Thu, 07 Sep 2017 01:55:22 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156395#comment-16156395
 ]


Cao Manh Dat edited comment on SOLR-8344 at 9/7/17 8:54 AM:
------------------------------------------------------------

Here are my patch for this ticket. The work here is simple. I created a new 
class called {{RetrieveFieldsOptimizer}}, It takes input from docFetcher, 
returnFields and docList, then return storedFields and docValuesFields need to 
retrieve.

The optimization here is very simple ( we can do more optimization in the 
future if we want ). In case of field has both stored and docValues, we always 
use docValues unless the numDocs is small and exist a field need to be returned 
but only stored.

Therefore in first pass ( with id and score fields only ) we will always use 
docValues to retrieve id field.

Here are some benchmark result ( 3 shards, 1 replica each, SSD drive, 18000 
docs, run 500 times and get average )
||Query||AVG QTime with optimizer||AVG QTime without optimizer||
|q=*:*&fl=title_str_dv_stored,id,revision_text_str_dv_stored&start=10000|54|172||
|q=*:*&fl=title_str_dv_stored,id,revision_text_str_dv_stored&start=10|2|2||
|q=*:*&fl=title_str_dv_stored,id,revision_text_str_dv_stored&rows=1000|40|64||
|q=*:*&fl=title_str_dv_stored,id,revision_text_str_stored&start=10000|53|175||
|q=*:*&fl=title_str_dv_stored,id,revision_text_str_stored&start=10|2|2||
|q=*:*&fl=title_str_dv_stored,id,revision_text_str_stored&rows=1000|56|64||


was (Author: caomanhdat):
Here are my patch for this ticket. The work here is simple. I created a new 
class called {{RetrieveFieldsOptimizer}}, It takes input from docFetcher, 
returnFields and docList, then return storedFields and docValuesFields need to 
retrieve.

The optimization here is very simple ( we can do more optimization in the 
future if we want ). In case of field has both stored and docValues, we always 
use docValues unless the numDocs is small and exist a field need to be returned 
but only stored.

Therefore in first pass ( with id and score fields only ) we will always use 
docValues to retrieve id field.

Here are some benchmark result ( 3 shards, 1 replica each, SSD drive, 18000 
docs )
||Query||AVG QTime with optimizer||AVG QTime without optimizer||
|q=*:*&fl=TITLE_str,id,REVISION_TEXT_str&start=10000|49|267||
|q=*:*&fl=TITLE_str,id,REVISION_TEXT_str&start=10|3|3||

> Decide default when requested fields are both column and row stored.
> --------------------------------------------------------------------
>
>                 Key: SOLR-8344
>                 URL: https://issues.apache.org/jira/browse/SOLR-8344
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Ishan Chattopadhyaya
>         Attachments: SOLR-8344.patch
>
>
> This issue was discussed in the comments at SOLR-8220. Splitting it out to a 
> separate issue so that we can have a focused discussion on whether/how to do 
> this.
> If a given set of requested fields are all stored and have docValues (column 
> stored), we can retrieve the values from either place.  What should the 
> default be?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SOLR-8344) Decide default when requested fields are both column and row stored.

Reply via email to