[
https://issues.apache.org/jira/browse/SOLR-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156395#comment-16156395
]
Cao Manh Dat edited comment on SOLR-8344 at 9/7/17 8:54 AM:
------------------------------------------------------------
Here are my patch for this ticket. The work here is simple. I created a new
class called {{RetrieveFieldsOptimizer}}, It takes input from docFetcher,
returnFields and docList, then return storedFields and docValuesFields need to
retrieve.
The optimization here is very simple ( we can do more optimization in the
future if we want ). In case of field has both stored and docValues, we always
use docValues unless the numDocs is small and exist a field need to be returned
but only stored.
Therefore in first pass ( with id and score fields only ) we will always use
docValues to retrieve id field.
Here are some benchmark result ( 3 shards, 1 replica each, SSD drive, 18000
docs, run 500 times and get average )
||Query||AVG QTime with optimizer||AVG QTime without optimizer||
|q=*:*&fl=title_str_dv_stored,id,revision_text_str_dv_stored&start=10000|54|172||
|q=*:*&fl=title_str_dv_stored,id,revision_text_str_dv_stored&start=10|2|2||
|q=*:*&fl=title_str_dv_stored,id,revision_text_str_dv_stored&rows=1000|40|64||
|q=*:*&fl=title_str_dv_stored,id,revision_text_str_stored&start=10000|53|175||
|q=*:*&fl=title_str_dv_stored,id,revision_text_str_stored&start=10|2|2||
|q=*:*&fl=title_str_dv_stored,id,revision_text_str_stored&rows=1000|56|64||
was (Author: caomanhdat):
Here are my patch for this ticket. The work here is simple. I created a new
class called {{RetrieveFieldsOptimizer}}, It takes input from docFetcher,
returnFields and docList, then return storedFields and docValuesFields need to
retrieve.
The optimization here is very simple ( we can do more optimization in the
future if we want ). In case of field has both stored and docValues, we always
use docValues unless the numDocs is small and exist a field need to be returned
but only stored.
Therefore in first pass ( with id and score fields only ) we will always use
docValues to retrieve id field.
Here are some benchmark result ( 3 shards, 1 replica each, SSD drive, 18000
docs )
||Query||AVG QTime with optimizer||AVG QTime without optimizer||
|q=*:*&fl=TITLE_str,id,REVISION_TEXT_str&start=10000|49|267||
|q=*:*&fl=TITLE_str,id,REVISION_TEXT_str&start=10|3|3||
> Decide default when requested fields are both column and row stored.
> --------------------------------------------------------------------
>
> Key: SOLR-8344
> URL: https://issues.apache.org/jira/browse/SOLR-8344
> Project: Solr
> Issue Type: New Feature
> Reporter: Ishan Chattopadhyaya
> Attachments: SOLR-8344.patch
>
>
> This issue was discussed in the comments at SOLR-8220. Splitting it out to a
> separate issue so that we can have a focused discussion on whether/how to do
> this.
> If a given set of requested fields are all stored and have docValues (column
> stored), we can retrieve the values from either place. What should the
> default be?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]