[ 
https://issues.apache.org/jira/browse/SOLR-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487562#comment-13487562
 ] 

Robert Muir commented on SOLR-3855:
-----------------------------------

warning: just skimmed the patch.

{quote}
    configured on a per-field-type basis (docValueType=...),
    enabled on a per-field basis (docValues=true/false)
{quote}

We could combine these? e.g. a docValueType of "none" or something? This would 
parallel the lucene apis and maybe make things a bit simpler.

{quote}
When doc values are enabled, they have precedence over the field cache for 
getValueSource and getSortField, however faceting and stats cannot use doc 
values yet (I would like to do this as a separate issue).
{quote}

Ultimately it would be really great if fieldcache and docvalues had the same 
API. I worry about the fact that its not this way currently. This shouldn't 
block this patch, its just a semi-related discussion... seems like fieldcache 
should be presented as "build docvalues on the fly for the field".

Would be awesome if faceting etc could use docvalues: though I think there is 
likely some work for the multivalued case? e.g. we would have to encode 
multiple tokens at a level above into the single-valued StraightBytes or 
whatever ala DocTermOrds? or maybe we should think about an actual type for 
this that can allow for more efficient impls?

{quote}
I also modified a lot of code (ReturnFields especially) to make DocValues 
behave like stored fields. I think this would be great for ID fields. In a 
cluster that has numShards shards, it would help decrease the number of disk 
seeks in the .fdt file (which is often too big to fit entirely in the OS cache) 
per request from (numShards * (start + rows) + rows) to rows.
{quote}

I didn't look at this part, but is this really true? its numFields * rows 
right? If its some special case for ID fields where #idfields=1 for distributed 
search or whatever, I think thats a good optimization for that use-case. But in 
general if docvalues are presented like stored fields for general purposes I 
think thats not a great illusion to give to the user in case they have a lot of 
fields?

Thanks for getting this started!
                
> DocValues support
> -----------------
>
>                 Key: SOLR-3855
>                 URL: https://issues.apache.org/jira/browse/SOLR-3855
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: 4.1, 5.0
>
>         Attachments: SOLR-3855.patch
>
>
> It would be nice if Solr supported DocValues:
>  - for ID fields (fewer disk seeks when running distributed search),
>  - for sorting/faceting/function queries (faster warmup time than fieldcache),
>  - better on-disk and in-memory efficiency (you can use packed impls).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to