[ 
https://issues.apache.org/jira/browse/SOLR-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491065#comment-13491065
 ] 

Adrien Grand commented on SOLR-3855:
------------------------------------

bq. Are you sure only one thing makes sense? What if i need integers that are 
larger than a short, but the range of values (max-min)
is actually small. Then a Packed impl could make more sense. So we should think 
about this...

I understand your point, I am myself a big supporter of packed ints and plan to 
use them probably more often than fixed ints, but I still think that fixed_ints 
would be a good default (no one would be surprised if the doc values of a field 
which is an int in their schema require 4 bytes per value).

But if Lucene was able to switch automatically from packed ints to fixed_ints 
if they have less than x% overhead, this would be great!

bq. Well I don't think there should be so many types

If you want to sort on a String field, there are 6 available types. And I think 
it should be easy for people getting started with Solr to do simple things such 
as sorting data without having to understand the different trade-offs of these 
doc values types in order to choose one. Otherwise the risk is that they keep 
using the field cache instead because they find it more convenient.

(I hate this argument because some people will certainly have troubles with 
SORTED doc values on a unique field of a very large index, but anyway it is 
still better than the field cache?)

bq. In my opinion instead of IndexWriter streaming docvalues to the codec 
directly, only to have the codec buffer up in ram and use
Counter for accounting, IndexWriter should buffer and things like STRAIGHT/VAR 
would just be optimizations...

+1

{quote} I'm still worried about this case: I don't like them treated as stored 
fields. Its only going to be more seeks if people have disk-enabled dvs that we 
must fetch in addition to the stored fields.
I havent looked at the relevant bits, but is it possible we could treat "*" as 
just meaning the stored fields still? Basically, if you CHOOSE to
request them, you get them, but we don't do anything trappy.{quote}

If we allow for direct doc values, this makes sense to not load them by 
default, but I think we should add documentation to the example schema.xml so 
that people know that it is wasteful to store fields if doc values are enabled 
and in memory, and that they can be added very easily to the response by adding 
the field name to the fl parameter.

In case the unique key has doc values and is not stored, maybe it still makes 
sense to fetch it when fl=*?


                
> DocValues support
> -----------------
>
>                 Key: SOLR-3855
>                 URL: https://issues.apache.org/jira/browse/SOLR-3855
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: 4.1, 5.0
>
>         Attachments: SOLR-3855.patch
>
>
> It would be nice if Solr supported DocValues:
>  - for ID fields (fewer disk seeks when running distributed search),
>  - for sorting/faceting/function queries (faster warmup time than fieldcache),
>  - better on-disk and in-memory efficiency (you can use packed impls).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to