Ok, makes sense.
Thanks for your answer Erick.

Gaël

________________________________
De : Erick Erickson <erickerick...@gmail.com>
Envoyé : mercredi 15 juillet 2020 22:53
À : solr-user@lucene.apache.org <solr-user@lucene.apache.org>
Objet : Re: Disk usage with useDocValuesAsStored

You’re off track a bit. useDocValuesAsStored has no effect on the size on disk. 
It’s purely a runtime option that pulls the data to return from either the 
stored or docValues parts of the index. If you change the definition and 
reindex, you should see significant differences in the size of your index, 
particularly the “*.fdt/*.fdx” and “*.dvd*I.dvm” files, where stored and 
docValues are kept respectively.

However, it’s also apples and oranges. Specifically, using docValues as stored 
will _not_ necessarily return the fields the same way they were sent in the 
multiValued case. The docValues data is kept as a SORTED_SET, which means it’s 
both lexically sorted and deduplicated. So input like “a” “z” “h” “a” will 
return “a” “h” “z”.

Best,
Erick

> On Jul 15, 2020, at 1:35 PM, Gael Jourdan-Weil 
> <gael.jourdan-w...@kelkoogroup.com> wrote:
>
> Hello,
>
> I was wondering if we can expect significant disk usage reduction (index 
> size) if we move from fields defined as "docValues=true + stored=true" to 
> "docValues=true + stored=false" (with useDocValuesAsStored=true as default in 
> both cases)?
>
> Considering the use case we are targeting is only Streaming Expression with 
> /export handler, I also understand that we might also set 
> useDocValuesAsStored=false from what is described at 
> https://lucene.apache.org/solr/guide/8_4/docvalues.html.
> If so, would setting useDocValuesAsStored=false help reduce the index size as 
> well?
>
> We will obviously try it and see by ourselves the results but I was wondering 
> if you already have an idea about it.
> Also if you have any good link to how data are physically stored depending on 
> the fields options (indexed/stored/docValues), this could really be 
> interesting.
>
> Thanks,
> Gaël

Reply via email to