On 1/31/2014 2:15 PM, Tom Burton-West wrote:
> When trying to facet on 200 million documents with a facet field that
> has a very large number of unique values, we are running into OOM's.
>  See this thread for background:
> http://lucene.472066.n3.nabble.com/Estimating-peak-memory-use-for-UnInvertedField-faceting-tt4100044.html
> 
> Otis suggested that using DocValues might solve the memory issues.
> 
> There seem to be several options for setting the DocValuesFormat.  Can
> someone please clarify what the choices are for Solr 4.6 and what the
> trade-offs are in terms of memory use and faceting performance?

To minimize the amount of heap memory required, you should use the disk
format.  There is one caveat, though -- only the default format is
compatible when using an index built with one Solr version with a newer
Solr version.  If you set it to disk, there's a very good chance that
you'll need to wipe out your index and rebuild it from scratch when you
upgrade Solr.  That is of course always recommended, but your index is
not typical.

I've heard that if you change the docValues format back to default and
optimize your index, you can then upgrade safely, go back to disk, and
optimize again, but I've never actually tried this.  I always rebuild
from scratch when I upgrade.  I would imagine that until the optimize
were to finish, anything that actually used the docValues wouldn't work
right.

Side anecdote: Because I maintain and update two completely separate
production copies of my main Solr index (rather than use replication),
upgrades are not terribly painful for us, even though a complete rebuild
is always done whenever there's an upgrade or a significant config change.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to