I sure don't see how this can work given the constraints. Just to hold the
values, assuming that each doc holds a value in 150 fields, you have 150 *
4 * 14,000,000 or 8.4G of memory required, and you just don't have that
much memory to play around with.

Sharding seems silly for 14M docs, but that might be what's necessary. Or
get hardware with lots of memory.

Or redefine the problem so you don't have to sort so many fields. Not quite
sure how do do that off the top of my head, but.....

Best
Erick


On Tue, Nov 27, 2012 at 9:25 PM, Arun Rangarajan
<arunrangara...@gmail.com>wrote:

> We have a Solr 3.6 core that has about 250 TrieIntFields (declared using
> dynamicField). There are about 14M docs in our Solr index and many
> documents have some value in many of these fields. We have a need to sort
> on all of these 250 fields over a period of time.
>
> The issue we are facing is that the underlying lucene fieldCache gets
> filled up very quickly. We have a 4 GB box and the index size is 18 GB.
> After a sort on 40 or 45 of these dynamic fields, the memory consumption is
> about 90% (tomcat set up to get max heap size of 3.6 GB) and we start
> getting OutOfMemory errors.
>
> For now, we have a cron job running every minute restarting tomcat if the
> total memory consumed is more than 80%.
>
> We thought instead of sorting, if we did boosting it won't go to
> fieldCache. So instead of issuing a query like
>
> select?q=name:alba&sort=relevance_11 desc
>
> we tried
>
> select?q={!boost relevance_11}name:alba
>
> but unfortunately boosting also populates the field cache.
>
> From what I have read, I understand that restricting the number of distinct
> values on sortable Solr fields will bring down the fieldCache space. The
> values in these sortable fields can be any integer from 0 to 33000 and
> quite widely distributed. We have a few scaling solutions in mind, but what
> is the best way to handle this whole issue?
>
> thanks.
>

Reply via email to