On Tue, Dec 21, 2010 at 4:02 PM, Rok Rejc <rokrej...@gmail.com> wrote: > Dear all, > > I have created an index with aprox. 1.1 billion of documents (around 500GB) > running on Solr 1.4.1. (64 bit JVM). > > I want to enable faceted navigation on am int field, which contains around > 250 unique values. > According to the wiki there are two methods: > > facet.method=fc which uses field cache. This method should use MaxDoc*4 > bytes of memory which is around: 4.1GB.
facet.method=fc uses the fieldcache, but it uses the StringIndex for all field types currently, so you need to add in space for the string representation of all the unique values. But this is only 250, so given the large number of docs, your estimate should still be close. > facet.method=enum which crated a bitset for each unique value. This method > should use NumberOfUniqueValues * SizeOfBitSet which is around 32GB. A more efficient representation is used for a set when the set size is less than maxDoc/64. This set type uses an int per doc in the set, so should use roughly the same amount of memory as a numeric fieldcache entry. > Are my calculations correct? > > My memory settings in Tomcat (windows) are: > Initial memory pool: 4096 MB > Maximum memory pool: 8192 MB (total 12GB in my test machine) > > I have tried to run a query > (...&facet=true&facet.field=PublisherId&facet.method=fc) but I am still > getting OOM: > > HTTP Status 500 - Java heap space java.lang.OutOfMemoryError: Java heap > space at > org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:703) > at > org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:224) > at > org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:692) > at > org.apache.solr.request.SimpleFacets.getFieldCacheCounts(SimpleFacets.java:350) > at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:255) > at > org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:283) > at > org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:166) > at > org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72) > at > ... > > Any idea what am I doing wrong, or have I miscalculated the memory > requirements? Perhaps you are already sorting by another field or faceting on another field that is causing a lot of memory to already be used, and this pushes it over the edge? Or perhaps the JVM simply can't find a contiguous area of memory this large? Line 703 is this: so it's failing to create the first array: final int[] retArray = new int[reader.maxDoc()]; Although the line after it is even more troublesome: String[] mterms = new String[reader.maxDoc()+1]; Although you only need an array of 250 to contain all the unique terms, the FieldCacheImpl starts out with maxDoc. I think trunk will be far better in this regard. You should also try facet.method=enum though too. -Yonik http://www.lucidimagination.com