Hello, 

We are working with large lucene 4.3.0 index and using 
SortedSetDocValuesFacetFields for creating facets and 
SortedSetDocValuesAccumulator for facet accumulation. We couldn't use a 
taxonomy based facet implementation (We use MultiReader for searching and our 
indices is composed of multiple physical lucene indices, hence we cannot have a 
single taxonomy index). We have two million categories and expect to have 
another two million in the near future. As the current implementation of 
SortedSetDocValuesAccumulator does not support ReusingFacetArrays, we are 
concerned with potential garabage collector related performance issues in our 
high traffic application. Will future Lucene release support using 
ReusingFacetArrays in SortedSetDocValuesAccumulator ?

Also as an alternative we are considering subclassing FacetIndexingParams and 
provide dimension specific CategoryListParams during indexing time. This will 
help to reduce the size of the FacetArray per facet request. We realize this 
approach will not support multiple FacetRequest in a single 
SortedSetDocValuesAccumulator, as SortedSetDocValuesReaderState hardcodes the 
category to null while calling FacetIndexingParams.getCategoryListParams(null) 
in its constructor. 

Are there better approaches to this problem ?


Thanks in advance for any help. 

Kannan
Cengage Learning
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to