Re: Too many values for UnInvertedField faceting on field topic

2012-03-02 Thread Michael Jakl
Hi!

On Thu, Mar 1, 2012 at 23:54, Yonik Seeley yo...@lucidimagination.com wrote:
 On Thu, Mar 1, 2012 at 3:34 AM, Michael Jakl jakl.mich...@gmail.com wrote:
 The topic field holds roughly 5
 values per doc, but I wasn't able to compute the correct number right
 now.

 How many unique values for that field in the whole index?
 If you have log output (or output from the stats page for
 fieldValueCache) that should tell you exactly.

I'm sorry, I've already reduced the size of the index and I'm in the
process of splitting it into a few shards. Solr couldn't build the
fieldValueCache for this particular field (that's where the exception
came from).

Thanks,
Michael


Re: Too many values for UnInvertedField faceting on field topic

2012-03-01 Thread Michael Jakl
Hi!

On Wed, Feb 29, 2012 at 22:21, Emmanuel Espina espinaemman...@gmail.com wrote:
 No. But probably we can find another way to do what you want. Please
 describe the problem and include some numbers to give us an idea of
 the sizes that you are handling. Number of documents, size of the
 index, etc.

Thank you! Our Solr holds currently about 168Mio documents. From each
of these documents we extract the most important keywords and store
them in a multivalued field (topic). Our goal is to provide faceted
navigation through these topics. The topic field holds roughly 5
values per doc, but I wasn't able to compute the correct number right
now.

The use cases require that the facets have to be calculated fast
enough so that they can be answered in reasonable time (1-2 secs)
which we were able to do with a 192GB RAM machine and regular warming.

Splitting the Solr into a few smaller ones (even on the same machine)
seems to be the most promising way, but I've been shying away from it
for some reasons: higher complexity, a huge reimport (though, I could
split the current index), some components didn't support it when we
were starting (Grouping was only introduced with 3.5 IIRC). I've
tested the sharding approach and it was a bit slower than the one huge
index approach.

I'd be happy to hear some suggestions,
Michael


Re: Too many values for UnInvertedField faceting on field topic

2012-03-01 Thread Yonik Seeley
On Thu, Mar 1, 2012 at 3:34 AM, Michael Jakl jakl.mich...@gmail.com wrote:
 The topic field holds roughly 5
 values per doc, but I wasn't able to compute the correct number right
 now.

How many unique values for that field in the whole index?
If you have log output (or output from the stats page for
fieldValueCache) that should tell you exactly.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Too many values for UnInvertedField faceting on field topic

2012-02-29 Thread Michael Jakl
Our Solr started to throw the following exception when requesting the
facets of a multivalued field holding a lot of terms.

SEVERE: org.apache.solr.common.SolrException: Too many values for
UnInvertedField faceting on field topic
at 
org.apache.solr.request.UnInvertedField.uninvert(UnInvertedField.java:390)
at 
org.apache.solr.request.UnInvertedField.init(UnInvertedField.java:180)
at 
org.apache.solr.request.UnInvertedField.getUnInvertedField(UnInvertedField.java:871)
at 
org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:287)
at 
org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:319)
at 
org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:193)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1373)
at 
org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:54)
at org.apache.solr.core.SolrCore$4.call(SolrCore.java:1198)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:139)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:909)
at java.lang.Thread.run(Thread.java:662)

Is there a way around it, maybe a setting to increase the limit?
Using facet.method=enum, as suggested in a thread in 2009, is far too
slow, at least in the experiments I did.

I'm using Solr 3.5.0 on Linux (192GB RAM), so faceting was pretty fast
after an initial cache warming.

Cheers,
Michael


Re: Too many values for UnInvertedField faceting on field topic

2012-02-29 Thread Emmanuel Espina
No. But probably we can find another way to do what you want. Please
describe the problem and include some numbers to give us an idea of
the sizes that you are handling. Number of documents, size of the
index, etc.

Thanks
Emmanuel

2012/2/29 Michael Jakl jakl.mich...@gmail.com:
 Our Solr started to throw the following exception when requesting the
 facets of a multivalued field holding a lot of terms.

 SEVERE: org.apache.solr.common.SolrException: Too many values for
 UnInvertedField faceting on field topic
        at 
 org.apache.solr.request.UnInvertedField.uninvert(UnInvertedField.java:390)
        at 
 org.apache.solr.request.UnInvertedField.init(UnInvertedField.java:180)
        at 
 org.apache.solr.request.UnInvertedField.getUnInvertedField(UnInvertedField.java:871)
        at 
 org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:287)
        at 
 org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:319)
        at 
 org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72)
        at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:193)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1373)
        at 
 org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:54)
        at org.apache.solr.core.SolrCore$4.call(SolrCore.java:1198)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:139)
        at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:909)
        at java.lang.Thread.run(Thread.java:662)

 Is there a way around it, maybe a setting to increase the limit?
 Using facet.method=enum, as suggested in a thread in 2009, is far too
 slow, at least in the experiments I did.

 I'm using Solr 3.5.0 on Linux (192GB RAM), so faceting was pretty fast
 after an initial cache warming.

 Cheers,
 Michael