Re: Multiple Blocked threads on UnInvertedField.getUnInvertedField() SegmentReader$CoreReaders.getTermsReader

2011-03-17 Thread Rachita Choudhary
Hi Yonik,

I have another question related to fieldValueCache.
When we uninvert a facet field, and if the termInstances = 0 for a
particular field, then also it gets added to the FieldValueCache.
What is the reason for caching facet fields with termInstances=0?

In our case, a lot of time is being spent in the 'uninvert' process. From
'time' values , I checked that it goes upto 20secs for certain facet fields.

Eg :
UnInverted multi-valued field
{field=product_brands_61936,memSize=4224,tindexSize=32,time=20202,phase1=20202,nTerms=0,bigTerms=0,termInstances=0,uses=0}

Also for the same facet field, the time and phase1 time varies from 3 msec
to 20 secs.
What is the reason for this variation ?
Also what does nTerms represent ?

Thanks,
Rachita

On Mon, Mar 7, 2011 at 8:22 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 On Mon, Mar 7, 2011 at 9:44 AM, Rachita Choudhary
 rachita.choudh...@burrp.com wrote:
  As enum method , will create a bitset for all the unique values

 It's more complex than that.
  - small sets will use a sorted int set... not a bitset
  - you can control what gets cached via facet.enum.cache.minDf parameter

 -Yonik
 http://lucidimagination.com



Re: Multiple Blocked threads on UnInvertedField.getUnInvertedField() SegmentReader$CoreReaders.getTermsReader

2011-03-07 Thread Rachita Choudhary
Hi Yonik,

Thanks for the information, but we are still facing issues related to
slowness and high memory usage.

As per my understanding, the default 'FC' method suits are use case, as we
have total about 1.1 million documents and no. of unique values for facet
fields is quite high.
We facet on 5 fields and the no. of unique values are:
Field 1 : 19,000
Field 2 : 19,000
Field 3 : 55,000
Field 4:  474
Field 5 : 27 (The alphabetical faceting)

All the facet fields are of type string and multivalued.

As enum method , will create a bitset for all the unique values, it would be
consuming more memory compared to fc method.
Also even with a field value cache size of '100', the heap memory(max 6GB)
is getting consumed pretty fast.

With about 60 parallel requests contributing about 4 million queries, about
25% of our queries have QTime above 1 sec.
The max QTime shoots upto 55 sec.

Debugging deeper into the solr and lucene code, the particular method which
slows us down is IndexSearcher.numDocs which internally gets the terms by
loading it from the index.
I have not been able to determine the root cause of this.

Any other pointers/suggestions in this regard will be helpful.

Thanks,
Rachita

On Tue, Feb 22, 2011 at 10:42 PM, Yonik Seeley
yo...@lucidimagination.comwrote:

 On Tue, Feb 22, 2011 at 9:13 AM, Rachita Choudhary
 rachita.choudh...@burrp.com wrote:
  Hi Solr Users,
 
  We are upgrading from Solr 1.3 to Solr 1.4.1.
  While using Solr 1.3 , we were seeing multiple blocking active threads on
  org.apache.lucene.store.FSDirectory$FSIndexInput.readInternal() .
 
  To utilize the benefits of NIO, on upgrading to Solr 1.4.1, we see other
  type of multiple blocking threads on
  org.apache.solr.request.UnInvertedField.getUnInvertedField()  
 
  SegmentReader$CoreReaders.getTermsReader.
  Due to this, the QTimes shoots up from few hundreds to thousand of
  msec.. even going upto 30-40 secs for a single query.
 
  - The multiple blocking threads show up after few thousands of queries.
  - We do not have faceting and sorting on the same fields.
  - Our facet fields are multivalued text fields, but no large text values
 are
  present.
  - Index size - around 10 GB
  - We have not specified any method for faceting in our schema.xml.
  - Our field value cache settings are:
   fieldValueCache
 class=solr.FastLRUCache
 size=175
 autowarmCount=0
 showItems=10
   /
 
  Can someone please tell us the why we are seeing these blocked threads ?
  Also if they are related to our field value cache , then a cache of size
 175
  will be filled up with very few initial queries and right after that we
  should see multiple blocking threads ?
  What difference it will make if we have facet.method = enum ?

 fc method on a multivalued field instantiates an UnInvertedField (like
 a multi-valued field cache) which can take some time.
 Just like sorting, you may want to use some warming faceting queries
 to make sure that real queries don't pay the cost of the initial entry
 construction.

 From your fieldValueCache statistics, it looks like the number of
 terms is low enough that the enum method may be fine here.

 -Yonik
 http://lucidimagination.com


  Is this all related to fieldValueCache or is there some other
 configuration
  which we need to set to avoid these blocking threads?
 
  Thanks,
  Rachita
 
  *Cache values example:
  *facetField1_27443 :
 
 {field=facet1_27443,memSize=4214884,tindexSize=52,time=22,phase1=15,nTerms=4,bigTerms=0,termInstances=6,uses=1}
 
  facetField1_70 :
 
 {field=facetField1_70,memSize=4223310,tindexSize=308,time=28,phase1=21,nTerms=636,bigTerms=0,termInstances=14404,uses=1}
 
  facetField2 :
 {field=facetField2,memSize=4262644,tindexSize=3156,time=273,phase1=267,nTerms=12188,bigTerms=0,termInstances=1255522,uses=7031}



Multiple Blocked threads on UnInvertedField.getUnInvertedField() SegmentReader$CoreReaders.getTermsReader

2011-02-22 Thread Rachita Choudhary
Hi Solr Users,

We are upgrading from Solr 1.3 to Solr 1.4.1.
While using Solr 1.3 , we were seeing multiple blocking active threads on
org.apache.lucene.store.FSDirectory$FSIndexInput.readInternal() .

To utilize the benefits of NIO, on upgrading to Solr 1.4.1, we see other
type of multiple blocking threads on
org.apache.solr.request.UnInvertedField.getUnInvertedField()  

SegmentReader$CoreReaders.getTermsReader.
Due to this, the QTimes shoots up from few hundreds to thousand of
msec.. even going upto 30-40 secs for a single query.

- The multiple blocking threads show up after few thousands of queries.
- We do not have faceting and sorting on the same fields.
- Our facet fields are multivalued text fields, but no large text values are
present.
- Index size - around 10 GB
- We have not specified any method for faceting in our schema.xml.
- Our field value cache settings are:
 fieldValueCache
class=solr.FastLRUCache
size=175
autowarmCount=0
showItems=10
  /

Can someone please tell us the why we are seeing these blocked threads ?
Also if they are related to our field value cache , then a cache of size 175
will be filled up with very few initial queries and right after that we
should see multiple blocking threads ?
What difference it will make if we have facet.method = enum ?
Is this all related to fieldValueCache or is there some other configuration
which we need to set to avoid these blocking threads?

Thanks,
Rachita

*Cache values example:
*facetField1_27443 :
{field=facet1_27443,memSize=4214884,tindexSize=52,time=22,phase1=15,nTerms=4,bigTerms=0,termInstances=6,uses=1}

facetField1_70 :
{field=facetField1_70,memSize=4223310,tindexSize=308,time=28,phase1=21,nTerms=636,bigTerms=0,termInstances=14404,uses=1}

facetField2 : 
{field=facetField2,memSize=4262644,tindexSize=3156,time=273,phase1=267,nTerms=12188,bigTerms=0,termInstances=1255522,uses=7031}
*
Stack trace for
org.apache.solr.request.UnInvertedField.getUnInvertedField() -
BLOCKED*

at org.apache.solr.request.UnInvertedField.getUnInvertedField
(UnInvertedField.java:837)
 at org.apache.solr.request.SimpleFacets.getTermCounts (SimpleFacets.java:250)
 at org.apache.solr.request.SimpleFacets.getFacetFieldCounts
(SimpleFacets.java:283)
 at org.apache.solr.request.SimpleFacets.getFacetCounts (SimpleFacets.java:166)
 at org.apache.solr.handler.component.FacetComponent.process
(FacetComponent.java:72)
 at org.apache.solr.handler.component.SearchHandler.handleRequestBody
(SearchHandler.java:195)
 at org.apache.solr.handler.RequestHandlerBase.handleRequest
(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute (SolrCore.java:1316)
 at org.apache.solr.servlet.SolrDispatchFilter.execute
(SolrDispatchFilter.java:338)
 at org.apache.solr.servlet.SolrDispatchFilter.doFilter
(SolrDispatchFilter.java:241)
 at com.caucho.server.dispatch.FilterFilterChain.doFilter
(FilterFilterChain.java:87)
 at com.caucho.server.webapp.WebAppFilterChain.doFilter
(WebAppFilterChain.java:187)
 at com.caucho.server.dispatch.ServletInvocation.service
(ServletInvocation.java:266)
 at com.caucho.server.http.HttpRequest.handleRequest (HttpRequest.java:270)
 at com.caucho.server.port.TcpConnection.run (TcpConnection.java:678)
 at com.caucho.util.ThreadPool$Item.runTasks (ThreadPool.java:721)
 at com.caucho.util.ThreadPool$Item.run (ThreadPool.java:643)
 at java.lang.Thread.run (Thread.java:595)


*org.apache.lucene.index.SegmentReader$CoreReaders.getTermsReader() -
BLOCKED*

at org.apache.lucene.index.SegmentReader$CoreReaders.getTermsReader
(SegmentReader.java:170)
 at org.apache.lucene.index.SegmentTermDocs. (SegmentTermDocs.java:52)
 at org.apache.lucene.index.SegmentReader.termDocs (SegmentReader.java:987)
 at org.apache.lucene.index.IndexReader.termDocs (IndexReader.java:1102)
 at org.apache.lucene.index.SegmentReader.termDocs (SegmentReader.java:981)
 at org.apache.solr.search.SolrIndexReader.termDocs (SolrIndexReader.java:320)
 at org.apache.solr.search.SolrIndexSearcher.getDocSetNC
(SolrIndexSearcher.java:640)
 at org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet
(SolrIndexSearcher.java:563)
 at org.apache.solr.search.SolrIndexSearcher.numDocs
(SolrIndexSearcher.java:1422)
 at com.askme.solrenhancements.facet.ExtendedFacet.getCustomFacetCount
(ExtendedFacet.java:132)
 at com.askme.solrenhancements.facet.ExtendedFacet.getCustomFacetCount
(ExtendedFacet.java:92)
 at com.askme.solrenhancements.facet.ExtendedFacet.getFacetAdditionalInfo
(ExtendedFacet.java:69)
 at com.askme.solrenhancements.facet.ExtendedFacet.getFacetInfo
(ExtendedFacet.java:56)
 at com.askme.solrenhancements.facet.CustomFacetComponent.process
(CustomFacetComponent.java:43)
 at org.apache.solr.handler.component.SearchHandler.handleRequestBody
(SearchHandler.java:195)
 at org.apache.solr.handler.RequestHandlerBase.handleRequest
(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute (SolrCore.java:1316)
 at 

Solr 1.4.1 using more memory than Solr 1.3

2011-02-09 Thread Rachita Choudhary
Hi Solr Users,

We are in the process of upgrading from Solr 1.3 to Solr 1.4.1.
While performing stress test on Solr 1.4.1 to measure the performance
improvement in Query times (QTime) and no more blocked threads, we ran into
memory issues with Solr 1.4.1.

Test Setup details:
- 2 identical hosts running Solr 1.3 and Solr 1.4.1 individually.
- 3 cores with index sizes : 10 GB, 2 GB, 1 GB.
- JVM Max RAM : 3GB ( Xmx3072m) , Total RAM : 4GB
- No other application/service running on the servers.
- For querying solr servers, we are using wget queries from a standalone
host.

For the same index data and same set of queries, Solr 1.3 is hovering
between 1.5 to 2.2 GB, whereas with about 20K requests Solr 1.4.1 is
reaching its 3 GB limit and performing FULL GC after almost every query. The
Full GC is also not freeing up any memory.

Has anyone also faced similar issues with Solr 1.4.1 ?

Also why is Solr 1.4.1 using more memory for the same amount of processing
compared to Solr 1.3 ?

Is there any particular configuration that needs to be done to avoid this
high memory usage ?

Thanks,
Rachita