Analysis tool vs search query
Hi, I've run into this issue that I have no way of resolving, since the analysis tool doesn't show me there is an error. I copy the exact field value into the analysis tool and i type in the exact query request i'm issuing and the tool finds it a match. However running the query with that exact same request doesn't return the item. I know the item is there, since I can find it based on another field. It appears that the problem occurs when i add a second word in my query. So I also tried replacing all whitespaces with _, just to make sure that there's a mismatch there but there isn't. Here is my field type definition in case i'm missing something Thanks, Tony fieldType name=prefix_search class=solr.TextField positionIncrementGap=1 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory / filter class=solr.ISOLatin1AccentFilterFactory / filter class=solr.PatternReplaceFilterFactory pattern=[\-.,()] replacement= replace=all / filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement=_ replace=all / filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=40/ /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory / filter class=solr.ISOLatin1AccentFilterFactory / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true / filter class=solr.PatternReplaceFilterFactory pattern=[\-.,()] replacement= replace=all / filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement=_ replace=all / /analyzer /fieldType Example inputs for analysis: Index value: Banana, Veggie Query value: banana veggie -- View this message in context: http://old.nabble.com/Analysis-tool-vs-search-query-tp27316047p27316047.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr/Lucene keeps eating up memory while idling
Did I read that right? 330K docs == 12 GB index. Ops, missed the dot - 1.2GB, but i don't think that should really make the difference in this case. Even if it was 12 GB it would just have some really juicy documents, right? :) Can you share the Solr logs and/or your config? Is this happening around a commit or some warming process? After startup, with no requests hitting it and no warming/commits/indexing, I don't see why it would be growing. Do you have custom code? There is custom code around the solrj API however it does not explain this behaviour because of the lack of requests coming through it. There are no indexing, commits or queries sent to the server after it's started up, except for the initial 2 warming queries (can those be to blame for this even with no caches present??). Here are these in the log (it's on it's default verbosity so i'll refrain from posting the whole start up until necessary) After the initial start up, what you see in the log is GC every 2.5 min and Full GC every 30min. No actual activity is present. Oct 15, 2009 1:13:36 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=null path=null params={start=0q=fast_warmrows=10} hits=0 status=0 QTime=16853 Oct 15, 2009 1:13:36 PM org.apache.solr.core.QuerySenderListener newSearcher INFO: QuerySenderListener done. Oct 15, 2009 1:13:36 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=null path=null params={q=static+firstSearcher+warming+query+from+solrconfig.xml} hits=0 status=0 QTime=204 Oct 15, 2009 1:13:36 PM org.apache.solr.core.QuerySenderListener newSearcher INFO: QuerySenderListener done here is the config on it: config abortOnConfigurationError${solr.abortOnConfigurationError:true}/abortOnConfigurationError dataDir/r9/flare1.data/solr/data/dataDir indexDefaults useCompoundFilefalse/useCompoundFile mergeFactor10/mergeFactor ramBufferSizeMB32/ramBufferSizeMB maxMergeDocs2147483647/maxMergeDocs maxFieldLength1/maxFieldLength writeLockTimeout1000/writeLockTimeout commitLockTimeout1/commitLockTimeout lockTypesingle/lockType /indexDefaults mainIndex useCompoundFilefalse/useCompoundFile ramBufferSizeMB32/ramBufferSizeMB mergeFactor10/mergeFactor maxMergeDocs2147483647/maxMergeDocs maxFieldLength1/maxFieldLength unlockOnStartupfalse/unlockOnStartup /mainIndex jmx / updateHandler class=solr.DirectUpdateHandler2 /updateHandler query maxBooleanClauses1024/maxBooleanClauses queryResultWindowSize50/queryResultWindowSize queryResultMaxDocsCached200/queryResultMaxDocsCached HashDocSet maxSize=3000 loadFactor=0.75/ listener event=newSearcher class=solr.QuerySenderListener arr name=queries lst str name=qsolr/str str name=start0/str str name=rows10/str /lst lst str name=qrocks/str str name=start0/str str name=rows10/str /lst lststr name=qstatic newSearcher warming query from solrconfig.xml/str/lst /arr /listener listener event=firstSearcher class=solr.QuerySenderListener arr name=queries lst str name=qfast_warm/str str name=start0/str str name=rows10/str /lst lststr name=qstatic firstSearcher warming query from solrconfig.xml/str/lst /arr /listener useColdSearcherfalse/useColdSearcher maxWarmingSearchers2/maxWarmingSearchers /query requestDispatcher handleSelect=true requestParsers enableRemoteStreaming=false multipartUploadLimitInKB=2048 / /requestDispatcher requestHandler name=standard class=solr.SearchHandler default=true lst name=defaults str name=echoParamsexplicit/str /lst /requestHandler requestHandler name=dismax class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str float name=tie0.01/float str name=qf text^0.5 address_t^2.0 name^1.5 brand^1.1 airport_name_t^1.0 /str str name=pf text^0.2 address_t^1.1 name^1.5 brand^1.4 brand_exact^1.9 airport_name_t^1.0 /str str name=fl id,name,price,score /str int name=ps100/int str name=q.alt*:*/str str name=hl.fltext features name/str str name=f.name.hl.fragsize0/str str name=f.name.hl.alternateFieldname/str str name=f.text.hl.fragmenterregex/str !-- defined below -- str name=spellchecktrue/str str name=spellcheck.extendedResultstrue/str str name=spellcheck.collatetrue/str str name=spellcheck.count5/str /lst arr name=last-components strspellcheck/str /arr /requestHandler requestHandler name=partitioned class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str str name=qftext^0.5 features^1.0 name^1.2 id^10.0/str str name=mm2lt;-1 5lt;-2 6lt;90%/str str name=bqincubationdate_dt:[* TO NOW/DAY-1MONTH]^2.2/str /lst lst name=appends str name=fqinStock:true/str
Re: Solr/Lucene keeps eating up memory while idling
Here is exactly half an hour from roughly the beginning of logging. There's nothing to see really because no requests are sent, you just see the GC behaviour: [Full GC 211987K-208493K(432448K), 0.6273480 secs] [GC 276333K-212269K(438720K), 0.0929710 secs] [GC 289133K-216269K(439936K), 0.1019780 secs] [GC 293133K-220205K(436672K), 0.1128410 secs] [GC 304301K-224429K(441472K), 0.1358250 secs] [GC 308525K-228685K(431744K), 0.1559950 secs] [GC 317197K-233069K(437312K), 0.1642160 secs] [GC 321581K-237613K(432832K), 0.1772830 secs] [GC 329197K-242093K(435136K), 0.1896270 secs] [GC 333677K-246701K(436352K), 0.2039880 secs] [GC 274165K-247917K(437760K), 0.2022640 secs] [Full GC 247917K-208726K(437760K), 0.7195200 secs] The heap is set to 1400m so it'll take it awhile to hit the roof. I also haven't tested to see if it stabilises but i'll leave it running now and see what happens to it overnight. I assume that when(if) it reaches the heap limit i'll just do full GCs more often. Grant Ingersoll-6 wrote: Please send a log covering at least the 2.5 minutes you discuss, but upwards of 5 minutes would be good. -- View this message in context: http://www.nabble.com/Solr-Lucene-keeps-eating-up-memory-while-idling-tp25894357p25916348.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr/Lucene keeps eating up memory while idling
I'm curious why this is occurring and whether i can prevent it. This is my scenario: Locally I have an idle running solr 1.3 service using lucene 2.4.1 which has an index of ~330K documents containing ~10 fields each(total size ~12GB). Currently I've turned off all caching, lazy field loading, however i do have facet fields set for some request handlers. What i'm seeing is heap space usage increasing by ~1.2MB per 2 sec (by java.lang.String objects). I'm assuming they're being used by lucene but i may be wrong about that, since i have no actual data to confirm it. Why exactly is this happening, considering no requests are being serviced? Shouldn't the memory usage stabilise with a certain set of information and only be affected on requests? Additionally there is a full GC every half hour, which seems very unreasonable on a machine that isn't actually being used as a service. I really hope there's just a certain setting that i've overlooked, or a concept i'm not understanding because otherwise this behaviour seems very unreasonable... Thanks beforehand, Tony -- View this message in context: http://www.nabble.com/Solr-Lucene-keeps-eating-up-memory-while-idling-tp25894357p25894357.html Sent from the Solr - User mailing list archive at Nabble.com.