Analysis tool vs search query

2010-01-25 Thread nonrenewable

Hi,

I've run into this issue that I have no way of resolving, since the analysis
tool doesn't show me there is an error. I copy the exact field value into
the analysis tool and i type in the exact query request i'm issuing and the
tool finds it a match. However running the query with that exact same
request doesn't return the item. 

I know the item is there, since I can find it based on another field. It
appears that the problem occurs when i add a second word in my query. So I
also tried replacing all whitespaces with _, just to make sure that there's
a mismatch there but there isn't. Here is my field type definition in case
i'm missing something
Thanks,
Tony

fieldType name=prefix_search class=solr.TextField
positionIncrementGap=1
  analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory /
filter class=solr.ISOLatin1AccentFilterFactory /
filter class=solr.PatternReplaceFilterFactory
pattern=[\-.,()] replacement= replace=all
/
filter class=solr.PatternReplaceFilterFactory
pattern=\s+ replacement=_ replace=all
/
filter class=solr.EdgeNGramFilterFactory minGramSize=1
maxGramSize=40/
  /analyzer
  analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory /
filter class=solr.ISOLatin1AccentFilterFactory /
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true /
filter class=solr.PatternReplaceFilterFactory
pattern=[\-.,()] replacement= replace=all
/
filter class=solr.PatternReplaceFilterFactory
pattern=\s+ replacement=_ replace=all
/
  /analyzer
/fieldType

Example inputs for analysis:
Index value: Banana, Veggie
Query value: banana veggie

-- 
View this message in context: 
http://old.nabble.com/Analysis-tool-vs-search-query-tp27316047p27316047.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr/Lucene keeps eating up memory while idling

2009-10-15 Thread nonrenewable

Did I read that right?  330K docs == 12 GB index.

Ops, missed the dot - 1.2GB, but i don't think that should really make the
difference in this case. Even if it was 12 GB it would just have some really
juicy documents, right? :)

Can you share the Solr logs and/or your config?  Is this happening  
around a commit or some warming process?  After startup, with no  
requests hitting it and no warming/commits/indexing, I don't see why  
it would be growing.  Do you have custom code?

There is custom code around the solrj API however it does not explain this
behaviour because of the lack of requests coming through it. There are no
indexing, commits or queries sent to the server after it's started up,
except for the initial 2 warming queries (can those be to blame for this
even with no caches present??). Here are these in the log (it's on it's
default verbosity so i'll refrain from posting the whole start up until
necessary) After the initial start up, what you see in the log is GC every
2.5 min and Full GC every 30min. No actual activity is present.

Oct 15, 2009 1:13:36 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=null path=null params={start=0q=fast_warmrows=10} hits=0
status=0 QTime=16853 
Oct 15, 2009 1:13:36 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.
Oct 15, 2009 1:13:36 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=null path=null
params={q=static+firstSearcher+warming+query+from+solrconfig.xml} hits=0
status=0 QTime=204 
Oct 15, 2009 1:13:36 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done


here is the config on it: 

config
 
abortOnConfigurationError${solr.abortOnConfigurationError:true}/abortOnConfigurationError
  dataDir/r9/flare1.data/solr/data/dataDir
  indexDefaults
useCompoundFilefalse/useCompoundFile
mergeFactor10/mergeFactor
ramBufferSizeMB32/ramBufferSizeMB
maxMergeDocs2147483647/maxMergeDocs
maxFieldLength1/maxFieldLength
writeLockTimeout1000/writeLockTimeout
commitLockTimeout1/commitLockTimeout
lockTypesingle/lockType
  /indexDefaults

  mainIndex
useCompoundFilefalse/useCompoundFile
ramBufferSizeMB32/ramBufferSizeMB
mergeFactor10/mergeFactor
maxMergeDocs2147483647/maxMergeDocs
maxFieldLength1/maxFieldLength
unlockOnStartupfalse/unlockOnStartup
  /mainIndex
  jmx /

  updateHandler class=solr.DirectUpdateHandler2
  /updateHandler


  query
maxBooleanClauses1024/maxBooleanClauses
queryResultWindowSize50/queryResultWindowSize
queryResultMaxDocsCached200/queryResultMaxDocsCached
HashDocSet maxSize=3000 loadFactor=0.75/
listener event=newSearcher class=solr.QuerySenderListener
  arr name=queries
lst str name=qsolr/str str name=start0/str str
name=rows10/str /lst
lst str name=qrocks/str str name=start0/str str
name=rows10/str /lst
lststr name=qstatic newSearcher warming query from
solrconfig.xml/str/lst
  /arr
/listener
listener event=firstSearcher class=solr.QuerySenderListener
  arr name=queries
lst str name=qfast_warm/str str name=start0/str str
name=rows10/str /lst
lststr name=qstatic firstSearcher warming query from
solrconfig.xml/str/lst
  /arr
/listener
useColdSearcherfalse/useColdSearcher
maxWarmingSearchers2/maxWarmingSearchers
  /query

  requestDispatcher handleSelect=true 
requestParsers enableRemoteStreaming=false
multipartUploadLimitInKB=2048 /
  /requestDispatcher
  
  requestHandler name=standard class=solr.SearchHandler default=true
 lst name=defaults
   str name=echoParamsexplicit/str
 /lst
  /requestHandler

  requestHandler name=dismax class=solr.SearchHandler 
lst name=defaults
 str name=defTypedismax/str
 str name=echoParamsexplicit/str
 float name=tie0.01/float
 str name=qf
text^0.5 address_t^2.0 name^1.5 brand^1.1 airport_name_t^1.0
 /str
 str name=pf
text^0.2 address_t^1.1 name^1.5 brand^1.4 brand_exact^1.9
airport_name_t^1.0
 /str
 str name=fl
id,name,price,score
 /str
 int name=ps100/int
 str name=q.alt*:*/str
 str name=hl.fltext features name/str
 str name=f.name.hl.fragsize0/str
 str name=f.name.hl.alternateFieldname/str
 str name=f.text.hl.fragmenterregex/str !-- defined below --
 str name=spellchecktrue/str 
 str name=spellcheck.extendedResultstrue/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.count5/str
/lst
 arr name=last-components
  strspellcheck/str
/arr
  /requestHandler
  requestHandler name=partitioned class=solr.SearchHandler 
lst name=defaults
 str name=defTypedismax/str
 str name=echoParamsexplicit/str
 str name=qftext^0.5 features^1.0 name^1.2 id^10.0/str
 str name=mm2lt;-1 5lt;-2 6lt;90%/str
 str name=bqincubationdate_dt:[* TO NOW/DAY-1MONTH]^2.2/str
/lst
lst name=appends
  str name=fqinStock:true/str

Re: Solr/Lucene keeps eating up memory while idling

2009-10-15 Thread nonrenewable

Here is exactly half an hour from roughly the beginning of logging. There's
nothing to see really because no requests are sent, you just see the GC
behaviour:
[Full GC 211987K-208493K(432448K), 0.6273480 secs]
[GC 276333K-212269K(438720K), 0.0929710 secs]
[GC 289133K-216269K(439936K), 0.1019780 secs]
[GC 293133K-220205K(436672K), 0.1128410 secs]
[GC 304301K-224429K(441472K), 0.1358250 secs]
[GC 308525K-228685K(431744K), 0.1559950 secs]
[GC 317197K-233069K(437312K), 0.1642160 secs]
[GC 321581K-237613K(432832K), 0.1772830 secs]
[GC 329197K-242093K(435136K), 0.1896270 secs]
[GC 333677K-246701K(436352K), 0.2039880 secs]
[GC 274165K-247917K(437760K), 0.2022640 secs]
[Full GC 247917K-208726K(437760K), 0.7195200 secs]

The heap is set to 1400m so it'll take it awhile to hit the roof. I also
haven't tested to see if it stabilises but i'll leave it running now and see
what happens to it overnight. I assume that when(if) it reaches the heap
limit i'll just do full GCs more often. 


Grant Ingersoll-6 wrote:
 
 Please send a log covering at least the 2.5 minutes you discuss, but  
 upwards of 5 minutes would be good.
 

-- 
View this message in context: 
http://www.nabble.com/Solr-Lucene-keeps-eating-up-memory-while-idling-tp25894357p25916348.html
Sent from the Solr - User mailing list archive at Nabble.com.



Solr/Lucene keeps eating up memory while idling

2009-10-14 Thread nonrenewable

I'm curious why this is occurring and whether i can prevent it. This is my
scenario:

Locally I have an idle running solr 1.3 service using lucene 2.4.1 which has
an index of ~330K documents containing ~10 fields each(total size ~12GB).
Currently I've turned off all caching, lazy field loading, however i do have
facet fields set for some request handlers. 

What i'm seeing is heap space usage increasing by ~1.2MB per 2 sec (by
java.lang.String objects). I'm assuming they're being used by lucene but i
may be wrong about that, since i have no actual data to confirm it. Why
exactly is this happening, considering no requests are being serviced?
Shouldn't the memory usage stabilise with a certain set of information and
only be affected on requests? Additionally there is a full GC every half
hour, which seems very unreasonable on a machine that isn't actually being
used as a service. 

I really hope there's just a certain setting that i've overlooked, or a
concept i'm not understanding because otherwise this behaviour seems very
unreasonable...

Thanks beforehand,
Tony
-- 
View this message in context: 
http://www.nabble.com/Solr-Lucene-keeps-eating-up-memory-while-idling-tp25894357p25894357.html
Sent from the Solr - User mailing list archive at Nabble.com.