Re: Solr memory use, jmap and TermInfos/tii

2010-09-11 Thread Simon Willnauer
On Sun, Sep 12, 2010 at 1:51 AM, Michael McCandless wrote: > On Sat, Sep 11, 2010 at 11:07 AM, Burton-West, Tom wrote: >>  Is there an example of how to set up the divisor parameter in >> solrconfig.xml somewhere? > > Alas I don't know how to configure terms index divisor from Solr... You can s

Re: Solr and jvm Garbage Collection tuning

2010-09-11 Thread Dennis Gearon
Thanks for the real life examples. You would have to do a LOT of sharding to get that to work better. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Fri, 9/10/10, K

Re: Solr memory use, jmap and TermInfos/tii

2010-09-11 Thread Michael McCandless
On Sat, Sep 11, 2010 at 11:07 AM, Burton-West, Tom wrote: >  Is there an example of how to set up the divisor parameter in solrconfig.xml > somewhere? Alas I don't know how to configure terms index divisor from Solr... >>>In 4.0, w/ flex indexing, the RAM efficiency is much better -- we use lar

Re: Solr memory use, jmap and TermInfos/tii

2010-09-11 Thread Lance Norskog
There is a trick: facets with only one occurrence tend to be mispellings or dirt. You write a program to fetch the terms (Lucene's CheckIndex is a great starting point) create a stopwords file. Here's a data mining project: which languages are more vulnerable to dirty OCR? Burton-West, Tom w

mm=0?

2010-09-11 Thread Satish Kumar
Hi, We have a requirement to show at least one result every time -- i.e., even if user entered term is not found in any of the documents. I was hoping setting mm to 0 will return results in all cases, but it is not. For example, if user entered term "alpha" and it is *not* in any of the documents

RE: Solr memory use, jmap and TermInfos/tii

2010-09-11 Thread Burton-West, Tom
Thanks Mike, >>Do you use a terms index divisor? Setting that to 2 would halve the >>amount of RAM required but double (on average) the seek time to locate >>a given term (but, depending on your queries, that seek time may still >>be a negligible part of overall query time, ie the tradeoff could

RE: multivalued fields in result

2010-09-11 Thread Markus Jelsma
Yes, you'll get what is stored and asked for.   -Original message- From: Jason Chaffee Sent: Sat 11-09-2010 05:27 To: solr-user@lucene.apache.org; Subject: multivalued fields in result Is it possible to return multivalued files in the result?   I would like to have a multivalued field

Re: Autocomplete with Filter Query

2010-09-11 Thread Ingo Renner
Am 10.09.2010 um 17:14 schrieb David Yang: Hi David, > Is there any way to provide autocomplete while filtering results? yes, you can use facets to achieve that. best Ingo -- Ingo Renner TYPO3 Core Developer, Release Manager TYPO3 4.2, Admin Google Summer of Code TYPO3 - Open Source Enterp

Re: Solr memory use, jmap and TermInfos/tii

2010-09-11 Thread Michael McCandless
Unfortunately, the terms index (before 4.0) is not RAM efficient -- I wrote about this here: http://chbits.blogspot.com/2010/07/lucenes-ram-usage-for-searching.html Every indexed term that's loaded into RAM creates 4 objects (TermInfo, Term, String, char[]), as you see in your profiler output