Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-13 Thread Sanyi
- leave the current implementation, raising an exception; - handle the exception and limit the boolean query to the first 1024 (or what ever the limit is) terms; - select, between the possible terms, only the first 1024 (or what ever the limit is) more meaningful ones, leaving out all the

Anyone implemented custom hit ranking?

2004-11-13 Thread Sanyi
Hi! I have problems with short text ranking. I've read about same raking problems in the list archives, but found only hints and toughts (adjust DefaultSimilarity, Similarity, etc...), not complete solutions with source code. Anyone implemented a good solution for this problem? (example: my

How to efficiently get # of search results, per attribute

2004-11-13 Thread Chris Lamprecht
I'd like to implement a search across several types of entities, let's say, classes, professors, and departments. I want the user to be able to enter a simple, single query and not have to specify what they're looking for. Then I want the search results to be something like this: Search results

Re: How to efficiently get # of search results, per attribute

2004-11-13 Thread Nader Henein
It depends on how many results they're looking through, here are two scenarios I see: 1] If you don't have that many records you can fetch all the results and then do a post parsing step the determine totals 2] If you have a lot of entries in each category and you're worried about fetching

Re: lucene Scorers

2004-11-13 Thread Paul Elschot
On Friday 12 November 2004 22:56, Chuck Williams wrote: I had a similar need and wrote MaxDisjunctionQuery and MaxDisjunctionScorer. Unfortunately these are not available as a patch but I've included the original message below that has the code (modulo line breaks added by simple text email

about Stemming

2004-11-13 Thread Miguel Angel
Hi, I have used the DEMOS of lucene and I want to know as it is possible to be added Stemming for my applications. -- Miguel Angel Angeles R. Asesoria en Conectividad y Servidores Telf. 97451277 - To unsubscribe, e-mail:

Re: about Stemming

2004-11-13 Thread Bernhard Messer
Miguel Angel schrieb: Hi, I have used the DEMOS of lucene and I want to know as it is possible to be added Stemming for my applications. have a look to the lucene-sandbox. Under contributions there are stemmers for many different languages.

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-13 Thread Paul Elschot
On Saturday 13 November 2004 09:16, Sanyi wrote: - leave the current implementation, raising an exception; - handle the exception and limit the boolean query to the first 1024 (or what ever the limit is) terms; - select, between the possible terms, only the first 1024 (or what ever the

RE: How to efficiently get # of search results, per attribute

2004-11-13 Thread Chuck Williams
My Lucene application includes multi-faceted navigation that does a more complex version of the below. I've got 5 different taxonomies into which every indexed item is classified. The largest of the taxonomies has over 15,000 entries while the other 4 are much smaller. For every search query, I

Re: How to efficiently get # of search results, per attribute

2004-11-13 Thread Chris Lamprecht
Nader and Chuck, Thanks for the responses, they're both helpful. My index sizes will begin on the order of 200,000 classes, and 20,000 instructors (and much fewer departments), and grow over time to maybe a few million classes. Compared to some of the numbers I've seen on this mailing list, my

RE: Anyone implemented custom hit ranking?

2004-11-13 Thread Chuck Williams
I've done some customization of scoring/ranking and plan to do more. A good place to start is with your own Similarity, extending Lucene's DefaultSimilarity. Like you, I found the default length normalization to not work well with my dataset. I separately weight each indexed field according to

Mozilla Desktop Search

2004-11-13 Thread Kevin A. Burton
http://www.peerfear.org/rss/permalink/2004/11/13/MozillaDesktopSearch/ The Mozilla foundation may be considering a desktop search implementation http://computerworld.com/developmenttopics/websitemgmt/story/0,10801,97396,00.html?f=x10 : Having launched the much-awaited Version 1.0 of