[Resent] Document boosting based on .. semantics?

2008-02-19 Thread Markus Fischer
Hi, [Resent: guess I sent the first before I completed my subscription, just in case it comes up twice ...] the subject may be a bit weird but I couldn't find a better way to describe a problem I'm trying to solve. If I'm not mistaken, one factor of scoring is the distance of the word with

Re: FieldSortedHitQueue rise in memory

2008-02-19 Thread Brian Doyle
In our custom sort code we had many issues. We fixed most of them by implementing hashcode and equals and then needed to have a WeakReference to the IndexReader. Once we did this our memory problems have mostly gone away. Thanks. On Feb 19, 2008 7:37 AM, Peter Keegan <[EMAIL PROTECTED]> wrote:

Re: Rails and lucene

2008-02-19 Thread Briggs
I agree with using Solr. Solr can output ruby code so it can be immediately evaluated. http://wiki.apache.org/solr/SolRuby?highlight=%28CategoryQueryResponseWriter%29%7C%28%28CategoryQueryResponseWriter%29%29 Solr is located at: http://lucene.apache.org/solr/ On Feb 19, 2008 3:25 PM, Kyle Max

Re: Rails and lucene

2008-02-19 Thread Kyle Maxwell
> Hi guys, > Now an idea knock my brain, which I want to integrate the lucene into my > ruby application. And the newest lucene api owns the interface to join the > ruby application. UnfortunatelyI have no experience about it. Let us talk > about it. Use Solr, or integrate Lucene via JRuby. I

RE: query question

2008-02-19 Thread Steven A Rowe
Hi C.B., Yonik is referring to a Solr class: You should theoretically be able to use this filter with straight Lucene code, as long as it's on the classpath. (I'm guessing Yo

RE: How to index word-pairs and phrases

2008-02-19 Thread Steven A Rowe
Hi Ghinwa, o.a.l.analysis.ngram.NgramTokenizer is a *character* level n-gram filter - from text "word1 word2 word3" you get tokens "wo", "or", "rd", "d1", etc. The ShingleFilter gives you "word1 word2" and "word2 word3". Steve On 02/19/2008 at 11:28 AM, Ghinwa Choueiter wrote: > What about > \

RE: How to index word-pairs and phrases

2008-02-19 Thread Ghinwa Choueiter
What about \contrib\analyzers\src\java\org\apache\lucene\analysis\ngram ?? Does this tokenizer do what I need? thank you, -Ghinwa On Tue, 19 Feb 2008, Steven A Rowe wrote: Mark, The ShingleFilter contrib has not been committed yet - it's still here: https://issues.apache.org/jira/browse/

RE: How to index word-pairs and phrases

2008-02-19 Thread Steven A Rowe
Mark, The ShingleFilter contrib has not been committed yet - it's still here: https://issues.apache.org/jira/browse/LUCENE-400 Steve On 02/19/2008 at 2:33 AM, markharw00d wrote: > Further to Grant's useful background - there is an analyzer specifically > for multi-word terms in "contrib". Se

Re: query question

2008-02-19 Thread Yonik Seeley
One way is to use WordDelimiterFilter in the analyzer. The example schema has it in the fieldType "text"... also check out http://localhost:8983/solr/admin/analysis.jsp -Yonik On Feb 19, 2008 7:21 AM, Cam Bazz <[EMAIL PROTECTED]> wrote: > Hello, > > I have a tokenized field where I store some inf

RE: Rails and lucene

2008-02-19 Thread Steven A Rowe
Hi Cooper Geng, Ferret is a Lucene-inspired Ruby search engine for Ruby - maybe that would be useful for you?: Steve On 02/19/2008 at 2:25 AM, coolgeng coolgeng wrote: > Hi guys, > Now an idea knock my brain, which I want to integrate the > lucene into

Re: FieldSortedHitQueue rise in memory

2008-02-19 Thread Peter Keegan
Hi Brian, I ran into something similar a long time ago. My custom sort objects were being cached by Lucene, but there were too many of them because each one had different 'reference values' for different queries. So, I changed the equals and hashcode methods to NOT use any instance data, thus avoi

query question

2008-02-19 Thread Cam Bazz
Hello, I have a tokenized field where I store some info. Lets say I have "abc 1234" and "abc 678" When the user searches for "abc1234" how can I find "abc 1234" ? Best. -C.B.

Re: Searching multiple indexes

2008-02-19 Thread Cedric Ho
> > I have some questions about searching multiple indexes. > > > > 1. IndexSearcher with a MultiReader will search the indexes > > sequentially? I think need to use either MultiSearcher or ParallelMultiSearcher > > > > 2. ParallelMultiSearcher searches in parallel. How is this > > done? One thre

RE: Searching multiple indexes

2008-02-19 Thread spring
No ideas? :( > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: Samstag, 16. Februar 2008 15:42 > To: java-user@lucene.apache.org > Subject: Searching multiple indexes > > Hi, > > I have some questions about searching multiple indexes. > > 1. IndexSearche

Re: Lucene in EJB enviornment

2008-02-19 Thread techkatta
Yes, as per the Lucene performance standards use only one IndexSearcher instance is true. But i am using the Lucene with Berkeley DB JE as a datastore in the EJB. So while exiting from the Sesssion Bean i am closing Berkeley DB JE connections as per the EJB standards. Which internally closes the J

Re: How to index word-pairs and phrases

2008-02-19 Thread markharw00d
Further to Grant's useful background - there is an analyzer specifically for multi-word terms in "contrib". See Lucene\contrib\analyzers\src\java\org\apache\lucene\analysis\shingle Cheers Mark Hi Ghinwa, A Term is simply a unit of tokenization that has been indexed for a Field, produced by a