Re: High frequency term for the searched query

2010-11-06 Thread starz10de
Hi Mic, I tried like this: String indexName = "path"; IndexReader r = IndexReader.open(indexName); MoreLikeThis mlt = new MoreLikeThis(r); . . . . . . . . BooleanQuery result = (BooleanQuery) mlt.like(docNum); result.add(query, BooleanClause.Occur.MUST_NOT); how I can print t

Re: High frequency term for the searched query

2010-11-06 Thread Michael McCandless
Looks like maybe if you use MoreLikeThis directly, you can call it's retrieveInterestingTerms(Reader) method? Or, MoreLikeThisQuery.rewrite will return a BooleanQuery whose clauses are the interesting terms? Mike On Fri, Nov 5, 2010 at 11:00 AM, starz10de wrote: > > HI Mike, > > I implemented M

Re: High frequency term for the searched query

2010-11-05 Thread starz10de
HI Mike, I implemented MoreLikeThis but I couldn't figure out where or how to print the related term to the given query. All what I got is the relevant documents to the query with their scores. Any idea how to get the related terms? -- View this message in context: http://lucene.472066.n3.nab

RE: High frequency term for the searched query

2010-11-05 Thread starz10de
Hi, I did as it is explained in the website: final Set terms = new HashSet(); query = searcher.rewrite(query); query.extractTerms(terms); for(Term t : terms){ int frequency = searcher.docFreq(t); } however I can't understa

Re: High frequency term for the searched query

2010-11-05 Thread Michael McCandless
Maybe MoreLikeThisQuery (under contrib/queries) will do what you want? Mike On Fri, Nov 5, 2010 at 3:33 AM, starz10de wrote: > > Hi, > > I need to expand the query with the most terms occurred with it in > documents. For example:  the word credits, tax, withdraw have high appearing > with Bank.

RE: High frequency term for the searched query

2010-11-05 Thread Uwe Schindler
to:farag_ah...@yahoo.com] > Sent: Friday, November 05, 2010 8:28 AM > To: java-user@lucene.apache.org > Subject: Re: High frequency term for the searched query > > > HI Chris, > > I tried your solution and got one problem "the method > extractterms(Set) is undefined f

Re: High frequency term for the searched query

2010-11-05 Thread starz10de
HI Chris, I tried your solution and got one problem "the method extractterms(Set) is undefined for the type Query" this is the ocde: Query query = QueryParser.parse(line, "contents", analyzer); //System.out.println("Searching for: " + query.toString("contents")); Hits hits = s

RE: High frequency term for the searched query

2010-11-05 Thread starz10de
Hi, I need to expand the query with the most terms occurred with it in documents. For example: the word credits, tax, withdraw have high appearing with Bank. So my query is “Bank” and the result should be ranked list of the most frequent terms with "Bank" I could do that as I explained but not

Re: High frequency term for the searched query

2010-11-04 Thread Chris Lu
After you get the query object, you can use IndexSearcher's function docFreq(), like this final Set terms = new HashSet(); query = searcher.rewrite(query); query.extractTerms(terms); for(Term t : terms){ int frequency = searcher.docFreq(t); } -- -- Chris Lu - Instant

Re: High frequency term for the searched query

2010-11-04 Thread Chris Lu
After you get the query object, you can use IndexSearcher's function docFreq(), like this final Set terms = new HashSet(); query = searcher.rewrite(query); query.extractTerms(terms); for(Term t : terms){ int frequency = irs.getSearcher().docFreq(t); } -- -- Chris Lu -

RE: High frequency term for the searched query

2010-11-04 Thread Burton-West, Tom
Can you give more details about what you want? Perhaps with an example? Do you want the number of documents containing the query term, the number of occurrences of the query term within a document, or the number of occurrences of the term in the entire index? You can use an explain query to get

Re: High frequency term for the searched query

2010-11-04 Thread Seth Rosen
You might want to take a look at this tutorial on how Lucene calculates Scoring [1]. If all you are interested in is the term frequency and you want to ignore other calculations you can override the others and have them return 1. Hope this helps! Seth Rosen s...@architexa.com www.architexa.com