Hi, The existing code retrieved a TermPositionVector with IndexReader.getTermFreqVector(docId, field). It then extracted the terms for the query and stores them in two different array.
One containing single word terms, the other containing the phrases. For single word term it loops on the array of term and increment the frequency this way: freq += tpv.getTermFrequencies()[tpv.indexOf(currentTerm.text())]; For the phrase it works the same way, but of course it searches for the entire set of terms in the correct order. Fast enough means that for the search query: on going for, if I have 3000 results which consists of document with an average of 1000 words it must be able to do it under 50ms on a dual Xeon machine. With the TermPositionVector my best results with no load on the server were around 3000ms. I am still an amateur with lucene, I have to migrate an application which used a customized version of lucene 1.3 to 2.1. I would really like to be able to use an unmodified version of lucene since it would be a lot easier to keep up to date with lucene. I'll give a try with TermDocs. Thanks -----Message d'origine----- De : Chris Hostetter [mailto:[EMAIL PROTECTED] Envoyé : February 23, 2007 7:18 PM À : Lucene Users Objet : Re: ConstantScoreQuery and MatchAllDocsQuery : I ask this because I need to return the frequency of the search terms : with each of my results, I tried using the TermFreqVector object but : unfortunately it was not fast enough, so I decided to modifiy lucene to : be able to return the frequency the same way the score is returned by : org.apache.lucene.search.Hits. ... : I started by adding public abstract int freq(); in package : org.apache.lucene.search.Scorerabstract class, and then modified : everyimplementation of Scorer to be able to get the frequency. can you elaborate on: * how you were trying to use TermFreqVector * how you define "fast enough" * how you are now getting the freq() value in all of the Scorer classes? If all you need to know is the frequency of each term in your query (and not hte frequency of all terms in teh document) did you try using the freq() method in the TermDocs iterator instead of the TermFreqVector class? using Query.extractTerms, and then getting a TermDocs instance and iterating over those terms using seek and over the docids from your results using skipTo should be an extremely fast way to get the freq() info. : It works well and fast, the only problem I have is that I did not find a : way to compute the frequency in both ConstantScoreQuery.java and : MatchAllDocsQuery.java internal scorers. neither of those queries involve any terms, so i'm not sure what freq() would even make sense ... "1" or "0" i would imagine. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]