subject:"Re\: How to get the tokens for a given document"

Re: How to get the tokens for a given document

2010-04-12 Thread Herbert L Roitblat

Thanks David. I think that I neglected to say that I am using pyLucene 2.4.0. Your suggestion is almost what we're doing: indexReader.getTermFreqVector(ID, fieldName) self.hits = list(self.lSearcher.search(self.query)) if self.hits: self.hit = lucene.Hit.cast_(self.hi

Re: How to get the tokens for a given document

2010-04-12 Thread David Causse

Hi, you are walking from indexReader.terms() then on indexReader.termDocs(Term t) for each term and then match your docID on the termsDocs enum? So you walk the whole index? You need a forward index and lucene is inverted but you have IMHO 2 solutions with lucene (sadly, they both require re-ind