Re: How to get the tokens for a given document

2010-04-12 Thread Herbert L Roitblat
Thanks David. I think that I neglected to say that I am using pyLucene 2.4.0. Your suggestion is almost what we're doing: indexReader.getTermFreqVector(ID, fieldName) self.hits = list(self.lSearcher.search(self.query)) if self.hits: self.hit = lucene.Hit.cast_(self.hi

Re: How to get the tokens for a given document

2010-04-12 Thread David Causse
Hi, you are walking from indexReader.terms() then on indexReader.termDocs(Term t) for each term and then match your docID on the termsDocs enum? So you walk the whole index? You need a forward index and lucene is inverted but you have IMHO 2 solutions with lucene (sadly, they both require re-ind