When I run your code, as is except for using RAMDirectory and setting up an IndexWriter using StandardAnalyzer
RAMDirectory dir = new RAMDirectory(); Analyzer anl = new StandardAnalyzer(Version.LUCENE_40); IndexWriterConfig iwcfg = new IndexWriterConfig(Version.LUCENE_40, anl); IndexWriter iw = new IndexWriter(dir, iwcfg); ... iw.addDocument(doc); iw.close(); it prints doc 0 had 1 terms. If change text to .e.g. "this is foobar gibberish" it says there are 2 terms. So it looks OK to me. "this" and "is" are presumably in the default list of stop words. Not relevant, but why are you using SlowCompositeReaderWrapper rather than just IndexReader rdr = DirectoryReader.open(dir)? I get the same results either way, -- Ian. On Thu, Jan 17, 2013 at 5:52 AM, Jon Stewart <j...@lightboxtechnologies.com> wrote: > Hello, > > I cannot extract document term vectors from an index, and have not > turned up much in some determined googling. In short, when I call > IndexReader.getTermVector(docID, field) or > IndexReader.getTermVectors(docID) and then navigate down to the Terms > for the specified field, I get a null result. > > // Indexing: > String bodyText = "this is foobar"; > final FieldType BodyOptions = new FieldType(); > BodyOptions.setIndexed(true); > > BodyOptions.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS); > BodyOptions.setStored(true); > BodyOptions.setStoreTermVectors(true); > BodyOptions.setTokenized(true); > Document doc = new Document(); > doc.add(new Field("body", bodyText, BodyOptions)); > > When I examine docs in Luke, I can see the term vectors. > > // Retrieving (at a later time) > DirectoryReader dirRdr = DirectoryReader.open(FSDirectory.open(new > File(path))); > SlowCompositeReaderWrapper rdr = new SlowCompositeReaderWrapper(dirRdr); > for (int i = 0; i < rdr.maxDoc(); ++i) { > int numTerms = 0; > Terms terms = rdr.getTermVector(i, "body"); > if (terms != null) { > TermsEnum term = terms.iterator(null); > while (term.next() != null) { > ++numTerms; > } > System.out.println("doc " + i + " had " + numTerms + " terms"); > } > else { > System.err.println("null term vector on doc " + i); > } > } > > On every doc, the Terms object I get back from getTermVector(i, "body") is > null. > > > Jon > -- > Jon Stewart, Principal > (646) 719-0317 | j...@lightboxtechnologies.com | Arlington, VA > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org