I created an index using Lucene 3.6.0 in which I specified that a certain text
field in each document should be indexed, stored, analyzed with no norms, with
term vectors, offsets and positions. Later I looked at that index in Luke, and
it said that term vectors were created for this field, but offsets and
positions were not. The code I used for indexing couldn't be simpler. It looks
like this for the relevant field:
doc.add(new Field("ReportText", reportTextContents, Field.Store.YES,
Field.Index.ANALYZED_NO_NORMS, Field.TermVector.WITH_POSITIONS_OFFSETS);
The indexer adds these documents to the index and commits them. I ran the
indexer in a debugger and watched the Lucene code set the Field instance
variables called storeTermVector, storeOffsetWithTermVector and
storePositionWithTermVector to true for this field.
When the indexing was done, I ran a simple program in a debugger that opens an
index, reads each document and writes out its information as XML. The values of
storeOffsetWithTermVector and storePositionWithTermVector in the ReportText
Field objects were false. Is there something other than specifying
Field.TermVector.WITH_POSITIONS_OFFSETS when constructing a Field that needs to
be done in order for offsets and positions to be saved in the index? Or are
there circumstances under which the Field.TermVector setting for a Field object
is ignored? This doesn't make sense to me, and I could swear that offsets and
positions were being saved in some older indexes I created that I unfortunately
no longer have around for comparison. I'm sure that I am just overlooking
something or have made some kind of mistake, but I can't see what it is at the
moment. Thanks for any help or advice you can give me.
Mike