I need to read the TokenStream at least twice
I used the horribly hackey but quick-for-me method of adding a
method to MemoryIndex that accepts a List of Tokens. Any ideas?
I'm not sure about modifying MemoryIndex. It should be easy enough
to create a subclass of TokenStream - ("CachedTokenStream"
perhaps?) which takes a real TokenStream in it's constructor and
delegates all "next" calls to it (and also records them in a List)
for the the first use. This can then be "rewound" and re-used to
run through the same set of tokens held in the list from the first
run.
Yes, as Marks points out this can be done without API change via the
existing MemoryIndex.addField(String fieldName, TokenStream stream)
The TokenStream could be constructed along similar lines as done in
MemoryIndex.keywordTokenStream(Collection) or perhaps along similar
lines as in
org.apache.lucene.index.memory.AnalyzerUtil.getTokenCachingAnalyzer
(Analyzer)
If needed, an IndexReader can be created from a MemoryIndex via
MemoryIndex.createSearcher().getIndexReader(), again without API change.
Wolfgang.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]