I need to read the TokenStream at least twice
I used the horribly hackey but quick-for-me method of adding a method to MemoryIndex that accepts a List of Tokens. Any ideas?

I'm not sure about modifying MemoryIndex. It should be easy enough to create a subclass of TokenStream - ("CachedTokenStream" perhaps?) which takes a real TokenStream in it's constructor and delegates all "next" calls to it (and also records them in a List) for the the first use. This can then be "rewound" and re-used to run through the same set of tokens held in the list from the first run.


Yes, as Marks points out this can be done without API change via the existing MemoryIndex.addField(String fieldName, TokenStream stream)

The TokenStream could be constructed along similar lines as done in MemoryIndex.keywordTokenStream(Collection) or perhaps along similar lines as in org.apache.lucene.index.memory.AnalyzerUtil.getTokenCachingAnalyzer (Analyzer)

If needed, an IndexReader can be created from a MemoryIndex via MemoryIndex.createSearcher().getIndexReader(), again without API change.

Wolfgang.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to