On 08/15/2012 02:34 PM, Ahmet Arslan wrote:
Is there an easy way to figure out
the most common tokens and then remove those tokens from the
documents.
Probably this :
http://lucene.apache.org/core/3_6_1/api/all/org/apache/lucene/misc/HighFreqTerms.html
ah, that's a good part 1. Then the Q would then be, how to modify the
index without reindexing all documents.
my gut is that it should be possible (it seems luke does it), but never
went deep into the document object besides for adding fields.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org