ok, I have no problem with filter/copy to new index, but that seems like
a good start point. Would need to figure out how to extend that class
correctly, but at least gives me a good starting point.
On 08/15/2012 02:48 PM, Uwe Schindler wrote:
You cannot modify the ternm dictionary of an index, see my other eMail. You
have to filter it by copying to a new index or reindexing. Document
modifications are not supported in Lucene and other inverted indexes.
-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-----Original Message-----
From: Shaya Potter [mailto:spot...@gmail.com]
Sent: Wednesday, August 15, 2012 8:44 PM
To: java-user@lucene.apache.org
Subject: Re: easy way to figure out most common tokens?
On 08/15/2012 02:34 PM, Ahmet Arslan wrote:
Is there an easy way to figure out
the most common tokens and then remove those tokens from the
documents.
Probably this :
http://lucene.apache.org/core/3_6_1/api/all/org/apache/lucene/misc/Hig
hFreqTerms.html
ah, that's a good part 1. Then the Q would then be, how to modify the
index
without reindexing all documents.
my gut is that it should be possible (it seems luke does it), but never
went deep
into the document object besides for adding fields.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org