On 08/19/2012 08:07 PM, Shaya Potter wrote:
On 08/15/2012 02:34 PM, Ahmet Arslan wrote:
Is there an easy way to figure out
the most common tokens and then remove those tokens from the
documents.
Probably this :
http://lucene.apache.org/core/3_6_1/api/all/org/apache/lucene/misc/HighFreqTerms.html
unsure how to use this
as far as I can tell org.apache.lucene.misc.TermStats doesn't exist in
lucene 3.6.1 (there seems to be some class like that in 4.x, but that
doesn't help me).
I'm wrong, its there, but eclipse isn't seeing it (haven't tried javac
by itself), even though it sees HighFreqTerms just fine.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org