Jens Grivolla a écrit :
> Hello,
>
> I'm looking to extract significant terms characterizing a set of
> documents (which in turn relate to a topic).
>
> This basically comes down to functionality similar to determining the
> terms with the greatest offer weight (as used for blind relevance
> feedback), or maximizing tf.idf (as is done in MoreLikeThis).
>
> Is there anything like this already implemented, or do I need to
> iterate through all documents in the set "manually", re-tokenize each
> one (or maybe use TermVectors), and then calculate the weight for each
> term? 
http://project.carrot2.org/index.html may be your friend.

M.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to