sorry, a stupid question,
Is there a best practice to get the term vector of an document?
Is there any experience to do any kind of feature selection for dimension reducing like zipf laws or getting tf/idf of a term for the complete corpora.
Thanks for any hints. Stefan
--------------------------------------------------------------- enterprise information technology consulting open technology: http://www.media-style.com open source: http://www.weta-group.net open discussion: http://www.text-mining.org
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
