[EMAIL PROTECTED] (Tom Fawcett) wrote:
>Just a casual comment on this. There has been a fair amount of work on text
>classification in the past few years, comparing different representations and
>algorithms. I wouldn't take any individual study's conclusions as definitive,
>since various papers have conflicting conclusions. As one example, most
>people think stopword elimination and stemming are effective, but Riloff makes
>a case against doing them:
>
>http://citeseer.nj.nec.com/riloff97little.html
>
>I have no reason to question Yang's results; I'm just pointing out that text
>classification is a big ball of wax.
Point taken. =) The other main reason I started with Document Frequency
as the measure of feature quality is that it's easy to understand and
easy to do. I still do want to evaluate the other methods, if for no
other reason than to learn their particularities.
------------------- -------------------
Ken Williams Last Bastion of Euclidity
[EMAIL PROTECTED] The Math Forum