Thanks for all the good feedback, I'll certainly be following up on it.

I did find one reason why I wasn't getting good matches ... when I
looked more carefully at the perl data structure, I found that the
'features' hash only contained alphabetic characters. So, for example,
in the string 'WARRIOR 14-160 14-160', only the warrior part was being
used. Also, with 'BMW 318i' and 'BWM 525i', the numbers were being
ignored, and with something like 'A/T', two separate features 'a' and
't' were there.

So my further question is how to get NaiveBayes to use white space
separated words as features ('318i', 'a/t') and not just the individual
alphabetic characters. Is it a simple option when calling 
new AI::Categorizer?

--
Jason Armstrong

Reply via email to