Hi, I read that the score reported by the cbayes classifier is not a probability and is only useful for relative ranking, but is there a way to compare or normalize scores across classifications?
Basically I'm looking for a way to weed out the low-probability matches.. For instance, if I get the following classifications: "apple, red" --> Fruit, Score == 10.39 "apple, white" --> Laptop, Score == 12.33 "red" --> Fruit, Score == 3.444 I want to be able to weed out the last "red" --> Fruit classification, because the score is "too low". Hope my question makes sense. (First post here. Wonderful work by the Mahout team!) Thanks! ~sumedh (Mahout 0.4; 4.5 million documents; 200+ labels)
