No, this is just on the train data. Its just a sanity check that the classifier works.
With ms = 5 and mdf = 5 INFO: ======================================================= Summary ------------------------------------------------------- Correctly Classified Instances : 1816 90.8% Incorrectly Classified Instances : 184 9.2% Total Classified Instances : 2000 ======================================================= Confusion Matrix ------------------------------------------------------- a b <--Classified as 818 182 | 1000 a = pos 2 998 | 1000 b = neg Default Category: unknown: 2 On Mon, Oct 18, 2010 at 10:49 PM, Ted Dunning <[email protected]> wrote: > Is this on the training data? Or held-out test data? > > If on test data, this is much, much too accurate to be believed. > > On Mon, Oct 18, 2010 at 10:14 AM, Robin Anil <[email protected]> wrote: > > > Just pushed a bug fix for ngrams. Update your copy. Here is the result > with > > ngram = 2 > > > > Correctly Classified Instances : 1995 99.75% > > Incorrectly Classified Instances : 5 0.25% > > Total Classified Instances : 2000 > > > > ======================================================= > > Confusion Matrix > > ------------------------------------------------------- > > a b <--Classified as > > 995 5 | 1000 a = pos > > 0 1000 | 1000 b = neg > > Default Category: unknown: 2 > > > > > > With some pruning, you will have a decent enough classifier for > sentiments > > >
