No, this is just on the train data. Its just a sanity check that the
classifier works.

With ms =  5 and mdf = 5

INFO: =======================================================
Summary
-------------------------------------------------------
Correctly Classified Instances          :       1816      90.8%
Incorrectly Classified Instances        :        184       9.2%
Total Classified Instances              :       2000

=======================================================
Confusion Matrix
-------------------------------------------------------
a     b     <--Classified as
818   182   |  1000   a     = pos
2     998   |  1000   b     = neg
Default Category: unknown: 2



On Mon, Oct 18, 2010 at 10:49 PM, Ted Dunning <[email protected]> wrote:

> Is this on the training data?  Or held-out test data?
>
> If on test data, this is much, much too accurate to be believed.
>
> On Mon, Oct 18, 2010 at 10:14 AM, Robin Anil <[email protected]> wrote:
>
> > Just pushed a bug fix for ngrams. Update your copy. Here is the result
> with
> > ngram = 2
> >
> > Correctly Classified Instances          :       1995     99.75%
> > Incorrectly Classified Instances        :          5      0.25%
> > Total Classified Instances              :       2000
> >
> > =======================================================
> > Confusion Matrix
> > -------------------------------------------------------
> > a     b     <--Classified as
> > 995   5     |  1000   a     = pos
> > 0     1000  |  1000   b     = neg
> > Default Category: unknown: 2
> >
> >
> > With some pruning, you will have a decent enough classifier for
> sentiments
> >
>

Reply via email to