2011/10/17 Robert Layton <[email protected]>:
> In the formula for the expected value for mutual information [1], the third
> summation uses n_{i,j}.
> Is this a new value, or do I use the value from the contingency matrix?
In Vinh, Epps and Bailey (2010 [1]), n_{i,j} is the contingency table.
The Wikipedia page seems rather sloppy, you might want to refer to the
original paper. (We had bugs in tf-idf because the Wikipedia had
errors in its formulas...)
[1] http://jmlr.csail.mit.edu/papers/volume11/vinh10a/vinh10a.pdf
HTH,
--
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general