I am trying to implement the Adjusted Mutual Information in a stable way.
Unfortunately, the third term for the Expected Mutual Information is not
stable and can result in overflow issues with only a moderate number of
samples (eg N=1000 fails). See here:
http://en.wikipedia.org/wiki/Adjusted_mutual_information
I think I've reduced the equation to a more stable format:
https://github.com/robertlayton/scikit-learn/wiki/Reducing-EMI
I would appreciate if someone could look through this an check:
1) That I did this correctly
2) That there isn't a better way (a better identity or efficient way to
reduce factorials)
Thanks,
Robert
--
My public key can be found at: http://pgp.mit.edu/
Search for this email address and select the key from "2011-08-19" (key id:
54BA8735)
Older keys can be used, but please inform me beforehand (and update when
possible!)
------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn
about Cisco certifications, training, and career opportunities.
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general