On Wed, Oct 26, 2011 at 22:27, Alexandre Passos <[email protected]> wrote:
> On Wed, Oct 26, 2011 at 22:15, Robert Layton <[email protected]> wrote:
>> I am trying to implement the Adjusted Mutual Information in a stable way.
>> Unfortunately, the third term for the Expected Mutual Information is not
>> stable and can result in overflow issues with only a moderate number of
>> samples (eg N=1000 fails). See
>> here: http://en.wikipedia.org/wiki/Adjusted_mutual_information
>> I think I've reduced the equation to a more stable
>> format: https://github.com/robertlayton/scikit-learn/wiki/Reducing-EMI
>> I would appreciate if someone could look through this an check:
>> 1) That I did this correctly
>> 2) That there isn't a better way (a better identity or efficient way to
>> reduce factorials)
>
> Have you tried using scipy.special.gammaln, doing all the
> multiplications and divisions with additions and subtractions in
> logspace, and then exponentiating?

And if this turns out to be too expensive you can probably get away
with stirling's approximation for log n!
http://en.wikipedia.org/wiki/Stirling%27s_approximation


-- 
 - Alexandre

------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to