Re: [Scikit-learn-general] Are you good at algebra with permutations and factorials?

Alexandre Passos Wed, 26 Oct 2011 19:43:35 -0700

On Wed, Oct 26, 2011 at 22:38, Robert Layton <[email protected]> wrote:
> On 27 October 2011 13:29, Alexandre Passos <[email protected]> wrote:
>>
>> On Wed, Oct 26, 2011 at 22:27, Alexandre Passos <[email protected]>
>> wrote:
>> > On Wed, Oct 26, 2011 at 22:15, Robert Layton <[email protected]>
>> > wrote:
>> >> I am trying to implement the Adjusted Mutual Information in a stable
>> >> way.
>> >> Unfortunately, the third term for the Expected Mutual Information is
>> >> not
>> >> stable and can result in overflow issues with only a moderate number of
>> >> samples (eg N=1000 fails). See
>> >> here: http://en.wikipedia.org/wiki/Adjusted_mutual_information
>> >> I think I've reduced the equation to a more stable
>> >> format: https://github.com/robertlayton/scikit-learn/wiki/Reducing-EMI
>> >> I would appreciate if someone could look through this an check:
>> >> 1) That I did this correctly
>> >> 2) That there isn't a better way (a better identity or efficient way to
>> >> reduce factorials)
>> >
>> > Have you tried using scipy.special.gammaln, doing all the
>> > multiplications and divisions with additions and subtractions in
>> > logspace, and then exponentiating?
>>
>> And if this turns out to be too expensive you can probably get away
>> with stirling's approximation for log n!
>> http://en.wikipedia.org/wiki/Stirling%27s_approximation
>>
>>
>> --
>>  - Alexandre
>>
>>
>> ------------------------------------------------------------------------------
>> The demand for IT networking professionals continues to grow, and the
>> demand for specialized networking skills is growing even more rapidly.
>> Take a complimentary Learning@Cisco Self-Assessment and learn
>> about Cisco certifications, training, and career opportunities.
>> http://p.sf.net/sfu/cisco-dev2dev
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
> That is an option. I wasn't sure how to use it though -> calculating the
> factorial isn't the issue, its working with the really large numbers that
> is. That is why I went with permutations, as the number should be lower.


Correct my if I'm wrong, but I'd say the problem is that in your
computation that should produce a reasonably small number your
intermediate steps actually involve very big numbers, which will be
multiplied and divided with each other until something reasonable is
left. So working in logspace will "squash" these numbers into
manageable sizes and after all the multiplications and divisions
(which will be additions and subtractions) let you have reasonable
numbers again. Most of your simplifications can still apply in
logspace, I think, and they could make it faster.

-- 
 - Alexandre

------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Are you good at algebra with permutations and factorials?

Reply via email to