On Sat, Oct 15, 2011 at 3:57 PM, Pietro Berkes <[email protected]> wrote:
> I wish there was a native numpy function for this case, which is
> fairly common in information theory quantities.
> As a workaround, I sometimes use these reasonably efficient utility functions:
>
> def log0(x):
>    """Robust 'entropy' logarithm: log(0.) = 0."""
>    return np.where(x==0., 0., np.log(x))
>
>
> def log0_no_warning(x):
>    """Robust 'entropy' logarithm: log(0.) = 0.
>
>    This version does not raise any warning when values of x=0. are first
>    encountered. However, it is slightly more inefficient."""
>    with np.errstate(divide='ignore'):
>        res = np.where(x==0., 0., np.log(x))
>    return res
>

I think the function is quite dangerous if you take it out of the
context of information measures

>>> np.log(0)
-inf

The equivalent functions that I used where all  for xlogy

res = np.where(x==0., 0., x*np.log(y))


Just my 2c from other packages.

Josef

>
>
> On Fri, Oct 14, 2011 at 10:31 AM, Olivier Grisel
> <[email protected]> wrote:
>> 2011/10/14 Robert Layton <[email protected]>:
>>> I'm working on adding Adjusted Mutual Information, and need to calculate the
>>> Mutual Information.
>>> I think I have the algorithm itself correct, except for the fact that
>>> whenever the contingency matrix is 0, a nan happens and propogates through
>>> the code.
>>>
>>> Sample code on the net [1] uses an eps=np.finfo(float).eps. Should I do
>>> this, adding eps to anything that is a denominator or parameter to log?
>>> Is there a better way?
>>
>> I would rather filter out any entry that has a 0.0 in the denominator
>> before the final sum using array masking.
>>
>> BTW, thanks for tackling this.
>>
>> --
>> Olivier
>> http://twitter.com/ogrisel - http://github.com/ogrisel
>>
>> ------------------------------------------------------------------------------
>> All the data continuously generated in your IT infrastructure contains a
>> definitive record of customers, application performance, security
>> threats, fraudulent activity and more. Splunk takes this data and makes
>> sense of it. Business sense. IT sense. Common sense.
>> http://p.sf.net/sfu/splunk-d2d-oct
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2d-oct
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to