On 07/23/2012 11:38 AM, Gael Varoquaux wrote:
> Thanks a lot for your investigation work. It's very useful.
>
> On Mon, Jul 23, 2012 at 11:32:24AM +0200, Emanuele Olivetti wrote:
>> As you can see the solution is very simple and just based on
>> np.logaddexp.reduce() instead of np.exp().sum(), plus np.nan_to_num()
>> and a little rearrangement.
> As far as I known, logaddexp.reduce is never the best option, as it does
> lead to overflow. The best option that I know is implemented
> sklearn.utils.extmath.logsumexp

I can confirm that sklearn.utils.extmath.logsumexp works at least
as well as np.logaddexp.reduce on my data.

Summation is a tricky beast, numerically. There are several
approaches/solutions (like logsumexp does, or Kaham summation
or logaddexp) going along different directions. It might be good
to merge them in a single solution, eventually.

I wrote my logaddexp together with logsubexp, logmean and
logvar here:
https://github.com/emanuele/inference_with_classifiers/blob/master/logvar.py
which could be of interest to someone. But most probably SciPy
is the right place where to discuss this.

>
> Appart from that minor remark, a pull request implementing the fix would
> be awesome.
>
>

I'll be on it soon. The only issue is to provide a meaningful test/usecase.

Emanuele

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to