Ryan, did you try passing the arrays, as they are, to np.random.choice? Do you 
get what you expect?

Dale Smith, Ph.D.
Data Scientist
​


d. 404.495.7220 x 4008   f. 404.795.7221
Nexidia Corporate | 3565 Piedmont Road, Building Two, Suite 400 | Atlanta, GA 
30305

    


-----Original Message-----
From: Matthieu Brucher [mailto:matthieu.bruc...@gmail.com] 
Sent: Thursday, December 17, 2015 7:56 AM
To: scikit-learn-general@lists.sourceforge.net
Subject: Re: [Scikit-learn-general] sklearn.preprocessing.normalize does not 
sum to 1

The thing is that even if you did sum and divide by the sum, summing the 
results back may not lead to 1.0. This is always the "issue" in floating point 
computation.

Cheers,

Matthieu

2015-12-17 8:26 GMT+01:00 Ryan R. Rosario <r...@bytemining.com>:
> Hi,
>
> I have a very large dense numpy matrix. To avoid running out of RAM, I use 
> np.float32 as the dtype instead of the default np.float64 on my system.
>
> When I do an L1 normalization of the rows (axis=1) in my matrix in-place 
> (copy=False), I frequently get rows that do not sum to 1. Since these are 
> probability distributions that I pass to np.random.choice, these must sum to 
> exactly 1.0.
>
> pp.normalize(term, norm='l1', axis=1, copy=False) sums = 
> term.sum(axis=1) sums[np.where(sums != 1)]
>
> array([ 0.99999994,  0.99999994,  1.00000012, ...,  0.99999994,
>       0.99999994,  0.99999994], dtype=float32)
>
> I wrote some code to manually add/subtract the small difference from 1 to 
> each row, and I make some progress, but still all the rows do not sum to 1.
>
> Is there a way to avoid this problem?
>
> — Ryan
> ----------------------------------------------------------------------
> -------- _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



--
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher

------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to