Hm, since you have problems with memory already, the longdouble wouldn't be an 
option I guess. However, what about using numpy.around to reduce the precision 
by a few decimals? 



Sent from my iPhone
> On Dec 17, 2015, at 8:26 AM, Ryan R. Rosario <r...@bytemining.com> wrote:
> 
> Hi,
> 
> I have a very large dense numpy matrix. To avoid running out of RAM, I use 
> np.float32 as the dtype instead of the default np.float64 on my system. 
> 
> When I do an L1 normalization of the rows (axis=1) in my matrix in-place 
> (copy=False), I frequently get rows that do not sum to 1. Since these are 
> probability distributions that I pass to np.random.choice, these must sum to 
> exactly 1.0.
> 
> pp.normalize(term, norm='l1', axis=1, copy=False) 
> sums = term.sum(axis=1)
> sums[np.where(sums != 1)]
> 
> array([ 0.99999994,  0.99999994,  1.00000012, ...,  0.99999994,
>      0.99999994,  0.99999994], dtype=float32)
> 
> I wrote some code to manually add/subtract the small difference from 1 to 
> each row, and I make some progress, but still all the rows do not sum to 1.
> 
> Is there a way to avoid this problem?
> 
> — Ryan
> ------------------------------------------------------------------------------
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to