Hi,

I have a very large dense numpy matrix. To avoid running out of RAM, I use 
np.float32 as the dtype instead of the default np.float64 on my system. 

When I do an L1 normalization of the rows (axis=1) in my matrix in-place 
(copy=False), I frequently get rows that do not sum to 1. Since these are 
probability distributions that I pass to np.random.choice, these must sum to 
exactly 1.0.

pp.normalize(term, norm='l1', axis=1, copy=False) 
sums = term.sum(axis=1)
sums[np.where(sums != 1)]

array([ 0.99999994,  0.99999994,  1.00000012, ...,  0.99999994,
      0.99999994,  0.99999994], dtype=float32)

I wrote some code to manually add/subtract the small difference from 1 to each 
row, and I make some progress, but still all the rows do not sum to 1.

Is there a way to avoid this problem?

— Ryan
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to