Hm, since you have problems with memory already, the longdouble wouldn't be an option I guess. However, what about using numpy.around to reduce the precision by a few decimals?
Sent from my iPhone > On Dec 17, 2015, at 8:26 AM, Ryan R. Rosario <r...@bytemining.com> wrote: > > Hi, > > I have a very large dense numpy matrix. To avoid running out of RAM, I use > np.float32 as the dtype instead of the default np.float64 on my system. > > When I do an L1 normalization of the rows (axis=1) in my matrix in-place > (copy=False), I frequently get rows that do not sum to 1. Since these are > probability distributions that I pass to np.random.choice, these must sum to > exactly 1.0. > > pp.normalize(term, norm='l1', axis=1, copy=False) > sums = term.sum(axis=1) > sums[np.where(sums != 1)] > > array([ 0.99999994, 0.99999994, 1.00000012, ..., 0.99999994, > 0.99999994, 0.99999994], dtype=float32) > > I wrote some code to manually add/subtract the small difference from 1 to > each row, and I make some progress, but still all the rows do not sum to 1. > > Is there a way to avoid this problem? > > — Ryan > ------------------------------------------------------------------------------ > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general