[math] DiscreteEmpiricalDistribution

2013-01-07 Thread Phil Steitz
The EmpiricalDistribution class in the random package is designed to support large samples. It does not store all of data points in memory, but instead bins the data and uses smoothing kernels within the bins. I have recently had the need for a discrete empirical distribution - i.e., an

Re: [math] DiscreteEmpiricalDistribution

2013-01-07 Thread Ted Dunning
This will be very useful. Sampling from discrete ECDF's is also closely related to generating samples from a multinomial distribution. I did a bit of work on the latter problem. The result of that work is in org.apache.mahout.math.random.Multinomial The major difference that you will have is