I would say that you need core sets.
We don't have them in scikit-learn but I'd love to get them.   Worst case do a 
data reduction using a streaming k means. 

Histograms are really a ugly way of doing data réduction. They don't adapt to 
the data distribution. 

Gaël

<div>-------- Original message --------</div><div>From: Charles Greenberg 
<great...@gmail.com> </div><div>Date:18/02/2014  19:48  (GMT+01:00) 
</div><div>To: scikit-learn-general@lists.sourceforge.net </div><div>Subject: 
Re: [Scikit-learn-general] fit a GMM to a histogram (or weighted data points) ? 
</div><div>
</div>To give some more info, my application is fitting to processed data. 
Basically, it's too expensive to keep all individual data points so we keep a 
(3D) histogram, but we still want to fit a GMM to the data. Additionally, we 
perform processing on the histogram, averaging from multiple experiments. My 
current solution (re-sampling from the histogram) does work but 1) it's kind of 
annoying and 2) I have to sample a LOT of points so it slows things down.

--Charles


On Sun, Feb 16, 2014 at 11:02 AM, Sturla Molden <sturla.mol...@gmail.com> wrote:
On 16/02/14 19:47, Gael Varoquaux wrote:
> On Sun, Feb 16, 2014 at 07:45:18PM +0100, Sturla Molden wrote:
>> The main use case is to use the histogram as a form of down-sampling.
>
> In this case, I'd love to see core sets, which would be a more optimal
> way of down sampling. And actually, that answers my question above, I
> think that core sets would be a usecase for weighted samples.

Yes, core-sets are possible too

http://people.csail.mit.edu/dannyf/nips11.pdf



Sturla


------------------------------------------------------------------------------
Android apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience.  Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to