Re: [Scikit-learn-general] weighted mean shift

2012-04-11 Thread Alexandre Gramfort
a sample_weight param seems reasonable to me Alex On Wed, Apr 11, 2012 at 5:10 PM, Olivier Grisel wrote: > Le 11 avril 2012 16:59, Michael Selik a écrit : >> Certainly. It looks like a good approach would be to break out line 121 in >> mean_shift_.py: >>> my_mean = np.mean(points_within, axis=

[Scikit-learn-general] Fwd: [ML tea] Matt Johnson's writeup for his talk today on spectral learning of HMM's

2012-04-11 Thread Satrajit Ghosh
of potential interest to the hmm module. cheers, satra -- Forwarded message -- From: George H. Chen Date: Mon, Apr 9, 2012 at 9:10 PM Subject: [ML tea] Matt Johnson's writeup for his talk today on spectral learning of HMM's Hello all. To those who missed it and for those who

Re: [Scikit-learn-general] weighted mean shift

2012-04-11 Thread Olivier Grisel
Le 11 avril 2012 16:59, Michael Selik a écrit : > Certainly. It looks like a good approach would be to break out line 121 in > mean_shift_.py: >> my_mean = np.mean(points_within, axis=0) > > And provide a function instead that allows several methods of mean > calculation -- flat kernel (current

Re: [Scikit-learn-general] weighted mean shift

2012-04-11 Thread Michael Selik
Certainly. It looks like a good approach would be to break out line 121 in mean_shift_.py: > my_mean = np.mean(points_within, axis=0) And provide a function instead that allows several methods of mean calculation -- flat kernel (current method), gaussian kernel, and/or accuracy-weighted kernel.

Re: [Scikit-learn-general] Sub sampling large datasets

2012-04-11 Thread Gael Varoquaux
On Wed, Apr 11, 2012 at 10:55:34AM +0200, Jean-Louis Durrieu wrote: > I was thinking it would be a good idea to include in gmm.py such a > mechanism. Core sets are a very beautiful idea from the theoretical standpoint and I'd love to have them in the scikit. We had even added them in the list of i

Re: [Scikit-learn-general] Non-Negative Garotte

2012-04-11 Thread Jaques Grobler
haha okay, I'll give it a swing! J 2012/4/11 Alexandre Gramfort > > The simulations that are done in the last mentioned paper by Ming Yuan > and > > Yi Lin, > > http://www2.isye.gatech.edu/statistics/papers/05-25.pdf, they find > that the > > NG seems to do > > generally better than the LASSO

Re: [Scikit-learn-general] Non-Negative Garotte

2012-04-11 Thread Alexandre Gramfort
> The simulations that are done in the last mentioned paper by Ming Yuan and > Yi Lin, > http://www2.isye.gatech.edu/statistics/papers/05-25.pdf,  they find that the > NG seems to do > generally better than the LASSO (figure 1) if you can reproduce this figure using my gist I pay you a beer :) I

Re: [Scikit-learn-general] Non-Negative Garotte

2012-04-11 Thread Jaques Grobler
@Olivier - no problem :) @Alex Appart from trying it ourselves to see how it fares, here're some findings: The simulations that are done in the last mentioned paper by Ming Yuan and Yi Lin, http://www2.isye.gatech.edu/statistics/papers/05-25.pdf, they find that the NG seems to do generally bette

Re: [Scikit-learn-general] Non-Negative Garotte

2012-04-11 Thread Olivier Grisel
Le 11 avril 2012 11:53, Alexandre Gramfort a écrit : > > it seems to me that it might be a good addition to the scikit if can > convince ourselves with examples that it does better than a Lasso. I agree. Thanks for the write-up Jacques. -- Olivier http://twitter.com/ogrisel - http://github.com

Re: [Scikit-learn-general] Non-Negative Garotte

2012-04-11 Thread Alexandre Gramfort
> The algorithm proposed in this paper, is rather similar to that of the  Lars > LASSO, but with a complicating > factor being a non-negative constraint on the shrinkage factor. (See eq. (2) > in this paper) > Once you've computed  your shrinkage factor, you basically have your > regression coeffic

Re: [Scikit-learn-general] Sub sampling large datasets

2012-04-11 Thread Olivier Grisel
Le 11 avril 2012 10:55, Jean-Louis Durrieu a écrit : > Hi all, > > On Feb 7, 2012, at 8:47 AM, Olivier Grisel wrote: > >> 2012/2/6 Shishir Pandey : >>> >>> I am working with a dataset which too big to fit in the memory. Is there a >>> way in scikits-learn to sub sample the existing dataset maintai

Re: [Scikit-learn-general] Non-Negative Garotte

2012-04-11 Thread Jaques Grobler
Here's a wee summary on the non-negative garrote (NG) i pieced together: The original non-negative garrote from Breiman (1995) is basically a scaled version of the least square estimate. Basically take a OLS estimator and then shrink that estimator to obtain a more sparse representation. The shrin

Re: [Scikit-learn-general] Sub sampling large datasets

2012-04-11 Thread Jean-Louis Durrieu
Hi all, On Feb 7, 2012, at 8:47 AM, Olivier Grisel wrote: > 2012/2/6 Shishir Pandey : >> >> I am working with a dataset which too big to fit in the memory. Is there a >> way in scikits-learn to sub sample the existing dataset maintaining its >> properties so that I can load it in my RAM? > > We