Re: [Scikit-learn-general] python 3.x

2014-02-18 Thread Joel Nothman
I should emphasise it will not affect the results: the example will still work fine. On 19 February 2014 13:29, Joel Nothman wrote: > Hi Tommy, > > I'm pretty sure this isn't related to Python 3. As far as I understand > it's a deprecation warning from numpy. It may well be fixed now, and > cer

Re: [Scikit-learn-general] python 3.x

2014-02-18 Thread Joel Nothman
Hi Tommy, I'm pretty sure this isn't related to Python 3. As far as I understand it's a deprecation warning from numpy. It may well be fixed now, and certainly the warning won't show with default warning settings anymore. Thanks for the report, - Joel On 19 February 2014 07:23, Tommy Carstense

[Scikit-learn-general] python 3.x

2014-02-18 Thread Tommy Carstensen
To scikit-learn-general, I was under the impression that scikit-learn is fully compatible with Python 3 after reading this thread: http://stackoverflow.com/questions/11910481/best-machine-learning-packag e-for-python-3x However, I get the warning below, when I run the first test example on this p

Re: [Scikit-learn-general] fit a GMM to a histogram (or weighted data points) ?

2014-02-18 Thread Gael Varoquaux
Agreed.  Gaël Original message From: Sturla Molden Date:18/02/2014 20:00 (GMT+01:00) To: [email protected] Subject: Re: [Scikit-learn-general] fit a GMM to a histogram (or weighted data points) ? Whether the initial data reduction is core s

Re: [Scikit-learn-general] fit a GMM to a histogram (or weighted data points) ?

2014-02-18 Thread Gael Varoquaux
I would say that you need core sets. We don't have them in scikit-learn but I'd love to get them.   Worst case do a data reduction using a streaming k means.  Histograms are really a ugly way of doing data réduction. They don't adapt to the data distribution.  Gaël Original message --

Re: [Scikit-learn-general] fit a GMM to a histogram (or weighted data points) ?

2014-02-18 Thread Sturla Molden
Whether the initial data reduction is core sets or histograms, the procedure for fitting a weighted GMM would still be the same. Sturla Charles Greenberg wrote: > To give some more info, my application is fitting to processed data. > Basically, it's too expensive to keep all individual data poi

Re: [Scikit-learn-general] fit a GMM to a histogram (or weighted data points) ?

2014-02-18 Thread Charles Greenberg
To give some more info, my application is fitting to processed data. Basically, it's too expensive to keep all individual data points so we keep a (3D) histogram, but we still want to fit a GMM to the data. Additionally, we perform processing on the histogram, averaging from multiple experiments. M

Re: [Scikit-learn-general] Installation on OS X Mavericks

2014-02-18 Thread Sturla Molden
So this is now fixed in Cython, just get the most recent master from github. :-) Sturla Sturla Molden wrote: > On 16/02/14 18:36, Sturla Molden wrote: >> On 16/02/14 17:59, Sturla Molden wrote: >> > Eh, I am not even sure it should be legal to call an iterator like this: >> > >> > typ

Re: [Scikit-learn-general] Shrinkage LDA

2014-02-18 Thread Alexandre Gramfort
https://github.com/scikit-learn/scikit-learn/issues/1649 have a look at the implemention from @ksemb https://gist.github.com/ksemb/4c3c86c6b62e7a99989b and the discussion on the issue HTH A -- Managing the Performance

Re: [Scikit-learn-general] Shrinkage LDA

2014-02-18 Thread Brunner, Clemens
I didn't find anything on the issue tracker related to this topic. Could you point me to the relevant entries please? Clemens On Feb 17, 2014, at 21:42, Alexandre Gramfort wrote: > hi Clemens, > > you might want to look at the open issues on the topic. > > A > > On Mon, Feb 17, 2014 at 1

Re: [Scikit-learn-general] Removing confounding factors before clustering

2014-02-18 Thread federico vaggi
Hey Juan, For scaling data, sklearn provides the preprocessing module. You can use one scaler object per batch, and that will normalize each batch indipendently. Depending on what the angles represent, you should probably rescale them in a way that they end up on a linear scale. Naively - maybe

[Scikit-learn-general] Removing confounding factors before clustering

2014-02-18 Thread Juan Nunez-Iglesias
Hi All, I have a "biggish" dataset (to use Gaël's terminology ;), 45K samples x 300 features, that I want to cluster. I have very heterogeneous features -- some are continuous, others are quasi-continuous (high counts), others are discrete (counts of rare events), others are angles (uniformly dist