Re: [Scikit-learn-general] Starting to contribute to Scikit-learn.

2014-09-06 Thread Kyle Kastner
Shubham, There are many open improvements on the GitHub issues list (https://github.com/scikit-learn/scikit-learn/issues?q=is%3Aopen+is%3Aissue+label%3AEasy). I recommend starting with a few of the Easy or Documentation tasks - it helps get the workflow down and is also very valuable to the projec

[Scikit-learn-general] Starting to contribute to Scikit-learn.

2014-09-06 Thread Shubham Tomar
Hi, I would like to start contributing to scikit learn. What are some tasks a machine learning starter can begin with ? I am familiar with fundamental machine learning algorithms like Naive Bayes, SVMs etc. Thanks and regards, Shubham ---

Re: [Scikit-learn-general] on scaling and grid search

2014-09-06 Thread Joel Nothman
Scaling (or the same scaling procedure) is not always beneficial, but you can certainly do exactly what you are saying by making a pipeline of a StandardScaler and your estimator. See the documentation for Pipeline at http://scikit-learn.org/dev/modules/pipeline.html and http://scikit-learn.org/de

Re: [Scikit-learn-general] one-class SVM with limited number of samples

2014-09-06 Thread Pagliari, Roberto
So basically you would not do grid search, and just look at the proportion of outliers in training and test datasets. But what if you have so few examples of anomalies, that you want to keep them all in the test dataset. In that case, how would you go about finding the best value of nu? Thank

[Scikit-learn-general] on scaling and grid search

2014-09-06 Thread Pagliari, Roberto
Typically one should scale training data and then test data using the values gotten when scaling training data. When performing grid search, shouldn't scaling occur every time (k-1) folds are selected as training data, and apply that to the k-th fold? So, in practice, everytime the training/tes

Re: [Scikit-learn-general] Update website "Comparison of LDA and PCA 2D projection of Iris dataset"

2014-09-06 Thread Lars Buitinck
2014-09-06 10:12 GMT+02:00 Gael Varoquaux : > On Fri, Sep 05, 2014 at 10:12:14PM -0400, Sebastian Raschka wrote: >> just saw that scikit-learn 15.2.0 is out and the LDA was fixed. That's great >> :). > > I missed that reviewing the patches that made it into the 0.15.2. > > The change in LDA should

Re: [Scikit-learn-general] Update website "Comparison of LDA and PCA 2D projection of Iris dataset"

2014-09-06 Thread Olivier Grisel
Alright I could not see the left panel with the version info on my mobile phone version when I answered that email. I now understand what happened: I buily the doc from the correct source checkout (tag 0.15.2) but using the wrong version of sklearn in my Python site-packages (I had master there be

Re: [Scikit-learn-general] scikit-learn 0.15.2 is out!

2014-09-06 Thread Olivier Grisel
Kyle Kelley reported that the source tarball that I uploaded yesterday night on PyPI had a typo in its setup.py causing an import error. This was caused by a mistake on my side when doing the release. I regenerated the correct tarball from the 0.15.2 git tag and reuploaded it to PyPI. Sorry for a

Re: [Scikit-learn-general] Update website "Comparison of LDA and PCA 2D projection of Iris dataset"

2014-09-06 Thread Gael Varoquaux
On Sat, Sep 06, 2014 at 04:13:32PM +0200, Olivier Grisel wrote: > > Looking at this page, however, there is a problem with the version > > number: the page says that the docs are for 0.16, while they are for > > 0.15.2. > Where in the page on which URL ? http://scikit-learn.org/stable/auto_exampl

Re: [Scikit-learn-general] Update website "Comparison of LDA and PCA 2D projection of Iris dataset"

2014-09-06 Thread Olivier Grisel
Le 6 sept. 2014 10:12, "Gael Varoquaux" a écrit : > > On Fri, Sep 05, 2014 at 10:12:14PM -0400, Sebastian Raschka wrote: > > just saw that scikit-learn 15.2.0 is out and the LDA was fixed. That's great :). > > I missed that reviewing the patches that made it into the 0.15.2. > > The change in LDA

Re: [Scikit-learn-general] Update website "Comparison of LDA and PCA 2D projection of Iris dataset"

2014-09-06 Thread Olivier Grisel
Weird. I don't have access to my computer right now. Will need to check what happened with the website when home. I thought I had synced the 0.15 / stable root from the 0.15.2 tag. -- Slashdot TV. Video for Nerds. Stuff

Re: [Scikit-learn-general] one-class SVM with limited number of samples

2014-09-06 Thread Albert Thomas
Hi Roberto, One possible way to tune the hyperparameters of the One Class SVM is to split the data set in training and test sets, train the One Class SVM with the training set and a pre-specified nu, and see if you get a similar amount of proportion of outliers (a number close to nu) on the test s

Re: [Scikit-learn-general] Update website "Comparison of LDA and PCA 2D projection of Iris dataset"

2014-09-06 Thread Gael Varoquaux
On Fri, Sep 05, 2014 at 10:12:14PM -0400, Sebastian Raschka wrote: > just saw that scikit-learn 15.2.0 is out and the LDA was fixed. That's great > :). I missed that reviewing the patches that made it into the 0.15.2. The change in LDA should _not_ have made to to a minor bugfix release. A chang

Re: [Scikit-learn-general] scikit-learn 0.15.2 is out!

2014-09-06 Thread Gael Varoquaux
On Sat, Sep 06, 2014 at 02:26:31AM +0200, Olivier Grisel wrote: > I just released 0.15.2. The source and binary packages for this > release are on PyPi as usual: > https://pypi.python.org/pypi/scikit-learn/0.15.2 Congratulations! And thanks you in the name of our users. This provides plenty of va