Re: [Scikit-learn-general] Calculating standard deviation for k-fold cross

2015-02-05 Thread Sebastian Raschka
Thanks for all your answers! Jason, I think you could be right, but the author wrote in the line above the code The mean score and the standard deviation of the score estimate are hence given by: So I assume he literally meant standard deviation to show how the scores varies rather than

Re: [Scikit-learn-general] Calculating standard deviation for k-fold cross

2015-02-05 Thread Jason Sanchez
This is a very common calculation, you will find it at all of these places (but only with one standard deviation): http://scikit-learn.org/stable/auto_examples/randomized_search.html

[Scikit-learn-general] fail to compile docs Could not import extension sphinx.ext.linkcode

2015-02-05 Thread Nancy Ouyang
make html fails with *Could not import extension sphinx.ext.linkcode (exception: No module named linkcode)* How can I get the docs to compile? A quick google search did not help. IRC had friendly people, but did not answer my question either. Maybe help here? Steps: git clone

Re: [Scikit-learn-general] GSoC2015 topics

2015-02-05 Thread Lee Zamparo
With respect to Gaussian processes, there are some good packages in python already (https://github.com/SheffieldML/GPy, https://github.com/dfm/george, probably others). In particular, GPy does not require any other dependencies over and above those already required by sklearn. Maybe a reasonable

Re: [Scikit-learn-general] GSoC2015 topics

2015-02-05 Thread Kyle Kastner
I think most of the GP related work is deciding what the sklearn compatible interface should be :) specifically how to handle kernels and try to share with core codebase. The HODLR solver of George could be very nice for scalibility but algorithm is not easy. There are a few other options on that

Re: [Scikit-learn-general] GSoC2015 topics

2015-02-05 Thread Gael Varoquaux
I have the same feeling. On Thu, Feb 05, 2015 at 03:56:12PM +, Thomas Johnson wrote: So I don't really have a 'deep' understanding of deep learning, but aren't things like Gaussian RBMs becoming obsolete? I thought I read that Hinton said that the current state-of-the-art is Really Big

Re: [Scikit-learn-general] GSoC2015 topics

2015-02-05 Thread Thomas Johnson
So I don't really have a 'deep' understanding of deep learning, but aren't things like Gaussian RBMs becoming obsolete? I thought I read that Hinton said that the current state-of-the-art is Really Big networks that just use standard backprop (plus tricks like dropout). Is that not correct, or is

Re: [Scikit-learn-general] GSoC2015 topics

2015-02-05 Thread Akshay Narasimha
Is Online low rank factorisation still a vaild idea for this year? As it was in the last years idea list. On Thu, Feb 5, 2015 at 9:49 PM, Alexandre Gramfort alexandre.gramf...@telecom-paristech.fr wrote: I just looked at the list from last year, and what seems most relevant still is GMMs,

Re: [Scikit-learn-general] GSoC2015 topics

2015-02-05 Thread Joel Nothman
I think adding partial_fit functions in general to as many algorithms as possible would be nice Which could be a project in itself, for someone open to breadth rather than depth. On 6 February 2015 at 06:43, Kyle Kastner kastnerk...@gmail.com wrote: IncrementalPCA is done (have to add

Re: [Scikit-learn-general] Calculating standard deviation for k-fold cross validation estimate

2015-02-05 Thread Kyle Kastner
Could it also be accounting for +- ? Standard deviation is one sided right? On Thu, Feb 5, 2015 at 4:54 PM, Joel Nothman joel.noth...@gmail.com wrote: With cv=5, only the training sets should overlap. Is this adjustment still appropriate? On 6 February 2015 at 06:44, Michael Eickenberg

Re: [Scikit-learn-general] Calculating standard deviation for k-fold cross validation estimate

2015-02-05 Thread Joel Nothman
With cv=5, only the training sets should overlap. Is this adjustment still appropriate? On 6 February 2015 at 06:44, Michael Eickenberg michael.eickenb...@gmail.com wrote: this is most probably due to the fact that 2 = sqrt(5 - 1), a correction to variance reduction incurred by the

Re: [Scikit-learn-general] GSoC2015 topics

2015-02-05 Thread Alexandre Gramfort
I just looked at the list from last year, and what seems most relevant still is GMMs, and possibly the coordinate descent solvers (Alex maybe you can say what is left there or if with the SAG we are happy now?) there is work coming in coordinate descent and SAG is almost done. I don't think

[Scikit-learn-general] Call for code nominations for Elegant SciPy!

2015-02-05 Thread Juan Nunez-Iglesias
-- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly

Re: [Scikit-learn-general] Call for code nominations for Elegant SciPy!

2015-02-05 Thread Juan Nunez-Iglesias
Hmm, not sure why this didn't render: Hi all, Sorry for cross posting but we are trying to get as many great submissions as possible! I'll keep things short with Raniere Silva's summary: *Long version:* http://ilovesymposia.com/2015/02/04/call-for-code-nominations-for-elegant-scipy/ . *Short

Re: [Scikit-learn-general] GSoC2015 topics

2015-02-05 Thread Andy
On 02/05/2015 01:03 PM, Daniel Sullivan wrote: I'm still in the process of polishing up SAG, hopefully I can get something commit-able soon Sure, no hurry. My question was more Do we want anything more that is not covered by your work on SAG ;)

Re: [Scikit-learn-general] GSoC2015 topics

2015-02-05 Thread Andy
Hi Christof. Good question. I don't think we came up with a list yet. I just looked at the list from last year, and what seems most relevant still is GMMs, and possibly the coordinate descent solvers (Alex maybe you can say what is left there or if with the SAG we are happy now?) There is still

Re: [Scikit-learn-general] GSoC2015 topics

2015-02-05 Thread Daniel Sullivan
I'm still in the process of polishing up SAG, hopefully I can get something commit-able soon On Thu, Feb 5, 2015 at 12:52 PM, Andy t3k...@gmail.com wrote: Hi Christof. Good question. I don't think we came up with a list yet. I just looked at the list from last year, and what seems most

Re: [Scikit-learn-general] Call for code nominations for Elegant SciPy!

2015-02-05 Thread Adam Hughes
Congrats Stefan, btw. Now sure where you find the time man. On Thu, Feb 5, 2015 at 8:28 PM, Juan Nunez-Iglesias jni.s...@gmail.com wrote: Hmm, not sure why this didn't render: Hi all, Sorry for cross posting but we are trying to get as many great submissions as possible! I'll keep things

[Scikit-learn-general] Calculating standard deviation for k-fold cross validation estimate

2015-02-05 Thread Sebastian Raschka
Hi, I am wondering why the standard deviation of the accuracy estimate is multiplied by 2 in the example on http://scikit-learn.org/stable/modules/cross_validation.html; it would be nice if someone could explain it to me. The relevant excerpt from the page linked above: clf =

Re: [Scikit-learn-general] Calculating standard deviation for k-fold cross validation estimate

2015-02-05 Thread Michael Eickenberg
this is most probably due to the fact that 2 = sqrt(5 - 1), a correction to variance reduction incurred by the overlapping nature of the folds. the bootstrap book contains more info on how to calculate these for different cases of splitting. hth, michael On Thursday, February 5, 2015, Sebastian

Re: [Scikit-learn-general] GSoC2015 topics

2015-02-05 Thread Kyle Kastner
IncrementalPCA is done (have to add randomized SVD solver but that should be simple), but I am sure there are other low rank methods which need a partial_fit . I think adding partial_fit functions in general to as many algorithms as possible would be nice Kyle On Thu, Feb 5, 2015 at 2:12 PM,