Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Olivier Grisel
2013/11/7 Mathieu Blondel math...@mblondel.org: On Fri, Nov 8, 2013 at 12:28 AM, Vlad Niculae zephy...@gmail.com wrote: I feel like this would go against explicit is better than implicit, but without it grid search would indeed be awkward. Maybe: if self.alpha_coef == 'same':

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Olivier Grisel
About the LBFGS-B residuals (non-)issue I was probably confused by the overlapping on the plot and mis-interpreted the location of the PG-l1 and PG-l2 curves. -- Olivier -- November Webinars for C, C++, Fortran

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Vlad Niculae
Re: the discussion we had at PyCon.fr, I noticed that the internal elastic net coordinate descent functions are parametrized with `l1_reg` and `l2_reg`, but the exposed classes and functions have `alpha` and `l1_ratio`. Only yesterday there was somebody on IRC who couldn't match Ridge with

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Mathieu Blondel
And lambda is a reserved keyword in Python ;-) On Fri, Nov 8, 2013 at 4:59 PM, Olivier Grisel olivier.gri...@ensta.orgwrote: 2013/11/7 Mathieu Blondel math...@mblondel.org: On Fri, Nov 8, 2013 at 12:28 AM, Vlad Niculae zephy...@gmail.com wrote: I feel like this would go against

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Thomas Unterthiner
Just my 0.02$ as a user: I was also a confused/put-off by `alpha` and `l1_ratio` when I first explored SGDClassifier, I found those names to be pretty inconsistent --- plus I tend to call my regularization parameters `lambda` and use `alpha` for learning rates. I'm sure other people associate

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Peter Prettenhofer
SGDClassifier adopted the parameter names of ElasticNet (which has been around in sklearn for longer) for consistency reasons. I agree that we should strive for concise and intuitive parameter names such as ``l1_ratio``. Naming in sklearn is actually quite unfortunate since the popular R package

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Olivier Grisel
We cannot use lambda as parameter name because it is a reserved keyword of the python language (for defining anonymous functions). This is why used alpha instead of lambda for the ElasticNet / Lasso model initially and then this notation was reused in more recently implemented estimators such as

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Alexandre Gramfort
just a remark in LogisticRegression you can use L1 and L2 reg and there is a single param that is alpha. It's not trivial to have a consistent naming for regularization param. In SVC it is C as it's the common naming... but it corresponds to 1/l2_reg with what you suggest... Alex

Re: [Scikit-learn-general] Benchmarking non-negative least squares solvers, work in progress

2013-11-08 Thread Gael Varoquaux
On Fri, Nov 08, 2013 at 11:56:24AM +0100, Olivier Grisel wrote: In retrospect I would have prefered it named something explicit like regularization or l2_reg instead of alpha. Agreed. Still I like the (alpha, l1_ratio) parameterization better over the (l2_reg, l1_reg) parameter set

Re: [Scikit-learn-general] Random forest with zero features

2013-11-08 Thread Michal Romaniuk
Did anyone work on this problem (exceptions raised by classifiers in grid search) since? I would be happy to do some work to fix this problem, but would need some advice. It seems to me like the easiest way around the issue is to wrap the call to clf.fit() in a try statement and catch the

[Scikit-learn-general] PR #2391, Implemented Determinant ECOC

2013-11-08 Thread Karol Pysniak
Hi All, I've added some new ECOC some time ago. Would it be possible to have some review and feedback? Also, would you recommend any datasets that could be used for verification? I am especially concerned about what type and size of data sets I should use. I would appreciate any help and

[Scikit-learn-general] Automated benchmarking

2013-11-08 Thread Karol Pysniak
Hi All, Has there any been discussion on adding some automated benchmarks for both speed and accuracy of the algorithms we have? I think it would very interesting if such a script could be automatically executed after every commit so that we could follow the performance of scikit-learn or, at

Re: [Scikit-learn-general] Automated benchmarking

2013-11-08 Thread Skipper Seabold
On Fri, Nov 8, 2013 at 6:30 PM, Karol Pysniak kpysn...@gmail.com wrote: Hi All, Has there any been discussion on adding some automated benchmarks for both speed and accuracy of the algorithms we have? I think it would very interesting if such a script could be automatically executed after

Re: [Scikit-learn-general] Automated benchmarking

2013-11-08 Thread Vlad Niculae
We have an instance of vbench continuously running [1] that I did as a GSoC project last year. For some reason it seems that the links don't generate properly now, but it still works (though all data got lost in a jenkins setup incident this summer). Here are some linear model benchmarks for

Re: [Scikit-learn-general] Automated benchmarking

2013-11-08 Thread Karol Pysniak
Looks good, but I was more interested if we want to have a single script or set of scripts that would produce a single number that could used to compare changes. What do you think? Thanks, Karol 2013/11/8 Skipper Seabold jsseab...@gmail.com On Fri, Nov 8, 2013 at 6:30 PM, Karol Pysniak

Re: [Scikit-learn-general] Automated benchmarking

2013-11-08 Thread Vlad Niculae
It should be written in such a way so that you can add more benchmarks with a PR to that repo (the master branch) and it should just work. Many parts of the framework are still hackish though. Yours, Vlad On Fri, Nov 8, 2013 at 7:53 PM, Karol Pysniak kpysn...@gmail.com wrote: Awesome, thanks

Re: [Scikit-learn-general] Automated benchmarking

2013-11-08 Thread Karol Pysniak
Yes, it would be perfect. Do you have any work planned or schedule on this problem? Let me know if there is something I could help with. Thanks, Karol 2013/11/8 Vlad Niculae zephy...@gmail.com It should be written in such a way so that you can add more benchmarks with a PR to that repo (the