Re: [Scikit-learn-general] metrics.adjusted_mutual_info_score method and metrics.silhouette_score don't exist.

2011-12-02 Thread Gael Varoquaux
On Thu, Dec 01, 2011 at 03:48:34PM -0700, María Helena Mejía Salazar wrote: > I am running plot_dbscan.py example. Lines 45-48 have problems. > metrics.adjusted_mutual_info_score method and metrics.silhouette_score > don't exist. Chances are that the examples work for the scikit-learn versio

Re: [Scikit-learn-general] Hyperparameter optimization

2011-12-02 Thread Gael Varoquaux
On Sat, Nov 19, 2011 at 09:15:43PM -0500, James Bergstra wrote: > 2. Gaussian process w. Expected Improvement global optimization. > This is an established technique for global optimization that has > about the right scaling properties to be good for hyper-parameter > optimization. Without knowin

Re: [Scikit-learn-general] Builtbot out of service?

2011-12-02 Thread Gael Varoquaux
On Thu, Nov 24, 2011 at 09:15:08PM +0100, Nelle Varoquaux wrote: > The buildbot is back online ! Thanks Nelle. It's really great to now have 2 people on the project that can handle issues on the Afpy server. Bus factor of scikit-learn@bbafpy is going up! G ---

Re: [Scikit-learn-general] A new jenkins integration server is online for scikit-learn

2011-12-02 Thread Gael Varoquaux
On Mon, Nov 28, 2011 at 11:28:51AM +0100, Olivier Grisel wrote: > > btw many are in joblib. I guess it makes no sense to fix them here? > I think joblib should be fixed upstream (if Gael want's to use pep8 as > a coding style convention for this project). I can grep them out of > this report if th

[Scikit-learn-general] motivation for the lib, why re-implement existing stuff

2011-12-02 Thread Denis Kochedykov
Hi all, I'm looking for an ML library for Python for our research team. I found a quite comprehensive one - Orange - and a relatively new one - scikits.learn. Orange definitely look good given the number of methods implemented in it, maturity and its GUI as a bonus. But I'm a bit confused - if

Re: [Scikit-learn-general] predicted variance API [was: Issue with gaussian processes]

2011-12-02 Thread Alexandre Gramfort
> As a nitpick I'd say compute_variance instead of return_variance > because the mean is still returned. fair enough :) Alex -- All the data continuously generated in your IT infrastructure contains a definitive record

Re: [Scikit-learn-general] predicted variance API [was: Issue with gaussian processes]

2011-12-02 Thread Alexandre Passos
On Fri, Dec 2, 2011 at 16:15, Alexandre Gramfort wrote: >> On the name though --- "eval_MSE" is a nonstandard term for "variance" >> no? MSE usually refers to a loss criterion, for comparing predictions >> with targets. > > return_variance As a nitpick I'd say compute_variance instead of return_v

Re: [Scikit-learn-general] predicted variance API [was: Issue with gaussian processes]

2011-12-02 Thread Alexandre Gramfort
> On the name though --- "eval_MSE" is a nonstandard term for "variance" > no? MSE usually refers to a loss criterion, for comparing predictions > with targets. return_variance would work for me instead of eval_MSE (which should have been eval_mse anyway) Alex

Re: [Scikit-learn-general] predicted variance API [was: Issue with gaussian processes]

2011-12-02 Thread Vincent Dubourg
> Cool, good to know! > > On the name though --- "eval_MSE" is a nonstandard term for "variance" > no? MSE usually refers to a loss criterion, for comparing predictions > with targets. > MSE stands for "mean squared error". The GP predictor indeed ensures minimum prediction variance (aka the mean

[Scikit-learn-general] Demo DBSCAN

2011-12-02 Thread María Helena Mejía Salazar
Hi, I modified a little bit the program of demo dbscan (plot_dbscan.py). I am using just distance (no similarities) and I am having bad results. There are just 5 points, I changed the eps as the minimum distance between the points and the number of minimun points are 2 since this is what I req

Re: [Scikit-learn-general] predicted variance API [was: Issue with gaussian processes]

2011-12-02 Thread James Bergstra
On Fri, Dec 2, 2011 at 1:00 PM, Vincent Dubourg wrote: > On 02/12/2011 18:19, Alexandre Passos wrote: >> On Fri, Dec 2, 2011 at 12:02, James Bergstra   >> wrote: >>> On Tue, Nov 29, 2011 at 5:24 PM, Olivier Grisel >>>  wrote: That makes sense. Fortunately we don't have an API to compute the

Re: [Scikit-learn-general] predicted variance API [was: Issue with gaussian processes]

2011-12-02 Thread Vincent Dubourg
On 02/12/2011 18:19, Alexandre Passos wrote: > On Fri, Dec 2, 2011 at 12:02, James Bergstra wrote: >> On Tue, Nov 29, 2011 at 5:24 PM, Olivier Grisel >> wrote: >>> That makes sense. Fortunately we don't have an API to compute the >>> expected variance of a prediction :) > So what does the eval_M

Re: [Scikit-learn-general] pruning trees

2011-12-02 Thread Peter Prettenhofer
2011/12/2 James Bergstra : > I'm looking at the decision tree code and I'm not seeing any pruning > logic, or other logic to prevent over-fitting (other than requiring > that leaf nodes be sufficiently populated).  Decision trees are not my > specialty, but pruning / early stopping seem often to be

Re: [Scikit-learn-general] predicted variance API [was: Issue with gaussian processes]

2011-12-02 Thread Alexandre Passos
On Fri, Dec 2, 2011 at 12:02, James Bergstra wrote: > On Tue, Nov 29, 2011 at 5:24 PM, Olivier Grisel > wrote: >> That makes sense. Fortunately we don't have an API to compute the >> expected variance of a prediction :) So what does the eval_MSE option do? --  - Alexandre -

[Scikit-learn-general] pruning trees

2011-12-02 Thread James Bergstra
I'm looking at the decision tree code and I'm not seeing any pruning logic, or other logic to prevent over-fitting (other than requiring that leaf nodes be sufficiently populated). Decision trees are not my specialty, but pruning / early stopping seem often to be mentioned in connection with trees

[Scikit-learn-general] predicted variance API [was: Issue with gaussian processes]

2011-12-02 Thread James Bergstra
On Tue, Nov 29, 2011 at 5:24 PM, Olivier Grisel wrote: > That makes sense. Fortunately we don't have an API to compute the > expected variance of a prediction :) Slightly off-topic, but this is exactly what's necessary to use existing regression algorithms for Bayesian optimization, even internal

Re: [Scikit-learn-general] Memory consumption of LinearSVC.fit

2011-12-02 Thread Olivier Grisel
2011/12/2 Ian Goodfellow : > On Fri, Oct 7, 2011 at 5:14 AM, Olivier Grisel > wrote: >> 2011/10/7 Ian Goodfellow : >>> Thanks. Yes it does appear that liblinear uses only a 64 bit dense format, >>> so this memory usage is normal/caused by the implementation of liblinear. >>> >>> You may want to u