Re: [Scikit-learn-general] OvR, Logistic Regression and SGD

2012-11-05 Thread Mathieu Blondel
On Tue, Nov 6, 2012 at 9:33 AM, Abhi wrote: > Hello, >I have been reading and testing examples around the sklearn > documentation and > am not too clear on few things and would appreciate any help regarding the > following questions: > 1) What would be the advantage of training LogisticRegr

Re: [Scikit-learn-general] OvR, Logistic Regression and SGD

2012-11-05 Thread Gael Varoquaux
On Tue, Nov 06, 2012 at 12:33:06AM +, Abhi wrote: > 1) What would be the advantage of training LogisticRegression vs > OneVsRestClassifier(LogisticRegression()) for multiclass. (I understand > the latter would basically train n_classes classifiers). Different decision boundaries. Depends on y

Re: [Scikit-learn-general] preprocessing.scaler uses population standard deviation

2012-11-05 Thread Gael Varoquaux
I am actually -1 on this, because the consequence would be that np.std(X, axis=-1) would no longer be one. I am afraid that it would confuse the users. I believe that the n/(n - 1) difference is completely irrelevent for machine learning purpose. If a quantity is relevant, it is the norm of the fe

[Scikit-learn-general] OvR, Logistic Regression and SGD

2012-11-05 Thread Abhi
Hello, I have been reading and testing examples around the sklearn documentation and am not too clear on few things and would appreciate any help regarding the following questions: 1) What would be the advantage of training LogisticRegression vs OneVsRestClassifier(LogisticRegression()) for mu

Re: [Scikit-learn-general] preprocessing.scaler uses population standard deviation

2012-11-05 Thread Lars Buitinck
2012/11/5 Doug Coleman : > It seems this is rarely the case in machine learning, so perhaps it would be > better to scale using the sample standard deviation, which numpy already > supports, or to make it a flag. +1 Since we renamed Scaler since the last release (?), we can make population stdev

Re: [Scikit-learn-general] Current HEAD test failure

2012-11-05 Thread Lars Buitinck
2012/11/5 Stéfan van der Walt : > I noticed on two different machines that scikit-learn "make" no longer > completes due to the following test failure: [snip] > File > "/home/stefan/akad/postdoc/ext/scikit-learn/sklearn/svm/tests/test_sparse.py", > line 71, in > kfunc = lambda x, y: np.do

[Scikit-learn-general] preprocessing.scaler uses population standard deviation

2012-11-05 Thread Doug Coleman
preprocessor.scaler calls numpy's default standard deviation, which is the population standard deviation (delta-degrees-of-freedom is 0). This is usually reserved for when you have the entire set of data. It seems this is rarely the case in machine learning, so perhaps it would be better to scale

[Scikit-learn-general] Current HEAD test failure

2012-11-05 Thread Stéfan van der Walt
Hi all, I noticed on two different machines that scikit-learn "make" no longer completes due to the following test failure: == ERROR: sklearn.svm.tests.test_sparse.test_svc_with_custom_kernel -

Re: [Scikit-learn-general] List of Parameters and Attributes for Scikit-learn estimators

2012-11-05 Thread Gael Varoquaux
On Mon, Nov 05, 2012 at 06:02:51PM +0100, Jaques Grobler wrote: > I've been trying to figure out a way to effectively get the param/attrib > descriptions as to add them to the list OK, don't loose time on this, it seems like it is hard. I think that it time to cut our losses on that. G -

Re: [Scikit-learn-general] List of Parameters and Attributes for Scikit-learn estimators

2012-11-05 Thread Jaques Grobler
I've been trying to figure out a way to effectively get the param/attrib descriptions as to add them to the list but the solution is evading me. I've uploaded a more recent version of the script. *slightly *more tidy :) It'd be nice if one could just see if there's a badly- or un-documented attri