Le 1 avril 2012 16:38, Andreas <[email protected]> a écrit : > On 04/01/2012 04:34 PM, Gael Varoquaux wrote: >> On Sun, Apr 01, 2012 at 04:23:36PM +0200, Andreas wrote: >> >>> @Alex, could you maybe give the setting again where you had >>> issues doing grid search without scale_C? >>> >> Afaik, it was with a l1-penalized logistic. In my experience, >> l2-penalized models and less sensitive to choice of the penality >> parameter, and hinge loss (aka SVM) and less sensitive than l2 of >> logistic loss. >> > I also tried L1 penalized logistic regression models. > It doesn't seem to make much difference on digits, usps > or madelon. > If would be good if you could provide a data set where > changing the fraction of training data by a factor of, say > 100, will lead to a similar model and that is sensitive > to C. > > I probably look a the wrong kind of data sets to see > the effect.
I think you need a dataset with n_features >> n_samples with many noisy features, maybe using make_classification with a n_informative == 0.1 * n_features for instance: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/datasets/samples_generator.py#L17 -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
