On Tue, Dec 6, 2011 at 4:25 PM, Alexandre Gramfort <[email protected]> wrote: > regarding the scaling by n_samples using estimators I am convinced the right > thing to do cf. my current PR to do this also on SVM models
I am not going to get involved in the discussion of whether to normalize the coefficient or not, but in all cases the objective function should be clearly documented. > > regarding the convergence pb and potential error, can you put a gist on github > to make the pb more easily reproducible. Problem #1 seems to be dependent on the dataset and/or weights. Is there a good way to transfer some .npy files to you? Problem #2 is just part of the definition of the interface; I can't really write code to demonstrate that the interface is missing a feature. > > Alex > > On Tue, Dec 6, 2011 at 9:17 PM, Ian Goodfellow <[email protected]> > wrote: >> ok, decreasing alpha by a factor of n_samples (5000 in my case) makes >> sparse_encode behave much more reasonably. >> >> However I still have two bugs to report: >> >> 1. The default algorithm returns this error: >> >> Traceback (most recent call last): >> File "s3c_sparsity_scale_plot.py", line 86, in <module> >> HS = sparse_encode( model.W.get_value(), X.T, alpha = 1./5000.).T >> File >> "/u/goodfeli/python_modules/lib/python2.7/site-packages/scikit_learn-0.9-py2.7-linux-x86_64.egg/sklearn/decomposition/dict_learning.py", >> line 117, in sparse_encode >> method='lasso') >> File >> "/u/goodfeli/python_modules/lib/python2.7/site-packages/scikit_learn-0.9-py2.7-linux-x86_64.egg/sklearn/linear_model/least_angle.py", >> line 249, in lars_path >> arrayfuncs.cholesky_delete(L[:n_active, :n_active], idx) >> File "arrayfuncs.pyx", line 104, in >> sklearn.utils.arrayfuncs.cholesky_delete >> (sklearn/utils/arrayfuncs.c:1516) >> TypeError: only length-1 arrays can be converted to Python scalars >> >> >> 2. The lasso_lars algorithm tells me I am not using enough iterations, >> but as far as I can tell the sparse_encode interface does not expose >> any way for me to increase the number of iterations that cd uses. >> >> /u/goodfeli/python_modules/lib/python2.7/site-packages/scikit_learn-0.9-py2.7-linux-x86_64.egg/sklearn/linear_model/coordinate_descent.py:173: >> UserWarning: Objective did not converge, you might want to increase >> the number of iterations >> warnings.warn('Objective did not converge, you might want' >> >> >> >> >> On Tue, Dec 6, 2011 at 2:43 PM, Olivier Grisel <[email protected]> >> wrote: >>> 2011/12/6 David Warde-Farley <[email protected]>: >>>> On Tue, Dec 06, 2011 at 09:04:22AM +0100, Alexandre Gramfort wrote: >>>>> > This actually gets at something I've been meaning to fiddle with and >>>>> > report but haven't had time: I'm not sure I completely trust the >>>>> > coordinate descent implementation in scikit-learn, because it seems to >>>>> > give me bogus answers a lot (i.e., the optimality conditions necessary >>>>> > for it to be an actual solution are not even approximately satisfied). >>>>> > Are you guys using something weird for the termination condition? >>>>> >>>>> can you give us a sample X and y that shows the pb? >>>>> >>>>> it should ultimately use the duality gap to stop the iterations but >>>>> there might be a corner case … >>>> >>>> In [34]: rng = np.random.RandomState(0) >>>> >>>> In [35]: dictionary = rng.normal(size=(100, 500)) / 1000; dictionary /= >>>> np.sqrt((dictionary ** 2).sum(axis=0)) >>>> >>>> In [36]: signal = rng.normal(size=100) / 1000 >>>> >>>> In [37]: from sklearn.linear_model import Lasso >>>> >>>> In [38]: lasso = Lasso(alpha=0.0001, max_iter=1e6, fit_intercept=False, >>>> tol=1e-8) >>>> >>>> In [39]: lasso.fit(dictionary, signal) >>>> Out[39]: >>>> Lasso(alpha=0.0001, copy_X=True, fit_intercept=False, max_iter=1000000.0, >>>> normalize=False, precompute='auto', tol=1e-08) >>>> >>>> In [40]: max(abs(lasso.coef_)) >>>> Out[40]: 0.0 >>>> >>>> In [41]: from pylearn2.optimization.feature_sign import feature_sign_search >>>> >>>> In [42]: coef = feature_sign_search(dictionary, signal, 0.0001) >>>> >>>> In [43]: max(abs(coef)) >>>> Out[43]: 0.0027295761244725018 >>>> >>>> And I'm pretty sure the latter result is the right one, since >>>> >>>> In [45]: def gradient(coefs): >>>> ....: gram = np.dot(dictionary.T, dictionary) >>>> ....: corr = np.dot(dictionary.T, signal) >>>> ....: return - 2 * corr + 2 * np.dot(gram, coefs) + 0.0001 * >>>> np.sign(coefs) >>>> ....: >>> >>> Actually, alpha in scikit-learn is multiplied by n_samples. I agree >>> this is misleading and not documented in the docstring. >>> >>>>>> lasso = Lasso(alpha=0.0001 / dictionary.shape[0], max_iter=1e6, >>>>>> fit_intercept=False, tol=1e-8).fit(dictionary, signal) >>>>>> max(abs(lasso.coef_)) >>> 0.0027627270397484554 >>>>>> max(abs(gradient(lasso.coef_))) >>> 0.00019687294269977963 >>> >>> -- >>> Olivier >>> http://twitter.com/ogrisel - http://github.com/ogrisel >>> >>> ------------------------------------------------------------------------------ >>> Cloud Services Checklist: Pricing and Packaging Optimization >>> This white paper is intended to serve as a reference, checklist and point of >>> discussion for anyone considering optimizing the pricing and packaging model >>> of a cloud services business. Read Now! >>> http://www.accelacomm.com/jaw/sfnl/114/51491232/ >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> [email protected] >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> >> ------------------------------------------------------------------------------ >> Cloud Services Checklist: Pricing and Packaging Optimization >> This white paper is intended to serve as a reference, checklist and point of >> discussion for anyone considering optimizing the pricing and packaging model >> of a cloud services business. Read Now! >> http://www.accelacomm.com/jaw/sfnl/114/51491232/ >> _______________________________________________ >> Scikit-learn-general mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > ------------------------------------------------------------------------------ > Cloud Services Checklist: Pricing and Packaging Optimization > This white paper is intended to serve as a reference, checklist and point of > discussion for anyone considering optimizing the pricing and packaging model > of a cloud services business. Read Now! > http://www.accelacomm.com/jaw/sfnl/114/51491232/ > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
