On Tue, Dec 6, 2011 at 4:25 PM, Alexandre Gramfort
<[email protected]> wrote:
> regarding the scaling by n_samples using estimators I am convinced the right
> thing to do cf. my current PR to do this also on SVM models

I am not going to get involved in the discussion of whether to
normalize the coefficient or not, but in all cases the objective
function should be clearly documented.

>
> regarding the convergence pb and potential error, can you put a gist on github
> to make the pb more easily reproducible.

Problem #1 seems to be dependent on the dataset and/or weights. Is
there a good way to transfer some .npy files to you?

Problem #2 is just part of the definition of the interface; I can't
really write code to demonstrate that the interface is missing a
feature.

>
> Alex
>
> On Tue, Dec 6, 2011 at 9:17 PM, Ian Goodfellow <[email protected]> 
> wrote:
>> ok, decreasing alpha by a factor of n_samples (5000 in my case) makes
>> sparse_encode behave much more reasonably.
>>
>> However I still have two bugs to report:
>>
>> 1. The default algorithm returns this error:
>>
>> Traceback (most recent call last):
>>  File "s3c_sparsity_scale_plot.py", line 86, in <module>
>>    HS = sparse_encode( model.W.get_value(), X.T, alpha = 1./5000.).T
>>  File 
>> "/u/goodfeli/python_modules/lib/python2.7/site-packages/scikit_learn-0.9-py2.7-linux-x86_64.egg/sklearn/decomposition/dict_learning.py",
>> line 117, in sparse_encode
>>    method='lasso')
>>  File 
>> "/u/goodfeli/python_modules/lib/python2.7/site-packages/scikit_learn-0.9-py2.7-linux-x86_64.egg/sklearn/linear_model/least_angle.py",
>> line 249, in lars_path
>>    arrayfuncs.cholesky_delete(L[:n_active, :n_active], idx)
>>  File "arrayfuncs.pyx", line 104, in
>> sklearn.utils.arrayfuncs.cholesky_delete
>> (sklearn/utils/arrayfuncs.c:1516)
>> TypeError: only length-1 arrays can be converted to Python scalars
>>
>>
>> 2. The lasso_lars algorithm tells me I am not using enough iterations,
>> but as far as I can tell the sparse_encode interface does not expose
>> any way for me to increase the number of iterations that cd uses.
>>
>> /u/goodfeli/python_modules/lib/python2.7/site-packages/scikit_learn-0.9-py2.7-linux-x86_64.egg/sklearn/linear_model/coordinate_descent.py:173:
>> UserWarning: Objective did not converge, you might want to increase
>> the number of iterations
>>  warnings.warn('Objective did not converge, you might want'
>>
>>
>>
>>
>> On Tue, Dec 6, 2011 at 2:43 PM, Olivier Grisel <[email protected]> 
>> wrote:
>>> 2011/12/6 David Warde-Farley <[email protected]>:
>>>> On Tue, Dec 06, 2011 at 09:04:22AM +0100, Alexandre Gramfort wrote:
>>>>> > This actually gets at something I've been meaning to fiddle with and 
>>>>> > report but haven't had time: I'm not sure I completely trust the 
>>>>> > coordinate descent implementation in scikit-learn, because it seems to 
>>>>> > give me bogus answers a lot (i.e., the optimality conditions necessary 
>>>>> > for it to be an actual solution are not even approximately satisfied). 
>>>>> > Are you guys using something weird for the termination condition?
>>>>>
>>>>> can you give us a sample X and y that shows the pb?
>>>>>
>>>>> it should ultimately use the duality gap to stop the iterations but
>>>>> there might be a corner case …
>>>>
>>>> In [34]: rng = np.random.RandomState(0)
>>>>
>>>> In [35]: dictionary = rng.normal(size=(100, 500)) / 1000; dictionary /=
>>>> np.sqrt((dictionary ** 2).sum(axis=0))
>>>>
>>>> In [36]: signal = rng.normal(size=100) / 1000
>>>>
>>>> In [37]: from sklearn.linear_model import Lasso
>>>>
>>>> In [38]: lasso = Lasso(alpha=0.0001, max_iter=1e6, fit_intercept=False,
>>>> tol=1e-8)
>>>>
>>>> In [39]: lasso.fit(dictionary, signal)
>>>> Out[39]:
>>>> Lasso(alpha=0.0001, copy_X=True, fit_intercept=False, max_iter=1000000.0,
>>>>   normalize=False, precompute='auto', tol=1e-08)
>>>>
>>>> In [40]: max(abs(lasso.coef_))
>>>> Out[40]: 0.0
>>>>
>>>> In [41]: from pylearn2.optimization.feature_sign import feature_sign_search
>>>>
>>>> In [42]: coef = feature_sign_search(dictionary, signal, 0.0001)
>>>>
>>>> In [43]: max(abs(coef))
>>>> Out[43]: 0.0027295761244725018
>>>>
>>>> And I'm pretty sure the latter result is the right one, since
>>>>
>>>> In [45]: def gradient(coefs):
>>>>   ....:     gram = np.dot(dictionary.T, dictionary)
>>>>   ....:     corr = np.dot(dictionary.T, signal)
>>>>   ....:     return - 2 * corr + 2 * np.dot(gram, coefs) + 0.0001 *
>>>> np.sign(coefs)
>>>>   ....:
>>>
>>> Actually, alpha in scikit-learn is multiplied by n_samples. I agree
>>> this is misleading and not documented in the docstring.
>>>
>>>>>> lasso = Lasso(alpha=0.0001 / dictionary.shape[0], max_iter=1e6, 
>>>>>> fit_intercept=False, tol=1e-8).fit(dictionary, signal)
>>>>>> max(abs(lasso.coef_))
>>> 0.0027627270397484554
>>>>>> max(abs(gradient(lasso.coef_)))
>>> 0.00019687294269977963
>>>
>>> --
>>> Olivier
>>> http://twitter.com/ogrisel - http://github.com/ogrisel
>>>
>>> ------------------------------------------------------------------------------
>>> Cloud Services Checklist: Pricing and Packaging Optimization
>>> This white paper is intended to serve as a reference, checklist and point of
>>> discussion for anyone considering optimizing the pricing and packaging model
>>> of a cloud services business. Read Now!
>>> http://www.accelacomm.com/jaw/sfnl/114/51491232/
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>> ------------------------------------------------------------------------------
>> Cloud Services Checklist: Pricing and Packaging Optimization
>> This white paper is intended to serve as a reference, checklist and point of
>> discussion for anyone considering optimizing the pricing and packaging model
>> of a cloud services business. Read Now!
>> http://www.accelacomm.com/jaw/sfnl/114/51491232/
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
> ------------------------------------------------------------------------------
> Cloud Services Checklist: Pricing and Packaging Optimization
> This white paper is intended to serve as a reference, checklist and point of
> discussion for anyone considering optimizing the pricing and packaging model
> of a cloud services business. Read Now!
> http://www.accelacomm.com/jaw/sfnl/114/51491232/
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to