I was initially confused by the specification of the dictionary size
for sparse_encode. It makes sense if you think of it as solving
multiple lasso problems, but as Vlad said it is different from the
dictionary learning setup. As Vlad said there is no right or wrong,
but personally I think it is confusing to use the lasso terminology to
describe the shape of the arguments when the method has a name from
the dictionary learning point of view ("sparse_encode").

As for my code the line that fails is:

from sklearn.decomposition import sparse_encode
HS = sparse_encode( model.W.get_value(), X.T, alpha = 0.01).T

model and X are both fairly complicated, but basically X is a 5000x108
design matrix consisting of whitened 6x6 image patches from the STL-10
dataset. model.W.get_value() is a 108x1600 dictionary matrix. Both
arguments are represented as 2 dimensional float32 numpy ndarrays.

I agree with David that it seems like the optimizer is broken, but I
disagree that the problem is the termination criterion. There should
not be any NaNs anywhere in the course of optimization.


On Tue, Dec 6, 2011 at 5:59 AM, Olivier Grisel <[email protected]> wrote:
> 2011/12/6 Vlad Niculae <[email protected]>:
>> On Tue, Dec 6, 2011 at 12:07 PM, Olivier Grisel
>> <[email protected]> wrote:
>>> 2011/12/6 Vlad Niculae <[email protected]>:
>>>>
>>>> On Dec 6, 2011, at 11:04 , Gael Varoquaux wrote:
>>>>
>>>>> On Tue, Dec 06, 2011 at 09:41:56AM +0200, Vlad Niculae wrote:
>>>>>> This is actually exactly how the module is designed.
>>>>>
>>>>> Great design! I should have looked at it closer before writing my mail.
>>>>>
>>>>>> We have BaseDictionaryLearning which only implements transforms. I
>>>>>> didn't try but you should be able to instantiate a
>>>>>> BaseDictionaryLearning object, set its components_ manually, and use
>>>>>> its transform.
>>>>>
>>>>> Maybe we need a subclass of this object, for instance 'sparse_coder' that
>>>>> takes as __init__ argument the dictionnary to be used.
>>>> Sounds good, this way it can be used in pipelines. I'll make a pull 
>>>> request.
>>>
>>> Also vlad can you check the shape of the output of:
>>>
>>> http://scikit-learn.org/dev/modules/generated/sklearn.decomposition.sparse_encode.html
>>> (and its' parallel variant) ?
>>>
>>> It looks wrong to me. I would have expected `(n_samples, n_components)` 
>>> instead.
>>
>> There's no really wrong or right here, but indeed it's backwards than
>> the dictionary learning framework, but it's shaped like the linear
>> estimators.
>
> Ok I think I understand: the dictionary X is currently documented as shape:
>
>  (n_samples, n_components)
>
> Whereas, IMHO it should be (n_components, n_features) : each atom (row
> of the dictionary) should have the dimensionality of the input data
> space which it is supposed to be a summary of.
>
> In that case the output of the sparse encoder for a new dataset of
> shape (n_samples, n_features) would be shaped (n_samples,
> n_components).
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
> ------------------------------------------------------------------------------
> Cloud Services Checklist: Pricing and Packaging Optimization
> This white paper is intended to serve as a reference, checklist and point of
> discussion for anyone considering optimizing the pricing and packaging model
> of a cloud services business. Read Now!
> http://www.accelacomm.com/jaw/sfnl/114/51491232/
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to