Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

Ian Goodfellow Tue, 06 Dec 2011 07:41:25 -0800

On Tue, Dec 6, 2011 at 10:31 AM, Vlad Niculae <[email protected]> wrote:
> On Tue, Dec 6, 2011 at 5:26 PM, Ian Goodfellow <[email protected]> 
> wrote:
>> I was initially confused by the specification of the dictionary size
>> for sparse_encode. It makes sense if you think of it as solving
>> multiple lasso problems, but as Vlad said it is different from the
>> dictionary learning setup. As Vlad said there is no right or wrong,
>> but personally I think it is confusing to use the lasso terminology to
>> describe the shape of the arguments when the method has a name from
>> the dictionary learning point of view ("sparse_encode").
>
> I think we can agree to switch around the dimensions so that
> sparse_encode fits with dictionary learning, right?
>
>> As for my code the line that fails is:
> I was specifically refering to when you attempted to manually plug in
> the components_ attribute, because I feel that that should work. Was
> the problem regarding invalid output, or a formal error?


from sklearn.decomposition import MiniBatchDictionaryLearning
m = MiniBatchDictionaryLearning(n_atoms = model.nhid,
fit_algorithm='lars', transform_algorithm = 'lasso_lars',
        dict_init = model.W.get_value().T)
m.components_ = model.W.get_value().T

HS = m.transform(X)

Before I patched m.components_ there was an AttributeError.
After I patched it, it was just incorrect output (all zeros, or NaNs,
depending on alpha). I thought that it was not working because the
object was not set up properly, but now that I have tried
sparse_encode it seems more likely that the underlying optimizer just
doesn't work.


>
> Vlad
>
>>
>> On Tue, Dec 6, 2011 at 5:59 AM, Olivier Grisel <[email protected]> 
>> wrote:
>>> 2011/12/6 Vlad Niculae <[email protected]>:
>>>> On Tue, Dec 6, 2011 at 12:07 PM, Olivier Grisel
>>>> <[email protected]> wrote:
>>>>> 2011/12/6 Vlad Niculae <[email protected]>:
>>>>>>
>>>>>> On Dec 6, 2011, at 11:04 , Gael Varoquaux wrote:
>>>>>>
>>>>>>> On Tue, Dec 06, 2011 at 09:41:56AM +0200, Vlad Niculae wrote:
>>>>>>>> This is actually exactly how the module is designed.
>>>>>>>
>>>>>>> Great design! I should have looked at it closer before writing my mail.
>>>>>>>
>>>>>>>> We have BaseDictionaryLearning which only implements transforms. I
>>>>>>>> didn't try but you should be able to instantiate a
>>>>>>>> BaseDictionaryLearning object, set its components_ manually, and use
>>>>>>>> its transform.
>>>>>>>
>>>>>>> Maybe we need a subclass of this object, for instance 'sparse_coder' 
>>>>>>> that
>>>>>>> takes as __init__ argument the dictionnary to be used.
>>>>>> Sounds good, this way it can be used in pipelines. I'll make a pull 
>>>>>> request.
>>>>>
>>>>> Also vlad can you check the shape of the output of:
>>>>>
>>>>> http://scikit-learn.org/dev/modules/generated/sklearn.decomposition.sparse_encode.html
>>>>> (and its' parallel variant) ?
>>>>>
>>>>> It looks wrong to me. I would have expected `(n_samples, n_components)` 
>>>>> instead.
>>>>
>>>> There's no really wrong or right here, but indeed it's backwards than
>>>> the dictionary learning framework, but it's shaped like the linear
>>>> estimators.
>>>
>>> Ok I think I understand: the dictionary X is currently documented as shape:
>>>
>>>  (n_samples, n_components)
>>>
>>> Whereas, IMHO it should be (n_components, n_features) : each atom (row
>>> of the dictionary) should have the dimensionality of the input data
>>> space which it is supposed to be a summary of.
>>>
>>> In that case the output of the sparse encoder for a new dataset of
>>> shape (n_samples, n_features) would be shaped (n_samples,
>>> n_components).
>>>
>>> --
>>> Olivier
>>> http://twitter.com/ogrisel - http://github.com/ogrisel
>>>
>>> ------------------------------------------------------------------------------
>>> Cloud Services Checklist: Pricing and Packaging Optimization
>>> This white paper is intended to serve as a reference, checklist and point of
>>> discussion for anyone considering optimizing the pricing and packaging model
>>> of a cloud services business. Read Now!
>>> http://www.accelacomm.com/jaw/sfnl/114/51491232/
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>> ------------------------------------------------------------------------------
>> Cloud Services Checklist: Pricing and Packaging Optimization
>> This white paper is intended to serve as a reference, checklist and point of
>> discussion for anyone considering optimizing the pricing and packaging model
>> of a cloud services business. Read Now!
>> http://www.accelacomm.com/jaw/sfnl/114/51491232/
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
> ------------------------------------------------------------------------------
> Cloud Services Checklist: Pricing and Packaging Optimization
> This white paper is intended to serve as a reference, checklist and point of
> discussion for anyone considering optimizing the pricing and packaging model
> of a cloud services business. Read Now!
> http://www.accelacomm.com/jaw/sfnl/114/51491232/
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

Reply via email to