Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

Vlad Niculae Tue, 06 Dec 2011 07:32:05 -0800

On Tue, Dec 6, 2011 at 5:26 PM, Ian Goodfellow <[email protected]> wrote:
> I was initially confused by the specification of the dictionary size
> for sparse_encode. It makes sense if you think of it as solving
> multiple lasso problems, but as Vlad said it is different from the
> dictionary learning setup. As Vlad said there is no right or wrong,
> but personally I think it is confusing to use the lasso terminology to
> describe the shape of the arguments when the method has a name from
> the dictionary learning point of view ("sparse_encode").


I think we can agree to switch around the dimensions so that
sparse_encode fits with dictionary learning, right?

> As for my code the line that fails is:
I was specifically refering to when you attempted to manually plug in
the components_ attribute, because I feel that that should work. Was
the problem regarding invalid output, or a formal error?

Vlad

>
> On Tue, Dec 6, 2011 at 5:59 AM, Olivier Grisel <[email protected]> 
> wrote:
>> 2011/12/6 Vlad Niculae <[email protected]>:
>>> On Tue, Dec 6, 2011 at 12:07 PM, Olivier Grisel
>>> <[email protected]> wrote:
>>>> 2011/12/6 Vlad Niculae <[email protected]>:
>>>>>
>>>>> On Dec 6, 2011, at 11:04 , Gael Varoquaux wrote:
>>>>>
>>>>>> On Tue, Dec 06, 2011 at 09:41:56AM +0200, Vlad Niculae wrote:
>>>>>>> This is actually exactly how the module is designed.
>>>>>>
>>>>>> Great design! I should have looked at it closer before writing my mail.
>>>>>>
>>>>>>> We have BaseDictionaryLearning which only implements transforms. I
>>>>>>> didn't try but you should be able to instantiate a
>>>>>>> BaseDictionaryLearning object, set its components_ manually, and use
>>>>>>> its transform.
>>>>>>
>>>>>> Maybe we need a subclass of this object, for instance 'sparse_coder' that
>>>>>> takes as __init__ argument the dictionnary to be used.
>>>>> Sounds good, this way it can be used in pipelines. I'll make a pull 
>>>>> request.
>>>>
>>>> Also vlad can you check the shape of the output of:
>>>>
>>>> http://scikit-learn.org/dev/modules/generated/sklearn.decomposition.sparse_encode.html
>>>> (and its' parallel variant) ?
>>>>
>>>> It looks wrong to me. I would have expected `(n_samples, n_components)` 
>>>> instead.
>>>
>>> There's no really wrong or right here, but indeed it's backwards than
>>> the dictionary learning framework, but it's shaped like the linear
>>> estimators.
>>
>> Ok I think I understand: the dictionary X is currently documented as shape:
>>
>>  (n_samples, n_components)
>>
>> Whereas, IMHO it should be (n_components, n_features) : each atom (row
>> of the dictionary) should have the dimensionality of the input data
>> space which it is supposed to be a summary of.
>>
>> In that case the output of the sparse encoder for a new dataset of
>> shape (n_samples, n_features) would be shaped (n_samples,
>> n_components).
>>
>> --
>> Olivier
>> http://twitter.com/ogrisel - http://github.com/ogrisel
>>
>> ------------------------------------------------------------------------------
>> Cloud Services Checklist: Pricing and Packaging Optimization
>> This white paper is intended to serve as a reference, checklist and point of
>> discussion for anyone considering optimizing the pricing and packaging model
>> of a cloud services business. Read Now!
>> http://www.accelacomm.com/jaw/sfnl/114/51491232/
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
> ------------------------------------------------------------------------------
> Cloud Services Checklist: Pricing and Packaging Optimization
> This white paper is intended to serve as a reference, checklist and point of
> discussion for anyone considering optimizing the pricing and packaging model
> of a cloud services business. Read Now!
> http://www.accelacomm.com/jaw/sfnl/114/51491232/
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

Reply via email to