Re: [Scikit-learn-general] sparse_encode omp error -- "the number of atoms..." (line 456 of dict_learning.py), sklearn 0.14.1

Kyle Kastner Thu, 15 May 2014 10:32:43 -0700

additionally, why is 'gram = np.dot(dictionary, dictionary.T)' used for the
OMP? According to the docstring on orthogonal_mp_gram it should be X.T * X
. This would also make gram (n_features, n_features) and make the size
checks work...



On Thu, May 15, 2014 at 10:01 AM, Kyle Kastner <kastnerk...@gmail.com>wrote:

> That is very interesting, and sheds some more light on the problem!
> Looking at the sparse encoder, I don't think n_nonzero_coefs can ever
> become zero - the n_nonzero_coefs should be 'max(n_features / 10, 1)'. The
> 2 in the dictionary (D.shape[0]) should be n_atoms, which can't be greater
> than n_features (64) according to the error message and the checks in both
> orthogonal_mp functions. I think this second case should still work, since
> it should pass the check in orthogonal_mp , but breaks for some reason due
> to use of orthogonal_mp_gram (presumably to make the OMP faster?).
>
> The second example has n_features = 64, n_samples = 100, and n_atoms =
> 2... which fits the criteria of n_nonzero_coefs less than n_features, and
> should always fit this case for auto choosing. 10% of the features is
> always <= total number of features (as long as n_features is >= 0, I
> suppose). This is why failing the check on 'autochoosing' n_nonzero_coefs
> is confusing me.
>
> orthogonal_mp has 'if tol is None and n_nonzero_coefs > X.shape[1]'
> while
> orthogonal_mp_gram has 'if tol is None and n_nonzero_coefs > len(Gram)'
>
> Are these two functions meant to be equivalent? It seems like they are
> checking different things, since X.shape[1] is n_features (64 here at the
> input to sparse_encode), while Gram has shape (2, 2), which is (n_atoms,
> n_atoms) since 'Gram = np.dot(dictionary, dictionary.T)'.
>
> When I change D.shape[0], it ends up breaking the check in
> orthogonal_mp_gram - but n_nonzero_coefs (autochosen as 10% of 64 = 6) in
> both cases is still less than n_features, which hasn't changed. It looks
> like the Gram check is really checking n_nonzero_coefs < n_atoms which is
> very different, though maybe the correct thing for that computation?
>
> I don't even see a way to do the equivalent check in orthogonal_mp_gram -
> it is only passed cov (n_atoms, n_samples) and gram (n_atoms, n_atoms) to
> do the computation, along with a few other settings. There doesn't seem to
> be any n_features info at all
>
> If they aren't meant to be equivalent, maybe the error message should be
> different for the orthogonal_mp_gram? If they *are* meant to be the same,
> then I am not sure how that is currently the case, though there are tests
> that check equivalency and pass in test_omp.py.
>
> I was using the defaults for n_nonzero_coefs while I try to understand the
> mexOMP referenced by a MATLAB KSVD implementation seen here -
> http://www.ux.uis.no/~karlsk/dle/#ssec32 . 'tnz', s in the link is using
> a default of 5 in the matlab code? Which seems very arbitrary, and not very
> different from an auto-chosen value of 6 in this particular case. I hit
> this problem when I started testing on a different dataset - got the
> initial implementation on images, and am now trying to use it for signals.
>
> Thanks for looking into this - hopefully it is just confusion on my part
>
>
> On Thu, May 15, 2014 at 1:55 AM, Gael Varoquaux <
> gael.varoqu...@normalesup.org> wrote:
>
>> On Thu, May 15, 2014 at 12:56:09AM -0500, Kyle Kastner wrote:
>> > #Broken?
>> > X = np.random.randn(100, 64)
>> > D = np.random.randn(2, 64)
>> > sparse_encode(X, D, algorithm='omp', n_nonzero_coefs=None)
>>
>> After a quick glance, could it be that with a dimension of 2,
>> n_nonzero_coefs, which defaults to int(.1 * n_features) is zero? When I
>> put it manually to 1, this work.
>>
>> This parameter is what I believe would control 'k' in a K-SVD. I am a bit
>> surprised that you are leaving it to its default value.
>>
>> HTH,
>>
>> Gaël
>>
>>
>> ------------------------------------------------------------------------------
>> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
>> Instantly run your Selenium tests across 300+ browser/OS combos.
>> Get unparalleled scalability from the best Selenium testing platform
>> available
>> Simple to use. Nothing to install. Get started now for free."
>> http://p.sf.net/sfu/SauceLabs
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>

------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] sparse_encode omp error -- "the number of atoms..." (line 456 of dict_learning.py), sklearn 0.14.1

Reply via email to