Alejandro,
It looks like the problem can be traced back to the ARPACK eigensolver.
If you run the code with eigen_solver='dense', it works as expected.
Sometimes arpack does not converge to all the requested eigenvalues, and
I guess there's no error reported when that happens.
I tried performing the eigenvalue decomposition using the scipy
development version of arpack, and it gives 3 dimensions as expected.
It may be that we can fix this by updating the arpack wrapper from scipy.
Jake
Alejandro Weinstein wrote:
> Hi:
>
> I am observing an unexpected behavior of Isomap, related to the
> dimensions of the transformed data. If I generate random data, say
> 1000 points each with dimension 10, and fit a transform using as a
> parameter out_dim=3, the fitted data has dimension (1000, 3), as
> expected. However, when I repeat the same steps but this time using my
> data set consisting of 427 points, each of dimension 400, the fitted
> data has dimension (427, 2), i.e., the output dimension is 1 less than
> out_dim. Using LLE with the same data set and parameters, the fitted
> data has the expected dimension (427, 3).
>
> The following code illustrate the phenomena:
>
> #############################################
> import numpy as np
> from sklearn import manifold
>
> n = 1000;
> m = 10;
> X = np.random.rand(n,m)
> n_neighbors = 5
> out_dim = 3
>
> Y = manifold.Isomap(n_neighbors, out_dim).fit_transform(X)
> print 'Using random data and Isomap'
> print 'X shape:%s, out_dim:%d, Y shape: %s' % (X.shape, out_dim, Y.shape)
>
> X = np.load('X.npy')
> Y = manifold.Isomap(n_neighbors, out_dim).fit_transform(X)
> print
> print 'Using the data X.npy and Isomap'
> print 'X shape:%s, out_dim:%d, Y shape: %s' % (X.shape, out_dim, Y.shape)
>
> Y = manifold.LocallyLinearEmbedding(n_neighbors, out_dim).fit_transform(X)
> print
> print 'Using the data X.npy and LLE'
> print 'X shape:%s, out_dim:%d, Y shape: %s' % (X.shape, out_dim, Y.shape)
> ##################################################################
>
> And this is the output:
>
> Using random data and Isomap
> X shape:(1000, 10), out_dim:3, Y shape: (1000, 3)
>
> Using the data X.npy and Isomap
> X shape:(427, 400), out_dim:3, Y shape: (427, 2)
>
> Using the data X.npy and LLE
> X shape:(427, 400), out_dim:3, Y shape: (427, 3)
>
> The code and the data set is available at
> https://github.com/aweinstein/scrapcode
>
> In case it is relevant, the data set consist of documents represented
> in the Latent Semantic Analysis space.
>
> Is this the expected behavior of Isomap, or is there something wrong?
>
> Alejandro.
>
> ------------------------------------------------------------------------------
> RSA(R) Conference 2012
> Save $700 by Nov 18
> Register now
> http://p.sf.net/sfu/rsa-sfdev2dev1
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general