Re: [Scikit-learn-general] Unexpected behavior of Isomap

2011-11-10 Thread bthirion
In your case, the rank of the Gram matrix is much smaller than the number of samples. All this means that we need to add some kind of regularization to it. Bertrand On 11/10/2011 04:00 AM, Alejandro Weinstein wrote: > On Mon, Nov 7, 2011 at 12:32 PM, Jacob VanderPlas > wrote: >> I think, base

Re: [Scikit-learn-general] Unexpected behavior of Isomap

2011-11-09 Thread Matthieu Brucher
Hi Jacob, Indeed, Isomap is a metric MDS, so you have the same hypothesis. A negative eigenvalue should not happen, but one never knows. As the eigenvalue only plays as a scaling factor, it is not too weird too use a negative one in the embedding construction. Cheers, Matthieu 2011/11/7 Jacob V

Re: [Scikit-learn-general] Unexpected behavior of Isomap

2011-11-09 Thread Alejandro Weinstein
On Mon, Nov 7, 2011 at 12:32 PM, Jacob VanderPlas wrote: > I think, based on this, that KernelPCA is correct as written, except > that the arpack method should use which='LA' rather than which='LM' > (thus ignoring any negative eigenvalues).  This would fix Alejandro's > problem.  I'll make the ch

Re: [Scikit-learn-general] Unexpected behavior of Isomap

2011-11-07 Thread Jacob VanderPlas
I dug around a bit, and found some info about kernel form in this document: http://people.kyb.tuebingen.mpg.de/lcayton/resexam.pdf MDS (on which Isomap is based) assumes a Euclidean distance matrix, which can be shown to always yield a positive semidefinite kernel. In the case of Isomap, the di

Re: [Scikit-learn-general] Unexpected behavior of Isomap

2011-11-06 Thread Jacob VanderPlas
I looked closer: turns out arpack is actually up-to-date. I think the bug is in the kernel pca code: eigsh should be called with keyword which='LA' rather than which='LM'. The fit_transform routine was finding three vectors, and then removing the one with a negative eigenvalue. Before making

Re: [Scikit-learn-general] Unexpected behavior of Isomap

2011-11-06 Thread Jacob VanderPlas
Alejandro, It looks like the problem can be traced back to the ARPACK eigensolver. If you run the code with eigen_solver='dense', it works as expected. Sometimes arpack does not converge to all the requested eigenvalues, and I guess there's no error reported when that happens. I tried perform

[Scikit-learn-general] Unexpected behavior of Isomap

2011-11-06 Thread Alejandro Weinstein
Hi: I am observing an unexpected behavior of Isomap, related to the dimensions of the transformed data. If I generate random data, say 1000 points each with dimension 10, and fit a transform using as a parameter out_dim=3, the fitted data has dimension (1000, 3), as expected. However, when I repea