[Numpy-discussion] Re: Dear Robert

Robert Kern Sat, 25 Feb 2023 15:45:13 -0800

On Sat, Feb 25, 2023 at 5:33 PM Louis Petingi <louis.peti...@csi.cuny.edu>
wrote:


> As you mentioned this is a generalization of the Fiedler eigenvector. When
> applying spectral clustering, and you want to find the two clusters
> then the Fiedler eigenvector tells you how to partition the vertices
> (bipartition) so the normalized cut is minimized. The concept can be
> generalized to
> k clusters by applying K-mean to the first k eigenvectors. The two
> partitions can be determined by grouping the vertices of the corresponding
> negative values of the eigenvector
> for one cluster and the other cluster are vertices corresponding to the
> non-negative values.
>
> I can use perhaps the normalized eigenvectors, but I am not sure if this
> is correct thus, I prefer using the magnitudes. In theory it may work as
> for example the entries of the Fielder eigenvector are divided by the norm.
>

My understanding of this family of techniques is that the K-means step is
basically heuristic (see section 8.4 of [1]). You have a lot of choices
here, and playing around with the magnitudes might be among them. But there
is nothing that comes _from_ the eigenvector calculation that can help
determine what this should be. There's nothing special about the way that
`np.linalg.eig()` computes eigenvectors before normalization that will
improve the K-means step. Rather, you have the freedom to scale the
eigenvectors freely to help out the K-means calculation (or any other
clustering that you might want to do there). I suspect that the
unit-normalization is probably one of the better ones.

[1]
http://people.csail.mit.edu/dsontag/courses/ml14/notes/Luxburg07_tutorial_spectral_clustering.pdf


> The eigenvalues and eigenvector calculators (in internet) find the
> magnitude  of the vector.
>

No, they do not find "the magnitude" because there is no such thing. They
just give _an_ arbitrary magnitude that happened to be the raw output of
their _particular_ algorithm without the following unit-normalization step.
There are lots of algorithms to compute eigenvectors. The magnitudes of the
eigenvector solutions before normalization are not all the same for all of
the different algorithms. There is absolutely no guarantee (and very little
probability) that if we modified `np.linalg.eig()` to not apply
normalization that the unnormalized eigenvectors would match the
unnormalized eigenvectors of any other implementation.

-- 
Robert Kern

_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: Dear Robert

Reply via email to