Re: How can I implement eigenvalue decomposition in Spark?
and transformations? Later we would have 500*500 matrix processed. It seems emergent that we should find some distributed computation platform. I see SVD has been implemented and I can get eigenvalues of a matrix through this API. But when I want to get both eigenvalues and eigenvectors or at least the biggest eigenvalue and the corresponding eigenvector, it seems that current Spark doesn't have such API. Is it possible that I write eigenvalue decomposition from scratch? What should I do? Thanks a lot! Miles Yao View this message in context: How can I implement eigenvalue decomposition in Spark? Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Li @vrilleup
Re: How can I implement eigenvalue decomposition in Spark?
in the Spark code. The one in mllib is not distributed (right?) and is probably not an efficient means of computing eigenvectors if you really just want a decomposition of a symmetric matrix. The one I see in graphx is distributed? I haven't used it though. Maybe it could be part of a solution. On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan [hidden email] http://user/SendEmail.jtp?type=nodenode=11765i=7 wrote: Our lab need to do some simulation on online social networks. We need to handle a 5000*5000 adjacency matrix, namely, to get its largest eigenvalue and corresponding eigenvector. Matlab can be used but it is time-consuming. Is Spark effective in linear algebra calculations and transformations? Later we would have 500*500 matrix processed. It seems emergent that we should find some distributed computation platform. I see SVD has been implemented and I can get eigenvalues of a matrix through this API. But when I want to get both eigenvalues and eigenvectors or at least the biggest eigenvalue and the corresponding eigenvector, it seems that current Spark doesn't have such API. Is it possible that I write eigenvalue decomposition from scratch? What should I do? Thanks a lot! Miles Yao View this message in context: How can I implement eigenvalue decomposition in Spark? Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: [hidden email] http://user/SendEmail.jtp?type=nodenode=11765i=8 For additional commands, e-mail: [hidden email] http://user/SendEmail.jtp?type=nodenode=11765i=9 -- Li @vrilleup -- If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-implement-eigenvalue-decomposition-in-Spark-tp11646p11765.html To start a new topic under Apache Spark User List, email ml-node+s1001560n1...@n3.nabble.com To unsubscribe from Apache Spark User List, click here http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=1code=bGVhcm5pbmdzLmNoaXR0dXJpQGdtYWlsLmNvbXwxfC03NzExMjUwMg== . NAML http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-implement-eigenvalue-decomposition-in-Spark-tp11646p11778.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: How can I implement eigenvalue decomposition in Spark?
The SVD does not in general give you eigenvalues of its input. Are you just trying to access the U and V matrices? they are also returned in the API. But they are not the eigenvectors of M, as you note. I don't think MLlib has anything to help with the general eigenvector problem. Maybe you can implement a sort of power iteration algorithm using GraphX to find the largest eigenvector? On Fri, Aug 8, 2014 at 4:07 AM, Chunnan Yao yaochun...@gmail.com wrote: Hi there, what you've suggested are all meaningful. But to make myself clearer, my essential problems are: 1. My matrix is asymmetric, and it is a probabilistic adjacency matrix, whose entries(a_ij) represents the likelihood that user j will broadcast the information generated by user i. Apparently, a_ij and a_ji is different, caus I love you doesn't necessarily mean you love me(What a sad story~). All entries are real. 2. I know I can get eigenvalues through SVD. My problem is I can't get the corresponding eigenvectors, which requires solving equations, and I also need eigenvectors in my calculation.In my simulation of this paper, I only need the biggest eigenvalues and corresponding eigenvectors. The paper posted by Shivaram Venkataraman is also concerned about symmetric matrix. Could any one help me out? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How can I implement eigenvalue decomposition in Spark?
@Miles, eigen-decomposition with asymmetric matrix doesn't always give real-value solutions, and it doesn't have the nice properties that symmetric matrix holds. Usually you want to symmetrize your asymmetric matrix in some way, e.g. see http://machinelearning.wustl.edu/mlpapers/paper_files/icml2005_ZhouHS05.pdf but as Sean mentioned, you can always compute the largest eigenpair with power method or some variations like pagerank, which is already implemented in graphx. On Fri, Aug 8, 2014 at 2:50 AM, Sean Owen so...@cloudera.com wrote: The SVD does not in general give you eigenvalues of its input. Are you just trying to access the U and V matrices? they are also returned in the API. But they are not the eigenvectors of M, as you note. I don't think MLlib has anything to help with the general eigenvector problem. Maybe you can implement a sort of power iteration algorithm using GraphX to find the largest eigenvector? On Fri, Aug 8, 2014 at 4:07 AM, Chunnan Yao yaochun...@gmail.com wrote: Hi there, what you've suggested are all meaningful. But to make myself clearer, my essential problems are: 1. My matrix is asymmetric, and it is a probabilistic adjacency matrix, whose entries(a_ij) represents the likelihood that user j will broadcast the information generated by user i. Apparently, a_ij and a_ji is different, caus I love you doesn't necessarily mean you love me(What a sad story~). All entries are real. 2. I know I can get eigenvalues through SVD. My problem is I can't get the corresponding eigenvectors, which requires solving equations, and I also need eigenvectors in my calculation.In my simulation of this paper, I only need the biggest eigenvalues and corresponding eigenvectors. The paper posted by Shivaram Venkataraman is also concerned about symmetric matrix. Could any one help me out? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Li @vrilleup
Re: How can I implement eigenvalue decomposition in Spark?
Generally adjacency matrix is undirected(symmetric) on social network, so you can get eigenvectors from SVD computed result. A = UDV^t The first column of U is the biggest eigenvector corresponding to the first value of D. xj @ Tokyo On Sat, Aug 9, 2014 at 4:08 AM, Li Pu l...@twitter.com.invalid wrote: @Miles, eigen-decomposition with asymmetric matrix doesn't always give real-value solutions, and it doesn't have the nice properties that symmetric matrix holds. Usually you want to symmetrize your asymmetric matrix in some way, e.g. see http://machinelearning.wustl.edu/mlpapers/paper_files/icml2005_ZhouHS05.pdf but as Sean mentioned, you can always compute the largest eigenpair with power method or some variations like pagerank, which is already implemented in graphx. On Fri, Aug 8, 2014 at 2:50 AM, Sean Owen so...@cloudera.com wrote: The SVD does not in general give you eigenvalues of its input. Are you just trying to access the U and V matrices? they are also returned in the API. But they are not the eigenvectors of M, as you note. I don't think MLlib has anything to help with the general eigenvector problem. Maybe you can implement a sort of power iteration algorithm using GraphX to find the largest eigenvector? On Fri, Aug 8, 2014 at 4:07 AM, Chunnan Yao yaochun...@gmail.com wrote: Hi there, what you've suggested are all meaningful. But to make myself clearer, my essential problems are: 1. My matrix is asymmetric, and it is a probabilistic adjacency matrix, whose entries(a_ij) represents the likelihood that user j will broadcast the information generated by user i. Apparently, a_ij and a_ji is different, caus I love you doesn't necessarily mean you love me(What a sad story~). All entries are real. 2. I know I can get eigenvalues through SVD. My problem is I can't get the corresponding eigenvectors, which requires solving equations, and I also need eigenvectors in my calculation.In my simulation of this paper, I only need the biggest eigenvalues and corresponding eigenvectors. The paper posted by Shivaram Venkataraman is also concerned about symmetric matrix. Could any one help me out? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Li @vrilleup
How can I implement eigenvalue decomposition in Spark?
Our lab need to do some simulation on online social networks. We need to handle a 5000*5000 adjacency matrix, namely, to get its largest eigenvalue and corresponding eigenvector. Matlab can be used but it is time-consuming. Is Spark effective in linear algebra calculations and transformations? Later we would have 500*500 matrix processed. It seems emergent that we should find some distributed computation platform. I see SVD has been implemented and I can get eigenvalues of a matrix through this API. But when I want to get both eigenvalues and eigenvectors or at least the biggest eigenvalue and the corresponding eigenvector, it seems that current Spark doesn't have such API. Is it possible that I write eigenvalue decomposition from scratch? What should I do? Thanks a lot! Miles Yao -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-implement-eigenvalue-decomposition-in-Spark-tp11646.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: How can I implement eigenvalue decomposition in Spark?
(-incubator, +user) If your matrix is symmetric (and real I presume), and if my linear algebra isn't too rusty, then its SVD is its eigendecomposition. The SingularValueDecomposition object you get back has U and V, both of which have columns that are the eigenvectors. There are a few SVDs in the Spark code. The one in mllib is not distributed (right?) and is probably not an efficient means of computing eigenvectors if you really just want a decomposition of a symmetric matrix. The one I see in graphx is distributed? I haven't used it though. Maybe it could be part of a solution. On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan yaochun...@gmail.com wrote: Our lab need to do some simulation on online social networks. We need to handle a 5000*5000 adjacency matrix, namely, to get its largest eigenvalue and corresponding eigenvector. Matlab can be used but it is time-consuming. Is Spark effective in linear algebra calculations and transformations? Later we would have 500*500 matrix processed. It seems emergent that we should find some distributed computation platform. I see SVD has been implemented and I can get eigenvalues of a matrix through this API. But when I want to get both eigenvalues and eigenvectors or at least the biggest eigenvalue and the corresponding eigenvector, it seems that current Spark doesn't have such API. Is it possible that I write eigenvalue decomposition from scratch? What should I do? Thanks a lot! Miles Yao View this message in context: How can I implement eigenvalue decomposition in Spark? Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How can I implement eigenvalue decomposition in Spark?
Reza Zadeh has contributed the distributed implementation of (Tall/Skinny) SVD (http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html), which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in Spark 1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your data is sparse (which it often is in social networks), you may have better luck with this. I haven't tried the GraphX implementation, but those algorithms are often well-suited for power-law distributed graphs as you might see in social networks. FWIW, I believe you need to square elements of the sigma matrix from the SVD to get the eigenvalues. On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen so...@cloudera.com wrote: (-incubator, +user) If your matrix is symmetric (and real I presume), and if my linear algebra isn't too rusty, then its SVD is its eigendecomposition. The SingularValueDecomposition object you get back has U and V, both of which have columns that are the eigenvectors. There are a few SVDs in the Spark code. The one in mllib is not distributed (right?) and is probably not an efficient means of computing eigenvectors if you really just want a decomposition of a symmetric matrix. The one I see in graphx is distributed? I haven't used it though. Maybe it could be part of a solution. On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan yaochun...@gmail.com wrote: Our lab need to do some simulation on online social networks. We need to handle a 5000*5000 adjacency matrix, namely, to get its largest eigenvalue and corresponding eigenvector. Matlab can be used but it is time-consuming. Is Spark effective in linear algebra calculations and transformations? Later we would have 500*500 matrix processed. It seems emergent that we should find some distributed computation platform. I see SVD has been implemented and I can get eigenvalues of a matrix through this API. But when I want to get both eigenvalues and eigenvectors or at least the biggest eigenvalue and the corresponding eigenvector, it seems that current Spark doesn't have such API. Is it possible that I write eigenvalue decomposition from scratch? What should I do? Thanks a lot! Miles Yao View this message in context: How can I implement eigenvalue decomposition in Spark? Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How can I implement eigenvalue decomposition in Spark?
@Miles, the latest SVD implementation in mllib is partially distributed. Matrix-vector multiplication is computed among all workers, but the right singular vectors are all stored in the driver. If your symmetric matrix is n x n and you want the first k eigenvalues, you will need to fit n x k doubles in driver's memory. Behind the scene, it calls ARPACK to compute eigen-decomposition of A^T A. You can look into the source code for the details. @Sean, the SVD++ implementation in graphx is not the canonical definition of SVD. It doesn't have the orthogonality that SVD holds. But we might want to use graphx as the underlying matrix representation for mllib.SVD to address the problem of skewed entry distribution. On Thu, Aug 7, 2014 at 10:51 AM, Evan R. Sparks evan.spa...@gmail.com wrote: Reza Zadeh has contributed the distributed implementation of (Tall/Skinny) SVD ( http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html), which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in Spark 1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your data is sparse (which it often is in social networks), you may have better luck with this. I haven't tried the GraphX implementation, but those algorithms are often well-suited for power-law distributed graphs as you might see in social networks. FWIW, I believe you need to square elements of the sigma matrix from the SVD to get the eigenvalues. On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen so...@cloudera.com wrote: (-incubator, +user) If your matrix is symmetric (and real I presume), and if my linear algebra isn't too rusty, then its SVD is its eigendecomposition. The SingularValueDecomposition object you get back has U and V, both of which have columns that are the eigenvectors. There are a few SVDs in the Spark code. The one in mllib is not distributed (right?) and is probably not an efficient means of computing eigenvectors if you really just want a decomposition of a symmetric matrix. The one I see in graphx is distributed? I haven't used it though. Maybe it could be part of a solution. On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan yaochun...@gmail.com wrote: Our lab need to do some simulation on online social networks. We need to handle a 5000*5000 adjacency matrix, namely, to get its largest eigenvalue and corresponding eigenvector. Matlab can be used but it is time-consuming. Is Spark effective in linear algebra calculations and transformations? Later we would have 500*500 matrix processed. It seems emergent that we should find some distributed computation platform. I see SVD has been implemented and I can get eigenvalues of a matrix through this API. But when I want to get both eigenvalues and eigenvectors or at least the biggest eigenvalue and the corresponding eigenvector, it seems that current Spark doesn't have such API. Is it possible that I write eigenvalue decomposition from scratch? What should I do? Thanks a lot! Miles Yao View this message in context: How can I implement eigenvalue decomposition in Spark? Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Li @vrilleup
Re: How can I implement eigenvalue decomposition in Spark?
If you just want to find the top eigenvalue / eigenvector you can do something like the Lanczos method. There is a description of a MapReduce based algorithm in Section 4.2 of [1] [1] http://www.cs.cmu.edu/~ukang/papers/HeigenPAKDD2011.pdf On Thu, Aug 7, 2014 at 10:54 AM, Li Pu l...@twitter.com.invalid wrote: @Miles, the latest SVD implementation in mllib is partially distributed. Matrix-vector multiplication is computed among all workers, but the right singular vectors are all stored in the driver. If your symmetric matrix is n x n and you want the first k eigenvalues, you will need to fit n x k doubles in driver's memory. Behind the scene, it calls ARPACK to compute eigen-decomposition of A^T A. You can look into the source code for the details. @Sean, the SVD++ implementation in graphx is not the canonical definition of SVD. It doesn't have the orthogonality that SVD holds. But we might want to use graphx as the underlying matrix representation for mllib.SVD to address the problem of skewed entry distribution. On Thu, Aug 7, 2014 at 10:51 AM, Evan R. Sparks evan.spa...@gmail.com wrote: Reza Zadeh has contributed the distributed implementation of (Tall/Skinny) SVD ( http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html), which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in Spark 1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your data is sparse (which it often is in social networks), you may have better luck with this. I haven't tried the GraphX implementation, but those algorithms are often well-suited for power-law distributed graphs as you might see in social networks. FWIW, I believe you need to square elements of the sigma matrix from the SVD to get the eigenvalues. On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen so...@cloudera.com wrote: (-incubator, +user) If your matrix is symmetric (and real I presume), and if my linear algebra isn't too rusty, then its SVD is its eigendecomposition. The SingularValueDecomposition object you get back has U and V, both of which have columns that are the eigenvectors. There are a few SVDs in the Spark code. The one in mllib is not distributed (right?) and is probably not an efficient means of computing eigenvectors if you really just want a decomposition of a symmetric matrix. The one I see in graphx is distributed? I haven't used it though. Maybe it could be part of a solution. On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan yaochun...@gmail.com wrote: Our lab need to do some simulation on online social networks. We need to handle a 5000*5000 adjacency matrix, namely, to get its largest eigenvalue and corresponding eigenvector. Matlab can be used but it is time-consuming. Is Spark effective in linear algebra calculations and transformations? Later we would have 500*500 matrix processed. It seems emergent that we should find some distributed computation platform. I see SVD has been implemented and I can get eigenvalues of a matrix through this API. But when I want to get both eigenvalues and eigenvectors or at least the biggest eigenvalue and the corresponding eigenvector, it seems that current Spark doesn't have such API. Is it possible that I write eigenvalue decomposition from scratch? What should I do? Thanks a lot! Miles Yao View this message in context: How can I implement eigenvalue decomposition in Spark? Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Li @vrilleup
Re: How can I implement eigenvalue decomposition in Spark?
The SVD computed result already contains descending order of singular values, you can get the biggest eigenvalue. --- val svd = matrix.computeSVD(matrix.numCols().toInt, computeU = true) val U: RowMatrix = svd.U val s: Vector = svd.s val V: Matrix = svd.V U.rows.toArray.take(1).foreach(println) println(s.toArray(0)*s.toArray(0)) println(V.toArray.take(s.size).foreach(println)) --- xj @ Tokyo On Fri, Aug 8, 2014 at 3:06 AM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: If you just want to find the top eigenvalue / eigenvector you can do something like the Lanczos method. There is a description of a MapReduce based algorithm in Section 4.2 of [1] [1] http://www.cs.cmu.edu/~ukang/papers/HeigenPAKDD2011.pdf On Thu, Aug 7, 2014 at 10:54 AM, Li Pu l...@twitter.com.invalid wrote: @Miles, the latest SVD implementation in mllib is partially distributed. Matrix-vector multiplication is computed among all workers, but the right singular vectors are all stored in the driver. If your symmetric matrix is n x n and you want the first k eigenvalues, you will need to fit n x k doubles in driver's memory. Behind the scene, it calls ARPACK to compute eigen-decomposition of A^T A. You can look into the source code for the details. @Sean, the SVD++ implementation in graphx is not the canonical definition of SVD. It doesn't have the orthogonality that SVD holds. But we might want to use graphx as the underlying matrix representation for mllib.SVD to address the problem of skewed entry distribution. On Thu, Aug 7, 2014 at 10:51 AM, Evan R. Sparks evan.spa...@gmail.com wrote: Reza Zadeh has contributed the distributed implementation of (Tall/Skinny) SVD ( http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html), which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in Spark 1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your data is sparse (which it often is in social networks), you may have better luck with this. I haven't tried the GraphX implementation, but those algorithms are often well-suited for power-law distributed graphs as you might see in social networks. FWIW, I believe you need to square elements of the sigma matrix from the SVD to get the eigenvalues. On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen so...@cloudera.com wrote: (-incubator, +user) If your matrix is symmetric (and real I presume), and if my linear algebra isn't too rusty, then its SVD is its eigendecomposition. The SingularValueDecomposition object you get back has U and V, both of which have columns that are the eigenvectors. There are a few SVDs in the Spark code. The one in mllib is not distributed (right?) and is probably not an efficient means of computing eigenvectors if you really just want a decomposition of a symmetric matrix. The one I see in graphx is distributed? I haven't used it though. Maybe it could be part of a solution. On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan yaochun...@gmail.com wrote: Our lab need to do some simulation on online social networks. We need to handle a 5000*5000 adjacency matrix, namely, to get its largest eigenvalue and corresponding eigenvector. Matlab can be used but it is time-consuming. Is Spark effective in linear algebra calculations and transformations? Later we would have 500*500 matrix processed. It seems emergent that we should find some distributed computation platform. I see SVD has been implemented and I can get eigenvalues of a matrix through this API. But when I want to get both eigenvalues and eigenvectors or at least the biggest eigenvalue and the corresponding eigenvector, it seems that current Spark doesn't have such API. Is it possible that I write eigenvalue decomposition from scratch? What should I do? Thanks a lot! Miles Yao View this message in context: How can I implement eigenvalue decomposition in Spark? Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Li @vrilleup
Re: How can I implement eigenvalue decomposition in Spark?
Hi there, what you've suggested are all meaningful. But to make myself clearer, my essential problems are: 1. My matrix is asymmetric, and it is a probabilistic adjacency matrix, whose entries(a_ij) represents the likelihood that user j will broadcast the information generated by user i. Apparently, a_ij and a_ji is different, caus I love you doesn't necessarily mean you love me(What a sad story~). All entries are real. 2. I know I can get eigenvalues through SVD. My problem is I can't get the corresponding eigenvectors, which requires solving equations, and I also need eigenvectors in my calculation.In my simulation of this paper, I only need the biggest eigenvalues and corresponding eigenvectors. The paper posted by Shivaram Venkataraman is also concerned about symmetric matrix. Could any one help me out? 2014-08-08 9:41 GMT+08:00 x wasedax...@gmail.com: The SVD computed result already contains descending order of singular values, you can get the biggest eigenvalue. --- val svd = matrix.computeSVD(matrix.numCols().toInt, computeU = true) val U: RowMatrix = svd.U val s: Vector = svd.s val V: Matrix = svd.V U.rows.toArray.take(1).foreach(println) println(s.toArray(0)*s.toArray(0)) println(V.toArray.take(s.size).foreach(println)) --- xj @ Tokyo On Fri, Aug 8, 2014 at 3:06 AM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: If you just want to find the top eigenvalue / eigenvector you can do something like the Lanczos method. There is a description of a MapReduce based algorithm in Section 4.2 of [1] [1] http://www.cs.cmu.edu/~ukang/papers/HeigenPAKDD2011.pdf On Thu, Aug 7, 2014 at 10:54 AM, Li Pu l...@twitter.com.invalid wrote: @Miles, the latest SVD implementation in mllib is partially distributed. Matrix-vector multiplication is computed among all workers, but the right singular vectors are all stored in the driver. If your symmetric matrix is n x n and you want the first k eigenvalues, you will need to fit n x k doubles in driver's memory. Behind the scene, it calls ARPACK to compute eigen-decomposition of A^T A. You can look into the source code for the details. @Sean, the SVD++ implementation in graphx is not the canonical definition of SVD. It doesn't have the orthogonality that SVD holds. But we might want to use graphx as the underlying matrix representation for mllib.SVD to address the problem of skewed entry distribution. On Thu, Aug 7, 2014 at 10:51 AM, Evan R. Sparks evan.spa...@gmail.com wrote: Reza Zadeh has contributed the distributed implementation of (Tall/Skinny) SVD ( http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html), which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in Spark 1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your data is sparse (which it often is in social networks), you may have better luck with this. I haven't tried the GraphX implementation, but those algorithms are often well-suited for power-law distributed graphs as you might see in social networks. FWIW, I believe you need to square elements of the sigma matrix from the SVD to get the eigenvalues. On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen so...@cloudera.com wrote: (-incubator, +user) If your matrix is symmetric (and real I presume), and if my linear algebra isn't too rusty, then its SVD is its eigendecomposition. The SingularValueDecomposition object you get back has U and V, both of which have columns that are the eigenvectors. There are a few SVDs in the Spark code. The one in mllib is not distributed (right?) and is probably not an efficient means of computing eigenvectors if you really just want a decomposition of a symmetric matrix. The one I see in graphx is distributed? I haven't used it though. Maybe it could be part of a solution. On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan yaochun...@gmail.com wrote: Our lab need to do some simulation on online social networks. We need to handle a 5000*5000 adjacency matrix, namely, to get its largest eigenvalue and corresponding eigenvector. Matlab can be used but it is time-consuming. Is Spark effective in linear algebra calculations and transformations? Later we would have 500*500 matrix processed. It seems emergent that we should find some distributed computation platform. I see SVD has been implemented and I can get eigenvalues of a matrix through this API. But when I want to get both eigenvalues and eigenvectors or at least the biggest eigenvalue and the corresponding eigenvector, it seems that current Spark doesn't have such API. Is it possible that I write eigenvalue decomposition from scratch? What should I do? Thanks a lot! Miles Yao View this message in context: How can I implement eigenvalue decomposition in Spark? Sent from the Apache Spark User List mailing list archive at Nabble.com
Re: How can I implement eigenvalue decomposition in Spark?
scratch? What should I do? Thanks a lot! Miles Yao View this message in context: How can I implement eigenvalue decomposition in Spark? Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Li @vrilleup