Re: How can I implement eigenvalue decomposition in Spark?

2014-08-08 Thread Chunnan Yao
 and
 transformations? Later
  we would have 500*500 matrix processed. It seems emergent
 that we
  should find some distributed computation platform.
 
  I see SVD has been implemented and I can get eigenvalues of a
 matrix through
  this API.  But when I want to get both eigenvalues and
 eigenvectors or at
  least the biggest eigenvalue and the corresponding eigenvector, it
 seems
  that current Spark doesn't have such API. Is it possible that I
 write
  eigenvalue decomposition from scratch? What should I do? Thanks a
 lot!
 
 
  Miles Yao
 
  
  View this message in context: How can I implement eigenvalue
 decomposition
  in Spark?
  Sent from the Apache Spark User List mailing list archive at
 Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org





 --
 Li
 @vrilleup








Re: How can I implement eigenvalue decomposition in Spark?

2014-08-08 Thread Chitturi Padma
 in the Spark code. The one in mllib is not
 distributed (right?) and is probably not an efficient means of
 computing eigenvectors if you really just want a decomposition of a
 symmetric matrix.

 The one I see in graphx is distributed? I haven't used it though.
 Maybe it could be part of a solution.



 On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan [hidden email]
 http://user/SendEmail.jtp?type=nodenode=11765i=7 wrote:
  Our lab need to do some simulation on online social networks. We
 need to
  handle a 5000*5000 adjacency matrix, namely, to get its largest
 eigenvalue
  and corresponding eigenvector. Matlab can be used but it is
 time-consuming.
  Is Spark effective in linear algebra calculations and
 transformations? Later
  we would have 500*500 matrix processed. It seems emergent
 that we
  should find some distributed computation platform.
 
  I see SVD has been implemented and I can get eigenvalues of a
 matrix through
  this API.  But when I want to get both eigenvalues and
 eigenvectors or at
  least the biggest eigenvalue and the corresponding eigenvector,
 it seems
  that current Spark doesn't have such API. Is it possible that I
 write
  eigenvalue decomposition from scratch? What should I do? Thanks a
 lot!
 
 
  Miles Yao
 
  
  View this message in context: How can I implement eigenvalue
 decomposition
  in Spark?
  Sent from the Apache Spark User List mailing list archive at
 Nabble.com.


 -
 To unsubscribe, e-mail: [hidden email]
 http://user/SendEmail.jtp?type=nodenode=11765i=8
 For additional commands, e-mail: [hidden email]
 http://user/SendEmail.jtp?type=nodenode=11765i=9





 --
 Li
 @vrilleup








 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-implement-eigenvalue-decomposition-in-Spark-tp11646p11765.html
  To start a new topic under Apache Spark User List, email
 ml-node+s1001560n1...@n3.nabble.com
 To unsubscribe from Apache Spark User List, click here
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=1code=bGVhcm5pbmdzLmNoaXR0dXJpQGdtYWlsLmNvbXwxfC03NzExMjUwMg==
 .
 NAML
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-implement-eigenvalue-decomposition-in-Spark-tp11646p11778.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: How can I implement eigenvalue decomposition in Spark?

2014-08-08 Thread Sean Owen
The SVD does not in general give you eigenvalues of its input.

Are you just trying to access the U and V matrices? they are also
returned in the API.  But they are not the eigenvectors of M, as you
note.

I don't think MLlib has anything to help with the general eigenvector problem.
Maybe you can implement a sort of power iteration algorithm using
GraphX to find the largest eigenvector?

On Fri, Aug 8, 2014 at 4:07 AM, Chunnan Yao yaochun...@gmail.com wrote:
 Hi there, what you've suggested are all meaningful. But to make myself
 clearer, my essential problems are:
 1. My matrix is asymmetric, and it is a probabilistic adjacency matrix,
 whose entries(a_ij) represents the likelihood that user j will broadcast the
 information generated by user i. Apparently, a_ij and a_ji is different,
 caus I love you doesn't necessarily mean you love me(What a sad story~). All
 entries are real.
 2. I know I can get eigenvalues through SVD. My problem is I can't get the
 corresponding eigenvectors, which requires solving equations, and I also
 need eigenvectors in my calculation.In my simulation of this paper, I only
 need the biggest eigenvalues and corresponding eigenvectors.
 The paper posted by Shivaram Venkataraman is also concerned about symmetric
 matrix. Could any one help me out?

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: How can I implement eigenvalue decomposition in Spark?

2014-08-08 Thread Li Pu
@Miles, eigen-decomposition with asymmetric matrix doesn't always give
real-value solutions, and it doesn't have the nice properties that
symmetric matrix holds. Usually you want to symmetrize your asymmetric
matrix in some way, e.g. see
http://machinelearning.wustl.edu/mlpapers/paper_files/icml2005_ZhouHS05.pdf
but as Sean mentioned, you can always compute the largest eigenpair with
power method or some variations like pagerank, which is already implemented
in graphx.


On Fri, Aug 8, 2014 at 2:50 AM, Sean Owen so...@cloudera.com wrote:

 The SVD does not in general give you eigenvalues of its input.

 Are you just trying to access the U and V matrices? they are also
 returned in the API.  But they are not the eigenvectors of M, as you
 note.

 I don't think MLlib has anything to help with the general eigenvector
 problem.
 Maybe you can implement a sort of power iteration algorithm using
 GraphX to find the largest eigenvector?

 On Fri, Aug 8, 2014 at 4:07 AM, Chunnan Yao yaochun...@gmail.com wrote:
  Hi there, what you've suggested are all meaningful. But to make myself
  clearer, my essential problems are:
  1. My matrix is asymmetric, and it is a probabilistic adjacency matrix,
  whose entries(a_ij) represents the likelihood that user j will broadcast
 the
  information generated by user i. Apparently, a_ij and a_ji is different,
  caus I love you doesn't necessarily mean you love me(What a sad story~).
 All
  entries are real.
  2. I know I can get eigenvalues through SVD. My problem is I can't get
 the
  corresponding eigenvectors, which requires solving equations, and I also
  need eigenvectors in my calculation.In my simulation of this paper, I
 only
  need the biggest eigenvalues and corresponding eigenvectors.
  The paper posted by Shivaram Venkataraman is also concerned about
 symmetric
  matrix. Could any one help me out?

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




-- 
Li
@vrilleup


Re: How can I implement eigenvalue decomposition in Spark?

2014-08-08 Thread x
Generally adjacency matrix is undirected(symmetric) on social network, so
you can get eigenvectors from SVD computed result.

A = UDV^t

The first column of U is the biggest eigenvector corresponding to the first
value of D.

xj @ Tokyo


On Sat, Aug 9, 2014 at 4:08 AM, Li Pu l...@twitter.com.invalid wrote:

 @Miles, eigen-decomposition with asymmetric matrix doesn't always give
 real-value solutions, and it doesn't have the nice properties that
 symmetric matrix holds. Usually you want to symmetrize your asymmetric
 matrix in some way, e.g. see
 http://machinelearning.wustl.edu/mlpapers/paper_files/icml2005_ZhouHS05.pdf
 but as Sean mentioned, you can always compute the largest eigenpair with
 power method or some variations like pagerank, which is already implemented
 in graphx.


 On Fri, Aug 8, 2014 at 2:50 AM, Sean Owen so...@cloudera.com wrote:

 The SVD does not in general give you eigenvalues of its input.

 Are you just trying to access the U and V matrices? they are also
 returned in the API.  But they are not the eigenvectors of M, as you
 note.

 I don't think MLlib has anything to help with the general eigenvector
 problem.
 Maybe you can implement a sort of power iteration algorithm using
 GraphX to find the largest eigenvector?

 On Fri, Aug 8, 2014 at 4:07 AM, Chunnan Yao yaochun...@gmail.com wrote:
  Hi there, what you've suggested are all meaningful. But to make myself
  clearer, my essential problems are:
  1. My matrix is asymmetric, and it is a probabilistic adjacency matrix,
  whose entries(a_ij) represents the likelihood that user j will
 broadcast the
  information generated by user i. Apparently, a_ij and a_ji is different,
  caus I love you doesn't necessarily mean you love me(What a sad
 story~). All
  entries are real.
  2. I know I can get eigenvalues through SVD. My problem is I can't get
 the
  corresponding eigenvectors, which requires solving equations, and I also
  need eigenvectors in my calculation.In my simulation of this paper, I
 only
  need the biggest eigenvalues and corresponding eigenvectors.
  The paper posted by Shivaram Venkataraman is also concerned about
 symmetric
  matrix. Could any one help me out?

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




 --
 Li
 @vrilleup



How can I implement eigenvalue decomposition in Spark?

2014-08-07 Thread yaochunnan
Our lab need to do some simulation on online social networks. We need to
handle a 5000*5000 adjacency matrix, namely, to get its largest eigenvalue
and corresponding eigenvector. Matlab can be used but it is time-consuming.
Is Spark effective in linear algebra calculations and transformations?
Later we would have 500*500 matrix processed. It seems emergent
that we should find some distributed computation platform.

I see SVD has been implemented and I can get eigenvalues of a matrix
through this API.  But when I want to get both eigenvalues and eigenvectors
or at least the biggest eigenvalue and the corresponding eigenvector, it
seems that current Spark doesn't have such API. Is it possible that I write
eigenvalue decomposition from scratch? What should I do? Thanks a lot!


Miles Yao




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-implement-eigenvalue-decomposition-in-Spark-tp11646.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: How can I implement eigenvalue decomposition in Spark?

2014-08-07 Thread Sean Owen
(-incubator, +user)

If your matrix is symmetric (and real I presume), and if my linear
algebra isn't too rusty, then its SVD is its eigendecomposition. The
SingularValueDecomposition object you get back has U and V, both of
which have columns that are the eigenvectors.

There are a few SVDs in the Spark code. The one in mllib is not
distributed (right?) and is probably not an efficient means of
computing eigenvectors if you really just want a decomposition of a
symmetric matrix.

The one I see in graphx is distributed? I haven't used it though.
Maybe it could be part of a solution.



On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan yaochun...@gmail.com wrote:
 Our lab need to do some simulation on online social networks. We need to
 handle a 5000*5000 adjacency matrix, namely, to get its largest eigenvalue
 and corresponding eigenvector. Matlab can be used but it is time-consuming.
 Is Spark effective in linear algebra calculations and transformations? Later
 we would have 500*500 matrix processed. It seems emergent that we
 should find some distributed computation platform.

 I see SVD has been implemented and I can get eigenvalues of a matrix through
 this API.  But when I want to get both eigenvalues and eigenvectors or at
 least the biggest eigenvalue and the corresponding eigenvector, it seems
 that current Spark doesn't have such API. Is it possible that I write
 eigenvalue decomposition from scratch? What should I do? Thanks a lot!


 Miles Yao

 
 View this message in context: How can I implement eigenvalue decomposition
 in Spark?
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: How can I implement eigenvalue decomposition in Spark?

2014-08-07 Thread Evan R. Sparks
Reza Zadeh has contributed the distributed implementation of (Tall/Skinny)
SVD (http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html),
which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in Spark
1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your data is
sparse (which it often is in social networks), you may have better luck
with this.

I haven't tried the GraphX implementation, but those algorithms are often
well-suited for power-law distributed graphs as you might see in social
networks.

FWIW, I believe you need to square elements of the sigma matrix from the
SVD to get the eigenvalues.




On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen so...@cloudera.com wrote:

 (-incubator, +user)

 If your matrix is symmetric (and real I presume), and if my linear
 algebra isn't too rusty, then its SVD is its eigendecomposition. The
 SingularValueDecomposition object you get back has U and V, both of
 which have columns that are the eigenvectors.

 There are a few SVDs in the Spark code. The one in mllib is not
 distributed (right?) and is probably not an efficient means of
 computing eigenvectors if you really just want a decomposition of a
 symmetric matrix.

 The one I see in graphx is distributed? I haven't used it though.
 Maybe it could be part of a solution.



 On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan yaochun...@gmail.com wrote:
  Our lab need to do some simulation on online social networks. We need to
  handle a 5000*5000 adjacency matrix, namely, to get its largest
 eigenvalue
  and corresponding eigenvector. Matlab can be used but it is
 time-consuming.
  Is Spark effective in linear algebra calculations and transformations?
 Later
  we would have 500*500 matrix processed. It seems emergent that we
  should find some distributed computation platform.
 
  I see SVD has been implemented and I can get eigenvalues of a matrix
 through
  this API.  But when I want to get both eigenvalues and eigenvectors or at
  least the biggest eigenvalue and the corresponding eigenvector, it seems
  that current Spark doesn't have such API. Is it possible that I write
  eigenvalue decomposition from scratch? What should I do? Thanks a lot!
 
 
  Miles Yao
 
  
  View this message in context: How can I implement eigenvalue
 decomposition
  in Spark?
  Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: How can I implement eigenvalue decomposition in Spark?

2014-08-07 Thread Li Pu
@Miles, the latest SVD implementation in mllib is partially distributed.
Matrix-vector multiplication is computed among all workers, but the right
singular vectors are all stored in the driver. If your symmetric matrix is
n x n and you want the first k eigenvalues, you will need to fit n x k
doubles in driver's memory. Behind the scene, it calls ARPACK to compute
eigen-decomposition of A^T A. You can look into the source code for the
details.

@Sean, the SVD++ implementation in graphx is not the canonical definition
of SVD. It doesn't have the orthogonality that SVD holds. But we might want
to use graphx as the underlying matrix representation for mllib.SVD to
address the problem of skewed entry distribution.


On Thu, Aug 7, 2014 at 10:51 AM, Evan R. Sparks evan.spa...@gmail.com
wrote:

 Reza Zadeh has contributed the distributed implementation of (Tall/Skinny)
 SVD (
 http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html),
 which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in Spark
 1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your data is
 sparse (which it often is in social networks), you may have better luck
 with this.

 I haven't tried the GraphX implementation, but those algorithms are often
 well-suited for power-law distributed graphs as you might see in social
 networks.

 FWIW, I believe you need to square elements of the sigma matrix from the
 SVD to get the eigenvalues.




 On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen so...@cloudera.com wrote:

 (-incubator, +user)

 If your matrix is symmetric (and real I presume), and if my linear
 algebra isn't too rusty, then its SVD is its eigendecomposition. The
 SingularValueDecomposition object you get back has U and V, both of
 which have columns that are the eigenvectors.

 There are a few SVDs in the Spark code. The one in mllib is not
 distributed (right?) and is probably not an efficient means of
 computing eigenvectors if you really just want a decomposition of a
 symmetric matrix.

 The one I see in graphx is distributed? I haven't used it though.
 Maybe it could be part of a solution.



 On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan yaochun...@gmail.com wrote:
  Our lab need to do some simulation on online social networks. We need to
  handle a 5000*5000 adjacency matrix, namely, to get its largest
 eigenvalue
  and corresponding eigenvector. Matlab can be used but it is
 time-consuming.
  Is Spark effective in linear algebra calculations and transformations?
 Later
  we would have 500*500 matrix processed. It seems emergent that
 we
  should find some distributed computation platform.
 
  I see SVD has been implemented and I can get eigenvalues of a matrix
 through
  this API.  But when I want to get both eigenvalues and eigenvectors or
 at
  least the biggest eigenvalue and the corresponding eigenvector, it seems
  that current Spark doesn't have such API. Is it possible that I write
  eigenvalue decomposition from scratch? What should I do? Thanks a lot!
 
 
  Miles Yao
 
  
  View this message in context: How can I implement eigenvalue
 decomposition
  in Spark?
  Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org





-- 
Li
@vrilleup


Re: How can I implement eigenvalue decomposition in Spark?

2014-08-07 Thread Shivaram Venkataraman
If you just want to find the top eigenvalue / eigenvector you can do
something like the Lanczos method. There is a description of a MapReduce
based algorithm in Section 4.2 of [1]

[1] http://www.cs.cmu.edu/~ukang/papers/HeigenPAKDD2011.pdf


On Thu, Aug 7, 2014 at 10:54 AM, Li Pu l...@twitter.com.invalid wrote:

 @Miles, the latest SVD implementation in mllib is partially distributed.
 Matrix-vector multiplication is computed among all workers, but the right
 singular vectors are all stored in the driver. If your symmetric matrix is
 n x n and you want the first k eigenvalues, you will need to fit n x k
 doubles in driver's memory. Behind the scene, it calls ARPACK to compute
 eigen-decomposition of A^T A. You can look into the source code for the
 details.

 @Sean, the SVD++ implementation in graphx is not the canonical definition
 of SVD. It doesn't have the orthogonality that SVD holds. But we might want
 to use graphx as the underlying matrix representation for mllib.SVD to
 address the problem of skewed entry distribution.


 On Thu, Aug 7, 2014 at 10:51 AM, Evan R. Sparks evan.spa...@gmail.com
 wrote:

 Reza Zadeh has contributed the distributed implementation of
 (Tall/Skinny) SVD (
 http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html),
 which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in Spark
 1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your data is
 sparse (which it often is in social networks), you may have better luck
 with this.

 I haven't tried the GraphX implementation, but those algorithms are often
 well-suited for power-law distributed graphs as you might see in social
 networks.

 FWIW, I believe you need to square elements of the sigma matrix from the
 SVD to get the eigenvalues.




 On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen so...@cloudera.com wrote:

 (-incubator, +user)

 If your matrix is symmetric (and real I presume), and if my linear
 algebra isn't too rusty, then its SVD is its eigendecomposition. The
 SingularValueDecomposition object you get back has U and V, both of
 which have columns that are the eigenvectors.

 There are a few SVDs in the Spark code. The one in mllib is not
 distributed (right?) and is probably not an efficient means of
 computing eigenvectors if you really just want a decomposition of a
 symmetric matrix.

 The one I see in graphx is distributed? I haven't used it though.
 Maybe it could be part of a solution.



 On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan yaochun...@gmail.com wrote:
  Our lab need to do some simulation on online social networks. We need
 to
  handle a 5000*5000 adjacency matrix, namely, to get its largest
 eigenvalue
  and corresponding eigenvector. Matlab can be used but it is
 time-consuming.
  Is Spark effective in linear algebra calculations and transformations?
 Later
  we would have 500*500 matrix processed. It seems emergent that
 we
  should find some distributed computation platform.
 
  I see SVD has been implemented and I can get eigenvalues of a matrix
 through
  this API.  But when I want to get both eigenvalues and eigenvectors or
 at
  least the biggest eigenvalue and the corresponding eigenvector, it
 seems
  that current Spark doesn't have such API. Is it possible that I write
  eigenvalue decomposition from scratch? What should I do? Thanks a lot!
 
 
  Miles Yao
 
  
  View this message in context: How can I implement eigenvalue
 decomposition
  in Spark?
  Sent from the Apache Spark User List mailing list archive at
 Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org





 --
 Li
 @vrilleup



Re: How can I implement eigenvalue decomposition in Spark?

2014-08-07 Thread x
 The SVD computed result already contains descending order of singular
values, you can get the biggest eigenvalue.

---

  val svd = matrix.computeSVD(matrix.numCols().toInt, computeU = true)
  val U: RowMatrix = svd.U
  val s: Vector = svd.s
  val V: Matrix = svd.V

  U.rows.toArray.take(1).foreach(println)

  println(s.toArray(0)*s.toArray(0))

  println(V.toArray.take(s.size).foreach(println))

---

xj @ Tokyo

On Fri, Aug 8, 2014 at 3:06 AM, Shivaram Venkataraman 
shiva...@eecs.berkeley.edu wrote:

 If you just want to find the top eigenvalue / eigenvector you can do
 something like the Lanczos method. There is a description of a MapReduce
 based algorithm in Section 4.2 of [1]

 [1] http://www.cs.cmu.edu/~ukang/papers/HeigenPAKDD2011.pdf


 On Thu, Aug 7, 2014 at 10:54 AM, Li Pu l...@twitter.com.invalid wrote:

 @Miles, the latest SVD implementation in mllib is partially distributed.
 Matrix-vector multiplication is computed among all workers, but the right
 singular vectors are all stored in the driver. If your symmetric matrix is
 n x n and you want the first k eigenvalues, you will need to fit n x k
 doubles in driver's memory. Behind the scene, it calls ARPACK to compute
 eigen-decomposition of A^T A. You can look into the source code for the
 details.

 @Sean, the SVD++ implementation in graphx is not the canonical definition
 of SVD. It doesn't have the orthogonality that SVD holds. But we might want
 to use graphx as the underlying matrix representation for mllib.SVD to
 address the problem of skewed entry distribution.


 On Thu, Aug 7, 2014 at 10:51 AM, Evan R. Sparks evan.spa...@gmail.com
 wrote:

 Reza Zadeh has contributed the distributed implementation of
 (Tall/Skinny) SVD (
 http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html),
 which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in Spark
 1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your data
 is sparse (which it often is in social networks), you may have better luck
 with this.

 I haven't tried the GraphX implementation, but those algorithms are
 often well-suited for power-law distributed graphs as you might see in
 social networks.

 FWIW, I believe you need to square elements of the sigma matrix from the
 SVD to get the eigenvalues.




 On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen so...@cloudera.com wrote:

 (-incubator, +user)

 If your matrix is symmetric (and real I presume), and if my linear
 algebra isn't too rusty, then its SVD is its eigendecomposition. The
 SingularValueDecomposition object you get back has U and V, both of
 which have columns that are the eigenvectors.

 There are a few SVDs in the Spark code. The one in mllib is not
 distributed (right?) and is probably not an efficient means of
 computing eigenvectors if you really just want a decomposition of a
 symmetric matrix.

 The one I see in graphx is distributed? I haven't used it though.
 Maybe it could be part of a solution.



 On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan yaochun...@gmail.com
 wrote:
  Our lab need to do some simulation on online social networks. We need
 to
  handle a 5000*5000 adjacency matrix, namely, to get its largest
 eigenvalue
  and corresponding eigenvector. Matlab can be used but it is
 time-consuming.
  Is Spark effective in linear algebra calculations and
 transformations? Later
  we would have 500*500 matrix processed. It seems emergent
 that we
  should find some distributed computation platform.
 
  I see SVD has been implemented and I can get eigenvalues of a matrix
 through
  this API.  But when I want to get both eigenvalues and eigenvectors
 or at
  least the biggest eigenvalue and the corresponding eigenvector, it
 seems
  that current Spark doesn't have such API. Is it possible that I write
  eigenvalue decomposition from scratch? What should I do? Thanks a lot!
 
 
  Miles Yao
 
  
  View this message in context: How can I implement eigenvalue
 decomposition
  in Spark?
  Sent from the Apache Spark User List mailing list archive at
 Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org





 --
 Li
 @vrilleup





Re: How can I implement eigenvalue decomposition in Spark?

2014-08-07 Thread Chunnan Yao
Hi there, what you've suggested are all meaningful. But to make myself
clearer, my essential problems are:
1. My matrix is asymmetric, and it is a probabilistic adjacency matrix,
whose entries(a_ij) represents the likelihood that user j will broadcast
the information generated by user i. Apparently, a_ij and a_ji is
different, caus I love you doesn't necessarily mean you love me(What a sad
story~). All entries are real.
2. I know I can get eigenvalues through SVD. My problem is I can't get the
corresponding eigenvectors, which requires solving equations, and I also
need eigenvectors in my calculation.In my simulation of this paper, I only
need the biggest eigenvalues and corresponding eigenvectors.
The paper posted by Shivaram Venkataraman is also concerned about symmetric
matrix. Could any one help me out?


2014-08-08 9:41 GMT+08:00 x wasedax...@gmail.com:

  The SVD computed result already contains descending order of singular
 values, you can get the biggest eigenvalue.

 ---

   val svd = matrix.computeSVD(matrix.numCols().toInt, computeU = true)
   val U: RowMatrix = svd.U
   val s: Vector = svd.s
   val V: Matrix = svd.V

   U.rows.toArray.take(1).foreach(println)

   println(s.toArray(0)*s.toArray(0))

   println(V.toArray.take(s.size).foreach(println))

 ---

 xj @ Tokyo


 On Fri, Aug 8, 2014 at 3:06 AM, Shivaram Venkataraman 
 shiva...@eecs.berkeley.edu wrote:

 If you just want to find the top eigenvalue / eigenvector you can do
 something like the Lanczos method. There is a description of a MapReduce
 based algorithm in Section 4.2 of [1]

 [1] http://www.cs.cmu.edu/~ukang/papers/HeigenPAKDD2011.pdf


 On Thu, Aug 7, 2014 at 10:54 AM, Li Pu l...@twitter.com.invalid wrote:

 @Miles, the latest SVD implementation in mllib is partially distributed.
 Matrix-vector multiplication is computed among all workers, but the right
 singular vectors are all stored in the driver. If your symmetric matrix is
 n x n and you want the first k eigenvalues, you will need to fit n x k
 doubles in driver's memory. Behind the scene, it calls ARPACK to compute
 eigen-decomposition of A^T A. You can look into the source code for the
 details.

 @Sean, the SVD++ implementation in graphx is not the canonical
 definition of SVD. It doesn't have the orthogonality that SVD holds. But we
 might want to use graphx as the underlying matrix representation for
 mllib.SVD to address the problem of skewed entry distribution.


 On Thu, Aug 7, 2014 at 10:51 AM, Evan R. Sparks evan.spa...@gmail.com
 wrote:

 Reza Zadeh has contributed the distributed implementation of
 (Tall/Skinny) SVD (
 http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html),
 which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in Spark
 1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your data
 is sparse (which it often is in social networks), you may have better luck
 with this.

 I haven't tried the GraphX implementation, but those algorithms are
 often well-suited for power-law distributed graphs as you might see in
 social networks.

 FWIW, I believe you need to square elements of the sigma matrix from
 the SVD to get the eigenvalues.




 On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen so...@cloudera.com wrote:

 (-incubator, +user)

 If your matrix is symmetric (and real I presume), and if my linear
 algebra isn't too rusty, then its SVD is its eigendecomposition. The
 SingularValueDecomposition object you get back has U and V, both of
 which have columns that are the eigenvectors.

 There are a few SVDs in the Spark code. The one in mllib is not
 distributed (right?) and is probably not an efficient means of
 computing eigenvectors if you really just want a decomposition of a
 symmetric matrix.

 The one I see in graphx is distributed? I haven't used it though.
 Maybe it could be part of a solution.



 On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan yaochun...@gmail.com
 wrote:
  Our lab need to do some simulation on online social networks. We
 need to
  handle a 5000*5000 adjacency matrix, namely, to get its largest
 eigenvalue
  and corresponding eigenvector. Matlab can be used but it is
 time-consuming.
  Is Spark effective in linear algebra calculations and
 transformations? Later
  we would have 500*500 matrix processed. It seems emergent
 that we
  should find some distributed computation platform.
 
  I see SVD has been implemented and I can get eigenvalues of a matrix
 through
  this API.  But when I want to get both eigenvalues and eigenvectors
 or at
  least the biggest eigenvalue and the corresponding eigenvector, it
 seems
  that current Spark doesn't have such API. Is it possible that I write
  eigenvalue decomposition from scratch? What should I do? Thanks a
 lot!
 
 
  Miles Yao
 
  
  View this message in context: How can I implement eigenvalue
 decomposition
  in Spark?
  Sent from the Apache Spark User List mailing list archive at
 Nabble.com

Re: How can I implement eigenvalue decomposition in Spark?

2014-08-07 Thread x
 scratch? What should I do? Thanks a
 lot!
 
 
  Miles Yao
 
  
  View this message in context: How can I implement eigenvalue
 decomposition
  in Spark?
  Sent from the Apache Spark User List mailing list archive at
 Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org





 --
 Li
 @vrilleup