Re: How can I implement eigenvalue decomposition in Spark?

Chunnan Yao Thu, 07 Aug 2014 20:09:08 -0700

Hi there, what you've suggested are all meaningful. But to make myself
clearer, my essential problems are:
1. My matrix is asymmetric, and it is a probabilistic adjacency matrix,
whose entries(a_ij) represents the likelihood that user j will broadcast
the information generated by user i. Apparently, a_ij and a_ji is
different, caus I love you doesn't necessarily mean you love me(What a sad
story~). All entries are real.
2. I know I can get eigenvalues through SVD. My problem is I can't get the
corresponding eigenvectors, which requires solving equations, and I also
need eigenvectors in my calculation.In my simulation of this paper, I only
need the biggest eigenvalues and corresponding eigenvectors.
The paper posted by Shivaram Venkataraman is also concerned about symmetric
matrix. Could any one help me out?



2014-08-08 9:41 GMT+08:00 x <wasedax...@gmail.com>:

>  The SVD computed result already contains descending order of singular
> values, you can get the biggest eigenvalue.
>
> ---
>
>   val svd = matrix.computeSVD(matrix.numCols().toInt, computeU = true)
>   val U: RowMatrix = svd.U
>   val s: Vector = svd.s
>   val V: Matrix = svd.V
>
>   U.rows.toArray.take(1).foreach(println)
>
>   println(s.toArray(0)*s.toArray(0))
>
>   println(V.toArray.take(s.size).foreach(println))
>
> ---
>
> xj @ Tokyo
>
>
> On Fri, Aug 8, 2014 at 3:06 AM, Shivaram Venkataraman <
> shiva...@eecs.berkeley.edu> wrote:
>
>> If you just want to find the top eigenvalue / eigenvector you can do
>> something like the Lanczos method. There is a description of a MapReduce
>> based algorithm in Section 4.2 of [1]
>>
>> [1] http://www.cs.cmu.edu/~ukang/papers/HeigenPAKDD2011.pdf
>>
>>
>> On Thu, Aug 7, 2014 at 10:54 AM, Li Pu <l...@twitter.com.invalid> wrote:
>>
>>> @Miles, the latest SVD implementation in mllib is partially distributed.
>>> Matrix-vector multiplication is computed among all workers, but the right
>>> singular vectors are all stored in the driver. If your symmetric matrix is
>>> n x n and you want the first k eigenvalues, you will need to fit n x k
>>> doubles in driver's memory. Behind the scene, it calls ARPACK to compute
>>> eigen-decomposition of A^T A. You can look into the source code for the
>>> details.
>>>
>>> @Sean, the SVD++ implementation in graphx is not the canonical
>>> definition of SVD. It doesn't have the orthogonality that SVD holds. But we
>>> might want to use graphx as the underlying matrix representation for
>>> mllib.SVD to address the problem of skewed entry distribution.
>>>
>>>
>>> On Thu, Aug 7, 2014 at 10:51 AM, Evan R. Sparks <evan.spa...@gmail.com>
>>> wrote:
>>>
>>>> Reza Zadeh has contributed the distributed implementation of
>>>> (Tall/Skinny) SVD (
>>>> http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html),
>>>> which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in Spark
>>>> 1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your data
>>>> is sparse (which it often is in social networks), you may have better luck
>>>> with this.
>>>>
>>>> I haven't tried the GraphX implementation, but those algorithms are
>>>> often well-suited for power-law distributed graphs as you might see in
>>>> social networks.
>>>>
>>>> FWIW, I believe you need to square elements of the sigma matrix from
>>>> the SVD to get the eigenvalues.
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen <so...@cloudera.com> wrote:
>>>>
>>>>> (-incubator, +user)
>>>>>
>>>>> If your matrix is symmetric (and real I presume), and if my linear
>>>>> algebra isn't too rusty, then its SVD is its eigendecomposition. The
>>>>> SingularValueDecomposition object you get back has U and V, both of
>>>>> which have columns that are the eigenvectors.
>>>>>
>>>>> There are a few SVDs in the Spark code. The one in mllib is not
>>>>> distributed (right?) and is probably not an efficient means of
>>>>> computing eigenvectors if you really just want a decomposition of a
>>>>> symmetric matrix.
>>>>>
>>>>> The one I see in graphx is distributed? I haven't used it though.
>>>>> Maybe it could be part of a solution.
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan <yaochun...@gmail.com>
>>>>> wrote:
>>>>> > Our lab need to do some simulation on online social networks. We
>>>>> need to
>>>>> > handle a 5000*5000 adjacency matrix, namely, to get its largest
>>>>> eigenvalue
>>>>> > and corresponding eigenvector. Matlab can be used but it is
>>>>> time-consuming.
>>>>> > Is Spark effective in linear algebra calculations and
>>>>> transformations? Later
>>>>> > we would have 5000000*5000000 matrix processed. It seems emergent
>>>>> that we
>>>>> > should find some distributed computation platform.
>>>>> >
>>>>> > I see SVD has been implemented and I can get eigenvalues of a matrix
>>>>> through
>>>>> > this API.  But when I want to get both eigenvalues and eigenvectors
>>>>> or at
>>>>> > least the biggest eigenvalue and the corresponding eigenvector, it
>>>>> seems
>>>>> > that current Spark doesn't have such API. Is it possible that I write
>>>>> > eigenvalue decomposition from scratch? What should I do? Thanks a
>>>>> lot!
>>>>> >
>>>>> >
>>>>> > Miles Yao
>>>>> >
>>>>> > ________________________________
>>>>> > View this message in context: How can I implement eigenvalue
>>>>> decomposition
>>>>> > in Spark?
>>>>> > Sent from the Apache Spark User List mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Li
>>> @vrilleup
>>>
>>
>>
>

Re: How can I implement eigenvalue decomposition in Spark?

Reply via email to