[
https://issues.apache.org/jira/browse/SPARK-7594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved SPARK-7594.
------------------------------
Resolution: Invalid
Please ask questions at user@
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
I think the issues is that the resulting Gramian somewhere will then have more
than 2^32 entries in an internal array. At this scale you'd be passing around
arrays of tens of gigabytes, which probably is well beyond what's practical for
this implementation.
> Increase maximum amount of columns for covariance matrix for principal
> components
> ---------------------------------------------------------------------------------
>
> Key: SPARK-7594
> URL: https://issues.apache.org/jira/browse/SPARK-7594
> Project: Spark
> Issue Type: Improvement
> Components: MLlib
> Reporter: Sebastian Alfers
> Priority: Minor
>
> In order to compute a huge dataset, the amount of columns to calculate the
> covariance matrix is limited:
> https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala#L129
> What is the reason behind this limitation and can it be extended?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]