[ 
https://issues.apache.org/jira/browse/SYSTEMML-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Baunsgaard resolved SYSTEMML-775.
-------------------------------------------
    Fix Version/s: Not Applicable
       Resolution: Done

> Distribute Data for spark
> -------------------------
>
>                 Key: SYSTEMML-775
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-775
>             Project: SystemDS
>          Issue Type: Question
>          Components: Algorithms
>    Affects Versions: SystemML 0.10
>            Reporter: Johannes Wilke
>            Priority: Minor
>             Fix For: Not Applicable
>
>
> Hi!
> I have to calculate in parallel on data on a spark-Cluster with SystemML.
> The program works fine on the cluster, but not in parallel, because I don't 
> know how to distribute my data throw this Cluster to use the data with 
> SystemML.
> In Scala I have tried the following:
>  val sysMlMatrix = RDDConverterUtils.dataFrameToBinaryBlock(sc, dff, mc, 
> false)
>  sysMlMatrix.saveAsObjectFile("/home/hduser/test.obj")
>  val sysMlMatrix2 = sc.sequenceFile[MatrixIndexes, 
> MatrixBlock]("/home/hduser/test.obj",1000);
>  val sysMlMatrix3 = JavaPairRDD.fromRDD(sysMlMatrix2)
>     ml.reset()
>     ml.registerInput("X", sysMlMatrix3, numRows, numCols)
> But I get a ClassCastException, when I try to load the object File.
> My Matrix has 1000 rows and I want to work in parallel on these rows.
> How can I reach this? I hope you can help me!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to