Hi, I am trying to multiply Matrix of size 67584*67584 in a loop. In the first iteration, multiplication goes through, but in the second iteration, it fails with Java heap out of memory issue. I'm using pyspark and below is the configuration. Setup: 70 nodes (1driver+69 workers) with SPARK_DRIVER_MEMORY=32g,SPARK_WORKER_CORES=16,SPARK_WORKER_MEMORY=20g,SPARK_EXECUTOR_MEMORY=5g,spark.executor.cores=5
Data : 67584 matrix size, block size is 1024 So, i basically load number of mat files (matlab .mat) files using textFile, form a Block RDD with each file read being a block, and create a blockmatrix(A) Then, i multiply the matrix with itself in the loop, basically to get the powers (A^^2,A^^4). But somehow multiplication always fails with out of memory issues after second iteration.I'm using multiply method from BlockMatrix for i in range(3): A = A.multiply(A) What am i missing? What is a correct way to load a big matrix file (.mat )from local filesystem into rdd and create a blockmatrix and do repeated multiplication? -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/BlockMatrix-Multiplication-fails-with-Out-of-Memory-tp18869.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org