Imran Younus created SYSTEMML-1172:
--------------------------------------
Summary: Matrix Multipy (cpmm) fails in spark mode.
Key: SYSTEMML-1172
URL: https://issues.apache.org/jira/browse/SYSTEMML-1172
Project: SystemML
Issue Type: Bug
Environment: spark 1.6.1
Three node cluster with 512GB ram and 48 cores per node.
Reporter: Imran Younus
Attachments: output_ATA_50k.log
I'm running this simple dml code for a {{50k x 50k}} matrix
{code}
N = $N
X = Rand(rows=N, cols=N, max=1, min=-1, pdf="uniform")
A = t(X) %*% X
fn = sum(A * A)
print(fn)
{code}
I'm running this with spark 1.6.1:
{code}
/opt/spark-1.6.2-bin-hadoop2.6/bin/spark-submit
--master=spark://rr-ram4.softlayer.com:7077 --executor-memory=40g
--driver-memory=40g sysml/target/SystemML.jar -f genDataForCholeskey.dml
-explain -stats -nvargs N=50000 output=/user/iyounus/data/PDmatrix_50k.csv >&
output_ATA_50k_no_writing.log
{code}
When this code runs, the executors start dying because of java heap
OutOfMemoryError. After multiple retries the code just fails.
The exact same code in python using numpy takes 7 min!!
log file is attached.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)