[
https://issues.apache.org/jira/browse/SYSTEMML-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874001#comment-15874001
]
Matthias Boehm commented on SYSTEMML-1044:
------------------------------------------
The issue was caused by the following left indexing operations (169). We have a
memory budget of 14,157MB, and apply a mapleftindex, because the right hand
side input of size 6866MB fits twice in this local budget (pinned+partitioned
matrix): 13,732MB < 14,157MB. However, as the "main" but sparse input of size
1,263MB needs to be parallelized from the driver (and kept references from
there), in total we exceed the local budget.
{code}
GENERIC (lines 88-122):
--(151) TRead B [1000000,1000,1000,1000,100000000] [0,0,1263 -> 1263MB], CP
--(150) TRead A [1000000,1000,1000,1000,1000000000] [0,0,7629 -> 7629MB], CP
--(168) rix (150) [1000000,900,1000,1000,-1] [7629,0,6866 -> 14496MB], SPARK
--(157) u(nrow) (151) [-1,-1,-1,-1,-1] [1263,0,0 -> 0MB]
--(169) lix (151,168,157) [1000000,1000,1000,1000,-1] [8129,0,7629 -> 15759MB],
SPARK
{code}
Fundamentally, this requires a proper handling of parallelized RDDs as targeted
in SYSTEMML-1314 but this is involved and will not make the cut for the 0.13
release.
However, for this particular case, we anyway should generalize our mapleftindex
to broadcast either left or right input - which would allow us to broadcast the
sparse left input here. This would significantly reduce the memory pressure to
6866MB + 2*1263MB = 9292MB here.
> Perftest: Data generator descriptive stats w/ OOM on M, dense
> -------------------------------------------------------------
>
> Key: SYSTEMML-1044
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1044
> Project: SystemML
> Issue Type: Bug
> Reporter: Matthias Boehm
>
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)