Matthias Boehm created SYSTEMML-1019:
----------------------------------------
Summary: Perftest: MSVM, L sparse, unncessary under-utilization
(parfor dop)
Key: SYSTEMML-1019
URL: https://issues.apache.org/jira/browse/SYSTEMML-1019
Project: SystemML
Issue Type: Bug
Reporter: Matthias Boehm
In execution modes spark and hybrid_spark, the parfor optimizer uses a smaller
memory budget (50% of normal memory budget) for deciding on the local parfor
degree of parallelism in order to avoid unnecessary "guarded collect" over
hdfs. However, "guarded collect" can only happen in case of existing spark
instructions; otherwise this decision leads to unnecessary under-utilization if
the memory requirement is limiting the degree of parallelism.
We should determine, if there are existing spark instructions in the body and
accordingly adapt the memory budget the parfor optimizer works with. On this
MSVM usecase, the performance differences are as follows:
{code}
#before the change
MSVM train ict=0 on mbperftest/multinomial/X10M_1k_sparse_k150: 144
MSVM train ict=1 on mbperftest/multinomial/X10M_1k_sparse_k150: 156
#after the change
MSVM train ict=0 on mbperftest/multinomial/X10M_1k_sparse_k150: 99
MSVM train ict=1 on mbperftest/multinomial/X10M_1k_sparse_k150: 117
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)