[ 
https://issues.apache.org/jira/browse/SYSTEMML-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1390:
-------------------------------------
    Description: This task aims to avoid unnecessary input caching for parfor 
spark datapartition-execute jobs (with grouping) in order to reduce the memory 
pressure and thus garbage collection overhead during shuffle and subsequent 
execution. We only apply this for the general case with grouping and if the 
input is a persisted rdd which has not been cached yet.  (was: This task aims 
to avoid unnecessary input caching for parfor spark datapartition-execute jobs 
(with grouping) in order to reduce the memory pressure and thus garbage 
collection overhead during shuffle and subsequent execution. We only apply this 
for the general case with grouping and if the input is a persisted rdd which 
has not yet been cached.)

> Avoid unnecessary caching of parfor spark datapartition-execute input
> ---------------------------------------------------------------------
>
>                 Key: SYSTEMML-1390
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1390
>             Project: SystemML
>          Issue Type: Sub-task
>          Components: APIs, Runtime
>            Reporter: Matthias Boehm
>             Fix For: SystemML 1.0
>
>
> This task aims to avoid unnecessary input caching for parfor spark 
> datapartition-execute jobs (with grouping) in order to reduce the memory 
> pressure and thus garbage collection overhead during shuffle and subsequent 
> execution. We only apply this for the general case with grouping and if the 
> input is a persisted rdd which has not been cached yet.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to