[ 
https://issues.apache.org/jira/browse/SYSTEMML-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564750#comment-16564750
 ] 

Matthias Boehm commented on SYSTEMML-2476:
------------------------------------------

thanks for catching this [~Guobao]. Let me demystify this my explaining the 
three overlapping issues here:
* You see MR instead of SPARK jobs because the tests did not set SPARK hybrid 
mode and hence we're running in hybrid (i.e., CP and MR).
* These distributed operations are caused by a missing literal replacement for 
scalar lookups into lists which make C unknown and because the output sizes of 
operations in the same DAG depend on C we compile conservative distributed 
operations. I have an extension of the recompiler that fixes these unnecessary 
distributed operations.
* However, there is a remaining issue. Specifically C comes out of the list 
with value type STRING. I made the runtime robust enough to handle this but we 
should also fix the root cause. I can have a look into this remaining issue 
tomorrow. Until then please leave the JIRA open.

> Unexpected mapreduce task
> -------------------------
>
>                 Key: SYSTEMML-2476
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2476
>             Project: SystemML
>          Issue Type: Bug
>            Reporter: LI Guobao
>            Priority: Major
>
> When trying to use scalar casting to get element from a list, unexpected 
> mapreduce tasks are launched instead of CP mode. The scenario is to replace 
> *C = 1* with *C = as.scalar(hyperparams["C"])* inside the {{_gradient 
> function_}} found in 
> {{_src/test/scripts/functions/paramserv/mnist_lenet_paramserv.dml_}}. And 
> then the problem could be reproduced by launching the method 
> {{_testParamservBSPBatchDisjointContiguous_}} inside class 
> _{{org.apache.sysml.test.integration.functions.paramserv.ParamservLocalNNTest}}_
> Here is the stack:
> {code:java}
> 18/07/31 22:10:27 INFO mapred.MapTask: numReduceTasks: 1
> 18/07/31 22:10:27 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
> 18/07/31 22:10:27 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
> 18/07/31 22:10:27 INFO mapred.MapTask: soft limit at 83886080
> 18/07/31 22:10:27 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
> 18/07/31 22:10:27 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
> 18/07/31 22:10:27 INFO mapreduce.Job: The url to track the job: 
> http://localhost:8080/
> 18/07/31 22:10:27 INFO mapreduce.Job: Running job: job_local792652629_0008
> {code}
> [~mboehm7], if possible, could you take a look on this? And I've double 
> checked the creation of execution context in 
> {{ParamservBuiltinCPInstruction}}. But it is instance of ExecutionContext not 
> SparkExecutionContext.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to