[ 
https://issues.apache.org/jira/browse/SYSTEMML-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15524778#comment-15524778
 ] 

Imran Younus commented on SYSTEMML-831:
---------------------------------------

I've tried everything I can to make this code run on spark without any success. 
The code has two functions. Right now I'm only trying to run first of the 
these: {{x2p}}.

I'm using MNIST data. If I used only 10k points from MNIST data, the code runs 
on the driver node only and it works find. Once I used all 60k points, it 
doesn't work. Here is how I'm running the code:

{{> spark-submit --master spark://rr-ram11:7077 --conf spark.driver.memory=20g 
--conf spark.executor.memory=20g --conf spark.executor.cores=4 --class 
org.apache.sysml.api.DMLScript target/SystemML.jar -f scripts/staging/tSNE.dml 
-stats -explain -exec spark}}

I'm using a 10 node cluster with 512GB ram on each node. I've used many 
different configurations for ram and cores and all that nothing seem to work. 
Here is the main problem:

org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
block generated from statement block between lines 153 and 153 -- Error 
evaluating instruction:
 CP°extfunct°.defaultNS°x2p°2°1°X·MATRIX·DOUBLE°30·SCALAR·INT·true°P
        at 
org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:152)
        at org.apache.sysml.api.DMLScript.execute(DMLScript.java:698)
        at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:364)
        at org.apache.sysml.api.DMLScript.main(DMLScript.java:199)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at 
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
in program block generated from statement block between lines 153 and 153 -- 
Error evaluating instruction: CP°extfunct°.defaultNS°x2p°2°1°X·M
ATRIX·DOUBLE°30·SCALAR·INT·true°P
        at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:335)
        at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:224)
        at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
        at 
org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:145)
        ... 12 more
Caused by: org.apache.sysml.runtime.DMLRuntimeException: error executing 
function .defaultNS::x2p
        at 
org.apache.sysml.runtime.instructions.cp.FunctionCallCPInstruction.processInstruction(FunctionCallCPInstruction.java:184)
        at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:305)
        ... 15 more
Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
in function program block generated from function statement block between lines 
51 and 103 -- Error evaluating function program block
        at 
org.apache.sysml.runtime.controlprogram.FunctionProgramBlock.execute(FunctionProgramBlock.java:121)
        at 
org.apache.sysml.runtime.instructions.cp.FunctionCallCPInstruction.processInstruction(FunctionCallCPInstruction.java:177)
        ... 16 more
Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
in while program block generated from while statement block between lines 69 
and 98 -- Error evaluating while program block
        at 
org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:181)
        at 
org.apache.sysml.runtime.controlprogram.FunctionProgramBlock.execute(FunctionProgramBlock.java:114)
        ... 17 more
Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
in program block generated from statement block between lines 70 and 77 -- 
Error evaluating instruction: SPARK°map/°_mVar277·MATRIX·DOUBLE°_m
Var278·MATRIX·DOUBLE°_mVar280·MATRIX·DOUBLE°RIGHT°COL_VECTOR
        at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:335)
        at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:224)
        at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
        at 
org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:169)
        ... 18 more
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 0 in stage 12.0 failed 4 times, most recent failure: Lost task 0.3 in 
stage 12.0 (TID 15, rr-ram8.softlayer.com): org.apache.spark.stora
ge.BlockFetchException: Failed to fetch block from 1 locations. Most recent 
failure cause:
        at 
org.apache.spark.storage.BlockManager$$anonfun$doGetRemote$2.apply(BlockManager.scala:605)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doGetRemote$2.apply(BlockManager.scala:595)
        at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at 
org.apache.spark.storage.BlockManager.doGetRemote(BlockManager.scala:595)
        at 
org.apache.spark.storage.BlockManager.getRemote(BlockManager.scala:580)
        at org.apache.spark.storage.BlockManager.get(BlockManager.scala:640)
        at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:44)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: Size 
exceeds Integer.MAX_VALUE

Attached is the complete log file.

[~nakul02] [~mboehm7] [~niketanpansare]

> Implement t-SNE algorithm
> -------------------------
>
>                 Key: SYSTEMML-831
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-831
>             Project: SystemML
>          Issue Type: Improvement
>          Components: Algorithms
>            Reporter: Imran Younus
>            Assignee: Imran Younus
>
> This jira implements the t-distributed Stochastic Neighbor Embedding 
> algorithm for dimensionality reduction presented in this paper:
> Visualizing Data using t-SNE 
> by Laurens van der Maaten, Geoffrey Hinton
> http://www.jmlr.org/papers/v9/vandermaaten08a.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to