Hello All,
I have some questions running systemml scripts on HDFS (with hybrid_spark
execution mode).

My Current Configuration:
Standalone HDFS on OSX (version 2.8)
and Spark Pre-Built for hadoop 2.7 (version 2.1.0)

*jps* out from my system
[image: Inline image 1]


Both of them have been installed separately.
As far as I understand, to enable hdfs support we need to run spark master
on yarn-client | yarn-cluster. (Is this understanding correct?)

My question:
I dont have access to a cluster, is there a way to set up a yarn-client /
yarn-cluster or my local system so that I can run systemml scripts on
hybrid_spark mode with HDFS?. If yes could you please point to some
documentation?.

Thank you so much,
Krishna


PS : sysout of what I have tried already attached below.
# Standalone System-ML jar 
SCRIPT_DIR=$SYSTEMML_HOME/scripts/*
BUILD_DIR=$SYSTEMML_HOME/target/*
LIB_DIR=$SYSTEMML_HOME/target/lib/*
HADOOP_HOME=$SYSTEMML_HOME/target/lib/hadoop/*
SYSTEMML_JAR=$SYSTEMML_HOME/target/systemml-1.0.0-SNAPSHOT.jar

FORMAT="csv" 
ALGO=/Users/krishna/open-source/incubator-systemml/scripts/datagen/genRandData4Kmeans.dml
 
java -cp $SCRIPT_DIR:$BUILD_DIR:$LIB_DIR:$HADOOP_HOME 
org.apache.sysml.api.DMLScript 
-Dlog4j.configuration=file:'$SYSTEMML_HOME/conf/log4j.properties' -f $ALGO 
-exec hybrid_spark -nvargs nr=10000 nf=1000 nc=50 dc=10.0 dr=1.0 fbf=100.0 
cbf=100.0 X=hdfs:///data/X.data C=hdfs:///data/C.data Y=hdfs:///data/Y.data 
YbyC=hdfs:///data/YbyC.data fmt=$FORMAT 

#### Logs
krishna@Krishna:~/open-source/scripts$ java -cp 
$SCRIPT_DIR:$BUILD_DIR:$LIB_DIR:$HADOOP_HOME org.apache.sysml.api.DMLScript 
-Dlog4j.configuration=file:'$SYSTEMML_HOME/conf/log4j.properties' -f $ALGO 
-exec hybrid_spark -nvargs nr=10000 nf=1000 nc=50 dc=10.0 dr=1.0 fbf=100.0 
cbf=100.0 X=hdfs:///data/X.data C=hdfs:///data/C.data Y=hdfs:///data/Y.data 
YbyC=hdfs:///data/YbyC.data fmt=$FORMAT


log4j:WARN No appenders could be found for logger 
(org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
BEGIN K-MEANS GENERATOR SCRIPT
Generating cluster distribution (mixture) centroids...
Generating record-to-cluster assignments...
Generating within-cluster random shifts...
Generating records by shifting from centroids...
Computing record-to-cluster assignments by minimum centroid distance...
Computing useful statistics...
Writing out the resulting dataset...
Exception in thread "main" org.apache.sysml.api.DMLException: 
org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
block generated from statement block between lines 80 and 119 -- Error 
evaluating instruction: 
CP掳write掳C路MATRIX路DOUBLE掳hdfs:///data/C.data路SCALAR路STRING路true掳csv路SCALAR路STRING路true掳false掳,掳false掳路SCALAR路STRING路true
        at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
        at org.apache.sysml.api.DMLScript.main(DMLScript.java:207)
Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
block generated from statement block between lines 80 and 119 -- Error 
evaluating instruction: 
CP掳write掳C路MATRIX路DOUBLE掳hdfs:///data/C.data路SCALAR路STRING路true掳csv路SCALAR路STRING路true掳false掳,掳false掳路SCALAR路STRING路true
        at 
org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
        at org.apache.sysml.api.DMLScript.execute(DMLScript.java:665)
        at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:346)
        ... 1 more
Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
in program block generated from statement block between lines 80 and 119 -- 
Error evaluating instruction: 
CP掳write掳C路MATRIX路DOUBLE掳hdfs:///data/C.data路SCALAR路STRING路true掳csv路SCALAR路STRING路true掳false掳,掳false掳路SCALAR路STRING路true
        at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
        at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
        at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
        at 
org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
        ... 3 more
Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
Export to hdfs:///data/C.data failed.
        at 
org.apache.sysml.runtime.controlprogram.caching.CacheableData.exportData(CacheableData.java:779)
        at 
org.apache.sysml.runtime.controlprogram.caching.CacheableData.exportData(CacheableData.java:694)
        at 
org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.writeCSVFile(VariableCPInstruction.java:826)
        at 
org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processWriteInstruction(VariableCPInstruction.java:773)
        at 
org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processInstruction(VariableCPInstruction.java:642)
        at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
        ... 6 more
Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs:/data, expected: 
file:///
        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:80)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:423)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:590)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:441)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:428)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
        at 
org.apache.sysml.runtime.util.MapReduceTool.writeMetaDataFile(MapReduceTool.java:390)
        at 
org.apache.sysml.runtime.controlprogram.caching.CacheableData.writeMetaData(CacheableData.java:960)
        at 
org.apache.sysml.runtime.controlprogram.caching.CacheableData.exportData(CacheableData.java:772)
        ... 11 more

Reply via email to