[ 
https://issues.apache.org/jira/browse/SYSTEMML-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998785#comment-15998785
 ] 

Nakul Jindal commented on SYSTEMML-1451:
----------------------------------------

It seems from the hs_err {{pid25327.log}} file, you are trying to allocate 18GB 
or memory. Clearly your system does not have that much.

The provided {{sparkDML.sh}} script which the {{genBinomialData.sh}} calls is 
not robust and general purpose. It expects YARN to be present and the cluster 
to be of a certain size to work. 
To get around these obstacles and to build a more general purpose robust 
script, please take the time to understand each of the options passed to 
{{java}} and to {{SystemML}}.

To get started, I would ask that you try to first run SystemML from the command 
line without using any of the provided scripts 
([systemml|https://github.com/apache/incubator-systemml/blob/master/bin/systemml]
 or 
[sparkDML.sh|https://github.com/apache/incubator-systemml/blob/master/scripts/sparkDML.sh]).
 Try running for both standalone and spark.
For standalone, you command should look like
{{java -cp ... org.apache.sysml.api.DMLScript 
-Dlog4j.configuration=file:'$PROJECT_ROOT_DIR/conf/log4j.properties' -f .. 
-stats -config conf/SystemML-config.xml -exec singlenode -nvargs ..}}
For spark, it should look like

{{$SPARK_HOME/bin/spark-submit --master ..  --driver-class-path ..}}

You will run into issues each time, with java complaining about not being able 
to find a class or the output not showing up correctly, etc. 
As you figure out the exact parameters needed to run, you will know what needs 
to be done to write a robust systemml invocation script and consequently what 
you'd need to do to automate the performance testing.

Here is a little script I wrote in python to hopefully eventually replace 
{{bin/systemml}} - 
[systemml-standalone.py|https://github.com/apache/incubator-systemml/blob/master/bin/systemml-standalone.py].
 It works for all platforms (MacOS, Windows, Windows+Cygwin, Linux). Read 
through it; it may be useful. We also want to create a 
{{systemml-spark-submit.py}} script which is a lot more robust than the 
{{scripts/sparkDML.sh}} script we currently have and then have the performance 
tests call into that.

> Automate performance testing and reporting
> ------------------------------------------
>
>                 Key: SYSTEMML-1451
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1451
>             Project: SystemML
>          Issue Type: Improvement
>          Components: Infrastructure, Test
>            Reporter: Nakul Jindal
>              Labels: gsoc2017, mentor, performance, reporting, testing
>
> As part of a release (and in general), performance tests are run for SystemML.
> Currently, running and reporting on these performance tests are a manual 
> process. There are helper scripts, but largely the process is manual.
> The aim of this GSoC 2017 project is to automate performance testing and its 
> reporting.
> These are the tasks that this entails
> 1. Automate running of the performance tests, including generation of test 
> data
> 2. Detect errors and report if any
> 3. Record performance benchmarking information
> 4. Automatically compare this performance to previous version to check for 
> performance regressions
> 5. Automatically compare to Spark MLLib, R?, Julia?
> 6. Prepare report with all the information about failed jobs, performance 
> information, perf info against other comparable projects/algorithms 
> (plotted/in plain text in CSV, PDF or other common format)
> 7. Create scripts to automatically run this process on a cloud provider that 
> spins up machines, runs the test, saves the reports and spins down the 
> machines.
> 8. Create a web application to do this interactively without dropping down 
> into a shell.
> As part of this project, the student will need to know scripting (in Bash, 
> Python, etc). It may also involve changing error reporting and performance 
> reporting code in SystemML. 
> Rating - Medium (for the amount of work)
> Mentor - [~nakul02] (Other co-mentors will join in)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to