Hi Deron,

Please see the below answers:

Are all the property
name/values in this file the recommended SystemML configuration settings
when running on a Hadoop cluster?
Yes, but some are dependent on the size of cluster (for example: number of
reducers). So the user might need to modify them accordingly.

Are any of these properties of particular
relevance when increasing performance for the cluster?
Yes. Going back to "the number of reducers" example, if one has 100 node
cluster and using default "10" reducers would cause underutilization of the
cluster.

For example, I have a 4-node cluster with 3 data nodes. Should I change
<numreducers> to be 2x the number of data nodes, so change from 10 to 6?
2x nodes is a good rule of thumb for the number of reducers for "MR"
backend. I verified this in the performance experiments.

Also, with regards to <optlevel>, what is being optimized and how does this
affect performance?
<optlevel> is a tuning flag for SystemML's runtime optimizer. I would
recommend to use the default optlevel. Here is the documentation:
* Optimization Types for Compilation
         *
         *  O0 STATIC - Decisions for scheduling operations on CP/MR are
based on
         *  predefined set of rules, which check if the dimensions are below
a
         *  fixed/static threshold (OLD Method of choosing between CP and
MR).
         *  The optimization scope is LOCAL, i.e., per statement block.
         *  Advanced rewrites like constant folding, common subexpression
elimination,
         *  or inter procedural analysis are NOT applied.
         *
         *  O1 MEMORY_BASED - Every operation is scheduled on CP or MR,
solely
         *  based on the amount of memory required to perform that operation.
         *  It does NOT take the execution time into account.
         *  The optimization scope is LOCAL, i.e., per statement block.
         *  Advanced rewrites like constant folding, common subexpression
elimination,
         *  or inter procedural analysis are NOT applied.
         *
         *  O2 MEMORY_BASED - Every operation is scheduled on CP or MR,
solely
         *  based on the amount of memory required to perform that operation.
         *  It does NOT take the execution time into account.
         *  The optimization scope is LOCAL, i.e., per statement block.
         *  All advanced rewrites are applied. This is the default
optimization
         *  level of SystemML.
         *
         *  O3 GLOBAL TIME_MEMORY_BASED - Operation scheduling on CP or MR as
well as
         *  many other rewrites of data flow properties such as block size,
partitioning,
         *  replication, vectorization, etc are done with the optimization
objective of
         *  minimizing execution time under hard memory constraints per
operation and
         *  execution context. The optimization scope if GLOBAL, i.e.,
program-wide.
         *  All advanced rewrites are applied. This optimization level
requires more
         *  optimization time but has higher optimization potential.
         *
         *  O4 DEBUG MODE - All optimizations, global and local, which
interfere with
         *  breakpoints are NOT applied. This optimization level is REQUIRED
for the
         *  compiler running in debug mode.

Thanks,

Niketan Pansare
IBM Almaden Research Center
Phone (office): (408) 927 1740
E-mail: [email protected]
http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar



From:   Deron Eriksson <[email protected]>
To:     [email protected]
Date:   11/17/2015 07:31 PM
Subject:        SystemML-config.xml in distributed Hadoop environment



Hello,

The SystemML binary release comes with a SystemML configuration file
(SystemML-config.xml) in its root directory. Are all the property
name/values in this file the recommended SystemML configuration settings
when running on a Hadoop cluster? Are any of these properties of particular
relevance when increasing performance for the cluster?

For example, I have a 4-node cluster with 3 data nodes. Should I change
<numreducers> to be 2x the number of data nodes, so change from 10 to 6?

Also, with regards to <optlevel>, what is being optimized and how does this
affect performance?

Thanks!
Deron

Reply via email to