incubator-zeppelin git commit: [ZEPPELIN-688] Giving an option to hide REPL output in Spark interpreter

moon Thu, 10 Mar 2016 13:17:45 -0800

Repository: incubator-zeppelin
Updated Branches:
  refs/heads/master cc24227bf -> 893b49b5c



[ZEPPELIN-688] Giving an option to hide REPL output in Spark interpreter

### What is this PR for?
When a user runs Spark interpreter, the result will come out with result number 
message like :
```
res0: Int = 250
```
Someone might want to print this REPL output with their result, but others may 
want to see the result only since sometimes this output is too verbose. So, I 
just want to give an option to hide this REPL output to those users.

The default value of `zeppelin.spark.printREPLOutput` is `true`. This status is 
as-is. Users can hide REPL output only when they changes this property `true` 
to `false`. In this case, they can check the result by specifying such as 
`print(some_variable)`.

### What type of PR is it?
Improvement

### Todos
* [x] - Add a property `zeppelin.spark.printREPLOutput`
* [x] - Add Spark interpreter property table to `docs/spark.md`

### What is the Jira issue?
[ZEPPELIN-688](https://issues.apache.org/jira/browse/ZEPPELIN-688#)

### How should this be tested?
After applying this PR,
  1. Create spark interpreter for this test and change 
`zeppelin.spark.printREPLOutput` property value `true` -> `false`
  2. Create a notebook and bind interpreter what you made.
  3. Write `val a = 250` down and run this paragraph. Then you can check the 
any output is not shown although paragraph status is **FINISHED** (This is the 
result of this PR).
  4. Run `print(a)` in the next paragraph. Then finally you can get a result 
`250`.

### Screenshots (if appropriate)
![spark-result](https://cloud.githubusercontent.com/assets/10060731/13560814/8ec3080e-e467-11e5-9059-e4edf57a38a6.gif)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? I added Spark interpreter property table to 
`docs/spark.md`.

Author: AhyoungRyu <fbdkdu...@hanmail.net>

Closes #764 from AhyoungRyu/ZEPPELIN-688 and squashes the following commits:

c4bbe33 [AhyoungRyu] Add a additional sentence to docs/spark.md
f1621f6 [AhyoungRyu] Add Spark interpreter property table to docs/spark.md
2036e09 [AhyoungRyu] ZEPPELIN-688: Giving an option to hide REPL output in 
spark interpreter


Project: http://git-wip-us.apache.org/repos/asf/incubator-zeppelin/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-zeppelin/commit/893b49b5
Tree: http://git-wip-us.apache.org/repos/asf/incubator-zeppelin/tree/893b49b5
Diff: http://git-wip-us.apache.org/repos/asf/incubator-zeppelin/diff/893b49b5

Branch: refs/heads/master
Commit: 893b49b5c065fc8da6c8f4d9ff0a79cfa177ba12
Parents: cc24227
Author: AhyoungRyu <fbdkdu...@hanmail.net>
Authored: Wed Mar 9 10:56:09 2016 +0900
Committer: Lee moon soo <m...@apache.org>
Committed: Thu Mar 10 13:21:20 2016 -0800

----------------------------------------------------------------------
 docs/interpreter/spark.md                       | 70 +++++++++++++++++++-
 .../apache/zeppelin/spark/SparkInterpreter.java | 63 ++++++++++--------
 2 files changed, 106 insertions(+), 27 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-zeppelin/blob/893b49b5/docs/interpreter/spark.md
----------------------------------------------------------------------
diff --git a/docs/interpreter/spark.md b/docs/interpreter/spark.md
index 027d4b6..21c3df5 100644
--- a/docs/interpreter/spark.md
+++ b/docs/interpreter/spark.md
@@ -40,6 +40,74 @@ Spark Interpreter group, which consisted of 4 interpreters.
 </table>
 
 ## Configuration
+Zeppelin provides the below properties for Spark interpreter. 
+You can also set other Spark properties which are not listed in the table. If 
so, please refer to [Spark Available 
Properties](http://spark.apache.org/docs/latest/configuration.html#available-properties).
+<table class="table-configuration">
+  <tr>
+    <th>Property</th>
+    <th>Default</th>
+    <th>Description</th>
+  </tr>
+  <tr>
+    <td>args</td>
+    <td></td>
+    <td>Spark commandline args</td>
+  </tr>
+    <td>master</td>
+    <td>local[*]</td>
+    <td>Spark master uri. <br/> ex) spark://masterhost:7077</td>
+  <tr>
+    <td>spark.app.name</td>
+    <td>Zeppelin</td>
+    <td>The name of spark application.</td>
+  </tr>
+  <tr>
+    <td>spark.cores.max</td>
+    <td></td>
+    <td>Total number of cores to use. <br/> Empty value uses all available 
core.</td>
+  </tr>
+  <tr>
+    <td>spark.executor.memory </td>
+    <td>512m</td>
+    <td>Executor memory per worker instance. <br/> ex) 512m, 32g</td>
+  </tr>
+  <tr>
+    <td>zeppelin.dep.additionalRemoteRepository</td>
+    <td>spark-packages, <br/> http://dl.bintray.com/spark-packages/maven, 
<br/> false;</td>
+    <td>A list of `id,remote-repository-URL,is-snapshot;` <br/> for each 
remote repository.</td>
+  </tr>
+  <tr>
+    <td>zeppelin.dep.localrepo</td>
+    <td>local-repo</td>
+    <td>Local repository for dependency loader</td>
+  </tr>
+  <tr>
+    <td>zeppelin.pyspark.python</td>
+    <td>python</td>
+    <td>Python command to run pyspark with</td>
+  </tr>
+  <tr>
+    <td>zeppelin.spark.concurrentSQL</td>
+    <td>false</td>
+    <td>Execute multiple SQL concurrently if set true.</td>
+  </tr>
+  <tr>
+    <td>zeppelin.spark.maxResult</td>
+    <td>1000</td>
+    <td>Max number of SparkSQL result to display.</td>
+  </tr>
+  <tr>
+    <td>zeppelin.spark.printREPLOutput</td>
+    <td>true</td>
+    <td>Print REPL output</td>
+  </tr>
+  <tr>
+    <td>zeppelin.spark.useHiveContext</td>
+    <td>true</td>
+    <td>Use HiveContext instead of SQLContext if it is true.</td>
+  </tr>
+</table>
+
 Without any configuration, Spark interpreter works out of box in local mode. 
But if you want to connect to your Spark cluster, you'll need to follow below 
two simple steps.
 
 ### 1. Export SPARK_HOME
@@ -269,7 +337,7 @@ To learn more about dynamic form, checkout [Dynamic 
Form](../manual/dynamicform.
 In 'Separate Interpreter for each note' mode, SparkInterpreter creates scala 
compiler per each notebook. However it still shares the single SparkContext.
 
 ## Setting up Zeppelin with Kerberos
-Logical setup with Zeppelin, Kerberos Distribution Center (KDC), and Spark on 
YARN:
+Logical setup with Zeppelin, Kerberos Key Distribution Center (KDC), and Spark 
on YARN:
 
 <img src="../assets/themes/zeppelin/img/docs-img/kdc_zeppelin.png">
 

http://git-wip-us.apache.org/repos/asf/incubator-zeppelin/blob/893b49b5/spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java
----------------------------------------------------------------------
diff --git 
a/spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java 
b/spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java
index 57d2724..c39ef31 100644
--- a/spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java
+++ b/spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java
@@ -82,31 +82,34 @@ public class SparkInterpreter extends Interpreter {
 
   static {
     Interpreter.register(
-        "spark",
-        "spark",
-        SparkInterpreter.class.getName(),
-        new InterpreterPropertyBuilder()
-            .add("spark.app.name",
-                getSystemDefault("SPARK_APP_NAME", "spark.app.name", 
"Zeppelin"),
-                "The name of spark application.")
-            .add("master",
-                getSystemDefault("MASTER", "spark.master", "local[*]"),
-                "Spark master uri. ex) spark://masterhost:7077")
-            .add("spark.executor.memory",
-                getSystemDefault(null, "spark.executor.memory", "512m"),
-                "Executor memory per worker instance. ex) 512m, 32g")
-            .add("spark.cores.max",
-                getSystemDefault(null, "spark.cores.max", ""),
-                "Total number of cores to use. Empty value uses all available 
core.")
-            .add("zeppelin.spark.useHiveContext",
-                getSystemDefault("ZEPPELIN_SPARK_USEHIVECONTEXT",
-                    "zeppelin.spark.useHiveContext", "true"),
-                "Use HiveContext instead of SQLContext if it is true.")
-            .add("zeppelin.spark.maxResult",
-                getSystemDefault("ZEPPELIN_SPARK_MAXRESULT", 
"zeppelin.spark.maxResult", "1000"),
-                "Max number of SparkSQL result to display.")
-            .add("args", "", "spark commandline args").build());
-
+      "spark",
+      "spark",
+      SparkInterpreter.class.getName(),
+      new InterpreterPropertyBuilder()
+        .add("spark.app.name",
+          getSystemDefault("SPARK_APP_NAME", "spark.app.name", "Zeppelin"),
+          "The name of spark application.")
+        .add("master",
+          getSystemDefault("MASTER", "spark.master", "local[*]"),
+          "Spark master uri. ex) spark://masterhost:7077")
+        .add("spark.executor.memory",
+          getSystemDefault(null, "spark.executor.memory", "512m"),
+          "Executor memory per worker instance. ex) 512m, 32g")
+        .add("spark.cores.max",
+          getSystemDefault(null, "spark.cores.max", ""),
+          "Total number of cores to use. Empty value uses all available core.")
+        .add("zeppelin.spark.useHiveContext",
+          getSystemDefault("ZEPPELIN_SPARK_USEHIVECONTEXT",
+            "zeppelin.spark.useHiveContext", "true"),
+          "Use HiveContext instead of SQLContext if it is true.")
+        .add("zeppelin.spark.maxResult",
+          getSystemDefault("ZEPPELIN_SPARK_MAXRESULT", 
"zeppelin.spark.maxResult", "1000"),
+          "Max number of SparkSQL result to display.")
+        .add("args", "", "spark commandline args")
+        .add("zeppelin.spark.printREPLOutput", "true",
+          "Print REPL output")
+        .build()
+    );
   }
 
   private ZeppelinContext z;
@@ -383,6 +386,10 @@ public class SparkInterpreter extends Interpreter {
     return defaultValue;
   }
 
+  public boolean printREPLOutput() {
+    return 
java.lang.Boolean.parseBoolean(getProperty("zeppelin.spark.printREPLOutput"));
+  }
+
   @Override
   public void open() {
     URL[] urls = getClassloaderUrls();
@@ -483,7 +490,11 @@ public class SparkInterpreter extends Interpreter {
 
     synchronized (sharedInterpreterLock) {
       /* create scala repl */
-      this.interpreter = new SparkILoop(null, new PrintWriter(out));
+      if (printREPLOutput()) {
+        this.interpreter = new SparkILoop(null, new PrintWriter(out));
+      } else {
+        this.interpreter = new SparkILoop(null, new PrintWriter(Console.out(), 
false));
+      }
 
       interpreter.settings_$eq(settings);

incubator-zeppelin git commit: [ZEPPELIN-688] Giving an option to hide REPL output in Spark interpreter

Reply via email to