[jira] [Created] (ZEPPELIN-3989) Configure IPython Interpreter in Docker image

2019-02-03 Thread Lee moon soo (JIRA)
Lee moon soo created ZEPPELIN-3989:
--

 Summary: Configure IPython Interpreter in Docker image
 Key: ZEPPELIN-3989
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3989
 Project: Zeppelin
  Issue Type: Task
Affects Versions: 0.8.0
Reporter: Lee moon soo
Assignee: Lee moon soo
 Fix For: 0.8.1


It'll be nice if Zeppelin docker image installs necessary libraries to enable 
IPython Interpreter.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Leemoonsoo commented on issue #3299: [ZEPPELIN-3920] [FOLLOWUP] get correct hyperlink address when num job url is one

2019-02-03 Thread GitBox
Leemoonsoo commented on issue #3299: [ZEPPELIN-3920] [FOLLOWUP] get correct 
hyperlink address when num job url is one
URL: https://github.com/apache/zeppelin/pull/3299#issuecomment-460127666
 
 
   @liuxunorg No problem :) Thanks for the review and confirming fix!
   
   Merging to master


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Leemoonsoo commented on issue #3298: [ZEPPELIN-3982] add how to setup development mode for Kubernetes support

2019-02-03 Thread GitBox
Leemoonsoo commented on issue #3298: [ZEPPELIN-3982] add how to setup 
development mode for Kubernetes support
URL: https://github.com/apache/zeppelin/pull/3298#issuecomment-460127410
 
 
   Merge to master if no more comment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Leemoonsoo closed pull request #3298: [ZEPPELIN-3982] add how to setup development mode for Kubernetes support

2019-02-03 Thread GitBox
Leemoonsoo closed pull request #3298: [ZEPPELIN-3982] add how to setup 
development mode for Kubernetes support
URL: https://github.com/apache/zeppelin/pull/3298
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Leemoonsoo opened a new pull request #3298: [ZEPPELIN-3982] add how to setup development mode for Kubernetes support

2019-02-03 Thread GitBox
Leemoonsoo opened a new pull request #3298: [ZEPPELIN-3982] add how to setup 
development mode for Kubernetes support
URL: https://github.com/apache/zeppelin/pull/3298
 
 
   ### What is this PR for?
   This pr adds docs that explains development mode for Kubernetes support.
   
   
   ### What type of PR is it?
   Documentation
   
   ### What is the Jira issue?
   https://issues.apache.org/jira/browse/ZEPPELIN-3982
   
   ### Questions:
   * Does the licenses files need update? no
   * Is there breaking changes for older versions? no
   * Does this needs documentation? no
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Leemoonsoo commented on a change in pull request #3302: [ZEPPELIN-3988] Paragraph Text output includes \r\n is not displayed correctly.

2019-02-03 Thread GitBox
Leemoonsoo commented on a change in pull request #3302: [ZEPPELIN-3988] 
Paragraph Text output includes \r\n is not displayed correctly.
URL: https://github.com/apache/zeppelin/pull/3302#discussion_r253344719
 
 

 ##
 File path: zeppelin-web/src/app/notebook/paragraph/result/result.js
 ##
 @@ -0,0 +1,9 @@
+export default class Result {
+  constructor(data) {
+this.data = data;
+  }
+
+  checkAndReplaceCarriageReturn() {
+return this.data.replace(/(\r\n)/gm, '\n');
 
 Review comment:
   @zjffdu I checked https://github.com/apache/zeppelin/pull/2729/. And i'm not 
sure just replacing `\r\n` to `\n` will be okay or not. Could you review?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Leemoonsoo commented on a change in pull request #3302: [ZEPPELIN-3988] Paragraph Text output includes \r\n is not displayed correctly.

2019-02-03 Thread GitBox
Leemoonsoo commented on a change in pull request #3302: [ZEPPELIN-3988] 
Paragraph Text output includes \r\n is not displayed correctly.
URL: https://github.com/apache/zeppelin/pull/3302#discussion_r253344639
 
 

 ##
 File path: zeppelin-web/src/app/notebook/paragraph/result/result.controller.js
 ##
 @@ -488,22 +489,7 @@ function ResultCtrl($scope, $rootScope, $route, $window, 
$routeParams, $location
   };
 
   const checkAndReplaceCarriageReturn = function(str) {
-if (/\r/.test(str)) {
-  let newGenerated = '';
-  let strArr = str.split('\n');
-  for (let str of strArr) {
-if (/\r/.test(str)) {
-  let splitCR = str.split('\r');
-  newGenerated += splitCR[splitCR.length - 1] + '\n';
 
 Review comment:
   Looks like this line cause problem


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Leemoonsoo commented on a change in pull request #3302: [ZEPPELIN-3988] Paragraph Text output includes \r\n is not displayed correctly.

2019-02-03 Thread GitBox
Leemoonsoo commented on a change in pull request #3302: [ZEPPELIN-3988] 
Paragraph Text output includes \r\n is not displayed correctly.
URL: https://github.com/apache/zeppelin/pull/3302#discussion_r253344467
 
 

 ##
 File path: 
zeppelin-web/src/app/notebook/paragraph/result/result.controller.test.js
 ##
 @@ -0,0 +1,41 @@
+describe('Controller: ResultCtrl', function() {
+  beforeEach(angular.mock.module('zeppelinWebApp'));
+
+  let scope;
+  let controller;
+  let resultMock = {
+  };
+  let configMock = {
+  };
+  let paragraphMock = {
+id: 'p1',
+results: {
+  msg: [],
+},
+  };
+  let route = {
+current: {
+  pathParams: {
+noteId: 'noteId',
+  },
+},
+  };
+
+  beforeEach(inject(function($controller, $rootScope) {
+scope = $rootScope.$new();
+scope.$parent = $rootScope.$new(true, $rootScope);
+scope.$parent.paragraph = paragraphMock;
+
+controller = $controller('ResultCtrl', {
+  $scope: scope,
+  $route: route,
+});
+
+scope.init(resultMock, configMock, paragraphMock, 1);
+  }));
+
+  it('scope should be initialized', function() {
+expect(scope).toBeDefined();
+expect(controller).toBeDefined();
 
 Review comment:
   Does not have meaningful test now. But later, test for 
`result.controller.js` can be easily added when necessary.
   Note that `controller` method, such as `const checkAndReplaceCarriageReturn 
= ...` is not accessible here. Unlike `scope` is accessible and testable here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Leemoonsoo commented on a change in pull request #3302: [ZEPPELIN-3988] Paragraph Text output includes \r\n is not displayed correctly.

2019-02-03 Thread GitBox
Leemoonsoo commented on a change in pull request #3302: [ZEPPELIN-3988] 
Paragraph Text output includes \r\n is not displayed correctly.
URL: https://github.com/apache/zeppelin/pull/3302#discussion_r253344252
 
 

 ##
 File path: zeppelin-web/src/app/notebook/paragraph/result/result.controller.js
 ##
 @@ -488,22 +489,7 @@ function ResultCtrl($scope, $rootScope, $route, $window, 
$routeParams, $location
   };
 
   const checkAndReplaceCarriageReturn = function(str) {
-if (/\r/.test(str)) {
-  let newGenerated = '';
-  let strArr = str.split('\n');
-  for (let str of strArr) {
-if (/\r/.test(str)) {
-  let splitCR = str.split('\r');
-  newGenerated += splitCR[splitCR.length - 1] + '\n';
-} else {
-  newGenerated += str + '\n';
-}
-  }
-  // remove last "\n" character
-  return newGenerated.slice(0, -1);
-} else {
-  return str;
-}
 
 Review comment:
   Reason implementation of `checkAndReplaceCarriageReturn` moved to new class 
is for testability.
   Controller method is hard to test. Unless method is referenced by `$scope`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Leemoonsoo opened a new pull request #3302: [ZEPPELIN-3988] Paragraph Text output includes \r\n is not displayed correctly.

2019-02-03 Thread GitBox
Leemoonsoo opened a new pull request #3302: [ZEPPELIN-3988] Paragraph Text 
output includes \r\n is not displayed correctly.
URL: https://github.com/apache/zeppelin/pull/3302
 
 
   ### What is this PR for?
   When paragraph text output includes `\r\n`, it is not displayed correctly in 
paragraph result.
   
   Expected result
   
![image](https://user-images.githubusercontent.com/1540981/52189818-da9e8680-27ef-11e9-8d17-790101677a9b.png)
   
   Current behavior (displays empty result)
   
![image](https://user-images.githubusercontent.com/1540981/52189831-e8540c00-27ef-11e9-8f35-bc79b21669e3.png)
   
   
   ### What type of PR is it?
   Bug Fix
   
   ### What is the Jira issue?
   https://issues.apache.org/jira/browse/ZEPPELIN-3988
   
   ### How should this be tested?
   run
   
   ```
   %python
   print("Hello world\r\n")
   ```
   
   and see if output is displayed correctly.
   
   ### Questions:
   * Does the licenses files need update? no
   * Is there breaking changes for older versions? no
   * Does this needs documentation? no
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (ZEPPELIN-3988) Text output include '\r' char does not displayed correctly

2019-02-03 Thread Lee moon soo (JIRA)
Lee moon soo created ZEPPELIN-3988:
--

 Summary: Text output include '\r' char does not displayed correctly
 Key: ZEPPELIN-3988
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3988
 Project: Zeppelin
  Issue Type: Task
Affects Versions: 0.8.0
Reporter: Lee moon soo
Assignee: Lee moon soo
 Fix For: 0.8.1


Text output '\r' is not displayed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] felixcheung commented on issue #3300: ZEPPELIN-3983. Travis fails due to downloading spark takes a lot of time

2019-02-03 Thread GitBox
felixcheung commented on issue #3300: ZEPPELIN-3983. Travis fails due to 
downloading spark takes a lot of time
URL: https://github.com/apache/zeppelin/pull/3300#issuecomment-460106676
 
 
   LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] felixcheung commented on a change in pull request #3300: ZEPPELIN-3983. Travis fails due to downloading spark takes a lot of time

2019-02-03 Thread GitBox
felixcheung commented on a change in pull request #3300: ZEPPELIN-3983. Travis 
fails due to downloading spark takes a lot of time
URL: https://github.com/apache/zeppelin/pull/3300#discussion_r253330574
 
 

 ##
 File path: 
zeppelin-interpreter-integration/src/main/test/org/apache/zeppelin/integration/DownloadUtils.java
 ##
 @@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zeppelin.integration;
+
+import org.apache.commons.io.FileUtils;
+import org.apache.commons.io.IOUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.InputStreamReader;
+import java.net.URL;
+
+/**
+ * Utility class for downloading spark/flink. This is used for spark/flink 
integration test.
+ */
+public class DownloadUtils {
+  private static Logger LOGGER = LoggerFactory.getLogger(DownloadUtils.class);
+
+  private static String downloadFolder = System.getProperty("user.home") + 
"/.cache";
+
+  static {
+try {
+  FileUtils.forceMkdir(new File(downloadFolder));
+} catch (IOException e) {
+  throw new RuntimeException("Fail to create download folder: " + 
downloadFolder, e);
+}
+  }
+
+
+  public static String downloadSpark(String version) {
+String sparkDownloadFolder = downloadFolder + "/spark";
+File targetSparkHomeFolder = new File(sparkDownloadFolder + "/spark-" + 
version + "-bin-hadoop2.6");
+if (targetSparkHomeFolder.exists()) {
+  LOGGER.info("Skip to download spark as it is already downloaded.");
+  return targetSparkHomeFolder.getAbsolutePath();
+}
+download("spark", version, "-bin-hadoop2.6.tgz");
+return targetSparkHomeFolder.getAbsolutePath();
+  }
+
+  public static String downloadFlink(String version) {
+String flinkDownloadFolder = downloadFolder + "/flink";
+File targetFlinkHomeFolder = new File(flinkDownloadFolder + "/flink-" + 
version);
+if (targetFlinkHomeFolder.exists()) {
+  LOGGER.info("Skip to download flink as it is already downloaded.");
+  return targetFlinkHomeFolder.getAbsolutePath();
+}
+download("flink", version, "-bin-hadoop27-scala_2.11.tgz");
+return targetFlinkHomeFolder.getAbsolutePath();
+  }
+
+  // Try mirrors first, if fails fallback to apache archive
+  private static void download(String project, String version, String postFix) 
{
+String projectDownloadFolder = downloadFolder + "/" + project;
+try {
+  String preferredMirror = IOUtils.toString(new 
URL("https://www.apache.org/dyn/closer.lua?preferred=true;));
+  File downloadFile = new File(projectDownloadFolder + "/" + project + "-" 
+ version + postFix);
+  String downloadURL = preferredMirror + "/" + project + "/" + project + 
"-" + version + "/" + project + "-" + version + postFix;
+  runShellCommand(new String[]{"wget", downloadURL, "-P", 
projectDownloadFolder});
+  runShellCommand(new String[]{"tar", "-xvf", 
downloadFile.getAbsolutePath(), "-C", projectDownloadFolder});
+} catch (Exception e) {
+  LOGGER.warn("Failed to download " + project + " from mirror site, 
fallback to use apache archive", e);
+  File downloadFile = new File(projectDownloadFolder + "/" + project + "-" 
+ version + postFix);
+  String downloadURL =
+  "https://archive.apache.org/dist/; + project + "/" + project +"-"
+  + version
+  + "/" + project + "-"
+  + version
+  + postFix;
+  try {
+runShellCommand(new String[]{"wget", downloadURL, "-P", 
projectDownloadFolder});
+runShellCommand(
+new String[]{"tar", "-xvf", downloadFile.getAbsolutePath(), 
"-C", projectDownloadFolder});
+  } catch (Exception ex) {
+throw new RuntimeException("Fail to download " + project + " " + 
version, ex);
+  }
+}
+  }
+
+  private static void runShellCommand(String[] commands) throws IOException, 
InterruptedException {
+LOGGER.info("Starting shell commands: " + StringUtils.join(commands, " "));
+Process 

[GitHub] maziyarpanahi commented on a change in pull request #3189: [ZEPPELIN-3758]. Convert old note file note.json to new style

2019-02-03 Thread GitBox
maziyarpanahi commented on a change in pull request #3189: [ZEPPELIN-3758]. 
Convert old note file note.json to new style
URL: https://github.com/apache/zeppelin/pull/3189#discussion_r253314125
 
 

 ##
 File path: 
zeppelin-interpreter/src/main/java/org/apache/zeppelin/conf/ZeppelinConfiguration.java
 ##
 @@ -732,6 +732,8 @@ public String getZeppelinSearchTempPath() {
 ZEPPELIN_NOTEBOOK_STORAGE("zeppelin.notebook.storage",
 "org.apache.zeppelin.notebook.repo.GitNotebookRepo"),
 ZEPPELIN_NOTEBOOK_ONE_WAY_SYNC("zeppelin.notebook.one.way.sync", false),
+
ZEPPELIN_NOTEBOOK_NEW_FORMAT_CONVERT("zeppelin.notebook.new_format.convert", 
false),
+
ZEPPELIN_NOTEBOOK_NEW_FORMAT_DELETE_OLD("zeppelin.notebook.new_format.delete_old",
 false),
 
 Review comment:
   If the user forgot to change it back, every restart would convert an old 
version of an old format (json) and overwrites it on a new version of a newer 
version note. I would go with the script, but it's not easy when you have 100 
notes in HDFS :( 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] maziyarpanahi commented on issue #3189: [ZEPPELIN-3758]. Convert old note file note.json to new style

2019-02-03 Thread GitBox
maziyarpanahi commented on issue #3189: [ZEPPELIN-3758]. Convert old note file 
note.json to new style
URL: https://github.com/apache/zeppelin/pull/3189#issuecomment-460076991
 
 
   Does this work out of the box if the notes are in HDFS? I am facing a 
similar situation which I am guessing is related to this issue:
   https://jira.apache.org/jira/projects/ZEPPELIN/issues/ZEPPELIN-3987


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (ZEPPELIN-3987) Zeppelin 0.9.0 fail to access Notebooks from HDFS

2019-02-03 Thread Maziyar PANAHI (JIRA)
Maziyar PANAHI created ZEPPELIN-3987:


 Summary: Zeppelin 0.9.0 fail to access Notebooks from HDFS
 Key: ZEPPELIN-3987
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3987
 Project: Zeppelin
  Issue Type: Bug
Affects Versions: 0.9.0
 Environment: Cloudera 6.1

Spark 2.4

Hadoop 3.0

Shiro, LDAP
Reporter: Maziyar PANAHI
 Attachments: Screenshot 2019-02-03 17.59.35.png

Hi,

I have built Zeppelin-0.9.0-SNAPSHOT and copied my configs from previous 
version 0.8.2 into this new directory. Usually, all the versions after 0.8.0 
(0.8.1, 0.8.2) immediately after start will fetch all the notebooks from HDFS. 
However, in 0.9.0 the UI is empty and the logs also indicate the reading 
Notebooks did not happen. 
{code:java}
 zeppelin.notebook.storage 
org.apache.zeppelin.notebook.repo.FileSystemNotebookRepo 
hadoop compatible file system notebook persistence layer 
implementation 
 zeppelin.notebook.dir 
hdfs://hadoop-master-1:8020/user/zeppelin/notebook 
path or URI for notebook persist 
{code}
The startup logs:

 
{code:java}
INFO [2019-02-03 17:55:41,797] ({main} ZeppelinConfiguration.java[create]:127) 
- Load configuration from 
file:/opt/zeppelin-0.9.0-SNAPSHOT/conf/zeppelin-site.xml
INFO [2019-02-03 17:55:41,856] ({main} ZeppelinConfiguration.java[create]:135) 
- Server Host: 0.0.0.0
INFO [2019-02-03 17:55:41,857] ({main} ZeppelinConfiguration.java[create]:137) 
- Server Port: 8080
INFO [2019-02-03 17:55:41,857] ({main} ZeppelinConfiguration.java[create]:141) 
- Context Path: /
INFO [2019-02-03 17:55:41,857] ({main} ZeppelinConfiguration.java[create]:142) 
- Zeppelin Version: 0.9.0-SNAPSHOT
INFO [2019-02-03 17:55:41,876] ({main} Log.java[initialized]:193) - Logging 
initialized @440ms to org.eclipse.jetty.util.log.Slf4jLog
WARN [2019-02-03 17:55:41,994] ({main} 
ServerConnector.java[setSoLingerTime]:458) - Ignoring deprecated socket close 
linger time
INFO [2019-02-03 17:55:42,064] ({main} 
ZeppelinServer.java[setupWebAppContext]:403) - ZeppelinServer Webapp path: 
/opt/zeppelin-0.9.0-SNAPSHOT/webapps
WARN [2019-02-03 17:55:42,223] ({main} 
NotebookAuthorization.java[getInstance]:79) - Notebook authorization module was 
called without initialization, initializing with default configuration
WARN [2019-02-03 17:55:42,225] ({main} 
ZeppelinConfiguration.java[getConfigFSDir]:545) - zeppelin.config.fs.dir is not 
specified, fall back to local conf directory zeppelin.conf.dir
WARN [2019-02-03 17:55:42,225] ({main} 
ZeppelinConfiguration.java[getConfigFSDir]:545) - zeppelin.config.fs.dir is not 
specified, fall back to local conf directory zeppelin.conf.dir
INFO [2019-02-03 17:55:42,225] ({main} 
LocalConfigStorage.java[loadNotebookAuthorization]:84) - Load notebook 
authorization from file: 
/opt/zeppelin-0.9.0-SNAPSHOT/conf/notebook-authorization.json
INFO [2019-02-03 17:55:42,279] ({main} Credentials.java[loadFromFile]:121) - 
/opt/zeppelin-0.9.0-SNAPSHOT/conf/credentials.json
INFO [2019-02-03 17:55:42,350] ({main} NotebookServer.java[]:145) - 
NotebookServer instantiated: org.apache.zeppelin.socket.NotebookServer@ae13544
INFO [2019-02-03 17:55:42,350] ({main} 
NotebookServer.java[setServiceLocator]:150) - Injected ServiceLocator: 
ServiceLocatorImpl(shared-locator,0,1089504328)
INFO [2019-02-03 17:55:42,351] ({main} NotebookServer.java[setNotebook]:156) - 
Injected NotebookProvider
INFO [2019-02-03 17:55:42,353] ({main} 
NotebookServer.java[setNotebookService]:163) - Injected NotebookServiceProvider
INFO [2019-02-03 17:55:42,359] ({main} ZeppelinServer.java[main]:233) - 
Starting zeppelin server
INFO [2019-02-03 17:55:42,361] ({main} Server.java[doStart]:370) - 
jetty-9.4.14.v20181114; built: 2018-11-14T21:20:31.478Z; git: 
c4550056e785fb5665914545889f21dc136ad9e6; jvm 1.8.0_201-b09
INFO [2019-02-03 17:55:44,696] ({main} 
StandardDescriptorProcessor.java[visitServlet]:283) - NO JSP Support for /, did 
not find org.eclipse.jetty.jsp.JettyJspServlet
INFO [2019-02-03 17:55:44,711] ({main} 
DefaultSessionIdManager.java[doStart]:365) - DefaultSessionIdManager 
workerName=node0
INFO [2019-02-03 17:55:44,711] ({main} 
DefaultSessionIdManager.java[doStart]:370) - No SessionScavenger set, using 
defaults
INFO [2019-02-03 17:55:44,713] ({main} HouseKeeper.java[startScavenging]:149) - 
node0 Scavenging every 66ms
INFO [2019-02-03 17:55:44,720] ({main} ContextHandler.java[log]:2345) - 
Initializing Shiro environment
INFO [2019-02-03 17:55:44,720] ({main} 
EnvironmentLoader.java[initEnvironment]:133) - Starting Shiro environment 
initialization.
INFO [2019-02-03 17:55:45,078] ({main} IniRealm.java[processDefinitions]:188) - 
IniRealm defined, but there is no [users] section defined. This realm will not 
be populated with any users and it is assumed that they will be populated 
programatically. Users must be defined for this Realm instance to be useful.
INFO [2019-02-03 17:55:45,078] 

[jira] [Created] (ZEPPELIN-3986) Cannot access any JAR in yarn cluster mode

2019-02-03 Thread Maziyar PANAHI (JIRA)
Maziyar PANAHI created ZEPPELIN-3986:


 Summary: Cannot access any JAR in yarn cluster mode
 Key: ZEPPELIN-3986
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3986
 Project: Zeppelin
  Issue Type: Bug
  Components: Interpreters
Affects Versions: 0.8.1, 0.8.2
 Environment: Cloudera/CDH 6.1

Spark 2.4

Hadoop 3.0

Zeppelin 0.8.2 (built from the latest merged pull request)
Reporter: Maziyar PANAHI


Hello,

YARN cluster mode was introduced in `0.8.0` and fixed for not finding 
ZeppelinContext in `0.8.1`. However, I have difficulties to access any JAR in 
order to `import` them inside my notebook.

I have a CDH cluster, where everything works in deployMode `client`, but the 
moment I switch to `cluster` and the driver is not the same machine as Zeppelin 
server it can't find the packages.

Working configs:

Inside interpreter:

master: yarn

spark.submit.deployMode: client

Inside `zeppelin-env.sh`:

 
{code:java}
export SPARK_SUBMIT_OPTIONS="--jars 
hdfs:///user/maziyar/jars/zeppelin/graphframes/graphframes-assembly-0.7.0-spark2.3-SNAPSHOT.jar
{code}
 

Since the JAR is already on HDFS, switching to `cluster` should be as simple as 
changing `spark.submit.deployMode` to the cluster. However, doing that results 
in:

 
{code:java}
import org.graphframes._

:23: error: object graphframes is not a member of package org import 
org.graphframes._
{code}
I can see my JAR in Spark UI in `spark.yarn.dist.jars` and 
`spark.yarn.secondary.jars` in both cluster and client mode.

 

In client mode `sc.jars` will result:

 
{code:java}
res2: Seq[String] = 
List(file:/opt/zeppelin-0.8.2-new/interpreter/spark/spark-interpreter-0.8.2-SNAPSHOT.jar){code}
 

However, in `cluster` mode the same command is empty. I thought maybe there is 
something extra or missing on Zeppelin Spark Interpreter that doesn't not allow 
the JAR being used in cluster mode.

 

This is how Spark UI reports my JAR in `client` mode:

 

 

 

 
|spark.repl.local.jars 
|file:/tmp/spark-3aadfe3c-8821-4dfe-875b-744c2e35a95a/graphframes-assembly-0.7.0-spark2.3-SNAPSHOT.jar|
|spark.yarn.dist.jars 
|hdfs://hadoop-master-1:8020/user/mpanahi/jars/zeppelin/graphframes/graphframes-assembly-0.7.0-spark2.3-SNAPSHOT.jar|
|spark.yarn.secondary.jars|graphframes-assembly-0.7.0-spark2.3-SNAPSHOT.jar|
|sun.java.command|org.apache.spark.deploy.SparkSubmit --master yarn --conf 
spark.executor.memory=5g --conf spark.driver.memory=8g --conf 
spark.driver.cores=4 --conf spark.yarn.isPython=true --conf 
spark.driver.extraClassPath=:/opt/zeppelin-0.8.2-new/interpreter/spark/*:/opt/zeppelin-0.8.2-new/zeppelin-interpreter/target/lib/*::/opt/zeppelin-0.8.2-new/zeppelin-interpreter/target/classes:/opt/zeppelin-0.8.2-new/zeppelin-interpreter/target/test-classes:/opt/zeppelin-0.8.2-new/zeppelin-zengine/target/test-classes:/opt/zeppelin-0.8.2-new/interpreter/spark/spark-interpreter-0.8.2-SNAPSHOT.jar
 --conf spark.useHiveContext=true --conf spark.app.name=Zeppelin --conf 
spark.executor.cores=5 --conf spark.submit.deployMode=client --conf 
spark.dynamicAllocation.maxExecutors=50 --conf 
spark.dynamicAllocation.initialExecutors=1 --conf 
spark.dynamicAllocation.enabled=true --conf spark.driver.extraJavaOptions= 
-Dfile.encoding=UTF-8 
-Dlog4j.configuration=file:///opt/zeppelin-0.8.2-new/conf/log4j.properties 
-Dzeppelin.log.file=/var/log/zeppelin/zeppelin-interpreter-spark-mpanahi-zeppelin-hadoop-gateway.log
 --class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer --jars 
hdfs:///user/mpanahi/jars/zeppelin/graphframes/graphframes-assembly-0.7.0-spark2.3-SNAPSHOT.jar,|

 

This is how Spark UI reports my JAR in `cluster` mode (same configs as I 
mentioned above):

  
|spark.repl.local.jars |This field does not exist in cluster mode|
|spark.yarn.dist.jars 
|hdfs://hadoop-master-1:8020/user/mpanahi/jars/zeppelin/graphframes/graphframes-assembly-0.7.0-spark2.3-SNAPSHOT.jar|
|spark.yarn.secondary.jars|graphframes-assembly-0.7.0-spark2.3-SNAPSHOT.jar|
|sun.java.command|org.apache.spark.deploy.yarn.ApplicationMaster --class 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer --jar 
file:/opt/zeppelin-0.8.2-new/interpreter/spark/spark-interpreter-0.8.2-SNAPSHOT.jar
 --arg 134.158.74.122 --arg 46130 --arg : --properties-file 
/yarn/nm/usercache/mpanahi/appcache/application_1547731772080_0077/container_1547731772080_0077_01_01/__spark_conf__/__spark_conf__.properties|

 

Thank you.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)