[GitHub] [zeppelin] felixcheung commented on a change in pull request #3337: [ZEPPELIN-4078] Ipython queue performance

2019-03-20 Thread GitBox
felixcheung commented on a change in pull request #3337: [ZEPPELIN-4078] 
Ipython queue performance
URL: https://github.com/apache/zeppelin/pull/3337#discussion_r267638063
 
 

 ##
 File path: python/src/main/resources/grpc/python/ipython_server.py
 ##
 @@ -52,24 +52,19 @@ def execute(self, request, context):
 print("execute code:\n")
 print(request.code.encode('utf-8'))
 sys.stdout.flush()
-stdout_queue = queue.Queue(maxsize = 10)
-stderr_queue = queue.Queue(maxsize = 10)
-image_queue = queue.Queue(maxsize = 5)
-
+stream_reply_queue = queue.Queue(maxsize = 20)
 
 Review comment:
   should maxsize be a bit more?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] felixcheung commented on a change in pull request #3337: [ZEPPELIN-4078] Ipython queue performance

2019-03-20 Thread GitBox
felixcheung commented on a change in pull request #3337: [ZEPPELIN-4078] 
Ipython queue performance
URL: https://github.com/apache/zeppelin/pull/3337#discussion_r267637879
 
 

 ##
 File path: python/src/main/resources/grpc/python/ipython_server.py
 ##
 @@ -52,24 +52,19 @@ def execute(self, request, context):
 print("execute code:\n")
 print(request.code.encode('utf-8'))
 sys.stdout.flush()
-stdout_queue = queue.Queue(maxsize = 10)
-stderr_queue = queue.Queue(maxsize = 10)
-image_queue = queue.Queue(maxsize = 5)
-
+stream_reply_queue = queue.Queue(maxsize = 20)
+payload_reply = []
 def _output_hook(msg):
 msg_type = msg['header']['msg_type']
 content = msg['content']
 if msg_type == 'stream':
-stdout_queue.put(content['text'])
+
stream_reply_queue.put(ipython_pb2.ExecuteResponse(status=ipython_pb2.SUCCESS, 
type=ipython_pb2.TEXT, output=content['text']))
+elif msg_type == 'error':
+
stream_reply_queue.put(ipython_pb2.ExecuteResponse(status=ipython_pb2.ERROR, 
type=ipython_pb2.TEXT, output='\n'.join(content['traceback'])))
 elif msg_type in ('display_data', 'execute_result'):
-stdout_queue.put(content['data'].get('text/plain', ''))
+
stream_reply_queue.put(ipython_pb2.ExecuteResponse(status=ipython_pb2.SUCCESS, 
type=ipython_pb2.TEXT, output=content['data'].get('text/plain', '')))
 if 'image/png' in content['data']:
-image_queue.put(content['data']['image/png'])
-elif msg_type == 'error':
-stderr_queue.put('\n'.join(content['traceback']))
-
-
-payload_reply = []
+
stream_reply_queue.put(ipython_pb2.ExecuteResponse(status=ipython_pb2.SUCCESS, 
type=ipython_pb2.IMAGE, output=content['data']['image/png']))
 
 Review comment:
   this is a bit long and hard to read? consider refactor 
`stream_reply_queue.put` into a separate line


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (ZEPPELIN-4082) Error occured when using UDF with scoped notebook

2019-03-20 Thread Jin-Hyeok, Cha (JIRA)
Jin-Hyeok, Cha created ZEPPELIN-4082:


 Summary: Error occured when using UDF with scoped notebook
 Key: ZEPPELIN-4082
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4082
 Project: Zeppelin
  Issue Type: Bug
  Components: Interpreters
Affects Versions: 0.8.1
 Environment: * Zeppelin v0.8.1
 * Spark v2.4.0 (1 Master, N Workers)
 * Hadoop (Embedded, Maybe v2.7.x)
 * The interpreter will be instantiated *Per Note* in *scoped* process
Reporter: Jin-Hyeok, Cha


When I defined my own function with UDF (User-Defined Functions) feature, and I 
got the error message like this:

 
{code:java}
java.lang.ClassCastException: cannot assign instance of 
scala.collection.immutable.List$SerializationProxy to field 
org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type 
scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD
{code}
 

 

I just defined the simple function:

 
{code:java}
import java.text.SimpleDateFormat

def diffHour(s1: String, s2: String): Long = {
  var hour = 0L
  try {
val sdf = new SimpleDateFormat("-MM-dd HH:mm:ss")
val d1 = sdf.parse(s1)
val d2 = sdf.parse(s2)
hour = d2.getTime - d1.getTime
hour /= 1000 * 60 * 60
  } catch {
case e: Exception => hour = -1
  }
  hour
}{code}
 

And registered my function to Spark SQL Context:
{code:java}
sqlContext.udf.register("diffHour", diffHour _)
{code}
Now I expected I can use my function on SQL.

 
{code:java}
%sql
SELECT
  id,
  time,
  diffHour(time, '2019-01-01 00:00:00') as hour
FROM users{code}
But the error occurred I mentioned at first.

 

 

I used *Per Note* and *scoped* settings for Spark Interpreter.
So I changed Interpreter settings to *Globally*.
Then error not occurred.

 

How can I fix it?

Please help me.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [zeppelin] zjffdu opened a new pull request #3338: ZEPPELIN-4081. when the python process is killed, the task state is still running

2019-03-20 Thread GitBox
zjffdu opened a new pull request #3338: ZEPPELIN-4081. when the python process 
is killed,the task state is still running
URL: https://github.com/apache/zeppelin/pull/3338
 
 
   ### What is this PR for?
   This PR will break python code execution if the python process is existed. 
Besides that, I also improve the error message for ipython interpreter although 
it doesn't have such issue.
   
   
   ### What type of PR is it?
   [Bug Fix]
   
   ### Todos
   * [ ] - Task
   
   ### What is the Jira issue?
   * https://issues.apache.org/jira/browse/ZEPPELIN-4081
   
   ### How should this be tested?
   * Unit test is added
   
   ### Screenshots (if appropriate)
   
   ### Questions:
   * Does the licenses files need update? No
   * Is there breaking changes for older versions? No
   * Does this needs documentation? No
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (ZEPPELIN-4081) when the python process is killed,the task state is still running

2019-03-20 Thread MOBIN (JIRA)
MOBIN created ZEPPELIN-4081:
---

 Summary: when the python process is killed,the task state is still 
running
 Key: ZEPPELIN-4081
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4081
 Project: Zeppelin
  Issue Type: Bug
  Components: python-interpreter
Affects Versions: 0.8.0
 Environment: centOS-6.5

java version "1.7.0_75"
Reporter: MOBIN


when execute the following python code, then kill the python process,the taks 
state is still running
{code:java}
// import time
   print("start")
   time.sleep(1000)
   print("end"){code}
 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [zeppelin] AyWa commented on issue #3336: [ZEPPELIN-4078] Fix concurrent autocomplete and execute for Ipython

2019-03-20 Thread GitBox
AyWa commented on issue #3336: [ZEPPELIN-4078] Fix concurrent autocomplete and 
execute for Ipython
URL: https://github.com/apache/zeppelin/pull/3336#issuecomment-475073644
 
 
   @Leemoonsoo Thx you for the info, I guess it was because of `new 
Properties()`. i pushed a changed to use `initIntpProperties()` in the test. 
Let's hope it will pass 🤞 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] jongyoul commented on a change in pull request #3316: [ZEPPELIN-3985]. Move note permission from notebook-authorization.json to note file

2019-03-20 Thread GitBox
jongyoul commented on a change in pull request #3316: [ZEPPELIN-3985]. Move 
note permission from notebook-authorization.json to note file
URL: https://github.com/apache/zeppelin/pull/3316#discussion_r267533084
 
 

 ##
 File path: 
zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/AuthorizationService.java
 ##
 @@ -0,0 +1,249 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zeppelin.notebook;
+
+import com.google.common.base.Predicate;
+import com.google.common.collect.FluentIterable;
+import com.google.common.collect.Sets;
+import org.apache.commons.lang.StringUtils;
+import org.apache.zeppelin.conf.ZeppelinConfiguration;
+import org.apache.zeppelin.user.AuthenticationInfo;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import javax.inject.Inject;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+/**
+ * This class is responsible for maintain notes authorization info. And 
provide api for
+ * setting and querying note authorization info.
+ */
+public class AuthorizationService {
+
+  private static final Logger LOGGER = 
LoggerFactory.getLogger(AuthorizationService.class);
+  private static final Set EMPTY_SET = new HashSet<>();
+
+  private ZeppelinConfiguration conf;
+  private Notebook notebook;
+  /*
+   * contains roles for each user
+   */
+  private Map> userRoles = new HashMap<>();
+
+  @Inject
+  public AuthorizationService(Notebook notebook) {
 
 Review comment:
   It would be better to have `ZeppelinConfiguration` injected as a parameter 
of this constructor.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] Leemoonsoo edited a comment on issue #3336: [ZEPPELIN-4078] Fix concurrent autocomplete and execute for Ipython

2019-03-20 Thread GitBox
Leemoonsoo edited a comment on issue #3336: [ZEPPELIN-4078] Fix concurrent 
autocomplete and execute for Ipython
URL: https://github.com/apache/zeppelin/pull/3336#issuecomment-474924351
 
 
   Thanks @AyWa for the contribution. A test is failing with
   
   ```
   13:09:18,331  INFO org.apache.zeppelin.spark.OldSparkInterpreter:338 - 
-- Create new SparkContext local ---
   13:09:18,335  INFO org.apache.spark.SparkContext:58 - Running Spark version 
1.6.3
   13:09:18,338 ERROR org.apache.spark.SparkContext:95 - Error initializing 
SparkContext.
   org.apache.spark.SparkException: An application name must be set in your 
configuration
at org.apache.spark.SparkContext.(SparkContext.scala:404)
at 
org.apache.zeppelin.spark.OldSparkInterpreter.createSparkContext_1(OldSparkInterpreter.java:426)
at 
org.apache.zeppelin.spark.OldSparkInterpreter.createSparkContext(OldSparkInterpreter.java:321)
at 
org.apache.zeppelin.spark.OldSparkInterpreter.getSparkContext(OldSparkInterpreter.java:139)
at 
org.apache.zeppelin.spark.OldSparkInterpreter.open(OldSparkInterpreter.java:696)
at 
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:66)
at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at 
org.apache.zeppelin.interpreter.Interpreter.getInterpreterInTheSameSessionByClassName(Interpreter.java:354)
at 
org.apache.zeppelin.interpreter.Interpreter.getInterpreterInTheSameSessionByClassName(Interpreter.java:365)
at 
org.apache.zeppelin.spark.IPySparkInterpreter.open(IPySparkInterpreter.java:52)
at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at 
org.apache.zeppelin.spark.IPySparkInterpreterTest.startInterpreter(IPySparkInterpreterTest.java:93)
at 
org.apache.zeppelin.python.IPythonInterpreterTest.testIpython_shouldNotHang_whenCallingAutoCompleteAndInterpretConcurrently(IPythonInterpreterTest.java:250)
   
   ```
   
   Looks like 
https://github.com/apache/zeppelin/blob/master/spark/interpreter/src/test/java/org/apache/zeppelin/spark/IPySparkInterpreterTest.java#L59
 is somehow not applied on the test. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] Leemoonsoo commented on issue #3336: [ZEPPELIN-4078] Fix concurrent autocomplete and execute for Ipython

2019-03-20 Thread GitBox
Leemoonsoo commented on issue #3336: [ZEPPELIN-4078] Fix concurrent 
autocomplete and execute for Ipython
URL: https://github.com/apache/zeppelin/pull/3336#issuecomment-474924351
 
 
   Thanks @AyWa for the contribution. Looks like a test is failing with
   
   ```
   13:09:18,331  INFO org.apache.zeppelin.spark.OldSparkInterpreter:338 - 
-- Create new SparkContext local ---
   13:09:18,335  INFO org.apache.spark.SparkContext:58 - Running Spark version 
1.6.3
   13:09:18,338 ERROR org.apache.spark.SparkContext:95 - Error initializing 
SparkContext.
   org.apache.spark.SparkException: An application name must be set in your 
configuration
at org.apache.spark.SparkContext.(SparkContext.scala:404)
at 
org.apache.zeppelin.spark.OldSparkInterpreter.createSparkContext_1(OldSparkInterpreter.java:426)
at 
org.apache.zeppelin.spark.OldSparkInterpreter.createSparkContext(OldSparkInterpreter.java:321)
at 
org.apache.zeppelin.spark.OldSparkInterpreter.getSparkContext(OldSparkInterpreter.java:139)
at 
org.apache.zeppelin.spark.OldSparkInterpreter.open(OldSparkInterpreter.java:696)
at 
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:66)
at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at 
org.apache.zeppelin.interpreter.Interpreter.getInterpreterInTheSameSessionByClassName(Interpreter.java:354)
at 
org.apache.zeppelin.interpreter.Interpreter.getInterpreterInTheSameSessionByClassName(Interpreter.java:365)
at 
org.apache.zeppelin.spark.IPySparkInterpreter.open(IPySparkInterpreter.java:52)
at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at 
org.apache.zeppelin.spark.IPySparkInterpreterTest.startInterpreter(IPySparkInterpreterTest.java:93)
at 
org.apache.zeppelin.python.IPythonInterpreterTest.testIpython_shouldNotHang_whenCallingAutoCompleteAndInterpretConcurrently(IPythonInterpreterTest.java:250)
   
   ```
   
   Looks like 
https://github.com/apache/zeppelin/blob/master/spark/interpreter/src/test/java/org/apache/zeppelin/spark/IPySparkInterpreterTest.java#L59
 is somehow not applied on the test. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] zjffdu commented on issue #3316: [ZEPPELIN-3985]. Move note permission from notebook-authorization.json to note file

2019-03-20 Thread GitBox
zjffdu commented on issue #3316: [ZEPPELIN-3985]. Move note permission from 
notebook-authorization.json to note file
URL: https://github.com/apache/zeppelin/pull/3316#issuecomment-474852742
 
 
   @felixcheung Could you help review it ? Thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] zjffdu commented on issue #3336: [ZEPPELIN-4078] Fix concurrent autocomplete and execute for Ipython

2019-03-20 Thread GitBox
zjffdu commented on issue #3336: [ZEPPELIN-4078] Fix concurrent autocomplete 
and execute for Ipython
URL: https://github.com/apache/zeppelin/pull/3336#issuecomment-474766783
 
 
   Thanks @AyWa LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] AyWa opened a new pull request #3337: [ZEPPELIN-4078] Ipython queue performance

2019-03-20 Thread GitBox
AyWa opened a new pull request #3337: [ZEPPELIN-4078] Ipython queue performance
URL: https://github.com/apache/zeppelin/pull/3337
 
 
   ### What is this PR for?
   
   The pr is to fix a bug that will make the **ipython** queue listener, 
overuse cpu. After this fix, cpu usage should be way lower. 
   Also there is a bit of refactor to use only one queue to ensure emssage will 
be order even with a sleep.
   
   ### What type of PR is it?
   Bug Fix / performance improvement
   
   ### Todos
   * [x] - Performance improvement
   * [ ] - Need to add some performance test ? or other test ?
   
   ### What is the Jira issue?
   It is one part of the jira issue.
   https://issues.apache.org/jira/browse/ZEPPELIN-4078
   
   ### How should this be tested?
   * First time? Setup Travis CI as described on 
https://zeppelin.apache.org/contribution/contributions.html#continuous-integration
   * Strongly recommended: add automated unit tests for any new or changed 
behavior
   * Outline any manual steps to test the PR here.
   
   ### Screenshots (if appropriate)
   
   ### Questions:
   * Does the licenses files need update? no
   * Is there breaking changes for older versions? no
   * Does this needs documentation? no
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [zeppelin] AyWa opened a new pull request #3336: [ZEPPELIN-4078] Fix concurrent autocomplete and execute for Ipython

2019-03-20 Thread GitBox
AyWa opened a new pull request #3336: [ZEPPELIN-4078] Fix concurrent 
autocomplete and execute for Ipython
URL: https://github.com/apache/zeppelin/pull/3336
 
 
   ### What is this PR for?
   
   The pr is to fix a bug that will make the **ipython** `execute_interactive` 
hang forever if a auto `complete` call is make at the same time. (see unit test 
for example that is failing on master).
   
   For now the fix is to synchronize those method : `execute` / `complete`. It 
will not bring regression because anyway, the kernel does not support 
concurrent execute and auto complete (see 
https://github.com/jupyter/notebook/issues/3763)
   
   ### What type of PR is it?
   Bug Fix
   
   ### Todos
   * [x] - unit test failing in master / succeed on this branch
   * [x] - fix with lock
   
   ### What is the Jira issue?
   It is one part of the jira issue. Other fix will come soon
   https://issues.apache.org/jira/browse/ZEPPELIN-4078
   
   ### How should this be tested?
   * First time? Setup Travis CI as described on 
https://zeppelin.apache.org/contribution/contributions.html#continuous-integration
   * Strongly recommended: add automated unit tests for any new or changed 
behavior
   * Outline any manual steps to test the PR here.
   
   ### Screenshots (if appropriate)
   
   ### Questions:
   * Does the licenses files need update? no
   * Is there breaking changes for older versions? no
   * Does this needs documentation? no
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services