[jira] [Created] (ZEPPELIN-2208) docs suggest put -Dspark.executor.memory in ZEPPELIN_JAVA_OPTS
Egor Pahomov created ZEPPELIN-2208: -- Summary: docs suggest put -Dspark.executor.memory in ZEPPELIN_JAVA_OPTS Key: ZEPPELIN-2208 URL: https://issues.apache.org/jira/browse/ZEPPELIN-2208 Project: Zeppelin Issue Type: Bug Components: documentation Affects Versions: 0.7.0 Reporter: Egor Pahomov Priority: Minor Piece of zeppelin-env.sh.template: {code} # export ZEPPELIN_JAVA_OPTS # Additional jvm options. for example, export ZEPPELIN_JAVA_OPTS="-Dspark.executor.memory=8g -Dspark.cores.max=16" {code} Zeppelin doc on https://zeppelin.apache.org/docs/0.7.0/install/upgrade.html {code} >From 0.7, we don't use ZEPPELIN_JAVA_OPTS as default value of >ZEPPELIN_INTP_JAVA_OPTS and also the same for ZEPPELIN_MEM/ZEPPELIN_INTP_MEM. >If user want to configure the jvm opts of interpreter process, please set >ZEPPELIN_INTP_JAVA_OPTS and ZEPPELIN_INTP_MEM explicitly. If you don't set >ZEPPELIN_INTP_MEM, Zeppelin will set it to -Xms1024m -Xmx1024m >-XX:MaxPermSize=512m by default. {code} But if it's true, what is purpose of specifiyng -Dspark.executor.memory in ZEPPELIN_JAVA_OPTS. It would not go into ZEPPELIN_INTP_JAVA_OPTS, but it's the place where this config should be -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ZEPPELIN-1923) I can build project with scala 2.10, but not scala 2.11.
Egor Pahomov created ZEPPELIN-1923: -- Summary: I can build project with scala 2.10, but not scala 2.11. Key: ZEPPELIN-1923 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1923 Project: Zeppelin Issue Type: Bug Components: build Affects Versions: 0.6.2 Reporter: Egor Pahomov When I build SNAPSHOT with: {code} mvn -Pscala-2.11 -Pspark-2.0 -Phadoop-2.6 -Psparkr -Pyarn -Ppyspark -DskipTests -Pvendor-repo clean package {code} if fails, but same with scala-2.10 works. I could not get from travis file who scala-2.11 build should even work -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (ZEPPELIN-1922) Wrong version of jackson fails spark code
Egor Pahomov created ZEPPELIN-1922: -- Summary: Wrong version of jackson fails spark code Key: ZEPPELIN-1922 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1922 Project: Zeppelin Issue Type: Bug Affects Versions: 0.6.2 Reporter: Egor Pahomov Priority: Blocker With spark 2.0.2 with yarn turned on(really everything turned on) conflict of versions prevent from spark code to execute: I'm getting {code} import org.apache.commons.io.IOUtils import java.net.URL import java.nio.charset.Charset java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.rdd.RDDOperationScope$ at org.apache.spark.SparkContext.withScope(SparkContext.scala:679) at org.apache.spark.SparkContext.parallelize(SparkContext.scala:693) {code} And sometimes some sane information about jackson issue. It seems, that I'm not the only one: http://stackoverflow.com/questions/40329663/zeppelin-java-lang-noclassdeffounderror-could-not-initialize-class-org-apache-s The way I've build docker: https://github.com/epahomov/docker-zeppelin/blob/master/Dockerfile -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (ZEPPELIN-1885) Add retry functionality to JDBC interpreter
Egor Pahomov created ZEPPELIN-1885: -- Summary: Add retry functionality to JDBC interpreter Key: ZEPPELIN-1885 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1885 Project: Zeppelin Issue Type: New Feature Components: Interpreters Reporter: Egor Pahomov I use JDBC interpreter to query against impala and connection sometimes get lost. I want to add functionality of retrying(reconnecting) in such cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (ZEPPELIN-1715) Add button "execute all blocks, starting from this one"
Egor Pahomov created ZEPPELIN-1715: -- Summary: Add button "execute all blocks, starting from this one" Key: ZEPPELIN-1715 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1715 Project: Zeppelin Issue Type: Improvement Components: GUI Reporter: Egor Pahomov Priority: Minor I have a long notebook. I've changed somthing in the middle and want to run all blocks below this one. Now I need to do it manually, I would love to have a special button for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (ZEPPELIN-1442) Pyspark in Zeppelin does not support UDF
Egor Pahomov created ZEPPELIN-1442: -- Summary: Pyspark in Zeppelin does not support UDF Key: ZEPPELIN-1442 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1442 Project: Zeppelin Issue Type: Bug Components: pySpark, python-interpreter Affects Versions: 0.6.1 Reporter: Egor Pahomov It worked in 0.5.6 with spark 1.6.2 On 2.0 - I've checked - when I run just pySpark everything working fine. I've build zeppelin with {code} mvn clean package -Pspark-2.0 -Phadoop-2.6 -Pyarn -Ppyspark -DskipTests {code} {code} %pyspark from pyspark.sql import SQLContext, Row from pyspark.sql.types import StringType, IntegerType, StructType, StructField, MapType, FloatType, ArrayType from pyspark.sql.functions import udf sqlContext.registerFunction("stringLengthString", lambda x: len(x)) {code} Returns: {code} Traceback (most recent call last): File "/tmp/zeppelin_pyspark-3201872850141735060.py", line 266, in raise Exception(traceback.format_exc()) Exception: Traceback (most recent call last): File "/tmp/zeppelin_pyspark-3201872850141735060.py", line 264, in exec(code) File "", line 7, in File "/home/egor/spark/python/pyspark/sql/context.py", line 203, in registerFunction self.sparkSession.catalog.registerFunction(name, f, returnType) AttributeError: 'JavaMember' object has no attribute 'registerFunction' {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)