Re: Optimizing Spark interpreter startup

2023-05-03 Thread Jeff Zhang
Hi Vladimir,

Have you compared it with spark shell? I think it is similar as spark shell

On Wed, May 3, 2023 at 10:12 PM Vladimir Prus 
wrote:

> Hi,
>
> I was profiling the startup time of Spark Interpreter in our environment,
> and it looks like
> a total of 5 seconds is spent at this line in
> SparkScala212Interpreter.scala:
>
> sparkILoop.initializeSynchronous()
>
> That line, eventually, calls nsc.Global constructor, which spends 5
> seconds creating mirrors
> of every class on the classpath. Obviously, most users will never care
> about most of those
> classes.
>
> Any ideas on how this can be sped up, maybe by only looking at key spark
> classes?
>
> [image: image.png]
>
> --
> Vladimir Prus
> http://vladimirprus.com
>


-- 
Best Regards

Jeff Zhang


Optimizing Spark interpreter startup

2023-05-03 Thread Vladimir Prus
Hi,

I was profiling the startup time of Spark Interpreter in our environment,
and it looks like
a total of 5 seconds is spent at this line in
SparkScala212Interpreter.scala:

sparkILoop.initializeSynchronous()

That line, eventually, calls nsc.Global constructor, which spends 5 seconds
creating mirrors
of every class on the classpath. Obviously, most users will never care
about most of those
classes.

Any ideas on how this can be sped up, maybe by only looking at key spark
classes?

[image: image.png]

-- 
Vladimir Prus
http://vladimirprus.com


Running spark interpreter in 0.10.0 docker image fails to delete files

2021-11-02 Thread brian
Hi,

I have set up Zeppelin to use docker to run interpreters using

  zeppelin.run.mode
  docker
  'auto|local|k8s|docker'



  zeppelin.docker.container.image
   apache/zeppelin:0.10.0 
  Docker image for interpreters


This works so far, as interpreter images are being launched and used,
however I run into an issue with the Spark Interpreter.
In the 0.10.0 docker image, the DockerInterpreter tries to delete all
interpreters, that are not relevant for Spark. There I do get a permission
denied message. The Dockerfile uses USER 1000 and the Zeppelin files in the
image are owned by root, thus a delete fails.

This happens in
org/apache/zeppelin/interpreter/launcher/DockerInterpreterProcess.java
private void rmInContainer(String containerId, String path)

Please advice.
Thanks.

Cheers
Brian


Log snippet:
[...]
rm: cannot remove
'/opt/zeppelin/interpreter/angular/zeppelin-angular-0.10.0.jar'
 INFO [2021-11-02 09:29:29,030] ({SchedulerFactory3}
DockerInterpreterProcess.java[execInContainer]:591) - : Permission denied

 INFO [2021-11-02 09:29:29,030] ({SchedulerFactory3}
DockerInterpreterProcess.java[execInContainer]:591) - rm: cannot remove
'/opt/zeppelin/interpreter/angular/META-INF/LICENSE': Permission denied
rm: cannot remove '/opt/zeppelin/interpreter/angular/META-INF/DEPENDENCIES':
Permission denied

 INFO [2021-11-02 09:29:29,030] ({SchedulerFactory3}
DockerInterpreterProcess.java[execInContainer]:591) - rm: cannot remove
'/opt/zeppelin/interpreter/angular/META-INF/NOTICE': Permission denied

 INFO [2021-11-02 09:29:29,030] ({SchedulerFactory3}
DockerInterpreterProcess.java[execInContainer]:591) - rm: cannot remove
'/opt/zeppelin/interpreter/flink-cmd/interpreter-setting.json': Permissi
on denied
rm: cannot remove
'/opt/zeppelin/interpreter/flink-cmd/zeppelin-flink-cmd-0.10.0.jar'
 INFO [2021-11-02 09:29:29,030] ({SchedulerFactory3}
DockerInterpreterProcess.java[execInContainer]:591) - : Permission denied

 INFO [2021-11-02 09:29:29,030] ({SchedulerFactory3}
DockerInterpreterProcess.java[execInContainer]:591) - rm: cannot remove
'/opt/zeppelin/interpreter/flink-cmd/META-INF/LICENSE': Permission denied
rm: cannot remove
'/opt/zeppelin/interpreter/flink-cmd/META-INF/DEPENDENCIES': Permission
denied
INFO [2021-11-02 09:29:29,030] ({SchedulerFactory3}
DockerInterpreterProcess.java[execInContainer]:591) - rm: cannot remove
'/opt/zeppelin/interpreter/flink-cmd/META-INF/NOTICE': Permission denied

 INFO [2021-11-02 09:29:29,031] ({SchedulerFactory3}
DockerInterpreterProcess.java[execInContainer]:581) - exec container
commmand: mkdir /opt/zeppelin -p
 INFO [2021-11-02 09:29:29,082] ({SchedulerFactory3}
DockerInterpreterProcess.java[execInContainer]:581) - exec container
commmand: mkdir /opt/zeppelin/conf -p
 WARN [2021-11-02 09:29:29,114] ({SchedulerFactory3}
DockerInterpreterProcess.java[copyRunFileToContainer]:452) - /etc/krb5.conf
file not found, Did not upload the krb5.conf to the container!
 INFO [2021-11-02 09:29:29,114] ({SchedulerFactory3}
DockerInterpreterProcess.java[execInContainer]:581) - exec container
commmand: mkdir /opt/zeppelin/bin -p
 INFO [2021-11-02 09:29:29,177] ({SchedulerFactory3}
DockerInterpreterProcess.java[execInContainer]:581) - exec container
commmand: mkdir /opt/zeppelin/interpreter/spark -p
 INFO [2021-11-02 09:29:40,185] ({SchedulerFactory3}
DockerInterpreterProcess.java[execInContainer]:581) - exec container
commmand: mkdir /tmp/zeppelin-tar -p
 WARN [2021-11-02 09:29:40,404] ({SchedulerFactory3}
NotebookServer.java[onStatusChange]:1986) - Job 20180530-101750_1491737301
is finished, status: ERROR, exception: null, result: %text
java.lang.IllegalStateException: Container
8d64e1b1c58327771aa08678f555f95b3d0bbca37cb7b7662220594a4f39f2da is not
running.
at
com.spotify.docker.client.DefaultDockerClient.execCreate(DefaultDockerClient
.java:1650)
at
org.apache.zeppelin.interpreter.launcher.DockerInterpreterProcess.execInCont
ainer(DockerInterpreterProcess.java:584)
at
org.apache.zeppelin.interpreter.launcher.DockerInterpreterProcess.mkdirInCon
tainer(DockerInterpreterProcess.java:563)
at
org.apache.zeppelin.interpreter.launcher.DockerInterpreterProcess.deployToCo
ntainer(DockerInterpreterProcess.java:539)
at
org.apache.zeppelin.interpreter.launcher.DockerInterpreterProcess.copyRunFil
eToContainer(DockerInterpreterProcess.java:533)
at
org.apache.zeppelin.interpreter.launcher.DockerInterpreterProcess.start(Dock
erInterpreterProcess.java:237)
at
org.apache.zeppelin.interpreter.ManagedInterpreterGroup.getOrCreateInterpret
erProcess(ManagedInterpreterGroup.java:68)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getOrCreateInterpre
terProcess(RemoteInterpreter.java:104)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(Rem
oteInterpreter.java:154)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpre
ter.java:126

Re: Scala 2.12 version mismatch for Spark Interpreter

2021-10-28 Thread Mich Talebzadeh
apologies should say the docker image should be on 3.1.1



   view my Linkedin profile




*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Thu, 28 Oct 2021 at 14:34, Mich Talebzadeh 
wrote:

> you should go for Spark 3.1.1 for k8s. That is the tried and tested one
> for Kubernetes in Spark 3 series, meaning the docker image should be on
> .1.1 and your client which I think is used to submit spark-submit on k8s
> should also be on 3.1.1
>
> HTH
>
>
>view my Linkedin profile
> 
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Thu, 28 Oct 2021 at 13:13, Jeff Zhang  wrote:
>
>> Hi Fabrizio,
>>
>> Spark 3.2.0 is supported recently in this PR
>> https://github.com/apache/zeppelin/pull/4257
>> The problem you mentioned is solved.
>>
>> Fabrizio Fab  于2021年10月28日周四 下午7:43写道:
>>
>>> I am aware that Spark 3.20 is not officially released, but I am trying
>>> to put it to work.
>>>
>>> The first thing that I noticed is the following:
>>>
>>> the SparkInterpreter is compiled for Scala 2.12.7
>>>
>>> Spark 3.2 is compiled for Scala 2.12.15
>>>
>>> Unfortunately there are some breaking changes between the two versions
>>> (even if only the minor version has changed... W.T.F. ??)  that requires a
>>> recompiling (I hope no code update)..
>>>
>>> The first incompatibily I run into is at line 66 of
>>> SparkScala212Interpreter.scala
>>> val settings = new Settings()
>>> settings.processArguments(List("-Yrepl-class-based",
>>>   "-Yrepl-outdir", s"${outputDir.getAbsolutePath}"), true)
>>> settings.embeddedDefaults(sparkInterpreterClassLoader)
>>>
>>> -->settings.usejavacp.value = true  <--
>>>
>>> scala.tools.nsc.Settings.usejavacp was moved since 2.12.13 from
>>> AbsSettings to MutableSettings, so you  get a runtime error.
>>>
>>>
>>> I'll make you know if I'll resolve all problems.
>>>
>>>
>>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>


Re: Scala 2.12 version mismatch for Spark Interpreter

2021-10-28 Thread Mich Talebzadeh
you should go for Spark 3.1.1 for k8s. That is the tried and tested one for
Kubernetes in Spark 3 series, meaning the docker image should be on .1.1
and your client which I think is used to submit spark-submit on k8s should
also be on 3.1.1

HTH


   view my Linkedin profile




*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Thu, 28 Oct 2021 at 13:13, Jeff Zhang  wrote:

> Hi Fabrizio,
>
> Spark 3.2.0 is supported recently in this PR
> https://github.com/apache/zeppelin/pull/4257
> The problem you mentioned is solved.
>
> Fabrizio Fab  于2021年10月28日周四 下午7:43写道:
>
>> I am aware that Spark 3.20 is not officially released, but I am trying to
>> put it to work.
>>
>> The first thing that I noticed is the following:
>>
>> the SparkInterpreter is compiled for Scala 2.12.7
>>
>> Spark 3.2 is compiled for Scala 2.12.15
>>
>> Unfortunately there are some breaking changes between the two versions
>> (even if only the minor version has changed... W.T.F. ??)  that requires a
>> recompiling (I hope no code update)..
>>
>> The first incompatibily I run into is at line 66 of
>> SparkScala212Interpreter.scala
>> val settings = new Settings()
>> settings.processArguments(List("-Yrepl-class-based",
>>   "-Yrepl-outdir", s"${outputDir.getAbsolutePath}"), true)
>> settings.embeddedDefaults(sparkInterpreterClassLoader)
>>
>> -->settings.usejavacp.value = true  <--
>>
>> scala.tools.nsc.Settings.usejavacp was moved since 2.12.13 from
>> AbsSettings to MutableSettings, so you  get a runtime error.
>>
>>
>> I'll make you know if I'll resolve all problems.
>>
>>
>>
>
> --
> Best Regards
>
> Jeff Zhang
>


Re: Scala 2.12 version mismatch for Spark Interpreter

2021-10-28 Thread Jeff Zhang
Hi Fabrizio,

Spark 3.2.0 is supported recently in this PR
https://github.com/apache/zeppelin/pull/4257
The problem you mentioned is solved.

Fabrizio Fab  于2021年10月28日周四 下午7:43写道:

> I am aware that Spark 3.20 is not officially released, but I am trying to
> put it to work.
>
> The first thing that I noticed is the following:
>
> the SparkInterpreter is compiled for Scala 2.12.7
>
> Spark 3.2 is compiled for Scala 2.12.15
>
> Unfortunately there are some breaking changes between the two versions
> (even if only the minor version has changed... W.T.F. ??)  that requires a
> recompiling (I hope no code update)..
>
> The first incompatibily I run into is at line 66 of
> SparkScala212Interpreter.scala
> val settings = new Settings()
> settings.processArguments(List("-Yrepl-class-based",
>   "-Yrepl-outdir", s"${outputDir.getAbsolutePath}"), true)
> settings.embeddedDefaults(sparkInterpreterClassLoader)
>
> -->settings.usejavacp.value = true  <--
>
> scala.tools.nsc.Settings.usejavacp was moved since 2.12.13 from
> AbsSettings to MutableSettings, so you  get a runtime error.
>
>
> I'll make you know if I'll resolve all problems.
>
>
>

-- 
Best Regards

Jeff Zhang


Scala 2.12 version mismatch for Spark Interpreter

2021-10-28 Thread Fabrizio Fab
I am aware that Spark 3.20 is not officially released, but I am trying to put 
it to work.

The first thing that I noticed is the following:

the SparkInterpreter is compiled for Scala 2.12.7

Spark 3.2 is compiled for Scala 2.12.15

Unfortunately there are some breaking changes between the two versions (even if 
only the minor version has changed... W.T.F. ??)  that requires a recompiling 
(I hope no code update)..

The first incompatibily I run into is at line 66 of 
SparkScala212Interpreter.scala
val settings = new Settings()
settings.processArguments(List("-Yrepl-class-based",
  "-Yrepl-outdir", s"${outputDir.getAbsolutePath}"), true)
settings.embeddedDefaults(sparkInterpreterClassLoader)

-->settings.usejavacp.value = true  <--

scala.tools.nsc.Settings.usejavacp was moved since 2.12.13 from AbsSettings to 
MutableSettings, so you  get a runtime error.


I'll make you know if I'll resolve all problems.




Re: CVE-2019-10095: Apache Zeppelin: bash command injection in spark interpreter

2021-09-28 Thread Michiel Haisma
Hi Jeff, others,

Can you please provide additional information regarding this vulnerability. 
Please include the following information:

 * Technical description of vulnerability, how users determine whether they are 
impacted. Maybe this is satisfied by one of the following items:
 * Relevant issue in Zeppelin Jira issue tracker.
 * Link to pull request or commit containing the fix.
 * List of released versions containing the fix.

I would also highly suggest providing these additional details in one of the 
vulnerability databases (e.g. https://nvd.nist.gov/vuln/detail/CVE-2019-10095) 
so that users have a better understanding of the impact and solutions.
NVD - CVE-2019-10095<https://nvd.nist.gov/vuln/detail/CVE-2019-10095>
NVD Analysts use publicly available information to associate vector strings and 
CVSS scores. We also display any CVSS information provided within the CVE List 
from the CNA.
nvd.nist.gov

Many thanks,

Michiel

On 2021/09/02 15:56:50, Jeff Zhang  wrote:
> Description:>
>
> bash command injection vulnerability in Apache Zeppelin allows an attacker to 
> inject system commands into Spark interpreter settings. This issue affects 
> Apache Zeppelin Apache Zeppelin version 0.9.0 and prior versions.>
>
> Credit:>
>
> Apache Zeppelin would like to thank HERE Security team for reporting this 
> issue >
>
>


CVE-2019-10095: Apache Zeppelin: bash command injection in spark interpreter

2021-09-02 Thread Jeff Zhang
Description:

bash command injection vulnerability in Apache Zeppelin allows an attacker to 
inject system commands into Spark interpreter settings.  This issue affects 
Apache Zeppelin Apache Zeppelin version 0.9.0 and prior versions.

Credit:

Apache Zeppelin would like to thank HERE Security team for reporting this issue 



Re: Local spark interpreter with extra java options

2021-07-25 Thread Lior Chaga
Awesome, thanks Jeff.

On Sun, Jul 25, 2021 at 11:24 AM Jeff Zhang  wrote:

> Hi Lior,
>
> It would be fixed in https://github.com/apache/zeppelin/pull/4127
>
>
> Lior Chaga  于2021年7月25日周日 下午3:58写道:
>
>> After a couple of attempts of code fixes, when every time I seemed to
>> make things work just to find out the next step in the process breaks, I've
>> found the most simple solution - put them extraJavaOptions in
>> spark-defaults.conf (instead of keeping them in interpreter settings)
>>
>>
>>
>> On Sun, Jul 11, 2021 at 1:30 PM Lior Chaga  wrote:
>>
>>> Thanks Jeff,
>>> So I should escape the whitespaces? Is there a ticket for it? couldn't
>>> find one
>>>
>>> On Sun, Jul 11, 2021 at 1:10 PM Jeff Zhang  wrote:
>>>
>>>> I believe this is due to SparkInterpreterLauncher doesn't support
>>>> parameters with whitespace. (It would use whitespace as delimiter to
>>>> separate parameters), this is a known issue
>>>>
>>>> Lior Chaga  于2021年7月11日周日 下午4:14写道:
>>>>
>>>>> So after adding the quotes in both SparkInterpreterLauncher
>>>>> and interpreter.sh, interpreter is still failing with same error of
>>>>> Unrecognized option.
>>>>> But the weird thing is that if I copy the command supposedly executed
>>>>> from zeppelin (as it is printed to log) and run it directly in shell, the
>>>>> interpreter process is properly running. So my guess is that the forked
>>>>> process command that is created, is not really identical to the one that 
>>>>> is
>>>>> logged.
>>>>>
>>>>> This is how my cmd looks like (censored a bit):
>>>>>
>>>>> /usr/local/spark/bin/spark-submit
>>>>> --class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
>>>>> --driver-class-path
>>>>> :/zeppelin/local-repo/spark/*:/zeppelin/interpreter/spark/*:::/zeppelin/inter
>>>>> preter/zeppelin-interpreter-shaded-0.10.0-SNAPSHOT.jar:/zeppelin/interpreter/spark/spark-interpreter-0.10.0-SNAPSHOT.jar:/etc/hadoop/conf
>>>>>
>>>>> *--driver-java-options " -DSERVICENAME=zeppelin_docker
>>>>> -Dfile.encoding=UTF-8
>>>>> -Dlog4j.configuration=file:///zeppelin/conf/log4j.properties
>>>>> -Dlog4j.configurationFile=file:///zeppelin/conf/log4j2.properties
>>>>> -Dzeppelin.log.file=/var/log/zeppelin/zeppelin-interpreter-spark-shared_process--zeppelin-test-spark3-7d74d5df4-2g8x5.log"
>>>>> *
>>>>> --conf spark.driver.host=10.135.120.245
>>>>> --conf "spark.dynamicAllocation.minExecutors=1"
>>>>> --conf "spark.shuffle.service.enabled=true"
>>>>> --conf "spark.sql.parquet.int96AsTimestamp=true"
>>>>> --conf "spark.ui.retainedTasks=1"
>>>>> --conf "spark.executor.heartbeatInterval=600s"
>>>>> --conf "spark.ui.retainedJobs=100"
>>>>> --conf "spark.sql.ui.retainedExecutions=10"
>>>>> --conf "spark.hadoop.cloneConf=true"
>>>>> --conf "spark.debug.maxToStringFields=20"
>>>>> --conf "spark.executor.memory=70g"
>>>>> --conf
>>>>> "spark.executor.extraClassPath=../mysql-connector-java-8.0.18.jar:../guava-19.0.jar"
>>>>>
>>>>> --conf "spark.hadoop.fs.permissions.umask-mode=000"
>>>>> --conf "spark.memory.storageFraction=0.1"
>>>>> --conf "spark.scheduler.mode=FAIR"
>>>>> --conf "spark.sql.adaptive.enabled=true"
>>>>> --conf
>>>>> "spark.master=mesos://zk://zk003:2181,zk004:2181,zk006:2181,/mesos-zeppelin"
>>>>>
>>>>> --conf "spark.driver.memory=15g"
>>>>> --conf "spark.io.compression.codec=lz4"
>>>>> --conf "spark.executor.uri=
>>>>> https://artifactory.company.com/artifactory/static/spark/spark-dist/spark-3.1.2.2-hadoop-2.7-zulu;
>>>>> -
>>>>> -conf "spark.ui.retainedStages=500"
>>>>> --conf "spark.mesos.uris=
>>>>> https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/mysql-connector-java-8.0.18.jar,https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/guava-19.0.jar;
>>>>>
>>>>> --conf &q

Re: Local spark interpreter with extra java options

2021-07-25 Thread Jeff Zhang
Hi Lior,

It would be fixed in https://github.com/apache/zeppelin/pull/4127


Lior Chaga  于2021年7月25日周日 下午3:58写道:

> After a couple of attempts of code fixes, when every time I seemed to make
> things work just to find out the next step in the process breaks, I've
> found the most simple solution - put them extraJavaOptions in
> spark-defaults.conf (instead of keeping them in interpreter settings)
>
>
>
> On Sun, Jul 11, 2021 at 1:30 PM Lior Chaga  wrote:
>
>> Thanks Jeff,
>> So I should escape the whitespaces? Is there a ticket for it? couldn't
>> find one
>>
>> On Sun, Jul 11, 2021 at 1:10 PM Jeff Zhang  wrote:
>>
>>> I believe this is due to SparkInterpreterLauncher doesn't support
>>> parameters with whitespace. (It would use whitespace as delimiter to
>>> separate parameters), this is a known issue
>>>
>>> Lior Chaga  于2021年7月11日周日 下午4:14写道:
>>>
>>>> So after adding the quotes in both SparkInterpreterLauncher
>>>> and interpreter.sh, interpreter is still failing with same error of
>>>> Unrecognized option.
>>>> But the weird thing is that if I copy the command supposedly executed
>>>> from zeppelin (as it is printed to log) and run it directly in shell, the
>>>> interpreter process is properly running. So my guess is that the forked
>>>> process command that is created, is not really identical to the one that is
>>>> logged.
>>>>
>>>> This is how my cmd looks like (censored a bit):
>>>>
>>>> /usr/local/spark/bin/spark-submit
>>>> --class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
>>>> --driver-class-path
>>>> :/zeppelin/local-repo/spark/*:/zeppelin/interpreter/spark/*:::/zeppelin/inter
>>>> preter/zeppelin-interpreter-shaded-0.10.0-SNAPSHOT.jar:/zeppelin/interpreter/spark/spark-interpreter-0.10.0-SNAPSHOT.jar:/etc/hadoop/conf
>>>>
>>>> *--driver-java-options " -DSERVICENAME=zeppelin_docker
>>>> -Dfile.encoding=UTF-8
>>>> -Dlog4j.configuration=file:///zeppelin/conf/log4j.properties
>>>> -Dlog4j.configurationFile=file:///zeppelin/conf/log4j2.properties
>>>> -Dzeppelin.log.file=/var/log/zeppelin/zeppelin-interpreter-spark-shared_process--zeppelin-test-spark3-7d74d5df4-2g8x5.log"
>>>> *
>>>> --conf spark.driver.host=10.135.120.245
>>>> --conf "spark.dynamicAllocation.minExecutors=1"
>>>> --conf "spark.shuffle.service.enabled=true"
>>>> --conf "spark.sql.parquet.int96AsTimestamp=true"
>>>> --conf "spark.ui.retainedTasks=1"
>>>> --conf "spark.executor.heartbeatInterval=600s"
>>>> --conf "spark.ui.retainedJobs=100"
>>>> --conf "spark.sql.ui.retainedExecutions=10"
>>>> --conf "spark.hadoop.cloneConf=true"
>>>> --conf "spark.debug.maxToStringFields=20"
>>>> --conf "spark.executor.memory=70g"
>>>> --conf
>>>> "spark.executor.extraClassPath=../mysql-connector-java-8.0.18.jar:../guava-19.0.jar"
>>>>
>>>> --conf "spark.hadoop.fs.permissions.umask-mode=000"
>>>> --conf "spark.memory.storageFraction=0.1"
>>>> --conf "spark.scheduler.mode=FAIR"
>>>> --conf "spark.sql.adaptive.enabled=true"
>>>> --conf
>>>> "spark.master=mesos://zk://zk003:2181,zk004:2181,zk006:2181,/mesos-zeppelin"
>>>>
>>>> --conf "spark.driver.memory=15g"
>>>> --conf "spark.io.compression.codec=lz4"
>>>> --conf "spark.executor.uri=
>>>> https://artifactory.company.com/artifactory/static/spark/spark-dist/spark-3.1.2.2-hadoop-2.7-zulu;
>>>> -
>>>> -conf "spark.ui.retainedStages=500"
>>>> --conf "spark.mesos.uris=
>>>> https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/mysql-connector-java-8.0.18.jar,https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/guava-19.0.jar;
>>>>
>>>> --conf "spark.driver.maxResultSize=8g"
>>>> *--conf "spark.executor.extraJavaOptions=-DSERVICENAME=Zeppelin
>>>> -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=2015
>>>> -XX:-OmitStackTraceInFastThrow -Dcom.sun.management.jmxremote
>>>> -Dcom.sun.management.jmxremote.port=55745
>>>> -Dcom.sun.management.jmxremote.authenticate=false
>>>>

Re: Local spark interpreter with extra java options

2021-07-25 Thread Lior Chaga
After a couple of attempts of code fixes, when every time I seemed to make
things work just to find out the next step in the process breaks, I've
found the most simple solution - put them extraJavaOptions in
spark-defaults.conf (instead of keeping them in interpreter settings)



On Sun, Jul 11, 2021 at 1:30 PM Lior Chaga  wrote:

> Thanks Jeff,
> So I should escape the whitespaces? Is there a ticket for it? couldn't
> find one
>
> On Sun, Jul 11, 2021 at 1:10 PM Jeff Zhang  wrote:
>
>> I believe this is due to SparkInterpreterLauncher doesn't support
>> parameters with whitespace. (It would use whitespace as delimiter to
>> separate parameters), this is a known issue
>>
>> Lior Chaga  于2021年7月11日周日 下午4:14写道:
>>
>>> So after adding the quotes in both SparkInterpreterLauncher
>>> and interpreter.sh, interpreter is still failing with same error of
>>> Unrecognized option.
>>> But the weird thing is that if I copy the command supposedly executed
>>> from zeppelin (as it is printed to log) and run it directly in shell, the
>>> interpreter process is properly running. So my guess is that the forked
>>> process command that is created, is not really identical to the one that is
>>> logged.
>>>
>>> This is how my cmd looks like (censored a bit):
>>>
>>> /usr/local/spark/bin/spark-submit
>>> --class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
>>> --driver-class-path
>>> :/zeppelin/local-repo/spark/*:/zeppelin/interpreter/spark/*:::/zeppelin/inter
>>> preter/zeppelin-interpreter-shaded-0.10.0-SNAPSHOT.jar:/zeppelin/interpreter/spark/spark-interpreter-0.10.0-SNAPSHOT.jar:/etc/hadoop/conf
>>>
>>> *--driver-java-options " -DSERVICENAME=zeppelin_docker
>>> -Dfile.encoding=UTF-8
>>> -Dlog4j.configuration=file:///zeppelin/conf/log4j.properties
>>> -Dlog4j.configurationFile=file:///zeppelin/conf/log4j2.properties
>>> -Dzeppelin.log.file=/var/log/zeppelin/zeppelin-interpreter-spark-shared_process--zeppelin-test-spark3-7d74d5df4-2g8x5.log"
>>> *
>>> --conf spark.driver.host=10.135.120.245
>>> --conf "spark.dynamicAllocation.minExecutors=1"
>>> --conf "spark.shuffle.service.enabled=true"
>>> --conf "spark.sql.parquet.int96AsTimestamp=true"
>>> --conf "spark.ui.retainedTasks=1"
>>> --conf "spark.executor.heartbeatInterval=600s"
>>> --conf "spark.ui.retainedJobs=100"
>>> --conf "spark.sql.ui.retainedExecutions=10"
>>> --conf "spark.hadoop.cloneConf=true"
>>> --conf "spark.debug.maxToStringFields=20"
>>> --conf "spark.executor.memory=70g"
>>> --conf
>>> "spark.executor.extraClassPath=../mysql-connector-java-8.0.18.jar:../guava-19.0.jar"
>>>
>>> --conf "spark.hadoop.fs.permissions.umask-mode=000"
>>> --conf "spark.memory.storageFraction=0.1"
>>> --conf "spark.scheduler.mode=FAIR"
>>> --conf "spark.sql.adaptive.enabled=true"
>>> --conf
>>> "spark.master=mesos://zk://zk003:2181,zk004:2181,zk006:2181,/mesos-zeppelin"
>>>
>>> --conf "spark.driver.memory=15g"
>>> --conf "spark.io.compression.codec=lz4"
>>> --conf "spark.executor.uri=
>>> https://artifactory.company.com/artifactory/static/spark/spark-dist/spark-3.1.2.2-hadoop-2.7-zulu;
>>> -
>>> -conf "spark.ui.retainedStages=500"
>>> --conf "spark.mesos.uris=
>>> https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/mysql-connector-java-8.0.18.jar,https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/guava-19.0.jar;
>>>
>>> --conf "spark.driver.maxResultSize=8g"
>>> *--conf "spark.executor.extraJavaOptions=-DSERVICENAME=Zeppelin
>>> -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=2015
>>> -XX:-OmitStackTraceInFastThrow -Dcom.sun.management.jmxremote
>>> -Dcom.sun.management.jmxremote.port=55745
>>> -Dcom.sun.management.jmxremote.authenticate=false
>>> -Dcom.sun.management.jmxremote.ssl=false -verbose:gc
>>> -Dlog4j.configurationFile=/etc/config/log4j2-executor-config.xml
>>> -XX:+UseG1GC -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps
>>> -XX:+PrintGCTimeStamps -XX:+PrintFlagsFinal -XX:+PrintReferenceGC
>>> -XX:+PrintGCDetails -XX:+PrintAdaptiveSizePolicy
>>> -XX:+UnlockDiagnosticVMOptions -XX:+G1Summariz

Re: Local spark interpreter with extra java options

2021-07-11 Thread Lior Chaga
Thanks Jeff,
So I should escape the whitespaces? Is there a ticket for it? couldn't find
one

On Sun, Jul 11, 2021 at 1:10 PM Jeff Zhang  wrote:

> I believe this is due to SparkInterpreterLauncher doesn't support
> parameters with whitespace. (It would use whitespace as delimiter to
> separate parameters), this is a known issue
>
> Lior Chaga  于2021年7月11日周日 下午4:14写道:
>
>> So after adding the quotes in both SparkInterpreterLauncher
>> and interpreter.sh, interpreter is still failing with same error of
>> Unrecognized option.
>> But the weird thing is that if I copy the command supposedly executed
>> from zeppelin (as it is printed to log) and run it directly in shell, the
>> interpreter process is properly running. So my guess is that the forked
>> process command that is created, is not really identical to the one that is
>> logged.
>>
>> This is how my cmd looks like (censored a bit):
>>
>> /usr/local/spark/bin/spark-submit
>> --class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
>> --driver-class-path
>> :/zeppelin/local-repo/spark/*:/zeppelin/interpreter/spark/*:::/zeppelin/inter
>> preter/zeppelin-interpreter-shaded-0.10.0-SNAPSHOT.jar:/zeppelin/interpreter/spark/spark-interpreter-0.10.0-SNAPSHOT.jar:/etc/hadoop/conf
>>
>> *--driver-java-options " -DSERVICENAME=zeppelin_docker
>> -Dfile.encoding=UTF-8
>> -Dlog4j.configuration=file:///zeppelin/conf/log4j.properties
>> -Dlog4j.configurationFile=file:///zeppelin/conf/log4j2.properties
>> -Dzeppelin.log.file=/var/log/zeppelin/zeppelin-interpreter-spark-shared_process--zeppelin-test-spark3-7d74d5df4-2g8x5.log"
>> *
>> --conf spark.driver.host=10.135.120.245
>> --conf "spark.dynamicAllocation.minExecutors=1"
>> --conf "spark.shuffle.service.enabled=true"
>> --conf "spark.sql.parquet.int96AsTimestamp=true"
>> --conf "spark.ui.retainedTasks=1"
>> --conf "spark.executor.heartbeatInterval=600s"
>> --conf "spark.ui.retainedJobs=100"
>> --conf "spark.sql.ui.retainedExecutions=10"
>> --conf "spark.hadoop.cloneConf=true"
>> --conf "spark.debug.maxToStringFields=20"
>> --conf "spark.executor.memory=70g"
>> --conf
>> "spark.executor.extraClassPath=../mysql-connector-java-8.0.18.jar:../guava-19.0.jar"
>>
>> --conf "spark.hadoop.fs.permissions.umask-mode=000"
>> --conf "spark.memory.storageFraction=0.1"
>> --conf "spark.scheduler.mode=FAIR"
>> --conf "spark.sql.adaptive.enabled=true"
>> --conf
>> "spark.master=mesos://zk://zk003:2181,zk004:2181,zk006:2181,/mesos-zeppelin"
>>
>> --conf "spark.driver.memory=15g"
>> --conf "spark.io.compression.codec=lz4"
>> --conf "spark.executor.uri=
>> https://artifactory.company.com/artifactory/static/spark/spark-dist/spark-3.1.2.2-hadoop-2.7-zulu;
>> -
>> -conf "spark.ui.retainedStages=500"
>> --conf "spark.mesos.uris=
>> https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/mysql-connector-java-8.0.18.jar,https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/guava-19.0.jar;
>>
>> --conf "spark.driver.maxResultSize=8g"
>> *--conf "spark.executor.extraJavaOptions=-DSERVICENAME=Zeppelin
>> -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=2015
>> -XX:-OmitStackTraceInFastThrow -Dcom.sun.management.jmxremote
>> -Dcom.sun.management.jmxremote.port=55745
>> -Dcom.sun.management.jmxremote.authenticate=false
>> -Dcom.sun.management.jmxremote.ssl=false -verbose:gc
>> -Dlog4j.configurationFile=/etc/config/log4j2-executor-config.xml
>> -XX:+UseG1GC -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps
>> -XX:+PrintGCTimeStamps -XX:+PrintFlagsFinal -XX:+PrintReferenceGC
>> -XX:+PrintGCDetails -XX:+PrintAdaptiveSizePolicy
>> -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark
>> -XX:+PrintStringDeduplicationStatistics -XX:+UseStringDeduplication
>> -XX:InitiatingHeapOccupancyPercent=35
>> -Dhttps.proxyHost=proxy.service.consul -Dhttps.proxyPort=3128" *
>> --conf "spark.dynamicAllocation.enabled=true"
>> --conf "spark.default.parallelism=1200"
>> --conf "spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2"
>> --conf
>> "spark.hadoop.fs.AbstractFileSystem.gs.impl=com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS"
>>
>> --conf "spark.app.name=zeppelin_docker_spark3"
>> --conf "spark.

Re: Local spark interpreter with extra java options

2021-07-11 Thread Jeff Zhang
I believe this is due to SparkInterpreterLauncher doesn't support
parameters with whitespace. (It would use whitespace as delimiter to
separate parameters), this is a known issue

Lior Chaga  于2021年7月11日周日 下午4:14写道:

> So after adding the quotes in both SparkInterpreterLauncher
> and interpreter.sh, interpreter is still failing with same error of
> Unrecognized option.
> But the weird thing is that if I copy the command supposedly executed from
> zeppelin (as it is printed to log) and run it directly in shell, the
> interpreter process is properly running. So my guess is that the forked
> process command that is created, is not really identical to the one that is
> logged.
>
> This is how my cmd looks like (censored a bit):
>
> /usr/local/spark/bin/spark-submit
> --class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
> --driver-class-path
> :/zeppelin/local-repo/spark/*:/zeppelin/interpreter/spark/*:::/zeppelin/inter
> preter/zeppelin-interpreter-shaded-0.10.0-SNAPSHOT.jar:/zeppelin/interpreter/spark/spark-interpreter-0.10.0-SNAPSHOT.jar:/etc/hadoop/conf
>
> *--driver-java-options " -DSERVICENAME=zeppelin_docker
> -Dfile.encoding=UTF-8
> -Dlog4j.configuration=file:///zeppelin/conf/log4j.properties
> -Dlog4j.configurationFile=file:///zeppelin/conf/log4j2.properties
> -Dzeppelin.log.file=/var/log/zeppelin/zeppelin-interpreter-spark-shared_process--zeppelin-test-spark3-7d74d5df4-2g8x5.log"
> *
> --conf spark.driver.host=10.135.120.245
> --conf "spark.dynamicAllocation.minExecutors=1"
> --conf "spark.shuffle.service.enabled=true"
> --conf "spark.sql.parquet.int96AsTimestamp=true"
> --conf "spark.ui.retainedTasks=1"
> --conf "spark.executor.heartbeatInterval=600s"
> --conf "spark.ui.retainedJobs=100"
> --conf "spark.sql.ui.retainedExecutions=10"
> --conf "spark.hadoop.cloneConf=true"
> --conf "spark.debug.maxToStringFields=20"
> --conf "spark.executor.memory=70g"
> --conf
> "spark.executor.extraClassPath=../mysql-connector-java-8.0.18.jar:../guava-19.0.jar"
>
> --conf "spark.hadoop.fs.permissions.umask-mode=000"
> --conf "spark.memory.storageFraction=0.1"
> --conf "spark.scheduler.mode=FAIR"
> --conf "spark.sql.adaptive.enabled=true"
> --conf
> "spark.master=mesos://zk://zk003:2181,zk004:2181,zk006:2181,/mesos-zeppelin"
>
> --conf "spark.driver.memory=15g"
> --conf "spark.io.compression.codec=lz4"
> --conf "spark.executor.uri=
> https://artifactory.company.com/artifactory/static/spark/spark-dist/spark-3.1.2.2-hadoop-2.7-zulu;
> -
> -conf "spark.ui.retainedStages=500"
> --conf "spark.mesos.uris=
> https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/mysql-connector-java-8.0.18.jar,https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/guava-19.0.jar;
>
> --conf "spark.driver.maxResultSize=8g"
> *--conf "spark.executor.extraJavaOptions=-DSERVICENAME=Zeppelin
> -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=2015
> -XX:-OmitStackTraceInFastThrow -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=55745
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false -verbose:gc
> -Dlog4j.configurationFile=/etc/config/log4j2-executor-config.xml
> -XX:+UseG1GC -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps
> -XX:+PrintGCTimeStamps -XX:+PrintFlagsFinal -XX:+PrintReferenceGC
> -XX:+PrintGCDetails -XX:+PrintAdaptiveSizePolicy
> -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark
> -XX:+PrintStringDeduplicationStatistics -XX:+UseStringDeduplication
> -XX:InitiatingHeapOccupancyPercent=35
> -Dhttps.proxyHost=proxy.service.consul -Dhttps.proxyPort=3128" *
> --conf "spark.dynamicAllocation.enabled=true"
> --conf "spark.default.parallelism=1200"
> --conf "spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2"
> --conf
> "spark.hadoop.fs.AbstractFileSystem.gs.impl=com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS"
>
> --conf "spark.app.name=zeppelin_docker_spark3"
> --conf "spark.shuffle.service.port=7337"
> --conf "spark.memory.fraction=0.75"
> --conf "spark.mesos.coarse=true"
> --conf "spark.ui.port=4041"
> --conf "spark.dynamicAllocation.executorIdleTimeout=60s"
> --conf "spark.sql.shuffle.partitions=1200"
> --conf "spark.sql.parquet.outputTimestampType=TIMESTAMP_MILLIS"
> --conf "spark.dynamicAllocation.cachedExecutorIdleTimeout=120s"
> --conf "spark.networ

Re: Local spark interpreter with extra java options

2021-07-11 Thread Lior Chaga
So after adding the quotes in both SparkInterpreterLauncher
and interpreter.sh, interpreter is still failing with same error of
Unrecognized option.
But the weird thing is that if I copy the command supposedly executed from
zeppelin (as it is printed to log) and run it directly in shell, the
interpreter process is properly running. So my guess is that the forked
process command that is created, is not really identical to the one that is
logged.

This is how my cmd looks like (censored a bit):

/usr/local/spark/bin/spark-submit
--class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
--driver-class-path
:/zeppelin/local-repo/spark/*:/zeppelin/interpreter/spark/*:::/zeppelin/inter
preter/zeppelin-interpreter-shaded-0.10.0-SNAPSHOT.jar:/zeppelin/interpreter/spark/spark-interpreter-0.10.0-SNAPSHOT.jar:/etc/hadoop/conf

*--driver-java-options " -DSERVICENAME=zeppelin_docker
-Dfile.encoding=UTF-8
-Dlog4j.configuration=file:///zeppelin/conf/log4j.properties
-Dlog4j.configurationFile=file:///zeppelin/conf/log4j2.properties
-Dzeppelin.log.file=/var/log/zeppelin/zeppelin-interpreter-spark-shared_process--zeppelin-test-spark3-7d74d5df4-2g8x5.log"
*
--conf spark.driver.host=10.135.120.245
--conf "spark.dynamicAllocation.minExecutors=1"
--conf "spark.shuffle.service.enabled=true"
--conf "spark.sql.parquet.int96AsTimestamp=true"
--conf "spark.ui.retainedTasks=1"
--conf "spark.executor.heartbeatInterval=600s"
--conf "spark.ui.retainedJobs=100"
--conf "spark.sql.ui.retainedExecutions=10"
--conf "spark.hadoop.cloneConf=true"
--conf "spark.debug.maxToStringFields=20"
--conf "spark.executor.memory=70g"
--conf
"spark.executor.extraClassPath=../mysql-connector-java-8.0.18.jar:../guava-19.0.jar"

--conf "spark.hadoop.fs.permissions.umask-mode=000"
--conf "spark.memory.storageFraction=0.1"
--conf "spark.scheduler.mode=FAIR"
--conf "spark.sql.adaptive.enabled=true"
--conf
"spark.master=mesos://zk://zk003:2181,zk004:2181,zk006:2181,/mesos-zeppelin"

--conf "spark.driver.memory=15g"
--conf "spark.io.compression.codec=lz4"
--conf "spark.executor.uri=
https://artifactory.company.com/artifactory/static/spark/spark-dist/spark-3.1.2.2-hadoop-2.7-zulu;
-
-conf "spark.ui.retainedStages=500"
--conf "spark.mesos.uris=
https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/mysql-connector-java-8.0.18.jar,https://artifactory.company.com/artifactory/static/spark/spark-executor/jars/guava-19.0.jar;

--conf "spark.driver.maxResultSize=8g"
*--conf "spark.executor.extraJavaOptions=-DSERVICENAME=Zeppelin
-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=2015
-XX:-OmitStackTraceInFastThrow -Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=55745
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false -verbose:gc
-Dlog4j.configurationFile=/etc/config/log4j2-executor-config.xml
-XX:+UseG1GC -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps
-XX:+PrintGCTimeStamps -XX:+PrintFlagsFinal -XX:+PrintReferenceGC
-XX:+PrintGCDetails -XX:+PrintAdaptiveSizePolicy
-XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark
-XX:+PrintStringDeduplicationStatistics -XX:+UseStringDeduplication
-XX:InitiatingHeapOccupancyPercent=35
-Dhttps.proxyHost=proxy.service.consul -Dhttps.proxyPort=3128" *
--conf "spark.dynamicAllocation.enabled=true"
--conf "spark.default.parallelism=1200"
--conf "spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2"
--conf
"spark.hadoop.fs.AbstractFileSystem.gs.impl=com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS"

--conf "spark.app.name=zeppelin_docker_spark3"
--conf "spark.shuffle.service.port=7337"
--conf "spark.memory.fraction=0.75"
--conf "spark.mesos.coarse=true"
--conf "spark.ui.port=4041"
--conf "spark.dynamicAllocation.executorIdleTimeout=60s"
--conf "spark.sql.shuffle.partitions=1200"
--conf "spark.sql.parquet.outputTimestampType=TIMESTAMP_MILLIS"
--conf "spark.dynamicAllocation.cachedExecutorIdleTimeout=120s"
--conf "spark.network.timeout=1200s"
--conf "spark.cores.max=600"
--conf
"spark.hadoop.fs.gs.impl=com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem"

--conf "spark.worker.timeout=15"
*--conf
"spark.driver.extraJavaOptions=-Dhttps.proxyHost=proxy.service.consul
-Dhttps.proxyPort=3128
-Dlog4j.configuration=file:/usr/local/spark/conf/log4j.properties
-Djavax.jdo.option.ConnectionDriverName=com.mysql.cj.jdbc.Driver
-Djavax.jdo.option.ConnectionPassword=2eebb22277
-Djavax.jdo.option.ConnectionURL=jdbc:mysql://proxysql-backend.service.consul.company.com:6033/hms?useSSL=false=SCHEMA=true
<http://proxysql-backen

Re: Local spark interpreter with extra java options

2021-07-08 Thread Jeff Zhang
Thanks Lior for the investigation.


Lior Chaga  于2021年7月8日周四 下午8:31写道:

> Ok, I think I found the issue. It's not only that the quotations are
> missing from the --conf param, they are also missing from
> the --driver-java-options, which is concatenated to
> the INTERPRETER_RUN_COMMAND in interpreter.sh
>
> I will fix it in my build, but would like a confirmation that this is
> indeed the issue (and I'm not missing anything), so I'd open a pull
> request.
>
> On Thu, Jul 8, 2021 at 3:05 PM Lior Chaga  wrote:
>
>> I'm trying to run zeppelin using local spark interpreter.
>> Basically everything works, but if I try to set
>> `spark.driver.extraJavaOptions` or `spark.executor.extraJavaOptions`
>> containing several arguments, I get an exception.
>> For instance, for providing `-DmyParam=1 -DmyOtherParam=2`, I'd get:
>> Error: Unrecognized option: -DmyOtherParam=2
>>
>> I noticed that the spark submit looks as follow:
>>
>> spark-submit --class
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 
>> --driver-class-path
>>    *--conf spark.driver.extraJavaOptions=-DmyParam=1
>> -DmyOtherParam=2*
>>
>> So I tried to patch SparkInterpreterLauncher to add quotation marks (like
>> in the example from spark documentation -
>> https://spark.apache.org/docs/latest/configuration.html#dynamically-loading-spark-properties
>> )
>>
>> I see that the quotation marks were added: *--conf
>> "spark.driver.extraJavaOptions=-DmyParam=1 -DmyOtherParam=2"*
>> But I still get the same error.
>>
>> Any idea how I can make it work?
>>
>

-- 
Best Regards

Jeff Zhang


Re: Local spark interpreter with extra java options

2021-07-08 Thread Lior Chaga
Ok, I think I found the issue. It's not only that the quotations are
missing from the --conf param, they are also missing from
the --driver-java-options, which is concatenated to
the INTERPRETER_RUN_COMMAND in interpreter.sh

I will fix it in my build, but would like a confirmation that this is
indeed the issue (and I'm not missing anything), so I'd open a pull
request.

On Thu, Jul 8, 2021 at 3:05 PM Lior Chaga  wrote:

> I'm trying to run zeppelin using local spark interpreter.
> Basically everything works, but if I try to set
> `spark.driver.extraJavaOptions` or `spark.executor.extraJavaOptions`
> containing several arguments, I get an exception.
> For instance, for providing `-DmyParam=1 -DmyOtherParam=2`, I'd get:
> Error: Unrecognized option: -DmyOtherParam=2
>
> I noticed that the spark submit looks as follow:
>
> spark-submit --class
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 
> --driver-class-path
>    *--conf spark.driver.extraJavaOptions=-DmyParam=1 -DmyOtherParam=2*
>
> So I tried to patch SparkInterpreterLauncher to add quotation marks (like
> in the example from spark documentation -
> https://spark.apache.org/docs/latest/configuration.html#dynamically-loading-spark-properties
> )
>
> I see that the quotation marks were added: *--conf
> "spark.driver.extraJavaOptions=-DmyParam=1 -DmyOtherParam=2"*
> But I still get the same error.
>
> Any idea how I can make it work?
>


Local spark interpreter with extra java options

2021-07-08 Thread Lior Chaga
I'm trying to run zeppelin using local spark interpreter.
Basically everything works, but if I try to set
`spark.driver.extraJavaOptions` or `spark.executor.extraJavaOptions`
containing several arguments, I get an exception.
For instance, for providing `-DmyParam=1 -DmyOtherParam=2`, I'd get:
Error: Unrecognized option: -DmyOtherParam=2

I noticed that the spark submit looks as follow:

spark-submit --class
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
--driver-class-path
   *--conf spark.driver.extraJavaOptions=-DmyParam=1 -DmyOtherParam=2*

So I tried to patch SparkInterpreterLauncher to add quotation marks (like
in the example from spark documentation -
https://spark.apache.org/docs/latest/configuration.html#dynamically-loading-spark-properties
)

I see that the quotation marks were added: *--conf
"spark.driver.extraJavaOptions=-DmyParam=1 -DmyOtherParam=2"*
But I still get the same error.

Any idea how I can make it work?


Re: Running spark interpreter with zeppelin on k8s

2021-06-23 Thread Lior Chaga
Ok, I've tried it.
Indeed it doesn't look for a spark pod. Other issues though, but if I
wouldn't overcome I'll open a new thread.
Thanks Jeff!

On Wed, Jun 23, 2021 at 11:40 AM Jeff Zhang  wrote:

> set zeppelin.run.mode in zeppelin-site.xml to be local
>
> Lior Chaga  于2021年6月23日周三 下午4:35写道:
>
>> I'm trying to deploy zeppelin 0.10 on k8s, using following manual build:
>>
>> mvn clean package -DskipTests -Pspark-scala-2.12 -Pinclude-hadoop 
>> -Pspark-3.0 -Phadoop2  -Pbuild-distr  -pl 
>> zeppelin-interpreter,zeppelin-zengine,spark/interpreter,spark/spark-dependencies,zeppelin-web,zeppelin-server,zeppelin-distribion,jdbc,zeppelin-plugins/notebookrepo/filesystem,zeppelin-plugins/launcher/k8s-standard
>>  -am
>>
>>
>> Spark itself is configured to use mesos as resource manager.
>> It seems as if when trying to start the spark
>> interpreter, K8sRemoteInterpreterProcess tries to find a sidecar pod for
>> spark interpreter:
>>
>> Pod pod = client.pods().inNamespace(namespace).withName(podName).get();
>>
>> Is there any option not to have spark interpreter as a separate pod, and
>> instead just create the spark context within the zeppelin process? I'm
>> trying to understand if I could make zeppelin
>> use K8sStandardInterpreterLauncher instead (I assume it's an alternative to
>> the remote interpreter?)
>>
>> Thanks,
>> Lior
>>
>
>
> --
> Best Regards
>
> Jeff Zhang
>


Re: Running spark interpreter with zeppelin on k8s

2021-06-23 Thread Jeff Zhang
set zeppelin.run.mode in zeppelin-site.xml to be local

Lior Chaga  于2021年6月23日周三 下午4:35写道:

> I'm trying to deploy zeppelin 0.10 on k8s, using following manual build:
>
> mvn clean package -DskipTests -Pspark-scala-2.12 -Pinclude-hadoop -Pspark-3.0 
> -Phadoop2  -Pbuild-distr  -pl 
> zeppelin-interpreter,zeppelin-zengine,spark/interpreter,spark/spark-dependencies,zeppelin-web,zeppelin-server,zeppelin-distribion,jdbc,zeppelin-plugins/notebookrepo/filesystem,zeppelin-plugins/launcher/k8s-standard
>  -am
>
>
> Spark itself is configured to use mesos as resource manager.
> It seems as if when trying to start the spark
> interpreter, K8sRemoteInterpreterProcess tries to find a sidecar pod for
> spark interpreter:
>
> Pod pod = client.pods().inNamespace(namespace).withName(podName).get();
>
> Is there any option not to have spark interpreter as a separate pod, and
> instead just create the spark context within the zeppelin process? I'm
> trying to understand if I could make zeppelin
> use K8sStandardInterpreterLauncher instead (I assume it's an alternative to
> the remote interpreter?)
>
> Thanks,
> Lior
>


-- 
Best Regards

Jeff Zhang


Running spark interpreter with zeppelin on k8s

2021-06-23 Thread Lior Chaga
I'm trying to deploy zeppelin 0.10 on k8s, using following manual build:

mvn clean package -DskipTests -Pspark-scala-2.12 -Pinclude-hadoop
-Pspark-3.0 -Phadoop2  -Pbuild-distr  -pl
zeppelin-interpreter,zeppelin-zengine,spark/interpreter,spark/spark-dependencies,zeppelin-web,zeppelin-server,zeppelin-distribion,jdbc,zeppelin-plugins/notebookrepo/filesystem,zeppelin-plugins/launcher/k8s-standard
-am


Spark itself is configured to use mesos as resource manager.
It seems as if when trying to start the spark
interpreter, K8sRemoteInterpreterProcess tries to find a sidecar pod for
spark interpreter:

Pod pod = client.pods().inNamespace(namespace).withName(podName).get();

Is there any option not to have spark interpreter as a separate pod, and
instead just create the spark context within the zeppelin process? I'm
trying to understand if I could make zeppelin
use K8sStandardInterpreterLauncher instead (I assume it's an alternative to
the remote interpreter?)

Thanks,
Lior


Re: Custom init for Spark interpreter

2021-05-21 Thread Jeff Zhang
Thanks Vladimir

Vladimir Prus  于2021年5月21日周五 下午5:35写道:

> Jeff,
>
> thanks for the response. I've created
> https://issues.apache.org/jira/browse/ZEPPELIN-5386 and will see whether
> I can make a generic patch for this.
>
> On Fri, May 21, 2021 at 11:21 AM Jeff Zhang  wrote:
>
>> Right,we have hooks for each paragraph execution, but no interpreter
>> process level hook. Could you create a ticket for that ? And welcome to
>> contribute.
>>
>> Vladimir Prus  于2021年5月21日周五 下午4:16写道:
>>
>>>
>>> Hi,
>>>
>>> is there a way, when using Spark interpreter, to always run additional
>>> Scala code after startup? E.g. I want to automatically execute
>>>
>>>  import com.joom.whatever._
>>>
>>> so that users don't have to do it all the time. I see that
>>> BaseSparkScalaInterpreter.spark2CreateContext imports a few thing, but the
>>> code does not appear to support customizing this sequence.
>>>
>>> --
>>> Vladimir Prus
>>> http://vladimirprus.com
>>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>
>
> --
> Vladimir Prus
> http://vladimirprus.com
>


-- 
Best Regards

Jeff Zhang


Re: Custom init for Spark interpreter

2021-05-21 Thread Vladimir Prus
Jeff,

thanks for the response. I've created
https://issues.apache.org/jira/browse/ZEPPELIN-5386 and will see whether I
can make a generic patch for this.

On Fri, May 21, 2021 at 11:21 AM Jeff Zhang  wrote:

> Right,we have hooks for each paragraph execution, but no interpreter
> process level hook. Could you create a ticket for that ? And welcome to
> contribute.
>
> Vladimir Prus  于2021年5月21日周五 下午4:16写道:
>
>>
>> Hi,
>>
>> is there a way, when using Spark interpreter, to always run additional
>> Scala code after startup? E.g. I want to automatically execute
>>
>>  import com.joom.whatever._
>>
>> so that users don't have to do it all the time. I see that
>> BaseSparkScalaInterpreter.spark2CreateContext imports a few thing, but the
>> code does not appear to support customizing this sequence.
>>
>> --
>> Vladimir Prus
>> http://vladimirprus.com
>>
>
>
> --
> Best Regards
>
> Jeff Zhang
>


-- 
Vladimir Prus
http://vladimirprus.com


Re: Custom init for Spark interpreter

2021-05-21 Thread Jeff Zhang
Right,we have hooks for each paragraph execution, but no interpreter
process level hook. Could you create a ticket for that ? And welcome to
contribute.

Vladimir Prus  于2021年5月21日周五 下午4:16写道:

>
> Hi,
>
> is there a way, when using Spark interpreter, to always run additional
> Scala code after startup? E.g. I want to automatically execute
>
>  import com.joom.whatever._
>
> so that users don't have to do it all the time. I see that
> BaseSparkScalaInterpreter.spark2CreateContext imports a few thing, but the
> code does not appear to support customizing this sequence.
>
> --
> Vladimir Prus
> http://vladimirprus.com
>


-- 
Best Regards

Jeff Zhang


Custom init for Spark interpreter

2021-05-21 Thread Vladimir Prus
Hi,

is there a way, when using Spark interpreter, to always run additional
Scala code after startup? E.g. I want to automatically execute

 import com.joom.whatever._

so that users don't have to do it all the time. I see that
BaseSparkScalaInterpreter.spark2CreateContext imports a few thing, but the
code does not appear to support customizing this sequence.

-- 
Vladimir Prus
http://vladimirprus.com


Re: Zeppelin 0.9 / Kubernetes / Spark interpreter

2021-05-03 Thread Jeff Zhang
It is fixed here https://github.com/apache/zeppelin/pull/4105



Sylvain Gibier  于2021年5月1日周六 下午2:37写道:

> Hi,
>
> Cf. ZEPPELIN-5337.
>
> Switching to isolated mode is not really an option - as it means one spark
> interpreter per note. per user -- which consumes a lot of resources, as
> there is no mechanism to clean k8s pods created afterwards. The scope mode
> allows us to share the spark interpreter along with our 100+ analysts.
>
>
> On Fri, Apr 30, 2021 at 5:05 PM moon soo Lee  wrote:
>
>> Hi,
>>
>> Thanks for sharing the issue.
>>
>> I tried zeppelin 0.9+ on k8s with per note scoped, scala 2.12, spark 3.0+.
>> And I could reproduce the problem. But isolated mode works without
>> problem.
>> Does isolated mode work for your use case?
>>
>> Best,
>> moon
>>
>>
>>
>> On Tue, Apr 27, 2021 at 12:39 PM Sylvain Gibier 
>> wrote:
>>
>>> Any idea?
>>>
>>> Actually anyone using zeppelin 0.9+ on k8s, with spark interpreter scope
>>> per note ?
>>>
>>>
>>> On 2021/04/24 10:46:06, Sylvain Gibier  wrote:
>>> > Hi,
>>> >
>>> > we have an issue with our current deployment of zeppelin on k8s, and
>>> more
>>> > precisely with spark interpreter.
>>> >
>>> > For reference - the spark context is: scala 2.12.10 / spark 2.4.7
>>> >
>>> > We have a weird behaviour, running the spark interpreter in per note,
>>> scoped
>>> >
>>> > To reproduce currently - we restart the spark interpreter in scoped per
>>> > note, and create two notebooks (A & B) with the same following code:
>>> >
>>> > %spark
>>> > > import spark.implicits._
>>> > >
>>> > > List(1, 2, 3).toDS.map(_ + 1).show
>>> > >
>>> >
>>> > 1- we run notebook A successfully
>>> > 2 - we run notebook B  - it fails with class cast exception
>>> >
>>> > org.apache.spark.SparkException: Job aborted due to stage failure:
>>> Task 0
>>> > > in stage 24.0 failed 4 times, most recent failure: Lost task 0.3 in
>>> stage
>>> > > 24.0 (TID 161, 10.11.18.133, executor 2):
>>> java.lang.ClassCastException:
>>> > > cannot assign instance of java.lang.invoke.SerializedLambda to field
>>> > > org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in
>>> instance
>>> > > of org.apache.spark.rdd.MapPartitionsRDD at
>>> > >
>>> java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2287)
>>> > > at
>>> java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1417)
>>> > > at
>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2293)
>>> > >
>>> >
>>> > Anyone having a working zeppelin deployment with k8s / spark 2.4 -
>>> scala
>>> > 2.12 ?
>>> >
>>> > or let anyone interested to make some $$$ to help us fix the issue?
>>> >
>>> > cheers
>>> >
>>>
>>

-- 
Best Regards

Jeff Zhang


Re: Zeppelin 0.9 / Kubernetes / Spark interpreter

2021-05-01 Thread Sylvain Gibier
Hi,

Cf. ZEPPELIN-5337.

Switching to isolated mode is not really an option - as it means one spark
interpreter per note. per user -- which consumes a lot of resources, as
there is no mechanism to clean k8s pods created afterwards. The scope mode
allows us to share the spark interpreter along with our 100+ analysts.


On Fri, Apr 30, 2021 at 5:05 PM moon soo Lee  wrote:

> Hi,
>
> Thanks for sharing the issue.
>
> I tried zeppelin 0.9+ on k8s with per note scoped, scala 2.12, spark 3.0+.
> And I could reproduce the problem. But isolated mode works without problem.
> Does isolated mode work for your use case?
>
> Best,
> moon
>
>
>
> On Tue, Apr 27, 2021 at 12:39 PM Sylvain Gibier 
> wrote:
>
>> Any idea?
>>
>> Actually anyone using zeppelin 0.9+ on k8s, with spark interpreter scope
>> per note ?
>>
>>
>> On 2021/04/24 10:46:06, Sylvain Gibier  wrote:
>> > Hi,
>> >
>> > we have an issue with our current deployment of zeppelin on k8s, and
>> more
>> > precisely with spark interpreter.
>> >
>> > For reference - the spark context is: scala 2.12.10 / spark 2.4.7
>> >
>> > We have a weird behaviour, running the spark interpreter in per note,
>> scoped
>> >
>> > To reproduce currently - we restart the spark interpreter in scoped per
>> > note, and create two notebooks (A & B) with the same following code:
>> >
>> > %spark
>> > > import spark.implicits._
>> > >
>> > > List(1, 2, 3).toDS.map(_ + 1).show
>> > >
>> >
>> > 1- we run notebook A successfully
>> > 2 - we run notebook B  - it fails with class cast exception
>> >
>> > org.apache.spark.SparkException: Job aborted due to stage failure: Task
>> 0
>> > > in stage 24.0 failed 4 times, most recent failure: Lost task 0.3 in
>> stage
>> > > 24.0 (TID 161, 10.11.18.133, executor 2):
>> java.lang.ClassCastException:
>> > > cannot assign instance of java.lang.invoke.SerializedLambda to field
>> > > org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in
>> instance
>> > > of org.apache.spark.rdd.MapPartitionsRDD at
>> > >
>> java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2287)
>> > > at
>> java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1417)
>> > > at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2293)
>> > >
>> >
>> > Anyone having a working zeppelin deployment with k8s / spark 2.4 - scala
>> > 2.12 ?
>> >
>> > or let anyone interested to make some $$$ to help us fix the issue?
>> >
>> > cheers
>> >
>>
>


Re: Zeppelin 0.9 / Kubernetes / Spark interpreter

2021-04-30 Thread moon soo Lee
Hi,

Thanks for sharing the issue.

I tried zeppelin 0.9+ on k8s with per note scoped, scala 2.12, spark 3.0+.
And I could reproduce the problem. But isolated mode works without problem.
Does isolated mode work for your use case?

Best,
moon



On Tue, Apr 27, 2021 at 12:39 PM Sylvain Gibier 
wrote:

> Any idea?
>
> Actually anyone using zeppelin 0.9+ on k8s, with spark interpreter scope
> per note ?
>
>
> On 2021/04/24 10:46:06, Sylvain Gibier  wrote:
> > Hi,
> >
> > we have an issue with our current deployment of zeppelin on k8s, and more
> > precisely with spark interpreter.
> >
> > For reference - the spark context is: scala 2.12.10 / spark 2.4.7
> >
> > We have a weird behaviour, running the spark interpreter in per note,
> scoped
> >
> > To reproduce currently - we restart the spark interpreter in scoped per
> > note, and create two notebooks (A & B) with the same following code:
> >
> > %spark
> > > import spark.implicits._
> > >
> > > List(1, 2, 3).toDS.map(_ + 1).show
> > >
> >
> > 1- we run notebook A successfully
> > 2 - we run notebook B  - it fails with class cast exception
> >
> > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
> > > in stage 24.0 failed 4 times, most recent failure: Lost task 0.3 in
> stage
> > > 24.0 (TID 161, 10.11.18.133, executor 2): java.lang.ClassCastException:
> > > cannot assign instance of java.lang.invoke.SerializedLambda to field
> > > org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in
> instance
> > > of org.apache.spark.rdd.MapPartitionsRDD at
> > >
> java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2287)
> > > at
> java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1417)
> > > at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2293)
> > >
> >
> > Anyone having a working zeppelin deployment with k8s / spark 2.4 - scala
> > 2.12 ?
> >
> > or let anyone interested to make some $$$ to help us fix the issue?
> >
> > cheers
> >
>


Re: Zeppelin 0.9 / Kubernetes / Spark interpreter

2021-04-27 Thread Sylvain Gibier
Any idea?

Actually anyone using zeppelin 0.9+ on k8s, with spark interpreter scope per 
note ? 


On 2021/04/24 10:46:06, Sylvain Gibier  wrote: 
> Hi,
> 
> we have an issue with our current deployment of zeppelin on k8s, and more
> precisely with spark interpreter.
> 
> For reference - the spark context is: scala 2.12.10 / spark 2.4.7
> 
> We have a weird behaviour, running the spark interpreter in per note, scoped
> 
> To reproduce currently - we restart the spark interpreter in scoped per
> note, and create two notebooks (A & B) with the same following code:
> 
> %spark
> > import spark.implicits._
> >
> > List(1, 2, 3).toDS.map(_ + 1).show
> >
> 
> 1- we run notebook A successfully
> 2 - we run notebook B  - it fails with class cast exception
> 
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
> > in stage 24.0 failed 4 times, most recent failure: Lost task 0.3 in stage
> > 24.0 (TID 161, 10.11.18.133, executor 2): java.lang.ClassCastException:
> > cannot assign instance of java.lang.invoke.SerializedLambda to field
> > org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
> > of org.apache.spark.rdd.MapPartitionsRDD at
> > java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2287)
> > at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1417)
> > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2293)
> >
> 
> Anyone having a working zeppelin deployment with k8s / spark 2.4 - scala
> 2.12 ?
> 
> or let anyone interested to make some $$$ to help us fix the issue?
> 
> cheers
> 


Zeppelin 0.9 / Kubernetes / Spark interpreter

2021-04-24 Thread Sylvain Gibier
Hi,

we have an issue with our current deployment of zeppelin on k8s, and more
precisely with spark interpreter.

For reference - the spark context is: scala 2.12.10 / spark 2.4.7

We have a weird behaviour, running the spark interpreter in per note, scoped

To reproduce currently - we restart the spark interpreter in scoped per
note, and create two notebooks (A & B) with the same following code:

%spark
> import spark.implicits._
>
> List(1, 2, 3).toDS.map(_ + 1).show
>

1- we run notebook A successfully
2 - we run notebook B  - it fails with class cast exception

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
> in stage 24.0 failed 4 times, most recent failure: Lost task 0.3 in stage
> 24.0 (TID 161, 10.11.18.133, executor 2): java.lang.ClassCastException:
> cannot assign instance of java.lang.invoke.SerializedLambda to field
> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
> of org.apache.spark.rdd.MapPartitionsRDD at
> java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2287)
> at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1417)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2293)
>

Anyone having a working zeppelin deployment with k8s / spark 2.4 - scala
2.12 ?

or let anyone interested to make some $$$ to help us fix the issue?

cheers


Re: Spark interpreter Repl injection

2021-03-09 Thread Carlos Diogo
Thanks
I created the issue
Regards
Carlos

On Tue 9. Mar 2021 at 19:02, moon soo Lee  wrote:

> Pyspark interpreter have 'intp' variable exposed in its repl environment
> (for internal use). And we can resolve reference to Spark interpreter from
> the 'intp' variable. However, scala repl environment in Spark Interpreter
> doesn't expose any variables that is useful for finding Spark Interpreter
> itself. So had to find a way from pyspark interpreter.
>
> z.interpret() doesn't look like it can bring some problem, in my opinion.
>
> Thanks,
> moon
>
>
>
>
> On Tue, Mar 9, 2021 at 8:54 AM Carlos Diogo  wrote:
>
>> Looks good Moon
>> Is there a specific reason why you needed the pyspark interpreter  to
>> access the spark interpreter? Could not the spark interpreter
>> programmatically access itself (and the same for the pyspark interpreter)
>>
>> Would the issue be to expose the z.interpret() method?
>>
>> Best regards
>> Carlos
>>
>> On Tue, Mar 9, 2021 at 5:10 PM moon soo Lee  wrote:
>>
>>> I see. If you want to specify a file, precode might not the best option.
>>> I found a hacky way to do it. Accessing SparkInterpreter instance object
>>> from PysparkInterpreter.
>>>
>>> %pyspark
>>> sparkIntpField = intp.getClass().getDeclaredField("sparkInterpreter")
>>> sparkIntpField.setAccessible(True)
>>> sparkIntp = sparkIntpField.get(intp)
>>> # run my scala code
>>> sparkIntp.interpret("val a=10", z.getInterpreterContext())
>>>
>>>
>>> See attached screenshot.
>>>
>>> [image: image.png]
>>>
>>> This is accessing internal variables outside the official API. So it may
>>> break at any time.
>>>
>>> I think it's better to expose interpret() method through
>>> 'ZeppelinContext'. So inside Note,
>>>
>>> z.interpret(any_string)
>>>
>>> can work without accessing this method in a hacky way.
>>> Please feel free to file an issue.
>>>
>>> Thanks,
>>> moon
>>>
>>>
>>>
>>>
>>> On Mon, Mar 8, 2021 at 10:23 PM Carlos Diogo  wrote:
>>>
>>>> Are you able to specify a file on the precode?
>>>> For now my work around is from within the note and with the rest api ,
>>>> to add a paragraph with the code I want to inject ( which can come from a
>>>> file )
>>>> It works ok , but with run all or schedule the code gets updated in the
>>>> note , but the old Code still executes . Only on the next run it will take
>>>> effect
>>>>
>>>> On Mon 8. Mar 2021 at 22:48, moon soo Lee  wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> How about precode
>>>>> <http://zeppelin.apache.org/docs/0.9.0/usage/interpreter/overview.html#precode>?
>>>>>  "zeppelin.SparkInterpreter.precode"
>>>>> can run scala code.
>>>>>
>>>>> Thanks,
>>>>> moon
>>>>>
>>>>>
>>>>> On Sat, Mar 6, 2021 at 4:51 AM Carlos Diogo  wrote:
>>>>>
>>>>>> That does not work if you want to have Scala code in a file ( common
>>>>>> functions) which you want to invoke in the note
>>>>>> The alternative is to compile the code and then add the jar which
>>>>>> would be normal for an application.
>>>>>> But zeppelin is about scripting so this is a request I get very often
>>>>>> from the users.
>>>>>> Specially because the z.run does not work properly most of the times
>>>>>> Carlos
>>>>>>
>>>>>> On Sat 6. Mar 2021 at 11:36, Jeff Zhang  wrote:
>>>>>>
>>>>>>> Why not copying scala code in zeppelin and run the notebook directly
>>>>>>> ?
>>>>>>>
>>>>>>> Carlos Diogo  于2021年3月6日周六 下午3:51写道:
>>>>>>>
>>>>>>>> Dear all
>>>>>>>> I have been  trying  to find a was to inject scala Code ( from
>>>>>>>> String) into the spark interpreter
>>>>>>>> In pyspark is easy with the exec function
>>>>>>>> It should not be very difficult  to access from the Note scala repl
>>>>>>>> interpreter but i could not find a way . I was even able to create a 
>>>>>>>> new
>>>>>>>> repl session but then I could not bind the objects
>>>>>>>> Any tips ?
>>>>>>>> Thanks
>>>>>>>> --
>>>>>>>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>>>>>>>> Carlos Diogo
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards
>>>>>>>
>>>>>>> Jeff Zhang
>>>>>>>
>>>>>> --
>>>>>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>>>>>> Carlos Diogo
>>>>>>
>>>>> --
>>>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>>>> Carlos Diogo
>>>>
>>>
>>
>> --
>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>> Carlos Diogo
>>
> --
Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
Carlos Diogo


Re: Spark interpreter Repl injection

2021-03-09 Thread moon soo Lee
Pyspark interpreter have 'intp' variable exposed in its repl environment
(for internal use). And we can resolve reference to Spark interpreter from
the 'intp' variable. However, scala repl environment in Spark Interpreter
doesn't expose any variables that is useful for finding Spark Interpreter
itself. So had to find a way from pyspark interpreter.

z.interpret() doesn't look like it can bring some problem, in my opinion.

Thanks,
moon




On Tue, Mar 9, 2021 at 8:54 AM Carlos Diogo  wrote:

> Looks good Moon
> Is there a specific reason why you needed the pyspark interpreter  to
> access the spark interpreter? Could not the spark interpreter
> programmatically access itself (and the same for the pyspark interpreter)
>
> Would the issue be to expose the z.interpret() method?
>
> Best regards
> Carlos
>
> On Tue, Mar 9, 2021 at 5:10 PM moon soo Lee  wrote:
>
>> I see. If you want to specify a file, precode might not the best option.
>> I found a hacky way to do it. Accessing SparkInterpreter instance object
>> from PysparkInterpreter.
>>
>> %pyspark
>> sparkIntpField = intp.getClass().getDeclaredField("sparkInterpreter")
>> sparkIntpField.setAccessible(True)
>> sparkIntp = sparkIntpField.get(intp)
>> # run my scala code
>> sparkIntp.interpret("val a=10", z.getInterpreterContext())
>>
>>
>> See attached screenshot.
>>
>> [image: image.png]
>>
>> This is accessing internal variables outside the official API. So it may
>> break at any time.
>>
>> I think it's better to expose interpret() method through
>> 'ZeppelinContext'. So inside Note,
>>
>> z.interpret(any_string)
>>
>> can work without accessing this method in a hacky way.
>> Please feel free to file an issue.
>>
>> Thanks,
>> moon
>>
>>
>>
>>
>> On Mon, Mar 8, 2021 at 10:23 PM Carlos Diogo  wrote:
>>
>>> Are you able to specify a file on the precode?
>>> For now my work around is from within the note and with the rest api ,
>>> to add a paragraph with the code I want to inject ( which can come from a
>>> file )
>>> It works ok , but with run all or schedule the code gets updated in the
>>> note , but the old Code still executes . Only on the next run it will take
>>> effect
>>>
>>> On Mon 8. Mar 2021 at 22:48, moon soo Lee  wrote:
>>>
>>>> Hi,
>>>>
>>>> How about precode
>>>> <http://zeppelin.apache.org/docs/0.9.0/usage/interpreter/overview.html#precode>?
>>>>  "zeppelin.SparkInterpreter.precode"
>>>> can run scala code.
>>>>
>>>> Thanks,
>>>> moon
>>>>
>>>>
>>>> On Sat, Mar 6, 2021 at 4:51 AM Carlos Diogo  wrote:
>>>>
>>>>> That does not work if you want to have Scala code in a file ( common
>>>>> functions) which you want to invoke in the note
>>>>> The alternative is to compile the code and then add the jar which
>>>>> would be normal for an application.
>>>>> But zeppelin is about scripting so this is a request I get very often
>>>>> from the users.
>>>>> Specially because the z.run does not work properly most of the times
>>>>> Carlos
>>>>>
>>>>> On Sat 6. Mar 2021 at 11:36, Jeff Zhang  wrote:
>>>>>
>>>>>> Why not copying scala code in zeppelin and run the notebook directly ?
>>>>>>
>>>>>> Carlos Diogo  于2021年3月6日周六 下午3:51写道:
>>>>>>
>>>>>>> Dear all
>>>>>>> I have been  trying  to find a was to inject scala Code ( from
>>>>>>> String) into the spark interpreter
>>>>>>> In pyspark is easy with the exec function
>>>>>>> It should not be very difficult  to access from the Note scala repl
>>>>>>> interpreter but i could not find a way . I was even able to create a new
>>>>>>> repl session but then I could not bind the objects
>>>>>>> Any tips ?
>>>>>>> Thanks
>>>>>>> --
>>>>>>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>>>>>>> Carlos Diogo
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards
>>>>>>
>>>>>> Jeff Zhang
>>>>>>
>>>>> --
>>>>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>>>>> Carlos Diogo
>>>>>
>>>> --
>>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>>> Carlos Diogo
>>>
>>
>
> --
> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
> Carlos Diogo
>


Re: Spark interpreter Repl injection

2021-03-09 Thread Carlos Diogo
Looks good Moon
Is there a specific reason why you needed the pyspark interpreter  to
access the spark interpreter? Could not the spark interpreter
programmatically access itself (and the same for the pyspark interpreter)

Would the issue be to expose the z.interpret() method?

Best regards
Carlos

On Tue, Mar 9, 2021 at 5:10 PM moon soo Lee  wrote:

> I see. If you want to specify a file, precode might not the best option.
> I found a hacky way to do it. Accessing SparkInterpreter instance object
> from PysparkInterpreter.
>
> %pyspark
> sparkIntpField = intp.getClass().getDeclaredField("sparkInterpreter")
> sparkIntpField.setAccessible(True)
> sparkIntp = sparkIntpField.get(intp)
> # run my scala code
> sparkIntp.interpret("val a=10", z.getInterpreterContext())
>
>
> See attached screenshot.
>
> [image: image.png]
>
> This is accessing internal variables outside the official API. So it may
> break at any time.
>
> I think it's better to expose interpret() method through
> 'ZeppelinContext'. So inside Note,
>
> z.interpret(any_string)
>
> can work without accessing this method in a hacky way.
> Please feel free to file an issue.
>
> Thanks,
> moon
>
>
>
>
> On Mon, Mar 8, 2021 at 10:23 PM Carlos Diogo  wrote:
>
>> Are you able to specify a file on the precode?
>> For now my work around is from within the note and with the rest api , to
>> add a paragraph with the code I want to inject ( which can come from a file
>> )
>> It works ok , but with run all or schedule the code gets updated in the
>> note , but the old Code still executes . Only on the next run it will take
>> effect
>>
>> On Mon 8. Mar 2021 at 22:48, moon soo Lee  wrote:
>>
>>> Hi,
>>>
>>> How about precode
>>> <http://zeppelin.apache.org/docs/0.9.0/usage/interpreter/overview.html#precode>?
>>>  "zeppelin.SparkInterpreter.precode"
>>> can run scala code.
>>>
>>> Thanks,
>>> moon
>>>
>>>
>>> On Sat, Mar 6, 2021 at 4:51 AM Carlos Diogo  wrote:
>>>
>>>> That does not work if you want to have Scala code in a file ( common
>>>> functions) which you want to invoke in the note
>>>> The alternative is to compile the code and then add the jar which would
>>>> be normal for an application.
>>>> But zeppelin is about scripting so this is a request I get very often
>>>> from the users.
>>>> Specially because the z.run does not work properly most of the times
>>>> Carlos
>>>>
>>>> On Sat 6. Mar 2021 at 11:36, Jeff Zhang  wrote:
>>>>
>>>>> Why not copying scala code in zeppelin and run the notebook directly ?
>>>>>
>>>>> Carlos Diogo  于2021年3月6日周六 下午3:51写道:
>>>>>
>>>>>> Dear all
>>>>>> I have been  trying  to find a was to inject scala Code ( from
>>>>>> String) into the spark interpreter
>>>>>> In pyspark is easy with the exec function
>>>>>> It should not be very difficult  to access from the Note scala repl
>>>>>> interpreter but i could not find a way . I was even able to create a new
>>>>>> repl session but then I could not bind the objects
>>>>>> Any tips ?
>>>>>> Thanks
>>>>>> --
>>>>>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>>>>>> Carlos Diogo
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards
>>>>>
>>>>> Jeff Zhang
>>>>>
>>>> --
>>>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>>>> Carlos Diogo
>>>>
>>> --
>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>> Carlos Diogo
>>
>

-- 
Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
Carlos Diogo


Re: Spark interpreter Repl injection

2021-03-09 Thread moon soo Lee
I see. If you want to specify a file, precode might not the best option.
I found a hacky way to do it. Accessing SparkInterpreter instance object
from PysparkInterpreter.

%pyspark
sparkIntpField = intp.getClass().getDeclaredField("sparkInterpreter")
sparkIntpField.setAccessible(True)
sparkIntp = sparkIntpField.get(intp)
# run my scala code
sparkIntp.interpret("val a=10", z.getInterpreterContext())


See attached screenshot.

[image: image.png]

This is accessing internal variables outside the official API. So it may
break at any time.

I think it's better to expose interpret() method through 'ZeppelinContext'.
So inside Note,

z.interpret(any_string)

can work without accessing this method in a hacky way.
Please feel free to file an issue.

Thanks,
moon




On Mon, Mar 8, 2021 at 10:23 PM Carlos Diogo  wrote:

> Are you able to specify a file on the precode?
> For now my work around is from within the note and with the rest api , to
> add a paragraph with the code I want to inject ( which can come from a file
> )
> It works ok , but with run all or schedule the code gets updated in the
> note , but the old Code still executes . Only on the next run it will take
> effect
>
> On Mon 8. Mar 2021 at 22:48, moon soo Lee  wrote:
>
>> Hi,
>>
>> How about precode
>> <http://zeppelin.apache.org/docs/0.9.0/usage/interpreter/overview.html#precode>?
>>  "zeppelin.SparkInterpreter.precode"
>> can run scala code.
>>
>> Thanks,
>> moon
>>
>>
>> On Sat, Mar 6, 2021 at 4:51 AM Carlos Diogo  wrote:
>>
>>> That does not work if you want to have Scala code in a file ( common
>>> functions) which you want to invoke in the note
>>> The alternative is to compile the code and then add the jar which would
>>> be normal for an application.
>>> But zeppelin is about scripting so this is a request I get very often
>>> from the users.
>>> Specially because the z.run does not work properly most of the times
>>> Carlos
>>>
>>> On Sat 6. Mar 2021 at 11:36, Jeff Zhang  wrote:
>>>
>>>> Why not copying scala code in zeppelin and run the notebook directly ?
>>>>
>>>> Carlos Diogo  于2021年3月6日周六 下午3:51写道:
>>>>
>>>>> Dear all
>>>>> I have been  trying  to find a was to inject scala Code ( from String)
>>>>> into the spark interpreter
>>>>> In pyspark is easy with the exec function
>>>>> It should not be very difficult  to access from the Note scala repl
>>>>> interpreter but i could not find a way . I was even able to create a new
>>>>> repl session but then I could not bind the objects
>>>>> Any tips ?
>>>>> Thanks
>>>>> --
>>>>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>>>>> Carlos Diogo
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards
>>>>
>>>> Jeff Zhang
>>>>
>>> --
>>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>>> Carlos Diogo
>>>
>> --
> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
> Carlos Diogo
>


Re: Spark interpreter Repl injection

2021-03-08 Thread Carlos Diogo
Are you able to specify a file on the precode?
For now my work around is from within the note and with the rest api , to
add a paragraph with the code I want to inject ( which can come from a file
)
It works ok , but with run all or schedule the code gets updated in the
note , but the old Code still executes . Only on the next run it will take
effect

On Mon 8. Mar 2021 at 22:48, moon soo Lee  wrote:

> Hi,
>
> How about precode
> <http://zeppelin.apache.org/docs/0.9.0/usage/interpreter/overview.html#precode>?
>  "zeppelin.SparkInterpreter.precode"
> can run scala code.
>
> Thanks,
> moon
>
>
> On Sat, Mar 6, 2021 at 4:51 AM Carlos Diogo  wrote:
>
>> That does not work if you want to have Scala code in a file ( common
>> functions) which you want to invoke in the note
>> The alternative is to compile the code and then add the jar which would
>> be normal for an application.
>> But zeppelin is about scripting so this is a request I get very often
>> from the users.
>> Specially because the z.run does not work properly most of the times
>> Carlos
>>
>> On Sat 6. Mar 2021 at 11:36, Jeff Zhang  wrote:
>>
>>> Why not copying scala code in zeppelin and run the notebook directly ?
>>>
>>> Carlos Diogo  于2021年3月6日周六 下午3:51写道:
>>>
>>>> Dear all
>>>> I have been  trying  to find a was to inject scala Code ( from String)
>>>> into the spark interpreter
>>>> In pyspark is easy with the exec function
>>>> It should not be very difficult  to access from the Note scala repl
>>>> interpreter but i could not find a way . I was even able to create a new
>>>> repl session but then I could not bind the objects
>>>> Any tips ?
>>>> Thanks
>>>> --
>>>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>>>> Carlos Diogo
>>>>
>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>> --
>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>> Carlos Diogo
>>
> --
Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
Carlos Diogo


Re: Spark interpreter Repl injection

2021-03-08 Thread moon soo Lee
Hi,

How about precode
<http://zeppelin.apache.org/docs/0.9.0/usage/interpreter/overview.html#precode>?
"zeppelin.SparkInterpreter.precode"
can run scala code.

Thanks,
moon


On Sat, Mar 6, 2021 at 4:51 AM Carlos Diogo  wrote:

> That does not work if you want to have Scala code in a file ( common
> functions) which you want to invoke in the note
> The alternative is to compile the code and then add the jar which would be
> normal for an application.
> But zeppelin is about scripting so this is a request I get very often from
> the users.
> Specially because the z.run does not work properly most of the times
> Carlos
>
> On Sat 6. Mar 2021 at 11:36, Jeff Zhang  wrote:
>
>> Why not copying scala code in zeppelin and run the notebook directly ?
>>
>> Carlos Diogo  于2021年3月6日周六 下午3:51写道:
>>
>>> Dear all
>>> I have been  trying  to find a was to inject scala Code ( from String)
>>> into the spark interpreter
>>> In pyspark is easy with the exec function
>>> It should not be very difficult  to access from the Note scala repl
>>> interpreter but i could not find a way . I was even able to create a new
>>> repl session but then I could not bind the objects
>>> Any tips ?
>>> Thanks
>>> --
>>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>>> Carlos Diogo
>>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
> --
> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
> Carlos Diogo
>


Re: Spark interpreter Repl injection

2021-03-06 Thread Carlos Diogo
That does not work if you want to have Scala code in a file ( common
functions) which you want to invoke in the note
The alternative is to compile the code and then add the jar which would be
normal for an application.
But zeppelin is about scripting so this is a request I get very often from
the users.
Specially because the z.run does not work properly most of the times
Carlos

On Sat 6. Mar 2021 at 11:36, Jeff Zhang  wrote:

> Why not copying scala code in zeppelin and run the notebook directly ?
>
> Carlos Diogo  于2021年3月6日周六 下午3:51写道:
>
>> Dear all
>> I have been  trying  to find a was to inject scala Code ( from String)
>> into the spark interpreter
>> In pyspark is easy with the exec function
>> It should not be very difficult  to access from the Note scala repl
>> interpreter but i could not find a way . I was even able to create a new
>> repl session but then I could not bind the objects
>> Any tips ?
>> Thanks
>> --
>> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
>> Carlos Diogo
>>
>
>
> --
> Best Regards
>
> Jeff Zhang
>
-- 
Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
Carlos Diogo


Re: Spark interpreter Repl injection

2021-03-06 Thread Jeff Zhang
Why not copying scala code in zeppelin and run the notebook directly ?

Carlos Diogo  于2021年3月6日周六 下午3:51写道:

> Dear all
> I have been  trying  to find a was to inject scala Code ( from String)
> into the spark interpreter
> In pyspark is easy with the exec function
> It should not be very difficult  to access from the Note scala repl
> interpreter but i could not find a way . I was even able to create a new
> repl session but then I could not bind the objects
> Any tips ?
> Thanks
> --
> Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
> Carlos Diogo
>


-- 
Best Regards

Jeff Zhang


Spark interpreter Repl injection

2021-03-05 Thread Carlos Diogo
Dear all
I have been  trying  to find a was to inject scala Code ( from String) into
the spark interpreter
In pyspark is easy with the exec function
It should not be very difficult  to access from the Note scala repl
interpreter but i could not find a way . I was even able to create a new
repl session but then I could not bind the objects
Any tips ?
Thanks
-- 
Os meus cumprimentos / Best regards /  Mit freundlichen Grüße
Carlos Diogo


spark.jars.packages not working in spark interpreter tutorial

2020-07-02 Thread David Boyd

All:

   Trying to run the Spark Interpreter tutorial note.

The spark.conf paragraph which specifies spark.jars.packages runs clean.
But the next paragraph which tries to use the avro jar fails with a 
class not found for


org.apache.spark.sql.avro.AvroFileFormat.DefaultSource

Spark is set to run Per Note in scoped process.

There are no errors in the 
zeppelin-interpreter-spark-shared_process-zeppelin-dspcnode11.dspc.incadencecorp.com.log


Any thoughts would be appreciated.

Note Spark Basic Features tutorial works fine.


The base zeppelin log has this error:

 INFO [2020-07-02 15:57:58,339] ({qtp923219673-142} 
VFSNotebookRepo.java[save]:145) - Saving note 2F8KN6TKK to Spark 
Tutorial/1. Spark Interpreter Introduction_2F8KN6TKK.zpln
 INFO [2020-07-02 15:57:58,343] ({SchedulerFactory3} 
AbstractScheduler.java[runJob]:125) - Job 20180530-222838_1995256600 
started by scheduler RemoteInterpreter-spark-shared_process-2F8KN6TKK
 INFO [2020-07-02 15:57:58,343] ({SchedulerFactory3} 
Paragraph.java[jobRun]:388) - Run paragraph [paragraph_id: 
20180530-222838_1995256600, interpreter: 
org.apache.zeppelin.spark.SparkInterpreter, note_id: 2F8KN6TKK, user: 
dspc_demo]
 INFO [2020-07-02 15:57:58,444] 
({JobStatusPoller-20180530-222838_1995256600} 
NotebookServer.java[onStatusChange]:1927) - Job 
20180530-222838_1995256600 starts to RUNNING
 INFO [2020-07-02 15:57:58,445] 
({JobStatusPoller-20180530-222838_1995256600} 
VFSNotebookRepo.java[save]:145) - Saving note 2F8KN6TKK to Spark 
Tutorial/1. Spark Interpreter Introduction_2F8KN6TKK.zpln
 WARN [2020-07-02 15:57:58,734] ({SchedulerFactory3} 
NotebookServer.java[onStatusChange]:1924) - Job 
20180530-222838_1995256600 is finished, status: ERROR, exception: 
null, result: %text java.lang.ClassNotFoundException: Failed to find 
data source: org.apache.spark.sql.avro.AvroFileFormat. Please find 
packages at http://spark.apache.org/third-party-projects.html
  at 
org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:657)

  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:194)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
  ... 45 elided
Caused by: java.lang.ClassNotFoundException: 
org.apache.spark.sql.avro.AvroFileFormat.DefaultSource
  at 
scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)

  at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
  at 
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20$$anonfun$apply$12.apply(DataSource.scala:634)
  at 
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20$$anonfun$apply$12.apply(DataSource.scala:634)

  at scala.util.Try$.apply(Try.scala:192)
  at 
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20.apply(DataSource.scala:634)
  at 
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20.apply(DataSource.scala:634)

  at scala.util.Try.orElse(Try.scala:84)
  at 
org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:634)

  ... 47 more

 INFO [2020-07-02 15:57:58,735] ({SchedulerFactory3} 
VFSNotebookRepo.java[save]:145) - Saving note 2F8KN6TKK to Spark 
Tutorial/1. Spark Interpreter Introduction_2F8KN6TKK.zpln



--
= mailto:db...@incadencecorp.com 
David W. Boyd
VP,  Data Solutions
10432 Balls Ford, Suite 240
Manassas, VA 20109
office:   +1-703-552-2862
cell: +1-703-402-7908
== http://www.incadencecorp.com/ 
ISO/IEC JTC1 SC42/WG2, editor ISO/IEC 20546, ISO/IEC 20547-1
Chair INCITS TG Big Data
Co-chair NIST Big Data Public Working Group Reference Architecture
First Robotic Mentor - FRC, FTC - www.iliterobotics.org
Board Member- USSTEM Foundation - www.usstem.org

The information contained in this message may be privileged
and/or confidential and protected from disclosure.
If the reader of this message is not the intended recipient
or an employee or agent responsible for delivering this message
to the intended recipient, you are hereby notified that any
dissemination, distribution or copying of this communication
is strictly prohibited.  If you have received this communication
in error, please notify the sender immediately by replying to
this message and deleting the material from any computer.



Re: Error starting spark interpreter with 0.9.0

2020-06-30 Thread Jeff Zhang
Which spark version do you use ? And could you check the spark interpreter
log file ? It is in ZEPPELIN_HOME/logs/zeppelin-interpreter-spark-*.log

David Boyd  于2020年6月30日周二 下午11:11写道:

> All:
>
> Just trying to get 0.9.0 to work and running into all sorts of issues.
> Previously I had set SPARK_MASTER to be yarn-client   so it would use my
> existing yarn cluster.
> That threw an error about yarn-client being deprecated in 2.0.
> So I switched it to local.
> I now get the error about the interpreter not starting and the following
> output in the note:
>
> > org.apache.zeppelin.interpreter.InterpreterException:
> > java.io.IOException: Fail to launch interpreter process: Interpreter
> > launch command: /opt/spark/spark-current/bin/spark-submit --class
> > org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
> > --driver-class-path
> >
> ":/opt/zeppelin/zeppelin-current/interpreter/spark/*::/opt/hadoop/hadoop-current/share/hadoop/common/sources/:/opt/hadoop/hadoop-current/share/hadoop/common/sources/:/opt/zeppelin/zeppelin-current/interpreter/zeppelin-interpreter-shaded-0.9.0-SNAPSHOT-shaded.jar
>
> >
> /opt/zeppelin/zeppelin-current/interpreter/zeppelin-interpreter-shaded-0.9.0-SNAPSHOT.jar:/opt/zeppelin/zeppelin-current/interpreter/spark/spark-interpreter-0.9.0-SNAPSHOT.jar:/opt/hadoop/hadoop-current/etc/hadoop"
>
> > --driver-java-options " -Dfile.encoding=UTF-8
> >
> -Dlog4j.configuration='file:///opt/zeppelin/zeppelin-current/conf/log4j.properties'
>
> >
> -Dlog4j.configurationFile='file:///opt/zeppelin/zeppelin-current/conf/log4j2.properties'
>
> >
> -Dzeppelin.log.file='/opt/zeppelin/zeppelin-current/logs/zeppelin-interpreter-spark-dspc_demo-zeppelin-dspcnode11.dspc.incadencecorp.com.log'"
>
> > --driver-memory 4G --executor-memory 6G --conf
> > spark\.serializer\=org\.apache\.spark\.serializer\.KryoSerializer
> > --conf spark\.executor\.memory\=1G --conf spark\.app\.name\=Zeppelin
> > --conf spark\.executor\.instances\=5 --conf spark\.master\=local\[\*\]
> > --conf spark\.sql\.crossJoin\.enabled\=true --conf
> > spark\.cores\.max\=10
> >
> /opt/zeppelin/zeppelin-current/interpreter/spark/spark-interpreter-0.9.0-SNAPSHOT.jar
>
> > 10.1.50.111 33591 "spark-dspc_demo" : SLF4J: Class path contains
> > multiple SLF4J bindings. SLF4J: Found binding in
> >
> [jar:file:/opt/zeppelin/zeppelin-0.9.0-SNAPSHOT/interpreter/spark/spark-interpreter-0.9.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J:
>
> > Found binding in
> >
> [jar:file:/opt/spark/spark-2.4.3.bdp-1-bin-hadoop2.7/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J:
>
> > See http://www.slf4j.org/codes.html#multiple_bindings for an
> > explanation. SLF4J: Actual binding is of type
> > [org.slf4j.impl.Log4jLoggerFactory] at
> >
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:134)
>
> > at
> >
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:281)
>
> > at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:412)
> > at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:72) at
> > org.apache.zeppelin.scheduler.Job.run(Job.java:172) at
> >
> org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:130)
>
> > at
> >
> org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:180)
>
> > at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>
> > at
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>
> > at
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
> > at java.lang.Thread.run(Thread.java:748) Caused by:
> > java.io.IOException: Fail to launch interpreter process: Interpreter
> > launch command: /opt/spark/spark-current/bin/spark-submit --class
> > org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
> > --driver-class-path
> >
> ":/opt/zeppelin/zeppelin-current/interpreter/spark/*::/opt/hadoop/hadoop-current/share/hadoop/common/sources/:/opt/hadoop/hadoop-current/share/hadoop/common/sources/:/opt/zeppelin/zeppelin-current/interpreter/zeppelin-interpreter-shaded-0.9.0

Error starting spark interpreter with 0.9.0

2020-06-30 Thread David Boyd

All:

   Just trying to get 0.9.0 to work and running into all sorts of issues.
Previously I had set SPARK_MASTER to be yarn-client   so it would use my
existing yarn cluster.
That threw an error about yarn-client being deprecated in 2.0.
So I switched it to local.
I now get the error about the interpreter not starting and the following 
output in the note:


org.apache.zeppelin.interpreter.InterpreterException: 
java.io.IOException: Fail to launch interpreter process: Interpreter 
launch command: /opt/spark/spark-current/bin/spark-submit --class 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 
--driver-class-path 
":/opt/zeppelin/zeppelin-current/interpreter/spark/*::/opt/hadoop/hadoop-current/share/hadoop/common/sources/:/opt/hadoop/hadoop-current/share/hadoop/common/sources/:/opt/zeppelin/zeppelin-current/interpreter/zeppelin-interpreter-shaded-0.9.0-SNAPSHOT-shaded.jar 
/opt/zeppelin/zeppelin-current/interpreter/zeppelin-interpreter-shaded-0.9.0-SNAPSHOT.jar:/opt/zeppelin/zeppelin-current/interpreter/spark/spark-interpreter-0.9.0-SNAPSHOT.jar:/opt/hadoop/hadoop-current/etc/hadoop" 
--driver-java-options " -Dfile.encoding=UTF-8 
-Dlog4j.configuration='file:///opt/zeppelin/zeppelin-current/conf/log4j.properties' 
-Dlog4j.configurationFile='file:///opt/zeppelin/zeppelin-current/conf/log4j2.properties' 
-Dzeppelin.log.file='/opt/zeppelin/zeppelin-current/logs/zeppelin-interpreter-spark-dspc_demo-zeppelin-dspcnode11.dspc.incadencecorp.com.log'" 
--driver-memory 4G --executor-memory 6G --conf 
spark\.serializer\=org\.apache\.spark\.serializer\.KryoSerializer 
--conf spark\.executor\.memory\=1G --conf spark\.app\.name\=Zeppelin 
--conf spark\.executor\.instances\=5 --conf spark\.master\=local\[\*\] 
--conf spark\.sql\.crossJoin\.enabled\=true --conf 
spark\.cores\.max\=10 
/opt/zeppelin/zeppelin-current/interpreter/spark/spark-interpreter-0.9.0-SNAPSHOT.jar 
10.1.50.111 33591 "spark-dspc_demo" : SLF4J: Class path contains 
multiple SLF4J bindings. SLF4J: Found binding in 
[jar:file:/opt/zeppelin/zeppelin-0.9.0-SNAPSHOT/interpreter/spark/spark-interpreter-0.9.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: 
Found binding in 
[jar:file:/opt/spark/spark-2.4.3.bdp-1-bin-hadoop2.7/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: 
See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation. SLF4J: Actual binding is of type 
[org.slf4j.impl.Log4jLoggerFactory] at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:134) 
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:281) 
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:412) 
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:72) at 
org.apache.zeppelin.scheduler.Job.run(Job.java:172) at 
org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:130) 
at 
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:180) 
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) 
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) 
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748) Caused by: 
java.io.IOException: Fail to launch interpreter process: Interpreter 
launch command: /opt/spark/spark-current/bin/spark-submit --class 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 
--driver-class-path 
":/opt/zeppelin/zeppelin-current/interpreter/spark/*::/opt/hadoop/hadoop-current/share/hadoop/common/sources/:/opt/hadoop/hadoop-current/share/hadoop/common/sources/:/opt/zeppelin/zeppelin-current/interpreter/zeppelin-interpreter-shaded-0.9.0-SNAPSHOT-shaded.jar 
/opt/zeppelin/zeppelin-current/interpreter/zeppelin-interpreter-shaded-0.9.0-SNAPSHOT.jar:/opt/zeppelin/zeppelin-current/interpreter/spark/spark-interpreter-0.9.0-SNAPSHOT.jar:/opt/hadoop/hadoop-current/etc/hadoop" 
--driver-java-options " -Dfile.encoding=UTF-8 
-Dlog4j.configuration='file:///opt/zeppelin/zeppelin-current/conf/log4j.properties' 
-Dlog4j.configurationFile='file:///opt/zeppelin/zeppelin-current/conf/log4j2.properties' 
-Dzeppelin.log.file='/opt/zeppelin/zeppelin-current/logs/zeppelin-interpreter-spark-dspc_demo-zeppelin-dspcnode11.dspc.incadencecorp.com.log'" 
--driver-memory 4G --executor-memory 6G --conf 
spark\.serializer\=org\.apache\.spark\.serializer\.KryoSerializer 
--conf spark\.executor\.memory\=1G --conf spark\.app\.name\=Zeppelin 
--conf spark\.executor\.instances\=5 --conf spark\

Re: Question: Adding Dependencies in with the Spark Interpreter with Kubernetes

2020-05-14 Thread Sebastian Albrecht
Am Mi., 13. Mai 2020 um 21:59 Uhr schrieb Hetul Patel :

>
> Are dependency downloads supported with zeppelin and spark over
> kubernetes? Or am I required to add the dependency jars directly to my
> spark docker image and add them to the classpath?
>
>
Hi Hetu,
i don't use docker but to connect to my cassandra from the spark cluster i
have to set SPARK_SUBMIT_OPTIONS='--packages
com.datastax.spark:spark-cassandra-connector_2.11:2.4.3'

HTH,
Sebastian.


> Thanks,
> Hetu
>


Question: Adding Dependencies in with the Spark Interpreter with Kubernetes

2020-05-13 Thread Hetul Patel
Hi all,

I've been trying the 0.9.0-preview1 build on minikube with the spark
interpreter. It's working, but I'm unable to work with any dependencies
that I've added to the spark interpreter.

(Note: I had to add `SPARK_SUBMIT_OPTIONS=--conf spark.jars.ivy=/tmp/.ivy`
and `SPARK_USER=root` to the default interpreter options.)

I'm trying to connect spark to Cassandra, and I've added the following
dependency to the spark interpreter: `
com.datastax.cassandra:cassandra-driver-core:3.9.0`.

In the `zeppelin-server` pod logs, I see this:

```
 INFO [2020-05-13 02:52:15,840] ({Thread-18}
InterpreterSetting.java[run]:953) - Start to download dependencies for
interpreter: spark
 INFO [2020-05-13 02:52:21,565] ({Thread-18}
InterpreterSetting.java[run]:966) - Finish downloading dependencies for
interpreter: spark
 INFO [2020-05-13 02:52:21,565] ({Thread-18}
InterpreterSetting.java[setStatus]:740) - Set interpreter spark status to
READY
```

However, when a run a cell, I don't see any note of dependencies being
downloaded to the actual spark pod, and I get this error:

```
:23: error: object datastax is not a member of package com import
com.datastax.driver.core.Cluster
```

Are dependency downloads supported with zeppelin and spark over kubernetes?
Or am I required to add the dependency jars directly to my spark docker
image and add them to the classpath?

Thanks,
Hetu


Re: Zeppelin 0.8.2 New Spark Interpreter

2019-11-08 Thread Mark Bidewell
Number 2 under http://zeppelin.apache.org/docs/0.8.2/interpreter/spark.html is
the best guide.  spark.jars.packages can be set on the interpreter.  I had
to add export SPARK_SUBMIT_OPTIONS="--repositories " to
zeppelin-env.sh to add my repo to the mix

On Fri, Nov 8, 2019 at 5:11 AM Anton Kulaga  wrote:

> Are there clear instructions how to use spark.jars.packages properties?
> For instance, if I want to depend on bintray repo
> https://dl.bintray.com/comp-bio-aging/main with
> "group.research.aging:spark-extensions_2.11:0.0.7.2" as a dependency, what
> should I do with newintepreter?
>
> On 2019/10/12 01:18:09, Jeff Zhang  wrote:
> > Glad to hear that.
> >
> > Mark Bidewell  于2019年10月12日周六 上午1:30写道:
> >
> > > Just wanted to say "thanks"!  Using spark.jars.packages, etc worked
> great!
> > >
> > > On Fri, Oct 11, 2019 at 9:45 AM Jeff Zhang  wrote:
> > >
> > >> That's right, document should also be updated
> > >>
> > >> Mark Bidewell  于2019年10月11日周五 下午9:28写道:
> > >>
> > >>> Also the interpreter setting UI is still listed as the first way to
> > >>> handle dependencies in the documentation - Maybe it should be marked
> as
> > >>> deprecated?
> > >>>
> > >>> http://zeppelin.apache.org/docs/0.8.2/interpreter/spark.html
> > >>>
> > >>>
> > >>> On Thu, Oct 10, 2019 at 9:58 PM Jeff Zhang  wrote:
> > >>>
> > >>>> It looks like many users still get used to specify spark
> dependencies
> > >>>> in interpreter setting UI, spark.jars and spark.jars.packages seems
> too
> > >>>> difficult to understand and not transparent, so I create ticket
> > >>>> https://issues.apache.org/jira/browse/ZEPPELIN-4374 that user can
> > >>>> still set dependencies in interpreter setting UI.
> > >>>>
> > >>>> Jeff Zhang  于2019年10月11日周五 上午9:54写道:
> > >>>>
> > >>>>> Like I said above, try to set them via spark.jars and
> > >>>>> spark.jars.packages.
> > >>>>>
> > >>>>> Don't set them here
> > >>>>>
> > >>>>> [image: image.png]
> > >>>>>
> > >>>>>
> > >>>>> Mark Bidewell  于2019年10月11日周五 上午9:35写道:
> > >>>>>
> > >>>>>> I was specifying them in the interpreter settings in the UI.
> > >>>>>>
> > >>>>>> On Thu, Oct 10, 2019 at 9:30 PM Jeff Zhang 
> wrote:
> > >>>>>>
> > >>>>>>> How do you specify your spark interpreter dependencies ? You
> need to
> > >>>>>>> specify it via property spark.jars or spark.jars.packages for
> non-local
> > >>>>>>> model.
> > >>>>>>>
> > >>>>>>> Mark Bidewell  于2019年10月11日周五 上午3:45写道:
> > >>>>>>>
> > >>>>>>>> I am running some initial tests of Zeppelin 0.8.2 and I am
> seeing
> > >>>>>>>> some weird issues with dependencies.  When I use the old
> interpreter,
> > >>>>>>>> everything works as expected.  When I use the new interpreter,
> classes in
> > >>>>>>>> my interpreter dependencies cannot be resolved when connecting
> to a master
> > >>>>>>>> that is not local[*],  I did not encounter issues with either
> interpreter
> > >>>>>>>> on 0.8.1.
> > >>>>>>>>
> > >>>>>>>> Has anyone else seen this?
> > >>>>>>>>
> > >>>>>>>> Thanks!
> > >>>>>>>>
> > >>>>>>>> --
> > >>>>>>>> Mark Bidewell
> > >>>>>>>> http://www.linkedin.com/in/markbidewell
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> Best Regards
> > >>>>>>>
> > >>>>>>> Jeff Zhang
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>> Mark Bidewell
> > >>>>>> http://www.linkedin.com/in/markbidewell
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Best Regards
> > >>>>>
> > >>>>> Jeff Zhang
> > >>>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Best Regards
> > >>>>
> > >>>> Jeff Zhang
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>> Mark Bidewell
> > >>> http://www.linkedin.com/in/markbidewell
> > >>>
> > >>
> > >>
> > >> --
> > >> Best Regards
> > >>
> > >> Jeff Zhang
> > >>
> > >
> > >
> > > --
> > > Mark Bidewell
> > > http://www.linkedin.com/in/markbidewell
> > >
> >
> >
> > --
> > Best Regards
> >
> > Jeff Zhang
> >
>


-- 
Mark Bidewell
http://www.linkedin.com/in/markbidewell


Re: Zeppelin 0.8.2 New Spark Interpreter

2019-11-08 Thread Anton Kulaga
Are there clear instructions how to use spark.jars.packages properties?
For instance, if I want to depend on bintray repo 
https://dl.bintray.com/comp-bio-aging/main with 
"group.research.aging:spark-extensions_2.11:0.0.7.2" as a dependency, what 
should I do with newintepreter?

On 2019/10/12 01:18:09, Jeff Zhang  wrote: 
> Glad to hear that.
> 
> Mark Bidewell  于2019年10月12日周六 上午1:30写道:
> 
> > Just wanted to say "thanks"!  Using spark.jars.packages, etc worked great!
> >
> > On Fri, Oct 11, 2019 at 9:45 AM Jeff Zhang  wrote:
> >
> >> That's right, document should also be updated
> >>
> >> Mark Bidewell  于2019年10月11日周五 下午9:28写道:
> >>
> >>> Also the interpreter setting UI is still listed as the first way to
> >>> handle dependencies in the documentation - Maybe it should be marked as
> >>> deprecated?
> >>>
> >>> http://zeppelin.apache.org/docs/0.8.2/interpreter/spark.html
> >>>
> >>>
> >>> On Thu, Oct 10, 2019 at 9:58 PM Jeff Zhang  wrote:
> >>>
> >>>> It looks like many users still get used to specify spark dependencies
> >>>> in interpreter setting UI, spark.jars and spark.jars.packages seems too
> >>>> difficult to understand and not transparent, so I create ticket
> >>>> https://issues.apache.org/jira/browse/ZEPPELIN-4374 that user can
> >>>> still set dependencies in interpreter setting UI.
> >>>>
> >>>> Jeff Zhang  于2019年10月11日周五 上午9:54写道:
> >>>>
> >>>>> Like I said above, try to set them via spark.jars and
> >>>>> spark.jars.packages.
> >>>>>
> >>>>> Don't set them here
> >>>>>
> >>>>> [image: image.png]
> >>>>>
> >>>>>
> >>>>> Mark Bidewell  于2019年10月11日周五 上午9:35写道:
> >>>>>
> >>>>>> I was specifying them in the interpreter settings in the UI.
> >>>>>>
> >>>>>> On Thu, Oct 10, 2019 at 9:30 PM Jeff Zhang  wrote:
> >>>>>>
> >>>>>>> How do you specify your spark interpreter dependencies ? You need to
> >>>>>>> specify it via property spark.jars or spark.jars.packages for 
> >>>>>>> non-local
> >>>>>>> model.
> >>>>>>>
> >>>>>>> Mark Bidewell  于2019年10月11日周五 上午3:45写道:
> >>>>>>>
> >>>>>>>> I am running some initial tests of Zeppelin 0.8.2 and I am seeing
> >>>>>>>> some weird issues with dependencies.  When I use the old interpreter,
> >>>>>>>> everything works as expected.  When I use the new interpreter, 
> >>>>>>>> classes in
> >>>>>>>> my interpreter dependencies cannot be resolved when connecting to a 
> >>>>>>>> master
> >>>>>>>> that is not local[*],  I did not encounter issues with either 
> >>>>>>>> interpreter
> >>>>>>>> on 0.8.1.
> >>>>>>>>
> >>>>>>>> Has anyone else seen this?
> >>>>>>>>
> >>>>>>>> Thanks!
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Mark Bidewell
> >>>>>>>> http://www.linkedin.com/in/markbidewell
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best Regards
> >>>>>>>
> >>>>>>> Jeff Zhang
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Mark Bidewell
> >>>>>> http://www.linkedin.com/in/markbidewell
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best Regards
> >>>>>
> >>>>> Jeff Zhang
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Best Regards
> >>>>
> >>>> Jeff Zhang
> >>>>
> >>>
> >>>
> >>> --
> >>> Mark Bidewell
> >>> http://www.linkedin.com/in/markbidewell
> >>>
> >>
> >>
> >> --
> >> Best Regards
> >>
> >> Jeff Zhang
> >>
> >
> >
> > --
> > Mark Bidewell
> > http://www.linkedin.com/in/markbidewell
> >
> 
> 
> -- 
> Best Regards
> 
> Jeff Zhang
> 


Re: Zeppelin 0.8.2 New Spark Interpreter

2019-10-11 Thread Jeff Zhang
Glad to hear that.

Mark Bidewell  于2019年10月12日周六 上午1:30写道:

> Just wanted to say "thanks"!  Using spark.jars.packages, etc worked great!
>
> On Fri, Oct 11, 2019 at 9:45 AM Jeff Zhang  wrote:
>
>> That's right, document should also be updated
>>
>> Mark Bidewell  于2019年10月11日周五 下午9:28写道:
>>
>>> Also the interpreter setting UI is still listed as the first way to
>>> handle dependencies in the documentation - Maybe it should be marked as
>>> deprecated?
>>>
>>> http://zeppelin.apache.org/docs/0.8.2/interpreter/spark.html
>>>
>>>
>>> On Thu, Oct 10, 2019 at 9:58 PM Jeff Zhang  wrote:
>>>
>>>> It looks like many users still get used to specify spark dependencies
>>>> in interpreter setting UI, spark.jars and spark.jars.packages seems too
>>>> difficult to understand and not transparent, so I create ticket
>>>> https://issues.apache.org/jira/browse/ZEPPELIN-4374 that user can
>>>> still set dependencies in interpreter setting UI.
>>>>
>>>> Jeff Zhang  于2019年10月11日周五 上午9:54写道:
>>>>
>>>>> Like I said above, try to set them via spark.jars and
>>>>> spark.jars.packages.
>>>>>
>>>>> Don't set them here
>>>>>
>>>>> [image: image.png]
>>>>>
>>>>>
>>>>> Mark Bidewell  于2019年10月11日周五 上午9:35写道:
>>>>>
>>>>>> I was specifying them in the interpreter settings in the UI.
>>>>>>
>>>>>> On Thu, Oct 10, 2019 at 9:30 PM Jeff Zhang  wrote:
>>>>>>
>>>>>>> How do you specify your spark interpreter dependencies ? You need to
>>>>>>> specify it via property spark.jars or spark.jars.packages for non-local
>>>>>>> model.
>>>>>>>
>>>>>>> Mark Bidewell  于2019年10月11日周五 上午3:45写道:
>>>>>>>
>>>>>>>> I am running some initial tests of Zeppelin 0.8.2 and I am seeing
>>>>>>>> some weird issues with dependencies.  When I use the old interpreter,
>>>>>>>> everything works as expected.  When I use the new interpreter, classes 
>>>>>>>> in
>>>>>>>> my interpreter dependencies cannot be resolved when connecting to a 
>>>>>>>> master
>>>>>>>> that is not local[*],  I did not encounter issues with either 
>>>>>>>> interpreter
>>>>>>>> on 0.8.1.
>>>>>>>>
>>>>>>>> Has anyone else seen this?
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> --
>>>>>>>> Mark Bidewell
>>>>>>>> http://www.linkedin.com/in/markbidewell
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards
>>>>>>>
>>>>>>> Jeff Zhang
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Mark Bidewell
>>>>>> http://www.linkedin.com/in/markbidewell
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards
>>>>>
>>>>> Jeff Zhang
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards
>>>>
>>>> Jeff Zhang
>>>>
>>>
>>>
>>> --
>>> Mark Bidewell
>>> http://www.linkedin.com/in/markbidewell
>>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>
>
> --
> Mark Bidewell
> http://www.linkedin.com/in/markbidewell
>


-- 
Best Regards

Jeff Zhang


Re: Zeppelin 0.8.2 New Spark Interpreter

2019-10-11 Thread Mark Bidewell
Just wanted to say "thanks"!  Using spark.jars.packages, etc worked great!

On Fri, Oct 11, 2019 at 9:45 AM Jeff Zhang  wrote:

> That's right, document should also be updated
>
> Mark Bidewell  于2019年10月11日周五 下午9:28写道:
>
>> Also the interpreter setting UI is still listed as the first way to
>> handle dependencies in the documentation - Maybe it should be marked as
>> deprecated?
>>
>> http://zeppelin.apache.org/docs/0.8.2/interpreter/spark.html
>>
>>
>> On Thu, Oct 10, 2019 at 9:58 PM Jeff Zhang  wrote:
>>
>>> It looks like many users still get used to specify spark dependencies in
>>> interpreter setting UI, spark.jars and spark.jars.packages seems too
>>> difficult to understand and not transparent, so I create ticket
>>> https://issues.apache.org/jira/browse/ZEPPELIN-4374 that user can still
>>> set dependencies in interpreter setting UI.
>>>
>>> Jeff Zhang  于2019年10月11日周五 上午9:54写道:
>>>
>>>> Like I said above, try to set them via spark.jars and
>>>> spark.jars.packages.
>>>>
>>>> Don't set them here
>>>>
>>>> [image: image.png]
>>>>
>>>>
>>>> Mark Bidewell  于2019年10月11日周五 上午9:35写道:
>>>>
>>>>> I was specifying them in the interpreter settings in the UI.
>>>>>
>>>>> On Thu, Oct 10, 2019 at 9:30 PM Jeff Zhang  wrote:
>>>>>
>>>>>> How do you specify your spark interpreter dependencies ? You need to
>>>>>> specify it via property spark.jars or spark.jars.packages for non-local
>>>>>> model.
>>>>>>
>>>>>> Mark Bidewell  于2019年10月11日周五 上午3:45写道:
>>>>>>
>>>>>>> I am running some initial tests of Zeppelin 0.8.2 and I am seeing
>>>>>>> some weird issues with dependencies.  When I use the old interpreter,
>>>>>>> everything works as expected.  When I use the new interpreter, classes 
>>>>>>> in
>>>>>>> my interpreter dependencies cannot be resolved when connecting to a 
>>>>>>> master
>>>>>>> that is not local[*],  I did not encounter issues with either 
>>>>>>> interpreter
>>>>>>> on 0.8.1.
>>>>>>>
>>>>>>> Has anyone else seen this?
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> --
>>>>>>> Mark Bidewell
>>>>>>> http://www.linkedin.com/in/markbidewell
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards
>>>>>>
>>>>>> Jeff Zhang
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Mark Bidewell
>>>>> http://www.linkedin.com/in/markbidewell
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards
>>>>
>>>> Jeff Zhang
>>>>
>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>
>>
>> --
>> Mark Bidewell
>> http://www.linkedin.com/in/markbidewell
>>
>
>
> --
> Best Regards
>
> Jeff Zhang
>


-- 
Mark Bidewell
http://www.linkedin.com/in/markbidewell


Re: Zeppelin 0.8.2 New Spark Interpreter

2019-10-10 Thread Jeff Zhang
It looks like many users still get used to specify spark dependencies in
interpreter setting UI, spark.jars and spark.jars.packages seems too
difficult to understand and not transparent, so I create ticket
https://issues.apache.org/jira/browse/ZEPPELIN-4374 that user can still set
dependencies in interpreter setting UI.

Jeff Zhang  于2019年10月11日周五 上午9:54写道:

> Like I said above, try to set them via spark.jars and spark.jars.packages.
>
> Don't set them here
>
> [image: image.png]
>
>
> Mark Bidewell  于2019年10月11日周五 上午9:35写道:
>
>> I was specifying them in the interpreter settings in the UI.
>>
>> On Thu, Oct 10, 2019 at 9:30 PM Jeff Zhang  wrote:
>>
>>> How do you specify your spark interpreter dependencies ? You need to
>>> specify it via property spark.jars or spark.jars.packages for non-local
>>> model.
>>>
>>> Mark Bidewell  于2019年10月11日周五 上午3:45写道:
>>>
>>>> I am running some initial tests of Zeppelin 0.8.2 and I am seeing some
>>>> weird issues with dependencies.  When I use the old interpreter, everything
>>>> works as expected.  When I use the new interpreter, classes in my
>>>> interpreter dependencies cannot be resolved when connecting to a master
>>>> that is not local[*],  I did not encounter issues with either interpreter
>>>> on 0.8.1.
>>>>
>>>> Has anyone else seen this?
>>>>
>>>> Thanks!
>>>>
>>>> --
>>>> Mark Bidewell
>>>> http://www.linkedin.com/in/markbidewell
>>>>
>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>
>>
>> --
>> Mark Bidewell
>> http://www.linkedin.com/in/markbidewell
>>
>
>
> --
> Best Regards
>
> Jeff Zhang
>


-- 
Best Regards

Jeff Zhang


Re: Zeppelin 0.8.2 New Spark Interpreter

2019-10-10 Thread Jeff Zhang
Like I said above, try to set them via spark.jars and spark.jars.packages.

Don't set them here

[image: image.png]


Mark Bidewell  于2019年10月11日周五 上午9:35写道:

> I was specifying them in the interpreter settings in the UI.
>
> On Thu, Oct 10, 2019 at 9:30 PM Jeff Zhang  wrote:
>
>> How do you specify your spark interpreter dependencies ? You need to
>> specify it via property spark.jars or spark.jars.packages for non-local
>> model.
>>
>> Mark Bidewell  于2019年10月11日周五 上午3:45写道:
>>
>>> I am running some initial tests of Zeppelin 0.8.2 and I am seeing some
>>> weird issues with dependencies.  When I use the old interpreter, everything
>>> works as expected.  When I use the new interpreter, classes in my
>>> interpreter dependencies cannot be resolved when connecting to a master
>>> that is not local[*],  I did not encounter issues with either interpreter
>>> on 0.8.1.
>>>
>>> Has anyone else seen this?
>>>
>>> Thanks!
>>>
>>> --
>>> Mark Bidewell
>>> http://www.linkedin.com/in/markbidewell
>>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>
>
> --
> Mark Bidewell
> http://www.linkedin.com/in/markbidewell
>


-- 
Best Regards

Jeff Zhang


Re: Zeppelin 0.8.2 New Spark Interpreter

2019-10-10 Thread Mark Bidewell
I was specifying them in the interpreter settings in the UI.

On Thu, Oct 10, 2019 at 9:30 PM Jeff Zhang  wrote:

> How do you specify your spark interpreter dependencies ? You need to
> specify it via property spark.jars or spark.jars.packages for non-local
> model.
>
> Mark Bidewell  于2019年10月11日周五 上午3:45写道:
>
>> I am running some initial tests of Zeppelin 0.8.2 and I am seeing some
>> weird issues with dependencies.  When I use the old interpreter, everything
>> works as expected.  When I use the new interpreter, classes in my
>> interpreter dependencies cannot be resolved when connecting to a master
>> that is not local[*],  I did not encounter issues with either interpreter
>> on 0.8.1.
>>
>> Has anyone else seen this?
>>
>> Thanks!
>>
>> --
>> Mark Bidewell
>> http://www.linkedin.com/in/markbidewell
>>
>
>
> --
> Best Regards
>
> Jeff Zhang
>


-- 
Mark Bidewell
http://www.linkedin.com/in/markbidewell


Re: Zeppelin 0.8.2 New Spark Interpreter

2019-10-10 Thread Jeff Zhang
How do you specify your spark interpreter dependencies ? You need to
specify it via property spark.jars or spark.jars.packages for non-local
model.

Mark Bidewell  于2019年10月11日周五 上午3:45写道:

> I am running some initial tests of Zeppelin 0.8.2 and I am seeing some
> weird issues with dependencies.  When I use the old interpreter, everything
> works as expected.  When I use the new interpreter, classes in my
> interpreter dependencies cannot be resolved when connecting to a master
> that is not local[*],  I did not encounter issues with either interpreter
> on 0.8.1.
>
> Has anyone else seen this?
>
> Thanks!
>
> --
> Mark Bidewell
> http://www.linkedin.com/in/markbidewell
>


-- 
Best Regards

Jeff Zhang


Zeppelin 0.8.2 New Spark Interpreter

2019-10-10 Thread Mark Bidewell
I am running some initial tests of Zeppelin 0.8.2 and I am seeing some
weird issues with dependencies.  When I use the old interpreter, everything
works as expected.  When I use the new interpreter, classes in my
interpreter dependencies cannot be resolved when connecting to a master
that is not local[*],  I did not encounter issues with either interpreter
on 0.8.1.

Has anyone else seen this?

Thanks!

-- 
Mark Bidewell
http://www.linkedin.com/in/markbidewell


Re: spark interpreter "master" parameter always resets to yarn-client after restart zeppelin

2019-08-19 Thread Jeff Zhang
Do you mean you will manually change master to yarn after zeppelin service
start, and want to reset it to yarn-client after restart zeppelin ?

Manuel Sopena Ballesteros  于2019年8月20日周二 上午8:01写道:

> Dear Zeppelin user community,
>
>
>
> I would like I a zeppelin installation with spark integration and the
> “master” parameter in the spark interpreter configuration always resets its
> value from “yarn” to “yarn-client” after zeppelin service reboot.
>
>
>
> How can I stop that?
>
>
>
> Thank you
>
>
> NOTICE
> Please consider the environment before printing this email. This message
> and any attachments are intended for the addressee named and may contain
> legally privileged/confidential/copyright information. If you are not the
> intended recipient, you should not read, use, disclose, copy or distribute
> this communication. If you have received this message in error please
> notify us at once by return email and then delete both messages. We accept
> no liability for the distribution of viruses or similar in electronic
> communications. This notice should not be removed.
>


-- 
Best Regards

Jeff Zhang


Re: python virtual environment on spark interpreter

2019-08-19 Thread Jeff Zhang
Use PYSPARK_PYTHON or spark.pyspark.python

Here's more details
https://www.zepl.com/viewer/notebooks/bm90ZTovL3pqZmZkdS9lNDYzY2NjMmRkODM0NTcwYjFiZTgwMzViMTBmNTUxZi9ub3RlLmpzb24


Manuel Sopena Ballesteros  于2019年8月20日周二 上午7:57写道:

> Dear Zeppelin user community,
>
>
>
> I have a zeppelin installation connected to a Spark cluster. I setup
> Zeppelin to submit jobs in yarn cluster mode and also impersonation is
> enabled. Now I would like to be able to use a python virtual environment
> instead of system one.
>
> Is there a way I could specify the python parameter in the spark
> interpreter settings so is can point to specific folder use home folder (eg
> /home/{user_home}/python_virt_env/python) instead of a system one?
>
>
>
> If not how should I achieve what I want?
>
>
>
> Thank you
>
>
>
> Manuel
> NOTICE
> Please consider the environment before printing this email. This message
> and any attachments are intended for the addressee named and may contain
> legally privileged/confidential/copyright information. If you are not the
> intended recipient, you should not read, use, disclose, copy or distribute
> this communication. If you have received this message in error please
> notify us at once by return email and then delete both messages. We accept
> no liability for the distribution of viruses or similar in electronic
> communications. This notice should not be removed.
>


-- 
Best Regards

Jeff Zhang


spark interpreter "master" parameter always resets to yarn-client after restart zeppelin

2019-08-19 Thread Manuel Sopena Ballesteros
Dear Zeppelin user community,

I would like I a zeppelin installation with spark integration and the "master" 
parameter in the spark interpreter configuration always resets its value from 
"yarn" to "yarn-client" after zeppelin service reboot.

How can I stop that?

Thank you

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


python virtual environment on spark interpreter

2019-08-19 Thread Manuel Sopena Ballesteros
Dear Zeppelin user community,

I have a zeppelin installation connected to a Spark cluster. I setup Zeppelin 
to submit jobs in yarn cluster mode and also impersonation is enabled. Now I 
would like to be able to use a python virtual environment instead of system one.
Is there a way I could specify the python parameter in the spark interpreter 
settings so is can point to specific folder use home folder (eg 
/home/{user_home}/python_virt_env/python) instead of a system one?

If not how should I achieve what I want?

Thank you

Manuel
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


Spark Interpreter failing to start: NumberFormat exception

2019-04-18 Thread Krentz
All -


I am having an issue with a build I forked from master that is compiled as
0.9. We have another build running 0.8 that works just fine. The Spark
interpreter is failing to start, and giving a NumberFormatException. It
looks like when Zeppelin runs interpreter.sh, the
RemoteInterpreterServer.java main method is pulling the IP address instead
of the port number.


Here is the command it tries running:


INFO [2019-04-17 19:33:17,507] ({SchedulerFactory2}
RemoteInterpreterManagedProcess.java[start]:136) - Run interpreter
process *[/opt/zeppelin/bin/interpreter.sh,
-d, /opt/zeppelin/interpreter/spark, -c, 11.3.64.129, -p, 38675, -r, :, -i,
spark-shared_process, -l, /opt/zeppelin/local-repo/spark, -g, spark]*


and here is the code from RemoteInterpreterServer.java starting at line 270:

if (args.length > 0) {

  zeppelinServerHost = args[0];

  *port = Integer.parseInt(args[1]);*

  interpreterGroupId = args[2];

  if (args.length > 3) {

portRange = args[3];

  }

}

It gets a NumberFormatException because it tries to do Integer.parseInt()
on an IP address, the second arg passed into the interpreter.sh


Here is the error:


Exception in thread "main" java.lang.NumberFormatException: For input
string: "11.3.64.129"

at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)

at java.lang.Integer.parseInt(Integer.java:580)

at java.lang.Integer.parseInt(Integer.java:615)

at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:272)

...


Why is RemoteInterpreterServer pulling out the wrong arg index? Or
alternately, why is zeppelin attempting to run interpreter.sh with the
wrong arguments? Has this issue been fixed somewhere that I missed? Am I on
a bad snapshot? I believe I am up-to-date with Master. My personal code
changes were focused on the front-end and realms so I haven't touched any
of the code in zeppelin-zengine or zeppelin-interpreter. Any help figuring
out why I am running into this is appreciated!


Thanks,

Chris Krentz


Re: Multi-line scripts in spark interpreter

2018-07-12 Thread Sanjay Dasgupta
Jeff Zhang's comment here
<https://issues.apache.org/jira/browse/ZEPPELIN-3547?focusedCommentId=16542368=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16542368>
may be useful.

Regards,
Sanjay

On Fri, Jul 13, 2018 at 1:01 AM, Paul Brenner  wrote:

> This behavior is coming from the new spark interpreter. Jeff opened 
> ZEPPELIN-3587
> to fix it. In the mean time you can use the old spark interpreter (set 
> zeppelin.spark.useNew
> to false) to get around this. Hopefully you aren't dependent on the new
> spark interpreter.
>
>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEWajTUYXBwH-r0FFZQrWW1LYYMXbcWNTB3kgkqCy72PU=>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEWajTUYXBwH-r0FFZQrWW1LYYMXbcWNTB3kgkqCy72PU=>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEWajTUYXBwH-r0FFZQrWW1LYYMXbcWNTB3kgkqCy72PU=>
>  *Paul
> Brenner*
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAavTFoHZxG7g2k8aDqqXmoG96bXk8k2NoW-CJz4H-B7deK9g>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAavTFoHZxG7g2k8aDqqXmoG96bXk8k2NoW-CJz4H-B7deK9g>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAavTFoHZxG7g2k8aDqqXmoG96bXk8k2NoW-CJz4H-B7deK9g>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAajTCNvLwH-r208YSvSYlI_30Khz3AMznqAuSGnrVMUuXbjGMX1r1Eo=>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAajTCNvLwH-r208YSvSYlI_30Khz3AMznqAuSGnrVMUuXbjGMX1r1Eo=>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAajTCNvByHKl3EQeT_SYlI_346t_zwcUtq_9N2dRze5S16ekDyvS6dsRdPiNXBOJbQ==>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAajTCNvByHKl3EQeT_SYlI_346t_zwcUtq_9N2dRze5S16ekDyvS6dsRdPiNXBOJbQ==>
> SR. DATA SCIENTIST
> *(217) 390-3033 *
>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEWajTUYXBwH-r0FFZQrWW1NDosfE9j1NV_baiK2pTy-JK87yj6Y_0nfHzZB_hCRUPfc4tCpWqqiveDFDGCnWhYtnvnOfy2qtlDmosP_GPBRDhm1BbQQgXyo8QKmL_9moEpVVKabmAUBhtqpY4Q9gYRd4KCiPf>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEXrPFHJDE0DKt1k1YE-rKzs3psusjh0kbrOP4KWdR0apV8by9-ofykK_tYh-iBQkPfsdjG5OmtjGaD1jdHivhPYjz3q__waRnECBX32FY2MY-EenNIIY9cpoN>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEXrPFHJDE0DKt1k1YE-rKzs3psusjh0kbrOP4KWdR0apV8by9-ofykK_tYh-iBQkPfsdjG5OmtjGaD1jdHivhPYjz3q__waRnECBX32FY2MY-EenNIIY9cpoN>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEXrPFHJDE0DKt1k1YE-rKzs3psusjh0kbrOP4KWdR0apV8by9-ofykK_tYh-iBQkPfsdjG5OmtjGaD1jdHivhPYjz3q__waRnECBX32FY2MY-EenNIIY9cpoN>
> <https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEXrPFHJDE0DKt1k1YE-rKzs3psusjh0kbrOP4KWdR0apV8by9-ofykK_tYh-iBQkPfsdjG5OmtjGaD1jdHivhPYjz3q__waRnECBX32FY2MY-EenNIIY9cpoN>
> <https://share.polymail.io/v1/z/b/

Re: Multi-line scripts in spark interpreter

2018-07-12 Thread Paul Brenner
This behavior is coming from the new spark interpreter. Jeff opened 
ZEPPELIN-3587 to fix it. In the mean time you can use the old spark interpreter 
(set zeppelin.spark.useNew to false) to get around this. Hopefully you aren't 
dependent on the new spark interpreter.

( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEWajTUYXBwH-r0FFZQrWW1LYYMXbcWNTB3kgkqCy72PU=
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEWajTUYXBwH-r0FFZQrWW1LYYMXbcWNTB3kgkqCy72PU=
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEWajTUYXBwH-r0FFZQrWW1LYYMXbcWNTB3kgkqCy72PU=
 ) *Paul Brenner* ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAavTFoHZxG7g2k8aDqqXmoG96bXk8k2NoW-CJz4H-B7deK9g
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAavTFoHZxG7g2k8aDqqXmoG96bXk8k2NoW-CJz4H-B7deK9g
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAavTFoHZxG7g2k8aDqqXmoG96bXk8k2NoW-CJz4H-B7deK9g
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAajTCNvLwH-r208YSvSYlI_30Khz3AMznqAuSGnrVMUuXbjGMX1r1Eo=
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAajTCNvLwH-r208YSvSYlI_30Khz3AMznqAuSGnrVMUuXbjGMX1r1Eo=
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAajTCNvByHKl3EQeT_SYlI_346t_zwcUtq_9N2dRze5S16ekDyvS6dsRdPiNXBOJbQ==
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcu8kTEAajTCNvByHKl3EQeT_SYlI_346t_zwcUtq_9N2dRze5S16ekDyvS6dsRdPiNXBOJbQ==
 ) SR. DATA SCIENTIST (217) 390-3033 

( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEWajTUYXBwH-r0FFZQrWW1NDosfE9j1NV_baiK2pTy-JK87yj6Y_0nfHzZB_hCRUPfc4tCpWqqiveDFDGCnWhYtnvnOfy2qtlDmosP_GPBRDhm1BbQQgXyo8QKmL_9moEpVVKabmAUBhtqpY4Q9gYRd4KCiPf
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEXrPFHJDE0DKt1k1YE-rKzs3psusjh0kbrOP4KWdR0apV8by9-ofykK_tYh-iBQkPfsdjG5OmtjGaD1jdHivhPYjz3q__waRnECBX32FY2MY-EenNIIY9cpoN
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEXrPFHJDE0DKt1k1YE-rKzs3psusjh0kbrOP4KWdR0apV8by9-ofykK_tYh-iBQkPfsdjG5OmtjGaD1jdHivhPYjz3q__waRnECBX32FY2MY-EenNIIY9cpoN
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEXrPFHJDE0DKt1k1YE-rKzs3psusjh0kbrOP4KWdR0apV8by9-ofykK_tYh-iBQkPfsdjG5OmtjGaD1jdHivhPYjz3q__waRnECBX32FY2MY-EenNIIY9cpoN
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEXrPFHJDE0DKt1k1YE-rKzs3psusjh0kbrOP4KWdR0apV8by9-ofykK_tYh-iBQkPfsdjG5OmtjGaD1jdHivhPYjz3q__waRnECBX32FY2MY-EenNIIY9cpoN
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm/RwQQUVo2H_kWCS78pLVi9PIldQ2hmape3YYPFK7buLKn4LKJKPMGWYTOd2H7Vbs9WdGULOGVl7z4afgAb3g1VbNJxDtaM99_oOX4Trk0NyaERQoT1Uhm8nazbRhWxN_C33aRro1-2Jcuu1HEXrPFHJDE0DKt1k1YE-rKzc3os-sih0kXquH-LnRbxuAO4_Wp-ofiir3mYRSiGBEPccltCoamvjHERlDaUmjjJ8SuxrnuwahqByDk7R2sVfV_2dg0090rGtBL
 ) ( 
https://share.polymail.io/v1/z/b/NWI0N2FjMmI4NmJm

Multi-line scripts in spark interpreter

2018-07-12 Thread Christopher Piggott
Hi,

This used to work:

val a = new Something()
 .someMethod()
 .someMethod2()


in 0.7.3 but it doesn't in 0.8.0 ... it says the .someMethod(), etc. are an
illegal start of expression.

Some of these setups I have are fluently expressed but would be
unmanageable in a single long line.  Is there something I need to do to
re-enable this so that my existing pool of zeppelin notebooks still all
work as they did before?

--Chris


Re: illegal start of definition with new spark interpreter

2018-07-05 Thread Jeff Zhang
This is due to different behavior of new spark interpreter, I have
created ZEPPELIN-3587 and will fix it asap.



Paul Brenner 于2018年7月6日周五 上午1:11写道:

> Hi all,
>
> When I try switching over to the new spark interpreter it seems there is a
> fundamental difference in how code is interpreted? Maybe that shouldn't be
> a surprise, but I'm wondering if other people have experienced it and if
> there is any work around or hope for a change in the future.
>
> Specifically, if I write some very normal code that looks like the
> following:
>
> df.groupBy("x").count()
>   .filter($"count" >= 2)
>
>
> everything works fine with the old interpreter, but the new interpreter
> complains:
> :1: error: illegal start of definition
> .filter($"count" >= 2)
>
> I realize that I can work around this by ending each line with a dot, but
> then
>
>1. I'm coding like a psychopath and
>2. I would have to go back and change every line of code in old
>notebooks
>
> Is this actually a bug/feature of the new spark interpreter or do I have
> some configuration problem. If it is a property of the new interpreter, is
> it always going to be this way? For now we are just telling our users not
> to use the new spark interpreter.
>
>
>
> <https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2Bvg596DwyK05I1SsVY-kp8PsAj6WdQTW0U2L8LMgGIMg=>
> <https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2Bvg596DwyK05I1SsVY-kp8PsAj6WdQTW0U2L8LMgGIMg=>
> <https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2Bvg596DwyK05I1SsVY-kp8PsAj6WdQTW0U2L8LMgGIMg=>
>  *Paul
> Brenner*
> <https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g19rzgqL18D3zVWL_Yovqjcf3eTUmSdXsxBh2F4o-odu412>
> <https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g19rzgqL18D3zVWL_Yovqjcf3eTUmSdXsxBh2F4o-odu412>
> <https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g19rzgqL18D3zVWL_Yovqjcf3eTUmSdXsxBh2F4o-odu412>
> <https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g59sWI4K05I3jVUa6gnsKaWRmoHxaqN3CPVWF2pfMAieoC9dpglZUs=>
> <https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g59sWI4K05I3jVUa6gnsKaWRmoHxaqN3CPVWF2pfMAieoC9dpglZUs=>
> <https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g59sWIyI0NG2T5SbqgnsKaWdWkL1q6q9CmftrGMCtphsSZSCBjrabID3lyZuedEwA==>
> <https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g59sWIyI0NG2T5SbqgnsKaWdWkL1q6q9CmftrGMCtphsSZSCBjrabID3lyZuedEwA==>
> SR. DATA SCIENTIST
> *(217) 390-3033 <(217)%20390-3033> *
>
> <https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2Bvg596DwyK05I1SsVY-kp8PmJJzNJlvrrvzDAqryODNZ5pCtDZZz9mle8pBBUf6jm44DjwLBrMDiahOeN-518iU_SOsoAPzAhd70cJ7GhYj8Re3xFG4oGMuFlUC6VvZab6MigiBSyzHi6ALGoUP-Nxsds0-PJ>
> <https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2BuRVrpSk3OwNO0zcUMrZ16uSIJClXnuCl7mWaqLGMFp5mpitddpT7lwmiohAXc7Tm4Imt0bZnLCLeh--W78M81h7OeIINJD8jafepfNMBMRYjosnFeOIoHhzy>
> <https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2BuRVrpSk3OwNO0zcUMrZ

illegal start of definition with new spark interpreter

2018-07-05 Thread Paul Brenner
Hi all,

When I try switching over to the new spark interpreter it seems there is a 
fundamental difference in how code is interpreted? Maybe that shouldn't be a 
surprise, but I'm wondering if other people have experienced it and if there is 
any work around or hope for a change in the future.

Specifically, if I write some very normal code that looks like the following:

df. groupBy ( "x" ). count (). filter ( $ "count" >= 2 )

everything works fine with the old interpreter, but the new interpreter 
complains:
:1: error: illegal start of definition

.filter($"count" >= 2)

I realize that I can work around this by ending each line with a dot, but then
* I'm coding like a psychopath and 
* I would have to go back and change every line of code in old notebooks
Is this actually a bug/feature of the new spark interpreter or do I have some 
configuration problem. If it is a property of the new interpreter, is it always 
going to be this way? For now we are just telling our users not to use the new 
spark interpreter.

( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2Bvg596DwyK05I1SsVY-kp8PsAj6WdQTW0U2L8LMgGIMg=
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2Bvg596DwyK05I1SsVY-kp8PsAj6WdQTW0U2L8LMgGIMg=
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2Bvg596DwyK05I1SsVY-kp8PsAj6WdQTW0U2L8LMgGIMg=
 ) *Paul Brenner* ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g19rzgqL18D3zVWL_Yovqjcf3eTUmSdXsxBh2F4o-odu412
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g19rzgqL18D3zVWL_Yovqjcf3eTUmSdXsxBh2F4o-odu412
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g19rzgqL18D3zVWL_Yovqjcf3eTUmSdXsxBh2F4o-odu412
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g59sWI4K05I3jVUa6gnsKaWRmoHxaqN3CPVWF2pfMAieoC9dpglZUs=
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g59sWI4K05I3jVUa6gnsKaWRmoHxaqN3CPVWF2pfMAieoC9dpglZUs=
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g59sWIyI0NG2T5SbqgnsKaWdWkL1q6q9CmftrGMCtphsSZSCBjrabID3lyZuedEwA==
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g59sWIyI0NG2T5SbqgnsKaWdWkL1q6q9CmftrGMCtphsSZSCBjrabID3lyZuedEwA==
 ) SR. DATA SCIENTIST (217) 390-3033 

( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2Bvg596DwyK05I1SsVY-kp8PmJJzNJlvrrvzDAqryODNZ5pCtDZZz9mle8pBBUf6jm44DjwLBrMDiahOeN-518iU_SOsoAPzAhd70cJ7GhYj8Re3xFG4oGMuFlUC6VvZab6MigiBSyzHi6ALGoUP-Nxsds0-PJ
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2BuRVrpSk3OwNO0zcUMrZ16uSIJClXnuCl7mWaqLGMFp5mpitddpT7lwmiohAXc7Tm4Imt0bZnLCLeh--W78M81h7OeIINJD8jafepfNMBMRYjosnFeOIoHhzy
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2BuRVrpSk3OwNO0zcUMrZ16uSIJClXnuCl7mWaqLGMFp5mpitddpT7lwmiohAXc7Tm4Imt0bZnLCLeh--W78M81h7OeIINJD8jafepfNMBMRYjosnFeOIoHhzy
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2BuRVrpSk3OwNO0zcUMrZ16uSIJClXnuCl7mWaqLGMFp5mpitddpT7lwmiohAXc7Tm4Imt0bZnLCLeh--W78M81h7OeIINJD8jafepfNMBMRYjosnFeOIoHhzy
 ) ( 
https://share

Re: Where Spark home is pick up in the new Spark interpreter

2018-06-06 Thread Jeff Zhang
 It is picked up from interpreter setting. You can define SPARK_HOME in
spark's interpreter setting page



Anthony Corbacho 于2018年6月7日周四 上午11:50写道:

> Hi,
>
> I am a bit confused where spark home is pick up in the new Spark
> interpreter in the 0.8 branch?
>
> Regards,
> Anthony
>


Where Spark home is pick up in the new Spark interpreter

2018-06-06 Thread Anthony Corbacho
Hi,

I am a bit confused where spark home is pick up in the new Spark
interpreter in the 0.8 branch?

Regards,
Anthony


Spark Interpreter Tutorial in Apache Zeppelin

2018-05-30 Thread Jeff Zhang
Hi Folks,


I often see users asking how to use spark interpreter in mail-list,
specially how to configure spark interpreter. So I wrote this article about
how to use spark interpreter in Apache Zeppelin (It is based on Zeppelin
0.8.0). But it is not completed yet, I will continue to add more content to
it, so welcome any feedback and comments on it.

https://medium.com/@zjffdu/spark-interpreter-tutorial-in-apache-zeppelin-a7e18b557a9c


Re: Spark Interpreter error: 'not found: type'

2018-03-19 Thread Jeff Zhang
I try it in master branch, it looks like it fails to download the
dependencies. and it also fails when I try use spark-submit directly.  It
should not be a zeppelin issue, please check these 2 dependencies.

Exception in thread "main" java.lang.RuntimeException: problem during
retrieve of org.apache.spark#spark-submit-parent:
java.lang.RuntimeException: Multiple artifacts of the module
org.bytedeco.javacpp-presets#openblas;0.2.19-1.3 are retrieved to the same
file! Update the retrieve pattern to fix this error. at
org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:249)
at
org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:83)
at org.apache.ivy.Ivy.retrieve(Ivy.java:551) at
org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1200)
at
org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:304)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153) at
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by:
java.lang.RuntimeException: Multiple artifacts of the module
org.bytedeco.javacpp-presets#openblas;0.2.19-1.3 are retrieved to the same
file! Update the retrieve pattern to fix this error. at
org.apache.ivy.core.retrieve.RetrieveEngine.determineArtifactsToCopy(RetrieveEngine.java:417)
at
org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:118)
... 7 more at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterManagedProcess.start(RemoteInterpreterManagedProcess.java:205)
at
org.apache.zeppelin.interpreter.ManagedInterpreterGroup.getOrCreateInterpreterProcess(ManagedInterpreterGroup.java:65)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getOrCreateInterpreterProcess(RemoteInterpreter.java:105)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:158)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:126)

Marcus <marcus.hun...@gmail.com>于2018年3月20日周二 上午8:01写道:

> Hi Karan,
>
> thanks for your hint, and sorry for the late response. I've tried the
> import using _root_ as suggested on stackoverflow, but it didn't change
> anything. Also, the import statement runs. The error occurs when using the
> classname.
>
> As for datavec-api, it is a transient dependency of deeplearning4j-core,
> which is loaded using %spark.dep. I also added it to the
> interpreter-settings as a dependency, with no different effect.
>
> Regards, Marcus
>
> On Wed, Mar 14, 2018 at 1:56 PM, Karan Sewani <karan.sew...@guenstiger.de>
> wrote:
>
>> Hello Marcus
>>
>>
>> Maybe it has something to do with
>>
>>
>> https://stackoverflow.com/questions/13008792/how-to-import-class-using-fully-qualified-name
>> https://stackoverflow.com/questions/13008792/how-to-import-class-using-fully-qualified-name
>>
>>
>>
>> <https://stackoverflow.com/questions/13008792/how-to-import-class-using-fully-qualified-name>I
>> have implemented user defined functions in spark and used them in my code
>> with jar being loaded in classpath and i didn't have any issues with import.
>>
>>
>> Can you give me idea of how you are loading this jar datavec-api for
>> zeppelin or spark-submit to access?
>>
>>
>> Best
>>
>> Karan
>> --
>> *From:* Marcus <marcus.hun...@gmail.com>
>> *Sent:* Saturday, March 10, 2018 10:43:25 AM
>> *To:* users@zeppelin.apache.org
>> *Subject:* Spark Interpreter error: 'not found: type'
>>
>> Hi,
>>
>> I am new to Zeppelin and encountered a strange behavior. When copying my
>> running scala-code to a notebook, I've got errors from the spark
>> interpreter, saying it could not find some types. Strangely the code
>> worked, when I used the fqcn instead of the simple name.
>> But since I want the create a workflow for me, where I use my IDE to
>> write scala and transfer it to a notebook, I'd prefer to not be forced to
>> using fqcn.
>>
>> Here's an example:
>>
>>
>> | %spark.dep
>> | z.reset()
>> | z.load("org.deeplearning4j:deeplearning4j-core:0.9.1")
>> | z.load("org.nd4j:nd4j-native-platform:0.9.1")
>>
>> res0: org.apache.zeppelin.dep.Dependency =
>> org.apache.zeppelin.dep.Dependency@2e10d1e4
>>
>> | import org.datavec.api.records.reader.impl.FileRecordReader
>> |
>> | class Test extends FileRecordReader {
>> | }
>> |
>> | val t = new Test()
>>
>> import org.datavec.api.records.reader.impl.FileRecordReader
>> :12: error: not found: type FileRecordReader
>> class Test extends FileRecordReader {
>>
>> Thanks, Marcus
>>
>
>


Re: Spark Interpreter error: 'not found: type'

2018-03-19 Thread Marcus
Hi Karan,

thanks for your hint, and sorry for the late response. I've tried the
import using _root_ as suggested on stackoverflow, but it didn't change
anything. Also, the import statement runs. The error occurs when using the
classname.

As for datavec-api, it is a transient dependency of deeplearning4j-core,
which is loaded using %spark.dep. I also added it to the
interpreter-settings as a dependency, with no different effect.

Regards, Marcus

On Wed, Mar 14, 2018 at 1:56 PM, Karan Sewani <karan.sew...@guenstiger.de>
wrote:

> Hello Marcus
>
>
> Maybe it has something to do with
>
> https://stackoverflow.com/questions/13008792/how-to-
> import-class-using-fully-qualified-namehttps://
> stackoverflow.com/questions/13008792/how-to-import-class-
> using-fully-qualified-name
>
>
>
> <https://stackoverflow.com/questions/13008792/how-to-import-class-using-fully-qualified-name>I
> have implemented user defined functions in spark and used them in my code
> with jar being loaded in classpath and i didn't have any issues with import.
>
>
> Can you give me idea of how you are loading this jar datavec-api for
> zeppelin or spark-submit to access?
>
>
> Best
>
> Karan
> --
> *From:* Marcus <marcus.hun...@gmail.com>
> *Sent:* Saturday, March 10, 2018 10:43:25 AM
> *To:* users@zeppelin.apache.org
> *Subject:* Spark Interpreter error: 'not found: type'
>
> Hi,
>
> I am new to Zeppelin and encountered a strange behavior. When copying my
> running scala-code to a notebook, I've got errors from the spark
> interpreter, saying it could not find some types. Strangely the code
> worked, when I used the fqcn instead of the simple name.
> But since I want the create a workflow for me, where I use my IDE to write
> scala and transfer it to a notebook, I'd prefer to not be forced to using
> fqcn.
>
> Here's an example:
>
>
> | %spark.dep
> | z.reset()
> | z.load("org.deeplearning4j:deeplearning4j-core:0.9.1")
> | z.load("org.nd4j:nd4j-native-platform:0.9.1")
>
> res0: org.apache.zeppelin.dep.Dependency = org.apache.zeppelin.dep.
> Dependency@2e10d1e4
>
> | import org.datavec.api.records.reader.impl.FileRecordReader
> |
> | class Test extends FileRecordReader {
> | }
> |
> | val t = new Test()
>
> import org.datavec.api.records.reader.impl.FileRecordReader
> :12: error: not found: type FileRecordReader
> class Test extends FileRecordReader {
>
> Thanks, Marcus
>


Re: Spark Interpreter error: 'not found: type'

2018-03-14 Thread Karan Sewani
Hello Marcus


Maybe it has something to do with

https://stackoverflow.com/questions/13008792/how-to-import-class-using-fully-qualified-namehttps://stackoverflow.com/questions/13008792/how-to-import-class-using-fully-qualified-name


<https://stackoverflow.com/questions/13008792/how-to-import-class-using-fully-qualified-name>I
 have implemented user defined functions in spark and used them in my code with 
jar being loaded in classpath and i didn't have any issues with import.


Can you give me idea of how you are loading this jar datavec-api for zeppelin 
or spark-submit to access?


Best

Karan


From: Marcus <marcus.hun...@gmail.com>
Sent: Saturday, March 10, 2018 10:43:25 AM
To: users@zeppelin.apache.org
Subject: Spark Interpreter error: 'not found: type'

Hi,

I am new to Zeppelin and encountered a strange behavior. When copying my 
running scala-code to a notebook, I've got errors from the spark interpreter, 
saying it could not find some types. Strangely the code worked, when I used the 
fqcn instead of the simple name.
But since I want the create a workflow for me, where I use my IDE to write 
scala and transfer it to a notebook, I'd prefer to not be forced to using fqcn.

Here's an example:


| %spark.dep
| z.reset()
| z.load("org.deeplearning4j:deeplearning4j-core:0.9.1")
| z.load("org.nd4j:nd4j-native-platform:0.9.1")

res0: org.apache.zeppelin.dep.Dependency = 
org.apache.zeppelin.dep.Dependency@2e10d1e4

| import org.datavec.api.records.reader.impl.FileRecordReader
|
| class Test extends FileRecordReader {
| }
|
| val t = new Test()

import org.datavec.api.records.reader.impl.FileRecordReader
:12: error: not found: type FileRecordReader
class Test extends FileRecordReader {

Thanks, Marcus


Spark Interpreter error: 'not found: type'

2018-03-09 Thread Marcus
Hi,

I am new to Zeppelin and encountered a strange behavior. When copying my
running scala-code to a notebook, I've got errors from the spark
interpreter, saying it could not find some types. Strangely the code
worked, when I used the fqcn instead of the simple name.
But since I want the create a workflow for me, where I use my IDE to write
scala and transfer it to a notebook, I'd prefer to not be forced to using
fqcn.

Here's an example:


| %spark.dep
| z.reset()
| z.load("org.deeplearning4j:deeplearning4j-core:0.9.1")
| z.load("org.nd4j:nd4j-native-platform:0.9.1")

res0: org.apache.zeppelin.dep.Dependency =
org.apache.zeppelin.dep.Dependency@2e10d1e4

| import org.datavec.api.records.reader.impl.FileRecordReader
|
| class Test extends FileRecordReader {
| }
|
| val t = new Test()

import org.datavec.api.records.reader.impl.FileRecordReader
:12: error: not found: type FileRecordReader
class Test extends FileRecordReader {

Thanks, Marcus


Re: Cannot define UDAF in %spark interpreter

2018-02-27 Thread Vannson, Raphael
Hello Paul,

Many thanks for your quick answer. This did the trick!
Fantastic!

Best,
Raphael



***PARAGRAPH INPUT:***
val AggregatedChangepointAnalyzer = new UserDefinedAggregateFunction {
…
}

***PARAGRAPH OUTPUT:***
AggregatedChangepointAnalyzer: 
org.apache.spark.sql.expressions.UserDefinedAggregateFunction{def 
evaluate(buffer: org.apache.spark.sql.Row): String} = 
79b2515edf74bd80cfc9d8ac1ba563c6anon$1@3b65afbc



I was then able to use the UDAF easily:
***PARAGRAPH INPUT:***
val cpt_df = df.groupBy("foo", "bar ", "baz", 
"bok").agg(AggregatedChangepointAnalyzer(col("y")).as("cpt"))
cpt_df.show

cpt_df: org.apache.spark.sql.DataFrame = [foo: string, bar: string ... 3 more 
fields]
++++--+---+
|foo |bar |baz | bok  |cpt|
++++--+---+
|some| secret | thing  | here | 40|
++++--+---+




From: Paul Brenner <pbren...@placeiq.com>
Date: Tuesday, February 27, 2018 at 3:31 PM
To: Raphael Vannson <raphael.vann...@thinkbiganalytics.com>, 
"users@zeppelin.apache.org" <users@zeppelin.apache.org>
Subject: Cannot define UDAF in %spark interpreter

[https://share.polymail.io/v2/z/a/NWE5NWU5NTdmN2Y5/ROsxnbrMSYqGdOuaYkRq7vFSwJ97WreGD-Dfi3zj_k7RT9GXsy7LJYxWVOSOxXNnopoYW22sBBaRxUGSCFmhLwx727JO_WGuGh8CZ5M6sOuFnUq9DZv6uloiPnfuhKSpaFMgs_T8eBORw_R9_ouLQgOanPF5xyctX24AtKNGHT8=.png]
Unfortunately, I don’t know why code that is working for you in spark shell 
isn’t working in Zeppelin. But if you are looking for a quick fix perhaps this 
could help?

I’ve had luck defining my UDAFs in zeppelin like:
val myUDAF = new UserDefinedAggregateFunction {}



So for example the following code compiles fine for me in zeppelin:

val FractionOfDayCoverage = new UserDefinedAggregateFunction {


  // Input Data Type Schema
  def inputSchema: StructType = StructType(Array(StructField("seconds", 
LongType)))

  // Intermediate Schema
  def bufferSchema = StructType(Array(
StructField("times", ArrayType(LongType

  // Returned Data Type .
  def dataType = DoubleType

  // Self-explaining
  def deterministic = true

  // This function is called whenever key changes
  def initialize(buffer: MutableAggregationBuffer) = {
var timeArray = new ListBuffer[Long]()
buffer.update(0,timeArray)
  }

  // Iterate over each entry of a group
  def update(buffer: MutableAggregationBuffer, input: Row) = {
if (!(input.isNullAt(0))){
var timeArray = new ListBuffer[Long]()
timeArray ++= buffer.getAs[List[Long]](0)
timeArray +=  input.getLong(0)
buffer.update(0,timeArray)
  }}

  // Merge two partial aggregates
  def merge(buffer1: MutableAggregationBuffer, buffer2: Row) = {
var timeArray = new ListBuffer[Long]()
timeArray ++= buffer1.getAs[List[Long]](0)
timeArray ++= buffer2.getAs[List[Long]](0)
buffer1.update(0,timeArray)
  }
  // Called after all the entries are exhausted.
def evaluate(buffer: Row) = {
var timeArray = new ListBuffer[Long]()
timeArray ++= buffer.getAs[List[Long]](0).filter(x => x != null)
val times = timeArray.toArray
scala.util.Sorting.quickSort(times)
var intStart = times(0) - 30*60
var intEnd = times(0) + 30*60
var seen = 0L
for (t <- times) {
if (t > intEnd + 30*60) {
seen += (intEnd - intStart)
intStart = t - 30*60
intEnd = t + 30*60
} else {
intEnd = t + 30*60
}
}
seen += intEnd - intStart
math.min(seen.toDouble/(24*60*60), 1)
  }
}


I’m using zeppelin 0.7.2 and spark 2.0.1 (I think) so perhaps there is a 
version issue somewhere?

[https://ci3.googleusercontent.com/proxy/tFn1I-GEOnccUtv8DHHEc49-6g3x3CbuQKzbfl2Z1BObEy0Qz6QebJimpP96TK3Za5MXwXTuwBZaobKp22nYAG3NdxAC0Q=s0-d-e1-ft#https://marketing.placeiq.net/images/placeiq.png]<http://www.placeiq.com/>

Paul Brenner

[https://ci4.googleusercontent.com/proxy/490PXYv9O6OiIp_DL4vuabJqVn53fMon5xNYZdftCVea9ySR2LcFDHe6Cdntb2G68uDAuA6FgLny8wKWLFWpsrPAt_FtLaE=s0-d-e1-ft#https://marketing.placeiq.net/images/twitter1.png]<https://twitter.com/placeiq>

[https://ci3.googleusercontent.com/proxy/fztHf1lRKLQYcAxebqfp2PYXCwVap3GobHVIbyp0j3NcuJOY16bUAZBibVOFf-fd1GsiuhrOfYy6dSwhlCwWU8ZUlw9OX5I=s0-d-e1-ft#https://marketing.placeiq.net/images/facebook.png]<https://www.facebook.com/PlaceIQ>

[https://ci5.googleusercontent.com/proxy/H26ThD7R6DOqxoLTgzi6k5SMrHoF2Tj44xI_7XlD9KfOIiGwe1WIMc5iQBxUBA9EuIyJMdaRXrhZTOrnkrn8O9Rf1FP9UQU=s0-d-e1-ft#https://marketing.placeiq.net/images/linkedin.png]<https://www.linkedin.com/company/placeiq>

DATA SCIENTIST

(217) 390-3033



[PlaceIQ:CES 2018]


On Tue, Feb 27, 2018 at 6:19 PM Vannson Raphael mailto:vannson%20raphael%20%3craphael.vann...@thinkbiganalytics.com%3e> &g

Cannot define UDAF in %spark interpreter

2018-02-27 Thread Paul Brenner
..@thinkbiganalytics.com> ) > wrote:

> 
> 
> 
> Hello,
> 
> I am having trouble defining a UDAF, using the same code in spark-shell in
> :paste mode works fine.
> 
> Environment:
> - Amazon EMR
> - Apache Zeppelin Version 0.7.3
> - Spark version 2.2.1
> - Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_161)
> 
> 1) Is there a way to configure the zeppelin %spark interpreter to do the
> equivalent of spark-shell's :paste mode?
> 2) If not, is there a workaround to be able to define UDAFs in Zeppelin's
> %spark interpreter?
> 
> Thanks!
> Raphael
> 
> 
> 
> 
> ***PARAGRAPH INPUT:***
> %spark
> 
> import org.apache.spark.sql.functions._
> import org.apache.spark.sql.types._
> import org.apache.spark.sql.expressions.{MutableAggregationBuffer,
> UserDefinedAggregateFunction}
> import org.apache.spark.sql.Row
> import scala.collection.mutable.WrappedArray
> import scala.collection.mutable.ListBuffer
> 
> class AggregatedChangepointAnalyzer extends UserDefinedAggregateFunction {
> 
> // Input schema
> override def inputSchema: StructType = StructType(StructField("y",
> DoubleType) :: Nil)
> 
> // Intermediate buffer schema
> override def bufferSchema: StructType =
> StructType(StructField("observations", ArrayType(DoubleType)) :: Nil)
> 
> //Output schema
> override def dataType: DataType = StringType
> 
> // Deterministic UDAF
> override def deterministic: Boolean = true
> 
> 
> 
> // How to initialize the intermediate processing buffer for each group:
> // We simply create a List[Double] which will hold the observations (y)
> // of each group
> override def initialize(buffer: MutableAggregationBuffer): Unit = {
> buffer(0) = Array.emptyDoubleArray
> }
> 
> // What to do with each new row within the group:
> // Here we append each new observation of the group
> // in a List[Double]
> override def update(buffer: MutableAggregationBuffer, input: Row): Unit =
> {
> // Put the observations collected into a List
> var values = new ListBuffer[Double]()
> values.appendAll(buffer.getAs[List[Double]](0))
> 
> // Get the new value for the current row
> val newValue = input.getDouble(0)
> 
> // Append the new value to the buffer and return it
> values.append(newValue)
> buffer.update(0, values)
> }
> 
> 
> // How to merge 2 buffers located on 2 separate executor hosts or JVMs:
> // Simply append one List at the end of another
> override def merge(buffer1: MutableAggregationBuffer, buffer2: Row): Unit
> = {
> var values = new ListBuffer[Double]()
> values ++= buffer1.getAs[List[Double]](0)
> values ++= buffer2.getAs[List[Double]](0)
> buffer1.update(0, values)
> }
> 
> 
> 
> override def evaluate(buffer: Row): String = {
> val observations = buffer.getSeq[Double](0)
> observations.size.toString
> }
> }
> 
> 
> 
> ***PARAGRAPH OUTPUT:***
> import org.apache.spark.sql.functions._
> import org.apache.spark.sql.types._
> import org.apache.spark.sql.expressions.{MutableAggregationBuffer,
> UserDefinedAggregateFunction}
> import org.apache.spark.sql.Row
> import scala.collection.mutable.WrappedArray
> import scala.collection.mutable.ListBuffer
> :12: error: not found: type UserDefinedAggregateFunction
> class AggregatedChangepointAnalyzer extends UserDefinedAggregateFunction {
> 
> ^
> :14: error: not found: type StructType
> override def inputSchema: StructType = StructType(StructField("y",
> DoubleType) :: Nil)
> ^
> :14: error: not found: value StructType
> override def inputSchema: StructType = StructType(StructField("y",
> DoubleType) :: Nil)
> ^
> :14: error: not found: value StructField
> override def inputSchema: StructType = StructType(StructField("y",
> DoubleType) :: Nil)
> ^
> :14: error: not found: value DoubleType
> override def inputSchema: StructType = StructType(StructField("y",
> DoubleType) :: Nil)
> ^
> :17: error: not found: type StructType
> override def bufferSchema: StructType =
> StructType(StructField("observations", ArrayType(DoubleType)) :: Nil)
> ^
> :17: error: not found: value StructType
> override def bufferSchema: StructType =
> StructType(StructField("observations", ArrayType(DoubleType)) :: Nil)
> ^
> :17: error: not found: value StructField
> override def bufferSchema: StructType =
> StructType(StructField("observations", ArrayType(DoubleType)) :: Nil)
> :17: error: not found: value ArrayType
> override def bufferSchema: StructType =
> StructType(StructField("observations", ArrayType(DoubleType)) :: Nil)
> ^
> :17: error: not found: value DoubleType
> overri

Jar dependencies are not reloaded when Spark interpreter is restarted?

2018-02-22 Thread Partridge, Lucas (GE Aviation)
I only change the content of the jar, not the name or version of the jar 
(otherwise I’d have to re-add it as a dependency anyway).  Or do you mean 
something else by ’version’?

This dependency is a local file. Zeppelin and Spark are all running on the same 
machine. So I’m just specifying the file system path of the jar; it’s not even 
prefixed with file:///.

From: Jhon Anderson Cardenas Diaz [mailto:jhonderson2...@gmail.com]
Sent: 22 February 2018 12:18
To: users@zeppelin.apache.org
Subject: EXT: Re: Jar dependencies are not reloaded when Spark interpreter is 
restarted?

When you say you change the dependency, is only about the content? Or content 
and version. I think the dependency should be reloaded only if its version 
change.

I do not think it's optimal to re-download the dependencies every time the 
interpreter reboots.

El 22 feb. 2018 05:22, "Partridge, Lucas (GE Aviation)" 
<lucas.partri...@ge.com<mailto:lucas.partri...@ge.com>> escribió:
I’m using Zeppelin 0.7.3 against a local standalone Spark ‘cluster’. I’ve added 
a Scala jar dependency to my Spark interpreter using Zeppelin’s UI. I thought 
if I changed my Scala code and updated the jar (using sbt outside of Zeppelin) 
then all I’d have to do is restart the interpreter for the new code to be 
picked up in Zeppelin in a regular scala paragraph.  However restarting the 
interpreter appears to have no effect – the new code is not detected. Is that 
expected behaviour or a bug?

The workaround I’m using at the moment is to edit the spark interpreter, remove 
the jar, re-add it, save the changes and then restart the interpreter. Clumsy 
but that’s better than restarting Zeppelin altogether.

Also, if anyone knows of a better way to reload code without restarting the 
interpreter then I’m open to suggestions:). Having to re-run lots of paragraphs 
after a restart is pretty tedious.

Thanks, Lucas.



Re: Jar dependencies are not reloaded when Spark interpreter is restarted?

2018-02-22 Thread Jhon Anderson Cardenas Diaz
When you say you change the dependency, is only about the content? Or
content and version. I think the dependency should be reloaded only if its
version change.

I do not think it's optimal to re-download the dependencies every time the
interpreter reboots.

El 22 feb. 2018 05:22, "Partridge, Lucas (GE Aviation)" <
lucas.partri...@ge.com> escribió:

> I’m using Zeppelin 0.7.3 against a local standalone Spark ‘cluster’. I’ve
> added a Scala jar dependency to my Spark interpreter using Zeppelin’s UI. I
> thought if I changed my Scala code and updated the jar (using sbt outside
> of Zeppelin) then all I’d have to do is restart the interpreter for the new
> code to be picked up in Zeppelin in a regular scala paragraph.  However
> restarting the interpreter appears to have no effect – the new code is not
> detected. Is that expected behaviour or a bug?
>
>
>
> The workaround I’m using at the moment is to edit the spark interpreter,
> remove the jar, re-add it, save the changes and then restart the
> interpreter. Clumsy but that’s better than restarting Zeppelin altogether.
>
>
>
> Also, if anyone knows of a better way to reload code without restarting
> the interpreter then I’m open to suggestions:). Having to re-run lots of
> paragraphs after a restart is pretty tedious.
>
>
>
> Thanks, Lucas.
>
>
>


Jar dependencies are not reloaded when Spark interpreter is restarted?

2018-02-22 Thread Partridge, Lucas (GE Aviation)
I'm using Zeppelin 0.7.3 against a local standalone Spark 'cluster'. I've added 
a Scala jar dependency to my Spark interpreter using Zeppelin's UI. I thought 
if I changed my Scala code and updated the jar (using sbt outside of Zeppelin) 
then all I'd have to do is restart the interpreter for the new code to be 
picked up in Zeppelin in a regular scala paragraph.  However restarting the 
interpreter appears to have no effect - the new code is not detected. Is that 
expected behaviour or a bug?

The workaround I'm using at the moment is to edit the spark interpreter, remove 
the jar, re-add it, save the changes and then restart the interpreter. Clumsy 
but that's better than restarting Zeppelin altogether.

Also, if anyone knows of a better way to reload code without restarting the 
interpreter then I'm open to suggestions:). Having to re-run lots of paragraphs 
after a restart is pretty tedious.

Thanks, Lucas.



Re: Custom Spark Interpreter?

2018-01-25 Thread Nick Moeckel
I am beginning work on extending the SparkInterpreter class right now- I
would be interested to hear more details about why this idea is not
straightforward. 

Thanks,
Nick



--
Sent from: 
http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/


Re: Custom Spark Interpreter?

2018-01-25 Thread ankit jain
Don't think that works, it just loads a blank page.

On Wed, Jan 24, 2018 at 11:06 PM, Jeff Zhang <zjf...@gmail.com> wrote:

>
> But if you don't set it in interpreter setting, it would get spark ui url
> dynamically.
>
>
>
> ankit jain <ankitjain@gmail.com>于2018年1月25日周四 下午3:03写道:
>
>> That method is just reading it from a config defined in interpreter
>> settings called "uiWebUrl" which makes it configurable but still static.
>>
>> On Wed, Jan 24, 2018 at 10:58 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>>
>>>
>>> IIRC, spark interpreter can get web ui url at runtime instead of static
>>> url.
>>>
>>> https://github.com/apache/zeppelin/blob/master/spark/
>>> src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java#L940
>>>
>>>
>>> ankit jain <ankitjain@gmail.com>于2018年1月25日周四 下午2:55写道:
>>>
>>>> Issue with Spark UI when running on AWS EMR is it requires ssh
>>>> tunneling to be setup which requires private aws keys.
>>>>
>>>> Our team is building a analytic platform on zeppelin for end-users who
>>>> we obviously can't hand out these keys.
>>>>
>>>> Another issue is setting up correct port - Zeppelin tries to use 4040
>>>> for spark but during an interpreter restart 4040 could be used by an old
>>>> still stuck paragraph. In that case Zeppelin simply tries the next port and
>>>> so on.
>>>>
>>>> Static url for Spark can't handle this and hence requires some dynamic
>>>> implementation.
>>>>
>>>> PS - As I write this a lightbulb goes on in my head. I guess we could
>>>> also modify Zeppelin restart script to kill those rogue processes and make
>>>> sure 4040 is always available?
>>>>
>>>> Thanks
>>>> Ankit
>>>>
>>>> On Wed, Jan 24, 2018 at 6:10 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>>>>
>>>>>
>>>>> If Spark interpreter didn't give you the correct spark UI, this should
>>>>> be a bug, you can file a ticket to fix it. Although you can make a custom
>>>>> interpreter by extending the current spark interpreter, it is not a 
>>>>> trivial
>>>>> work.
>>>>>
>>>>>
>>>>> ankit jain <ankitjain@gmail.com>于2018年1月25日周四 上午8:07写道:
>>>>>
>>>>>> Hi fellow Zeppelin users,
>>>>>> Has anyone tried to write a custom Spark Interpreter perhaps
>>>>>> extending from the one that ships currently with zeppelin -
>>>>>> spark/src/main/java/org/apache/zeppelin/spark/
>>>>>> *SparkInterpreter.java?*
>>>>>>
>>>>>> We are coming across cases where we need the interpreter to do
>>>>>> "more", eg change getSparkUIUrl() to directly load Yarn
>>>>>> ResourceManager/proxy/application_id123 rather than a fixed web ui.
>>>>>>
>>>>>> If we directly modify Zeppelin source code, upgrading to new zeppelin
>>>>>> versions will be a mess.
>>>>>>
>>>>>> Before we get too deep into it, wanted to get thoughts of the
>>>>>> community.
>>>>>>
>>>>>> What is a "clean" way to do such changes?
>>>>>>
>>>>>> --
>>>>>> Thanks & Regards,
>>>>>> Ankit.
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks & Regards,
>>>> Ankit.
>>>>
>>>
>>
>>
>> --
>> Thanks & Regards,
>> Ankit.
>>
>


-- 
Thanks & Regards,
Ankit.


Re: Custom Spark Interpreter?

2018-01-24 Thread Jeff Zhang
But if you don't set it in interpreter setting, it would get spark ui url
dynamically.



ankit jain <ankitjain@gmail.com>于2018年1月25日周四 下午3:03写道:

> That method is just reading it from a config defined in interpreter
> settings called "uiWebUrl" which makes it configurable but still static.
>
> On Wed, Jan 24, 2018 at 10:58 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>
>>
>> IIRC, spark interpreter can get web ui url at runtime instead of static
>> url.
>>
>>
>> https://github.com/apache/zeppelin/blob/master/spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java#L940
>>
>>
>> ankit jain <ankitjain@gmail.com>于2018年1月25日周四 下午2:55写道:
>>
>>> Issue with Spark UI when running on AWS EMR is it requires ssh tunneling
>>> to be setup which requires private aws keys.
>>>
>>> Our team is building a analytic platform on zeppelin for end-users who
>>> we obviously can't hand out these keys.
>>>
>>> Another issue is setting up correct port - Zeppelin tries to use 4040
>>> for spark but during an interpreter restart 4040 could be used by an old
>>> still stuck paragraph. In that case Zeppelin simply tries the next port and
>>> so on.
>>>
>>> Static url for Spark can't handle this and hence requires some dynamic
>>> implementation.
>>>
>>> PS - As I write this a lightbulb goes on in my head. I guess we could
>>> also modify Zeppelin restart script to kill those rogue processes and make
>>> sure 4040 is always available?
>>>
>>> Thanks
>>> Ankit
>>>
>>> On Wed, Jan 24, 2018 at 6:10 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>>>
>>>>
>>>> If Spark interpreter didn't give you the correct spark UI, this should
>>>> be a bug, you can file a ticket to fix it. Although you can make a custom
>>>> interpreter by extending the current spark interpreter, it is not a trivial
>>>> work.
>>>>
>>>>
>>>> ankit jain <ankitjain@gmail.com>于2018年1月25日周四 上午8:07写道:
>>>>
>>>>> Hi fellow Zeppelin users,
>>>>> Has anyone tried to write a custom Spark Interpreter perhaps extending
>>>>> from the one that ships currently with zeppelin -
>>>>> spark/src/main/java/org/apache/zeppelin/spark/*SparkInterpreter.java?*
>>>>>
>>>>> We are coming across cases where we need the interpreter to do "more",
>>>>> eg change getSparkUIUrl() to directly load Yarn
>>>>> ResourceManager/proxy/application_id123 rather than a fixed web ui.
>>>>>
>>>>> If we directly modify Zeppelin source code, upgrading to new zeppelin
>>>>> versions will be a mess.
>>>>>
>>>>> Before we get too deep into it, wanted to get thoughts of the
>>>>> community.
>>>>>
>>>>> What is a "clean" way to do such changes?
>>>>>
>>>>> --
>>>>> Thanks & Regards,
>>>>> Ankit.
>>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks & Regards,
>>> Ankit.
>>>
>>
>
>
> --
> Thanks & Regards,
> Ankit.
>


Re: Custom Spark Interpreter?

2018-01-24 Thread ankit jain
That method is just reading it from a config defined in interpreter
settings called "uiWebUrl" which makes it configurable but still static.

On Wed, Jan 24, 2018 at 10:58 PM, Jeff Zhang <zjf...@gmail.com> wrote:

>
> IIRC, spark interpreter can get web ui url at runtime instead of static
> url.
>
> https://github.com/apache/zeppelin/blob/master/spark/
> src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java#L940
>
>
> ankit jain <ankitjain@gmail.com>于2018年1月25日周四 下午2:55写道:
>
>> Issue with Spark UI when running on AWS EMR is it requires ssh tunneling
>> to be setup which requires private aws keys.
>>
>> Our team is building a analytic platform on zeppelin for end-users who we
>> obviously can't hand out these keys.
>>
>> Another issue is setting up correct port - Zeppelin tries to use 4040 for
>> spark but during an interpreter restart 4040 could be used by an old still
>> stuck paragraph. In that case Zeppelin simply tries the next port and so on.
>>
>> Static url for Spark can't handle this and hence requires some dynamic
>> implementation.
>>
>> PS - As I write this a lightbulb goes on in my head. I guess we could
>> also modify Zeppelin restart script to kill those rogue processes and make
>> sure 4040 is always available?
>>
>> Thanks
>> Ankit
>>
>> On Wed, Jan 24, 2018 at 6:10 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>>
>>>
>>> If Spark interpreter didn't give you the correct spark UI, this should
>>> be a bug, you can file a ticket to fix it. Although you can make a custom
>>> interpreter by extending the current spark interpreter, it is not a trivial
>>> work.
>>>
>>>
>>> ankit jain <ankitjain@gmail.com>于2018年1月25日周四 上午8:07写道:
>>>
>>>> Hi fellow Zeppelin users,
>>>> Has anyone tried to write a custom Spark Interpreter perhaps extending
>>>> from the one that ships currently with zeppelin -
>>>> spark/src/main/java/org/apache/zeppelin/spark/*SparkInterpreter.java?*
>>>>
>>>> We are coming across cases where we need the interpreter to do "more",
>>>> eg change getSparkUIUrl() to directly load Yarn 
>>>> ResourceManager/proxy/application_id123
>>>> rather than a fixed web ui.
>>>>
>>>> If we directly modify Zeppelin source code, upgrading to new zeppelin
>>>> versions will be a mess.
>>>>
>>>> Before we get too deep into it, wanted to get thoughts of the community.
>>>>
>>>> What is a "clean" way to do such changes?
>>>>
>>>> --
>>>> Thanks & Regards,
>>>> Ankit.
>>>>
>>>
>>
>>
>> --
>> Thanks & Regards,
>> Ankit.
>>
>


-- 
Thanks & Regards,
Ankit.


Re: How does user user jar conflict resolved in spark interpreter?

2017-11-15 Thread Jeff Zhang
You can try mode isolated per user which would create one jvm for each
user.
Check this link for details.
https://zeppelin.apache.org/docs/0.8.0-SNAPSHOT/usage/interpreter/interpreter_binding_mode.html



Serega Sheypak <serega.shey...@gmail.com>于2017年11月16日周四 上午5:42写道:

> Hi zeppelin users!
> I have the question about dependencies users are using while running
> notebooks using spark interpreter.
>
> Imagine I have configured spark intepreter.
>
> Two users write their spark notebooks.
> the first user does
>
> z.load("com:best-it-company:0.1")
>
>
> the second one user adds to his notebook:
>
> z.load("com:best-it-company:0.2")
>
> Then they start to execute two notebooks concurrently.
> What will happen to dependencies?
> They have same classes... will spark isolate 0.1 version from 0.2 version
> somehow?
>


How does user user jar conflict resolved in spark interpreter?

2017-11-15 Thread Serega Sheypak
Hi zeppelin users!
I have the question about dependencies users are using while running
notebooks using spark interpreter.

Imagine I have configured spark intepreter.

Two users write their spark notebooks.
the first user does

z.load("com:best-it-company:0.1")


the second one user adds to his notebook:

z.load("com:best-it-company:0.2")

Then they start to execute two notebooks concurrently.
What will happen to dependencies?
They have same classes... will spark isolate 0.1 version from 0.2 version
somehow?


Re: Configure spark interpreter setting from environment variables

2017-09-27 Thread benoitdr
That is working.
Thanks a lot


Re: Configure spark interpreter setting from environment variables

2017-09-27 Thread Jeff Zhang
unfortunately it is packaged in the spark interpreter jar. but you can get
it from source code.


Benoit Drooghaag <benoit.droogh...@skynet.be>于2017年9月27日周三 下午5:11写道:

> Thanks for your quick feedback.
> There is no "interpreter-setting.json" in zeppelin-0.7.3-bin-all.tgz.
> Can you tell me more ?
> I'm currently building a docker image for Zeppelin to be integrated with a
> spark cluster.
> I cannot rely on users to set interpreter settings.
>
> Thanks
>
> On 27 September 2017 at 10:56, Jeff Zhang <zjf...@gmail.com> wrote:
>
>>
>> Set interpreter setting is one time effort, it should not be inconvenient
>> for users. But if you are a zeppelin vender and want to customize zeppelin,
>> you can edit interpreter-setting.json of spark interpreter and copy it into
>> $ZEPPELIN_HOME/interpreter/spark
>>
>>
>> benoit.droogh...@gmail.com <benoit.droogh...@gmail.com>于2017年9月27日周三
>> 下午4:49写道:
>>
>>> Hi all,
>>>
>>> Is there a way to configure arbitrary spark interpreter settings via
>>> environment variables ?
>>> For example, I'd like to set the "spark.ui.reverseProxy" setting to
>>> "true".
>>> For the moment, I can only do it manually via the Zeppelin UI, that is
>>> working as expected.
>>> But I'd ike to have this property set automatically at startup.
>>>
>>> Can anyone advice on how to do that, or guide me to relevant doc ?
>>>
>>> Thanks,
>>> Benoit
>>>
>>
>


Re: Configure spark interpreter setting from environment variables

2017-09-27 Thread Benoit Drooghaag
Thanks for your quick feedback.
There is no "interpreter-setting.json" in zeppelin-0.7.3-bin-all.tgz.
Can you tell me more ?
I'm currently building a docker image for Zeppelin to be integrated with a
spark cluster.
I cannot rely on users to set interpreter settings.

Thanks

On 27 September 2017 at 10:56, Jeff Zhang <zjf...@gmail.com> wrote:

>
> Set interpreter setting is one time effort, it should not be inconvenient
> for users. But if you are a zeppelin vender and want to customize zeppelin,
> you can edit interpreter-setting.json of spark interpreter and copy it into
> $ZEPPELIN_HOME/interpreter/spark
>
>
> benoit.droogh...@gmail.com <benoit.droogh...@gmail.com>于2017年9月27日周三
> 下午4:49写道:
>
>> Hi all,
>>
>> Is there a way to configure arbitrary spark interpreter settings via
>> environment variables ?
>> For example, I'd like to set the "spark.ui.reverseProxy" setting to
>> "true".
>> For the moment, I can only do it manually via the Zeppelin UI, that is
>> working as expected.
>> But I'd ike to have this property set automatically at startup.
>>
>> Can anyone advice on how to do that, or guide me to relevant doc ?
>>
>> Thanks,
>> Benoit
>>
>


Re: Configure spark interpreter setting from environment variables

2017-09-27 Thread Jeff Zhang
Set interpreter setting is one time effort, it should not be inconvenient
for users. But if you are a zeppelin vender and want to customize zeppelin,
you can edit interpreter-setting.json of spark interpreter and copy it into
$ZEPPELIN_HOME/interpreter/spark


benoit.droogh...@gmail.com <benoit.droogh...@gmail.com>于2017年9月27日周三
下午4:49写道:

> Hi all,
>
> Is there a way to configure arbitrary spark interpreter settings via
> environment variables ?
> For example, I'd like to set the "spark.ui.reverseProxy" setting to "true".
> For the moment, I can only do it manually via the Zeppelin UI, that is
> working as expected.
> But I'd ike to have this property set automatically at startup.
>
> Can anyone advice on how to do that, or guide me to relevant doc ?
>
> Thanks,
> Benoit
>


Re: Configuring Zeppelin spark interpreter to work with different hadoop clusters

2017-06-30 Thread Jeff Zhang
HADOOP_CONF_DIR in zeppelin-env.sh would affect the whole zeppelin
instance, and define it in interpreter setting would affect that
interpreter.

Jeff Zhang <zjf...@gmail.com>于2017年7月1日周六 上午7:26写道:

>
> HADOOP_CONF_DIR would affect the whole zeppelin instance. and define it
> interpreter setting would affect that interpreter.
>
> And all the capitalized property name would be taken as env variable.
>
> Serega Sheypak <serega.shey...@gmail.com>于2017年7月1日周六 上午3:20写道:
>
>> hi, thanks for your reply. How should I set this variable?
>> I'm looking at Spark interpreter config UI. It doesn't allow me to set
>> env variable.
>>
>> https://zeppelin.apache.org/docs/latest/interpreter/spark.html#1-export-spark_home
>> tells that HADOOP_CONF_DIR should be set once per whole Zeppelin
>> instance.
>>
>> What do I miss?
>> Thanks!
>>
>> 2017-06-30 16:43 GMT+02:00 Jeff Zhang <zjf...@gmail.com>:
>>
>>>
>>> Right, create three spark interpreters for your 3 yarn cluster.
>>>
>>>
>>>
>>> Serega Sheypak <serega.shey...@gmail.com>于2017年6月30日周五 下午10:33写道:
>>>
>>>> Hi, thanks for your reply!
>>>> What do you mean by that?
>>>> I can have only one env variable HADOOP_CONF_DIR...
>>>> And how can user pick which env to run?
>>>>
>>>> Or you mean I have to create three Spark interpreters and each of them
>>>> would have it's own HADOOP_CONF_DIR pointed to single cluster config?
>>>>
>>>> 2017-06-30 16:21 GMT+02:00 Jeff Zhang <zjf...@gmail.com>:
>>>>
>>>>>
>>>>> Try set HADOOP_CONF_DIR for each yarn conf in interpreter setting.
>>>>>
>>>>> Serega Sheypak <serega.shey...@gmail.com>于2017年6月30日周五 下午10:11写道:
>>>>>
>>>>>> Hi I have several different hadoop clusters, each of them has it's
>>>>>> own YARN.
>>>>>> Is it possible to configure single Zeppelin instance to work with
>>>>>> different clusters?
>>>>>> I want to run spark on cluster A if data is there. Right now my
>>>>>> Zeppelin runs on single cluster and it sucks data from remote clusters
>>>>>> which is inefficient. Zeppelin can easily access any HDFS cluster, but 
>>>>>> what
>>>>>> about YARN?
>>>>>>
>>>>>> What are the correct approaches to solve the problem?
>>>>>>
>>>>>
>>>>
>>


Re: Configuring Zeppelin spark interpreter to work with different hadoop clusters

2017-06-30 Thread Serega Sheypak
hi, thanks for your reply. How should I set this variable?
I'm looking at Spark interpreter config UI. It doesn't allow me to set env
variable.
https://zeppelin.apache.org/docs/latest/interpreter/spark.html#1-export-spark_home
tells that HADOOP_CONF_DIR should be set once per whole Zeppelin instance.

What do I miss?
Thanks!

2017-06-30 16:43 GMT+02:00 Jeff Zhang <zjf...@gmail.com>:

>
> Right, create three spark interpreters for your 3 yarn cluster.
>
>
>
> Serega Sheypak <serega.shey...@gmail.com>于2017年6月30日周五 下午10:33写道:
>
>> Hi, thanks for your reply!
>> What do you mean by that?
>> I can have only one env variable HADOOP_CONF_DIR...
>> And how can user pick which env to run?
>>
>> Or you mean I have to create three Spark interpreters and each of them
>> would have it's own HADOOP_CONF_DIR pointed to single cluster config?
>>
>> 2017-06-30 16:21 GMT+02:00 Jeff Zhang <zjf...@gmail.com>:
>>
>>>
>>> Try set HADOOP_CONF_DIR for each yarn conf in interpreter setting.
>>>
>>> Serega Sheypak <serega.shey...@gmail.com>于2017年6月30日周五 下午10:11写道:
>>>
>>>> Hi I have several different hadoop clusters, each of them has it's own
>>>> YARN.
>>>> Is it possible to configure single Zeppelin instance to work with
>>>> different clusters?
>>>> I want to run spark on cluster A if data is there. Right now my
>>>> Zeppelin runs on single cluster and it sucks data from remote clusters
>>>> which is inefficient. Zeppelin can easily access any HDFS cluster, but what
>>>> about YARN?
>>>>
>>>> What are the correct approaches to solve the problem?
>>>>
>>>
>>


Re: Configuring Zeppelin spark interpreter to work with different hadoop clusters

2017-06-30 Thread Jeff Zhang
Right, create three spark interpreters for your 3 yarn cluster.



Serega Sheypak 于2017年6月30日周五 下午10:33写道:

> Hi, thanks for your reply!
> What do you mean by that?
> I can have only one env variable HADOOP_CONF_DIR...
> And how can user pick which env to run?
>
> Or you mean I have to create three Spark interpreters and each of them
> would have it's own HADOOP_CONF_DIR pointed to single cluster config?
>
> 2017-06-30 16:21 GMT+02:00 Jeff Zhang :
>
>>
>> Try set HADOOP_CONF_DIR for each yarn conf in interpreter setting.
>>
>> Serega Sheypak 于2017年6月30日周五 下午10:11写道:
>>
>>> Hi I have several different hadoop clusters, each of them has it's own
>>> YARN.
>>> Is it possible to configure single Zeppelin instance to work with
>>> different clusters?
>>> I want to run spark on cluster A if data is there. Right now my Zeppelin
>>> runs on single cluster and it sucks data from remote clusters which is
>>> inefficient. Zeppelin can easily access any HDFS cluster, but what about
>>> YARN?
>>>
>>> What are the correct approaches to solve the problem?
>>>
>>
>


Re: Configuring Zeppelin spark interpreter to work with different hadoop clusters

2017-06-30 Thread Serega Sheypak
Hi, thanks for your reply!
What do you mean by that?
I can have only one env variable HADOOP_CONF_DIR...
And how can user pick which env to run?

Or you mean I have to create three Spark interpreters and each of them
would have it's own HADOOP_CONF_DIR pointed to single cluster config?

2017-06-30 16:21 GMT+02:00 Jeff Zhang :

>
> Try set HADOOP_CONF_DIR for each yarn conf in interpreter setting.
>
> Serega Sheypak 于2017年6月30日周五 下午10:11写道:
>
>> Hi I have several different hadoop clusters, each of them has it's own
>> YARN.
>> Is it possible to configure single Zeppelin instance to work with
>> different clusters?
>> I want to run spark on cluster A if data is there. Right now my Zeppelin
>> runs on single cluster and it sucks data from remote clusters which is
>> inefficient. Zeppelin can easily access any HDFS cluster, but what about
>> YARN?
>>
>> What are the correct approaches to solve the problem?
>>
>


Re: java.lang.NullPointerException on adding local jar as dependency to the spark interpreter

2017-05-09 Thread Jongyoul Lee
Can you add your spark interpreter's log file?

On Sat, May 6, 2017 at 12:53 AM, shyla deshpande 
wrote:

> Also, my local jar file that I want to add as dependency is a fat jar with
> dependencies.  Nothing works after I add my local fat jar, I get 
> *java.lang.NullPointerException
> for everything. Please help*
>
> On Thu, May 4, 2017 at 10:18 PM, shyla deshpande  > wrote:
>
>> Adding the dependency by filling groupId:artifactId:version works good.
>> But when I add add a local jar file as the artifact , get
>> *ERROR java.lang.NullPointerException*. I see the local jar file being
>> added to local-repo, but I get the ERROR.
>>
>> Please help.
>>
>>
>


-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net


Re: java.lang.NullPointerException on adding local jar as dependency to the spark interpreter

2017-05-05 Thread shyla deshpande
Also, my local jar file that I want to add as dependency is a fat jar with
dependencies.  Nothing works after I add my local fat jar, I get
*java.lang.NullPointerException
for everything. Please help*

On Thu, May 4, 2017 at 10:18 PM, shyla deshpande 
wrote:

> Adding the dependency by filling groupId:artifactId:version works good.
> But when I add add a local jar file as the artifact , get
> *ERROR java.lang.NullPointerException*. I see the local jar file being
> added to local-repo, but I get the ERROR.
>
> Please help.
>
>


Re: Preconfigure Spark interpreter

2017-04-22 Thread Paul Brenner
Whenever I’ve wanted to do this (preconfigure an interpreter using 
interpreter.json instead of one by one adding each config into the webui) my 
process was

First create an interpreter in the webUI and enter all my configs into that 
interpreter via the webUI

In the webUI again,  create the next interpreter that I want to preconfigure so 
that zeppelin puts some skeleton code in the interpreter.json, creates an 
interpreter ID, and I assume creates anything else that might be relevant?

Stop Zeppelin

Open interpreter.json and carefully copy the relevant contents of the first 
interpreter section into the second interpreter section.

Restart zeppelin

Not sure which of those steps are necessary or might be excessive but it works 
for me. 

http://www.placeiq.com/ http://www.placeiq.com/ http://www.placeiq.com/

Paul Brenner

https://twitter.com/placeiq https://twitter.com/placeiq 
https://twitter.com/placeiq
https://www.facebook.com/PlaceIQ https://www.facebook.com/PlaceIQ
https://www.linkedin.com/company/placeiq 
https://www.linkedin.com/company/placeiq

DATA SCIENTIST

(217) 390-3033 

 

http://www.placeiq.com/2015/05/26/placeiq-named-winner-of-prestigious-2015-oracle-data-cloud-activate-award/
 
http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/
 
http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/
 
http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/
 
http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/
 
http://placeiq.com/2016/03/08/measuring-addressable-tv-campaigns-is-now-possible/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://pages.placeiq.com/Location-Data-Accuracy-Whitepaper-Download.html?utm_source=Signature_medium=Email_campaign=AccuracyWP
 
http://placeiq.com/2016/08/03/placeiq-bolsters-location-intelligence-platform-with-mastercard-insights/
 
http://placeiq.com/2016/10/26/the-making-of-a-location-data-industry-milestone/ 
http://placeiq.com/2016/12/07/placeiq-introduces-landmark-a-groundbreaking-offering-that-delivers-access-to-the-highest-quality-location-data-for-insights-that-fuel-limitless-business-decisions/

On Sat, Apr 22, 2017 at 3:13 PM Serega Sheypak

<
mailto:Serega Sheypak <serega.shey...@gmail.com>
> wrote:

a, pre, code, a:link, body { word-wrap: break-word !important; }

Aha, thanks. I'm building Zeppelin from source, so I can put my custom settings 
directly? 

BTW, why does interpreter-list file don't contain spark interpreter?

2017-04-22 13:33 GMT+02:00 Fabian Böhnlein

<
mailto:fabian.boehnl...@gmail.com
>

:

Do it via the Ui once and you'll see how interpreter.json of the Zeppelin 
installation will be changed.

On Sat, Apr 22, 2017, 11:35 Serega Sheypak <
mailto:serega.shey...@gmail.com
> wrote:

Hi, I need to pre-configure spark interpreter with my own artifacts and 
internal repositories. How can I do it?

Re: Preconfigure Spark interpreter

2017-04-22 Thread Serega Sheypak
Aha, thanks. I'm building Zeppelin from source, so I can put my custom
settings directly?

BTW, why does interpreter-list file don't contain spark interpreter?

2017-04-22 13:33 GMT+02:00 Fabian Böhnlein <fabian.boehnl...@gmail.com>:

> Do it via the Ui once and you'll see how interpreter.json of the Zeppelin
> installation will be changed.
>
> On Sat, Apr 22, 2017, 11:35 Serega Sheypak <serega.shey...@gmail.com>
> wrote:
>
>> Hi, I need to pre-configure spark interpreter with my own artifacts and
>> internal repositories. How can I do it?
>>
>


Re: Preconfigure Spark interpreter

2017-04-22 Thread Fabian Böhnlein
Do it via the Ui once and you'll see how interpreter.json of the Zeppelin
installation will be changed.

On Sat, Apr 22, 2017, 11:35 Serega Sheypak <serega.shey...@gmail.com> wrote:

> Hi, I need to pre-configure spark interpreter with my own artifacts and
> internal repositories. How can I do it?
>


Preconfigure Spark interpreter

2017-04-22 Thread Serega Sheypak
Hi, I need to pre-configure spark interpreter with my own artifacts and
internal repositories. How can I do it?


Re: Spark Interpreter: Change default scheduler pool

2017-04-17 Thread Fabian Böhnlein
Hi moon,

exactly, thanks for the pointer.

Added the issue: https://issues.apache.org/jira/browse/ZEPPELIN-2413

Best,
Fabian


On Tue, 28 Mar 2017 at 15:48 moon soo Lee  wrote:

> Hi Fabian,
>
> Thanks for sharing the issue.
> SparkSqlInterpreter set scheduler to "fair" depends on interpreter
> property [1]. I think we can do the similar for SparkInterpreter.
> Do you mind file a new JIRA issue for it?
>
> Regards,
> moon
>
> [1]
> https://github.com/apache/zeppelin/blob/0e1964877654c56c72473ad07dac1de6f9646816/spark/src/main/java/org/apache/zeppelin/spark/SparkSqlInterpreter.java#L98
>
>
> On Tue, Mar 28, 2017 at 5:24 AM Fabian Böhnlein <
> fabian.boehnl...@gmail.com> wrote:
>
>> Hi all,
>>
>> how can I change (globally, for Zeppelin) the default scheduler pool
>> which SparkInterpreter submits jobs to. Currently all jobs go into the pool
>> 'default' but I want them to go into the pool 'fair'.
>> We use "Per Note" and "scoped" processes for best resource sharing.
>>
>> "spark.scheduler.pool"="fair" in Interpreter Settings does not work,
>> should it?
>>
>> What works is
>> sc.setLocalProperty("spark.scheduler.pool","fair")
>> but it's required in every *note* (not just notebook) since it's on
>> thread level.
>>
>> Is there a possibility to globally/per notebook set the 'fair' pool as
>> the default pool?
>>
>> Zeppelin brings two (hardcoded?) sheduler pools 'default' and 'fair'.
>> Between them, the scheduling is FAIR. 'default' is FIFO, 'fair' is FAIR.
>>
>> This is awesome and together with dynamicAllocation allows for super
>> flexible usage for multiple users but above behavior is a bit complicated.
>>
>> Thanks,
>> Fabian
>>
>>
>>
>>


Re: "spark ui" button in spark interpreter does not show Spark web-ui

2017-03-13 Thread Hyung Sung Shim
Hello.
Thank you for sharing the problem.
Could you file a jira issue for this?

2017년 3월 13일 (월) 오후 3:18, Meethu Mathew 님이 작성:

> Hi,
>
> I have noticed the same problem
>
> Regards,
>
>
> Meethu Mathew
>
>
> On Mon, Mar 13, 2017 at 9:56 AM, Xiaohui Liu  wrote:
>
> Hi,
>
> We used 0.7.1-snapshot with our Mesos cluster, almost all our needed
> features (ldap login, notebook acl control, livy/pyspark/rspark/scala,
> etc.) work pretty well.
>
> But one thing does not work for us is the 'spark ui' button does not
> response to user clicks. No errors in browser side.
>
> Anyone has met similar issues? Any suggestions about where I should check?
>
> Regards
> Xiaohui
>
>
>


Re: "spark ui" button in spark interpreter does not show Spark web-ui

2017-03-13 Thread Meethu Mathew
Hi,

I have noticed the same problem

Regards,
Meethu Mathew


On Mon, Mar 13, 2017 at 9:56 AM, Xiaohui Liu  wrote:

> Hi,
>
> We used 0.7.1-snapshot with our Mesos cluster, almost all our needed
> features (ldap login, notebook acl control, livy/pyspark/rspark/scala,
> etc.) work pretty well.
>
> But one thing does not work for us is the 'spark ui' button does not
> response to user clicks. No errors in browser side.
>
> Anyone has met similar issues? Any suggestions about where I should check?
>
> Regards
> Xiaohui
>


"spark ui" button in spark interpreter does not show Spark web-ui

2017-03-12 Thread Xiaohui Liu
Hi,

We used 0.7.1-snapshot with our Mesos cluster, almost all our needed
features (ldap login, notebook acl control, livy/pyspark/rspark/scala,
etc.) work pretty well.

But one thing does not work for us is the 'spark ui' button does not
response to user clicks. No errors in browser side.

Anyone has met similar issues? Any suggestions about where I should check?

Regards
Xiaohui


Re: Unable to connect with Spark Interpreter

2016-11-29 Thread Felix Cheung
Hmm possibly with the classpath. These might be Windows specific issues. We 
probably need to debug to fix these.



From: Jan Botorek <jan.boto...@infor.com>
Sent: Tuesday, November 29, 2016 4:01:43 AM
To: users@zeppelin.apache.org
Subject: RE: Unable to connect with Spark Interpreter

Your last advice helped me to progress a little bit:

-  I started spark interpreter manually

o   c:\zepp\\bin\interpreter.cmd, -d, c:\zepp\interpreter\spark\, -p, 61176, 
-l, c:\zepp\/local-repo/2C2ZNEH5W

o   I needed to add a ‚\‘ into the –d attributte and make the path shorter --> 
moved to c:\zepp

-  Then, in Zeppelin web environment I setup the spark interpret to 
„connect to existing process“ (localhost/61176)

-  After that, when I execute any command, in interpreter cmd window 
appears this exception:

o   Exception in thread "pool-1-thread-2" java.lang.NoClassDefFoundError: 
scala/Option

o   at java.lang.Class.forName0(Native Method)

o   at java.lang.Class.forName(Class.java:264)

o   at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.createInterpreter(RemoteInterpreterServer.java:148)

o   at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$createInterpreter.getResult(RemoteInterpreterService.java:1409)

o   at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$createInterpreter.getResult(RemoteInterpreterService.java:1394)

o   at 
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)

o   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)

o   at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)

o   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

o   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

o   at java.lang.Thread.run(Thread.java:745)

o   Caused by: java.lang.ClassNotFoundException: scala.Option

o   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)

o   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

o   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)

o   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

o   ... 11 more

Is this of any help, please?

Regards,
Jan



From: Jan Botorek [mailto:jan.boto...@infor.com]
Sent: Tuesday, November 29, 2016 12:13 PM
To: users@zeppelin.apache.org
Subject: RE: Unable to connect with Spark Interpreter

I am sorry, but the directory local-repo is not presented in the zeppelin 
folder. I use this (https://zeppelin.apache.org/download.html) newest binary 
version.

Unfortunately, in the 0.6 version downloaded and built from github, also the 
folder local-repo doesn’t exist


From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Tuesday, November 29, 2016 10:45 AM
To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>
Subject: Re: Unable to connect with Spark Interpreter

I still don't see much useful info. Could you try run the following interpreter 
command directly ?

c:\_libs\zeppelin-0.6.2-bin-all\\bin\interpreter.cmd  -d 
c:\_libs\zeppelin-0.6.2-bin-all\interpreter\spark -p 53099 -l 
c:\_libs\zeppelin-0.6.2-bin-all\/local-repo/2C2ZNEH5W


Jan Botorek <jan.boto...@infor.com<mailto:jan.boto...@infor.com>>于2016年11月29日周二 
下午5:26写道:
I attach the log file after debugging turned on.

From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>]
Sent: Tuesday, November 29, 2016 10:04 AM

To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>
Subject: Re: Unable to connect with Spark Interpreter

Then I guess the spark process is failed to start so no logs for spark 
interpreter.

Can you use the following log4.properties ? This log4j properties file print 
more error info for further diagnose.

log4j.rootLogger = INFO, dailyfile

log4j.appender.stdout = org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout = org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - %m%n

log4j.appender.dailyfile.DatePattern=.-MM-dd
log4j.appender.dailyfile.Threshold = DEBUG
log4j.appender.dailyfile = org.apache.log4j.DailyRollingFileAppender
log4j.appender.dailyfile.File = ${zeppelin.log.file}
log4j.appender.dailyfile.layout = org.apache.log4j.PatternLayout
log4j.appender.dailyfile.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - 
%m%n


log4j.logger.org.apache.zeppelin.notebook.Paragraph=DEBUG
log4j.logger.org.apache.zeppelin.scheduler=DEBUG
log4j.logger.org.apache.zeppelin.livy=DEBUG
log4j.logger.org.apache.zeppelin.flink=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer=DEBUG



Jan Botorek <jan.boto...@infor.com<mailt

  1   2   >