Zeppelin future / Roadmap

2018-11-05 Thread Ruslan Dautkhanov
With Hortonworks and Cloudera merge, can somebody give insights on Zeppelin
roadmap?

For well over a year, it seems Hortonworks was the main driver in
committing
Zeppelin changes. (Jeff Zhang primarily - kudos for the great set of
improvements )

Will Cloudera+HW's Unity platform have both Zeppelin and CDSW? Or just one
of them?

Or it will be set a side, so the Zeppelin community will have to take over
/
step in to streamline Zeppelin roadmap/ maintaining commits etc?

Thank you for any insights!

We use open-source Zeppelin and want to understand this better to align our
own roadmaps for our internal Zeppelin users.


Thank you,
Ruslan Dautkhanov


Re: switching between python2 and 3 for %pyspark

2018-10-26 Thread Ruslan Dautkhanov
Thanks Jeff , yep that's what we have too in [1]  - that's what we have
currently in interpreter settings now.
It doesn't work for some reason.
We're running Zeppelin from ~May'18 snapshot - has anything changed since
then?


Ruslan




[1]

LD_LIBRARY_PATH  /opt/cloudera/parcels/Anaconda3/lib
PATH
/usr/java/latest/bin:/opt/cloudera/parcels/Anaconda3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/rdautkha/bin
PYSPARK_DRIVER_PYTHON  /opt/cloudera/parcels/Anaconda3/bin/python
PYSPARK_PYTHON  /opt/cloudera/parcels/Anaconda3/bin/python
PYTHONHOME  /opt/cloudera/parcels/Anaconda3

spark.executorEnv.LD_LIBRARY_PATH/  opt/cloudera/parcels/Anaconda3/lib
spark.executorEnv.PYSPARK_PYTHON  /opt/cloudera/parcels/Anaconda3/bin/python
spark.pyspark.driver.python  /opt/cloudera/parcels/Anaconda3/bin/python
spark.pyspark.python  /opt/cloudera/parcels/Anaconda3/bin/python
spark.yarn.appMasterEnv.PYSPARK_PYTHON
/opt/cloudera/parcels/Anaconda3/bin/python


-- 
Ruslan Dautkhanov


On Fri, Oct 26, 2018 at 9:10 PM Jeff Zhang  wrote:

> Hi Ruslan,
>
> I believe you can just set PYSPARK_PYTHON in spark interpreter setting to
> switch between python2 and python3
>
>
>
> Ruslan Dautkhanov 于2018年10月27日周六 上午2:26写道:
>
>> I'd like to give users ability to switch between Python2 and Python3 for
>> their PySpark jobs.
>> Was somebody able to set up something like this, so they can switch
>> between python2 and python3 pyspark interpreters?
>>
>> For this experiment, created a new %py3spark interpreter, assigned to
>> spark interpreter group.
>>
>> Added following options there for %py3spark: [1]
>> /opt/cloudera/parcels/Anaconda3 is our Anaconda python3 home that's
>> available on all worker nodes and on zeppelin server too.
>>
>> For default %pyspark interpreter it's very similar to [1], except all
>> paths have "/opt/cloudera/parcels/Anaconda" instead of "
>> /opt/cloudera/parcels/Anaconda3".
>>
>> Nevertheless, zeppelin_ipythonxxx/ipython_server.py
>> seems catching environment variable from zeppelin-env.sh and not from
>> interpreter settings.
>>
>> Zeppelin documentation reads that all uppercase variables will be
>> treated as environment variables, so I assume it should overwrite what's
>> in zeppelin-env.sh, no?
>>
>> It seems environment variables at interpreter level are broken - notice
>> "pyspark" paragraph has "Anaconda3" and not "Anaconda" in PATH
>> (highlighted).
>>
>> [image: image.png]
>>
>>
>>
>> [1]
>>
>> LD_LIBRARY_PATH  /opt/cloudera/parcels/Anaconda3/lib
>> PATH
>> /usr/java/latest/bin:/opt/cloudera/parcels/Anaconda3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/rdautkha/bin
>> PYSPARK_DRIVER_PYTHON  /opt/cloudera/parcels/Anaconda3/bin/python
>> PYSPARK_PYTHON  /opt/cloudera/parcels/Anaconda3/bin/python
>> PYTHONHOME  /opt/cloudera/parcels/Anaconda3
>>
>> spark.executorEnv.LD_LIBRARY_PATH/  opt/cloudera/parcels/Anaconda3/lib
>> spark.executorEnv.PYSPARK_PYTHON
>> /opt/cloudera/parcels/Anaconda3/bin/python
>> spark.pyspark.driver.python  /opt/cloudera/parcels/Anaconda3/bin/python
>> spark.pyspark.python  /opt/cloudera/parcels/Anaconda3/bin/python
>> spark.yarn.appMasterEnv.PYSPARK_PYTHON
>> /opt/cloudera/parcels/Anaconda3/bin/python
>>
>> --
>> Ruslan Dautkhanov
>>
>


switching between python2 and 3 for %pyspark

2018-10-26 Thread Ruslan Dautkhanov
I'd like to give users ability to switch between Python2 and Python3 for
their PySpark jobs.
Was somebody able to set up something like this, so they can switch between
python2 and python3 pyspark interpreters?

For this experiment, created a new %py3spark interpreter, assigned to spark
interpreter group.

Added following options there for %py3spark: [1]
/opt/cloudera/parcels/Anaconda3 is our Anaconda python3 home that's
available on all worker nodes and on zeppelin server too.

For default %pyspark interpreter it's very similar to [1], except all paths
have "/opt/cloudera/parcels/Anaconda" instead of "
/opt/cloudera/parcels/Anaconda3".

Nevertheless, zeppelin_ipythonxxx/ipython_server.py
seems catching environment variable from zeppelin-env.sh and not from
interpreter settings.

Zeppelin documentation reads that all uppercase variables will be
treated as environment variables, so I assume it should overwrite what's in
zeppelin-env.sh, no?

It seems environment variables at interpreter level are broken - notice
"pyspark" paragraph has "Anaconda3" and not "Anaconda" in PATH
(highlighted).

[image: image.png]



[1]

LD_LIBRARY_PATH  /opt/cloudera/parcels/Anaconda3/lib
PATH
/usr/java/latest/bin:/opt/cloudera/parcels/Anaconda3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/rdautkha/bin
PYSPARK_DRIVER_PYTHON  /opt/cloudera/parcels/Anaconda3/bin/python
PYSPARK_PYTHON  /opt/cloudera/parcels/Anaconda3/bin/python
PYTHONHOME  /opt/cloudera/parcels/Anaconda3

spark.executorEnv.LD_LIBRARY_PATH/  opt/cloudera/parcels/Anaconda3/lib
spark.executorEnv.PYSPARK_PYTHON  /opt/cloudera/parcels/Anaconda3/bin/python
spark.pyspark.driver.python  /opt/cloudera/parcels/Anaconda3/bin/python
spark.pyspark.python  /opt/cloudera/parcels/Anaconda3/bin/python
spark.yarn.appMasterEnv.PYSPARK_PYTHON
/opt/cloudera/parcels/Anaconda3/bin/python

-- 
Ruslan Dautkhanov


Re: How to make livy2.spark find jar

2018-10-26 Thread Ruslan Dautkhanov
Try adding ZEPPELIN_INTP_CLASSPATH_OVERRIDES, for example,

export
ZEPPELIN_INTP_CLASSPATH_OVERRIDES=/etc/hive/conf:/var/lib/sqoop/ojdbc7.jar


-- 
Ruslan Dautkhanov


On Tue, Oct 23, 2018 at 9:40 PM Lian Jiang  wrote:

> Hi,
>
> I am trying to use oracle jdbc to read oracle database table. I have added
> below property in custom zeppelin-env:
>
> SPARK_SUBMIT_OPTIONS="--jars /my/path/to/ojdbc8.jar"
>
> But
>
> val df = spark.read.format("jdbc").option("url", "jdbc:oracle:thin:@
> (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.9.44.99)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=
> myservice.mydns.com)))").option("user","myuser").option("password","mypassword").option("driver",
> "oracle.jdbc.driver.OracleDriver").option("dbtable",
> "myuser.mytable").load()
>
> throws:
>
> java.lang.ClassNotFoundException: oracle.jdbc.driver.OracleDriver at
> scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:357) at
> org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:45)
> at
> org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$6.apply(JDBCOptions.scala:79)
> at
> org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$6.apply(JDBCOptions.scala:79)
> at scala.Option.foreach(Option.scala:257) at
> org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.(JDBCOptions.scala:79)
> at
> org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.(JDBCOptions.scala:35)
> at
> org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:34)
> at
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:340)
> at
> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) at
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
>
> How to make livy2.spark interpreter find ojdbc8.jar? Thanks.
>
>


Re: Build failure on 0.8.0 when using CDH hadoop

2018-10-15 Thread Ruslan Dautkhanov
Michael,

This is my build command for Cloudera:

mvn clean package -DskipTests -Pspark-2.2 -Phadoop-2.6 -Pvendor-repo
-Pscala-2.10 -Psparkr -pl
'!alluxio,!flink,!ignite,!lens,!cassandra,!bigquery,!scio' -e

It works okay with CDH. Including CDH 5.14 you mentioned.

We used to have -Dhadoop.version=2.6.0-cdh5.12.1 way back but it had some
issues in more recent Zeppelin upstream.

I guess what Jeff was saying is that now Zeppelin shades all dependencies
See https://github.com/apache/zeppelin/pull/3170 that was committed last
month.
So I think you now don't even need to provide hadoop-2.6 profile at all,
but I haven't tested that.
You can test and let us know which way works for you.

-- 
Ruslan Dautkhanov


On Mon, Oct 15, 2018 at 11:48 AM Michael Williams 
wrote:

> I understand it's possible to build and run Zeppelin using plain Hadoop,
> but we are always running on Cloudera clusters so it makes sense for us to
> build against Cloudera's Hadoop dist. Or would you recommend building using
> plain Hadoop, not as a workaround but for some other reason?
>
> Since we are running on Cloudera, it would be nice to just have one
> version of the Hadoop jars floating around.
>
> On Fri, Oct 12, 2018 at 5:33 PM Jeff Zhang  wrote:
>
>> You don't need to build with CDH to run zeppelin under CDH, you can just
>> run the following command to run zeppelin under CDH
>>
>> mvn clean package -DskipTests
>>
>>
>> Michael Williams 于2018年10月13日周六 上午7:38写道:
>>
>>> Hey all,
>>>
>>> I'm hitting some dependency issues when trying to build Zeppelin 0.8.0
>>> with CDH-5.14.4. Here's the maven command I'm using:
>>>
>>> mvn clean install -DskipTests -Pbuild-distr -Pvendor-repo
>>>> -Dhadoop.version=2.6.0-cdh5.14.4 -Dcheckstyle.skip=true -Pr -Pspark-2.2
>>>> -Dspark.version=2.2.0 -Pscala-2.11
>>>>
>>>
>>> With that I run into some Jackson and Zookeeper issues, mainly with
>>> hadoop-client and hadoop-azure. I see that this has been an issue before
>>> and there was a PR to fix CDH compatibility issues for 0.8.0
>>> <https://github.com/apache/zeppelin/pull/2723>, but it looks like those
>>> changes have been overwritten at some point.
>>>
>>> Wondering if this is a known issue or are CDH builds supposed to be
>>> working? Was that commit to fix this overwritten by accident or not? In the
>>> case that is should be working, is there something that anyone can see
>>> wrong with what I am doing?
>>>
>>> Thanks,
>>> Michael
>>>
>>


Re: Zeppelin with JDBC interpreter, need queryBuilder help

2018-10-08 Thread Ruslan Dautkhanov
It wasn't committed. It seems got closed due to lack of reviews...

You can ping @mebelousov on github to reopen it and @zjffdu  or somebody
else from committers to have it reviewed.

I see some other very good Zeppelin improvements / PRs lack committers
attention and then eventually vanish.

-- 
Ruslan Dautkhanov


On Fri, Oct 5, 2018 at 3:50 PM anirban chatterjee <
anirban.chatter...@gmail.com> wrote:

> this is exactly what I am looking for!
> When will this be PR be committed?
> also, is the autosense triggered by some shortcut command or always on?
> Thanks,
> Anirban
>
> On Fri, Oct 5, 2018 at 2:01 PM Ruslan Dautkhanov 
> wrote:
>
>> Something like this is available on master, I think.
>>
>> How this works you could see on
>> https://github.com/apache/zeppelin/pull/2972
>> (not sure why that particular PR wasn't committed though)
>>
>> --
>> Ruslan Dautkhanov
>>
>>
>> On Fri, Oct 5, 2018 at 2:43 PM anirban chatterjee <
>> anirban.chatter...@gmail.com> wrote:
>>
>>> Hi, Can one get query builder type support (intellisense  capability
>>> where one gets suggestions on available tables, columns, fields etc
>>> contextually) with the JDBC interpreter? We are using Zeppelin with Presto,
>>> and would like to get suggestions of table's fields (not just keywords that
>>> one gets through Ctrl+.). Any suggestions?
>>> Thanks,
>>> Anirban
>>>
>>>


Re: Zeppelin with JDBC interpreter, need queryBuilder help

2018-10-05 Thread Ruslan Dautkhanov
Something like this is available on master, I think.

How this works you could see on
https://github.com/apache/zeppelin/pull/2972
(not sure why that particular PR wasn't committed though)

-- 
Ruslan Dautkhanov


On Fri, Oct 5, 2018 at 2:43 PM anirban chatterjee <
anirban.chatter...@gmail.com> wrote:

> Hi, Can one get query builder type support (intellisense  capability where
> one gets suggestions on available tables, columns, fields etc contextually)
> with the JDBC interpreter? We are using Zeppelin with Presto, and would
> like to get suggestions of table's fields (not just keywords that one gets
> through Ctrl+.). Any suggestions?
> Thanks,
> Anirban
>
>


Re: [DISCUSS] ZEPPELIN-2619. Save note in [Title].zpln instead of [NOTEID]/note.json

2018-08-13 Thread Ruslan Dautkhanov
Thanks for bringing this up for discussion. My 2 cents below.

I am with Maksim and Felix on concerns with special characters now allowed
in notebook names, and also concerns with different charsets. Russian
language, for example, most commonly use iso-8859-5, koi-8r/u, windows-1251
charsets etc. This seems like will bring whole new set of localization
issues.

If I understand correctly, this is being done solely to speed up loading
list of notebooks? What if a list of notebook names, their ids, folder
structure, etc can be *cached* in a separate small json file? Or perhaps in
a small embedded key-value store, like www.mapdb.org would do? Just
thinking out loud. This would require a way to lazily re-sync the cache.

Another way to speed up json reads is to somehow force "name" attribute to
be at the top of the json document that's written to disk. Then
re-implement json files reader to read just header of the file and do a
partial json parse ( or in the lack of options, grab "name" attribute from
the json file header by a regex for example).

Back to filenames and charsets, I think issue may be more complicated, if
you store notebooks on a remote filesystem (nfs/ samba etc), and what if
remote server and local nfs client have differences in default fs charsets?

Ideally would be if all filesystems would use UTF-8 for example, but I am
not certain that's a good assumption to make. Also exposing notebook names
can bring some other issues, like I know some users occasionally add
trailing/leading spaces etc.


On Mon, Aug 13, 2018 at 10:38 AM Belousov Maksim Eduardovich <
m.belou...@tinkoff.ru> wrote:

> The use of Russian and other specific letters in the note name is big
> advantage of Zeppelin. I would not like to give up this functionality.
>
> I support the idea about `zpln` file extension.
> The folder structure also sounds good.
>
> I'm afraid about non-latin symbols in folder and note name. And what about
> hieroglyphs?
>
> Apache Zeppelin may be the first to use Russian letters in file system in
> our company.
> I see a lot of risks to use non-latin symbols and a lot of issues to make
> new folder structure stable.
>
>
>
> --
> *От:* Jeff Zhang 
> *Отправлено:* 13 августа 2018 г. 12:50
> *Кому:* users@zeppelin.apache.org
> *Тема:* Re: [DISCUSS] ZEPPELIN-2619. Save note in [Title].zpln instead of
> [NOTEID]/note.json
>
> >>> Do we need the note id in the file name at all? What’s wrong with
> just note_name.zpln?
> The reason I keep note id is because currently we use noteId to identify
> one note. e.g. we use note id in both websocket api and rest api. It is
> almost impossible to remove noteId for the current architecture. If we put
> note id into file content of note_name.zpln, then we have to read the note
> file every time, then we meet the issues I mentioned above again.
>
> >>> If the file content is json then why not use note_name.json instead
> of .zpln? That would make it easier for editors to know how to
> load/highlight the file contents.
> I am not strongly biased on *.zpln. But I think one purpose is to help
> third parties to identify zeppelin note properly. e.g. github can identify
> jupyter notebook (*.ipynb) and render it properly.
>
> >>> Is there any reason for not using *real* folders or directories for
> organising the notebooks rather than embedding the folder hierarchy in the
> names of the notebooks?  If someone wants to ‘move’ the notebooks to
> another folder they’d have to manually rename all the files/notebooks at
> present.  That’s not very user-friendly.
>
> Actually my proposal is to use real folders. What user see in zeppelin
> note menu is the actual notes folder structure. If they want to move the
> notebooks to another folder, they can change the folder name just like what
> user did in file system.
>
>
>
>
>
> Partridge, Lucas (GE Aviation) 于2018年8月13日周一
> 下午4:43写道:
>
>> Hi Jeff,
>>
>> I have some questions about this proposal (I can’t edit the design doc):
>>
>>
>>
>>1. Do we need the note id in the file name at all? What’s wrong with
>>just note_name.zpln?
>>
>>2. If the file content is json then why not use note_name.json
>>instead of .zpln? That would make it easier for editors to know how to
>>load/highlight the file contents.
>>
>>3. Is there any reason for not using *real* folders or directories
>>for organising the notebooks rather than embedding the folder hierarchy in
>>the names of the notebooks?  If someone wants to ‘move’ the notebooks to
>>another folder they’d have to manually rename all the files/notebooks at
>>present.  That’s not very user-friendly.
>>
>>
>>
>> Thanks, Lucas.
>>
>> *From:* Jeff Zhang 
>> *Sent:* 13 August 2018 09:06
>> *To:* users@zeppelin.apache.org
>> *Cc:* dev 
>> *Subject:* EXT: Re: [DISCUSS] ZEPPELIN-2619. Save note in [Title].zpln
>> instead of [NOTEID]/note.json
>>
>>
>>
>> In that case, zeppelin should fail to create note.
>>
>>
>>
>> Felix Cheung 

Re: [ANNOUNCE] Apache Zeppelin 0.8.0 released

2018-08-08 Thread Ruslan Dautkhanov
Thanks Jeff! Should there be a Zeppelin 0.8.1 release sometime soon with
all the fixes for issues that the users have faced in 0.8.0?

-- 
Ruslan Dautkhanov


On Mon, Jul 23, 2018 at 12:24 AM Jeff Zhang  wrote:

>
> Thanks Ruslan, I will fix it.
>
> Ruslan Dautkhanov 于2018年7月23日周一 下午1:46写道:
>
>> https://zeppelin.apache.org/ home page still reads
>> "WHAT'S NEW IN
>>  Apache Zeppelin 0.7"
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>>
>> On Fri, Jun 29, 2018 at 4:56 AM Spico Florin 
>> wrote:
>>
>>> Hi!
>>>   I tried to get the docker image for this version 0.8.0, but it seems
>>> that is not in the official docker hub repository:
>>> https://hub.docker.com/r/apache/zeppelin/tags/ there is no such as
>>> version 0.8.0
>>> Also, the commands
>>>  docker pull apache/zeppelin:0.8.0
>>> or
>>>
>>> docker run -p 8080:8080 --rm --name zeppelin apache/zeppelin:0.8.0
>>>
>>>
>>> fails with
>>> Error response from daemon: manifest for apache/zeppelin:0.8.0 not found
>>>
>>> Can you please check? Or how should I get this version for docker
>>> (please instruct).
>>>
>>> Thanks.
>>> Regards,
>>>  Florin
>>>
>>>
>>>
>>> On Fri, Jun 29, 2018 at 6:13 AM, Jongyoul Lee 
>>> wrote:
>>>
>>>> Great work!!
>>>>
>>>> On Fri, Jun 29, 2018 at 9:49 AM, Jeff Zhang  wrote:
>>>>
>>>>> Thanks Patrick, I have fixed the broken link.
>>>>>
>>>>>
>>>>>
>>>>> Patrick Maroney 于2018年6月29日周五 上午7:13写道:
>>>>>
>>>>> > Install guides:
>>>>> >
>>>>> > http://zeppelin.apache.org/docs/0.8.0/install/install.html
>>>>> > Not Found
>>>>> >
>>>>> > The requested URL /docs/0.8.0/install/install.html was not found on
>>>>> this
>>>>> > server.
>>>>> >
>>>>> >
>>>>> http://zeppelin.apache.org/docs/0.8.0/manual/interpreterinstallation.html
>>>>> >
>>>>> > Not Found
>>>>> >
>>>>> > The requested URL /docs/0.8.0/manual/interpreterinstallation.html
>>>>> was not
>>>>> > found on this server.
>>>>> >
>>>>> >
>>>>> > Patrick Maroney
>>>>> > Principal Engineer - Data Science & Analytics
>>>>> > Wapack Labs
>>>>> >
>>>>> >
>>>>> > On Jun 28, 2018, at 6:59 PM, Jianfeng (Jeff) Zhang <
>>>>> jzh...@hortonworks.com>
>>>>> > wrote:
>>>>> >
>>>>> > Hi Patrick,
>>>>> >
>>>>> > Which link is broken ? I can access all the links.
>>>>> >
>>>>> > Best Regard,
>>>>> > Jeff Zhang
>>>>> >
>>>>> >
>>>>> > From: Patrick Maroney 
>>>>> > Reply-To: 
>>>>> > Date: Friday, June 29, 2018 at 4:59 AM
>>>>> > To: 
>>>>> > Cc: dev 
>>>>> > Subject: Re: [ANNOUNCE] Apache Zeppelin 0.8.0 released
>>>>> >
>>>>> > Great work Team/Community!
>>>>> >
>>>>> > Links on the main download page are broken:
>>>>> >
>>>>> > http://zeppelin.apache.org/download.html
>>>>> >
>>>>> > ...at least the ones I need ;-)
>>>>> >
>>>>> > *Patrick Maroney*
>>>>>
>>>>> > Principal Engineer - Data Science & Analytics
>>>>> > Wapack Labs LLC
>>>>> >
>>>>> >
>>>>> > Public Key:
>>>>> http://pgp.mit.edu/pks/lookup?op=get=0x7C810C9769BD29AF
>>>>> >
>>>>> > On Jun 27, 2018, at 11:21 PM, Prabhjyot Singh <
>>>>> prabhjyotsi...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > Awesome! congratulations team.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu 28 Jun, 2018, 8:39 AM Taejun Kim,  wrote:
>>>>> >
>>>>> >> Awesome! Thanks for your great work :)
>>>>> >>
>>>>> >> 2018년 6월 28일 (목) 오후 12:07, Jeff Zhang 님이 작성:
>>>>> >>
>>>>> >>> The Apache Zeppelin community is pleased to announce the
>>>>> availability of
>>>>> >>> the 0.8.0 release.
>>>>> >>>
>>>>> >>> Zeppelin is a collaborative data analytics and visualization tool
>>>>> for
>>>>> >>> distributed, general-purpose data processing system such as Apache
>>>>> Spark,
>>>>> >>> Apache Flink, etc.
>>>>> >>>
>>>>> >>> This is another major release after the last minor release 0.7.3.
>>>>> >>> The community put significant effort into improving Apache
>>>>> Zeppelin since
>>>>> >>> the last release. 122 contributors fixed totally 602 issues. Lots
>>>>> of
>>>>> >>> new features are introduced, such as inline configuration, ipython
>>>>> >>> interpreter, yarn-cluster mode support , interpreter lifecycle
>>>>> manager
>>>>> >>> and etc.
>>>>> >>>
>>>>> >>> We encourage you to download the latest release
>>>>> >>> fromhttp://zeppelin.apache.org/download.html
>>>>> >>>
>>>>> >>> Release note is available
>>>>> >>> athttp://zeppelin.apache.org/releases/zeppelin-release-0.8.0.html
>>>>> >>>
>>>>> >>> We welcome your help and feedback. For more information on the
>>>>> project
>>>>> >>> and
>>>>> >>> how to get involved, visit our website at
>>>>> http://zeppelin.apache.org/
>>>>> >>>
>>>>> >>> Thank you all users and contributors who have helped to improve
>>>>> Apache
>>>>> >>> Zeppelin.
>>>>> >>>
>>>>> >>> Regards,
>>>>> >>> The Apache Zeppelin community
>>>>> >>>
>>>>> >> --
>>>>> >> Taejun Kim
>>>>> >>
>>>>> >> Data Mining Lab.
>>>>> >> School of Electrical and Computer Engineering
>>>>> >> University of Seoul
>>>>> >>
>>>>> >
>>>>> >
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> 이종열, Jongyoul Lee, 李宗烈
>>>> http://madeng.net
>>>>
>>>
>>>


Re: Sometimes tab deletes text

2018-08-08 Thread Ruslan Dautkhanov
It was broken end of last year by this PR
https://github.com/apache/zeppelin/pull/2624

I already filed a jira for this
issue -
https://issues.apache.org/jira/browse/ZEPPELIN-3253

Zeppelin I built for our users has that change rolled back, see for example,
https://github.com/apache/zeppelin/pull/2812/files


--
Ruslan Dautkhanov


On Wed, Aug 8, 2018 at 10:01 AM Paul Brenner  wrote:

> ok, I went ahead and opened
> https://issues.apache.org/jira/browse/ZEPPELIN-3692
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErQEYA32hiB6svhZ6dNQnrKw9xG29x7IMjDaZ2VqzxHv9JNbHsi7yvZOUwB7I-b-7bEVRwLKBU_pob5z9ksWBo>
>
> If I find a better way to reproduce, more details, or if anyone else
> chimes in, I'll add to the ticket. Sorry I'm not better able to contribute
> to fixing this.
>
>
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErCVMAh3ZmWq4ml9OZLBmmIAUyW-c7fv5AmWhFLboiic7SSdI=>
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErCVMAh3ZmWq4ml9OZLBmmIAUyW-c7fv5AmWhFLboiic7SSdI=>
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErCVMAh3ZmWq4ml9OZLBmmIAUyW-c7fv5AmWhFLboiic7SSdI=>
>  *Paul
> Brenner*
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErQEYA33VmHao-k8LSJgflbBozFX5zqphNfrLucIjkqxaQ9XIJeaXa>
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErQEYA33VmHao-k8LSJgflbBozFX5zqphNfrLucIjkqxaQ9XIJeaXa>
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErQEYA33VmHao-k8LSJgflbBozFX5zqphNfrLucIjkqxaQ9XIJeaXa>
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErQEYA33ZmA_Asl9OZJwfnKEQ8G3A5k4UrHKIQZalvdOOkilFQ7eNtx7whB80=>
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErQEYA33ZmA_Asl9OZJwfnKEQ8G3A5k4UrHKIQZalvdOOkilFQ7eNtx7whB80=>
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErQEYA33ZmA_Amn96XIAzhLUQ8G3A5oIYnD6Y3TfHuBe1Pf4LY4_VhaG02vRH4s_AIoB5RhA==>
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErQEYA33ZmA_Amn96XIAzhLUQ8G3A5oIYnD6Y3TfHuBe1Pf4LY4_VhaG02vRH4s_AIoB5RhA==>
> SR. DATA SCIENTIST
> *(217) 390-3033 *
>
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErCVMAh3ZmWq4ml9OZLBmmIAUyWy8m8txlT_J2BuixGeBNeY7AqsGEScEbTqx_P7Jx2zzZQXvV52trzh6E8_iRykBOCxQ_zgJg4nz5t5Er_HGmBFEVx0a1AGahHQFULvOey_sp0YGqzd3SCkVpyXnaSDS6chcBwXHv>
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErCVMAgG1wF7sjh56fKgWncVpuQTIn8cZ7R-g4V73rG-1PY8bfqMGaWskdQ_JhObIy1yDZQnKb9m1n0gTA8PCK3h4OVEUjjEpt-XP7qds49vc7kwYa2DGigzRfj0CU>
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErCVMAgG1wF7sjh56fKgWncVpuQTIn8cZ7R-g4V73rG-1PY8bfqMGaWskdQ_JhObIy1yDZQnKb9m1n0gTA8PCK3h4OVEUjjEpt-XP7qds49vc7kwYa2DGigzRfj0CU>
> <https://share.polymail.io/v1/z/b/NWI2YjEzMWZlNjk0/KlQRBTEsQl6TBBvqq5P1ulqS2HpT7XuroULPqFkdrAJYNoqQhV95Y1HU1BfNTrXI1U3RwzkEO1Rjgxmq3iHJqsLp9-Y6ucg1HoNFuSH7xVkdZNUe5jite4eEd-mj4nsCjYFlHgK3evErCVMAgG1wF7sjh56fKgWncVpuQTIn8cZ7R-g4V73rG-1PY8bfqMGaWskdQ_JhObIy1yDZQnKb9m1n0gTA8PCK3h4OVEUjjEpt-XP7qds49vc7kwYa2DGigzRfj0CU>

Re: [ANNOUNCE] Apache Zeppelin 0.8.0 released

2018-07-22 Thread Ruslan Dautkhanov
https://zeppelin.apache.org/ home page still reads
"WHAT'S NEW IN
 Apache Zeppelin 0.7"


-- 
Ruslan Dautkhanov


On Fri, Jun 29, 2018 at 4:56 AM Spico Florin  wrote:

> Hi!
>   I tried to get the docker image for this version 0.8.0, but it seems
> that is not in the official docker hub repository:
> https://hub.docker.com/r/apache/zeppelin/tags/ there is no such as
> version 0.8.0
> Also, the commands
>  docker pull apache/zeppelin:0.8.0
> or
>
> docker run -p 8080:8080 --rm --name zeppelin apache/zeppelin:0.8.0
>
>
> fails with
> Error response from daemon: manifest for apache/zeppelin:0.8.0 not found
>
> Can you please check? Or how should I get this version for docker (please
> instruct).
>
> Thanks.
> Regards,
>  Florin
>
>
>
> On Fri, Jun 29, 2018 at 6:13 AM, Jongyoul Lee  wrote:
>
>> Great work!!
>>
>> On Fri, Jun 29, 2018 at 9:49 AM, Jeff Zhang  wrote:
>>
>>> Thanks Patrick, I have fixed the broken link.
>>>
>>>
>>>
>>> Patrick Maroney 于2018年6月29日周五 上午7:13写道:
>>>
>>> > Install guides:
>>> >
>>> > http://zeppelin.apache.org/docs/0.8.0/install/install.html
>>> > Not Found
>>> >
>>> > The requested URL /docs/0.8.0/install/install.html was not found on
>>> this
>>> > server.
>>> >
>>> >
>>> http://zeppelin.apache.org/docs/0.8.0/manual/interpreterinstallation.html
>>> >
>>> > Not Found
>>> >
>>> > The requested URL /docs/0.8.0/manual/interpreterinstallation.html was
>>> not
>>> > found on this server.
>>> >
>>> >
>>> > Patrick Maroney
>>> > Principal Engineer - Data Science & Analytics
>>> > Wapack Labs
>>> >
>>> >
>>> > On Jun 28, 2018, at 6:59 PM, Jianfeng (Jeff) Zhang <
>>> jzh...@hortonworks.com>
>>> > wrote:
>>> >
>>> > Hi Patrick,
>>> >
>>> > Which link is broken ? I can access all the links.
>>> >
>>> > Best Regard,
>>> > Jeff Zhang
>>> >
>>> >
>>> > From: Patrick Maroney 
>>> > Reply-To: 
>>> > Date: Friday, June 29, 2018 at 4:59 AM
>>> > To: 
>>> > Cc: dev 
>>> > Subject: Re: [ANNOUNCE] Apache Zeppelin 0.8.0 released
>>> >
>>> > Great work Team/Community!
>>> >
>>> > Links on the main download page are broken:
>>> >
>>> > http://zeppelin.apache.org/download.html
>>> >
>>> > ...at least the ones I need ;-)
>>> >
>>> > *Patrick Maroney*
>>>
>>> > Principal Engineer - Data Science & Analytics
>>> > Wapack Labs LLC
>>> >
>>> >
>>> > Public Key:
>>> http://pgp.mit.edu/pks/lookup?op=get=0x7C810C9769BD29AF
>>> >
>>> > On Jun 27, 2018, at 11:21 PM, Prabhjyot Singh <
>>> prabhjyotsi...@gmail.com>
>>> > wrote:
>>> >
>>> > Awesome! congratulations team.
>>> >
>>> >
>>> >
>>> > On Thu 28 Jun, 2018, 8:39 AM Taejun Kim,  wrote:
>>> >
>>> >> Awesome! Thanks for your great work :)
>>> >>
>>> >> 2018년 6월 28일 (목) 오후 12:07, Jeff Zhang 님이 작성:
>>> >>
>>> >>> The Apache Zeppelin community is pleased to announce the
>>> availability of
>>> >>> the 0.8.0 release.
>>> >>>
>>> >>> Zeppelin is a collaborative data analytics and visualization tool for
>>> >>> distributed, general-purpose data processing system such as Apache
>>> Spark,
>>> >>> Apache Flink, etc.
>>> >>>
>>> >>> This is another major release after the last minor release 0.7.3.
>>> >>> The community put significant effort into improving Apache Zeppelin
>>> since
>>> >>> the last release. 122 contributors fixed totally 602 issues. Lots of
>>> >>> new features are introduced, such as inline configuration, ipython
>>> >>> interpreter, yarn-cluster mode support , interpreter lifecycle
>>> manager
>>> >>> and etc.
>>> >>>
>>> >>> We encourage you to download the latest release
>>> >>> fromhttp://zeppelin.apache.org/download.html
>>> >>>
>>> >>> Release note is available
>>> >>> athttp://zeppelin.apache.org/releases/zeppelin-release-0.8.0.html
>>> >>>
>>> >>> We welcome your help and feedback. For more information on the
>>> project
>>> >>> and
>>> >>> how to get involved, visit our website at
>>> http://zeppelin.apache.org/
>>> >>>
>>> >>> Thank you all users and contributors who have helped to improve
>>> Apache
>>> >>> Zeppelin.
>>> >>>
>>> >>> Regards,
>>> >>> The Apache Zeppelin community
>>> >>>
>>> >> --
>>> >> Taejun Kim
>>> >>
>>> >> Data Mining Lab.
>>> >> School of Electrical and Computer Engineering
>>> >> University of Seoul
>>> >>
>>> >
>>> >
>>>
>>
>>
>>
>> --
>> 이종열, Jongyoul Lee, 李宗烈
>> http://madeng.net
>>
>
>


Re: Zeppelin distributed architecture design

2018-07-18 Thread Ruslan Dautkhanov
Thank you luxun,

I left a couple of comments in that google document.

-- 
Ruslan Dautkhanov


On Tue, Jul 17, 2018 at 11:30 PM liuxun  wrote:

> hi,Ruslan Dautkhanov
>
> Thank you very much for your question. according to your advice, I added 3
> schematics to illustrate.
> 1. Distributed Zeppelin Deployment architecture diagram.
> 2. Distributed zeppelin Server fault tolerance diagram.
> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
>
>
> The email attachment exceeded the size limit, so I reorganized the
> document and updated it with Google Docs.
>
> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit?usp=sharing
>
>
> 在 2018年7月18日,下午1:03,liuxun  写道:
>
> hi,Ruslan Dautkhanov
>
> Thank you very much for your question. according to your advice, I added 3
> schematics to illustrate.
> 1. Zeppelin Cluster architecture diagram.
> 2. Distributed zeppelin Server fault tolerance diagram.
> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
>
> Later, I will merge the schematic into the system design document.
>
> 
>
>
> 
>
>
>
> 
>
>
>
> 在 2018年7月18日,上午1:16,Ruslan Dautkhanov  写道:
>
> Nice.
>
> Thanks for sharing.
>
> Can you explain how are users routed into a particular zeppelin server
> instance? I've seen nginx on top of them, but I don't think the document
> covers details? If one zeppelin server goes down or unhealthy, is nginx
> supposed to detect (if so, how?) that and reroute users to a survived
> instance?
>
> Thanks,
> Ruslan Dautkhanov
>
>
> On Tue, Jul 17, 2018 at 2:46 AM liuxun  wrote:
>
> hi:
>
> Our company installed and deployed a lot of zeppelin for data analysis.
> The single server version of zeppelin could not meet our application
> scenarios, so we transformed zeppelin into a clustered service that
> supports distributed deployment, Have a unified entrance, high
> availability, and High server resource usage.  the email attachment is the
> entire design document, I am very happy to feedback our modified code back
> to the community.
>
>
> this is the JIRA I submitted in the community,
>
> https://issues.apache.org/jira/browse/ZEPPELIN-3471
>
>
> Since the design document size exceeds the mail attachment size limit, the
> document link address has to be sent.
>
>
> https://issues.apache.org/jira/secure/attachment/12931896/Zeppelin%20distributed%20architecture%20design.pdf
>
>
> https://issues.apache.org/jira/secure/attachment/12931895/zepplin%20Cluster%20Sequence%20Diagram.png
>
>
> liuxun
>
>
>
>


Re: Zeppelin distributed architecture design

2018-07-17 Thread Ruslan Dautkhanov
Nice.

Thanks for sharing.

Can you explain how are users routed into a particular zeppelin server
instance? I've seen nginx on top of them, but I don't think the document
covers details? If one zeppelin server goes down or unhealthy, is nginx
supposed to detect (if so, how?) that and reroute users to a survived
instance?

Thanks,
Ruslan Dautkhanov


On Tue, Jul 17, 2018 at 2:46 AM liuxun  wrote:

> hi:
>
> Our company installed and deployed a lot of zeppelin for data analysis.
> The single server version of zeppelin could not meet our application
> scenarios, so we transformed zeppelin into a clustered service that
> supports distributed deployment, Have a unified entrance, high
> availability, and High server resource usage.  the email attachment is the
> entire design document, I am very happy to feedback our modified code back
> to the community.
>
>
> this is the JIRA I submitted in the community,
>
> https://issues.apache.org/jira/browse/ZEPPELIN-3471
>
>
> Since the design document size exceeds the mail attachment size limit, the
> document link address has to be sent.
>
> https://issues.apache.org/jira/secure/attachment/12931896/Zeppelin%20distributed%20architecture%20design.pdf
>
> https://issues.apache.org/jira/secure/attachment/12931895/zepplin%20Cluster%20Sequence%20Diagram.png
>
>
> liuxun
>


Re: Paragraphs collapsing

2018-07-11 Thread Ruslan Dautkhanov
Not sure how to reproduce that too, but I think it happened
when I clicked to run all paragraphs in a noteboook, then I disconnected
and reconnected and notebook was still running rest of paragraphs at that
time.
But it was a while, so not sure if that's sufficient to reproduce that
issue.

-- 
Ruslan Dautkhanov


On Wed, Jul 11, 2018 at 8:34 AM Paul Brenner  wrote:

> I created https://issues.apache.org/jira/browse/ZEPPELIN-3616
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4H90Q2fD8-5ZZjzrIqop10D1CcNl3YoBY1BqzLkKtRRXRzAe9Krp_bRzHNSflM8zzt3f3dc736rdGH3bWobPv>
> However we don’t yet know how to reliably reproduce. Will add more detail
> to the ticket if we come to better understand how to reproduce, but if
> anyone else experiencing this can contribute anything helpful (when this
> happens? how long it takes? do multiple users need to be on zeppelin? do
> other cells in the notebook need to be run?) that would be appreciated.
>
>
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4VsgQge74ppNQnXfMs5o42zcBMKucl3c7VRtZdwUsRMZdjfY=>
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4VsgQge74ppNQnXfMs5o42zcBMKucl3c7VRtZdwUsRMZdjfY=>
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4VsgQge74ppNQnXfMs5o42zcBMKucl3c7VRtZdwUsRMZdjfY=>
>  *Paul
> Brenner*
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4H90Q2e344ZdImWaHuYR7lygAfsh1JJtJ5Ho5h9IyAA-xyh8rth1k>
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4H90Q2e344ZdImWaHuYR7lygAfsh1JJtJ5Ho5h9IyAA-xyh8rth1k>
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4H90Q2e344ZdImWaHuYR7lygAfsh1JJtJ5Ho5h9IyAA-xyh8rth1k>
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4H90Q2e74_81anXfMuIR503YPcMY_HYZQxR7VHY-gbfB5dnBg35QX8ZicwPg=>
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4H90Q2e74_81anXfMuIR503YPcMY_HYZQxR7VHY-gbfB5dnBg35QX8ZicwPg=>
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4H90Q2e74_81QlXrCv49_1nYPcMY_LoVc1hryNR-yXgfXhjSJLO6BNdjbv7sVboLIroQCCg==>
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4H90Q2e74_81QlXrCv49_1nYPcMY_LoVc1hryNR-yXgfXhjSJLO6BNdjbv7sVboLIroQCCg==>
> SR. DATA SCIENTIST
> *(217) 390-3033 *
>
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4VsgQge74ppNQnXfMs5o42zcBMJkgfN8elk6zfgbtQgrVgDiRC8dUQDjsfDmka5RysoprbXgLomEZJkgh3QrauwUs9r9YftETRbMPuxqbw0wh7S5hCY05pa-JvwIEfJTuD7EeRHofS6VqCYcwyRMM0YVkPbIc3YnM>
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4VsgQhvXu64ZVjTrKtYY5imhdKoQhf8UAnlT9L1O3QAfXmnCOCcdKUzDqcWe6bZQxvpZrbnFFs2cVOlJl3gLBr1tsqe5EPJkeXrwNpVApq2rFAxDoYgX3iPfAHKP7>
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7zgW6XAMGs2jyPS6z6BkNgaAa4vArG7J-ybIDZ4VsgQhvXu64ZVjTrKtYY5imhdKoQhf8UAnlT9L1O3QAfXmnCOCcdKUzDqcWe6bZQxvpZrbnFFs2cVOlJl3gLBr1tsqe5EPJkeXrwNpVApq2rFAxDoYgX3iPfAHKP7>
> <https://share.polymail.io/v1/z/b/NWI0NjE0ZjY5ZWQ5/vfDe3e-5eTzwzo9OQ-55bPTEMeVmaOuuZP-yX5lXQ11joN96teso2sc3Evt0OMTYRlgcbbiNTRd8Wi4XM-P-S3TOHOMVxlMFWZYGN7z

Re: Paragraphs collapsing

2018-07-10 Thread Ruslan Dautkhanov
I've seen this a couple of times..

-- 
Ruslan Dautkhanov


On Tue, Jul 10, 2018 at 2:34 PM Paul Brenner  wrote:

> We are using 0.8 release and noticed that the editor section of paragraphs
> will randomly collapse when you leave a notebook open for a while. Clicking
> "hide editor" followed by "show editor" will bring them back but it is
> quite annoying and wasn't there in the pre-release version we were using. I
> looked through jira but didn't immediately find an open issue for this.
> Anyone know whats up?
>
>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXR5Fn7gX-a2ZVD-pvQyJLgSHgY7mrQMolRGo4oWlOgAY-WZE=>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXR5Fn7gX-a2ZVD-pvQyJLgSHgY7mrQMolRGo4oWlOgAY-WZE=>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXR5Fn7gX-a2ZVD-pvQyJLgSHgY7mrQMolRGo4oWlOgAY-WZE=>
>  *Paul
> Brenner*
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXDoRntgb-LGJNC_skSTwIzT7hLUBZLXWaZW_MqbJZvYA7eUgo8RqW>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXDoRntgb-LGJNC_skSTwIzT7hLUBZLXWaZW_MqbJZvYA7eUgo8RqW>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXDoRntgb-LGJNC_skSTwIzT7hLUBZLXWaZW_MqbJZvYA7eUgo8RqW>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXDoRntgX-MjhfD-pvSDwKiWDuI04TFGjx3v_aR0uzjPJxQn6GHl3XlG7JNew=>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXDoRntgX-MjhfD-pvSDwKiWDuI04TFGjx3v_aR0uzjPJxQn6GHl3XlG7JNew=>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXDoRntgX-MjhVB-dhTzcMjGDuI04TJ2v9zfv9b1CCyFieNZ0bS1paOqL6unut8q80h8ORfg==>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXDoRntgX-MjhVB-dhTzcMjGDuI04TJ2v9zfv9b1CCyFieNZ0bS1paOqL6unut8q80h8ORfg==>
> SR. DATA SCIENTIST
> *(217) 390-3033 *
>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXR5Fn7gX-a2ZVD-pvQyJLgSHgYxEMdTG_ja-8JEnd1FWcM5EDT9Dz3yn0tvV4nk9H6xgu21EZqEoRM1f_mjsJveM4YmkNUafbl96j0UP4ugJV8dkqH-HC3oEfTf6VPtlZgtAqeZT-pOOsPdzMJn5XPEvng7xWOKp5>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXR5Fn6R7oJnNQH6dpRT5K0H68eQwNdiuhhbXydRyH1lieKdkcTdDtzCHyu6tmmE8E5wQu2FhXuUwdL027mTMSqb14PTgRE-_WjNGhzwnUektnBE-k0_Xbs-YDXfs0>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXR5Fn6R7oJnNQH6dpRT5K0H68eQwNdiuhhbXydRyH1lieKdkcTdDtzCHyu6tmmE8E5wQu2FhXuUwdL027mTMSqb14PTgRE-_WjNGhzwnUektnBE-k0_Xbs-YDXfs0>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXR5Fn6R7oJnNQH6dpRT5K0H68eQwNdiuhhbXydRyH1lieKdkcTdDtzCHyu6tmmE8E5wQu2FhXuUwdL027mTMSqb14PTgRE-_WjNGhzwnUektnBE-k0_Xbs-YDXfs0>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo9iKinRysHler5gdcH0DC_77dWkc-NKChI464_6MwZ_CPflgB3zN7x8Tb19cqdYpTZcNwm3sE4B-NIAyP0cMfmH-QTH8iUtZ6579WTCl8VNVYIqXw_I7Z-FzYfB4rfduIxMnHcTCXR5Fn6R7oJnNQH6dpRT5K0H68eQwNdiuhhbXydRyH1lieKdkcTdDtzCHyu6tmmE8E5wQu2FhXuUwdL027mTMSqb14PTgRE-_WjNGhzwnUektnBE-k0_Xbs-YDXfs0>
> <https://share.polymail.io/v1/z/b/NWI0M2I5NWU4NmJm/1rVo

Re: Shiro configuration - roles and limiting searchbase

2018-07-09 Thread Ruslan Dautkhanov
These two committed fixes aren't in 0.8.0
https://github.com/apache/zeppelin/pull/3045
https://github.com/apache/zeppelin/pull/3037
S
​ee if one of them is relevant to your issue.

​

-- 
Ruslan Dautkhanov


On Mon, Jul 9, 2018 at 9:24 AM András Kolbert 
wrote:

> The latest, 0.8
>
> On Mon, 9 Jul 2018, 17:21 Ruslan Dautkhanov,  wrote:
>
>> Which version of Zeppelin you're using?
>> If it's 0.7, try 0.8 I remember seeing some issues were fixed in 0.8 and
>> in master regarding this AD/LDAP groups...
>>
>> --
>> Ruslan Dautkhanov
>>
>>
>> On Mon, Jul 9, 2018 at 3:23 AM kolbertand...@gmail.com <
>> kolbertand...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> We've been trying to add the right shiro configuration to ensure that a
>>> specific AD group can only log in, and also differentiate roles. We got two
>>> working solutions, but the first let's in everyone within the active
>>> directory (but the roles work fine), the second does not let in everyone
>>> but the roles do not work.
>>>
>>> 1)
>>> This version works for the adding roles to the specific CNs but allows
>>> everyone to login.
>>>
>>> activeDirectoryRealm =
>>> org.apache.zeppelin.realm.ActiveDirectoryGroupRealm
>>> activeDirectoryRealm.systemUsername = aduser
>>> activeDirectoryRealm.hadoopSecurityCredentialPath =
>>> jceks://file/user/zeppelin/conf/zeppelin.jceks
>>> activeDirectoryRealm.searchBase = OU=User Accounts,DC=domain,DC=local
>>> activeDirectoryRealm.url = ldap://AD.domain.local:389
>>> activeDirectoryRealm.groupRolesMap = "CN=admins,OU=User
>>> Accounts,DC=domain,DC=local":"admin"
>>> activeDirectoryRealm.authorizationCachingEnabled = false
>>> activeDirectoryRealm.principalSuffix = @domain.local
>>> securityManager.realms = $activeDirectoryRealm
>>>
>>> 2)
>>> This version limits down the login to the specified AD group, but does
>>> not associates roles with the group.
>>> ldapADGCRealm = org.apache.zeppelin.realm.LdapRealm
>>> ldapADGCRealm.contextFactory.systemUsername = aduser@domain.local
>>> ldapADGCRealm.hadoopSecurityCredentialPath =
>>> jceks://file/user/zeppelin/conf/zeppelinldap.jceks
>>> ldapADGCRealm.searchBase = "OU=User Accounts,DC=domain,DC=local"
>>> ldapADGCRealm.userSearchBase = "OU=User Accounts,DC=domain,DC=local"
>>> ldapADGCRealm.groupSearchBase = "OU=User Accounts,DC=domain,DC=local"
>>> ldapADGCRealm.groupObjectClass = group
>>> ldapADGCRealm.memberAttribute = memberUid
>>> ldapADGCRealm.groupIdAttribute = cn
>>> ldapADGCRealm.groupSearchEnableMatchingRuleInChain = true
>>> ldapADGCRealm.rolesByGroup = users: admin
>>> ldapADGCRealm.userSearchFilter =
>>> (&(objectclass=user)(sAMAccountName={0})(memberOf=CN=users,OU=User
>>> Accounts,DC=domain,DC=local))
>>> ldapADGCRealm.contextFactory.url = ldap://AD.domain.local:389 (edited)
>>>
>>>
>>>
>>> Related posts:
>>>
>>> https://community.hortonworks.com/questions/54896/zeppelin-ad-users-not-binded-to-groups.html
>>>
>>> https://community.hortonworks.com/questions/82135/how-to-limit-access-to-zeppelin-webui-based-for-sp.html
>>>
>>> Any ideas where we go wrong?
>>>
>>> Thanks,
>>> Andras
>>>
>>


Re: Shiro configuration - roles and limiting searchbase

2018-07-09 Thread Ruslan Dautkhanov
Which version of Zeppelin you're using?
If it's 0.7, try 0.8 I remember seeing some issues were fixed in 0.8 and in
master regarding this AD/LDAP groups...

-- 
Ruslan Dautkhanov


On Mon, Jul 9, 2018 at 3:23 AM kolbertand...@gmail.com <
kolbertand...@gmail.com> wrote:

> Hi,
>
> We've been trying to add the right shiro configuration to ensure that a
> specific AD group can only log in, and also differentiate roles. We got two
> working solutions, but the first let's in everyone within the active
> directory (but the roles work fine), the second does not let in everyone
> but the roles do not work.
>
> 1)
> This version works for the adding roles to the specific CNs but allows
> everyone to login.
>
> activeDirectoryRealm = org.apache.zeppelin.realm.ActiveDirectoryGroupRealm
> activeDirectoryRealm.systemUsername = aduser
> activeDirectoryRealm.hadoopSecurityCredentialPath =
> jceks://file/user/zeppelin/conf/zeppelin.jceks
> activeDirectoryRealm.searchBase = OU=User Accounts,DC=domain,DC=local
> activeDirectoryRealm.url = ldap://AD.domain.local:389
> activeDirectoryRealm.groupRolesMap = "CN=admins,OU=User
> Accounts,DC=domain,DC=local":"admin"
> activeDirectoryRealm.authorizationCachingEnabled = false
> activeDirectoryRealm.principalSuffix = @domain.local
> securityManager.realms = $activeDirectoryRealm
>
> 2)
> This version limits down the login to the specified AD group, but does not
> associates roles with the group.
> ldapADGCRealm = org.apache.zeppelin.realm.LdapRealm
> ldapADGCRealm.contextFactory.systemUsername = aduser@domain.local
> ldapADGCRealm.hadoopSecurityCredentialPath =
> jceks://file/user/zeppelin/conf/zeppelinldap.jceks
> ldapADGCRealm.searchBase = "OU=User Accounts,DC=domain,DC=local"
> ldapADGCRealm.userSearchBase = "OU=User Accounts,DC=domain,DC=local"
> ldapADGCRealm.groupSearchBase = "OU=User Accounts,DC=domain,DC=local"
> ldapADGCRealm.groupObjectClass = group
> ldapADGCRealm.memberAttribute = memberUid
> ldapADGCRealm.groupIdAttribute = cn
> ldapADGCRealm.groupSearchEnableMatchingRuleInChain = true
> ldapADGCRealm.rolesByGroup = users: admin
> ldapADGCRealm.userSearchFilter =
> (&(objectclass=user)(sAMAccountName={0})(memberOf=CN=users,OU=User
> Accounts,DC=domain,DC=local))
> ldapADGCRealm.contextFactory.url = ldap://AD.domain.local:389 (edited)
>
>
>
> Related posts:
>
> https://community.hortonworks.com/questions/54896/zeppelin-ad-users-not-binded-to-groups.html
>
> https://community.hortonworks.com/questions/82135/how-to-limit-access-to-zeppelin-webui-based-for-sp.html
>
> Any ideas where we go wrong?
>
> Thanks,
> Andras
>


hadoop.security.auth_to_local

2018-07-06 Thread Ruslan Dautkhanov
I assume some users are connecting to Spark in Zeppelin through Livy.

It seems Livy doesn't support `hadoop.security.auth_to_local` - filed
https://issues.apache.org/jira/browse/LIVY-481

Has anyone ran into this issue?

It seems Livy's `livy.server.auth.kerberos.name-rules` config was trying to
workaround this (just guessing based on the config name),
but I can't find any documentation on this either.

Thanks!
Ruslan


Re: [DISCUSS] Is interpreter binding necessary ?

2018-07-06 Thread Ruslan Dautkhanov
+1 to remove it

Setting default interpreter is not very useful anyway (for example, we
can't make %pyspark default without manually editing xml files in Zeppelin
distro). https://issues.apache.org/jira/browse/ZEPPELIN-3282

-- 
Ruslan Dautkhanov


On Fri, Jul 6, 2018 at 7:27 AM Paul Brenner  wrote:

> I agree with Partridge. We have different interpreters defined with
> different queues and settings. So we need a way to quickly change the
> default interpreter and can’t rely on typing the desired interpreter at the
> start of each paragraph.
>
>
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stT-RZWW0TiJaTPs_IOYGlF1ZvI7h9Qb2g_eDdzcp6fWnRGMKU=>
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stT-RZWW0TiJaTPs_IOYGlF1ZvI7h9Qb2g_eDdzcp6fWnRGMKU=>
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stT-RZWW0TiJaTPs_IOYGlF1ZvI7h9Qb2g_eDdzcp6fWnRGMKU=>
>  *Paul
> Brenner*
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stTsANWA0fiYqDXt-NFancGmYTJoNoEym-hk4ZudikiEh22ahQg0LL2>
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stTsANWA0fiYqDXt-NFancGmYTJoNoEym-hk4ZudikiEh22ahQg0LL2>
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stTsANWA0fiYqDXt-NFancGmYTJoNoEym-hk4ZudikiEh22ahQg0LL2>
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stTsANWA0TifPrFs_IOa3cE3drGrtRO83LeA-sOurZBpe40Kl9Vs2ciD7HZiFQ=>
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stTsANWA0TifPrFs_IOa3cE3drGrtRO83LeA-sOurZBpe40Kl9Vs2ciD7HZiFQ=>
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stTsANWA0TifPrPu_8AbHwC2NrGrtROwHHSEO8pkpgUNczTb1OqOVPmpWhChtSWVFdvwywOLA==>
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stTsANWA0TifPrPu_8AbHwC2NrGrtROwHHSEO8pkpgUNczTb1OqOVPmpWhChtSWVFdvwywOLA==>
> SR. DATA SCIENTIST
> *(217) 390-3033 *
>
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stT-RZWW0TiJaTPs_IOYGlF1ZvI7otRkiuQULto2YFLKcHRaV-yQzCzzfozP1Vkx88WZ1XBuYhJEPd237RkovCaAxj0DNIuDG3qWOUDdM3xXXYnBAZyQ61ybfDbOnVWvhsXK0DXpbBauRLrhIQKcCDAthsl-eAsaQQD>
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stT-RZWXF_0aLHKo78IZnVEhMSU9JZQkTGOWKEmiNQRK8zTcxetQTCt3vI1Mgt6wc9Va0nBuoEHAfF6w64gofiBF0a0U4MyTiXnQ-oBaof1lKHJ7EeRjvYnxfekhrV->
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stT-RZWXF_0aLHKo78IZnVEhMSU9JZQkTGOWKEmiNQRK8zTcxetQTCt3vI1Mgt6wc9Va0nBuoEHAfF6w64gofiBF0a0U4MyTiXnQ-oBaof1lKHJ7EeRjvYnxfekhrV->
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stT-RZWXF_0aLHKo78IZnVEhMSU9JZQkTGOWKEmiNQRK8zTcxetQTCt3vI1Mgt6wc9Va0nBuoEHAfF6w64gofiBF0a0U4MyTiXnQ-oBaof1lKHJ7EeRjvYnxfekhrV->
> <https://share.polymail.io/v1/z/b/NWIzZjZkNTc0YWE0/kVB6OMitKz7Pqm5NuaQG4awyekI99c0gkhUdEDv2kriJlBnjhW1b_KV6uy_oCGrO25Ri9NpPRXrqb1oYLH5U3-CVKF4hGqzYaRjNEfcRIrXjFevdugvD0d2pmfVXx7uzGRDDgZjI9stT-RZWXF_0aLHKo78IZnVEhMSU9JZQkTGOWKEmiNQRK8zTcxetQTCt3vI1Mgt6wc9Va0nBuoEHAfF6w64gofiBF0a0U4MyTiXnQ-oBaof1lKHJ7EeRjvYnxfekhrV->
> <https://share.po

Re: [ANNOUNCE] Apache Zeppelin 0.8.0 released

2018-06-28 Thread Ruslan Dautkhanov
Great job. Congrats everyone involved.

-- 
Ruslan Dautkhanov


On Thu, Jun 28, 2018 at 9:47 AM Felix Cheung 
wrote:

> Congrats and thanks for putting together the release
>
> --
> *From:* Miquel Angel Andreu Febrer 
> *Sent:* Wednesday, June 27, 2018 11:02:20 PM
> *To:* d...@zeppelin.apache.org
> *Cc:* users@zeppelin.apache.org
> *Subject:* Re: [ANNOUNCE] Apache Zeppelin 0.8.0 released
>
> Great news,
>
> It has been a hard work to announce this release
>
> Thank you very much Jeff for your work and your patience
>
>
>
>
>
> El jue., 28 jun. 2018 6:05, Sanjay Dasgupta 
> escribió:
>
> > This is really a great milestone.
> >
> > Thanks to those behind the grand effort.
> >
> > On Thu, Jun 28, 2018 at 8:51 AM, Prabhjyot Singh <
> prabhjyotsi...@gmail.com
> > >
> > wrote:
> >
> > > Awesome! congratulations team.
> > >
> > >
> > >
> > > On Thu 28 Jun, 2018, 8:39 AM Taejun Kim,  wrote:
> > >
> > >> Awesome! Thanks for your great work :)
> > >>
> > >> 2018년 6월 28일 (목) 오후 12:07, Jeff Zhang 님이 작성:
> > >>
> > >>> The Apache Zeppelin community is pleased to announce the availability
> > of
> > >>> the 0.8.0 release.
> > >>>
> > >>> Zeppelin is a collaborative data analytics and visualization tool for
> > >>> distributed, general-purpose data processing system such as Apache
> > Spark,
> > >>> Apache Flink, etc.
> > >>>
> > >>> This is another major release after the last minor release 0.7.3.
> > >>> The community put significant effort into improving Apache Zeppelin
> > since
> > >>> the last release. 122 contributors fixed totally 602 issues. Lots of
> > >>> new features are introduced, such as inline configuration, ipython
> > >>> interpreter, yarn-cluster mode support , interpreter lifecycle
> manager
> > >>> and etc.
> > >>>
> > >>> We encourage you to download the latest release
> > >>> fromhttp://zeppelin.apache.org/download.html
> > >>>
> > >>> Release note is available
> > >>> athttp://zeppelin.apache.org/releases/zeppelin-release-0.8.0.html
> > >>>
> > >>> We welcome your help and feedback. For more information on the
> project
> > >>> and
> > >>> how to get involved, visit our website at
> http://zeppelin.apache.org/
> > >>>
> > >>> Thank you all users and contributors who have helped to improve
> Apache
> > >>> Zeppelin.
> > >>>
> > >>> Regards,
> > >>> The Apache Zeppelin community
> > >>>
> > >> --
> > >> Taejun Kim
> > >>
> > >> Data Mining Lab.
> > >> School of Electrical and Computer Engineering
> > >> University of Seoul
> > >>
> > >
> >
>


Re: Stdout both to file and console

2018-06-20 Thread Ruslan Dautkhanov
Something like this for log4j should do

log4j.rootLogger = INFO, dailyfile
> log4j.appender.stdout = org.apache.log4j.ConsoleAppender
> log4j.appender.stdout.layout = org.apache.log4j.PatternLayout
> log4j.appender.stdout.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) -
> %m%n
> log4j.appender.dailyfile.DatePattern=.-MM-dd
> log4j.appender.dailyfile.Threshold = DEBUG
> log4j.appender.dailyfile = org.apache.log4j.DailyRollingFileAppender
> log4j.appender.dailyfile.File = ${zeppelin.log.file}
> log4j.appender.dailyfile.layout = org.apache.log4j.PatternLayout
> log4j.appender.dailyfile.layout.ConversionPattern=%5p [%d] ({%t}
> %F[%M]:%L) - %m%n
>
> log4j.logger.org.apache.zeppelin.interpreter.InterpreterFactory=DEBUG
> log4j.logger.org.apache.zeppelin.notebook.Paragraph=DEBUG
> log4j.logger.org.apache.zeppelin.scheduler=DEBUG
> log4j.logger.org.apache.zeppelin.spark=DEBUG
> log4j.logger.org.apache.zeppelin.python=DEBUG
> log4j.logger.org.apache.zeppelin.interpreter.util=DEBUG
> log4j.logger.org.apache.zeppelin.interpreter.remote=DEBUG
>
> log4j.logger.org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer=DEBUG
> log4j.logger.org.glassfish.jersey.internal.inject.Providers=SEVERE




-- 
Ruslan Dautkhanov


On Wed, Jun 20, 2018 at 3:01 AM Alessandro Liparoti <
alessandro.l...@gmail.com> wrote:

> Hi,
> yes spark UI is a tool I already use for it but as Rusian mentioned would
> be good to have this functionality.
> Rusian: which verbose level allows me to have stdout in log files? Is
> there an attribute to add to the appender for this?
>
> *Alessandro Liparoti*
>
> 2018-06-19 19:52 GMT+02:00 Ruslan Dautkhanov :
>
>> If you set pretty verbose level in log4j then you can see output in log
>> files. I've seen it there.
>> Then you can use regexps to strip out paragraph outputs from rest of
>> debugging messages.
>> May work as a one off effort. Might be a good idea to file an enhancement
>> request - this can be also useful
>> for scheduled notebook runs - would be great to go back and review each
>> scheduled note executions etc.
>>
>>
>>
>> On Tue, Jun 19, 2018 at 2:56 AM Alessandro Liparoti <
>> alessandro.l...@gmail.com> wrote:
>>
>>> I am comparing performances between different implementations of a spark
>>> job and I am testing a chunk of code which prints partial results and info
>>> to sdtout. I can surely replace all the prints with logger calls and
>>> collect them. I just wanted to know if there was a way to avoid this or if
>>> this functionality was of easier implementation.
>>>
>>> *Alessandro Liparoti*
>>>
>>> 2018-06-19 10:52 GMT+02:00 Jeff Zhang :
>>>
>>>>
>>>> Not sure what kind of analysis you want to do, is the logging info in
>>>> the interpreter log file enough for you ? (You can update the log level in
>>>> log4j.properties to get more logs)
>>>>
>>>> Alessandro Liparoti 于2018年6月19日周二 下午4:47写道:
>>>>
>>>>> I would like to post-analyze the output of verbose jobs in the
>>>>> notebook and save them, avoiding to relaunch the jobs again. It would be
>>>>> also good to have the stderr logged to file.
>>>>>
>>>>> Thanks
>>>>>
>>>>> *Alessandro Liparoti*
>>>>>
>>>>> 2018-06-19 10:43 GMT+02:00 Jeff Zhang :
>>>>>
>>>>>>
>>>>>> I am not afraid it is not possible now. The stdout of notebooks is
>>>>>> not based on log4j. If you want it output to file as well, you might need
>>>>>> to change the code of the interpreter itself.
>>>>>> Usually it is not necessary to log it to log file as well, could you
>>>>>> tell why you want that ? Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>> alessandro.l...@gmail.com 于2018年6月19日周二
>>>>>> 下午3:52写道:
>>>>>>
>>>>>>> Good morning,
>>>>>>> I would like to have stdout of notebooks both printed out to console
>>>>>>> and file. How can I achieve that? I tried to play around with log4j but
>>>>>>> without any success; it seems it requires a custom appender 
>>>>>>> implementation.
>>>>>>> Any other simpler idea?
>>>>>>>
>>>>>>
>>>>>
>>>
>


Re: Stdout both to file and console

2018-06-19 Thread Ruslan Dautkhanov
If you set pretty verbose level in log4j then you can see output in log
files. I've seen it there.
Then you can use regexps to strip out paragraph outputs from rest of
debugging messages.
May work as a one off effort. Might be a good idea to file an enhancement
request - this can be also useful
for scheduled notebook runs - would be great to go back and review each
scheduled note executions etc.



On Tue, Jun 19, 2018 at 2:56 AM Alessandro Liparoti <
alessandro.l...@gmail.com> wrote:

> I am comparing performances between different implementations of a spark
> job and I am testing a chunk of code which prints partial results and info
> to sdtout. I can surely replace all the prints with logger calls and
> collect them. I just wanted to know if there was a way to avoid this or if
> this functionality was of easier implementation.
>
> *Alessandro Liparoti*
>
> 2018-06-19 10:52 GMT+02:00 Jeff Zhang :
>
>>
>> Not sure what kind of analysis you want to do, is the logging info in the
>> interpreter log file enough for you ? (You can update the log level in
>> log4j.properties to get more logs)
>>
>> Alessandro Liparoti 于2018年6月19日周二 下午4:47写道:
>>
>>> I would like to post-analyze the output of verbose jobs in the notebook
>>> and save them, avoiding to relaunch the jobs again. It would be also good
>>> to have the stderr logged to file.
>>>
>>> Thanks
>>>
>>> *Alessandro Liparoti*
>>>
>>> 2018-06-19 10:43 GMT+02:00 Jeff Zhang :
>>>

 I am not afraid it is not possible now. The stdout of notebooks is not
 based on log4j. If you want it output to file as well, you might need to
 change the code of the interpreter itself.
 Usually it is not necessary to log it to log file as well, could you
 tell why you want that ? Thanks



 alessandro.l...@gmail.com 于2018年6月19日周二
 下午3:52写道:

> Good morning,
> I would like to have stdout of notebooks both printed out to console
> and file. How can I achieve that? I tried to play around with log4j but
> without any success; it seems it requires a custom appender 
> implementation.
> Any other simpler idea?
>

>>>
>


Re: Brain dead question on my part...

2018-06-04 Thread Ruslan Dautkhanov
Can you send a screenshot with the error and complete exception stack?




-- 
Ruslan Dautkhanov

On Mon, Jun 4, 2018 at 10:40 AM, Michael Segel 
wrote:

> Hmmm. Still not working.
> Added it to the interpreter setting and restarted the interpreter.
>
> The issue is that I need to use the MapR version of spark since I’m
> running this on the cluster.
>
> Should I restart Zeppelin itself?
>
> On Jun 4, 2018, at 11:32 AM, Ruslan Dautkhanov 
> wrote:
>
> zeppelin.spark.enableSupportedVersionCheck
>
>
>


Re: Brain dead question on my part...

2018-06-04 Thread Ruslan Dautkhanov
Nope add that as a spark interpreter setting.
0.7.2 should work fine with Spark 2.2 afaik.
You may want to go with Zeppelin 0.8 when you upgrade to Spark 2.3.



-- 
Ruslan Dautkhanov

On Mon, Jun 4, 2018 at 10:29 AM, Michael Segel 
wrote:

> I’m assuming that I want to set this in ./conf/zeppelin-site.xml …
>
> Didn’t have any impact. Still getting the same error.
>
>
> On Jun 4, 2018, at 11:17 AM, Michael Segel 
> wrote:
>
> Hmmm…. did not know that option existed.
> Are there any downsides to doing this?
>
> Thx
>
> -Mike
>
>
> On Jun 4, 2018, at 11:10 AM, Ruslan Dautkhanov 
> wrote:
>
> Should you try to set  zeppelin.spark.enableSupportedVersionCheck to
> false at spark interpreter level ?
>
>
>
> --
> Ruslan Dautkhanov
>
> On Mon, Jun 4, 2018 at 9:05 AM, Michael Segel 
> wrote:
>
>> Hi,
>>
>> I’m trying to use Zeppelin to connect to a MapR Cluster…
>>
>> Yes, I know that MapR has their own supported release but I also want to
>> use the same set up to also run stand alone too…
>>
>> My issue is that I’m running Zeppelin 0.7.2 and when I try to connect to
>> spark, I get the following error….
>>
>>  Spark 2.2.1-mapr-1803 is not supported
>>
>> Ok… so its been a while, I’m trying to see what would cause this and if
>> there was an easy fix… (other than going w MapR’s release and running it in
>> a container.)
>>
>>
>> Thx
>>
>> -Mike
>>
>>
>
>
>


Re: Silly question...

2018-05-25 Thread Ruslan Dautkhanov
You may want to check if %spark.dep
https://zeppelin.apache.org/docs/latest/interpreter/spark.html#3-dynamic-dependency-loading-via-sparkdep-interpreter
helps here.



-- 
Ruslan Dautkhanov

On Fri, May 25, 2018 at 12:46 PM, Michael Segel <msegel_had...@hotmail.com>
wrote:

> What’s the best way to set up a class path for a specific notebook?
>
> I have some custom classes that I may want to include.
>
> Is there a way to specify this in the specific note?
> Would it be better to add the jars to an existing lib folder?
>
> Thx


note imports broken?

2018-05-23 Thread Ruslan Dautkhanov
Was anybody able to import notes on 0.8 RC or a recent master snapshot?
Notes import seems to be broken
Filed https://issues.apache.org/jira/browse/ZEPPELIN-3485
This looks serious to me.


-- 
Ruslan Dautkhanov


WARNING: A provider org.apache.zeppelin.rest.LoginRestApi registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime

2018-05-22 Thread Ruslan Dautkhanov
Testing Zeppelin from master snapshot.. getting new below errors/warnings
[1].
Never seen these.
Is this something we should be concerned about?
If not, how to disable them?
They show up at stdout when launching Zeppelin and not to a log file.


[1]


> WARNING: A provider org.apache.zeppelin.rest.SecurityRestApi registered in
> SERVER runtime does not implement any provider interfaces applicable in the
> SERVER runtime. Due to constraint configuration problems the provider
> org.apache.zeppelin.rest.SecurityRestApi will be ignored.
>


> May 22, 2018 11:21:57 AM org.glassfish.jersey.internal.inject.Providers
> checkProviderRuntime
> WARNING: A provider org.apache.zeppelin.rest.InterpreterRestApi registered
> in SERVER runtime does not implement any provider interfaces applicable in
> the SERVER runtime. Due to constraint configuration problems the provider
> org.apache.zeppelin.rest.InterpreterRestApi will be ignored.
>


> May 22, 2018 11:21:57 AM org.glassfish.jersey.internal.inject.Providers
> checkProviderRuntime
> WARNING: A provider org.apache.zeppelin.rest.LoginRestApi registered in
> SERVER runtime does not implement any provider interfaces applicable in the
> SERVER runtime. Due to constraint configuration problems the provider
> org.apache.zeppelin.rest.LoginRestApi will be ignored.
>


> May 22, 2018 11:21:57 AM org.glassfish.jersey.internal.inject.Providers
> checkProviderRuntime
> WARNING: A provider org.apache.zeppelin.rest.NotebookRepoRestApi
> registered in SERVER runtime does not implement any provider interfaces
> applicable in the SERVER runtime. Due to constraint configuration problems
> the provider org.apache.zeppelin.rest.NotebookRepoRestApi will be ignored.
>


> May 22, 2018 11:21:57 AM org.glassfish.jersey.internal.inject.Providers
> checkProviderRuntime
> WARNING: A provider org.apache.zeppelin.rest.HeliumRestApi registered in
> SERVER runtime does not implement any provider interfaces applicable in the
> SERVER runtime. Due to constraint configuration problems the provider
> org.apache.zeppelin.rest.HeliumRestApi will be ignored.
>


> May 22, 2018 11:21:57 AM org.glassfish.jersey.internal.inject.Providers
> checkProviderRuntime
> WARNING: A provider org.apache.zeppelin.rest.NotebookRestApi registered in
> SERVER runtime does not implement any provider interfaces applicable in the
> SERVER runtime. Due to constraint configuration problems the provider
> org.apache.zeppelin.rest.NotebookRestApi will be ignored.
>


> May 22, 2018 11:21:57 AM org.glassfish.jersey.internal.inject.Providers
> checkProviderRuntime
> WARNING: A provider org.apache.zeppelin.rest.ConfigurationsRestApi
> registered in SERVER runtime does not implement any provider interfaces
> applicable in the SERVER runtime. Due to constraint configuration problems
> the provider org.apache.zeppelin.rest.ConfigurationsRestApi will be ignored.
>


> May 22, 2018 11:21:57 AM org.glassfish.jersey.internal.inject.Providers
> checkProviderRuntime
> WARNING: A provider org.apache.zeppelin.rest.CredentialRestApi registered
> in SERVER runtime does not implement any provider interfaces applicable in
> the SERVER runtime. Due to constraint configuration problems the provider
> org.apache.zeppelin.rest.CredentialRestApi will be ignored.
>


> May 22, 2018 11:21:57 AM org.glassfish.jersey.internal.inject.Providers
> checkProviderRuntime
> WARNING: A provider org.apache.zeppelin.rest.ZeppelinRestApi registered in
> SERVER runtime does not implement any provider interfaces applicable in the
> SERVER runtime. Due to constraint configuration problems the provider
> org.apache.zeppelin.rest.ZeppelinRestApi will be ignored.
>


> May 22, 2018 11:21:57 AM org.glassfish.jersey.internal.Errors logErrors
> WARNING: The following warnings have been detected: WARNING: A HTTP GET
> method, public javax.ws.rs.core.Response
> org.apache.zeppelin.rest.InterpreterRestApi.listInterpreter(java.lang.String),
> should not consume any entity.
>


> WARNING: A HTTP GET method, public javax.ws.rs.core.Response
> org.apache.zeppelin.rest.CredentialRestApi.getCredentials(java.lang.String)
> throws java.io.IOException,java.lang.IllegalArgumentException, should not
> consume any entity.



-- 
Ruslan Dautkhanov


Re: Zeppelin 0.8 rc1

2018-05-21 Thread Ruslan Dautkhanov
Thank you Jeff.



-- 
Ruslan Dautkhanov

On Wed, May 16, 2018 at 6:19 PM, Jeff Zhang <zjf...@gmail.com> wrote:

> Yes, the voting thread is on dev mail list.
>
> https://lists.apache.org/thread.html/c6435f3fcfab4c516e2ef90f436575
> 3268546293afa1ae2c50cc54f9@%3Cdev.zeppelin.apache.org%3E
>
>
> Ruslan Dautkhanov <dautkha...@gmail.com>于2018年5月17日周四 上午1:50写道:
>
>> I didn't know 0.8 rc1/rc2 were out. Was it advertised on the dev list?
>>
>> Thanks for sharing this.
>>
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Sun, May 13, 2018 at 1:23 AM, Rotem Herzberg <
>> rotem.herzb...@gigaspaces.com> wrote:
>>
>>> Hello all,
>>>
>>> I've downloaded and built the zeppelin v0.8.0-rc1
>>> <https://github.com/apache/zeppelin/releases/tag/v0.8.0-rc1> build but
>>> when I try to run zeppelin the zeppelin process dies and I can't access the
>>> notebooks.
>>> [image: Screenshot from 2018-05-13 10-18-52.png]
>>> How can I solve this problem? Other zeppelin builds work fine on my
>>> machine.
>>>
>>> I noticed that rc2 was out and now doesn't appear in the releases. When
>>> will it be available?
>>>
>>> Thanks in advance,
>>>
>>>
>>> --
>>> <http://www.gigaspaces.com/?utm_source=Signature_medium=Email>
>>> *Rotem Herzberg*
>>> SW Engineer | GigaSpaces Technologies
>>>
>>> rotem.herzb...@gigaspaces.com   | M +972547718880
>>>
>>>   <https://twitter.com/gigaspaces>
>>> <https://www.linkedin.com/company/gigaspaces>
>>> <https://www.facebook.com/gigaspaces>
>>>
>>
>>


Re: Zeppelin 0.8 rc1

2018-05-16 Thread Ruslan Dautkhanov
I didn't know 0.8 rc1/rc2 were out. Was it advertised on the dev list?

Thanks for sharing this.



-- 
Ruslan Dautkhanov

On Sun, May 13, 2018 at 1:23 AM, Rotem Herzberg <
rotem.herzb...@gigaspaces.com> wrote:

> Hello all,
>
> I've downloaded and built the zeppelin v0.8.0-rc1
> <https://github.com/apache/zeppelin/releases/tag/v0.8.0-rc1> build but
> when I try to run zeppelin the zeppelin process dies and I can't access the
> notebooks.
>
> How can I solve this problem? Other zeppelin builds work fine on my
> machine.
>
> I noticed that rc2 was out and now doesn't appear in the releases. When
> will it be available?
>
> Thanks in advance,
>
>
> --
> <http://www.gigaspaces.com/?utm_source=Signature_medium=Email>
> *Rotem Herzberg*
> SW Engineer | GigaSpaces Technologies
>
> rotem.herzb...@gigaspaces.com   | M +972547718880
>
>   <https://twitter.com/gigaspaces>
> <https://www.linkedin.com/company/gigaspaces>
> <https://www.facebook.com/gigaspaces>
>


nightly builds?

2018-05-09 Thread Ruslan Dautkhanov
This probably had to go to the dev group instead - would it be possible to
get
an automated nightly/weekly builds published too?
Something like a "bleeding edge" builds from a master snapshot.

It would help folks who cannot build themselves, but have to use
some features / fixes that aren't available on latest official release.
Also it gives new features exposure to more testing, so it should be a
win-win for users and developers.

Some other open source projects employ nightly builds.


Thanks!
Ruslan Dautkhanov


Re: save data in a notebook to use in subsequent scripts

2018-04-30 Thread Ruslan Dautkhanov
Not sure if Spark-Cassandra connector would be helpful?

https://github.com/datastax/spark-cassandra-connector




-- 
Ruslan Dautkhanov

On Mon, Apr 30, 2018 at 7:38 AM, Soheil Pourbafrani <soheil.i...@gmail.com>
wrote:

> Is it possible to save a Cassandra query result in a variable to use in
> subsequent scripts?
>


Re: Error while loading shared libraries: libstdc++.so.6

2018-04-03 Thread Ruslan Dautkhanov
 >> [ERROR] /home/monster/zeppelin/zeppelin-web/node/node: error while
loading shared libraries: libstdc++.so.6:
>> cannot open shared object file: No such file or directory

$ sudo yum install libstdc++.x86_64

would do?



On Tue, Apr 3, 2018 at 3:09 PM, Joaquín Silva <
joaquin.silva.vigen...@gmail.com> wrote:

> Hello,
> I'm trying to build Zeppelin from source, but it keeps throwing this error:
>
> [ERROR] /home/monster/zeppelin/zeppelin-web/node/node: error while
> loading shared libraries: libstdc++.so.6: cannot open shared object file:
> No such file or directory
>
> This is the process that I followed:
> # git clone https://github.com/apache/zeppelin.git
> # cd zeppelin
> # mvn clean package -DskipTests -Pspark-2.2 -Phadoop-2.6 -Pyarn -Ppyspark
> -Psparkr -Pr -Pscala-2.11
>
> This is the other error that pops up at the end.
>
> [ERROR] Failed to execute goal 
> com.github.eirslett:frontend-maven-plugin:1.3:npm
> (npm install) on project zeppelin-web: Failed to run task: 'npm install
> --no-lockfile' failed. (error code 127) -> [Help 1]
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute
> goal com.github.eirslett:frontend-maven-plugin:1.3:npm (npm install) on
> project zeppelin-web: Failed to run task
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute
> (MojoExecutor.java:213)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute
> (MojoExecutor.java:154)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute
> (MojoExecutor.java:146)
> at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject
> (LifecycleModuleBuilder.java:117)
> at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject
> (LifecycleModuleBuilder.java:81)
> at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
> (SingleThreadedBuilder.java:56)
> at org.apache.maven.lifecycle.internal.LifecycleStarter.execute
> (LifecycleStarter.java:128)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
> at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
> at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
> at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:290)
> at org.apache.maven.cli.MavenCli.main (MavenCli.java:194)
> at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke
> (NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke
> (DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke (Method.java:498)
> at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced
> (Launcher.java:289)
> at org.codehaus.plexus.classworlds.launcher.Launcher.launch
> (Launcher.java:229)
> at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode
> (Launcher.java:415)
> at org.codehaus.plexus.classworlds.launcher.Launcher.main
> (Launcher.java:356)
> Caused by: org.apache.maven.plugin.MojoFailureException: Failed to run
> task
> at 
> com.github.eirslett.maven.plugins.frontend.mojo.AbstractFrontendMojo.execute
> (AbstractFrontendMojo.java:95)
> at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo
> (DefaultBuildPluginManager.java:137)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute
> (MojoExecutor.java:208)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute
> (MojoExecutor.java:154)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute
> (MojoExecutor.java:146)
> at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject
> (LifecycleModuleBuilder.java:117)
> at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject
> (LifecycleModuleBuilder.java:81)
> at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
> (SingleThreadedBuilder.java:56)
> at org.apache.maven.lifecycle.internal.LifecycleStarter.execute
> (LifecycleStarter.java:128)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
> at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
> at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
> at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:290)
> at org.apache.maven.cli.MavenCli.main (MavenCli.java:194)
> at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke
> (NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke
> (DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke (Method.java:498)
> at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced
> (Launcher.java:289)
> at 

%spark.dep for %pyspark

2018-03-29 Thread Ruslan Dautkhanov
Were you guys able to to use %spark.dep for %pyspark ?

According to documentation this should work:
https://zeppelin.apache.org/docs/0.7.2/interpreter/spark.html#dependency-management
" Note: %spark.dep interpreter loads libraries to %spark and %spark.pyspark but
not to %spark.sql interpreter.  "

In real life for some reason it doesn't work.. (on recent master)

(as a workaround I add a local jar into --jars option in the
spark_submit_options, but using %spark.dep would be so much nicer).


Thanks,
Ruslan


"IPython is available, use IPython for PySparkInterpreter"

2018-03-19 Thread Ruslan Dautkhanov
We're getting " IPython is available, use IPython for PySparkInterpreter "
warning each time we start %pyspark notebooks.

Although there is no difference between %pyspark and %ipyspark afaik.
At least we can use all ipython magic commands etc.
(maybe becase we have zeppelin.pyspark.useIPython=true?)

If that's the case, how we can disable "IPython is available, use IPython
for PySparkInterpreter" warning ?


-- 
Ruslan Dautkhanov


Re: multiple users sharing single Spark context

2018-03-15 Thread Ruslan Dautkhanov
Thanks Jeff.

Yep, that was helpful.

Btw, (i) icon has a broken link  (see highlighted part below) :



- it leads to a broken link
https://zeppelin.apache.org/docs//usage/interpreter/interpreter_binding_mode.html


What do you think about https://issues.apache.org/jira/browse/ZEPPELIN-3334
"Set spark.scheduler.pool to authenticated user name" ?
I still think it makes sense ..




-- 
Ruslan Dautkhanov

On Wed, Mar 14, 2018 at 6:32 PM, Jeff Zhang <zjf...@gmail.com> wrote:

>
> Globally shared mode means all the users shared the sparkcontext and also
> the same spark interpreter. That's why in this mode, code is executed
> sequentially, concurrency is not allowed here as there may be dependencies
> between paragraphs. Concurrency can not guaranteed the execution order.
>
> For your scenario, I think you can use scoped per user mode where all the
> users share the same sparkcontext but use different spark interpreter.
>
>
>
> ankit jain <ankitjain@gmail.com>于2018年3月15日周四 上午7:25写道:
>
>> We are seeing the same PENDING behavior despite running Spark Interpreter
>> in "Isolated per User" - we expected one SparkContext to be created per
>> user and indeed did see multiple SparkSubmit processes spun up on Zeppelin
>> pod.
>>
>> But why go to PENDING if there are multiple contexts that can be run in
>> parallel? Is assumption of multiple SparkSubmit = multiple SparkContext
>> correct?
>>
>> Thanks
>> Ankit
>>
>> On Wed, Mar 14, 2018 at 4:12 PM, Ruslan Dautkhanov <dautkha...@gmail.com>
>> wrote:
>>
>>> Looked at the code.. the only place Zeppelin handles
>>> spark.scheduler.pool is here -
>>>
>>> https://github.com/apache/zeppelin/blob/d762b5288536201d8a2964891c556e
>>> faa1bae867/spark/interpreter/src/main/java/org/apache/zeppelin/spark/
>>> SparkSqlInterpreter.java#L103
>>>
>>> I don't think it matches Spark documentation description that would
>>> allow multiple concurrent users to submit jobs independently.
>>> (each user's *thread* has to have different value for  *spark.scheduler.pool
>>> *)
>>>
>>> Filed https://issues.apache.org/jira/browse/ZEPPELIN-3334 to set
>>> *spark.scheduler.pool* to an authenticated user name.
>>>
>>> Other ideas?
>>>
>>>
>>>
>>>
>>> --
>>> Ruslan Dautkhanov
>>>
>>> On Wed, Mar 14, 2018 at 4:57 PM, Ruslan Dautkhanov <dautkha...@gmail.com
>>> > wrote:
>>>
>>>> Let's say we have a Spark interpreter set up as
>>>> " The interpreter will be instantiated *Globally *in *shared *process"
>>>>
>>>> When one user is using Spark interpreter,
>>>> another users that are trying to use the same interpreter,
>>>> getting PENDING until another user's code completes.
>>>>
>>>> Per Spark documentation, https://spark.apache.org/docs/
>>>> latest/job-scheduling.html
>>>>
>>>> " *within* each Spark application, multiple “jobs” (Spark actions) may
>>>>> be running concurrently if they were submitted by different threads
>>>>> ... /skip/
>>>>> threads. By “job”, in this section, we mean a Spark action (e.g. save,
>>>>>  collect) and any tasks that need to run to evaluate that action.
>>>>> Spark’s scheduler is fully thread-safe and supports this use case to 
>>>>> enable
>>>>> applications that serve multiple requests (e.g. queries for multiple 
>>>>> users).
>>>>> ... /skip/
>>>>> Without any intervention, newly submitted jobs go into a *default
>>>>> pool*, but jobs’ pools can be set by adding the *spark.scheduler.pool*
>>>>>  “local property” to the SparkContext in the thread that’s submitting
>>>>> them."
>>>>
>>>>
>>>> So Spark allows multiple users to use the same shared spark context..
>>>>
>>>> Two quick questions:
>>>> 1. Why concurrent users are getting PENDING in Zeppelin?
>>>> 2. Does Zeppelin set *spark.scheduler.pool* accordingly as described
>>>> above?
>>>>
>>>> PS.
>>>> We have set following Spark interpreter settings:
>>>> - zeppelin.spark.concurrentSQL= true
>>>> - spark.scheduler.mode = FAIR
>>>>
>>>>
>>>> Thank you,
>>>> Ruslan Dautkhanov
>>>>
>>>>
>>>
>>
>>
>> --
>> Thanks & Regards,
>> Ankit.
>>
>


Re: multiple users sharing single Spark context

2018-03-14 Thread Ruslan Dautkhanov
Looked at the code.. the only place Zeppelin handles spark.scheduler.pool
is here -

https://github.com/apache/zeppelin/blob/d762b5288536201d8a2964891c556efaa1bae867/spark/interpreter/src/main/java/org/apache/zeppelin/spark/SparkSqlInterpreter.java#L103

I don't think it matches Spark documentation description that would allow
multiple concurrent users to submit jobs independently.
(each user's *thread* has to have different value for  *spark.scheduler.pool
*)

Filed https://issues.apache.org/jira/browse/ZEPPELIN-3334 to set
*spark.scheduler.pool* to an authenticated user name.

Other ideas?




-- 
Ruslan Dautkhanov

On Wed, Mar 14, 2018 at 4:57 PM, Ruslan Dautkhanov <dautkha...@gmail.com>
wrote:

> Let's say we have a Spark interpreter set up as
> " The interpreter will be instantiated *Globally *in *shared *process"
>
> When one user is using Spark interpreter,
> another users that are trying to use the same interpreter,
> getting PENDING until another user's code completes.
>
> Per Spark documentation, https://spark.apache.org/docs/
> latest/job-scheduling.html
>
> " *within* each Spark application, multiple “jobs” (Spark actions) may be
>> running concurrently if they were submitted by different threads
>> ... /skip/
>> threads. By “job”, in this section, we mean a Spark action (e.g. save,
>> collect) and any tasks that need to run to evaluate that action. Spark’s
>> scheduler is fully thread-safe and supports this use case to enable
>> applications that serve multiple requests (e.g. queries for multiple users).
>> ... /skip/
>> Without any intervention, newly submitted jobs go into a *default pool*,
>> but jobs’ pools can be set by adding the *spark.scheduler.pool* “local
>> property” to the SparkContext in the thread that’s submitting them."
>
>
> So Spark allows multiple users to use the same shared spark context..
>
> Two quick questions:
> 1. Why concurrent users are getting PENDING in Zeppelin?
> 2. Does Zeppelin set *spark.scheduler.pool* accordingly as described
> above?
>
> PS.
> We have set following Spark interpreter settings:
> - zeppelin.spark.concurrentSQL= true
> - spark.scheduler.mode = FAIR
>
>
> Thank you,
> Ruslan Dautkhanov
>
>


multiple users sharing single Spark context

2018-03-14 Thread Ruslan Dautkhanov
Let's say we have a Spark interpreter set up as
" The interpreter will be instantiated *Globally *in *shared *process"

When one user is using Spark interpreter,
another users that are trying to use the same interpreter,
getting PENDING until another user's code completes.

Per Spark documentation,
https://spark.apache.org/docs/latest/job-scheduling.html

" *within* each Spark application, multiple “jobs” (Spark actions) may be
> running concurrently if they were submitted by different threads
> ... /skip/
> threads. By “job”, in this section, we mean a Spark action (e.g. save,
> collect) and any tasks that need to run to evaluate that action. Spark’s
> scheduler is fully thread-safe and supports this use case to enable
> applications that serve multiple requests (e.g. queries for multiple users).
> ... /skip/
> Without any intervention, newly submitted jobs go into a *default pool*,
> but jobs’ pools can be set by adding the *spark.scheduler.pool* “local
> property” to the SparkContext in the thread that’s submitting them."


So Spark allows multiple users to use the same shared spark context..

Two quick questions:
1. Why concurrent users are getting PENDING in Zeppelin?
2. Does Zeppelin set *spark.scheduler.pool* accordingly as described above?

PS.
We have set following Spark interpreter settings:
- zeppelin.spark.concurrentSQL= true
- spark.scheduler.mode = FAIR


Thank you,
Ruslan Dautkhanov


Re: Zeppelin - Spark Driver location

2018-03-13 Thread Ruslan Dautkhanov
https://github.com/apache/zeppelin/pull/2577 pronounces yarn-cluster in
it's title so I assume it's only yarn-cluster.
Never used standalone-cluster myself.

Which distro of Hadoop do you use?
Cloudera desupported standalone in CDH 5.5 and will remove in CDH 6.
https://www.cloudera.com/documentation/enterprise/release-notes/topics/rg_deprecated.html



-- 
Ruslan Dautkhanov

On Tue, Mar 13, 2018 at 5:45 PM, Jhon Anderson Cardenas Diaz <
jhonderson2...@gmail.com> wrote:

> Does this new feature work only for yarn-cluster ?. Or for spark
> standalone too ?
>
> El mar., 13 de mar. de 2018 18:34, Ruslan Dautkhanov <dautkha...@gmail.com>
> escribió:
>
>> > Zeppelin version: 0.8.0 (merged at September 2017 version)
>>
>> https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end of
>> September so not sure if you have that.
>>
>> Check out https://medium.com/@zjffdu/zeppelin-0-8-0-new-
>> features-ea53e8810235 how to set this up.
>>
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz <
>> jhonderson2...@gmail.com> wrote:
>>
>>> Hi zeppelin users !
>>>
>>> I am working with zeppelin pointing to a spark in standalone. I am
>>> trying to figure out a way to make zeppelin runs the spark driver outside
>>> of client process that submits the application.
>>>
>>> According with the documentation (http://spark.apache.org/docs/
>>> 2.1.1/spark-standalone.html):
>>>
>>> *For standalone clusters, Spark currently supports two deploy modes.
>>> In client mode, the driver is launched in the same process as the client
>>> that submits the application. In cluster mode, however, the driver is
>>> launched from one of the Worker processes inside the cluster, and the
>>> client process exits as soon as it fulfills its responsibility of
>>> submitting the application without waiting for the application to finish.*
>>>
>>> The problem is that, even when I set the properties for spark-standalone
>>> cluster and deploy mode in cluster, the driver still run inside zeppelin
>>> machine (according with spark UI/executors page). These are properties that
>>> I am setting for the spark interpreter:
>>>
>>> master: spark://:7077
>>> spark.submit.deployMode: cluster
>>> spark.executor.memory: 16g
>>>
>>> Any ideas would be appreciated.
>>>
>>> Thank you
>>>
>>> Details:
>>> Spark version: 2.1.1
>>> Zeppelin version: 0.8.0 (merged at September 2017 version)
>>>
>>
>>


Re: [DISCUSS] Roadmap 0.9 and future

2018-03-13 Thread Ruslan Dautkhanov
Thanks for sharing this moon !

Those are great ideas.



-- 
Ruslan Dautkhanov

On Wed, Mar 7, 2018 at 11:21 AM, moon soo Lee <m...@apache.org> wrote:

> Hi forks,
>
> There were an offline meeting yesterday at PaloAlto with contributors and
> users. We've shared idea about current state of project and future project
> roadmap and wishlists (meeting note [1]). It was really inspiring and
> exciting time. Let me try summarize, move this discussion to online.
>
> There were many ideas related to Interpreter. Especially, there were
> consensus that Spark support is one of biggest strength of Zeppelin and
> need to make further improvement to keep the strengths.
>
>- Spark
>- Immediate support of new spark release
>   - Ramp up support of current Spark feature (e.g. Display job
>   progress correctly)
>   - Spark streaming support
>   - Handling Livy timeout
>   - Other interpreters
>- Better Hive support (e.g. configuration)
>   - Latest version PrestoDB support (pass property correctly)
>   - Run interpreter in containerized environment
>- Let individual user upload custom library from user's machine
>directly
>- Interpreter documentation is not detail enough
>
> And people in the meeting excited about ConInterpreter ZEPPELIN-3085 [2]
> in upcoming release, regarding dynamic/inline configuration of interpreter.
>
> And there were ideas on other areas, too. like
>
>- Separate Admin role and user role
>- Sidebar with plugin widget
>- Better integration with emerging framework like Tensorflow/MXNet/Ray
>- Sharing data
>- Schedule notebook from external scheduler
>
> Regarding scheduling notebook, Luciano shared his project NotebookTools[3]
> and it made people really excited.
>
> Also, there were inspiring discussions about the community/project.
> Current status and how can we make community/project more healthy. And
> here's some ideas around the topic
>
>- Need more frequent release
>- More attention to code review to speed up
>- Publishing roadmap beforehand to help contribution
>- 'Newbie', 'low hanging fruit' tag helps contribution
>- Enterprise friendly is another biggest strength of Zeppelin (in
>addition to Spark support) need to keep improve.
>
>
> I probably missed many idea shared yesterday. Please feel free to
> add/correct the summary. Hope more people in the mailinglist join and
> develop the idea together. And I think this discussion can leads community
> shape 0.9 and future version of Zeppelin, and update and publish future
> roadmap[4].
>
> Best,
> moon
>
> Special thanks to ZEPL <https://www.zepl.com> for the swag and dinner.
>
> [1] https://docs.google.com/document/d/18Wc3pEFx3qm9XoME_
> V_B9k_LlAd1PLyKQQEveR1on1o/edit?usp=sharing
> [2] https://issues.apache.org/jira/browse/ZEPPELIN-3085
> [3] https://github.com/SparkTC/notebook-exporter/
> tree/master/notebook-exporter
> [4] https://cwiki.apache.org/confluence/display/ZEPPELIN/Zeppelin+Roadmap
>
>
>
>


Re: Zeppelin - Spark Driver location

2018-03-13 Thread Ruslan Dautkhanov
 > Zeppelin version: 0.8.0 (merged at September 2017 version)

https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end of
September so not sure if you have that.

Check out
https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235 how to
set this up.



-- 
Ruslan Dautkhanov

On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz <
jhonderson2...@gmail.com> wrote:

> Hi zeppelin users !
>
> I am working with zeppelin pointing to a spark in standalone. I am trying
> to figure out a way to make zeppelin runs the spark driver outside of
> client process that submits the application.
>
> According with the documentation (http://spark.apache.org/docs/
> 2.1.1/spark-standalone.html):
>
> *For standalone clusters, Spark currently supports two deploy modes.
> In client mode, the driver is launched in the same process as the client
> that submits the application. In cluster mode, however, the driver is
> launched from one of the Worker processes inside the cluster, and the
> client process exits as soon as it fulfills its responsibility of
> submitting the application without waiting for the application to finish.*
>
> The problem is that, even when I set the properties for spark-standalone
> cluster and deploy mode in cluster, the driver still run inside zeppelin
> machine (according with spark UI/executors page). These are properties that
> I am setting for the spark interpreter:
>
> master: spark://:7077
> spark.submit.deployMode: cluster
> spark.executor.memory: 16g
>
> Any ideas would be appreciated.
>
> Thank you
>
> Details:
> Spark version: 2.1.1
> Zeppelin version: 0.8.0 (merged at September 2017 version)
>


Re: Highlight Zeppelin 0.8 New Features

2018-03-13 Thread Ruslan Dautkhanov
Thanks Jeff!

That's great - our users were asking what are the highlights of the new
release.



-- 
Ruslan Dautkhanov

On Tue, Mar 13, 2018 at 10:07 AM, moon soo Lee <m...@apache.org> wrote:

> Looks great. I think online registry (helium) for visualization and spell
> is another important feature.
>
> Thanks,
> moon
>
> On Tue, Mar 13, 2018 at 12:41 AM Jeff Zhang <zjf...@gmail.com> wrote:
>
>>
>> I planned to publish this article after 0.8 release, but I think it would
>> be helpful for users to experience and verify these features before 0.8
>> release. So I sent it out before 0.8 release. I would be very appreciated
>> that users can try these features via branch-0.8, and this is not a full
>> list of 0.8 new features, feel free to add if I miss any important features.
>>
>> https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235
>>
>>
>>


Re: Zeppelin use survey

2018-02-28 Thread Ruslan Dautkhanov
Thank you Maxim and Moon.

It was interesting to see most of the users are using official releases and
not builds from master,
and see some other insights too.






-- 
Ruslan Dautkhanov

On Wed, Feb 28, 2018 at 10:46 AM, moon soo Lee <m...@apache.org> wrote:

> Thanks for having survey and sharing the result.
>
> I made a notebook with the result and published to online viewer, for
> someone who want to download/import the notebook and play around.
> https://www.zepl.com/viewer/notebooks/bm90ZTovL21vb24vZTU0NWExYmFhZT
> U1NGQ0MDhlNDM0MjExMmM2YzRlMmQvbm90ZS5qc29u
>
> Thanks,
> moon
>
> On Mon, Feb 26, 2018 at 1:44 AM Belousov Maksim Eduardovich <
> m.belou...@tinkoff.ru> wrote:
>
>> Hello again!
>>
>> I come back with the results.
>>
>>
>>
>> You can find the source data in  https://gist.github.com/mebelousov/
>> d672b71f70fff75c9851a9f3a6e5a2be
>>
>>
>>
>> Unexpectedly, but there are only 25 active readers.
>>
>>
>>
>> Diamgrams were builded in Zeppelin:
>>
>> [image: image005.jpg]
>>
>>
>>
>>
>>
>> Diagrams from Google:
>>
>>
>>
>> [image: image003.png]
>>
>>
>>
>> [image: image004.png]
>>
>>
>>
>> Thanks a lot fo all.
>>
>>
>>
>>
>>
>> Regards,
>>
>>
>>
>>
>> *Maksim Belousov *
>>
>>
>>
>> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
>> *Sent:* Friday, February 16, 2018 11:43 AM
>>
>>
>> *To:* users@zeppelin.apache.org
>> *Subject:* Re: Zeppelin use survey
>>
>>
>>
>>
>>
>> Thanks Maksim,  It is super helpful for the zeppelin community.
>>
>>
>>
>>
>>
>>
>>
>> Belousov Maksim Eduardovich <m.belou...@tinkoff.ru>于2018年2月16日周五 下午4:32
>> 写道:
>>
>> Hello users!
>>
>>
>>
>> Apache Zeppelin has wide functionality. It would be good to know how
>> Zeppelin is used, most popular features and wishes.
>>
>>
>>
>> I prepared the survey with 11 questions [1]. Please fill it.
>>
>>
>>
>> After a while I will share source data and results.
>>
>> I plan to make the survey every year.
>>
>>
>>
>> 1. https://goo.gl/forms/cnypeaT0lhGfEMld2
>>
>>
>>
>>
>>
>> Regards,
>>
>>
>> *Maksim Belousov*
>>
>>
>>
>>


Fwd: [grpc/grpc] Unicode support in Python 2? (#14446)

2018-02-19 Thread Ruslan Dautkhanov
New IPython backend for Spark interpreter breaks Python 2 compatibility
because of grpc.

Basically if you have an unicode character in a static string or even in a
comment, it'll break Spark interpreter.

https://issues.apache.org/jira/browse/ZEPPELIN-3239

Below update on gRPC issue https://github.com/grpc/grpc/issues/14446 says
that in Python 2
we should directly encode data explicitly before sending it over to grpc.


PS.
If unicode data is present, in frontend it just looks as paragraph is in
"RUNNING" state with no ability
to cancel it. The only way to "unhang" spark interpreter is to restart it.

It seems spark interpreter --> grpc --> ipython backend is currently
somewhat brittle as any exception
stops grpc stream [1].
Would it be possible to adjust code in ipython logic to restart grpc stream
for each paragraph run?
Like explained in https://github.com/grpc/grpc-java/issues/4086

Filed https://issues.apache.org/jira/browse/ZEPPELIN-3247 to consider
implementing this.




[1]

INFO [2018-02-14 10:39:10,923] ({grpc-default-worker-ELG-1-2}
AbstractClientStream2.java[inboundDataReceived]:249) - Received data on
closed stream
INFO [2018-02-14 10:39:10,924] ({grpc-default-worker-ELG-1-2}
AbstractClientStream2.java[inboundDataReceived]:249) - Received data on
closed stream
INFO [2018-02-14 10:39:10,925] ({grpc-default-worker-ELG-1-2}
AbstractClientStream2.java[inboundDataReceived]:249) - Received data on
closed stream







-- Forwarded message --
From: kpayson64 <notificati...@github.com>
Date: Mon, Feb 19, 2018 at 2:47 PM
Subject: Re: [grpc/grpc] Unicode support in Python 2? (#14446)
To: grpc/grpc <g...@noreply.github.com>
Cc: Ruslan Dautkhanov <dautkha...@gmail.com>, Author <
aut...@noreply.github.com>


I can confirm that Python 2 gRPC doesn't accept unicode characters. The
Python gRPC API accepts the string type, which in Python 2 is equivalent to
the byte type. Applications should do any encoding if they are using
unicode characters.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<https://github.com/grpc/grpc/issues/14446#issuecomment-366810413>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AC37KkH9LbYbClw5ZZ-T48i4GX54m15Rks5tWev7gaJpZM4SH7mi>
.


z.show() compatibility and new UI grid

2018-02-14 Thread Ruslan Dautkhanov
We've noticed two major issues with z.show() after upgrading Zeppelin

1)
z.show(df) used to work directly on spark dataframe object,
now it produces TypeError: object of type 'DataFrame' has no len()
Full exception stack in [1].

We tried disabling ipython and it seems to be a workaround.
I there is a way to have compatibility with previous Zeppelin release
on z.show()
without disabling ipython altogether?

https://issues.apache.org/jira/browse/ZEPPELIN-3234


2)
The new UI grid displays just an empty box when output is cut with a
message like

> Output is truncated to 102400 bytes. Learn more about
> ZEPPELIN_INTERPRETER_OUTPUT_LIMIT

It doesn't happen every time, I think it depends on how interpreter has cut
the table?

https://issues.apache.org/jira/browse/ZEPPELIN-3235


More minor - the new data grid visualization seems much slower on wider
datasets.
Not sure if there is a way to fallback to older table data grid UI option?



[1]

TypeErrorTraceback (most recent call last) 
> in ()
> > 1 z.show(spark.sql('select * from disc_mrt.unified_fact'))
>  in show(self, p, **kwargs)
> 73 # `isinstance(p, DataFrame)` would req `import
> pandas.core.frame.DataFrame`
> 74 # and so a dependency on pandas
> ---> 75 self.show_dataframe(p, **kwargs)
> 76 elif hasattr(p, '__call__'):
> 77 p() #error reporting  in show_dataframe(self,
> df, show_index, **kwargs)
> 80 """Pretty prints DF using Table Display System 81 """ ---> 82 limit =
> len(df) > self.max_result 83 header_buf = StringIO("")
> 84 if show_index: TypeError: object of type 'DataFrame' has no len()
>
>

-- 
Ruslan Dautkhanov


ipython/grpc issues

2018-02-14 Thread Ruslan Dautkhanov
I've seen several cases when new ipython interpreter isn't possible to stop
using "Cancel"
button.

Interpreter logs shows following errors.

Then paragraph stops accepting Cancel commands and it shows that status as
"running" when it's actually not.

Python 2.7.13.
Zeppelin from a few days old master snapshot.

$ pip freeze | egrep "ipython|grpc|jupyter"
grpcio==1.9.1
ipython==5.1.0
ipython-genutils==0.2.0
jupyter==1.0.0
jupyter-client==5.2.2
jupyter-console==5.0.0
jupyter-core==4.4.0



[1]

>
> ERROR [2018-02-14 10:39:10,922] ({grpc-default-executor-3}
> IPythonClient.java[onError]:138) - Fail to call IPython grpc
> io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED:
> io.grpc.netty.NettyClientTransport$3: Frame size 216695976 exceeds maximum:
> 4194304.
> at io.grpc.Status.asRuntimeException(Status.java:543)
> at
> io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:395)
> at
> io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:426)
> at
> io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76)
> at
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:512)
> at
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:429)
> at
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:544)
> at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52)
> at
> io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:117)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)


and

>  INFO [2018-02-14 10:39:10,923] ({grpc-default-worker-ELG-1-2}
> AbstractClientStream2.java[inboundDataReceived]:249) - Received data on
> closed stream
>  INFO [2018-02-14 10:39:10,924] ({grpc-default-worker-ELG-1-2} 
> AbstractClientStream2.java[inboundDataReceived]:249)
> - Received data on closed stream
>  INFO [2018-02-14 10:39:10,925] ({grpc-default-worker-ELG-1-2}
> AbstractClientStream2.java[inboundDataReceived]:249) - Received data on
> closed stream







-- 
Ruslan Dautkhanov


TimeoutLifecycleManager

2018-02-12 Thread Ruslan Dautkhanov
Trying out below settings for the new TimeoutLifecycleManager :


> zeppelin.interpreter.lifecyclemanager.class
> org.apache.zeppelin.interpreter.lifecycle.TimeoutLifecycleManager
> zeppelin.interpreter.lifecyclemanager.timeout.checkinterval
> 6
> zeppelin.interpreter.lifecyclemanager.timeout.threshold
> 60
>

My understanding is should have killed my Spark interpreter in 10 minutes
of idleness (given  zeppelin.interpreter.lifecyclemanager.timeout.threshold
= 60 ms above)
but for some reason it keeps running.. am I missing something?

No messages of attempts to timeout the interpreter in the logs even at
DEBUG level.


Thanks,
Ruslan Dautkhanov


Re: zeppelin build fails with DependencyConvergence error

2018-01-11 Thread Ruslan Dautkhanov
Thank you Jeff


-- 
Ruslan Dautkhanov

On Thu, Jan 11, 2018 at 1:57 AM, Jeff Zhang <zjf...@gmail.com> wrote:

>
> ZEPPELIN-3119 will fix this. Will update this thread once it is done
>
>
>
>
> Ruslan Dautkhanov <dautkha...@gmail.com>于2017年12月29日周五 上午6:04写道:
>
>> The build failure messages all point to the zeppelin-zengine module in
>> the source code.  In this commit: https://github.com/apache/
>> zeppelin/commit/30bfcae0c0c9650aff3ed1f8fe41eee9c4e93cb1#diff-
>> 98784f3ef76c2907324fa9e48e66cf47 , a dependency change was made to add
>> the org.apache.hadoop:hadoop-client which points to both
>> org.apache.hadoop:hadoop-common and org.apache.hadoop:hadoop-hdfs.
>> These two have dependencies upon different versions (at least for the
>> Cloudera version of them)  of the org.codehaus.jackson:jackson-mapper-asl
>> library, 1.8.8 and 1.9.13 respectively.
>>
>> Was anyone able to build zeppelin with cloudera repo after
>> ZEPPELIN-1515. Notebook: HDFS as a backend storage (Use hadoop client
>> jar) PR #2455
>> was committed ?
>>
>>
>> On Mon, Dec 18, 2017 at 4:20 PM, Ruslan Dautkhanov <dautkha...@gmail.com>
>> wrote:
>>
>>> We're now looking at shading option.
>>>
>>> Talking to Cloudera Support even minor upgrades to jackson known have
>>> caused issues in the past.
>>> They also said they're planning to upgrade CDH6 to jackson 2.*7*.8 -
>>> but this will be released mid-next year.
>>> So we're not waiting for that to happen.
>>>
>>> Yes, we will contribute back to the project when we find solution.
>>> Thanks for the suggestion Felix. Is this known if Zeppelin can work fine
>>> with jasckson 2.*2*.3?
>>> (certain dependencies currently list jackson 2.*5*.3)
>>>
>>>
>>>
>>> --
>>> Ruslan Dautkhanov
>>>
>>> On Sat, Dec 16, 2017 at 3:03 AM, Felix Cheung <felixcheun...@hotmail.com
>>> > wrote:
>>>
>>>> Instead of exclusion, would it be better to use the version in the
>>>> cloudera repo?
>>>>
>>>> Please do consider contributing these changes back to Zeppelin source.
>>>> Thanks!
>>>>
>>>> _
>>>> From: Ruslan Dautkhanov <dautkha...@gmail.com>
>>>> Sent: Monday, December 11, 2017 3:42 PM
>>>> Subject: Re: zeppelin build fails with DependencyConvergence error
>>>> To: Zeppelin Users <us...@zeppelin.incubator.apache.org>
>>>>
>>>>
>>>>
>>>> Looks like master branch of Zeppelin still has compatibility issue with
>>>> Cloudera dependencies.
>>>>
>>>> When built using
>>>>
>>>> mvn clean package -DskipTests -Pspark-2.2 -Dhadoop.version=2.6.0-cdh5.12.1
>>>> -Phadoop-2.6 -Pvendor-repo -pl '!...list of excluded packages' -e
>>>>
>>>> maven fails on jackson convergence error - see below email for more
>>>> details.
>>>> Looks like there was a change in Zeppelin that upgraded Jackson's
>>>> version?
>>>> So now it conflicts with older jackson library as referenced by
>>>> cloudera repo.
>>>>
>>>> workaround: Zeppelin builds fine with pom change [1] - the question is
>>>> now
>>>> would somebody expect Zeppelin would still be functioning correctly
>>>> with these exclusions?
>>>>
>>>>
>>>>
>>>> [1]
>>>>
>>>> --- a/zeppelin-zengine/pom.xml
>>>>> +++ b/zeppelin-zengine/pom.xml
>>>>> @@ -364,6 +364,30 @@
>>>>>com.google.guava
>>>>>guava
>>>>>  
>>>>> +
>>>>> +  com.fasterxml.jackson.core
>>>>> +  jackson-core
>>>>> +
>>>>> +
>>>>> +  com.fasterxml.jackson.core
>>>>> +  jackson-annotations
>>>>> +
>>>>> +
>>>>> +  com.fasterxml.jackson.core
>>>>> +  jackson-databind
>>>>> +
>>>>> +
>>>>> +  org.codehaus.jackson
>>>>> +  jackson-mapper-asl
>>>>> +
>>>>> +
>>>>> +  org.codehaus.jackson
>>>>> +  jackson-core-asl
>>>>>

Re: zeppelin build fails with DependencyConvergence error

2017-12-18 Thread Ruslan Dautkhanov
We're now looking at shading option.

Talking to Cloudera Support even minor upgrades to jackson known have
caused issues in the past.
They also said they're planning to upgrade CDH6 to jackson 2.*7*.8 - but
this will be released mid-next year.
So we're not waiting for that to happen.

Yes, we will contribute back to the project when we find solution.
Thanks for the suggestion Felix. Is this known if Zeppelin can work fine
with jasckson 2.*2*.3?
(certain dependencies currently list jackson 2.*5*.3)



-- 
Ruslan Dautkhanov

On Sat, Dec 16, 2017 at 3:03 AM, Felix Cheung <felixcheun...@hotmail.com>
wrote:

> Instead of exclusion, would it be better to use the version in the
> cloudera repo?
>
> Please do consider contributing these changes back to Zeppelin source.
> Thanks!
>
> _________
> From: Ruslan Dautkhanov <dautkha...@gmail.com>
> Sent: Monday, December 11, 2017 3:42 PM
> Subject: Re: zeppelin build fails with DependencyConvergence error
> To: Zeppelin Users <us...@zeppelin.incubator.apache.org>
>
>
>
> Looks like master branch of Zeppelin still has compatibility issue with
> Cloudera dependencies.
>
> When built using
>
> mvn clean package -DskipTests -Pspark-2.2 -Dhadoop.version=2.6.0-cdh5.12.1
> -Phadoop-2.6 -Pvendor-repo -pl '!...list of excluded packages' -e
>
> maven fails on jackson convergence error - see below email for more
> details.
> Looks like there was a change in Zeppelin that upgraded Jackson's version?
> So now it conflicts with older jackson library as referenced by cloudera
> repo.
>
> workaround: Zeppelin builds fine with pom change [1] - the question is now
> would somebody expect Zeppelin would still be functioning correctly with
> these exclusions?
>
>
>
> [1]
>
> --- a/zeppelin-zengine/pom.xml
>> +++ b/zeppelin-zengine/pom.xml
>> @@ -364,6 +364,30 @@
>>com.google.guava
>>guava
>>  
>> +
>> +  com.fasterxml.jackson.core
>> +  jackson-core
>> +
>> +
>> +  com.fasterxml.jackson.core
>> +  jackson-annotations
>> +
>> +
>> +  com.fasterxml.jackson.core
>> +  jackson-databind
>> +
>> +
>> +  org.codehaus.jackson
>> +  jackson-mapper-asl
>> +    
>> +
>> +  org.codehaus.jackson
>> +  jackson-core-asl
>> +
>> +
>> +  org.apache.zookeeper
>> +  zookeeper
>> +
>>
>>  
>>
>
>
>
> On Sun, Aug 27, 2017 at 2:25 PM, Ruslan Dautkhanov <dautkha...@gmail.com>
> wrote:
>
>> Building from a current Zeppelin snapshot fails with
>> zeppelin build fails with org.apache.maven.plugins.enfor
>> cer.DependencyConvergence
>> see details below.
>>
>> Build command
>> /opt/maven/maven-latest/bin/mvn clean package -DskipTests -Pspark-2.2
>> -Dhadoop.version=2.6.0-cdh5.12.0 -Phadoop-2.6 -Pvendor-repo -Pscala-2.10
>> -Psparkr -pl '!*..excluded certain modules..*' -e
>>
>> maven 3.5.0
>>> jdk 1.8.0_141
>>> RHEL 7.3
>>> npm.x86_64   1:3.10.10-1.6.11.1.1.el7
>>> nodejs.x86_641:6.11.1-1.el7 @epel
>>> latest zeppelin snapshot
>>
>>
>> Any ideas? It's my first attempt to build on rhel7/jdk8 .. never seen
>> this problem before.
>>
>> Thanks,
>> Ruslan
>>
>>
>>
>> [INFO] Scanning for projects...
>> [WARNING]
>> [WARNING] Some problems were encountered while building the effective
>> model for org.apache.zeppelin:zeppelin-spark-dependencies_2.10:jar:0.8
>> .0-SNAPSHOT
>> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
>> found duplicate declaration of plugin 
>> com.googlecode.maven-download-plugin:download-maven-plugin
>> @ line 940, column 15
>> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
>> found duplicate declaration of plugin 
>> com.googlecode.maven-download-plugin:download-maven-plugin
>> @ line 997, column 15
>> [WARNING]
>> [WARNING] Some problems were encountered while building the effective
>> model for org.apache.zeppelin:zeppelin-spark_2.10:jar:0.8.0-SNAPSHOT
>> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
>> found duplicate declaration of plugin org.scala-tools:maven-scala-plugin
>> @ line 467, column 15
>> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
>> found dup

Re: zeppelin build fails with DependencyConvergence error

2017-12-11 Thread Ruslan Dautkhanov
Looks like master branch of Zeppelin still has compatibility issue with
Cloudera dependencies.

When built using

mvn clean package -DskipTests -Pspark-2.2 -Dhadoop.version=2.6.0-cdh5.12.1
-Phadoop-2.6 -Pvendor-repo -pl '!...list of excluded packages' -e

maven fails on jackson convergence error - see below email for more
details.
Looks like there was a change in Zeppelin that upgraded Jackson's version?
So now it conflicts with older jackson library as referenced by cloudera
repo.

workaround: Zeppelin builds fine with pom change [1] - the question is now
would somebody expect Zeppelin would still be functioning correctly with
these exclusions?



[1]

--- a/zeppelin-zengine/pom.xml
> +++ b/zeppelin-zengine/pom.xml
> @@ -364,6 +364,30 @@
>com.google.guava
>guava
>  
> +
> +  com.fasterxml.jackson.core
> +  jackson-core
> +
> +
> +  com.fasterxml.jackson.core
> +  jackson-annotations
> +
> +
> +  com.fasterxml.jackson.core
> +  jackson-databind
> +
> +
> +  org.codehaus.jackson
> +  jackson-mapper-asl
> +
> +
> +  org.codehaus.jackson
> +  jackson-core-asl
> +
> +
> +  org.apache.zookeeper
> +  zookeeper
> +
>
>  
>



On Sun, Aug 27, 2017 at 2:25 PM, Ruslan Dautkhanov <dautkha...@gmail.com>
wrote:

> Building from a current Zeppelin snapshot fails with
> zeppelin build fails with org.apache.maven.plugins.
> enforcer.DependencyConvergence
> see details below.
>
> Build command
> /opt/maven/maven-latest/bin/mvn clean package -DskipTests -Pspark-2.2
> -Dhadoop.version=2.6.0-cdh5.12.0 -Phadoop-2.6 -Pvendor-repo -Pscala-2.10
> -Psparkr -pl '!*..excluded certain modules..*' -e
>
> maven 3.5.0
>> jdk 1.8.0_141
>> RHEL 7.3
>> npm.x86_64   1:3.10.10-1.6.11.1.1.el7
>> nodejs.x86_641:6.11.1-1.el7 @epel
>> latest zeppelin snapshot
>
>
> Any ideas? It's my first attempt to build on rhel7/jdk8 .. never seen this
> problem before.
>
> Thanks,
> Ruslan
>
>
>
> [INFO] Scanning for projects...
> [WARNING]
> [WARNING] Some problems were encountered while building the effective
> model for org.apache.zeppelin:zeppelin-spark-dependencies_2.10:jar:0.
> 8.0-SNAPSHOT
> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
> found duplicate declaration of plugin 
> com.googlecode.maven-download-plugin:download-maven-plugin
> @ line 940, column 15
> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
> found duplicate declaration of plugin 
> com.googlecode.maven-download-plugin:download-maven-plugin
> @ line 997, column 15
> [WARNING]
> [WARNING] Some problems were encountered while building the effective
> model for org.apache.zeppelin:zeppelin-spark_2.10:jar:0.8.0-SNAPSHOT
> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
> found duplicate declaration of plugin org.scala-tools:maven-scala-plugin
> @ line 467, column 15
> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
> found duplicate declaration of plugin 
> org.apache.maven.plugins:maven-surefire-plugin
> @ line 475, column 15
> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
> found duplicate declaration of plugin 
> org.apache.maven.plugins:maven-compiler-plugin
> @ line 486, column 15
> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
> found duplicate declaration of plugin org.scala-tools:maven-scala-plugin
> @ line 496, column 15
> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
> found duplicate declaration of plugin 
> org.apache.maven.plugins:maven-surefire-plugin
> @ line 504, column 15
> [WARNING]
> [WARNING] It is highly recommended to fix these problems because they
> threaten the stability of your build.
> [WARNING]
> [WARNING] For this reason, future Maven versions might no longer support
> building such malformed projects.
> [WARNING]
> [WARNING] The project org.apache.zeppelin:zeppelin-web:war:0.8.0-SNAPSHOT
> uses prerequisites which is only intended for maven-plugin projects but not
> for non maven-plugin projects. For such purposes you should use the
> maven-enforcer-plugin. See https://maven.apache.org/
> enforcer/enforcer-rules/requireMavenVersion.html
>
>
> ... [skip]
>
> [INFO] 
> 
> [INFO] Building Zeppelin: Zengine 0.8.0-SNAPSHOT
> [INFO] 

Re: IPython is available, use IPython for PySparkInterpreter

2017-12-11 Thread Ruslan Dautkhanov
Makes sense

Thank you Jeff !

On Sun, Dec 10, 2017 at 11:24 PM Jeff Zhang <zjf...@gmail.com> wrote:

>
> I am afraid currently there's no way to make ipython as default of
> %pyspark, but you can use %ipyspark to use ipython without this warning
> message.
>
> But making ipython as default is on my plan, for now I try to keep
> backward compatibility as much as possible, so only use ipython when it is
> available, otherwise still use the old python interpreter implementation.
> I will change ipython as default and the original python implementation as
> fallback when ipython interpreter become much more mature.
>
>
>
>
> Ruslan Dautkhanov <dautkha...@gmail.com>于2017年12月11日周一 下午1:20写道:
>
>> Getting "IPython is available, use IPython for PySparkInterpreter"
>> warning after starting pyspark interpreter.
>>
>> How do I default %pyspark to ipython?
>>
>> Tried to change to
>> "class": "org.apache.zeppelin.spark.PySparkInterpreter",
>> to
>> "class": "org.apache.zeppelin.spark.IPySparkInterpreter",
>> in interpreter.json but this gets overwritten back to PySparkInterpreter.
>>
>> Also tried to change to zeppelin.pyspark.python to ipython with no luck
>> too.
>>
>> Is there is a documented way to default pyspark interpreter to ipython?
>> Glanced over PR-2474 but can't quickly get what I am missing.
>>
>>
>> Thanks.
>>
>>
>>


IPython is available, use IPython for PySparkInterpreter

2017-12-10 Thread Ruslan Dautkhanov
Getting "IPython is available, use IPython for PySparkInterpreter" warning
after starting pyspark interpreter.

How do I default %pyspark to ipython?

Tried to change to
"class": "org.apache.zeppelin.spark.PySparkInterpreter",
to
"class": "org.apache.zeppelin.spark.IPySparkInterpreter",
in interpreter.json but this gets overwritten back to PySparkInterpreter.

Also tried to change to zeppelin.pyspark.python to ipython with no luck too.

Is there is a documented way to default pyspark interpreter to ipython?
Glanced over PR-2474 but can't quickly get what I am missing.


Thanks.


Re: [DISCUSS] Change some default settings for avoiding unintended usages

2017-11-29 Thread Ruslan Dautkhanov
Would be nice if each user's interpreter is started in its own docker
container a-la cloudera data science workbench.
Then each user's shell interpreter is pretty isolated.
Actually, from a CDSW session you could pop up a terminal session to your
container which I found pretty neat.



-- 
Ruslan Dautkhanov

On Wed, Nov 29, 2017 at 5:00 PM, Jeff Zhang <zjf...@gmail.com> wrote:

>
> Shell interpreter is a black hole for security, usually we don't recommend
> or allow user to use shell.
>
> We may need to refactor the shell interpreter, running under zeppelin user
> is too dangerous.
>
>
>
>
>
> Jongyoul Lee <jongy...@gmail.com>于2017年11月29日周三 下午11:44写道:
>
>> Hi, users and dev,
>>
>> Recently, I've got an issue about the abnormal usage of some interpreters.
>> Zeppelin's users can access shell by shell and python interpreters. It
>> means all users can run or execute what they want even if it harms the
>> system. Thus I agree that we need to change some default settings to
>> prevent this kind of abusing situation. Before we proceed to do it, I want
>> to listen to others' opinions.
>>
>> Feel free to reply this email
>>
>> Regards,
>> Jongyoul
>>
>> --
>> 이종열, Jongyoul Lee, 李宗烈
>> http://madeng.net
>>
>


Re: I need data download button in the Reports mode

2017-11-09 Thread Ruslan Dautkhanov
Chrome can print to pdf. In Destination "printer" change to "Save as pdf".



-- 
Ruslan Dautkhanov

On Thu, Nov 9, 2017 at 10:31 AM, shyla deshpande <deshpandesh...@gmail.com>
wrote:

> Hello all,
>
> I want the users to be able to download the data in report mode.  Is there
> a way, some kind of tricks? please help
>
> Thanks
>


Re: Re: Roadmap for 0.8.0

2017-10-23 Thread Ruslan Dautkhanov
Sorry for bringing up an older topic .. I agree "latest" / "stable" makes a
lot of sense.

Also what was *not* discussed in this thread is release cadence target.
IMHO, 2-3 releases a year should give a quicker turnover to release latest
fixes and improvements / quicker feedback from the users?

Would be great to see 0.8.0 release soon..
I see on github there were a lot of awesome additions/commits since the
last release.

Thoughts?



On Tue, Mar 21, 2017 at 10:57 AM, moon soo Lee  wrote:

> Thanks for the opinion. Yes we can think about proper label. These are
> labels mentioned in this threads so far.
>
> 'Unstable', 'Stable', 'Old'
> 'Latest', 'Stable', 'Old'
> 'Beta', 'Stable', 'Old'
>
> Intention is not that we want to release 'unstable' version, i think.
> The intention is give user proper expectation that latest release may(and
> may not) include bug which we couldn't discovered in verification process
> like what happened in our previous release 0.7.0 and 0.6.0.
>
> These are how other apache projects describe their releases.
>
> Kafka - x.x.z is the latest release. current stable version is x.y.z.
> Flink - x.y.z is our latest stable release
> Cassandra - even-numbered contains new features, odd-numbered contains bug
> fixes only
> Spark - available 2.1.0, 2.0.2, 2.0.1, 2.0.0  1.4.0 as a 'stable'
> release, others are available as 'archived' releases.
> Mesos - most recent stable release: x.y.z
> Hadoop - 'x.y.z-alpha' or 'x.y.z' or 'x.y.z (stable)'
> Hbase - 1.2.x series is current stable release. (while 1.3.x series does
> not have a label)
>
> As you can see, it's difficult to find common rule what 'latest' should
> mean in Apache projects.
>
> Considering the intention that we're not intentionally releasing
> 'unstable' version, i prefer 'latest / stable' tiny bit more than 'beta /
> stable'.
>
> I'd like hear more opinions.
>
> Thanks,
> moon
>
> On Tue, Mar 21, 2017 at 9:16 AM Jan Rasehorn  wrote:
>
>> Hi moon,
>>
>> I think assuming the latest release would be unstable is confusing and
>> not in line with other Apache projects. If you want to have a instable
>> prerelease version, I would suggest to call it a beta version and once the
>> major bugs are removed, a new stable release could be provided.
>>
>> BR, Jan
>> --
>> Diese Nachricht wurde von meinem Android Mobiltelefon mit GMX Mail
>> gesendet.
>> Am 21.03.17, 16:41, moon soo Lee  schrieb:
>>
>> And if i suggest simplest way for us to set quality expectation to user,
>> which will be labeling release in download page.
>>
>> Currently releases are divided into 2 categories in download page.
>> 'Latest release' and 'Old releases'. I think we can treat 'Latest' as
>> unstable and add one more category 'Stable release'.
>>
>> For example, once 0.8.0 is released,
>>
>> Latest release : 0.8.0
>> Stable release : 0.7.1
>> Old release : 0.6.2, 0.6.1 
>>
>> Once we feel confident about the stability of latest release, we can just
>> change label from latest to stable in the download page. (and previous
>> stable goes to old releases)
>> We can even include formal vote for moving release from 'latest' to
>> 'stable' in our release process, if it is necessary.
>>
>> Thanks,
>> moon
>>
>> On Tue, Mar 21, 2017 at 6:59 AM moon soo Lee  wrote:
>>
>> Yes, having longer RC period will help.
>>
>> But if i recall 0.7.0 release, although 21 people participated verifying
>> through 4 RC for 15days, it wasn't enough to catch all critical problems
>> during the release process. After the release, we've got much more number
>> of bug reports, in next few days.
>>
>> Basically, verifying RC is limited to people who subscribe mailing list +
>> willing to contribute time to verify RC, which is much smaller number of
>> people who download release from download page. So having longer RC period
>> will definitely help and i think we should do, but I think it's still not
>> enough to make sure the quality, considering past history.
>>
>> AFAIK, releasing 0.8.0-preview, calling it unstable is up to the project.
>> ASF release process defines how to release source code, but it does not
>> really restrict what kind of 'version' the project should have releases.
>> For example, spark released spark-2.0.0-preview[1] before spark-2.0.0.
>>
>> Thanks,
>> moon
>>
>> [1] http://spark.apache.org/news/spark-2.0.0-preview.html
>>
>>
>> On Mon, Mar 20, 2017 at 11:31 PM Jongyoul Lee  wrote:
>>
>> I agree that it will help prolong RC period and use it actually. And also
>> we need code freeze for the new features and spend time to stabilize RC.
>>
>> On Tue, Mar 21, 2017 at 1:25 PM, Felix Cheung 
>> wrote:
>>
>> +1 on quality and stabilization.
>>
>> I'm not sure if releasing as preview or calling it unstable fits with the
>> ASF release process though.
>>
>> Other projects have code freeze, RC (and longer RC iteration time) etc. -
>> do we think those 

Re: [ANNOUNCE] Apache Zeppelin 0.7.3 released

2017-09-22 Thread Ruslan Dautkhanov
That's awesome. Congrats everyone!

Hope to see 0.8.0 release soon too - it has nice new features we would love
to see.



-- 
Ruslan Dautkhanov

On Fri, Sep 22, 2017 at 1:36 AM, Mina Lee <mina...@apache.org> wrote:

> The Apache Zeppelin community is pleased to announce the availability of
> the 0.7.3 release.
>
> Zeppelin is a collaborative data analytics and visualization tool for
> distributed, general-purpose data processing system such as Apache Spark,
> Apache Flink, etc.
>
> The community put significant effort into improving Apache Zeppelin since
> the last release. 20 contributors provided 30+ patches
> for improvements and bug fixes. More than 20+ issues have been resolved.
>
> We encourage you to download the latest release from
> http://zeppelin.apache.org/download.html
>
> Release note is available at
> http://zeppelin.apache.org/releases/zeppelin-release-0.7.3.html
>
> We welcome your help and feedback. For more information on the project and
> how to get involved, visit our website at http://zeppelin.apache.org/
>
> Thank you all users and contributors who have helped to improve Apache
> Zeppelin.
>
> Regards,
> The Apache Zeppelin community
>


Re: Cloudera Data Science Workbench and Zeppelin

2017-09-11 Thread Ruslan Dautkhanov
I like the idea of using containers / kubernetes to run users' own
instances of data science workbench.

Although with YARN adding docker containers in Hadoop 2.8 we could have
similar functionality without k8s:
https://issues.apache.org/jira/browse/YARN-3611 ?
Assuming  we will see https://issues.apache.org/jira/browse/ZEPPELIN-2040
completed  ..

Thanks




-- 
Ruslan Dautkhanov

On Sun, Sep 10, 2017 at 9:13 PM, Yeshwanth Jagini <y...@yotabitesllc.com>
wrote:

> Cloudera Data Science workbench is totally a different product. Cloudera
> acquired it from https://sense.io/
>
>
>
> On Sun, Sep 10, 2017 at 2:13 AM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Hi,
>>
>> As I understood Cloudera Data Science Workbench was based on Zeppelin.
>> Zeppelin as open source was supported by Hortonworks and Cloudera took
>> Zeppelin and created the Workbench.
>>
>> As anyone within Zeppelin community come across this Workbench and how do
>> you rate it?
>>
>> Thanks
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>
>
>
> --
> Thanks,
> Yeshwanth Jagini
>


Re: note.json or interpreter.json format change?

2017-08-28 Thread Ruslan Dautkhanov
This problem only happens when trying to upgraded Zeppelin (yes, from
yesterday's snapshot) and then switching back
to May-old Zeppelin.
Date format has changed in all fields
- "dateCreated", "dateStarted", "dateFinished"
Impact limited to only notebooks modified in the newer version of Zeppelin
- they'll not show up in list of available notebooks.



-- 
Ruslan Dautkhanov

On Mon, Aug 28, 2017 at 6:07 PM, Jianfeng (Jeff) Zhang <
jzh...@hortonworks.com> wrote:

>
> Do you use the latest zeppelin master branch ? I see this issue before,
> but believe it has been fixed.
>
>
> Best Regard,
> Jeff Zhang
>
>
> From: Ruslan Dautkhanov <dautkha...@gmail.com>
> Reply-To: "users@zeppelin.apache.org" <users@zeppelin.apache.org>
> Date: Tuesday, August 29, 2017 at 1:50 AM
> To: Zeppelin Users <us...@zeppelin.incubator.apache.org>
> Subject: Re: note.json or interpreter.json format change?
>
> There is a date format change
>
> current zeppelin snapshot note.json date format example
>   "dateUpdated": "2017-08-27 19:56:22.229",
>
> ~May-1st zeppelin note.json date format example
>   "dateUpdated": "May 5, 2017 2:55:34 PM",
>
> This breaks note.json compatibility.
> Can somebody please point me to PR / jria for this change?
> Any workarounds that would make an upgrade easier?
> Also, this change makes reverting zeppelin upgrades impossible.
>
>
>
>
> --
> Ruslan Dautkhanov
>
> On Mon, Aug 28, 2017 at 11:35 AM, Ruslan Dautkhanov <dautkha...@gmail.com>
> wrote:
>
>> I guess exception in the log might be why- see [1].
>> Any fixes/ workarounds for this issue?
>>
>>
>> [1]
>>
>> ERROR [2017-08-28 11:25:52,628] ({main} VFSNotebookRepo.java[list]:151)
>> - Can't read note file:///home/rdautkha/zeppelin/notebooks/2CJW01020
>> com.google.gson.JsonSyntaxException: 2017-08-27 19:56:22.229
>> at com.google.gson.internal.bind.DateTypeAdapter.deserializeToD
>> ate(DateTypeAdapter.java:81)
>> at com.google.gson.internal.bind.DateTypeAdapter.read(DateTypeA
>> dapter.java:66)
>> at com.google.gson.internal.bind.DateTypeAdapter.read(DateTypeA
>> dapter.java:41)
>> at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1
>> .read(ReflectiveTypeAdapterFactory.java:93)
>> at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$A
>> dapter.read(ReflectiveTypeAdapterFactory.java:172)
>> at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.
>> read(TypeAdapterRuntimeTypeWrapper.java:40)
>> at com.google.gson.internal.bind.CollectionTypeAdapterFactory$A
>> dapter.read(CollectionTypeAdapterFactory.java:81)
>> at com.google.gson.internal.bind.CollectionTypeAdapterFactory$A
>> dapter.read(CollectionTypeAdapterFactory.java:60)
>> at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1
>> .read(ReflectiveTypeAdapterFactory.java:93)
>> at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$A
>> dapter.read(ReflectiveTypeAdapterFactory.java:172)
>> at com.google.gson.Gson.fromJson(Gson.java:791)
>> at com.google.gson.Gson.fromJson(Gson.java:757)
>> at com.google.gson.Gson.fromJson(Gson.java:706)
>> at com.google.gson.Gson.fromJson(Gson.java:678)
>> at org.apache.zeppelin.notebook.Note.fromJson(Note.java:898)
>> at org.apache.zeppelin.notebook.repo.VFSNotebookRepo.getNote(VF
>> SNotebookRepo.java:178)
>> at org.apache.zeppelin.notebook.repo.VFSNotebookRepo.getNoteInf
>> o(VFSNotebookRepo.java:201)
>> at org.apache.zeppelin.notebook.repo.VFSNotebookRepo.list(VFSNo
>> tebookRepo.java:146)
>> at org.apache.zeppelin.notebook.repo.NotebookRepoSync.list(Note
>> bookRepoSync.java:158)
>> at org.apache.zeppelin.notebook.Notebook.loadAllNotes(Notebook.
>> java:553)
>> at org.apache.zeppelin.notebook.Notebook.(Notebook.java:1
>> 24)
>> at org.apache.zeppelin.server.ZeppelinServer.(ZeppelinSer
>> ver.java:158)
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance(Native
>> ConstructorAccessorImpl.java:57)
>> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(De
>> legatingConstructorAccessorImpl.java:45)
>> at java.lang.reflect.Constructor.newInstance(Constructor.java:5
>> 26)
>> at org.apache.cxf.jaxrs.serv

Re: note.json or interpreter.json format change?

2017-08-28 Thread Ruslan Dautkhanov
There is a date format change

current zeppelin snapshot note.json date format example
  "dateUpdated": "2017-08-27 19:56:22.229",

~May-1st zeppelin note.json date format example
  "dateUpdated": "May 5, 2017 2:55:34 PM",

This breaks note.json compatibility.
Can somebody please point me to PR / jria for this change?
Any workarounds that would make an upgrade easier?
Also, this change makes reverting zeppelin upgrades impossible.




-- 
Ruslan Dautkhanov

On Mon, Aug 28, 2017 at 11:35 AM, Ruslan Dautkhanov <dautkha...@gmail.com>
wrote:

> I guess exception in the log might be why- see [1].
> Any fixes/ workarounds for this issue?
>
>
> [1]
>
> ERROR [2017-08-28 11:25:52,628] ({main} VFSNotebookRepo.java[list]:151) -
> Can't read note file:///home/rdautkha/zeppelin/notebooks/2CJW01020
> com.google.gson.JsonSyntaxException: 2017-08-27 19:56:22.229
> at com.google.gson.internal.bind.DateTypeAdapter.
> deserializeToDate(DateTypeAdapter.java:81)
> at com.google.gson.internal.bind.DateTypeAdapter.read(
> DateTypeAdapter.java:66)
> at com.google.gson.internal.bind.DateTypeAdapter.read(
> DateTypeAdapter.java:41)
> at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$
> 1.read(ReflectiveTypeAdapterFactory.java:93)
> at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$
> Adapter.read(ReflectiveTypeAdapterFactory.java:172)
> at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.
> read(TypeAdapterRuntimeTypeWrapper.java:40)
> at com.google.gson.internal.bind.CollectionTypeAdapterFactory$
> Adapter.read(CollectionTypeAdapterFactory.java:81)
> at com.google.gson.internal.bind.CollectionTypeAdapterFactory$
> Adapter.read(CollectionTypeAdapterFactory.java:60)
> at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$
> 1.read(ReflectiveTypeAdapterFactory.java:93)
> at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$
> Adapter.read(ReflectiveTypeAdapterFactory.java:172)
> at com.google.gson.Gson.fromJson(Gson.java:791)
> at com.google.gson.Gson.fromJson(Gson.java:757)
> at com.google.gson.Gson.fromJson(Gson.java:706)
> at com.google.gson.Gson.fromJson(Gson.java:678)
> at org.apache.zeppelin.notebook.Note.fromJson(Note.java:898)
> at org.apache.zeppelin.notebook.repo.VFSNotebookRepo.getNote(
> VFSNotebookRepo.java:178)
> at org.apache.zeppelin.notebook.repo.VFSNotebookRepo.
> getNoteInfo(VFSNotebookRepo.java:201)
> at org.apache.zeppelin.notebook.repo.VFSNotebookRepo.list(
> VFSNotebookRepo.java:146)
> at org.apache.zeppelin.notebook.repo.NotebookRepoSync.list(
> NotebookRepoSync.java:158)
> at org.apache.zeppelin.notebook.Notebook.loadAllNotes(
> Notebook.java:553)
> at org.apache.zeppelin.notebook.Notebook.(Notebook.java:124)
> at org.apache.zeppelin.server.ZeppelinServer.(
> ZeppelinServer.java:158)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.cxf.jaxrs.servlet.CXFNonSpringJaxrsServlet.
> createSingletonInstance(CXFNonSpringJaxrsServlet.java:382)
> at org.apache.cxf.jaxrs.servlet.CXFNonSpringJaxrsServlet.
> createApplicationInstance(CXFNonSpringJaxrsServlet.java:454)
> at org.apache.cxf.jaxrs.servlet.CXFNonSpringJaxrsServlet.
> createServerFromApplication(CXFNonSpringJaxrsServlet.java:432)
> at org.apache.cxf.jaxrs.servlet.CXFNonSpringJaxrsServlet.init(
> CXFNonSpringJaxrsServlet.java:93)
> at org.eclipse.jetty.servlet.ServletHolder.initServlet(
> ServletHolder.java:616)
> at org.eclipse.jetty.servlet.ServletHolder.initialize(
> ServletHolder.java:396)
> at org.eclipse.jetty.servlet.ServletHandler.initialize(
> ServletHandler.java:871)
> at org.eclipse.jetty.servlet.ServletContextHandler.startContext(
> ServletContextHandler.java:298)
> at org.eclipse.jetty.webapp.WebAppContext.startWebapp(
> WebAppContext.java:1349)
> at org.eclipse.jetty.webapp.WebAppContext.startContext(
> WebAppContext.java:1342)
> at org.eclipse.jetty.server.handler.ContextHandler.
> doStart(ContextHandler.java:741)
> at org.eclipse.jetty.webapp.WebAppContext.doStart(
> WebAppContext.java:505)
> at org.eclipse.jetty.util.component.AbstractLifeCycle.
> start(AbstractLifeCycle.jav

Re: note.json or interpreter.json format change?

2017-08-28 Thread Ruslan Dautkhanov
)
at
org.apache.zeppelin.server.ZeppelinServer.main(ZeppelinServer.java:195)
Caused by: java.text.ParseException: Unparseable date: "2017-08-27
19:56:22.229"
at java.text.DateFormat.parse(DateFormat.java:357)
at
com.google.gson.internal.bind.DateTypeAdapter.deserializeToDate(DateTypeAdapter.java:79)
... 50 more




-- 
Ruslan Dautkhanov

On Mon, Aug 28, 2017 at 11:32 AM, Ruslan Dautkhanov <dautkha...@gmail.com>
wrote:

> Testing a newer Zeppelin version from yesterday's zeppelin snapshot.
> Noticed it doesn't see majority of my notebooks.
> Our previous Zeppelin version is ~end of April snapshot of Zeppelin.
> Did the format of note.json or interpreter.json expected format change
> in a way that made those note.json notebooks not show up?
>
>
> Thanks,
> Ruslan Dautkhanov
>


note.json or interpreter.json format change?

2017-08-28 Thread Ruslan Dautkhanov
Testing a newer Zeppelin version from yesterday's zeppelin snapshot.
Noticed it doesn't see majority of my notebooks.
Our previous Zeppelin version is ~end of April snapshot of Zeppelin.
Did the format of note.json or interpreter.json expected format change
in a way that made those note.json notebooks not show up?


Thanks,
Ruslan Dautkhanov


Re: zeppelin build fails with DependencyConvergence error

2017-08-28 Thread Ruslan Dautkhanov
Like$class.flatMap(TraversableLike.scala:241)
at
scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at
org.apache.spark.deploy.yarn.security.ConfigurableCredentialManager.obtainCredentials(ConfigurableCredentialManager.scala:80)
at
org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:371)
at
org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:816)
at
org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:169)
at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
at org.apache.spark.SparkContext.(SparkContext.scala:509)
at
org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
at
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
at scala.Option.getOrElse(Option.scala:121)
at
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:40)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:35)
at
org.apache.zeppelin.spark.SparkInterpreter.createSparkSession(SparkInterpreter.java:399)
at
org.apache.zeppelin.spark.SparkInterpreter.getSparkSession(SparkInterpreter.java:277)
at
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:870)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at
org.apache.zeppelin.spark.PySparkInterpreter.getSparkInterpreter(PySparkInterpreter.java:586)
at
org.apache.zeppelin.spark.PySparkInterpreter.createGatewayServerAndStartScript(PySparkInterpreter.java:218)
at
org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:163)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:499)
at org.apache.zeppelin.scheduler.Job.run(Job.java:181)
at
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)






-- 
Ruslan Dautkhanov

On Sun, Aug 27, 2017 at 6:34 PM, Jeff Zhang <zjf...@gmail.com> wrote:

>
> Can you run this command before ?
>
> It looks like CDH issue, I can build it successfully without specifying
> hadoop.version but just only -Phadoop-2.6
>
>
> Ruslan Dautkhanov <dautkha...@gmail.com>于2017年8月28日周一 上午4:26写道:
>
>> Building from a current Zeppelin snapshot fails with
>> zeppelin build fails with org.apache.maven.plugins.
>> enforcer.DependencyConvergence
>> see details below.
>>
>> Build command
>> /opt/maven/maven-latest/bin/mvn clean package -DskipTests -Pspark-2.2
>> -Dhadoop.version=2.6.0-cdh5.12.0 -Phadoop-2.6 -Pvendor-repo -Pscala-2.10
>> -Psparkr -pl '!*..excluded certain modules..*' -e
>>
>> maven 3.5.0
>>> jdk 1.8.0_141
>>> RHEL 7.3
>>> npm.x86_64   1:3.10.10-1.6.11.1.1.el7
>>> nodejs.x86_641:6.11.1-1.el7 @epel
>>> latest zeppelin snapshot
>>
>>
>> Any ideas? It's my first attempt to build on rhel7/jdk8 .. never seen
>> this problem before.
>>
>> Thanks,
>> Ruslan
>>
>>
>>
>> [INFO] Scanning for projects...
>> [WARNING]
>> [WARNING] Some problems were encountered while building the effective
>> model for org.apache.zeppelin:zeppelin-spark-dependencies_2.10:jar:0.
>> 8.0-SNAPSHOT
>> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
>> found duplicate declaration of plugi

zeppelin build fails with DependencyConvergence error

2017-08-27 Thread Ruslan Dautkhanov
Building from a current Zeppelin snapshot fails with
zeppelin build fails with
org.apache.maven.plugins.enforcer.DependencyConvergence
see details below.

Build command
/opt/maven/maven-latest/bin/mvn clean package -DskipTests -Pspark-2.2
-Dhadoop.version=2.6.0-cdh5.12.0 -Phadoop-2.6 -Pvendor-repo -Pscala-2.10
-Psparkr -pl '!*..excluded certain modules..*' -e

maven 3.5.0
> jdk 1.8.0_141
> RHEL 7.3
> npm.x86_64   1:3.10.10-1.6.11.1.1.el7
> nodejs.x86_641:6.11.1-1.el7 @epel
> latest zeppelin snapshot


Any ideas? It's my first attempt to build on rhel7/jdk8 .. never seen this
problem before.

Thanks,
Ruslan



[INFO] Scanning for projects...
[WARNING]
[WARNING] Some problems were encountered while building the effective model
for org.apache.zeppelin:zeppelin-spark-dependencies_2.10:jar:0.8.0-SNAPSHOT
[WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
found duplicate declaration of plugin
com.googlecode.maven-download-plugin:download-maven-plugin @ line 940,
column 15
[WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
found duplicate declaration of plugin
com.googlecode.maven-download-plugin:download-maven-plugin @ line 997,
column 15
[WARNING]
[WARNING] Some problems were encountered while building the effective model
for org.apache.zeppelin:zeppelin-spark_2.10:jar:0.8.0-SNAPSHOT
[WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
found duplicate declaration of plugin org.scala-tools:maven-scala-plugin @
line 467, column 15
[WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
found duplicate declaration of plugin
org.apache.maven.plugins:maven-surefire-plugin @ line 475, column 15
[WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
found duplicate declaration of plugin
org.apache.maven.plugins:maven-compiler-plugin @ line 486, column 15
[WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
found duplicate declaration of plugin org.scala-tools:maven-scala-plugin @
line 496, column 15
[WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
found duplicate declaration of plugin
org.apache.maven.plugins:maven-surefire-plugin @ line 504, column 15
[WARNING]
[WARNING] It is highly recommended to fix these problems because they
threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support
building such malformed projects.
[WARNING]
[WARNING] The project org.apache.zeppelin:zeppelin-web:war:0.8.0-SNAPSHOT
uses prerequisites which is only intended for maven-plugin projects but not
for non maven-plugin projects. For such purposes you should use the
maven-enforcer-plugin. See
https://maven.apache.org/enforcer/enforcer-rules/requireMavenVersion.html


... [skip]

[INFO]

[INFO] Building Zeppelin: Zengine 0.8.0-SNAPSHOT
[INFO]

[INFO]
[INFO] --- maven-clean-plugin:2.6.1:clean (default-clean) @
zeppelin-zengine ---
[INFO]
[INFO] --- flatten-maven-plugin:1.0.0:clean (flatten.clean) @
zeppelin-zengine ---
[INFO]
[INFO] --- maven-checkstyle-plugin:2.13:check (checkstyle-fail-build) @
zeppelin-zengine ---
[INFO]
[INFO]
[INFO] --- maven-resources-plugin:2.7:copy-resources (copy-resources) @
zeppelin-zengine ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 17 resources
[INFO]
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce) @ zeppelin-zengine
---
[WARNING]
Dependency convergence error for
com.fasterxml.jackson.core:jackson-core:2.5.3 paths to dependency are:
+-org.apache.zeppelin:zeppelin-zengine:0.8.0-SNAPSHOT
  +-com.amazonaws:aws-java-sdk-s3:1.10.62
+-com.amazonaws:aws-java-sdk-core:1.10.62
  +-com.fasterxml.jackson.core:jackson-databind:2.5.3
+-com.fasterxml.jackson.core:jackson-core:2.5.3
and
+-org.apache.zeppelin:zeppelin-zengine:0.8.0-SNAPSHOT
  +-org.apache.hadoop:hadoop-client:2.6.0-cdh5.12.0
+-org.apache.hadoop:hadoop-aws:2.6.0-cdh5.12.0
  +-com.fasterxml.jackson.core:jackson-core:2.2.3

[WARNING]
Dependency convergence error for
org.codehaus.jackson:jackson-mapper-asl:1.9.13 paths to dependency are:
+-org.apache.zeppelin:zeppelin-zengine:0.8.0-SNAPSHOT
  +-com.github.eirslett:frontend-maven-plugin:1.3
+-com.github.eirslett:frontend-plugin-core:1.3
  +-org.codehaus.jackson:jackson-mapper-asl:1.9.13
and
+-org.apache.zeppelin:zeppelin-zengine:0.8.0-SNAPSHOT
  +-org.apache.hadoop:hadoop-client:2.6.0-cdh5.12.0
+-org.apache.hadoop:hadoop-common:2.6.0-cdh5.12.0
  +-org.codehaus.jackson:jackson-mapper-asl:1.8.8
and
+-org.apache.zeppelin:zeppelin-zengine:0.8.0-SNAPSHOT
  +-org.apache.hadoop:hadoop-client:2.6.0-cdh5.12.0
+-org.apache.hadoop:hadoop-hdfs:2.6.0-cdh5.12.0
  +-org.codehaus.jackson:jackson-mapper-asl:1.9.13

... [skipped a number of other 

Re: Cloudera Spark 2.2

2017-08-04 Thread Ruslan Dautkhanov
This should do:


> export SPARK_HOME=/opt/cloudera/parcels/SPARK2/lib/spark2
> export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive
> export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
> export HADOOP_CONF_DIR=/etc/hadoop/conf
> export HIVE_CONF_DIR=/etc/hive/conf



>
> mvn clean package -DskipTests -Pspark-2.1 -Dhadoop.version=2.6.0-cdh5.10.1
> -Phadoop-2.6 -Pvendor-repo -Pscala-2.10 -Psparkr -pl
> '!alluxio,!flink,!ignite,!lens,!cassandra,!bigquery,!scio' -e


You may needs additional steps depending which interpreters you use (like R
etc).


-- 
Ruslan Dautkhanov

On Fri, Aug 4, 2017 at 8:31 AM, Benjamin Kim <bbuil...@gmail.com> wrote:

> Hi Ruslan,
>
> Can you send me the steps you used to build it, especially the Maven
> command with the arguments? I will try to build it also.
>
> I do believe that the binaries are for official releases.
>
> Cheers,
> Ben
>
>
> On Wed, Aug 2, 2017 at 3:44 PM Ruslan Dautkhanov <dautkha...@gmail.com>
> wrote:
>
>> It was built. I think binaries are only available for official releases?
>>
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Wed, Aug 2, 2017 at 4:41 PM, Benjamin Kim <bbuil...@gmail.com> wrote:
>>
>>> Did you build Zeppelin or download the binary?
>>>
>>> On Wed, Aug 2, 2017 at 3:40 PM Ruslan Dautkhanov <dautkha...@gmail.com>
>>> wrote:
>>>
>>>> We're using an ~April snapshot of Zeppelin, so not sure about 0.7.1.
>>>>
>>>> Yes, we have that spark home in zeppelin-env.sh
>>>>
>>>>
>>>>
>>>> --
>>>> Ruslan Dautkhanov
>>>>
>>>> On Wed, Aug 2, 2017 at 4:31 PM, Benjamin Kim <bbuil...@gmail.com>
>>>> wrote:
>>>>
>>>>> Does this work with Zeppelin 0.7.1? We an error when setting
>>>>> SPARK_HOME in zeppelin-env.sh to what you have below.
>>>>>
>>>>> On Wed, Aug 2, 2017 at 3:24 PM Ruslan Dautkhanov <dautkha...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> You don't have to use spark2-shell and spark2-submit to use Spark 2.
>>>>>> That can be controled by setting SPARK_HOME using regular
>>>>>> spark-submit/spark-shell.
>>>>>>
>>>>>> $ which spark-submit
>>>>>> /usr/bin/spark-submit
>>>>>> $ which spark-shell
>>>>>> /usr/bin/spark-shell
>>>>>>
>>>>>> $ spark-shell
>>>>>> Welcome to
>>>>>>     __
>>>>>>  / __/__  ___ _/ /__
>>>>>> _\ \/ _ \/ _ `/ __/  '_/
>>>>>>/___/ .__/\_,_/_/ /_/\_\   version 1.6.0
>>>>>>   /_/
>>>>>>
>>>>>>
>>>>>>
>>>>>> $ export SPARK_HOME=/opt/cloudera/parcels/SPARK2/lib/spark2
>>>>>>
>>>>>> $ spark-shell
>>>>>> Welcome to
>>>>>>     __
>>>>>>  / __/__  ___ _/ /__
>>>>>> _\ \/ _ \/ _ `/ __/  '_/
>>>>>>/___/ .__/\_,_/_/ /_/\_\   version 2.1.0.cloudera1
>>>>>>   /_/
>>>>>>
>>>>>>
>>>>>> spark-submit and spark-shell are just shell script wrappers.
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Ruslan Dautkhanov
>>>>>>
>>>>>> On Wed, Aug 2, 2017 at 10:22 AM, Benjamin Kim <bbuil...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> According to the Zeppelin documentation, Zeppelin 0.7.1 supports
>>>>>>> Spark 2.1. But, I don't know if it supports Spark 2.2 or even 2.1 from
>>>>>>> Cloudera. For some reason, Cloudera defaults to Spark 1.6 and so does 
>>>>>>> the
>>>>>>> calls to spark-shell and spark-submit. To force the use of Spark 2.x, 
>>>>>>> the
>>>>>>> calls need to be spark2-shell and spark2-submit. I wonder if this is
>>>>>>> causing the problem. By the way, we are using Java8 corporate wide, and
>>>>>>> there seems to be no problems using Zeppelin.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Ben
>>>>>>>
>>>>>>> On Tue, Aug 1, 2017 

Re: Cloudera Spark 2.2

2017-08-02 Thread Ruslan Dautkhanov
It was built. I think binaries are only available for official releases?



-- 
Ruslan Dautkhanov

On Wed, Aug 2, 2017 at 4:41 PM, Benjamin Kim <bbuil...@gmail.com> wrote:

> Did you build Zeppelin or download the binary?
>
> On Wed, Aug 2, 2017 at 3:40 PM Ruslan Dautkhanov <dautkha...@gmail.com>
> wrote:
>
>> We're using an ~April snapshot of Zeppelin, so not sure about 0.7.1.
>>
>> Yes, we have that spark home in zeppelin-env.sh
>>
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Wed, Aug 2, 2017 at 4:31 PM, Benjamin Kim <bbuil...@gmail.com> wrote:
>>
>>> Does this work with Zeppelin 0.7.1? We an error when setting SPARK_HOME
>>> in zeppelin-env.sh to what you have below.
>>>
>>> On Wed, Aug 2, 2017 at 3:24 PM Ruslan Dautkhanov <dautkha...@gmail.com>
>>> wrote:
>>>
>>>> You don't have to use spark2-shell and spark2-submit to use Spark 2.
>>>> That can be controled by setting SPARK_HOME using regular
>>>> spark-submit/spark-shell.
>>>>
>>>> $ which spark-submit
>>>> /usr/bin/spark-submit
>>>> $ which spark-shell
>>>> /usr/bin/spark-shell
>>>>
>>>> $ spark-shell
>>>> Welcome to
>>>>     __
>>>>  / __/__  ___ _/ /__
>>>> _\ \/ _ \/ _ `/ __/  '_/
>>>>/___/ .__/\_,_/_/ /_/\_\   version 1.6.0
>>>>   /_/
>>>>
>>>>
>>>>
>>>> $ export SPARK_HOME=/opt/cloudera/parcels/SPARK2/lib/spark2
>>>>
>>>> $ spark-shell
>>>> Welcome to
>>>>     __
>>>>  / __/__  ___ _/ /__
>>>> _\ \/ _ \/ _ `/ __/  '_/
>>>>/___/ .__/\_,_/_/ /_/\_\   version 2.1.0.cloudera1
>>>>   /_/
>>>>
>>>>
>>>> spark-submit and spark-shell are just shell script wrappers.
>>>>
>>>>
>>>>
>>>> --
>>>> Ruslan Dautkhanov
>>>>
>>>> On Wed, Aug 2, 2017 at 10:22 AM, Benjamin Kim <bbuil...@gmail.com>
>>>> wrote:
>>>>
>>>>> According to the Zeppelin documentation, Zeppelin 0.7.1 supports Spark
>>>>> 2.1. But, I don't know if it supports Spark 2.2 or even 2.1 from Cloudera.
>>>>> For some reason, Cloudera defaults to Spark 1.6 and so does the calls to
>>>>> spark-shell and spark-submit. To force the use of Spark 2.x, the calls 
>>>>> need
>>>>> to be spark2-shell and spark2-submit. I wonder if this is causing the
>>>>> problem. By the way, we are using Java8 corporate wide, and there seems to
>>>>> be no problems using Zeppelin.
>>>>>
>>>>> Cheers,
>>>>> Ben
>>>>>
>>>>> On Tue, Aug 1, 2017 at 7:05 PM Ruslan Dautkhanov <dautkha...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Might need to recompile Zeppelin with Scala 2.11?
>>>>>> Also Spark 2.2 now requires JDK8 I believe.
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Ruslan Dautkhanov
>>>>>>
>>>>>> On Tue, Aug 1, 2017 at 6:26 PM, Benjamin Kim <bbuil...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Here is more.
>>>>>>>
>>>>>>> org.apache.zeppelin.interpreter.InterpreterException: WARNING:
>>>>>>> User-defined SPARK_HOME (/opt/cloudera/parcels/SPARK2-
>>>>>>> 2.2.0.cloudera1-1.cdh5.12.0.p0.142354/lib/spark2) overrides
>>>>>>> detected (/opt/cloudera/parcels/SPARK2/lib/spark2).
>>>>>>> WARNING: Running spark-class from user-defined location.
>>>>>>> Exception in thread "main" java.lang.NoSuchMethodError:
>>>>>>> scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
>>>>>>> at org.apache.spark.util.Utils$.getDefaultPropertiesFile(
>>>>>>> Utils.scala:2103)
>>>>>>> at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$
>>>>>>> mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:124)
>>>>>>> at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$
>>>>>>> mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:124)
>>>>>>> at scala.Option.getO

Re: Cloudera Spark 2.2

2017-08-01 Thread Ruslan Dautkhanov
Might need to recompile Zeppelin with Scala 2.11?
Also Spark 2.2 now requires JDK8 I believe.



-- 
Ruslan Dautkhanov

On Tue, Aug 1, 2017 at 6:26 PM, Benjamin Kim <bbuil...@gmail.com> wrote:

> Here is more.
>
> org.apache.zeppelin.interpreter.InterpreterException: WARNING:
> User-defined SPARK_HOME (/opt/cloudera/parcels/SPARK2-
> 2.2.0.cloudera1-1.cdh5.12.0.p0.142354/lib/spark2) overrides detected
> (/opt/cloudera/parcels/SPARK2/lib/spark2).
> WARNING: Running spark-class from user-defined location.
> Exception in thread "main" java.lang.NoSuchMethodError:
> scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
> at org.apache.spark.util.Utils$.getDefaultPropertiesFile(Utils.scala:2103)
> at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$
> mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:124)
> at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$
> mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:124)
> at scala.Option.getOrElse(Option.scala:120)
> at org.apache.spark.deploy.SparkSubmitArguments.
> mergeDefaultSparkProperties(SparkSubmitArguments.scala:124)
> at org.apache.spark.deploy.SparkSubmitArguments.(
> SparkSubmitArguments.scala:110)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> Cheers,
> Ben
>
>
> On Tue, Aug 1, 2017 at 5:24 PM Jeff Zhang <zjf...@gmail.com> wrote:
>
>>
>> Then it is due to some classpath issue. I am not sure familiar with CDH,
>> please check whether spark of CDH include hadoop jar with it.
>>
>>
>> Benjamin Kim <bbuil...@gmail.com>于2017年8月2日周三 上午8:22写道:
>>
>>> Here is the error that was sent to me.
>>>
>>> org.apache.zeppelin.interpreter.InterpreterException: Exception in
>>> thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/
>>> FSDataInputStream
>>> Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.
>>> FSDataInputStream
>>>
>>> Cheers,
>>> Ben
>>>
>>>
>>> On Tue, Aug 1, 2017 at 5:20 PM Jeff Zhang <zjf...@gmail.com> wrote:
>>>
>>>>
>>>> By default, 0.7.1 doesn't support spark 2.2. But you can set
>>>> zeppelin.spark.enableSupportedVersionCheck in interpreter setting to
>>>> disable the supported version check.
>>>>
>>>>
>>>> Jeff Zhang <zjf...@gmail.com>于2017年8月2日周三 上午8:18写道:
>>>>
>>>>>
>>>>> What's the error you see in log ?
>>>>>
>>>>>
>>>>> Benjamin Kim <bbuil...@gmail.com>于2017年8月2日周三 上午8:18写道:
>>>>>
>>>>>> Has anyone configured Zeppelin 0.7.1 for Cloudera's release of Spark
>>>>>> 2.2? I can't get it to work. I downloaded the binary and set SPARK_HOME 
>>>>>> to
>>>>>> /opt/cloudera/parcels/SPARK2/lib/spark2. I must be missing something.
>>>>>>
>>>>>> Cheers,
>>>>>> Ben
>>>>>>
>>>>>


Re: Showing pandas dataframe with utf8 strings

2017-07-11 Thread Ruslan Dautkhanov
Your example works fine for me too.

We're on Zeppelin snapshot ~2 months old.



-- 
Ruslan Dautkhanov

On Tue, Jul 11, 2017 at 3:11 PM, Ben Vogan <b...@shopkick.com> wrote:

> Here is the specific example that is failing:
>
> import pandas
> z.show(pandas.DataFrame([u'Jalape\xf1os.'],[1],['Menu']))
>
> On Tue, Jul 11, 2017 at 2:32 PM, Ruslan Dautkhanov <dautkha...@gmail.com>
> wrote:
>
>> Hi Ben,
>>
>> I can't reproduce this
>>
>> from pyspark.sql.types import *
>>> rdd = sc.parallelize([[u'El Niño']])
>>> df = sqlc.createDataFrame(
>>>   rdd, schema=StructType([StructField("unicode data",
>>> StringType(), True)])
>>> )
>>> df.show()
>>> z.show(df)
>>
>>
>> shows unicode character fine.
>>
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Tue, Jul 11, 2017 at 11:37 AM, Ben Vogan <b...@shopkick.com> wrote:
>>
>>> Hi Ruslan,
>>>
>>> I tried adding:
>>>
>>>  export LC_ALL="en_US.utf8"
>>>
>>> To my zeppelin-env.sh script and restarted Zeppelin, but I still have
>>> the same problem.  The print statement:
>>>
>>> python -c "print (u'\xf1')"
>>>
>>> works from the note.  I think the problem is the use of the str
>>> function.  Looking at the stack you can see that the zeppelin code is
>>> calling body_buf.write(str(cell)).  If you call str(u'\xf1') you will get
>>> the error.
>>>
>>> --Ben
>>>
>>> On Tue, Jul 11, 2017 at 10:19 AM, Ruslan Dautkhanov <
>>> dautkha...@gmail.com> wrote:
>>>
>>>> $ env | grep LC
>>>>> $
>>>>> $ python -c "print (u'\xf1')"
>>>>> ñ
>>>>>
>>>>
>>>>
>>>>> $ export LC_ALL="C"
>>>>> $ python -c "print (u'\xf1')"
>>>>> Traceback (most recent call last):
>>>>>   File "", line 1, in 
>>>>> UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in
>>>>> position 0: ordinal not in range(128)
>>>>>
>>>>
>>>>
>>>>> $ export LC_ALL="en_US.utf8"
>>>>> $ python -c "print (u'\xf1')"
>>>>> ñ
>>>>>
>>>>
>>>>
>>>>> $ unset LC_ALL
>>>>> $ env | grep LC
>>>>> $
>>>>> $ python -c "print (u'El Ni\xf1o')"
>>>>> El Niño
>>>>
>>>>
>>>> You could add LC_ALL export to your zeppelin-env.sh script.
>>>>
>>>>
>>>>
>>>> --
>>>> Ruslan Dautkhanov
>>>>
>>>> On Tue, Jul 11, 2017 at 9:35 AM, Ben Vogan <b...@shopkick.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I am trying to use the zeppelin context to show the contents of a
>>>>> pandas DataFrame and getting the following error:
>>>>>
>>>>> Traceback (most recent call last):
>>>>>   File "/tmp/zeppelin_python-7554503996532642522.py", line 278, in
>>>>> 
>>>>> raise Exception(traceback.format_exc())
>>>>> Exception: Traceback (most recent call last):
>>>>>   File "/tmp/zeppelin_python-7554503996532642522.py", line 271, in
>>>>> 
>>>>> exec(code)
>>>>>   File "", line 2, in 
>>>>>   File "/tmp/zeppelin_python-7554503996532642522.py", line 93, in show
>>>>> self.show_dataframe(p, **kwargs)
>>>>>   File "/tmp/zeppelin_python-7554503996532642522.py", line 121, in
>>>>> show_dataframe
>>>>> body_buf.write(str(cell))
>>>>> UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in
>>>>> position 79: ordinal not in range(128)
>>>>>
>>>>> How do I go about resolving this?
>>>>>
>>>>> I'm running version 0.7.1 with python 2.7.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> --
>>>>> *BENJAMIN VOGAN* | Data Platform Team Lead
>>>>>
>>>>> <http://www.shopkick.com/>
>>>>> <https://www.facebook.com/shopkick>
>>>>> <https://www.instagram.com/shopkick/>
>>>>> <https://www.pinterest.com/shopkick/>
>>>>> <https://twitter.com/shopkickbiz>
>>>>> <https://www.linkedin.com/company-beta/831240/?pathWildcard=831240>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> *BENJAMIN VOGAN* | Data Platform Team Lead
>>>
>>> <http://www.shopkick.com/>
>>> <https://www.facebook.com/shopkick>
>>> <https://www.instagram.com/shopkick/>
>>> <https://www.pinterest.com/shopkick/> <https://twitter.com/shopkickbiz>
>>> <https://www.linkedin.com/company-beta/831240/?pathWildcard=831240>
>>>
>>
>>
>
>
> --
> *BENJAMIN VOGAN* | Data Platform Team Lead
>
> <http://www.shopkick.com/>
> <https://www.facebook.com/shopkick> <https://www.instagram.com/shopkick/>
> <https://www.pinterest.com/shopkick/> <https://twitter.com/shopkickbiz>
> <https://www.linkedin.com/company-beta/831240/?pathWildcard=831240>
>


Re: JDBC use with zeppelin

2017-07-10 Thread Ruslan Dautkhanov
For Oracle JDBC driver we had to feed ojdb7.jar
into SPARK_SUBMIT_OPTIONS through --jars parameter
and into ZEPPELIN_INTP_CLASSPATH_OVERRIDES, like:

zeppelin-env.sh:

export SPARK_SUBMIT_OPTIONS=". . . --jars /var/lib/sqoop/ojdbc7.jar"
> export
> ZEPPELIN_INTP_CLASSPATH_OVERRIDES=/etc/hive/conf:/var/lib/sqoop/ojdbc7.jar





-- 
Ruslan Dautkhanov

On Mon, Jul 10, 2017 at 12:10 PM, <dar...@ontrenet.com> wrote:

> Hi
>
> We want to use a jdbc driver with pyspark through Zeppelin. Not the custom
> interpreter but from sqlContext where we can read into dataframe.
>
> I added the jdbc driver jar to zeppelin spark submit options "--jars" but
> it still says driver class not found.
>
> Does it have to reside somewhere else?
>
> Thanks in advance!
>
>
>
> Get Outlook for Android <https://aka.ms/ghei36>
>
>


Re: Query about the high availability of Zeppelin

2017-06-30 Thread Ruslan Dautkhanov
I think if you have a shared storage for notebooks (for example, NFS
mounted from a third server),
and a load-balancer that supports sticky sessions (like F5) on top, it
should be possible to have HA without
any code change in Zeppelin. Am I missing something?



-- 
Ruslan Dautkhanov

On Fri, Jun 30, 2017 at 5:54 PM, Alexander Filipchik <afilipc...@gmail.com>
wrote:

> Honestly,  HA requires more than just active stand by.
> It should be able to scale without major surgeries, which is not possible
> right now. For example, if you start too many interpreters, zeppelin box
> will simply run out of memory.
>
> Alex
>
> On Thu, Jun 29, 2017 at 10:59 PM, wenxing zheng <wenxing.zh...@gmail.com>
> wrote:
>
>> at first, I would think GIT storage is a good option and we can push and
>> pull the changes regularly.
>>
>> With multiple zeppelin instances, maybe we need a new component or
>> service to act as a distributed scheduler: dispatch the Job to and manage
>> the Jobs on the Zeppelin instances.
>>
>> On Fri, Jun 30, 2017 at 1:26 PM, Vinay Shukla <vinayshu...@gmail.com>
>> wrote:
>>
>>> Here is what I think should be part of HA consideration:
>>>
>>>1. Have multiple Zeppelin Instances
>>>2. Have the notebooks storage backed by something like an NFS so all
>>>notebooks are visible across all Zeppelin instances
>>>3. Put multiple load balancers infront of Zeppelin to route requests.
>>>
>>> Consider that HA needs scalability, which depends on which interpreter
>>> you plan to use. So you might need to consider HA at both Zeppelin and
>>> interpreter level. For example if you were using Z + Livy + Spark, you will
>>> need to consider scalability + HA needs of Z + Livy interpreter + Livy
>>> Server + Spark (on Cluster manager).
>>>
>>> On Thu, Jun 29, 2017 at 10:04 PM, wenxing zheng <wenxing.zh...@gmail.com
>>> > wrote:
>>>
>>>> and do we have any architecture doc for reference? Because we need to
>>>> add the HA capability as soon as possible, hope we can figure it out.
>>>>
>>>> On Fri, Jun 30, 2017 at 12:33 PM, wenxing zheng <
>>>> wenxing.zh...@gmail.com> wrote:
>>>>
>>>>> Thanks to Jeff and Moon.
>>>>>
>>>>> So currently the active-active model doesn't work on GIT storage, am I
>>>>> right?
>>>>>
>>>>> On Fri, Jun 30, 2017 at 12:16 PM, moon soo Lee <m...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Basically active-(hot)standby model would work.
>>>>>> Two or more Zeppelin instance can be started and pointing the same
>>>>>> notebook storage, if only one Zeppelin instance (active) change notebook 
>>>>>> at
>>>>>> any given time.
>>>>>>
>>>>>> In case of the active instance fails, one of rest instance can take
>>>>>> over the role by refreshing notebook list and start make change.
>>>>>>
>>>>>> But all these fail over is not provided by Zeppelin and need to
>>>>>> depends on external script or HA software (like Heartbeat).
>>>>>>
>>>>>> Like Jeff mentioned, community does not have concrete plan for having
>>>>>> HA built-in at this moment.
>>>>>>
>>>>>> Hope this helps,
>>>>>>
>>>>>> Thanks,
>>>>>> moon
>>>>>>
>>>>>> On Fri, Jun 30, 2017 at 1:01 PM Jeff Zhang <zjf...@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>> No concrete plan for that. There're other higher priority things
>>>>>>> need to be done. At least it would not be available in 0.8, maybe after 
>>>>>>> 1.0
>>>>>>>
>>>>>>>
>>>>>>> wenxing zheng <wenxing.zh...@gmail.com>于2017年6月30日周五 上午11:47写道:
>>>>>>>
>>>>>>>> Thanks to Jianfeng.
>>>>>>>>
>>>>>>>> Do you  know any plan on this?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jun 30, 2017 at 11:32 AM, Jianfeng (Jeff) Zhang <
>>>>>>>> jzh...@hortonworks.com> wrote:
>>>>>>>>
>>>>>>>>> HA is not supported, there’s still  lots of configuration files
>>>>>>>>> stored in local file system.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best Regard,
>>>>>>>>> Jeff Zhang
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> From: wenxing zheng <wenxing.zh...@gmail.com>
>>>>>>>>> Reply-To: "users@zeppelin.apache.org" <users@zeppelin.apache.org>
>>>>>>>>> Date: Friday, June 30, 2017 at 9:40 AM
>>>>>>>>> To: "users@zeppelin.apache.org" <users@zeppelin.apache.org>
>>>>>>>>> Subject: Query about the high availability of Zeppelin
>>>>>>>>>
>>>>>>>>> Hello all,
>>>>>>>>>
>>>>>>>>> I still didn't find any docs on this topic? Appreciated if anyone
>>>>>>>>> can shed some lights on how to get the Zeppelin into a cluster with
>>>>>>>>> shared/centralized storage
>>>>>>>>>
>>>>>>>>> Regards, Wenxing
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>
>>>>
>>>
>>
>


Re: Integrating with Airflow

2017-05-19 Thread Ruslan Dautkhanov
Thanks for sharing this Ben.

I agree Zeppelin is a better fit with tighter integration with Spark and
built-in visualizations.

We have pretty much standardized on pySpark, so here's one of the scripts
we use internally
to extract %pyspark, %sql and %md paragraphs into a standalone script (that
can be scheduled in Airflow for example)
https://github.com/Tagar/stuff/blob/master/znote.py (patches are welcome :-)

Hope this helps.

ps. In my opinion adding dependencies between paragraphs wouldn't be that
hard for simple cases,
and can be first step to define a DAG in Zeppelin directly. It would be
really awesome if we see this type of
integration in the future.

Othewise I don't see much value if a whole note/ whole workflow would run
as a single task in Airflow.
In my opinion, each paragraph has to be a task... then it'll be very
useful.


Thanks,
Ruslan


On Fri, May 19, 2017 at 4:55 PM, Ben Vogan <b...@shopkick.com> wrote:

> I do not expect the relationship between DAGs to be described in Zeppelin
> - that would be done in Airflow.  It just seems that Zeppelin is such a
> great tool for a data scientists workflow that it would be nice if once
> they are done with the work the note could be productionized directly.  I
> could envision a couple of scenarios:
>
> 1. Using a zeppelin instance to run the note via the REST API.  The
> instance could be containerized and spun up specifically for a DAG or it
> could be a permanently available one.
> 2. A note could be pulled from git and some part of the Zeppelin engine
> could execute the note without the web UI at all.
>
> I would expect on the airflow side there to be some special operators for
> executing these.
>
> If the scheduler is pluggable then it should be possible to create a plug
> in that talks to the Airflow REST API.
>
> I happen to prefer Zeppelin to Jupyter - although I get your point about
> both being python.  I don't really view that as a problem - most of the big
> data platforms I'm talking to are implemented on the JVM after all.  The
> python part of Airflow is really just describing what gets run and it isn't
> hard to run something that isn't written in python.
>
> On Fri, May 19, 2017 at 2:52 PM, Ruslan Dautkhanov <dautkha...@gmail.com>
> wrote:
>
>> We also use both Zeppelin and Airflow.
>>
>> I'm interested in hearing what others are doing here too.
>>
>> Although honestly there might be some challenges
>> - Airflow expects a DAG structure, while a notebook has pretty linear
>> structure;
>> - Airflow is Python-based; Zeppelin is all Java (REST API might be of
>> help?).
>> Jupyter+Airflow might be a more natural fit to integrate?
>>
>> On top of that, the way we use Zeppelin is a lot of ad-hoc queries,
>> while Airflow is for more finalized workflows I guess?
>>
>> Thanks for bringing this up.
>>
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Fri, May 19, 2017 at 2:20 PM, Ben Vogan <b...@shopkick.com> wrote:
>>
>>> Hi all,
>>>
>>> We are really enjoying the workflow of interacting with our data via
>>> Zeppelin, but are not sold on using the built in cron scheduling
>>> capability.  We would like to be able to create more complex DAGs that are
>>> better suited for something like Airflow.  I was curious as to whether
>>> anyone has done an integration of Zeppelin with Airflow.
>>>
>>> Either directly from within Zeppelin, or from the Airflow side.
>>>
>>> Thanks,
>>> --
>>> *BENJAMIN VOGAN* | Data Platform Team Lead
>>>
>>> <http://www.shopkick.com/>
>>> <https://www.facebook.com/shopkick>
>>> <https://www.instagram.com/shopkick/>
>>> <https://www.pinterest.com/shopkick/> <https://twitter.com/shopkickbiz>
>>> <https://www.linkedin.com/company-beta/831240/?pathWildcard=831240>
>>>
>>
>>
>
>
> --
> *BENJAMIN VOGAN* | Data Platform Team Lead
>
> <http://www.shopkick.com/>
> <https://www.facebook.com/shopkick> <https://www.instagram.com/shopkick/>
> <https://www.pinterest.com/shopkick/> <https://twitter.com/shopkickbiz>
> <https://www.linkedin.com/company-beta/831240/?pathWildcard=831240>
>


zeppelin static web reource names - id dups

2017-05-11 Thread Ruslan Dautkhanov
Maven generates some of the web resource names, for example, css files.

- What are those hex ids in file names?
- Why those ids duplicate in file names up to 5 times? (see example below
in *bold*)


$ find . -name "main*css"



> ./spark-dependencies/target/spark-2.1.0/docs/css/main.css



>
> ./zeppelin-web/dist/styles/main.a8972425cabfc433.a8972425cabfc433.1e4d9898f11b1363.css



> ./zeppelin-web/dist/styles/main.a8972425cabfc433.css



> ./zeppelin-web/dist/styles/main.
> *a8972425cabfc433.a8972425cabfc433.a8972425cabfc433.a8972425cabfc433.a8972425cabfc433*
> .css


We slightly change css and I haven't noticed this behavior with dups in
previous Zeppelin versions.


Ruslan


NullPointerException at org.apache.zeppelin.spark.Utils.buildJobGroupId

2017-05-10 Thread Ruslan Dautkhanov
Has anyone experienced below exception?
It started happening inconsistently after upgrade to a last week master
snapshot of Zeppelin.
We have multiple users reported the same issue.

java.lang.NullPointerException at
org.apache.zeppelin.spark.Utils.buildJobGroupId(Utils.java:112) at
org.apache.zeppelin.spark.SparkZeppelinContext.showData(SparkZeppelinContext.java:100)
at
org.apache.zeppelin.spark.SparkSqlInterpreter.interpret(SparkSqlInterpreter.java:129)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:101)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:500)
at org.apache.zeppelin.scheduler.Job.run(Job.java:181) at
org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)



Thanks,
Ruslan


Re: ZeppelinContext textbox for passwords

2017-05-09 Thread Ruslan Dautkhanov
In pyspark with Jupyter we used to do
getpass.getpass "Prompt the user for a password without echoing"
https://docs.python.org/2/library/getpass.html
but with Zeppelin Spark Interpreter wouldn't pass interactive request
to pyspark REPL - it actually makes spark interpreter paragraph hang.

I was thinking to submit a jira for this a while back.
Not sure how hard it would be to add in Zeppelin.

Here's how it looks in Jupyter (Jupyer actually displays an interactive
prompt under the paragraph):

[image: Inline image 1]

Here's how it looks in Zeppelin (after canceling execution as it gets
stuck):

[image: Inline image 2]

I think if Zeppelin could understand that there is an interactive prompt,
this will be helpful not only with password prompts but any other cases
(including shell interpreter).



-- 
Ruslan Dautkhanov

On Tue, May 9, 2017 at 4:59 PM, Ben Vogan <b...@shopkick.com> wrote:

> Hi there,
>
> Is it possible to create a textbox for accepting passwords via the
> ZeppelinContext (i.e. one that masks input)?  I do not see any way to do
> so, but I hope I'm missing something.
>
> Thanks,
>
> --
> *BENJAMIN VOGAN* | Data Platform Team Lead
>
> <http://www.shopkick.com/>
> <https://www.facebook.com/shopkick> <https://www.instagram.com/shopkick/>
> <https://www.pinterest.com/shopkick/> <https://twitter.com/shopkickbiz>
> <https://www.linkedin.com/company-beta/831240/?pathWildcard=831240>
>


Re: Recent Improvements Apache Zeppelin Livy Integration

2017-05-04 Thread Ruslan Dautkhanov
Thanks for sharing this Jeff!

Once Zeppelin supports yarn-cluster, what would be main benefits of using
Livy Spark interpreters, instead of just the Spark interpreters?



-- 
Ruslan Dautkhanov

On Thu, May 4, 2017 at 10:51 PM, Jeff Zhang <zjf...@gmail.com> wrote:

> For anyone that is using or interested in livy interpreter
>
> https://hortonworks.com/blog/recent-improvements-apache-
> zeppelin-livy-integration/
>
>
>


Re: Export Zeppelin notebook with built-int visualization

2017-05-03 Thread Ruslan Dautkhanov
Hope to see this as implemented one day
https://issues.apache.org/jira/browse/ZEPPELIN-1774


On Wed, May 3, 2017 at 5:05 AM Petr Knez  wrote:

> I know about feature (link to paragraph) but it not works if Zeppelin has
> enabled Shiro authorization.
> It works only for me (if I'm logged in) but not for anyone else. See in
> enclosed picture.
> P.
>
> On Wed, May 3, 2017 at 11:16 AM, Jeff Zhang  wrote:
>
>>
>> You can export paragraph
>>
>> https://zeppelin.apache.org/docs/0.8.0-SNAPSHOT/manual/publish.html
>>
>>
>>
>> Petr Knez 于2017年5月3日周三 下午4:59写道:
>>
>>> I there any option to export notebook with built-in visualization
>>> (z.show) outside zeppelin for users who don't have access Zeppelin, like
>>> export whole notebook or note (not only code) into html (not .json)?
>>> Thanks
>>> Petr
>>>
>>
>


Re: Can't download moderately large data or number of rows to csv

2017-05-02 Thread Ruslan Dautkhanov
Good idea to introduce in Zeppelin a way to download full datasets without
actually visualizing them.

Not sure if this helps, we taught our users to use %sh hadoop fs -getmerge
/hadoop/path/dir/ /some/nfs/mount/
for large files (they sometimes have to download datasets with millions of
records).
They run Zeppelin on edge nodes that have NFS mounts to a drop zone.

ps. Hue has a limit too, by default 100k rows
https://github.com/cloudera/hue/blob/release-3.12.0/desktop/conf.dist/hue.ini#L905

Not sure how much it scales up.



-- 
Ruslan Dautkhanov

On Tue, May 2, 2017 at 10:41 AM, Paul Brenner <pbren...@placeiq.com> wrote:

> There are limits to how much data the download to csv button will download
> (1.5MB? 3500 rows?) which limit zeppelin’s usefulness for our BI teams.
> This limit comes up far before we run into issues with showing too many
> rows of data in zeppelin.
>
> Unfortunately (fortunately?) Hue is the other tool the BI team has been
> using and there they have no problem downloading much larger datasets to
> csv. This is definitely not a requirement I’ve ever run into in the way I
> use zeppelin since I would just use spark to write the data out. However,
> the BI team is not allowed to run spark jobs (they use hive via jdbc) so
> that download to csv button is pretty important to them.
>
> Would it be possible to significantly increase the limit? Even better
> would it be possible to download more data than is shown? I assume this is
> the type of thing I would need to open a ticket for, but I wanted to ask
> here first.
>
> <http://www.placeiq.com/> <http://www.placeiq.com/>
> <http://www.placeiq.com/> Paul Brenner <https://twitter.com/placeiq>
> <https://twitter.com/placeiq> <https://twitter.com/placeiq>
> <https://www.facebook.com/PlaceIQ> <https://www.facebook.com/PlaceIQ>
> <https://www.linkedin.com/company/placeiq>
> <https://www.linkedin.com/company/placeiq>
> DATA SCIENTIST
> *(217) 390-3033 <(217)%20390-3033> *
>
> <http://www.placeiq.com/2015/05/26/placeiq-named-winner-of-prestigious-2015-oracle-data-cloud-activate-award/>
> <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/>
> <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/>
> <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/>
> <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/>
> <http://placeiq.com/2016/03/08/measuring-addressable-tv-campaigns-is-now-possible/>
> <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/>
> <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/>
> <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/>
> <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/>
> <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/>
> <http://pages.placeiq.com/Location-Data-Accuracy-Whitepaper-Download.html?utm_source=Signature_medium=Email_campaign=AccuracyWP>
> <http://placeiq.com/2016/08/03/placeiq-bolsters-location-intelligence-platform-with-mastercard-insights/>
> <http://placeiq.com/2016/10/26/the-making-of-a-location-data-industry-milestone/>[image:
> PlaceIQ:Location Data Accuracy]
> <http://placeiq.com/2016/12/07/placeiq-introduces-landmark-a-groundbreaking-offering-that-delivers-access-to-the-highest-quality-location-data-for-insights-that-fuel-limitless-business-decisions/>
>


Re: How do I configure R interpreter in Zeppelin?

2017-04-26 Thread Ruslan Dautkhanov
Thanks for feedback.

%spark.r
print("Hello World!")
 throws exception [2].

Understood - I'll try to remove -Pr and rebuild Zeppelin. Yep, I used a
fresh master snapshot.
( I have't seen anything in maven build logs that could indicate a problem
around R interpreter)
Will update this email thread with result after rebuilding Zeppelin without
-Pr


[2]

spark.r interpreter not found
org.apache.zeppelin.interpreter.InterpreterException: spark.r interpreter
not found at
org.apache.zeppelin.interpreter.InterpreterFactory.getInterpreter(InterpreterFactory.java:417)
at org.apache.zeppelin.notebook.Note.run(Note.java:620) at
org.apache.zeppelin.socket.NotebookServer.persistAndExecuteSingleParagraph(NotebookServer.java:1781)
at
org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:1741)
at
org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:288)
at
org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:59)
at
org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128)
at
org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69)
at
org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65)
at
org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122)




-- 
Ruslan Dautkhanov

On Wed, Apr 26, 2017 at 2:13 PM, moon soo Lee <m...@apache.org> wrote:

> Zeppelin includes two R interpreter implementations.
>
> One used to activated by -Psparkr the other -Pr.
> Since https://github.com/apache/zeppelin/pull/2215, -Psparkr is activated
> by default. And if you're trying to use sparkR, -Psparkr (activated by
> default in master branch) is implementation you might be more interested.
>
> So you can just try use with %spark.r prefix.
> Let me know if it works for you.
>
> Thanks,
> moon
>
> On Wed, Apr 26, 2017 at 12:11 AM Ruslan Dautkhanov <dautkha...@gmail.com>
> wrote:
>
>> Hi moon soo Lee,
>>
>> Cloudera's Spark doesn't have $SPARK_HOME/bin/sparkR
>> Would Zeppelin still enable its sparkR interpreter then?
>>
>> Built Zeppelin using
>>
>> $ mvn clean package -DskipTests -Pspark-2.1 -Ppyspark
>>> -Dhadoop.version=2.6.0-cdh5.10.1 -Phadoop-2.6 -Pyarn *-Pr*
>>> -Pvendor-repo -Pscala-2.10 -pl '!...,!...' -e
>>
>>
>> . . .
>>> [INFO] Zeppelin: *R Interpreter*  SUCCESS
>>> [01:01 min]
>>> [INFO] 
>>> 
>>> [INFO] BUILD SUCCESS
>>> [INFO] 
>>> 
>>> [INFO] Total time: 11:28 min
>>
>>
>> None of the R-related interpreters show up nevertheless.
>>
>> This is including latest Zeppelin snapshot and was the same on previous
>> releases of Zeppelin.
>> So something is missing on our side.
>>
>> R and R packages mentioned in http://zeppelin.apache.org/
>> docs/0.8.0-SNAPSHOT/interpreter/r.html
>> are installed on the servers that runs Zeppelin (and Spark driver as it
>> is yarn-client).
>>
>> I guess either above build options are wrong or there is another
>> dependency I missed.
>> conf/zeppelin-site.xml has R related interpreters mentioned - [1] but
>> none of them
>> show up once Zeppelin starts up.
>>
>> Any ideas?
>>
>>
>> Thank you,
>> Ruslan
>>
>>
>> [1]
>>
>> 
>>>   zeppelin.interpreters
>>>   org.apache.zeppelin.spark.PySparkInterpreter,org.
>>> apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.
>>> *rinterpreter.RRepl*,org.apache.zeppelin.rinterpreter.*KnitR*
>>> ,org.apache.zeppelin.spark.*SparkRInterpreter*
>>> ,org.apache.zeppelin.spark.SparkSqlInterpreter,org.
>>> apache.zeppelin.spark.DepInterpreter,org.apache.
>>> zeppelin.markdown.Markdown,org.apache.zeppelin.angular.
>>> AngularInterpreter,org.apache.zeppelin.shell.
>>> ShellInterpreter,org.apache.zeppelin.file.HDFSFileInterpreter,org.
>>> apache.zeppelin.flink.FlinkInterpreter,,org.apache.zeppelin.python.
>>> PythonInterpreter,org.apache.zeppelin.lens.LensInterpreter,
>>> org.apache.zeppelin.ignite.IgniteInterpreter,org.apache.zeppelin.ignite.
>>> IgniteSqlInterpreter,org.apache.zeppelin.cassandra.
>>> CassandraInterpreter,org.apache.zeppelin.geode.GeodeOqlInterpreter,org.
>>> apache.zeppelin.postgresql.PostgreSqlInterpreter,org.
>>> apache.zeppelin.jdbc.JDBCInterpreter,org.a

Re: How do I configure R interpreter in Zeppelin?

2017-04-26 Thread Ruslan Dautkhanov
Hi moon soo Lee,

Cloudera's Spark doesn't have $SPARK_HOME/bin/sparkR
Would Zeppelin still enable its sparkR interpreter then?

Built Zeppelin using

$ mvn clean package -DskipTests -Pspark-2.1 -Ppyspark
> -Dhadoop.version=2.6.0-cdh5.10.1 -Phadoop-2.6 -Pyarn *-Pr* -Pvendor-repo
> -Pscala-2.10 -pl '!...,!...' -e


. . .
> [INFO] Zeppelin: *R Interpreter*  SUCCESS
> [01:01 min]
> [INFO]
> 
> [INFO] BUILD SUCCESS
> [INFO]
> 
> [INFO] Total time: 11:28 min


None of the R-related interpreters show up nevertheless.

This is including latest Zeppelin snapshot and was the same on previous
releases of Zeppelin.
So something is missing on our side.

R and R packages mentioned in
http://zeppelin.apache.org/docs/0.8.0-SNAPSHOT/interpreter/r.html
are installed on the servers that runs Zeppelin (and Spark driver as it is
yarn-client).

I guess either above build options are wrong or there is another dependency
I missed.
conf/zeppelin-site.xml has R related interpreters mentioned - [1] but none
of them
show up once Zeppelin starts up.

Any ideas?


Thank you,
Ruslan


[1]


>   zeppelin.interpreters
>
> org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.
> *rinterpreter.RRepl*,org.apache.zeppelin.rinterpreter.*KnitR*
> ,org.apache.zeppelin.spark.*SparkRInterpreter*
> ,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.angular.AngularInterpreter,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.file.HDFSFileInterpreter,org.apache.zeppelin.flink.FlinkInterpreter,,org.apache.zeppelin.python.PythonInterpreter,org.apache.zeppelin.lens.LensInterpreter,org.apache.zeppelin.ignite.IgniteInterpreter,org.apache.zeppelin.ignite.IgniteSqlInterpreter,org.apache.zeppelin.cassandra.CassandraInterpreter,org.apache.zeppelin.geode.GeodeOqlInterpreter,org.apache.zeppelin.postgresql.PostgreSqlInterpreter,org.apache.zeppelin.jdbc.JDBCInterpreter,org.apache.zeppelin.kylin.KylinInterpreter,org.apache.zeppelin.elasticsearch.ElasticsearchInterpreter,org.apache.zeppelin.scalding.ScaldingInterpreter,org.apache.zeppelin.alluxio.AlluxioInterpreter,org.apache.zeppelin.hbase.HbaseInterpreter,org.apache.zeppelin.livy.LivySparkInterpreter,org.apache.zeppelin.livy.LivyPySparkInterpreter,org.apache.zeppelin.livy.LivySparkRInterpreter,org.apache.zeppelin.livy.LivySparkSQLInterpreter,org.apache.zeppelin.bigquery.BigQueryInterpreter
>   Comma separated interpreter configurations. First
> interpreter become a default
> 





-- 
Ruslan Dautkhanov

On Sun, Mar 19, 2017 at 1:07 PM, moon soo Lee <m...@apache.org> wrote:

> Easiest way to figure out what your environment needs is,
>
> 1. run SPARK_HOME/bin/sparkR in your shell and make sure it works in the
> same host where Zeppelin going to run.
> 2. try use %spark.r in Zeppelin with SPARK_HOME configured. Normally it
> should work when 1) works without problem, otherwise take a look error
> message and error log to get more informations.
>
> Thanks,
> moon
>
>
> On Sat, Mar 18, 2017 at 8:47 PM Shanmukha Sreenivas Potti <
> shanmu...@utexas.edu> wrote:
>
> I'm not 100% sure as I haven't set it up but it looks like I'm using
>> Zeppelin preconfigured with Spark and I've also taken a snapshot of the
>> Spark Interpreter configuration that I have access to/using in Zeppelin.
>> This interpreter comes with SQL and Python integration and I'm figuring out
>> how do I get to use R.
>>
>> On Sat, Mar 18, 2017 at 8:06 PM, moon soo Lee <m...@apache.org> wrote:
>>
>> AFAIK, Amazon EMR service has an option that launches Zeppelin
>> (preconfigured) with Spark. Do you use Zeppelin provided by EMR or are you
>> setting up Zeppelin separately?
>>
>> Thanks,
>> moon
>>
>> On Sat, Mar 18, 2017 at 4:13 PM Shanmukha Sreenivas Potti <
>> shanmu...@utexas.edu> wrote:
>>
>> ​​
>> Hi Moon,
>>
>> Thanks for responding. Exporting Spark_home is exactly where I have a
>> problem. I'm using Zeppelin notebook with Spark on EMR clusters from an AWS
>> account on cloud. I'm not the master account holder for that AWS account
>> but I'm guessing I'm a client account with limited access probably. Can I
>> still do it?
>>
>> If yes, can you explain where and how should I do that shell scripting to
>> export the variable? Can I do this in the notebook itself by starting the
>> paragraph with sh% or do I need to do something else?
>> If you can share any video that would be great. I would like to let you
>

Re: multiple instances of the same interpreter type

2017-04-07 Thread Ruslan Dautkhanov
We have each user running their own Zeppelin instances,
so everyone has Spark interpreter group defined as

  "option": {
..
"perNote": "shared",
"perUser": "shared",
..
  }

which translates to "interpreter will be instantiated
Globally  in shared
process."



-- 
Ruslan Dautkhanov

On Thu, Apr 6, 2017 at 6:34 PM, Jeff Zhang <zjf...@gmail.com> wrote:

>
> What mode do you use ?
>
>
>
> Ruslan Dautkhanov <dautkha...@gmail.com>于2017年4月7日周五 上午12:49写道:
>
>> A user managed somehow to launch multiple instances of spark interpreter
>> under the same Zeppelin server.
>>
>> See a snippet of `pstree` output:
>>
>>   |-java,6360,wabramov -Dfile.encoding=UTF-8 -Xms1024m -Xmx2048m
>> -XX:MaxPermSize=512m-Dlog4j.configuration=file:///home/wabramov/
>>   |   |-interpreter.sh,4510 /opt/zeppelin/zeppelin-active/bin/interpreter.sh
>> -d /opt/zeppelin/zeppelin-active/interpreter/spark -p 45986 -l/opt/zeppe
>>   |   |   `-interpreter.sh,4523 
>> /opt/zeppelin/zeppelin-active/bin/interpreter.sh
>> -d /opt/zeppelin/zeppelin-active/interpreter/spark -p 45986 -l/opt/zeppe
>>   |   |   `-java,4524 -cp/etc/hive/conf/:/opt/
>> zeppelin/zeppelin-active/interpreter/spark/*:/opt/
>> zeppelin/zeppelin-active/zeppeli
>>   |   |-interpreter.sh,5097 /opt/zeppelin/zeppelin-active/bin/interpreter.sh
>> -d /opt/zeppelin/zeppelin-active/interpreter/spark -p 39752 -l/opt/zeppe
>>   |   |   `-interpreter.sh,5110 
>> /opt/zeppelin/zeppelin-active/bin/interpreter.sh
>> -d /opt/zeppelin/zeppelin-active/interpreter/spark -p 39752 -l/opt/zeppe
>>   |   |   `-java,5111 -cp/etc/hive/conf/:/opt/
>> zeppelin/zeppelin-active/interpreter/spark/*:/opt/
>> zeppelin/zeppelin-active/zeppeli
>>
>>
>> I see another user has three (3) instances running of %sh interpreter.
>>
>> Is this a known issue?
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>


Re: Other paragraphs do not wait for %sh paragraphs to finish.

2017-04-06 Thread Ruslan Dautkhanov
Apart from introducing a full-blown graph of DAG dependencies, a simpler
solution
might be introducing a paragraph-level property "depends on previous
paragraph" (boolean),
so in run-all-paragraphs run, this particular paragraph wouldn't be
scheduled until
previous one is complete (without errors).

It will be a compromise between completely sequential run and having a way
to define a DAG.



-- 
Ruslan Dautkhanov

On Thu, Apr 6, 2017 at 1:32 AM, Jeff Zhang <zjf...@gmail.com> wrote:

>
> That's correct, it needs define dependency between paragraphs, e.g.
>  %spark(deps=p1), so that we can build DAG for the whole pipeline.
>
>
>
>
>
> Rick Moritz <rah...@gmail.com>于2017年4月6日周四 下午3:28写道:
>
>> This actually calls for a dependency definition of notes within a
>> notebook, so the scheduler can decide which tasks to run simultaneously.
>> I suggest a simple counter of dependency levels, which by default
>> increases with every new note and can be decremented to allow notes to run
>> simultaneously. Run-all then submits each level into the target
>> interpreters for this level, awaits termination of all results, and then
>> starts the next level's note.
>>
>>
>> On Thu, Apr 6, 2017 at 12:57 AM, moon soo Lee <m...@apache.org> wrote:
>>
>> Hi,
>>
>> That's expected behavior at the moment. The reason is
>>
>> Each interpreter has it's own scheduler (either FIFO, Parallel), and
>> run-all just submit all paragraphs into target interpreter's scheduler.
>>
>> I think we can add feature such as run-all-sequentially.
>> Do you mind file a JIRA issue?
>>
>> Thanks,
>> moon
>>
>> On Thu, Apr 6, 2017 at 5:35 AM <murexconsult...@googlemail.com> wrote:
>>
>> I often have notebooks that have a %sh as the 1st paragraph. This scps
>> some file from another server, and then a number of spark or sparksql
>> paragraphs are after that.
>>
>> If I click on the run-all paragraphs at the top of the notebook the 1st
>> %sh paragraph kicks off as expected, but the 2nd %spark notebook starts too
>> at the same time. The others go into pending state and then start once the
>> spark one has completed.
>>
>> Is this a bug? Or am I doing something wrong?
>>
>> Thanks
>>
>>
>>


Re: Other paragraphs do not wait for %sh paragraphs to finish.

2017-04-06 Thread Ruslan Dautkhanov
Filed https://issues.apache.org/jira/browse/ZEPPELIN-2368

We had users asking the same.. it forced them to run paragraphs one by one
manually.




-- 
Ruslan Dautkhanov

On Wed, Apr 5, 2017 at 4:57 PM, moon soo Lee <m...@apache.org> wrote:

> Hi,
>
> That's expected behavior at the moment. The reason is
>
> Each interpreter has it's own scheduler (either FIFO, Parallel), and
> run-all just submit all paragraphs into target interpreter's scheduler.
>
> I think we can add feature such as run-all-sequentially.
> Do you mind file a JIRA issue?
>
> Thanks,
> moon
>
> On Thu, Apr 6, 2017 at 5:35 AM <murexconsult...@googlemail.com> wrote:
>
>> I often have notebooks that have a %sh as the 1st paragraph. This scps
>> some file from another server, and then a number of spark or sparksql
>> paragraphs are after that.
>>
>> If I click on the run-all paragraphs at the top of the notebook the 1st
>> %sh paragraph kicks off as expected, but the 2nd %spark notebook starts too
>> at the same time. The others go into pending state and then start once the
>> spark one has completed.
>>
>> Is this a bug? Or am I doing something wrong?
>>
>> Thanks
>>
>>


multiple instances of the same interpreter type

2017-04-06 Thread Ruslan Dautkhanov
A user managed somehow to launch multiple instances of spark interpreter
under the same Zeppelin server.

See a snippet of `pstree` output:

  |-java,6360,wabramov -Dfile.encoding=UTF-8 -Xms1024m -Xmx2048m
-XX:MaxPermSize=512m-Dlog4j.configuration=file:///home/wabramov/
  |   |-interpreter.sh,4510 /opt/zeppelin/zeppelin-active/bin/interpreter.sh
-d /opt/zeppelin/zeppelin-active/interpreter/spark -p 45986 -l/opt/zeppe
  |   |   `-interpreter.sh,4523
/opt/zeppelin/zeppelin-active/bin/interpreter.sh -d
/opt/zeppelin/zeppelin-active/interpreter/spark -p 45986 -l/opt/zeppe
  |   |   `-java,4524
-cp/etc/hive/conf/:/opt/zeppelin/zeppelin-active/interpreter/spark/*:/opt/zeppelin/zeppelin-active/zeppeli
  |   |-interpreter.sh,5097 /opt/zeppelin/zeppelin-active/bin/interpreter.sh
-d /opt/zeppelin/zeppelin-active/interpreter/spark -p 39752 -l/opt/zeppe
  |   |   `-interpreter.sh,5110
/opt/zeppelin/zeppelin-active/bin/interpreter.sh -d
/opt/zeppelin/zeppelin-active/interpreter/spark -p 39752 -l/opt/zeppe
  |   |   `-java,5111
-cp/etc/hive/conf/:/opt/zeppelin/zeppelin-active/interpreter/spark/*:/opt/zeppelin/zeppelin-active/zeppeli


I see another user has three (3) instances running of %sh interpreter.

Is this a known issue?


-- 
Ruslan Dautkhanov


Re: Roadmap for 0.8.0

2017-04-05 Thread Ruslan Dautkhanov
Hi Jeff,

I looked at the PR for ZEPPELIN-1595
<https://issues.apache.org/jira/browse/ZEPPELIN-1595>

It does not look it covers %sh interpreter.

%sh and %sql interpters are somewhat unique as they don't have access to
Zeppelin API (please correct me if I'm wrong)

So what https://issues.apache.org/jira/browse/ZEPPELIN-1967 is suggesting,
is to introduce syntax
that is used in Jupyter notebooks, i.e. {*var1*} will be implied as
z.get('var1'), for example:

%sh
/path/to/script --param8={*var1*} --param9={*var2*}

where var1 and var2 would be implied to be fetched as z.get('var1')
and z.get('var2') respectively.


Or similarly for %sql :

%sql
create table dwh.table_{*year*} stores as parquet
as
select * from spark_df1 where year = {*year*}

We miss a lot global variables for %sql and %sh so that a Zeppelin note can
be used as a single parametrized
orchestration for a whole workflow.


Thank you,
Ruslan Dautkhanov

On Wed, Apr 5, 2017 at 12:01 AM, Jeff Zhang <zjf...@gmail.com> wrote:

>
> Hi Ruslan,
>
> Regarding 'make zeppelinContext available in shell interpreter', you may
> want to check https://issues.apache.org/jira/browse/ZEPPELIN-1595
>
>
> Ruslan Dautkhanov <dautkha...@gmail.com>于2017年4月3日周一 下午12:05写道:
>
>> That's exciting to see plans for 0.8.0 on the horizon.
>>
>> Here's my top list for 0.8 :
>>
>> - https://issues.apache.org/jira/browse/ZEPPELIN-2197 "Interpreter Idle
>> timeout"
>>  This is a most-wanted feature by our Zeppelin admins. It was mentioned
>> at least once on this email chain.
>>
>> - https://issues.apache.org/jira/browse/ZEPPELIN-1967 "Passing Z
>> variables to Shell Interpreter"
>>  We had several of our users asking about this functionality. %sh and
>> some other interpreters can't be
>>  parametrized by global variables. ZEPPELIN-1967 is one way of how this
>> can be solved.
>>
>> - https://issues.apache.org/jira/browse/ZEPPELIN-1660 "Home directory
>> references (i.e. ~/zeppelin/) in zeppelin-env.sh don't work as expected"
>>   Less of a critical compared to the above two, but it could complement
>> the multi-tenancy feature very well.
>>
>>
>> Best regards,
>> Ruslan Dautkhanov
>>
>> On Wed, Mar 22, 2017 at 11:29 AM, Felix Cheung <felixcheun...@hotmail.com
>> > wrote:
>>
>> +1 with latest/stable.
>>
>>
>>
>>
>> --
>> *From:* moon soo Lee <m...@apache.org>
>> *Sent:* Tuesday, March 21, 2017 8:41:58 AM
>> *To:* users@zeppelin.apache.org
>> *Cc:* d...@zeppelin.apache.org
>>
>> *Subject:* Re: Roadmap for 0.8.0
>>
>>
>> And if i suggest simplest way for us to set quality expectation to user,
>> which will be labeling release in download page.
>>
>> Currently releases are divided into 2 categories in download page.
>> 'Latest release' and 'Old releases'. I think we can treat 'Latest' as
>> unstable and add one more category 'Stable release'.
>>
>> For example, once 0.8.0 is released,
>>
>> Latest release : 0.8.0
>> Stable release : 0.7.1
>> Old release : 0.6.2, 0.6.1 
>>
>> Once we feel confident about the stability of latest release, we can just
>> change label from latest to stable in the download page. (and previous
>> stable goes to old releases)
>> We can even include formal vote for moving release from 'latest' to
>> 'stable' in our release process, if it is necessary.
>>
>> Thanks,
>> moon
>>
>> On Tue, Mar 21, 2017 at 6:59 AM moon soo Lee <m...@apache.org> wrote:
>>
>> Yes, having longer RC period will help.
>>
>> But if i recall 0.7.0 release, although 21 people participated verifying
>> through 4 RC for 15days, it wasn't enough to catch all critical problems
>> during the release process. After the release, we've got much more number
>> of bug reports, in next few days.
>>
>> Basically, verifying RC is limited to people who subscribe mailing list +
>> willing to contribute time to verify RC, which is much smaller number of
>> people who download release from download page. So having longer RC period
>> will definitely help and i think we should do, but I think it's still not
>> enough to make sure the quality, considering past history.
>>
>> AFAIK, releasing 0.8.0-preview, calling it unstable is up to the project.
>> ASF release process defines how to release source code, but it does not
>> really restrict what kind of 'version' the project should have releases.
>> For example, spark released spark-2.0.0-preview[1] before spark-2.0.0.
>>
>> Thanks

Re: Roadmap for 0.8.0

2017-04-02 Thread Ruslan Dautkhanov
That's exciting to see plans for 0.8.0 on the horizon.

Here's my top list for 0.8 :

- https://issues.apache.org/jira/browse/ZEPPELIN-2197 "Interpreter Idle
timeout"
 This is a most-wanted feature by our Zeppelin admins. It was mentioned at
least once on this email chain.

- https://issues.apache.org/jira/browse/ZEPPELIN-1967 "Passing Z variables
to Shell Interpreter"
 We had several of our users asking about this functionality. %sh and some
other interpreters can't be
 parametrized by global variables. ZEPPELIN-1967 is one way of how this can
be solved.

- https://issues.apache.org/jira/browse/ZEPPELIN-1660 "Home directory
references (i.e. ~/zeppelin/) in zeppelin-env.sh don't work as expected"
  Less of a critical compared to the above two, but it could complement the
multi-tenancy feature very well.


Best regards,
Ruslan Dautkhanov

On Wed, Mar 22, 2017 at 11:29 AM, Felix Cheung <felixcheun...@hotmail.com>
wrote:

> +1 with latest/stable.
>
>
>
>
> --
> *From:* moon soo Lee <m...@apache.org>
> *Sent:* Tuesday, March 21, 2017 8:41:58 AM
> *To:* users@zeppelin.apache.org
> *Cc:* d...@zeppelin.apache.org
>
> *Subject:* Re: Roadmap for 0.8.0
>
> And if i suggest simplest way for us to set quality expectation to user,
> which will be labeling release in download page.
>
> Currently releases are divided into 2 categories in download page. 'Latest
> release' and 'Old releases'. I think we can treat 'Latest' as unstable and
> add one more category 'Stable release'.
>
> For example, once 0.8.0 is released,
>
> Latest release : 0.8.0
> Stable release : 0.7.1
> Old release : 0.6.2, 0.6.1 
>
> Once we feel confident about the stability of latest release, we can just
> change label from latest to stable in the download page. (and previous
> stable goes to old releases)
> We can even include formal vote for moving release from 'latest' to
> 'stable' in our release process, if it is necessary.
>
> Thanks,
> moon
>
> On Tue, Mar 21, 2017 at 6:59 AM moon soo Lee <m...@apache.org> wrote:
>
>> Yes, having longer RC period will help.
>>
>> But if i recall 0.7.0 release, although 21 people participated verifying
>> through 4 RC for 15days, it wasn't enough to catch all critical problems
>> during the release process. After the release, we've got much more number
>> of bug reports, in next few days.
>>
>> Basically, verifying RC is limited to people who subscribe mailing list +
>> willing to contribute time to verify RC, which is much smaller number of
>> people who download release from download page. So having longer RC period
>> will definitely help and i think we should do, but I think it's still not
>> enough to make sure the quality, considering past history.
>>
>> AFAIK, releasing 0.8.0-preview, calling it unstable is up to the project.
>> ASF release process defines how to release source code, but it does not
>> really restrict what kind of 'version' the project should have releases.
>> For example, spark released spark-2.0.0-preview[1] before spark-2.0.0.
>>
>> Thanks,
>> moon
>>
>> [1] http://spark.apache.org/news/spark-2.0.0-preview.html
>>
>>
>> On Mon, Mar 20, 2017 at 11:31 PM Jongyoul Lee <jongy...@gmail.com> wrote:
>>
>> I agree that it will help prolong RC period and use it actually. And also
>> we need code freeze for the new features and spend time to stabilize RC.
>>
>> On Tue, Mar 21, 2017 at 1:25 PM, Felix Cheung <felixcheun...@hotmail.com>
>> wrote:
>>
>> +1 on quality and stabilization.
>>
>> I'm not sure if releasing as preview or calling it unstable fits with the
>> ASF release process though.
>>
>> Other projects have code freeze, RC (and longer RC iteration time) etc. -
>> do we think those will help improve quality when the release is finally cut?
>>
>>
>> _
>> From: Jianfeng (Jeff) Zhang <jzh...@hortonworks.com>
>> Sent: Monday, March 20, 2017 6:13 PM
>> Subject: Re: Roadmap for 0.8.0
>> To: <users@zeppelin.apache.org>, dev <d...@zeppelin.apache.org>
>>
>>
>>
>> Strongly +1 for adding system test for different interpreter modes and
>> focus on bug fixing than new features. I do heard from some users complain
>> about the bugs of zeppelin major release. A stabilized release is very
>> necessary for community.
>>
>>
>>
>>
>> Best Regard,
>> Jeff Zhang
>>
>>
>> From: moon soo Lee <m...@apache.org<mailto:m...@apache.org
>> <m...@apache.org>>>
>> Reply-T

Re: Should zeppelin.pyspark.python be used on the worker nodes ?

2017-03-20 Thread Ruslan Dautkhanov
> from pyspark.conf import SparkConf
> ImportError: No module named *pyspark.conf*


William, you probably meant

from pyspark import SparkConf


?


-- 
Ruslan Dautkhanov

On Mon, Mar 20, 2017 at 2:12 PM, William Markito Oliveira <
william.mark...@gmail.com> wrote:

> Ah! Thanks Ruslan! I'm still using 0.7.0 - Let me update to 0.8.0 and
> I'll come back update this thread with the results.
>
> On Mon, Mar 20, 2017 at 3:10 PM, William Markito Oliveira <
> william.mark...@gmail.com> wrote:
>
>> Hi moon, thanks for the tip. Here to summarize my current settings are
>> the following
>>
>> conf/zeppelin-env.sh has only SPARK_HOME setting:
>>
>> export SPARK_HOME=/opt/spark-2.1.0-bin-hadoop2.7/
>>
>> Then on the configuration of the interpreter through the web interface I
>> have:
>>
>> PYSPARK_PYTHON=/opt/miniconda2/envs/myenv/bin/python
>> zeppelin.pyspark.python=python
>>
>> But when I submit from the notebook I'm receiving:  pyspark is not
>> responding
>>
>> And the log file outputs:
>>
>> Traceback (most recent call last): File 
>> "/tmp/zeppelin_pyspark-6480867511995958556.py",
>> line 22, in  from pyspark.conf import SparkConf ImportError: No
>> module named pyspark.conf
>>
>> Any thoughts ?  Thanks a lot!
>>
>> On Mon, Mar 20, 2017 at 2:27 PM, moon soo Lee <m...@apache.org> wrote:
>>
>>> When property key in interpreter configuration screen matches certain
>>> condition [1], it'll be treated as a environment variable.
>>>
>>> You can remove PYSPARK_PYTHON from conf/zeppelin-env.sh and place it in
>>> interpreter configuration.
>>>
>>> Thanks,
>>> moon
>>>
>>> [1] https://github.com/apache/zeppelin/blob/master/zeppelin-
>>> interpreter/src/main/java/org/apache/zeppelin/interpreter/re
>>> mote/RemoteInterpreter.java#L152
>>>
>>>
>>> On Mon, Mar 20, 2017 at 12:21 PM William Markito Oliveira <
>>> william.mark...@gmail.com> wrote:
>>>
>>>> Thanks for the quick response Ruslan.
>>>>
>>>> But given that it's an environment variable, I can't quickly change
>>>> that value and point to a different python environment without restarting
>>>> the Zeppelin process, can I ? I mean is there a way to set the value for
>>>> PYSPARK_PYTHON from the Interpreter configuration screen ?
>>>>
>>>> Thanks,
>>>>
>>>>
>>>> On Mon, Mar 20, 2017 at 2:15 PM, Ruslan Dautkhanov <
>>>> dautkha...@gmail.com> wrote:
>>>>
>>>> You can set PYSPARK_PYTHON environment variable for that.
>>>>
>>>> Not sure about zeppelin.pyspark.python. I think it does not work
>>>> See comments in https://issues.apache.org/jira/browse/ZEPPELIN-1265
>>>>
>>>> Eventually, i think we can remove zeppelin.pyspark.python and use only
>>>> PYSPARK_PYTHON instead to avoid confusion.
>>>>
>>>>
>>>> --
>>>> Ruslan Dautkhanov
>>>>
>>>> On Mon, Mar 20, 2017 at 12:59 PM, William Markito Oliveira <
>>>> mark...@apache.org> wrote:
>>>>
>>>> I'm trying to use zeppelin.pyspark.python as the variable to set the
>>>> python that Spark worker nodes should use for my job, but it doesn't seem
>>>> to be working.
>>>>
>>>> Am I missing something or this variable does not do that ?
>>>>
>>>> My goal is to change that variable to point to different conda
>>>> environments.  These environments are available in all worker nodes since
>>>> it's on a shared location and ideally all nodes then would have access to
>>>> the same libraries and dependencies.
>>>>
>>>> Thanks,
>>>>
>>>> ~/William
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> ~/William
>>>>
>>>
>>
>>
>> --
>> ~/William
>>
>
>
>
> --
> ~/William
>


Re: Should zeppelin.pyspark.python be used on the worker nodes ?

2017-03-20 Thread Ruslan Dautkhanov
You're right - it will not be dynamic.

You may want to check
https://issues.apache.org/jira/browse/ZEPPELIN-2195
https://github.com/apache/zeppelin/pull/2079
it seems it is fixed in a current snapshot of Zeppelin (comitted 3 weeks
ago).






-- 
Ruslan Dautkhanov

On Mon, Mar 20, 2017 at 1:21 PM, William Markito Oliveira <
william.mark...@gmail.com> wrote:

> Thanks for the quick response Ruslan.
>
> But given that it's an environment variable, I can't quickly change that
> value and point to a different python environment without restarting the
> Zeppelin process, can I ? I mean is there a way to set the value for
> PYSPARK_PYTHON from the Interpreter configuration screen ?
>
> Thanks,
>
>
> On Mon, Mar 20, 2017 at 2:15 PM, Ruslan Dautkhanov <dautkha...@gmail.com>
> wrote:
>
>> You can set PYSPARK_PYTHON environment variable for that.
>>
>> Not sure about zeppelin.pyspark.python. I think it does not work
>> See comments in https://issues.apache.org/jira/browse/ZEPPELIN-1265
>>
>> Eventually, i think we can remove zeppelin.pyspark.python and use only
>> PYSPARK_PYTHON instead to avoid confusion.
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Mon, Mar 20, 2017 at 12:59 PM, William Markito Oliveira <
>> mark...@apache.org> wrote:
>>
>>> I'm trying to use zeppelin.pyspark.python as the variable to set the
>>> python that Spark worker nodes should use for my job, but it doesn't seem
>>> to be working.
>>>
>>> Am I missing something or this variable does not do that ?
>>>
>>> My goal is to change that variable to point to different conda
>>> environments.  These environments are available in all worker nodes since
>>> it's on a shared location and ideally all nodes then would have access to
>>> the same libraries and dependencies.
>>>
>>> Thanks,
>>>
>>> ~/William
>>>
>>
>>
>
>
> --
> ~/William
>


Re: Should zeppelin.pyspark.python be used on the worker nodes ?

2017-03-20 Thread Ruslan Dautkhanov
You can set PYSPARK_PYTHON environment variable for that.

Not sure about zeppelin.pyspark.python. I think it does not work
See comments in https://issues.apache.org/jira/browse/ZEPPELIN-1265

Eventually, i think we can remove zeppelin.pyspark.python and use only
PYSPARK_PYTHON instead to avoid confusion.


-- 
Ruslan Dautkhanov

On Mon, Mar 20, 2017 at 12:59 PM, William Markito Oliveira <
mark...@apache.org> wrote:

> I'm trying to use zeppelin.pyspark.python as the variable to set the
> python that Spark worker nodes should use for my job, but it doesn't seem
> to be working.
>
> Am I missing something or this variable does not do that ?
>
> My goal is to change that variable to point to different conda
> environments.  These environments are available in all worker nodes since
> it's on a shared location and ideally all nodes then would have access to
> the same libraries and dependencies.
>
> Thanks,
>
> ~/William
>


Re: Is there a way to close interpreter after inactivity

2017-03-01 Thread Ruslan Dautkhanov
https://issues.apache.org/jira/browse/ZEPPELIN-2197

This was created just yesterday :-)

On Wed, Mar 1, 2017 at 12:54 PM Alexander Filipchik 
wrote:

> Hi,
>
> Is there any way to close an isolated interpreter after some timeout?
> Let's say set an inactivity timeout of 30 mins (user input or job output)
> and then return all the resources and close everything.
>
> Thank you,
> Alex
>


Re: "output disappears" after running a paragraph - on recent master snapshot

2017-01-30 Thread Ruslan Dautkhanov
Thanks for the follow up Moon.

I just happened to witness this blinking paragraph's output on his PC.
It's interesting that it only happens in Chrome for him.
(I use the same version of Chrome as him, and can't reproduce it)

We also checked that it happens for different interpreters - %sh and %spark
both show output, then it disappers very quickly.

It doesn't happen in IE for that user. (IE has other issues like it doesn't
show "Took x min y sec. Last updated by z at ..." at the bottom of a
paragraph).

ps. We also checked that user doesn't have ad blocking extensions in Chrome
or anything like that.



-- 
Ruslan Dautkhanov

On Sun, Jan 29, 2017 at 2:43 PM, moon soo Lee <m...@apache.org> wrote:

> Hi,
>
> I'm not sure which action can possibly make output blinks and disappears.
> But
>
> ERROR [2017-01-28 11:13:53,338] ({pool-4-thread-1}
> AppendOutputRunner.java[run]:68) - Wait for OutputBuffer queue
> interrupted: null
>
> can occur when interpreter process is terminating (e.g. user click
> interpreter restart).
>
> Thanks,
> moon
>
>
> On Sun, Jan 29, 2017 at 6:18 AM Ruslan Dautkhanov <dautkha...@gmail.com>
> wrote:
>
>> We upgraded our Zeppelin to yesterday's master snapshot.
>>
>> One of users complains that all his notes don't produces output
>> after the upgrade. It "blinks" with output and then output disappears.
>>
>> Here's quote from his email:
>>
>> My notebooks are having problems printing output.  I can see the output
>> flash on the screen but then it disappears.
>>
>>
>> See that user's log attached. It has exceptions, including:
>>
>> ERROR [2017-01-28 11:13:53,338] ({pool-4-thread-1}
>> AppendOutputRunner.java[run]:68) - Wait for OutputBuffer queue
>> interrupted: null
>> ERROR [2017-01-28 11:41:53,272] ({qtp691893263-18} 
>> ResourcePoolUtils.java[getAllResourcesExcept]:64)
>> -
>>
>>
>> I can't reproduce this issue, but we're running Zeppelin instances out of
>> the same
>> Zeppelin installation (running under different users though), so wanted
>> to bounce
>> this at other users. Have you seen this error before? Is this a known
>> issue?
>>
>>
>> Thanks,
>> Ruslan Dautkhanov
>>
>


Re: 'File size limit Exceeded' when importing notes - even for small files

2017-01-17 Thread Ruslan Dautkhanov
>From the screenshot "JSON file size cannot exceed MB".
Notice there is no number between "exceed" and "MB".
Not sure if we're missing a setting or an environment variable to define
the limit?
It now prevents us from importing any notebooks.



-- 
Ruslan Dautkhanov

On Tue, Jan 17, 2017 at 11:54 AM, Ruslan Dautkhanov <dautkha...@gmail.com>
wrote:

> 'File size limit Exceeded' when importing notes - even for small files
>
> This happens even for tiny files - a few Kb.
>
> Is this a known issue?
>
> Running Zeppelin 0.7.0 from a few weeks old snapshot.
>
> See attached screenshot.
>
>
> --
> Ruslan Dautkhanov
>


Re: Passing variables from %pyspark to %sh

2017-01-13 Thread Ruslan Dautkhanov
Created https://issues.apache.org/jira/browse/ZEPPELIN-1967

(JIRA had some issues.. https://twitter.com/infrabot  - had to wait a
couple of days.)

Great ideas. Thank you everyone.




-- 
Ruslan Dautkhanov

On Thu, Jan 12, 2017 at 8:55 AM, t p <tauis2...@gmail.com> wrote:

> Is something like feasible from the front end perspective - i.e the web UI
> (Angular?) - i.e. not matter which process/JVM runs the interpreter, I’d
> assume that a book is executed in the context of a we browser which unifies
> all the pages of the book...
>
> On Jan 12, 2017, at 9:56 AM, Jeff Zhang <zjf...@gmail.com> wrote:
>
>
> Agree to share variables between interpreters. Currently zeppelin launch
> one JVM for each interpreter group. So it is not possible to share
> variables between spark and sh. But for some interpreters like sh, md, it
> is not necessary to create separate JVM for them. We can embed them in
> spark interpreter JVM.  But we could not do it for all interpreters,
> because it would cause potential jar conflicts.
>
>
>
> Jongyoul Lee <jongy...@gmail.com>于2017年1月12日周四 下午10:18写道:
>
>> Yes, many users suggest that feature to share results between paragraphs
>> and different interpreters. I think this would be one of major features in
>> a next release.
>>
>> On Thu, Jan 12, 2017 at 10:30 PM, t p <tauis2...@gmail.com> wrote:
>>
>> Is it possible to have similar support to exchange  checkbox/dropdown
>> variables and can variables be exchanged with other interpreters like PSQL
>> (e.g. variable set by spark/pyspark and accessible in another para which is
>> running PSQL interpreter).
>>
>> I’m interested in doing this and I’d like to know if there is a way to
>> accomplish this:
>> https://lists.apache.org/thread.html/a1b3530e5a20f983acd70f8fca029f
>> 90b6bfe8d0d999597342447e6f@%3Cusers.zeppelin.apache.org%3E
>>
>>
>> On Jan 12, 2017, at 2:16 AM, Jongyoul Lee <jongy...@gmail.com> wrote:
>>
>> There's no way to communicate between spark and sh intepreter. It need to
>> implement it but it doesn't yet. But I agree that it would be helpful for
>> some cases. Can you create issue?
>>
>> On Thu, Jan 12, 2017 at 3:32 PM, Ruslan Dautkhanov <dautkha...@gmail.com>
>> wrote:
>>
>> It's possible to exchange variables between Scala and Spark
>> through z.put and z.get.
>>
>> How to pass a variable to %sh?
>>
>> In Jupyter it's possible to do for example as
>>
>>   ! hadoop fs -put {localfile} {hdfsfile}
>>
>>
>> where localfile and and hdfsfile are Python variables.
>>
>> Can't find any references for something similar in Shell Interpreter
>> https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/interpreter/shell.html
>>
>> In many notebooks we have to pass small variabels
>> from Zeppelin notes to external scripts as parameters.
>>
>> It would be awesome to have something like
>>
>> %sh
>> /path/to/script --param8={var1} --param9={var2}
>>
>>
>> where var1 and var2 would be implied to be fetched as z.get('var1')
>> and z.get('var2') respectively.
>>
>> Other thoughts?
>>
>>
>> Thank you,
>> Ruslan Dautkhanov
>>
>>
>>
>>
>> --
>> 이종열, Jongyoul Lee, 李宗烈
>> http://madeng.net
>>
>>
>>
>>
>>
>> --
>> 이종열, Jongyoul Lee, 李宗烈
>> http://madeng.net
>>
>
>


Passing variables from %pyspark to %sh

2017-01-11 Thread Ruslan Dautkhanov
It's possible to exchange variables between Scala and Spark
through z.put and z.get.

How to pass a variable to %sh?

In Jupyter it's possible to do for example as

>   ! hadoop fs -put {localfile} {hdfsfile}


where localfile and and hdfsfile are Python variables.

Can't find any references for something similar in Shell Interpreter
https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/interpreter/shell.html

In many notebooks we have to pass small variabels
from Zeppelin notes to external scripts as parameters.

It would be awesome to have something like

%sh
> /path/to/script --param8={var1} --param9={var2}


where var1 and var2 would be implied to be fetched as z.get('var1')
and z.get('var2') respectively.

Other thoughts?


Thank you,
Ruslan Dautkhanov


Re: Interpreter zombie processes

2016-12-16 Thread Ruslan Dautkhanov
Thank you everyone for confirming this issue.

Created https://issues.apache.org/jira/browse/ZEPPELIN-1832

Thanks again.



-- 
Ruslan Dautkhanov

On Fri, Dec 16, 2016 at 2:48 AM, blaubaer <rene.pfitz...@nzz.ch> wrote:

> We are seeing this problem as well, regularly actually. Especially in
> situations when we have many concurrent interpreters running.
>
>
>
> --
> View this message in context: http://apache-zeppelin-users-
> incubating-mailing-list.75479.x6.nabble.com/Interpreter-
> zombie-processes-tp4738p4746.html
> Sent from the Apache Zeppelin Users (incubating) mailing list mailing list
> archive at Nabble.com.
>


code generation: paragraph generates another paragraph

2016-12-12 Thread Ruslan Dautkhanov
We'd like to have paragraph's code generated by a preceding paragraph.

For example, one of the use cases we have
is when %pyspark generates Hive DDLs.
(can't run those in Spark in some cases)
Any chance an output of a paragraph can be redirected to a following
paragraph?

I was thinking something like this could be used
https://zeppelin.apache.org/docs/latest/rest-api/rest-notebook.html#create-a-new-paragraph
http://[zeppelin-server]:[zeppelin-port]/api/notebook/[notebookId]/paragraph
But not sure if there is a easy way to call Zeppelin API directly through
"z" variable?

Something like z.addParagraph(...)

In most cases a paragraph generates a SQL code that can't be run directly
as Spark SQL
and has to be run by a different engine, for example, by Hive.


Thanks,
Ruslan


Re: 0.7.0 zeppelin.interpreters change: can't make pyspark be default Spark interperter

2016-12-08 Thread Ruslan Dautkhanov
I got a lucky jira number :-)

https://issues.apache.org/jira/browse/ZEPPELIN-1777

Thank you Jeff.



-- 
Ruslan Dautkhanov

On Thu, Dec 8, 2016 at 10:50 PM, Jeff Zhang <zjf...@gmail.com> wrote:

> hmm, I think so, please file a ticket for it.
>
>
>
> Ruslan Dautkhanov <dautkha...@gmail.com>于2016年12月9日周五 下午1:49写道:
>
>> Hi Jeff,
>>
>> When I made pySpark as default - it works as expected;
>> except Setting UI. See screenshot below.
>>
>> Notice it shows %spark twice.
>> First time as default. 2nd one is not.
>> It should have been %pyspark (default), %spark, ..
>> as I made pyspark default.
>>
>> Is this a new bug in 0.7?
>>
>> [image: Inline image 1]
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Wed, Nov 30, 2016 at 7:34 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>>
>> Hi Ruslan,
>>
>> I miss another thing, You also need to delete file conf/interpreter.json
>> which store the original setting. Otherwise the original setting is always
>> loaded.
>>
>>
>> Ruslan Dautkhanov <dautkha...@gmail.com>于2016年12月1日周四 上午1:03写道:
>>
>> Got it. Thanks Jeff.
>>
>> I've downloaded
>> https://github.com/apache/zeppelin/blob/master/spark/src/main/resources/
>> interpreter-setting.json
>> and saved to $ZEPPELIN_HOME/interpreter/spark/
>> Then Moved  "defaultInterpreter": true,
>> from json section
>> "className": "org.apache.zeppelin.spark.SparkInterpreter",
>> to section
>> "className": "org.apache.zeppelin.spark.PySparkInterpreter",
>>
>> pySpark is still not default.
>>
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Tue, Nov 29, 2016 at 10:36 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>>
>> No, you don't need to create that directory, it should be in
>> $ZEPPELIN_HOME/interpreter/spark
>>
>>
>>
>>
>> Ruslan Dautkhanov <dautkha...@gmail.com>于2016年11月30日周三 下午12:12写道:
>>
>> Thank you Jeff.
>>
>> Do I have to create interpreter/spark directory in $ZEPPELIN_HOME/conf
>> or in $ZEPPELIN_HOME directory?
>> So zeppelin.interpreters in zeppelin-site.xml is deprecated in 0.7?
>>
>> Thanks!
>>
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Tue, Nov 29, 2016 at 6:54 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>>
>> The default interpreter is now defined in interpreter-setting.json
>>
>> You can update the following file to make pyspark as the default
>> interpreter and then copy it to folder interpreter/spark
>>
>> https://github.com/apache/zeppelin/blob/master/spark/src/main/resources/
>> interpreter-setting.json
>>
>>
>>
>> Ruslan Dautkhanov <dautkha...@gmail.com>于2016年11月30日周三 上午8:49写道:
>>
>> After 0.6.2 -> 0.7 upgrade, pySpark isn't a default Spark interpreter;
>> despite we have org.apache.zeppelin.spark.*PySparkInterpreter*
>> listed first in zeppelin.interpreters.
>>
>> zeppelin.interpreters in zeppelin-site.xml:
>>
>> 
>>   zeppelin.interpreters
>>   org.apache.zeppelin.spark.PySparkInterpreter,org.
>> apache.zeppelin.spark.SparkInterpreter
>> ...
>> 
>>
>>
>>
>> Any ideas how to fix this?
>>
>>
>> Thanks,
>> Ruslan
>>
>>
>>
>>
>>


Re: 0.7.0 zeppelin.interpreters change: can't make pyspark be default Spark interperter

2016-12-08 Thread Ruslan Dautkhanov
Hi Jeff,

When I made pySpark as default - it works as expected;
except Setting UI. See screenshot below.

Notice it shows %spark twice.
First time as default. 2nd one is not.
It should have been %pyspark (default), %spark, ..
as I made pyspark default.

Is this a new bug in 0.7?

[image: Inline image 1]


-- 
Ruslan Dautkhanov

On Wed, Nov 30, 2016 at 7:34 PM, Jeff Zhang <zjf...@gmail.com> wrote:

> Hi Ruslan,
>
> I miss another thing, You also need to delete file conf/interpreter.json
> which store the original setting. Otherwise the original setting is always
> loaded.
>
>
> Ruslan Dautkhanov <dautkha...@gmail.com>于2016年12月1日周四 上午1:03写道:
>
>> Got it. Thanks Jeff.
>>
>> I've downloaded
>> https://github.com/apache/zeppelin/blob/master/spark/src/main/resources/
>> interpreter-setting.json
>> and saved to $ZEPPELIN_HOME/interpreter/spark/
>> Then Moved  "defaultInterpreter": true,
>> from json section
>> "className": "org.apache.zeppelin.spark.SparkInterpreter",
>> to section
>>     "className": "org.apache.zeppelin.spark.PySparkInterpreter",
>>
>> pySpark is still not default.
>>
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Tue, Nov 29, 2016 at 10:36 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>>
>> No, you don't need to create that directory, it should be in
>> $ZEPPELIN_HOME/interpreter/spark
>>
>>
>>
>>
>> Ruslan Dautkhanov <dautkha...@gmail.com>于2016年11月30日周三 下午12:12写道:
>>
>> Thank you Jeff.
>>
>> Do I have to create interpreter/spark directory in $ZEPPELIN_HOME/conf
>> or in $ZEPPELIN_HOME directory?
>> So zeppelin.interpreters in zeppelin-site.xml is deprecated in 0.7?
>>
>> Thanks!
>>
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Tue, Nov 29, 2016 at 6:54 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>>
>> The default interpreter is now defined in interpreter-setting.json
>>
>> You can update the following file to make pyspark as the default
>> interpreter and then copy it to folder interpreter/spark
>>
>> https://github.com/apache/zeppelin/blob/master/spark/src/main/resources/
>> interpreter-setting.json
>>
>>
>>
>> Ruslan Dautkhanov <dautkha...@gmail.com>于2016年11月30日周三 上午8:49写道:
>>
>> After 0.6.2 -> 0.7 upgrade, pySpark isn't a default Spark interpreter;
>> despite we have org.apache.zeppelin.spark.*PySparkInterpreter*
>> listed first in zeppelin.interpreters.
>>
>> zeppelin.interpreters in zeppelin-site.xml:
>>
>> 
>>   zeppelin.interpreters
>>   org.apache.zeppelin.spark.PySparkInterpreter,org.
>> apache.zeppelin.spark.SparkInterpreter
>> ...
>> 
>>
>>
>>
>> Any ideas how to fix this?
>>
>>
>> Thanks,
>> Ruslan
>>
>>
>>
>>


Re: Export note as a PDF

2016-12-08 Thread Ruslan Dautkhanov
Thank you Hyunsung.

For various reasons we can't use Zeppelinhub.
One of the them being we have to run Zeppelin on-prem and don't depend on
any external resources.

I've created
https://issues.apache.org/jira/browse/ZEPPELIN-1774
"Export notebook as a pixel-perfect printable document, i.e. export as a
PDF"
Please vote up if you would find that useful too.

Thank you.



-- 
Ruslan Dautkhanov

On Wed, Dec 7, 2016 at 10:32 PM, Hyunsung Jo <hyunsung...@gmail.com> wrote:

> Hi Ruslan,
>
> Not aware of Zeppelin's roadmap, but perhaps the tag line of the
> ZeppelinHub website (www.zeppelinhub.com) is hinting its feelings towards
> PDF:
> "ANALYZE, SHARE, AND REPEAT.
> Share your graphs and reports from Apache Zeppelin with anyone.
> Never send a graph in a PDF or Powerpoint again."
>
> Regards,
> Jo
>
>
> On Thu, Dec 8, 2016 at 2:00 PM Ruslan Dautkhanov <dautkha...@gmail.com>
> wrote:
>
>> Our users are looking for functionality similar to Jupyter's save
>> notebook as a PDF..
>> Is this in Zeppelin's roadmap somewhere?
>> Could not find any related JIRAs.
>>
>>
>> Thanks,
>> Ruslan Dautkhanov
>>
>


sparkContext to get Spark Driver's URL

2016-11-30 Thread Ruslan Dautkhanov
Any easy way to get Spark Driver's URL (i.e. from sparkContext )?
I always have to go to CM -> YARN applications -> choose my Spark job ->
click Application Master etc. to get Spark's Driver UI.

Any way we could derive driver's URL programmatically from SparkContext
variable?


ps. Long haul - it would be super awesome to get a link staright in
Zeppelin notebook (when SparkContext is instatiated).


Thank you,
Ruslan


shiro.ini [urls] authorization: lock Zeppelin to one user

2016-11-30 Thread Ruslan Dautkhanov
Until we have a good multitenancy support in Zeppelin, we'd have to run
individual Zeppelin instances for each user.

We were trying to use following shiro.ini configurations:

> [urls]
> /api/version = anon
> /** = user["rdautkhanov@CORP.DOMAIN"]


Also tried

> /** = authc, user["rdautkhanov@CORP.DOMAIN"]


none works in a sense that other users after successful LDAP authentication
can create their own notebooks in other user's Zeppelin instances.

shiro.ini has [users] and [roles] sections are empty.

[main] section configures LDAP authentication backend which works as
expected.

rdautkhanov@CORP.DOMAIN is actual user name which is used in LDAP
authentication.

How to make [urls] section let only one specific user in?
Again, neither

> /** = user["rdautkhanov@CORP.DOMAIN"]

nor

> /** = authc, user["rdautkhanov@CORP.DOMAIN"]

work as we expect.

LDAP authentication works as expected; we're struggling with authorization
-
to lock Zeppelin in [urls] to one user (or a few users).


Thank you,
Ruslan


Re: 0.7.0 zeppelin.interpreters change: can't make pyspark be default Spark interperter

2016-11-30 Thread Ruslan Dautkhanov
Got it. Thanks Jeff.

I've downloaded
https://github.com/apache/zeppelin/blob/master/spark/src/main/resources/
interpreter-setting.json
and saved to $ZEPPELIN_HOME/interpreter/spark/
Then Moved  "defaultInterpreter": true,
from json section
"className": "org.apache.zeppelin.spark.SparkInterpreter",
to section
"className": "org.apache.zeppelin.spark.PySparkInterpreter",

pySpark is still not default.



-- 
Ruslan Dautkhanov

On Tue, Nov 29, 2016 at 10:36 PM, Jeff Zhang <zjf...@gmail.com> wrote:

> No, you don't need to create that directory, it should be in
> $ZEPPELIN_HOME/interpreter/spark
>
>
>
>
> Ruslan Dautkhanov <dautkha...@gmail.com>于2016年11月30日周三 下午12:12写道:
>
>> Thank you Jeff.
>>
>> Do I have to create interpreter/spark directory in $ZEPPELIN_HOME/conf
>> or in $ZEPPELIN_HOME directory?
>> So zeppelin.interpreters in zeppelin-site.xml is deprecated in 0.7?
>>
>> Thanks!
>>
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Tue, Nov 29, 2016 at 6:54 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>>
>> The default interpreter is now defined in interpreter-setting.json
>>
>> You can update the following file to make pyspark as the default
>> interpreter and then copy it to folder interpreter/spark
>>
>> https://github.com/apache/zeppelin/blob/master/spark/src/main/resources/
>> interpreter-setting.json
>>
>>
>>
>> Ruslan Dautkhanov <dautkha...@gmail.com>于2016年11月30日周三 上午8:49写道:
>>
>> After 0.6.2 -> 0.7 upgrade, pySpark isn't a default Spark interpreter;
>> despite we have org.apache.zeppelin.spark.*PySparkInterpreter*
>> listed first in zeppelin.interpreters.
>>
>> zeppelin.interpreters in zeppelin-site.xml:
>>
>> 
>>   zeppelin.interpreters
>>   org.apache.zeppelin.spark.PySparkInterpreter,org.
>> apache.zeppelin.spark.SparkInterpreter
>> ...
>> 
>>
>>
>>
>> Any ideas how to fix this?
>>
>>
>> Thanks,
>> Ruslan
>>
>>
>>


Re: 0.7.0 zeppelin.interpreters change: can't make pyspark be default Spark interperter

2016-11-29 Thread Ruslan Dautkhanov
Thank you Jeff.

Do I have to create interpreter/spark directory in $ZEPPELIN_HOME/conf
or in $ZEPPELIN_HOME directory?
So zeppelin.interpreters in zeppelin-site.xml is deprecated in 0.7?

Thanks!



-- 
Ruslan Dautkhanov

On Tue, Nov 29, 2016 at 6:54 PM, Jeff Zhang <zjf...@gmail.com> wrote:

> The default interpreter is now defined in interpreter-setting.json
>
> You can update the following file to make pyspark as the default
> interpreter and then copy it to folder interpreter/spark
>
> https://github.com/apache/zeppelin/blob/master/spark/src/main/resources/
> interpreter-setting.json
>
>
>
> Ruslan Dautkhanov <dautkha...@gmail.com>于2016年11月30日周三 上午8:49写道:
>
>> After 0.6.2 -> 0.7 upgrade, pySpark isn't a default Spark interpreter;
>> despite we have org.apache.zeppelin.spark.*PySparkInterpreter*
>> listed first in zeppelin.interpreters.
>>
>> zeppelin.interpreters in zeppelin-site.xml:
>>
>> 
>>   zeppelin.interpreters
>>   org.apache.zeppelin.spark.PySparkInterpreter,org.
>> apache.zeppelin.spark.SparkInterpreter
>> ...
>> 
>>
>>
>>
>> Any ideas how to fix this?
>>
>>
>> Thanks,
>> Ruslan
>>
>


  1   2   >