Re: Clean up PRs

2018-07-05 Thread Jongyoul Lee
I'll remove three more PRs below:

Closes #2713 How to create an universal filter for a dashboard using date
filter? By Manituti
Closes #2742 [ZEPPELIN-3168] Interpreter Settings Authorization By necosta
Closes #3040 Branch 0.7 By niuguidong


On Wed, Jul 4, 2018 at 4:51 PM, Jeff Zhang  wrote:

> Thanks Jongyoul
>
>
>
> Jongyoul Lee 于2018年7月4日周三 下午2:41写道:
>
> > Hi,
> >
> > Keeping clean on PR page, I will remove outdated PR by the end of 2017
> > soon. Herer is the list of PRs:
> >
> > Closes #1896 implementation test case for connect to exists interpreter
> > process By cloverhearts
> > Closes #1928 ZEPPELIN-598 ] Dynamic loading for Interpreter and API By
> > cloverhearts
> > Closes #1930 [ZEPPELIN-1886] implementation z.getZeppelinJobStatus  By
> > cloverhearts
> > Closes #2271 [ZEPPELIN-2414] Memory leak under scoped mode of
> > SparkInterpreter caused by inapproprately setting
> Thread.contextClassLoader
> > By hammertank
> > Closes #2473 [fix] Check runtimeInfos is not null By ebuildy
> > Closes #2399 [ZEPPELIN-2620] fa-area chart cannot be resized smaller By
> > 0q
> > Closes #2041 ZEPPELIN-2137 Changed "show line chart with focus" to "zoom"
> > so that … By samthebest
> > Closes #2282 [ZEPPELIN-2447] Fix python interpreter as to use max result
> > setting By paularmand
> > Closes #2859 [ZEPPELIN-3320] upgrade the flink interpreter from 1.1.3 to
> > 1.4.0 By jianran
> > Closes #1873 [ZEPPELIN-1922] Exclude jackson-core and jackson-databind
> > from aws dep By epahomov
> > Closes #1893 [ZEPPELIN-451] Save codes and messages as multi-line By
> sixmen
> > Closes #1897 Can't remove interpreter repository. By astroshim
> > Closes #1907 [MINOR] Add enough padding at the bottom of notebook By
> > cuspymd
> > Closes #2019 [ZEPPELIN-2117] jdbc autocomplete for exasol database By
> > sotnich
> > Closes #2151 [ZEPPELIN-465] Be able to run paragraph and the following
> > ones By Remilito
> > Closes #2195 [ZEPPELIN-2319] new methods for ZeppelinContext By meniluca
> > Closes #2249 ZEPPELIN-2356. Improvement for z.angularBind and
> > z.angularWatch By zjffdu
> > Closes #2250 [ZEPPELIN-2085] Interpret scala code in paste mode By DrIgor
> > Closes #2227 [ZEPPELIN-2359] Support Spell as Display By echarles
> > Closes #2321 [ZEPPELIN-2498] add more info and config for Ldap
> > authentication  By khalidhuseynov
> > Closes #2286 Jenkins CI test By Leemoonsoo
> > Closes #2294 [ZEPPELIN-2438]: Add local winutils.exe in build step and
> use
> > if required … By cfries
> > Closes #2301 [ZEPPELIN-1625] Override Interpreter settings at User level
> > By benoyantony
> > Closes #2345 [MINOR] Hide pagination button if there is only 1 page in
> > Helium menu By AhyoungRyu
> > Closes #2372 [ZEPPELIN-2581] Shell ansi codes By DrIgor
> > Closes #2374 [ZEPPELIN-2593] Add storage settings to persist on run and
> > commit By khalidhuseynov
> > Closes #2396 [ZEPPELIN-2451]: Add JDBC config option for calling
> > connection.commit after paragraph execution By randerzander
> > Closes #2402 [ZEPPELIN-2636] User role lookup via interfaces By volumeint
> > Closes #2427 Rinterpreter By test2016new
> > Closes #2389 [ZEPPELIN-2612] Remove duplicates from ZeppelinConfiguration
> > By DrIgor
> > Closes #2434 ZEPPELIN-2687: Show code in description section of Spark UI
> > By karuppayya
> > Closes #2439 [ZEPPELIN-2680] allow opening notebook as a reader By herval
> > Closes #2440 [ZEPPELIN-2587] allow logging in if you're anonymous By
> herval
> > Closes #2448 [ZEPPELIN-2702] save notes in reader-friendly format with
> > zpln extension for VFS and Git repo By khalidhuseynov
> > Closes #2479 [ZEPPELIN-2762] Use regex to retrieve interpreter/repl name,
> > and to retrieve property key; add and fix tests By dabaitu
> > Closes #2508 [ZEPPELIN-2817] Support default interpreter setting in
> create
> > note re… By reminia
> > Closes #2484 [ZEPPELIN-2711] basic metrics for paragraphs & notebook
> > view/create/run By herval
> > Closes #2489 [Refactoring] some opportunities to use diamond operator By
> > desmorto
> > Closes #2503 [ZEPPELIN-2808] remember me support By herval
> > Closes #2523 [ZEPPELIN-2764] update packages and documentation for
> > ZeppelinHubRepo storage By khalidhuseynov
> > Closes #2528 ZEPPELIN-2834 Show Interpreter list as expand collapse
> blocks
> > By malayhm
> > Closes #2548 fix tiny typos By kepricon
> > Closes #2558 sc.setLocalProperty(...) should be more deterministic By
> > ruseel
> > Closes #2568 ZEPPELIN-2904 Show Remove Paragraph button upfront By
> malayhm
> > Closes #2573 [ZEPPELIN-2920] move commonly used error handlers to util
> > (FRONT) By 1ambda
> > Closes #2555 [ZEPPELIN-2885] Have Logger subclass StringIO and override
> > write method By sctincman
> > Closes #2602 [ZEPPELIN-2961] Remove unnecessary
> 'interpreter-setting.json'
> > file in the spark interpreter. By Leemoonsoo
> > Closes #2605 [ZEPPELIN-2963] Fix paragraph aborting on next run after
> > cancel By namanmishra91
> > Closes #2614 Add support for 

Re: illegal start of definition with new spark interpreter

2018-07-05 Thread Jeff Zhang
This is due to different behavior of new spark interpreter, I have
created ZEPPELIN-3587 and will fix it asap.



Paul Brenner 于2018年7月6日周五 上午1:11写道:

> Hi all,
>
> When I try switching over to the new spark interpreter it seems there is a
> fundamental difference in how code is interpreted? Maybe that shouldn't be
> a surprise, but I'm wondering if other people have experienced it and if
> there is any work around or hope for a change in the future.
>
> Specifically, if I write some very normal code that looks like the
> following:
>
> df.groupBy("x").count()
>   .filter($"count" >= 2)
>
>
> everything works fine with the old interpreter, but the new interpreter
> complains:
> :1: error: illegal start of definition
> .filter($"count" >= 2)
>
> I realize that I can work around this by ending each line with a dot, but
> then
>
>1. I'm coding like a psychopath and
>2. I would have to go back and change every line of code in old
>notebooks
>
> Is this actually a bug/feature of the new spark interpreter or do I have
> some configuration problem. If it is a property of the new interpreter, is
> it always going to be this way? For now we are just telling our users not
> to use the new spark interpreter.
>
>
>
> 
> 
> 
>  *Paul
> Brenner*
> 
> 
> 
> 
> 
> 
> 
> SR. DATA SCIENTIST
> *(217) 390-3033 <(217)%20390-3033> *
>
> 
> 
> 
> 

illegal start of definition with new spark interpreter

2018-07-05 Thread Paul Brenner
Hi all,

When I try switching over to the new spark interpreter it seems there is a 
fundamental difference in how code is interpreted? Maybe that shouldn't be a 
surprise, but I'm wondering if other people have experienced it and if there is 
any work around or hope for a change in the future.

Specifically, if I write some very normal code that looks like the following:

df. groupBy ( "x" ). count (). filter ( $ "count" >= 2 )

everything works fine with the old interpreter, but the new interpreter 
complains:
:1: error: illegal start of definition

.filter($"count" >= 2)

I realize that I can work around this by ending each line with a dot, but then
* I'm coding like a psychopath and 
* I would have to go back and change every line of code in old notebooks
Is this actually a bug/feature of the new spark interpreter or do I have some 
configuration problem. If it is a property of the new interpreter, is it always 
going to be this way? For now we are just telling our users not to use the new 
spark interpreter.

( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2Bvg596DwyK05I1SsVY-kp8PsAj6WdQTW0U2L8LMgGIMg=
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2Bvg596DwyK05I1SsVY-kp8PsAj6WdQTW0U2L8LMgGIMg=
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2Bvg596DwyK05I1SsVY-kp8PsAj6WdQTW0U2L8LMgGIMg=
 ) *Paul Brenner* ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g19rzgqL18D3zVWL_Yovqjcf3eTUmSdXsxBh2F4o-odu412
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g19rzgqL18D3zVWL_Yovqjcf3eTUmSdXsxBh2F4o-odu412
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g19rzgqL18D3zVWL_Yovqjcf3eTUmSdXsxBh2F4o-odu412
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g59sWI4K05I3jVUa6gnsKaWRmoHxaqN3CPVWF2pfMAieoC9dpglZUs=
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g59sWI4K05I3jVUa6gnsKaWRmoHxaqN3CPVWF2pfMAieoC9dpglZUs=
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g59sWIyI0NG2T5SbqgnsKaWdWkL1q6q9CmftrGMCtphsSZSCBjrabID3lyZuedEwA==
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjp0ViB5g59sWIyI0NG2T5SbqgnsKaWdWkL1q6q9CmftrGMCtphsSZSCBjrabID3lyZuedEwA==
 ) SR. DATA SCIENTIST (217) 390-3033 

( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2Bvg596DwyK05I1SsVY-kp8PmJJzNJlvrrvzDAqryODNZ5pCtDZZz9mle8pBBUf6jm44DjwLBrMDiahOeN-518iU_SOsoAPzAhd70cJ7GhYj8Re3xFG4oGMuFlUC6VvZab6MigiBSyzHi6ALGoUP-Nxsds0-PJ
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2BuRVrpSk3OwNO0zcUMrZ16uSIJClXnuCl7mWaqLGMFp5mpitddpT7lwmiohAXc7Tm4Imt0bZnLCLeh--W78M81h7OeIINJD8jafepfNMBMRYjosnFeOIoHhzy
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2BuRVrpSk3OwNO0zcUMrZ16uSIJClXnuCl7mWaqLGMFp5mpitddpT7lwmiohAXc7Tm4Imt0bZnLCLeh--W78M81h7OeIINJD8jafepfNMBMRYjosnFeOIoHhzy
 ) ( 
https://share.polymail.io/v1/z/b/NWIzZTQ5ZWY0MWMz/rznLZYVOWjGxxi2kFmK48vTZxo2CZHXAQmA_20mvA1WkQiUmf17mp3CRW1AIDwUEcCUCPfnAWAR5tbmsk7gI0f1UE2JEGNTdtJILqpKarT3lU8mACK-GjghWsZWiB9kb1s2Ioq_JiKjpmE2BuRVrpSk3OwNO0zcUMrZ16uSIJClXnuCl7mWaqLGMFp5mpitddpT7lwmiohAXc7Tm4Imt0bZnLCLeh--W78M81h7OeIINJD8jafepfNMBMRYjosnFeOIoHhzy
 ) ( 

RE: Partial code lost when multiple people work in same note

2018-07-05 Thread Belousov Maksim Eduardovich
PR2848 [1] fixed this behavior, but not merged to branch-0.8.
So fixed released versions are absent.



1.   https://github.com/apache/zeppelin/pull/2848 - [Zeppelin-3307] - 
Improved shared browsing/editing for the note


Regards,

Maksim Belousov


From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Thursday, July 05, 2018 4:17 PM
To: users@zeppelin.apache.org
Subject: Re: Partial code lost when multiple people work in same note


Which version of zeppelin do you use ? And how do you cooperate ? Multiple 
people work on the same paragraphs ?

Ben Teeuwen 
mailto:ben.teeu...@booking.com>>于2018年7月5日周四 下午7:24写道:
Hi,

We're trying out Zeppelin with a bunch of people. As soon as 2 people work in 
the same note on the same machine, code is lost from the chunk someone is 
working in. Quite some colleagues concluded that the cooperation feature, 
initially expected to be one of the killer features, doesn't live up to the 
promise and moved back to Jupyter.

Is this a known issue, and/or have others experienced this? Curious if we've 
set it up erroneously, or whether this is a ticket worthy and needs more 
debugging information.

Ben


Re: Partial code lost when multiple people work in same note

2018-07-05 Thread Jeff Zhang
Which version of zeppelin do you use ? And how do you cooperate ? Multiple
people work on the same paragraphs ?

Ben Teeuwen 于2018年7月5日周四 下午7:24写道:

> Hi,
>
> We're trying out Zeppelin with a bunch of people. As soon as 2 people work
> in the same note on the same machine, code is lost from the chunk someone
> is working in. Quite some colleagues concluded that the cooperation
> feature, initially expected to be one of the killer features, doesn't live
> up to the promise and moved back to Jupyter.
>
> Is this a known issue, and/or have others experienced this? Curious if
> we've set it up erroneously, or whether this is a ticket worthy and needs
> more debugging information.
>
> Ben
>


Re: org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()

2018-07-05 Thread Adamantios Corais
Hi Andrea,

The following workaround works for me (but maybe there are other
alternatives too):

- downloaded spark spark-2.3.1-bin-hadoop2.7
- renamed the zeppelin-env.sh.template to zeppelin-env.sh
- appended the following line in the above file: export
SPARK_HOME=../../spark-2.3.1-bin-hadoop2.7/

Hope this helps,




*// **Adamantios Corais*

On Thu, Jul 5, 2018 at 1:51 PM, Andrea Santurbano  wrote:

> Thanks Jeff,
> is there a workaround in order to make it work now?
>
> Il giorno gio 5 lug 2018 alle ore 12:42 Jeff Zhang  ha
> scritto:
>
>>
>> This is due to hadoop version used in embedded spark is 2.3 which is too
>> lower. I created https://issues.apache.org/jira/browse/ZEPPELIN-3586 for
>> this issue. Suppose it will be fixed in o.8.1
>>
>>
>>
>> Andrea Santurbano 于2018年7月5日周四 下午3:35写道:
>>
>>> I agree that is not for production, but if want to do a simple blog post
>>> (and that's what I'm doing) I think it's a well suited solution.
>>> Is it possible to fix this?
>>> Thanks
>>> Andrea
>>>
>>> Il giorno gio 5 lug 2018 alle ore 02:29 Jeff Zhang 
>>> ha scritto:
>>>

 This might be due to the embedded spark version.  I would recommend you
 to specify SPARK_HOME instead of using the embedded spark, the embedded
 spark is not for production.


 Andrea Santurbano 于2018年7月5日周四 上午12:07写道:

> I have the same issue...
> Il giorno mar 3 lug 2018 alle 23:18 Adamantios Corais <
> adamantios.cor...@gmail.com> ha scritto:
>
>> Hi Jeff, I am using the embedded Spark.
>>
>> FYI, this is how I start the dockerized (yet old) version of Zeppelin
>> that works as expected.
>>
>> #!/bin/bash
>>> docker run --rm \
>>> --name zepelin \
>>> -p 127.0.0.1:9090:8080 \
>>> -p 127.0.0.1:5050:4040 \
>>> -v $(pwd):/zeppelin/notebook \
>>> apache/zeppelin:0.7.3
>>
>>
>> And this is how I start the binarized (yet stable) version of
>> Zeppelin that is supposed to work (but it doesn't).
>>
>> #!/bin/bash
>>> wget http://www-eu.apache.org/dist/zeppelin/zeppelin-0.8.0/
>>> zeppelin-0.8.0-bin-all.tgz
>>> tar  zxvf zeppelin-0.8.0-bin-all.tgz
>>> cd   ./zeppelin-0.8.0-bin-all/
>>> bash ./bin/zeppelin.sh
>>
>>
>> Thanks.
>>
>>
>>
>>
>> *// **Adamantios Corais*
>>
>> On Tue, Jul 3, 2018 at 2:24 AM, Jeff Zhang  wrote:
>>
>>>
>>> Do you use the embeded spark or specify SPARK_HOME ? If you set
>>> SPARK_HOME, which spark version and hadoop version do you use ?
>>>
>>>
>>>
>>> Adamantios Corais 于2018年7月3日周二
>>> 上午12:32写道:
>>>
 Hi,

 I have downloaded the latest binary package of Zeppelin (ver.
 0.8.0), extracted, and started as follows: `./bin/zeppelin.sh`

 Next, I tried a very simple example:

 `spark.read.parquet("./bin/userdata1.parquet").show()`

 Which unfortunately returns the following error. Note that the same
 example works fine with the official docker version of Zeppelin (ver.
 0.7.3). Any ideas?

 org.apache.spark.SparkException: Job aborted due to stage failure:
> Task 0 in stage 7.0 failed 1 times, most recent failure: Lost task 
> 0.0 in
> stage 7.0 (TID 7, localhost, executor driver): 
> java.lang.NoSuchMethodError:
> org.apache.hadoop.fs.FileSystem$Statistics.
> getThreadStatistics()Lorg/apache/hadoop/fs/FileSystem$
> Statistics$StatisticsData;
> at org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1$$
> anonfun$apply$mcJ$sp$1.apply(SparkHadoopUtil.scala:149)
> at org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1$$
> anonfun$apply$mcJ$sp$1.apply(SparkHadoopUtil.scala:149)
> at scala.collection.TraversableLike$$anonfun$map$
> 1.apply(TraversableLike.scala:234)
> at scala.collection.TraversableLike$$anonfun$map$
> 1.apply(TraversableLike.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> at scala.collection.IterableLike$class.foreach(IterableLike.
> scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at scala.collection.TraversableLike$class.map(
> TraversableLike.scala:234)
> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
> at org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1.
> apply$mcJ$sp(SparkHadoopUtil.scala:149)
> at org.apache.spark.deploy.SparkHadoopUtil.
> getFSBytesReadOnThreadCallback(SparkHadoopUtil.scala:150)
> at org.apache.spark.sql.execution.datasources.
> FileScanRDD$$anon$1.(FileScanRDD.scala:78)
> at org.apache.spark.sql.execution.datasources.FileScanRDD.compute(
> FileScanRDD.scala:71)
> at 

Re: org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()

2018-07-05 Thread Jeff Zhang
This is due to hadoop version used in embedded spark is 2.3 which is too
lower. I created https://issues.apache.org/jira/browse/ZEPPELIN-3586 for
this issue. Suppose it will be fixed in o.8.1



Andrea Santurbano 于2018年7月5日周四 下午3:35写道:

> I agree that is not for production, but if want to do a simple blog post
> (and that's what I'm doing) I think it's a well suited solution.
> Is it possible to fix this?
> Thanks
> Andrea
>
> Il giorno gio 5 lug 2018 alle ore 02:29 Jeff Zhang  ha
> scritto:
>
>>
>> This might be due to the embedded spark version.  I would recommend you
>> to specify SPARK_HOME instead of using the embedded spark, the embedded
>> spark is not for production.
>>
>>
>> Andrea Santurbano 于2018年7月5日周四 上午12:07写道:
>>
>>> I have the same issue...
>>> Il giorno mar 3 lug 2018 alle 23:18 Adamantios Corais <
>>> adamantios.cor...@gmail.com> ha scritto:
>>>
 Hi Jeff, I am using the embedded Spark.

 FYI, this is how I start the dockerized (yet old) version of Zeppelin
 that works as expected.

 #!/bin/bash
> docker run --rm \
> --name zepelin \
> -p 127.0.0.1:9090:8080 \
> -p 127.0.0.1:5050:4040 \
> -v $(pwd):/zeppelin/notebook \
> apache/zeppelin:0.7.3


 And this is how I start the binarized (yet stable) version of Zeppelin that
 is supposed to work (but it doesn't).

 #!/bin/bash
> wget
> http://www-eu.apache.org/dist/zeppelin/zeppelin-0.8.0/zeppelin-0.8.0-bin-all.tgz
> tar  zxvf zeppelin-0.8.0-bin-all.tgz
> cd   ./zeppelin-0.8.0-bin-all/
> bash ./bin/zeppelin.sh


 Thanks.




 *// **Adamantios Corais*

 On Tue, Jul 3, 2018 at 2:24 AM, Jeff Zhang  wrote:

>
> Do you use the embeded spark or specify SPARK_HOME ? If you set
> SPARK_HOME, which spark version and hadoop version do you use ?
>
>
>
> Adamantios Corais 于2018年7月3日周二 上午12:32写道:
>
>> Hi,
>>
>> I have downloaded the latest binary package of Zeppelin (ver. 0.8.0),
>> extracted, and started as follows: `./bin/zeppelin.sh`
>>
>> Next, I tried a very simple example:
>>
>> `spark.read.parquet("./bin/userdata1.parquet").show()`
>>
>> Which unfortunately returns the following error. Note that the same
>> example works fine with the official docker version of Zeppelin (ver.
>> 0.7.3). Any ideas?
>>
>> org.apache.spark.SparkException: Job aborted due to stage failure:
>>> Task 0 in stage 7.0 failed 1 times, most recent failure: Lost task 0.0 
>>> in
>>> stage 7.0 (TID 7, localhost, executor driver): 
>>> java.lang.NoSuchMethodError:
>>> org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()Lorg/apache/hadoop/fs/FileSystem$Statistics$StatisticsData;
>>> at
>>> org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1$$anonfun$apply$mcJ$sp$1.apply(SparkHadoopUtil.scala:149)
>>> at
>>> org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1$$anonfun$apply$mcJ$sp$1.apply(SparkHadoopUtil.scala:149)
>>> at
>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>>> at
>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>>> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>>> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>>> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>>> at
>>> scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>>> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>>> at
>>> org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1.apply$mcJ$sp(SparkHadoopUtil.scala:149)
>>> at
>>> org.apache.spark.deploy.SparkHadoopUtil.getFSBytesReadOnThreadCallback(SparkHadoopUtil.scala:150)
>>> at
>>> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.(FileScanRDD.scala:78)
>>> at
>>> org.apache.spark.sql.execution.datasources.FileScanRDD.compute(FileScanRDD.scala:71)
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>>> at
>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>>> at
>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>>> at org.apache.spark.scheduler.Task.run(Task.scala:108)
>>> at
>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
>>> at
>>> 

Re: org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()

2018-07-05 Thread Andrea Santurbano
I agree that is not for production, but if want to do a simple blog post
(and that's what I'm doing) I think it's a well suited solution.
Is it possible to fix this?
Thanks
Andrea

Il giorno gio 5 lug 2018 alle ore 02:29 Jeff Zhang  ha
scritto:

>
> This might be due to the embedded spark version.  I would recommend you to
> specify SPARK_HOME instead of using the embedded spark, the embedded spark
> is not for production.
>
>
> Andrea Santurbano 于2018年7月5日周四 上午12:07写道:
>
>> I have the same issue...
>> Il giorno mar 3 lug 2018 alle 23:18 Adamantios Corais <
>> adamantios.cor...@gmail.com> ha scritto:
>>
>>> Hi Jeff, I am using the embedded Spark.
>>>
>>> FYI, this is how I start the dockerized (yet old) version of Zeppelin
>>> that works as expected.
>>>
>>> #!/bin/bash
 docker run --rm \
 --name zepelin \
 -p 127.0.0.1:9090:8080 \
 -p 127.0.0.1:5050:4040 \
 -v $(pwd):/zeppelin/notebook \
 apache/zeppelin:0.7.3
>>>
>>>
>>> And this is how I start the binarized (yet stable) version of Zeppelin that
>>> is supposed to work (but it doesn't).
>>>
>>> #!/bin/bash
 wget
 http://www-eu.apache.org/dist/zeppelin/zeppelin-0.8.0/zeppelin-0.8.0-bin-all.tgz
 tar  zxvf zeppelin-0.8.0-bin-all.tgz
 cd   ./zeppelin-0.8.0-bin-all/
 bash ./bin/zeppelin.sh
>>>
>>>
>>> Thanks.
>>>
>>>
>>>
>>>
>>> *// **Adamantios Corais*
>>>
>>> On Tue, Jul 3, 2018 at 2:24 AM, Jeff Zhang  wrote:
>>>

 Do you use the embeded spark or specify SPARK_HOME ? If you set
 SPARK_HOME, which spark version and hadoop version do you use ?



 Adamantios Corais 于2018年7月3日周二 上午12:32写道:

> Hi,
>
> I have downloaded the latest binary package of Zeppelin (ver. 0.8.0),
> extracted, and started as follows: `./bin/zeppelin.sh`
>
> Next, I tried a very simple example:
>
> `spark.read.parquet("./bin/userdata1.parquet").show()`
>
> Which unfortunately returns the following error. Note that the same
> example works fine with the official docker version of Zeppelin (ver.
> 0.7.3). Any ideas?
>
> org.apache.spark.SparkException: Job aborted due to stage failure:
>> Task 0 in stage 7.0 failed 1 times, most recent failure: Lost task 0.0 in
>> stage 7.0 (TID 7, localhost, executor driver): 
>> java.lang.NoSuchMethodError:
>> org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()Lorg/apache/hadoop/fs/FileSystem$Statistics$StatisticsData;
>> at
>> org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1$$anonfun$apply$mcJ$sp$1.apply(SparkHadoopUtil.scala:149)
>> at
>> org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1$$anonfun$apply$mcJ$sp$1.apply(SparkHadoopUtil.scala:149)
>> at
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>> at
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>> at
>> scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>> at
>> org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1.apply$mcJ$sp(SparkHadoopUtil.scala:149)
>> at
>> org.apache.spark.deploy.SparkHadoopUtil.getFSBytesReadOnThreadCallback(SparkHadoopUtil.scala:150)
>> at
>> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.(FileScanRDD.scala:78)
>> at
>> org.apache.spark.sql.execution.datasources.FileScanRDD.compute(FileScanRDD.scala:71)
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>> at
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>> at
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>> at org.apache.spark.scheduler.Task.run(Task.scala:108)
>> at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> at java.lang.Thread.run(Thread.java:748)
>> Driver stacktrace:
>>   at org.apache.spark.scheduler.DAGScheduler.org
>>