The embedded mode is not for production, it ships with a embedded spark and
also embedded hadoop. so It may not include the other necessary dependency
like azure file system. I would recommend you to use non-embedded mode
(specify SPARK_HOME to a spark installation)

Metin OSMAN <mos...@mixdata.com>于2018年10月4日周四 下午4:29写道:

> Hi Jeff,
>
> I am using the embedded mode and there is no SPARK_HOME set for the user
> running the daemon in the server.
> On my local computer, I am also running the embedded spark, but I do have
> also a local installation of spark and the SPARK_HOME env var is set.
> My local installation is correctly setup so I can read and write files
> with wasbs protocol.
>
> I just compared the environments through the spark UIs, and I noticed that
> on my local computer some parameters are mixed with my local spark
> installation. I have for example all of my local spark installation jars
> loaded in the classpath.
>
> So the embedded spark can be messed up if the SPARK_HOME env var is setup.
>
> An then it seems like using azure storage with wasbs protocol do not work
> Out Of The Box.
> I was confused by the fact that all the needed jar files are present in
> the lib directory of the zeppelin installation folder.
> Actually, to make azure storage working, one must copy the needed jars
> from the lib directory to the interpreter dep directory
>
> zeppelin-0.8.0-bin-all$ cp lib/*azure* interpreter/spark/dep/
>
> And setup the interpreter with the following parameters :
>
> spark.hadoop.fs.azure org.apache.hadoop.fs.azure.NativeAzureFileSystem
> spark.hadoop.fs.azure.account.key.<mystorageaccount>.blob.core.windows.net
> <mykey>
>
> Metin
> 1:29 am, Jeff Zhang
>
>
>
> Do you specify SPARK_HOME or just using the local embedded mode of spark ?
>
> Metin OSMAN <mos...@mixdata.com>于2018年10月4日周四 上午1:39写道:
>
> Hi,
>
> I have downloaded and setup zeppelin on my local Ubuntu 18.04 computer,
> and I successfully managed to open file on Azure Storage with spark
> interpreter out of the box.
>
> Then I have installed the same package on a Ubuntu 14.04 server.
> When I try running a simple spark read parquet from an azure storage
> account, I get a java.io.IOException: No FileSystem for scheme: wasbs
>
> sqlContext.read.parquet("wasbs://
> mycontai...@myacountsa.blob.core.windows.net/mypath")
>
> java.io.IOException: No FileSystem for scheme: wasbs at
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304) at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311) at
> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90) at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350) at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332) at
> org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369) at
> org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) at
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:350)
> at
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:348)
> at
> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
> at
> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
> at scala.collection.immutable.List.foreach(List.scala:381) at
> scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
> at scala.collection.immutable.List.flatMap(List.scala:344) at
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:348)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) at
> org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:559) at
> org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:543) ...
> 52 elided
>
> I copied the interpreter.json file from my local computer to the server
> but that has not changed anything.
>
> Should it be working ootb or the fact that it worked on my local computer
> may be due to some local spark configuration or environment variables ?
>
> Thank you,
> Metin
>
>

Reply via email to