Re: Using Elasticsearch Connector in Zeppelin Notebook

SiS Mon, 16 Nov 2015 12:28:53 -0800

Hi 

thanks a lot for your help. It worked like you described it. But I have another 
question. Would it be possible to define the dependency somehow centrally, so 
that it is not necessary to insert the %dep in all notebooks and especially not 
need to restart the Interpreter all time I start the notebook?


BR


> Am 10.11.2015 um 23:06 schrieb Jeff Steinmetz <jeffrey.steinm...@gmail.com>:
> 
> You can load the hadoop dependency directly from the elastic search maven 
> repository if you like.
> I’m using the snapshot builds since it fixes a few issues that I’ve been 
> testing with costin @ elastic recently.
> 
> In your interpreter, you will want to set a new flag of es.nodes and list 
> your comma separated elastic search IP addresses or host names.
> 
> I.e, you can do this (assuming you want to use the native elasticsearch-spark 
> approach - which is preferred over hadoop reader/writers)
> 
> 
> %dep
> 
> z.addRepo("Sonatype 
> snapshot").url("https://oss.sonatype.org/content/repositories/snapshots";).snapshot
> z.load("org.elasticsearch::elasticsearch-spark:2.2.0.BUILD-SNAPSHOT")
> 
> 
> 
> %spark
> 
> val query = “{ some ES style query string here }”
> 
> val RDD = sc.esJsonRDD("evo-session-reader/session", query ) // this returns 
> the original json, if you omit query, it will assume match_all
> val RDD2 = sc.esRDD("evo-session-reader/session", query )  // this returns a 
> Map[String,AnyRef]
> 
> 
> 
> 
> 
> 
> 
> Best
> Jeff Steinmetz
> 
> 
> On 11/10/15, 1:45 PM, "SiS" <n...@cht3.com> wrote:
> 
>> Hi Everybody, 
>> 
>> through the effect that I’m new to Spark and Zeppelin I hope my question I 
>> have is here in the right place. 
>> I played around with Zeppelin and Spark and tried to load data by connection 
>> to an elasticsearch cluster. 
>> But to be honest I have no clue how I have to setup zeppelin or the notebook 
>> to use the elasticsearch/hadoop/spark
>> library (jar) so I’m able to connect using pyspark. 
>> Do I have to copy the jar somewhere in the zeppelin folders?
>> 
>> My plan is to transfer an index/type from elasticsearch to datafframes in 
>> spark.
>> 
>> Did somebody give me a short explanation for setting this up? Or could point 
>> me to the right documentation.
>> 
>> Any help would be appreciated.
>> 
>> Thanks a lot
>> Sven
>

Re: Using Elasticsearch Connector in Zeppelin Notebook

Reply via email to