For some, like graphframes that are Spark packages, you could also use 
--packages in the command line of spark-submit or pyspark. 
Seehttp://spark.apache.org/docs/latest/submitting-applications.html

    _____________________________
From: Jakob Odersky <ja...@odersky.com>
Sent: Thursday, March 17, 2016 6:40 PM
Subject: Re: installing packages with pyspark
To: Ajinkya Kale <kaleajin...@gmail.com>
Cc:  <user@spark.apache.org>


                   Hi,   
 regarding 1, packages are resolved locally. That means that when you   
 specify a package, spark-submit will resolve the dependencies and   
 download any jars on the local machine, before shipping* them to the   
 cluster. So, without a priori knowledge of dataproc clusters, it   
 should be no different to specify packages.   
    
 Unfortunatly I can't help with 2.   
    
 --Jakob   
    
 *shipping in this case means making them available via the network   
    
 On Thu, Mar 17, 2016 at 5:36 PM, Ajinkya Kale <kaleajin...@gmail.com> wrote:   
 > Hi all,   
 >   
 > I had couple of questions.   
 > 1. Is there documentation on how to add the graphframes or any other package 
 >   
 > for that matter on the google dataproc managed spark clusters ?   
 >   
 > 2. Is there a way to add a package to an existing pyspark context through a  
 >  
 > jupyter notebook ?   
 >   
 > --aj   
    
 ---------------------------------------------------------------------   
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org   
 For additional commands, e-mail: user-h...@spark.apache.org   
    
       


  

Reply via email to