For some, like graphframes that are Spark packages, you could also use
--packages in the command line of spark-submit or pyspark.
Seehttp://spark.apache.org/docs/latest/submitting-applications.html
_____________________________
From: Jakob Odersky <[email protected]>
Sent: Thursday, March 17, 2016 6:40 PM
Subject: Re: installing packages with pyspark
To: Ajinkya Kale <[email protected]>
Cc: <[email protected]>
Hi,
regarding 1, packages are resolved locally. That means that when you
specify a package, spark-submit will resolve the dependencies and
download any jars on the local machine, before shipping* them to the
cluster. So, without a priori knowledge of dataproc clusters, it
should be no different to specify packages.
Unfortunatly I can't help with 2.
--Jakob
*shipping in this case means making them available via the network
On Thu, Mar 17, 2016 at 5:36 PM, Ajinkya Kale <[email protected]> wrote:
> Hi all,
>
> I had couple of questions.
> 1. Is there documentation on how to add the graphframes or any other package
>
> for that matter on the google dataproc managed spark clusters ?
>
> 2. Is there a way to add a package to an existing pyspark context through a
>
> jupyter notebook ?
>
> --aj
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]