Re: Using SPARK packages in Spark Cluster

Gourav Sengupta Mon, 15 Feb 2016 06:03:43 -0800

Hi Jorge/ All,

Please please please go through this link
http://spark.apache.org/docs/latest/spark-standalone.html.
<http://spark.apache.org/docs/latest/spark-standalone.html>
The link tells you how to start a SPARK cluster in local mode. If you have
not started or worked in SPARK cluster in local mode kindly do not attempt
in answering this question.


My question is how to use packages like
https://github.com/databricks/spark-csv when I using SPARK cluster in local
mode.

Regards,
Gourav Sengupta

<http://spark.apache.org/docs/latest/spark-standalone.html>

On Mon, Feb 15, 2016 at 1:55 PM, Jorge Machado <jom...@me.com> wrote:

> Hi Gourav,
>
> I did not unterstand your problem… the - - packages  command should not
> make any difference if you are running standalone or in YARN for example.
> Give us an example what packages are you trying to load, and what error
> are you getting…  If you want to use the libraries in spark-packages.org 
> without
> the --packages why do you not use maven ?
> Regards
>
>
> On 12/02/2016, at 13:22, Gourav Sengupta <gourav.sengu...@gmail.com>
> wrote:
>
> Hi,
>
> I am creating sparkcontext in a SPARK standalone cluster as mentioned
> here: http://spark.apache.org/docs/latest/spark-standalone.html using the
> following code:
>
>
> --------------------------------------------------------------------------------------------------------------------------
> sc.stop()
> conf = SparkConf().set( 'spark.driver.allowMultipleContexts' , False) \
>                   .setMaster("spark://hostname:7077") \
>                   .set('spark.shuffle.service.enabled', True) \
>                   .set('spark.dynamicAllocation.enabled','true') \
>                   .set('spark.executor.memory','20g') \
>                   .set('spark.driver.memory', '4g') \
>
> .set('spark.default.parallelism',(multiprocessing.cpu_count() -1 ))
> conf.getAll()
> sc = SparkContext(conf = conf)
>
> -----(we should definitely be able to optimise the configuration but that
> is not the point here) ---
>
> I am not able to use packages, a list of which is mentioned here
> http://spark-packages.org, using this method.
>
> Where as if I use the standard "pyspark --packages" option then the
> packages load just fine.
>
> I will be grateful if someone could kindly let me know how to load
> packages when starting a cluster as mentioned above.
>
>
> Regards,
> Gourav Sengupta
>
>
>

Re: Using SPARK packages in Spark Cluster

Reply via email to