Hi
  please paste the exception
for Spark vs Jupyter, you might want to sign up for  this.
It'll give you jupyter  and spark...and presumably the spark-csv is already
part of it ?

https://community.cloud.databricks.com/login.html

hth
 marco



On Sat, Sep 3, 2016 at 8:10 PM, Arif,Mubaraka <arif.mubar...@heb.com> wrote:

> On the on-premise *Cloudera Hadoop 5.7.2* I have installed the anaconda
> package and trying to *setup Jupyter notebook *to work with spark1.6.
>
>
>
> I have ran into problems when I trying to use the package
> *com.databricks:spark-csv_2.10:1.4.0* for *reading and inferring the
> schema of the csv file using python spark*.
>
>
>
> I have installed the* jar file - spark-csv_2.10-1.4.0.jar *in
> */var/opt/teradata/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/jar* and c
> *onfigurations* are set as  :
>
>
>
> export PYSPARK_DRIVER_PYTHON=/var/opt/teradata/cloudera/parcels/
> Anaconda-4.0.0/bin/jupyter
> export PYSPARK_DRIVER_PYTHON_OPTS="notebook --NotebookApp.open_browser=False
> --NotebookApp.ip='*' --NotebookApp.port=8083"
> export PYSPARK_PYTHON=/var/opt/teradata/cloudera/parcels/
> Anaconda-4.0.0/bin/python
>
>
>
> When I run pyspark from the command line with packages option, like :
>
>
>
> *$pyspark --packages com.databricks:spark-csv_2.10:1.4.0 *
>
>
>
> It throws the error and fails to recognize the added dependency.
>
>
>
> Any ideas on how to resolve this error is much appreciated.
>
>
>
> Also, any ideas on the experience in installing and running Jupyter
> notebook with anaconda and spark please share.
>
>
>
> thanks,
>
> Muby
>
>
>
>
> --------------------------------------------------------------------- To
> unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to