Here's what I set in a shell script to start the notebook:

export PYSPARK_PYTHON=~/anaconda/bin/python
export PYSPARK_DRIVER_PYTHON=~/anaconda/bin/ipython
export PYSPARK_DRIVER_PYTHON_OPTS='notebook'

If you want to use HiveContext w/CDH:

export HADOOP_CONF_DIR=/etc/hive/conf

Then just run pyspark:
pyspark --master yarn-client --driver-memory 4G --executor-memory 2G
--num-executors 10


-Don


On Wed, Dec 2, 2015 at 6:11 AM, Roberto Pagliari <roberto.pagli...@asos.com>
wrote:

> Does anyone have a pointer to Jupyter configuration with pyspark? The
> current material on python inotebook is out of date, and jupyter ignores
> ipython profiles.
>
> Thank you,
>
>


-- 
Donald Drake
Drake Consulting
http://www.drakeconsulting.com/
https://twitter.com/dondrake <http://www.MailLaunder.com/>
800-733-2143

Reply via email to