You can check a script that I created for the Amazon cloud: https://snippetessay.wordpress.com/2015/04/18/big-data-lab-in-the-cloud-with-hadoopsparkrpython/
If I remember correctly then you need to add something to the startup py for ipython > On 03 Nov 2015, at 01:04, Andy Davidson <a...@santacruzintegration.com> wrote: > > Hi > > I recently installed a new cluster using the > spark-1.5.1-bin-hadoop2.6/ec2/spark-ec2. SparkPi sample app works correctly. > > I am trying to run iPython notebook on my cluster master and use an ssh > tunnel so that I can work with the notebook in a browser running on my mac. > Bellow is how I set up the ssh tunnel > > $ ssh -i $KEY_FILE -N -f -L localhost:8888:localhost:7000 > ec2-user@$SPARK_MASTER > > $ ssh -i $KEY_FILE ec2-user@$SPARK_MASTER > $ cd top level notebook dir > $ IPYTHON_OPTS="notebook --no-browser --port=7000" > /root/spark/bin/pyspark > > I am able to access my notebooks in the browser by opening > http://localhost:8888 > > When I run the following python code I get an error NameError: name 'sc' is > not defined? Any idea what the problem might be? > > I looked through pyspark and tried various combinations of the following but > still get the same error > > $ PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook > --no-browser --port=7000" /root/spark/bin/pyspark --master=local[2] > > Kind regards > > Andy > > > > > > In [1]: > > import sys > print (sys.version) > > import os > print(os.getcwd() + "\n") > 2.6.9 (unknown, Apr 1 2015, 18:16:00) > [GCC 4.8.2 20140120 (Red Hat 4.8.2-16)] > /home/ec2-user/dataScience > > In [2]: > > from pyspark import SparkContext > textFile = sc.textFile("readme.txt") > textFile.take(1) > --------------------------------------------------------------------------- > NameError Traceback (most recent call last) > <ipython-input-2-b67a9be29bd9> in <module>() > 1 from pyspark import SparkContext > ----> 2 textFile = sc.textFile("readme.txt") > 3 textFile.take(1) > > NameError: name 'sc' is not defined > > In [ ]: > >