Hi Felix and Tomoas, Thanks a lot for your information. I figured out the environment variable PYSPARK_PYTHON is the secret key.
My current approach is to start iPython notebook on the namenode, export PYSPARK_PYTHON=/opt/local/anaconda/bin/ipython /opt/local/anaconda/bin/ipython notebook --profile=mypysparkprofile In my iPython notebook, I have the flexibility to manually start my SparkContext in a way like this: os.environ["YARN_CONF_DIR"] = "/etc/hadoop/conf" os.environ["JAVA_HOME"] = "/usr/lib/jvm/jre-1.7.0-openjdk.x86_64/" sys.path.append("/opt/cloudera/parcels/CDH/lib/spark/python") sys.path.append("/opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.8.2.1-src.zip") from pyspark import SparkContext, SparkConf sconf = SparkConf() conf = (SparkConf().setMaster("spark://datafireball1:7077") .setAppName("SparkApplication") .set("spark.executor.memory", "16g") .set("spark.ui.killEnabled", "true")) sc = SparkContext(conf=conf) This really works out for me and I am using the lastest iPython notebook to interactively write Spark application. If you have a better Python solution will can offer a better workflow for interactive spark development. Please share. Bin On Tue, May 12, 2015 at 1:20 AM, Tomas Olsson <tomas.ols...@mdh.se> wrote: > Hi, > You can try > > PYSPARK_DRIVER_PYTHON=/path/to/ipython > PYSPARK_DRIVER_PYTHON_OPTS="notebookâ /path/to//pyspark > > > /Tomas > > > On 11 May 2015, at 22:17, Bin Wang <binwang...@gmail.com> wrote: > > > > Hey there, > > > > I have installed a python interpreter in certain location, say > "/opt/local/anaconda". > > > > Is there anything that I can specify the Python interpreter while > developing in iPython notebook? Maybe a property in the while creating the > Sparkcontext? > > > > > > I know that I can put "#!/opt/local/anaconda" at the top of my Python > code and use spark-submit to distribute it to the cluster. However, since I > am using iPython notebook, this is not available as an option. > > > > Best, > > > > Bin > >