Re: Install - Configure Jupyter Notebook

dusenberrymw Wed, 05 Jul 2017 14:24:03 -0700

For a bit more context, this is the general way of starting Jupyter with 
PySpark support.  In contrast, the usual `jupyter notebook` command will only 
launch Jupyter with a standard Python kernel.


Additionally, all of the extra "conf" settings in that command refer to 
settings that could be placed in the standard `conf/spark-defaults.conf`file of 
your Spark installation, with spaces instead of the equals signs, in case 
you're already familiar with that.

- Mike

--

Mike Dusenberry
GitHub: github.com/dusenberrymw
LinkedIn: linkedin.com/in/mikedusenberry

Sent from my iPhone.


> On Jul 5, 2017, at 2:14 PM, Niketan Pansare <npan...@us.ibm.com> wrote:
> 
> Hi Gustavo,
> 
> You can paste that code into the commandline:
> $ PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS="notebook" pyspark 
> --master local[*] --conf "spark.driver.memory=12g" --conf 
> spark.driver.maxResultSize=0 --conf spark.akka.frameSize=128 --conf 
> spark.default.parallelism=100
> 
> The above command tells "pyspark" that the python driver is jupyter. For more 
> details, please see 
> https://github.com/apache/spark/blob/master/bin/pyspark#L27
> 
> Alternatively, you can follow Arijit's suggestion.
> 
> Thanks,
> 
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
> 
> arijit chakraborty ---07/02/2017 04:22:28 AM---Hi Gustavo, You can put that 
> pyspark details in the jupyter console itself.
> 
> From: arijit chakraborty <ak...@hotmail.com>
> To: "dev@systemml.apache.org" <dev@systemml.apache.org>
> Date: 07/02/2017 04:22 AM
> Subject: Re: Install - Configure Jupyter Notebook
> 
> 
> 
> 
> Hi Gustavo,
> 
> 
> You can put that pyspark details in the jupyter console itself.
> 
> 
> import os
> import sys
> import pandas as pd
> import numpy as np
> 
> spark_path = "C:\spark"
> os.environ['SPARK_HOME'] = spark_path
> os.environ['HADOOP_HOME'] = spark_path
> 
> sys.path.append(spark_path + "/bin")
> sys.path.append(spark_path + "/python")
> sys.path.append(spark_path + "/python/pyspark/")
> sys.path.append(spark_path + "/python/lib")
> sys.path.append(spark_path + "/python/lib/pyspark.zip")
> sys.path.append(spark_path + "/python/lib/py4j-0.10.4-src.zip")
> 
> from pyspark import SparkContext
> from pyspark import SparkConf
> 
> sc = SparkContext("local[*]", "test")
> 
> 
> # SystemML Specifications:
> 
> 
> from pyspark.sql import SQLContext
> import systemml as sml
> sqlCtx = SQLContext(sc)
> ml = sml.MLContext(sc)
> 
> 
> But this is not a very good way of doing it. I did it as I'm using windows 
> and it's easier to do it like that.
> 
> 
> Regards,
> 
> Arijit
> 
> ________________________________
> From: Gustavo Frederico <gustavo.freder...@thinkwrap.com>
> Sent: Sunday, July 2, 2017 10:16:03 AM
> To: dev@systemml.apache.org
> Subject: Install - Configure Jupyter Notebook
> 
> 
> A basic question: step 3 in https://systemml.apache.org/install-systemml.html 
> <https://systemml.apache.org/install-systemml.html>  for “Configure Jupyter 
> Notebook” has
> # Start Jupyter Notebook Server
> PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS="notebook" pyspark 
> --master local[*] --conf "spark.driver.memory=12g" --conf 
> spark.driver.maxResultSize=0 --conf spark.akka.frameSize=128 --conf 
> spark.default.parallelism=100
> Where does that go? There are no details in this step…
> 
> Thanks
> 
> Gustavo
> 
> 
>

Re: Install - Configure Jupyter Notebook

Reply via email to