Re: Should zeppelin.pyspark.python be used on the worker nodes ?

William Markito Oliveira Mon, 20 Mar 2017 13:16:38 -0700

Hi moon, thanks for the tip. Here to summarize my current settings are the
following


conf/zeppelin-env.sh has only SPARK_HOME setting:

export SPARK_HOME=/opt/spark-2.1.0-bin-hadoop2.7/

Then on the configuration of the interpreter through the web interface I
have:

PYSPARK_PYTHON=/opt/miniconda2/envs/myenv/bin/python
zeppelin.pyspark.python=python

But when I submit from the notebook I'm receiving:  pyspark is not
responding

And the log file outputs:

Traceback (most recent call last): File
"/tmp/zeppelin_pyspark-6480867511995958556.py", line 22, in <module> from
pyspark.conf import SparkConf ImportError: No module named pyspark.conf

Any thoughts ?  Thanks a lot!

On Mon, Mar 20, 2017 at 2:27 PM, moon soo Lee <[email protected]> wrote:

> When property key in interpreter configuration screen matches certain
> condition [1], it'll be treated as a environment variable.
>
> You can remove PYSPARK_PYTHON from conf/zeppelin-env.sh and place it in
> interpreter configuration.
>
> Thanks,
> moon
>
> [1] https://github.com/apache/zeppelin/blob/master/zeppelin-
> interpreter/src/main/java/org/apache/zeppelin/interpreter/
> remote/RemoteInterpreter.java#L152
>
>
> On Mon, Mar 20, 2017 at 12:21 PM William Markito Oliveira <
> [email protected]> wrote:
>
>> Thanks for the quick response Ruslan.
>>
>> But given that it's an environment variable, I can't quickly change that
>> value and point to a different python environment without restarting the
>> Zeppelin process, can I ? I mean is there a way to set the value for
>> PYSPARK_PYTHON from the Interpreter configuration screen ?
>>
>> Thanks,
>>
>>
>> On Mon, Mar 20, 2017 at 2:15 PM, Ruslan Dautkhanov <[email protected]>
>> wrote:
>>
>> You can set PYSPARK_PYTHON environment variable for that.
>>
>> Not sure about zeppelin.pyspark.python. I think it does not work
>> See comments in https://issues.apache.org/jira/browse/ZEPPELIN-1265
>>
>> Eventually, i think we can remove zeppelin.pyspark.python and use only
>> PYSPARK_PYTHON instead to avoid confusion.
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Mon, Mar 20, 2017 at 12:59 PM, William Markito Oliveira <
>> [email protected]> wrote:
>>
>> I'm trying to use zeppelin.pyspark.python as the variable to set the
>> python that Spark worker nodes should use for my job, but it doesn't seem
>> to be working.
>>
>> Am I missing something or this variable does not do that ?
>>
>> My goal is to change that variable to point to different conda
>> environments.  These environments are available in all worker nodes since
>> it's on a shared location and ideally all nodes then would have access to
>> the same libraries and dependencies.
>>
>> Thanks,
>>
>> ~/William
>>
>>
>>
>>
>>
>> --
>> ~/William
>>
>


-- 
~/William

Re: Should zeppelin.pyspark.python be used on the worker nodes ?

Reply via email to