does this apply to YARN client mode?
>
> ------
> From: andrewweiner2...@u.northwestern.edu
> Date: Sun, 17 Jan 2016 17:00:39 -0600
> Subject: Re: SparkContext SyntaxError: invalid syntax
> To: cutl...@gmail.com
> CC: user@spark.apache.org
>
>
> Ye
s.
On Sun, Jan 17, 2016 at 11:37 PM, Felix Cheung
wrote:
> Do you still need help on the PR?
> btw, does this apply to YARN client mode?
>
> --
> From: andrewweiner2...@u.northwestern.edu
> Date: Sun, 17 Jan 2016 17:00:39 -0600
> Subject: Re: S
andrewweiner2...@u.northwestern.edu
<mailto:andrewweiner2...@u.northwestern.edu>
Date: Sun, 17 Jan 2016 17:00:39 -0600
Subject: Re: SparkContext SyntaxError: invalid syntax
To: cutl...@gmail.com <mailto:cutl...@gmail.com>
CC: user@spark.apache.org <mailto:user@spark.a
17 Jan 2016 17:00:39 -0600
> Subject: Re: SparkContext SyntaxError: invalid syntax
> To: cutl...@gmail.com
> CC: user@spark.apache.org
>
>
> Yeah, I do think it would be worth explicitly stating this in the docs. I
> was going to try to edit the docs myself and submit a pull
Do you still need help on the PR?
btw, does this apply to YARN client mode?
From: andrewweiner2...@u.northwestern.edu
Date: Sun, 17 Jan 2016 17:00:39 -0600
Subject: Re: SparkContext SyntaxError: invalid syntax
To: cutl...@gmail.com
CC: user@spark.apache.org
Yeah, I do think it would be worth
Yeah, I do think it would be worth explicitly stating this in the docs. I
was going to try to edit the docs myself and submit a pull request, but I'm
having trouble building the docs from github. If anyone else wants to do
this, here is approximately what I would say:
(To be added to
http://spar
Glad you got it going! It's wasn't very obvious what needed to be set,
maybe it is worth explicitly stating this in the docs since it seems to
have come up a couple times before too.
Bryan
On Fri, Jan 15, 2016 at 12:33 PM, Andrew Weiner <
andrewweiner2...@u.northwestern.edu> wrote:
> Actually,
Actually, I just found this [
https://issues.apache.org/jira/browse/SPARK-1680], which after a bit of
googling and reading leads me to believe that the preferred way to change
the yarn environment is to edit the spark-defaults.conf file by adding this
line:
spark.yarn.appMasterEnv.PYSPARK_PYTHON
I finally got the pi.py example to run in yarn cluster mode. This was the
key insight:
https://issues.apache.org/jira/browse/SPARK-9229
I had to set SPARK_YARN_USER_ENV in spark-env.sh:
export SPARK_YARN_USER_ENV="PYSPARK_PYTHON=/home/aqualab/local/bin/python"
This caused the PYSPARK_PYTHON envi
I tried playing around with my environment variables, and here is an update.
When I run in cluster mode, my environment variables do not persist
throughout the entire job.
For example, I tried creating a local copy of HADOOP_CONF_DIR in
/home//local/etc/hadoop/conf, and then, in spark-env.sh I the
Indeed! Here is the output when I run in cluster mode:
Traceback (most recent call last):
File "pi.py", line 22, in ?
raise RuntimeError("\n"+str(sys.version_info) +"\n"+
RuntimeError:
(2, 4, 3, 'final', 0)
[('PYSPARK_GATEWAY_PORT', '48079'), ('PYTHONPATH',
'/scratch2/hadoop/yarn/local/user
It seems like it could be the case that some other Python version is being
invoked. To make sure, can you add something like this to the top of the
.py file you are submitting to get some more info about how the application
master is configured?
import sys, os
raise RuntimeError("\n"+str(sys.vers
Hi Bryan,
I ran "$> python --version" on every node on the cluster, and it is Python
2.7.8 for every single one.
When I try to submit the Python example in client mode
* ./bin/spark-submit --master yarn --deploy-mode client
--driver-memory 4g --executor-memory 2g --executor-cores
Hi Andrew,
There are a couple of things to check. First, is Python 2.7 the default
version on all nodes in the cluster or is it an alternate install? Meaning
what is the output of this command "$> python --version" If it is an
alternate install, you could set the environment variable "PYSPARK_P
Thanks for your continuing help. Here is some additional info.
*OS/architecture*
output of *cat /proc/version*:
Linux version 2.6.18-400.1.1.el5 (mockbu...@x86-012.build.bos.redhat.com)
output of *lsb_release -a*:
LSB Version:
:core-4.0-amd64:core-4.0-ia32:core-4.0-noarch:graphics-4.0-amd64:gr
Now for simplicity I'm testing with wordcount.py from the provided
examples, and using Spark 1.6.0
The first error I get is:
16/01/08 19:14:46 ERROR lzo.GPLNativeCodeLoader: Could not load native gpl
library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
at java.la
Hi Andrew,
I know that older versions of Spark could not run PySpark on YARN in
cluster mode. I'm not sure if that is fixed in 1.6.0 though. Can you try
setting deploy-mode option to "client" when calling spark-submit?
Bryan
On Thu, Jan 7, 2016 at 2:39 PM, weineran <
andrewweiner2...@u.northwe
17 matches
Mail list logo