On 12/28/15, 5:16 PM, "Daniel Valdivia" wrote:
>Hi,
>
>I'm trying to submit a job to a small spark cluster running in stand
>alone mode, however it seems like the jar file I'm submitting to the
>cluster is "not found" by the workers nodes.
>
>I might have understood
I'm using Spark 1.5.0 with the standalone scheduler, and for the life of me I
can't figure out why this isn't working. I have an application that works fine
with --deploy-mode client that I'm trying to get to run in cluster mode so I
can use --supervise. I ran into a few issues with my
I guess I was a little light on the details in my haste. I'm using Spark on
YARN, and this is in the driver process in yarn-client mode (most notably
spark-shell). I've had to manually add a bunch of JARs that I had thought it
would just pick up like everything else does:
export
It seems to me that SPARK_SUBMIT_CLASSPATH does not follow the same ability as
other tools to put wildcards in the paths you add. For some reason it doesn't
pick up the classpath information from yarn-site.xml either, it seems, when
running on YARN. I'm having to manually add every single
memory allocation bug?
Hi Greg,
It does seem like a bug. What is the particular exception message that you see?
Andrew
2014-10-08 12:12 GMT-07:00 Greg Hill
greg.h...@rackspace.commailto:greg.h...@rackspace.com:
So, I think this is a bug, but I wanted to get some feedback before I reported
So, I think this is a bug, but I wanted to get some feedback before I reported
it as such. On Spark on YARN, 1.1.0, if you specify the --driver-memory value
to be higher than the memory available on the client machine, Spark errors out
due to failing to allocate enough memory. This happens
Do you have YARN_CONF_DIR set in your environment to point Spark to where your
yarn configs are?
Greg
From: Raghuveer Chanda
raghuveer.cha...@gmail.commailto:raghuveer.cha...@gmail.com
Date: Wednesday, September 24, 2014 12:25 PM
To:
Nishkam Ravi
nr...@cloudera.commailto:nr...@cloudera.com:
Maybe try --driver-memory if you are using spark-submit?
Thanks,
Nishkam
On Mon, Sep 22, 2014 at 1:41 PM, Greg Hill
greg.h...@rackspace.commailto:greg.h...@rackspace.com wrote:
Ah, I see. It turns out that my problem
I know the recommendation is it depends, but can people share what sort of
memory allocations they're using for their driver processes? I'd like to get
an idea of what the range looks like so we can provide sensible defaults
without necessarily knowing what the jobs will look like. The
an environment variable you could set (SPARK_CLASSPATH), though
this is now deprecated.
Let me know if you have more questions about these options,
-Andrew
2014-09-08 6:59 GMT-07:00 Greg Hill
greg.h...@rackspace.commailto:greg.h...@rackspace.com:
Is SPARK_EXECUTOR_INSTANCES the total number of workers
(SPARK_CLASSPATH), though
this is now deprecated.
Let me know if you have more questions about these options,
-Andrew
2014-09-08 6:59 GMT-07:00 Greg Hill
greg.h...@rackspace.commailto:greg.h...@rackspace.com:
Is SPARK_EXECUTOR_INSTANCES the total number of workers in the cluster or the
workers per
To answer my own question, in case someone else runs into this. The spark user
needs to be in the same group on the namenode, and hdfs caches that information
for it seems like at least an hour. Magically started working on its own.
Greg
From: Greg
I am running Spark on Yarn with the HDP 2.1 technical preview. I'm having
issues getting the spark history server permissions to read the spark event
logs from hdfs. Both sides are configured to write/read logs from:
hdfs:///apps/spark/events
The history server is running as user spark, the
I'm running into a problem getting this working as well. I have spark-submit
and spark-shell working fine, but pyspark in interactive mode can't seem to
find the lzo jar:
java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not
found
This is in
My Spark history server won't start because it's trying to hit the namenode on
8021, but the namenode is on 8020 (the default). How can I configure the
history server to use the right port? I can't find any relevant setting on the
docs:
Nevermind, PEBKAC. I had put in the wrong port in the $LOG_DIR environment
variable.
Greg
From: Greg greg.h...@rackspace.commailto:greg.h...@rackspace.com
Date: Wednesday, September 3, 2014 1:56 PM
To: user@spark.apache.orgmailto:user@spark.apache.org
I'm working on setting up Spark on YARN using the HDP technical preview -
http://hortonworks.com/kb/spark-1-0-1-technical-preview-hdp-2-1-3/
I have installed the Spark JARs on all the slave nodes and configured YARN to
find the JARs. It seems like everything is working.
Unless I'm
Thanks. That sounds like how I was thinking it worked. I did have to install
the JARs on the slave nodes for yarn-cluster mode to work, FWIW. It's probably
just whichever node ends up spawning the application master that needs it, but
it wasn't passed along from spark-submit.
Greg
From:
18 matches
Mail list logo