I got this working by having our sysadmin update our security group to
allow incoming traffic from the local subnet on ports 1-65535. I'm not
sure if there's a more specific range I could have used, but so far,
everything is running!
Thanks for all the responses Marcelo and Andrew!!
Matt
On Wed, Jul 16, 2014 at 12:36 PM, Matt Work Coarr
mattcoarr.w...@gmail.com wrote:
Thanks Marcelo, I'm not seeing anything in the logs that clearly explains
what's causing this to break.
One interesting point that we just discovered is that if we run the driver
and the slave (worker) on the
Thanks Marcelo! This is a huge help!!
Looking at the executor logs (in a vanilla spark install, I'm finding them
in $SPARK_HOME/work/*)...
It launches the executor, but it looks like the
CoarseGrainedExecutorBackend is having trouble talking to the driver
(exactly what you said!!!).
Do you
Hi Matt,
I'm not very familiar with setup on ec2; the closest I can point you
at is to look at the launch_cluster in ec2/spark_ec2.py, where the
ports seem to be configured.
On Thu, Jul 17, 2014 at 1:29 PM, Matt Work Coarr
mattcoarr.w...@gmail.com wrote:
Thanks Marcelo! This is a huge help!!
Hi Matt,
The security group shouldn't be an issue; the ports listed in
`spark_ec2.py` are only for communication with the outside world.
How did you launch your application? I notice you did not launch your
driver from your Master node. What happens if you did? Another thing is
that there seems
Thanks Marcelo, I'm not seeing anything in the logs that clearly explains
what's causing this to break.
One interesting point that we just discovered is that if we run the driver
and the slave (worker) on the same host it runs, but if we run the driver
on a separate host it does not run.
Hello spark folks,
I have a simple spark cluster setup but I can't get jobs to run on it. I
am using the standlone mode.
One master, one slave. Both machines have 32GB ram and 8 cores.
The slave is setup with one worker that has 8 cores and 24GB memory
allocated.
My application requires 2
Have you looked at the slave machine to see if the process has
actually launched? If it has, have you tried peeking into its log
file?
(That error is printed whenever the executors fail to report back to
the driver. Insufficient resources to launch the executor is the most
common cause of that,