Hi,

We have a situation where a Pyspark script works fine as a local process
("local" url) on the Master and the Worker nodes, which would indicate that
all python dependencies are set up properly on each machine.

But when we try to run the script at the cluster level (using the master's
url), if fails partway through the flow on a GroupBy with a SocketConnect
error and python crashes.

This is on ec2 using the AMI. This doesn't seem to be an issue of the
master not seeing the workers, since they show up in the web ui.

Also, we can see the job running on the cluster until it reaches the
GroupBy transform step, which is when we get the SocketConnect error.

Any ideas?

-Suren


SUREN HIRAMAN, VP TECHNOLOGY
Velos
Accelerating Machine Learning

440 NINTH AVENUE, 11TH FLOOR
NEW YORK, NY 10001
O: (917) 525-2466 ext. 105
F: 646.349.4063
E: suren.hiraman@v <suren.hira...@sociocast.com>elos.io
W: www.velos.io

Reply via email to