Hi, We have a situation where a Pyspark script works fine as a local process ("local" url) on the Master and the Worker nodes, which would indicate that all python dependencies are set up properly on each machine.
But when we try to run the script at the cluster level (using the master's url), if fails partway through the flow on a GroupBy with a SocketConnect error and python crashes. This is on ec2 using the AMI. This doesn't seem to be an issue of the master not seeing the workers, since they show up in the web ui. Also, we can see the job running on the cluster until it reaches the GroupBy transform step, which is when we get the SocketConnect error. Any ideas? -Suren SUREN HIRAMAN, VP TECHNOLOGY Velos Accelerating Machine Learning 440 NINTH AVENUE, 11TH FLOOR NEW YORK, NY 10001 O: (917) 525-2466 ext. 105 F: 646.349.4063 E: suren.hiraman@v <suren.hira...@sociocast.com>elos.io W: www.velos.io