Re: Trouble with PySpark UDFs and SPARK_HOME only on EMR

2017-06-22 Thread Nicholas Chammas
ateway.py", > line 77, in launch_gateway > proc = Popen(command, stdin=PIPE, preexec_fn=preexec_func, env=env) > File "/usr/lib64/python3.5/subprocess.py", line 950, in __init__ > restore_signals, start_new_session) > File "/usr/lib64/python3.5/subproc

Trouble with PySpark UDFs and SPARK_HOME only on EMR

2017-06-22 Thread Nick Chammas
I’m seeing a strange issue on EMR which I posted about here . In brief, when I try to import a UDF I’ve defined, Python somehow fails to find Spark. This exact code works for me locally and works on our on