Loopback requires communication between the Spark workers and the Python process you started the Beam pipeline from. So it makes sense that you would have connection errors in a distributed environment. Try using environment type PROCESS instead.
I plan to update the website soon with explanations of these options, stay tuned. Kyle Weaver | Software Engineer | github.com/ibzib | [email protected] On Tue, Sep 17, 2019 at 4:54 PM Benjamin Tan <[email protected]> wrote: > I'm having connection refused errors, though code on PySpark works on the > clusters so I'm pretty sure it's not a firewall issue. > > So, does loopback mode work on a Spark cluster? >
