Hi everyone, I've also been interested in better understanding what ports are used where and the direction the network connections go. I've observed a running cluster and read through code, and came up with the below documentation addition.
https://github.com/apache/spark/pull/856 Scott and Jacob -- it sounds like you two have pulled together some of this yourselves for writing firewall rules. Would you mind taking a look at this pull request and confirming that it matches your observations? Wrong documentation is worse than no documentation, so I'd like to make sure this is right. Cheers, Andrew On Wed, May 7, 2014 at 10:19 AM, Mark Baker <dist...@acm.org> wrote: > On Tue, May 6, 2014 at 9:09 AM, Jacob Eisinger <jeis...@us.ibm.com> wrote: > > In a nut shell, Spark opens up a couple of well known ports. And,then > the workers and the shell open up dynamic ports for each job. These > dynamic ports make securing the Spark network difficult. > > Indeed. > > Judging by the frequency with which this topic arises, this is a > concern for many (myself included). > > I couldn't find anything in JIRA about it, but I'm curious to know > whether the Spark team considers this a problem in need of a fix? > > Mark. >