Hmm, those do look like 4 listening ports to me. PID 3404 is an executor and PID 4762 is a worker? This is a standalone cluster?
On Wed, May 28, 2014 at 8:22 AM, Jacob Eisinger <jeis...@us.ibm.com> wrote: > Howdy Andrew, > > Here is what I ran before an application context was created (other > services have been deleted): > > *# netstat -l -t tcp -p --numeric-ports > * > Active Internet connections (only servers) > > Proto Recv-Q Send-Q Local Address Foreign Address > State PID/Program name > *tcp6 0 0 10.90.17.100:8888 <http://10.90.17.100:8888> > :::* LISTEN 4762/java > * > *tcp6 0 0 :::8081 :::* > LISTEN 4762/java * > > > And, then while the application context is up: > > *# netstat -l -t tcp -p --numeric-ports > * > Active Internet connections (only servers) > > Proto Recv-Q Send-Q Local Address Foreign Address > State PID/Program name > *tcp6 0 0 10.90.17.100:8888 <http://10.90.17.100:8888> > :::* LISTEN 4762/java > * > *tcp6 0 0 :::57286 :::* > LISTEN 3404/java * > *tcp6 0 0 10.90.17.100:38118 <http://10.90.17.100:38118> > :::* LISTEN 3404/java > * > *tcp6 0 0 10.90.17.100:35530 <http://10.90.17.100:35530> > :::* LISTEN 3404/java > * > *tcp6 0 0 :::60235 :::* > LISTEN 3404/java * > *tcp6 0 0 :::8081 :::* > LISTEN 4762/java * > > > My understanding is that this says four ports are open. Is 57286 and > 60235 not being used? > > > Jacob > > Jacob D. Eisinger > IBM Emerging Technologies > jeis...@us.ibm.com - (512) 286-6075 > > [image: Inactive hide details for Andrew Ash ---05/25/2014 06:25:18 > PM---Hi Jacob, The config option spark.history.ui.port is new for 1]Andrew > Ash ---05/25/2014 06:25:18 PM---Hi Jacob, The config option > spark.history.ui.port is new for 1.0 The problem that > > > From: Andrew Ash <and...@andrewash.com> > To: user@spark.apache.org > Date: 05/25/2014 06:25 PM > > Subject: Re: Comprehensive Port Configuration reference? > ------------------------------ > > > > Hi Jacob, > > The config option spark.history.ui.port is new for 1.0 The problem that > History server solves is that in non-Standalone cluster deployment modes > (Mesos and YARN) there is no long-lived Spark Master that can store logs > and statistics about an application after it finishes. History server is > the UI that renders logged data from applications after they complete. > > Read more here: > *https://issues.apache.org/jira/browse/SPARK-1276*<https://issues.apache.org/jira/browse/SPARK-1276> > and > *https://github.com/apache/spark/pull/204*<https://github.com/apache/spark/pull/204> > > As far as the two vs four dynamic ports, are those all listening ports? I > did observe 4 ports in use, but only two of them were listening. The other > two were the random ports used for responses on outbound connections, the > source port of the (srcIP, srcPort, dstIP, dstPort) tuple that uniquely > identifies a TCP socket. > > > *http://unix.stackexchange.com/questions/75011/how-does-the-server-find-out-what-client-port-to-send-to*<http://unix.stackexchange.com/questions/75011/how-does-the-server-find-out-what-client-port-to-send-to> > > Thanks for taking a look through! > > I also realized that I had a couple mistakes with the 0.9 to 1.0 > transition so appropriately documented those now as well in the updated PR. > > Cheers! > Andrew > > > > On Fri, May 23, 2014 at 2:43 PM, Jacob Eisinger > <*jeis...@us.ibm.com*<jeis...@us.ibm.com>> > wrote: > > Howdy Andrew, > > I noticed you have a configuration item that we were not aware of: > spark.history.ui.port . Is that new for 1.0? > > Also, we noticed that the Workers and the Drivers were opening up four > dynamic ports per application context. It looks like you were seeing two. > > Everything else looks like it aligns! > Jacob > > > > Jacob D. Eisinger > IBM Emerging Technologies > *jeis...@us.ibm.com* <jeis...@us.ibm.com> - *(512) > 286-6075*<%28512%29%20286-6075> > > [image: Inactive hide details for Andrew Ash ---05/23/2014 10:30:58 > AM---Hi everyone, I've also been interested in better understanding]Andrew > Ash ---05/23/2014 10:30:58 AM---Hi everyone, I've also been interested in > better understanding what ports are used where > > From: Andrew Ash <*and...@andrewash.com* <and...@andrewash.com>> > To: *user@spark.apache.org* <user@spark.apache.org> > Date: 05/23/2014 10:30 AM > Subject: Re: Comprehensive Port Configuration reference? > ------------------------------ > > > > Hi everyone, > > I've also been interested in better understanding what ports are used > where and the direction the network connections go. I've observed a > running cluster and read through code, and came up with the below > documentation addition. > > *https://github.com/apache/spark/pull/856*<https://github.com/apache/spark/pull/856> > > Scott and Jacob -- it sounds like you two have pulled together some of > this yourselves for writing firewall rules. Would you mind taking a look > at this pull request and confirming that it matches your observations? > Wrong documentation is worse than no documentation, so I'd like to make > sure this is right. > > Cheers, > Andrew > > > On Wed, May 7, 2014 at 10:19 AM, Mark Baker > <*dist...@acm.org*<dist...@acm.org>> > wrote: > On Tue, May 6, 2014 at 9:09 AM, Jacob Eisinger > <*jeis...@us.ibm.com*<jeis...@us.ibm.com>> > wrote: > > In a nut shell, Spark opens up a couple of well known ports. > And,then the workers and the shell open up dynamic ports for each job. > These dynamic ports make securing the Spark network difficult. > > Indeed. > > Judging by the frequency with which this topic arises, this is a > concern for many (myself included). > > I couldn't find anything in JIRA about it, but I'm curious to know > whether the Spark team considers this a problem in need of a fix? > > Mark. > > > >