Supervisors don't open the ports.  The workers do.  The supervisors
*launch* the workers.

- Erik

On Fri, Jul 29, 2016 at 8:46 AM, Arjun Rao <[email protected]> wrote:

> The behavior is similar across any set of port ranges assigned as the
> supervisor slots ports. I tried with 6700-6703 and it's the same issue of
> address in use. I might be wrong but is it possible that the supervisor is
> not opening the ports as opposed to some other process using that port?
>
> Sent from my iPhone
>
> > On Jul 28, 2016, at 8:56 PM, Erik Weathers <[email protected]>
> wrote:
> >
> > I'm a bit confused as to what all of those cmds are showing / proving.
> >
> > But one thing I will point out is that you probably shouldn't be using
> > ports between 32768-61000 for your workers, because those ports are for
> > ephemeral usage, so could be used by another process randomly.  (That's
> the
> > default on linux at least.)
> >
> > - Erik
> >
> >> On Thu, Jul 28, 2016 at 5:47 PM, Arjun Rao <[email protected]>
> wrote:
> >>
> >> Thanks for the reply Erik. I ran nc -l 59027 on the supervisor host, but
> >> i think it is able to connect successfully. i ran the strace in any case
> >> and the output is attached in the file. I ran a couple of other
> commands as
> >> well and this is what i found.
> >>
> >> *With the supervisor running*
> >>
> >>
> >>
> >> nc -v devctsl001 59027
> >>
> >> nc: connect to devctsl001 port 59027 (tcp) failed: Connection refused
> >>
> >>
> >>
> >> telnet devctsl001 59027
> >>
> >> Trying 45.32.96.34...
> >>
> >> telnet: connect to address xx.xx.xx.xx: Connection refused
> >>
> >>
> >>
> >> nc -l 59027
> >>
> >> {No address already in use error. Connection seems to be open}
> >>
> >>
> >> *With the UI running ( the storm ui connects on 59031. The UI comes up
> >> successfully without any issues)*
> >>
> >>
> >>
> >> nc -v devctsl001 59031
> >>
> >> Connection to devctsl001 59031 port [tcp/*] succeeded!
> >>
> >>
> >> telnet devctsl001 59031
> >>
> >> Trying xx.xx.xx.xx...
> >>
> >> Connected to devctsl001.
> >>
> >> Escape character is '^]'.
> >>
> >>
> >> nc -l 59031
> >>
> >> nc: Address already in use
> >>
> >>
> >>
> >>
> >> Might be a red herring, but thought i'd share what i have done so far.
> >>
> >>
> >> Best,
> >>
> >> Arjun
> >>
> >> On Thu, Jul 28, 2016 at 7:35 PM, Erik Weathers <
> >> [email protected]> wrote:
> >>
> >>> Somehow the OS is denying your application's request to create a
> socket.
> >>> Either the port really is bound to another process despite your netstat
> >>> cmd
> >>> not revealing that, or you are hitting some other limit.  The thread
> you
> >>> linked doesn't seem useful towards determining what your problem's root
> >>> cause is.
> >>>
> >>> I would run:  `nc -l 59027` in order to see if anything can bind to
> that
> >>> port.
> >>> Assuming it fails, then follow that up with an `strace nc -l 59027` to
> see
> >>> if there's any other evidence of why it's failing to bind.
> >>>
> >>> - Erik
> >>>
> >>> On Thu, Jul 28, 2016 at 3:46 PM, Arjun Rao <[email protected]>
> >>> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> We are active users of storm in production. One of our pre-prod
> clusters
> >>>> however, is not functional at the moment. The storm daemons ( nimbus,
> >>> ui,
> >>>> logviewer, supervisor ) start up fine, but the storm workers are not
> get
> >>>> instantiated, when we submit topologies. We see the following error in
> >>> the
> >>>> worker logs:
> >>>>
> >>>> 2016-07-28 18:33:59 [main] b.s.d.worker [INFO] Reading Assignments.
> >>>> 2016-07-28 18:34:00 [main] b.s.m.TransportFactory [INFO] Storm peer
> >>>> transport plugin:backtype.storm.messaging.netty.Context
> >>>> 2016-07-28 18:34:00 [main] b.s.d.worker [INFO] Launching
> receive-thread
> >>>> for b4560ed4-d257-4151-9764-633707282a1f:59027
> >>>> 2016-07-28 18:34:00 [main] b.s.m.n.Server [INFO] Create Netty Server
> >>>> Netty-server-localhost-59027, buffer_size: 5242880, maxWorkers: 1
> >>>> 2016-07-28 18:34:00 [main] b.s.d.worker [ERROR] Error on
> initialization
> >>> of
> >>>> server mk-worker
> >>>> org.apache.storm.netty.channel.ChannelException: Failed to bind to:
> >>>> 0.0.0.0/0.0.0.0:59027
> >>>>        at
> >>>
> org.apache.storm.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
> >>>> ~[storm-core-0.9.6.jar:0.9.6]
> >>>>        at
> backtype.storm.messaging.netty.Server.<init>(Server.java:130)
> >>>> ~[storm-core-0.9.6.jar:0.9.6]
> >>>>        at backtype.storm.messaging.netty.Context.bind(Context.java:73)
> >>>> ~[storm-core-0.9.6.jar:0.9.6]
> >>>>        at
> >>>
> backtype.storm.messaging.loader$launch_receive_thread_BANG_.doInvoke(loader.clj:68)
> >>>> ~[storm-core-0.9.6.jar:0.9.6]
> >>>>        at clojure.lang.RestFn.invoke(RestFn.java:668)
> >>>> [clojure-1.5.1.jar:na]
> >>>>        at
> >>>
> backtype.storm.daemon.worker$launch_receive_thread.invoke(worker.clj:380)
> >>>> ~[storm-core-0.9.6.jar:0.9.6]
> >>>>        at
> >>>
> backtype.storm.daemon.worker$fn__4629$exec_fn__1104__auto____4630.invoke(worker.clj:415)
> >>>> ~[storm-core-0.9.6.jar:0.9.6]
> >>>>        at clojure.lang.AFn.applyToHelper(AFn.java:185)
> >>>> [clojure-1.5.1.jar:na]
> >>>>        at clojure.lang.AFn.applyTo(AFn.java:151)
> [clojure-1.5.1.jar:na]
> >>>>        at clojure.core$apply.invoke(core.clj:617)
> >>> ~[clojure-1.5.1.jar:na]
> >>>>        at
> >>>
> backtype.storm.daemon.worker$fn__4629$mk_worker__4685.doInvoke(worker.clj:393)
> >>>> [storm-core-0.9.6.jar:0.9.6]
> >>>>        at clojure.lang.RestFn.invoke(RestFn.java:512)
> >>>> [clojure-1.5.1.jar:na]
> >>>>        at backtype.storm.daemon.worker$_main.invoke(worker.clj:504)
> >>>> [storm-core-0.9.6.jar:0.9.6]
> >>>>        at clojure.lang.AFn.applyToHelper(AFn.java:172)
> >>>> [clojure-1.5.1.jar:na]
> >>>>        at clojure.lang.AFn.applyTo(AFn.java:151)
> [clojure-1.5.1.jar:na]
> >>>>        at backtype.storm.daemon.worker.main(Unknown Source)
> >>>> [storm-core-0.9.6.jar:0.9.6]
> >>>> java.net.BindException: Address already in use
> >>>>        at sun.nio.ch.Net.bind0(Native Method) ~[na:1.8.0_45]
> >>>>        at sun.nio.ch.Net.bind(Net.java:437) ~[na:1.8.0_45]
> >>>>        at sun.nio.ch.Net.bind(Net.java:429) ~[na:1.8.0_45]
> >>>>        at
> >>>
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
> >>>> ~[na:1.8.0_45]
> >>>>        at
> >>>> sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
> >>>> ~[na:1.8.0_45]
> >>>>        at
> >>>
> org.apache.storm.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193)
> >>>> ~[storm-core-0.9.6.jar:0.9.6]
> >>>>        at
> >>>
> org.apache.storm.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:372)
> >>>> ~[storm-core-0.9.6.jar:0.9.6]
> >>>>        at
> >>>
> org.apache.storm.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:296)
> >>>> ~[storm-core-0.9.6.jar:0.9.6]
> >>>>        at
> >>>
> org.apache.storm.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
> >>>> ~[storm-core-0.9.6.jar:0.9.6]
> >>>>        at
> >>>
> org.apache.storm.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
> >>>> ~[storm-core-0.9.6.jar:0.9.6]
> >>>>        at
> >>>
> org.apache.storm.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
> >>>> ~[storm-core-0.9.6.jar:0.9.6]
> >>>>        at
> >>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >>>> ~[na:1.8.0_45]
> >>>>        at
> >>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >>>> ~[na:1.8.0_45]
> >>>>        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_45]
> >>>> 2016-07-28 18:34:00 [main] b.s.util [ERROR] Halting process: ("Error
> on
> >>>> initialization")
> >>>> java.lang.RuntimeException: ("Error on initialization")
> >>>>        at
> backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325)
> >>>> [storm-core-0.9.6.jar:0.9.6]
> >>>>        at clojure.lang.RestFn.invoke(RestFn.java:423)
> >>>> [clojure-1.5.1.jar:na]
> >>>>        at
> >>>
> backtype.storm.daemon.worker$fn__4629$mk_worker__4685.doInvoke(worker.clj:393)
> >>>> [storm-core-0.9.6.jar:0.9.6]
> >>>>        at clojure.lang.RestFn.invoke(RestFn.java:512)
> >>>> [clojure-1.5.1.jar:na]
> >>>>        at backtype.storm.daemon.worker$_main.invoke(worker.clj:504)
> >>>> [storm-core-0.9.6.jar:0.9.6]
> >>>>        at clojure.lang.AFn.applyToHelper(AFn.java:172)
> >>>> [clojure-1.5.1.jar:na]
> >>>>        at clojure.lang.AFn.applyTo(AFn.java:151)
> [clojure-1.5.1.jar:na]
> >>>>        at backtype.storm.daemon.worker.main(Unknown Source)
> >>>> [storm-core-0.9.6.jar:0.9.6]
> >>>>
> >>>>
> >>>> We are running storm 0.9.6. The ports that we have assigned for the
> >>>> supervisor are 59027, 59028, 59029, 59030.  When I run commands to
> >>> check if
> >>>> anything is running on those ports ( for eg. netstat -an | grep 59027
> >>> ), I
> >>>> do not get back any results. So it looks like there is nothing running
> >>> on
> >>>> those ports. (Based on this :
> >>>
> http://grokbase.com/t/gg/storm-user/137h7hr7f0/hi-when-i-run-storm-ui-i-get-address-is-already-in-use-error
> >>> )
> >>>> It almost seems the storm supervisor on that box is not able to open
> up
> >>>> those ports for the workers to be started on. Does anyone know how
> this
> >>>> problem can be solved/debugged? This cluster was working without any
> >>> issues
> >>>> and then we started hitting the “Address already in use” errors and
> have
> >>>> been unable to get around it. If you need any more information about
> the
> >>>> nature of our setup, please let me know.
> >>>>
> >>>> Thanks!
> >>>>
> >>>> Best,
> >>>> Arjun
> >>
> >>
>

Reply via email to