The behavior is similar across any set of port ranges assigned as the 
supervisor slots ports. I tried with 6700-6703 and it's the same issue of 
address in use. I might be wrong but is it possible that the supervisor is not 
opening the ports as opposed to some other process using that port? 

Sent from my iPhone

> On Jul 28, 2016, at 8:56 PM, Erik Weathers <[email protected]> 
> wrote:
> 
> I'm a bit confused as to what all of those cmds are showing / proving.
> 
> But one thing I will point out is that you probably shouldn't be using
> ports between 32768-61000 for your workers, because those ports are for
> ephemeral usage, so could be used by another process randomly.  (That's the
> default on linux at least.)
> 
> - Erik
> 
>> On Thu, Jul 28, 2016 at 5:47 PM, Arjun Rao <[email protected]> wrote:
>> 
>> Thanks for the reply Erik. I ran nc -l 59027 on the supervisor host, but
>> i think it is able to connect successfully. i ran the strace in any case
>> and the output is attached in the file. I ran a couple of other commands as
>> well and this is what i found.
>> 
>> *With the supervisor running*
>> 
>> 
>> 
>> nc -v devctsl001 59027
>> 
>> nc: connect to devctsl001 port 59027 (tcp) failed: Connection refused
>> 
>> 
>> 
>> telnet devctsl001 59027
>> 
>> Trying 45.32.96.34...
>> 
>> telnet: connect to address xx.xx.xx.xx: Connection refused
>> 
>> 
>> 
>> nc -l 59027
>> 
>> {No address already in use error. Connection seems to be open}
>> 
>> 
>> *With the UI running ( the storm ui connects on 59031. The UI comes up
>> successfully without any issues)*
>> 
>> 
>> 
>> nc -v devctsl001 59031
>> 
>> Connection to devctsl001 59031 port [tcp/*] succeeded!
>> 
>> 
>> telnet devctsl001 59031
>> 
>> Trying xx.xx.xx.xx...
>> 
>> Connected to devctsl001.
>> 
>> Escape character is '^]'.
>> 
>> 
>> nc -l 59031
>> 
>> nc: Address already in use
>> 
>> 
>> 
>> 
>> Might be a red herring, but thought i'd share what i have done so far.
>> 
>> 
>> Best,
>> 
>> Arjun
>> 
>> On Thu, Jul 28, 2016 at 7:35 PM, Erik Weathers <
>> [email protected]> wrote:
>> 
>>> Somehow the OS is denying your application's request to create a socket.
>>> Either the port really is bound to another process despite your netstat
>>> cmd
>>> not revealing that, or you are hitting some other limit.  The thread you
>>> linked doesn't seem useful towards determining what your problem's root
>>> cause is.
>>> 
>>> I would run:  `nc -l 59027` in order to see if anything can bind to that
>>> port.
>>> Assuming it fails, then follow that up with an `strace nc -l 59027` to see
>>> if there's any other evidence of why it's failing to bind.
>>> 
>>> - Erik
>>> 
>>> On Thu, Jul 28, 2016 at 3:46 PM, Arjun Rao <[email protected]>
>>> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> We are active users of storm in production. One of our pre-prod clusters
>>>> however, is not functional at the moment. The storm daemons ( nimbus,
>>> ui,
>>>> logviewer, supervisor ) start up fine, but the storm workers are not get
>>>> instantiated, when we submit topologies. We see the following error in
>>> the
>>>> worker logs:
>>>> 
>>>> 2016-07-28 18:33:59 [main] b.s.d.worker [INFO] Reading Assignments.
>>>> 2016-07-28 18:34:00 [main] b.s.m.TransportFactory [INFO] Storm peer
>>>> transport plugin:backtype.storm.messaging.netty.Context
>>>> 2016-07-28 18:34:00 [main] b.s.d.worker [INFO] Launching receive-thread
>>>> for b4560ed4-d257-4151-9764-633707282a1f:59027
>>>> 2016-07-28 18:34:00 [main] b.s.m.n.Server [INFO] Create Netty Server
>>>> Netty-server-localhost-59027, buffer_size: 5242880, maxWorkers: 1
>>>> 2016-07-28 18:34:00 [main] b.s.d.worker [ERROR] Error on initialization
>>> of
>>>> server mk-worker
>>>> org.apache.storm.netty.channel.ChannelException: Failed to bind to:
>>>> 0.0.0.0/0.0.0.0:59027
>>>>        at
>>> org.apache.storm.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
>>>> ~[storm-core-0.9.6.jar:0.9.6]
>>>>        at backtype.storm.messaging.netty.Server.<init>(Server.java:130)
>>>> ~[storm-core-0.9.6.jar:0.9.6]
>>>>        at backtype.storm.messaging.netty.Context.bind(Context.java:73)
>>>> ~[storm-core-0.9.6.jar:0.9.6]
>>>>        at
>>> backtype.storm.messaging.loader$launch_receive_thread_BANG_.doInvoke(loader.clj:68)
>>>> ~[storm-core-0.9.6.jar:0.9.6]
>>>>        at clojure.lang.RestFn.invoke(RestFn.java:668)
>>>> [clojure-1.5.1.jar:na]
>>>>        at
>>> backtype.storm.daemon.worker$launch_receive_thread.invoke(worker.clj:380)
>>>> ~[storm-core-0.9.6.jar:0.9.6]
>>>>        at
>>> backtype.storm.daemon.worker$fn__4629$exec_fn__1104__auto____4630.invoke(worker.clj:415)
>>>> ~[storm-core-0.9.6.jar:0.9.6]
>>>>        at clojure.lang.AFn.applyToHelper(AFn.java:185)
>>>> [clojure-1.5.1.jar:na]
>>>>        at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.5.1.jar:na]
>>>>        at clojure.core$apply.invoke(core.clj:617)
>>> ~[clojure-1.5.1.jar:na]
>>>>        at
>>> backtype.storm.daemon.worker$fn__4629$mk_worker__4685.doInvoke(worker.clj:393)
>>>> [storm-core-0.9.6.jar:0.9.6]
>>>>        at clojure.lang.RestFn.invoke(RestFn.java:512)
>>>> [clojure-1.5.1.jar:na]
>>>>        at backtype.storm.daemon.worker$_main.invoke(worker.clj:504)
>>>> [storm-core-0.9.6.jar:0.9.6]
>>>>        at clojure.lang.AFn.applyToHelper(AFn.java:172)
>>>> [clojure-1.5.1.jar:na]
>>>>        at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.5.1.jar:na]
>>>>        at backtype.storm.daemon.worker.main(Unknown Source)
>>>> [storm-core-0.9.6.jar:0.9.6]
>>>> java.net.BindException: Address already in use
>>>>        at sun.nio.ch.Net.bind0(Native Method) ~[na:1.8.0_45]
>>>>        at sun.nio.ch.Net.bind(Net.java:437) ~[na:1.8.0_45]
>>>>        at sun.nio.ch.Net.bind(Net.java:429) ~[na:1.8.0_45]
>>>>        at
>>> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>>>> ~[na:1.8.0_45]
>>>>        at
>>>> sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>>>> ~[na:1.8.0_45]
>>>>        at
>>> org.apache.storm.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193)
>>>> ~[storm-core-0.9.6.jar:0.9.6]
>>>>        at
>>> org.apache.storm.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:372)
>>>> ~[storm-core-0.9.6.jar:0.9.6]
>>>>        at
>>> org.apache.storm.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:296)
>>>> ~[storm-core-0.9.6.jar:0.9.6]
>>>>        at
>>> org.apache.storm.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
>>>> ~[storm-core-0.9.6.jar:0.9.6]
>>>>        at
>>> org.apache.storm.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>>>> ~[storm-core-0.9.6.jar:0.9.6]
>>>>        at
>>> org.apache.storm.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>>>> ~[storm-core-0.9.6.jar:0.9.6]
>>>>        at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>> ~[na:1.8.0_45]
>>>>        at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>> ~[na:1.8.0_45]
>>>>        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_45]
>>>> 2016-07-28 18:34:00 [main] b.s.util [ERROR] Halting process: ("Error on
>>>> initialization")
>>>> java.lang.RuntimeException: ("Error on initialization")
>>>>        at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325)
>>>> [storm-core-0.9.6.jar:0.9.6]
>>>>        at clojure.lang.RestFn.invoke(RestFn.java:423)
>>>> [clojure-1.5.1.jar:na]
>>>>        at
>>> backtype.storm.daemon.worker$fn__4629$mk_worker__4685.doInvoke(worker.clj:393)
>>>> [storm-core-0.9.6.jar:0.9.6]
>>>>        at clojure.lang.RestFn.invoke(RestFn.java:512)
>>>> [clojure-1.5.1.jar:na]
>>>>        at backtype.storm.daemon.worker$_main.invoke(worker.clj:504)
>>>> [storm-core-0.9.6.jar:0.9.6]
>>>>        at clojure.lang.AFn.applyToHelper(AFn.java:172)
>>>> [clojure-1.5.1.jar:na]
>>>>        at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.5.1.jar:na]
>>>>        at backtype.storm.daemon.worker.main(Unknown Source)
>>>> [storm-core-0.9.6.jar:0.9.6]
>>>> 
>>>> 
>>>> We are running storm 0.9.6. The ports that we have assigned for the
>>>> supervisor are 59027, 59028, 59029, 59030.  When I run commands to
>>> check if
>>>> anything is running on those ports ( for eg. netstat -an | grep 59027
>>> ), I
>>>> do not get back any results. So it looks like there is nothing running
>>> on
>>>> those ports. (Based on this :
>>> http://grokbase.com/t/gg/storm-user/137h7hr7f0/hi-when-i-run-storm-ui-i-get-address-is-already-in-use-error
>>> )
>>>> It almost seems the storm supervisor on that box is not able to open up
>>>> those ports for the workers to be started on. Does anyone know how this
>>>> problem can be solved/debugged? This cluster was working without any
>>> issues
>>>> and then we started hitting the “Address already in use” errors and have
>>>> been unable to get around it. If you need any more information about the
>>>> nature of our setup, please let me know.
>>>> 
>>>> Thanks!
>>>> 
>>>> Best,
>>>> Arjun
>> 
>> 

Reply via email to