[ 
https://issues.apache.org/jira/browse/STORM-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15134067#comment-15134067
 ] 

Satish Duggana commented on STORM-1516:
---------------------------------------

Whenever a topology is submitted, it creates respective workers on 
supervisor/s. These worker pids are stored as files in 
${storm-localdir}/workers/{worker-id}/pids/ on supervisor. But there is an 
issue in storing worker pids. So, supervisor could not find respective worker 
pids when a topology is killed. Subsequent topology deployment workers are 
failed because of earlier workers are still alive and bound to the respective 
ports. This issue occurs only on 2.0 branch. 

The issue here is that worker.clj # mk-server has below bug.
(ConfigUtils/clusterMode conf) returns a string but it is compared with keyword 
which returns false, because of which pids are not stored. 

  (when (= :distributed (ConfigUtils/clusterMode conf)) 
    (let [pid (process-pid)]
      (touch (ConfigUtils/workerPidPath conf worker-id pid))
      (spit (ConfigUtils/workerArtifactsPidPath conf storm-id port) pid)))


> Topology with time based windowing bolts are not getting killed properly.
> -------------------------------------------------------------------------
>
>                 Key: STORM-1516
>                 URL: https://issues.apache.org/jira/browse/STORM-1516
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-core
>    Affects Versions: 2.0.0
>            Reporter: Satish Duggana
>            Assignee: Satish Duggana
>            Priority: Blocker
>             Fix For: 2.0.0
>
>         Attachments: SlidingWindowTopology.java
>
>
> When topology with timebased windowing bolts are killed, respective workers 
> are not shutdown properly and they remain running. When you want to deploy a 
> new topology, it throws with the below Exception as the earlier worker is not 
> shutdown. This issue is not specific with this topology though.
> 2016-02-02 10:07:42.845 o.a.s.d.worker [ERROR] Error on initialization of 
> server mk-worker
> org.apache.storm.shade.org.jboss.netty.channel.ChannelException: Failed to 
> bind to: 0.0.0.0/0.0.0.0:6700
>       at 
> org.apache.storm.shade.org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
>  ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at org.apache.storm.messaging.netty.Server.<init>(Server.java:101) 
> ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at org.apache.storm.messaging.netty.Context.bind(Context.java:67) 
> ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at 
> org.apache.storm.daemon.worker$worker_data$fn__6329.invoke(worker.clj:265) 
> ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at org.apache.storm.util$assoc_apply_self.invoke(util.clj:934) 
> ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at org.apache.storm.daemon.worker$worker_data.invoke(worker.clj:262) 
> ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at 
> org.apache.storm.daemon.worker$fn__6627$exec_fn__2511__auto__$reify__6629.run(worker.clj:605)
>  ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_60]
>       at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_60]
>       at 
> org.apache.storm.daemon.worker$fn__6627$exec_fn__2511__auto____6628.invoke(worker.clj:603)
>  ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at clojure.lang.AFn.applyToHelper(AFn.java:178) ~[clojure-1.7.0.jar:?]
>       at clojure.lang.AFn.applyTo(AFn.java:144) ~[clojure-1.7.0.jar:?]
>       at clojure.core$apply.invoke(core.clj:630) ~[clojure-1.7.0.jar:?]
>       at 
> org.apache.storm.daemon.worker$fn__6627$mk_worker__6722.doInvoke(worker.clj:577)
>  [storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at clojure.lang.RestFn.invoke(RestFn.java:512) [clojure-1.7.0.jar:?]
>       at org.apache.storm.daemon.worker$_main.invoke(worker.clj:764) 
> [storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at clojure.lang.AFn.applyToHelper(AFn.java:165) [clojure-1.7.0.jar:?]
>       at clojure.lang.AFn.applyTo(AFn.java:144) [clojure-1.7.0.jar:?]
>       at org.apache.storm.daemon.worker.main(Unknown Source) 
> [storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> Caused by: java.net.BindException: Address already in use
>       at sun.nio.ch.Net.bind0(Native Method) ~[?:1.8.0_60]
>       at sun.nio.ch.Net.bind(Net.java:433) ~[?:1.8.0_60]
>       at sun.nio.ch.Net.bind(Net.java:425) ~[?:1.8.0_60]
>       at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) 
> ~[?:1.8.0_60]
>       at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) 
> ~[?:1.8.0_60]
>       at 
> org.apache.storm.shade.org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193)
>  ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at 
> org.apache.storm.shade.org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:372)
>  ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at 
> org.apache.storm.shade.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:296)
>  ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at 
> org.apache.storm.shade.org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
>  ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at 
> org.apache.storm.shade.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>  ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at 
> org.apache.storm.shade.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>  ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0_60]
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0_60]
>       at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_60]
> 2016-02-02 10:07:42.857 o.a.s.util [ERROR] Halting process: ("Error on 
> initialization")



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to