[ https://issues.apache.org/jira/browse/STORM-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun Mahadevan updated STORM-3110: ---------------------------------- Description: While running in secure mode, supervisor sets the worker user (in workers local state) as the user that launched the topology. {code:java} SET worker-user 4d67a6be-4c80-4622-96af-f94706d58553 foo {code} However the OS process does not actually run as the user (e.g foo) unless "supervisor.run.worker.as.user" is also set. if the supervisor's assignment changes, the supervisor in some cases checks if all processes are dead by matching the "pid+user" name. Here if the worker is running as a different user (say storm) the supervisor wrongly assumes that the worker process is dead. Later when supervisor tries to launch a worker at that same port, it throws a bind exception o.a.s.m.n.Server main [INFO] Create Netty Server Netty-server-localhost-6700, buffer_size: 5242880, maxWorkers: 1 o.a.s.d.worker main [ERROR] Error on initialization of server mk-worker org.apache.storm.shade.org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:6700 at org.apache.storm.shade.org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) ~[storm-core-1.2.0.3.1.0.0-501.jar:1.2.0.3.1.0.0-501] was: While running in secure mode, supervisor sets the worker user (in workers local state) as the user that launched the topology. {code:java} SET worker-user 4d67a6be-4c80-4622-96af-f94706d58553 foo {code} However the OS process does not actually run as the user (e.g hrt_qa) unless "supervisor.run.worker.as.user" is also set. if the supervisor's assignment changes, the supervisor in some cases checks if all processes are dead by matching the "pid+user" name. Here if the worker is running as a different user (say storm) the supervisor wrongly assumes that the worker process is dead. Later when supervisor tries to launch a worker at that same port, it throws a bind exception o.a.s.m.n.Server main [INFO] Create Netty Server Netty-server-localhost-6700, buffer_size: 5242880, maxWorkers: 1 o.a.s.d.worker main [ERROR] Error on initialization of server mk-worker org.apache.storm.shade.org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:6700 at org.apache.storm.shade.org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) ~[storm-core-1.2.0.3.1.0.0-501.jar:1.2.0.3.1.0.0-501] > Supervisor does not kill all worker processes in secure mode in case of user > mismatch > ------------------------------------------------------------------------------------- > > Key: STORM-3110 > URL: https://issues.apache.org/jira/browse/STORM-3110 > Project: Apache Storm > Issue Type: Improvement > Reporter: Arun Mahadevan > Assignee: Arun Mahadevan > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > While running in secure mode, supervisor sets the worker user (in workers > local state) as the user that launched the topology. > > {code:java} > SET worker-user 4d67a6be-4c80-4622-96af-f94706d58553 foo > {code} > > However the OS process does not actually run as the user (e.g foo) unless > "supervisor.run.worker.as.user" is also set. > > if the supervisor's assignment changes, the supervisor in some cases checks > if all processes are dead by matching the "pid+user" name. Here if the worker > is running as a different user (say storm) the supervisor wrongly assumes > that the worker process is dead. > > Later when supervisor tries to launch a worker at that same port, it throws a > bind exception > > o.a.s.m.n.Server main [INFO] Create Netty Server > Netty-server-localhost-6700, buffer_size: 5242880, maxWorkers: 1 > o.a.s.d.worker main [ERROR] Error on initialization of server mk-worker > org.apache.storm.shade.org.jboss.netty.channel.ChannelException: Failed to > bind to: 0.0.0.0/0.0.0.0:6700 > at > org.apache.storm.shade.org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) > ~[storm-core-1.2.0.3.1.0.0-501.jar:1.2.0.3.1.0.0-501] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)