Github user kishorvpatil commented on the pull request: https://github.com/apache/storm/pull/647#issuecomment-157558858 I think the spout and bolt should take care of handling hangs ( or use timeouts instead of making blocking calls). Also, the spout/bolt code should guard against creating threads that can cause unhandled exceptions/hang-ups. Forcing worker to not send heart-beats would make killing other components running on that worker - which is not desired. Secondly, worker should not be killed unless it is certain that is the process issue and not external service issue - e.g. if kafka spout hangs - killing worker might force it to be relaunched or scheduled may not solve the problem - new worker process still make another blocking call and hang-up. Thirdly, killing worker will force relaunch/reschedule/ - forcing topology to be un-stabie as all other workers in loop have to reconnect to this new worker. In large topologies that might become a bigger problem and lead to domino effects and take longer to settle the topology. -1
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---