[ https://issues.apache.org/jira/browse/STORM-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated STORM-3024: ---------------------------------- Labels: pull-request-available (was: ) > Allow scheduling for RAS to happen in the background > ---------------------------------------------------- > > Key: STORM-3024 > URL: https://issues.apache.org/jira/browse/STORM-3024 > Project: Apache Storm > Issue Type: New Feature > Components: storm-server > Affects Versions: 2.0.0 > Reporter: Robert Joseph Evans > Assignee: Robert Joseph Evans > Priority: Major > Labels: pull-request-available > > We have run into some issues recently where occasionally a strategy on a very > large cluster will take an extra long amount of time finish scheduling. This > slowness cascades into other issues, like topologies not being able to be > killed because the timer thread is still in use trying to run scheduling. > The plan is to make scheduling happen in a thread pool. The main thread will > wait for up to a configurable amount of time for the topology to be > scheduled, but if it does not complete in that time it will be left to keep > running in the background thread in hopes that later on it will be scheduled. > If for some reason the state of the cluster changes while scheduling is > happening in the background we will cancel the scheduling, as any scheduling > it produced may not be able to fit on the cluster. The next time the > scheduler runs it will restart the scheduling and hopefully allow the cluster > to reach a steady state even if it takes a while, but without blocking kills > and other critical operations from happening. > Note that we are also working on optimizing scheduling as well so that these > issues don't happen in the first place. -- This message was sent by Atlassian JIRA (v7.6.3#76005)