----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48722/#review138587 -----------------------------------------------------------
Ship it! Ship It! - Laszlo Puskas On June 16, 2016, 4:43 p.m., Sebastian Toader wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/48722/ > ----------------------------------------------------------- > > (Updated June 16, 2016, 4:43 p.m.) > > > Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, > Sandor Magyari, and Sumit Mohanty. > > > Bugs: AMBARI-17248 > https://issues.apache.org/jira/browse/AMBARI-17248 > > > Repository: ambari > > > Description > ------- > > Commands to be executed by ambari-agents are being sent down by the server in > the response message to agent heartbeat messages. > The server processes the received heartbeat, it checks if there are next > commands scheduled to be executed by ambari-agent and adds those to the > heartbeat response for the ambari-agent. > The server organises the commands that can be executed in parallel into > stages. Ambari server ensures that only the commands of a single stage is > scheduled to be executed by the agent and starts scheduling the commands of > the next stage only after all commands of current stage has finished > successfully. > The processing of command status received with the heartbeat message happens > asynchronously to heartbeat response in HeartBeatProcessor and > ActionScheduler creation thus when the heartbeat response is created the > commands for the next stage are not scheduled yet. This means that the next > commands will be sent to agent only with the next heartbeat. > Agents currently sends a heartbeat to the server on command a completion or > at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – > self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 > seconds if there are no commands to be executed. > This means that when the server receives a heartbeat triggered by the > completion of the last command from the current stage the server will send > the commands for the next stage only 10 seconds later when the next heartbeat > is received. This leads to agents spending considerable amount of time idle > when there are multiple stages to be executed. > Agents should heartbeat at a higher rate while there are still pending stages > to be executed. > > > Diffs > ----- > > ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b > ambari-agent/conf/windows/ambari-agent.ini df88be6 > ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a > ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 > ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae > ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 > ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py > 8103872 > > ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java > 35a37e3 > > ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java > 1ab7ae9 > ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java > ac0ddd2 > ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java > bd9de13 > > ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java > 3d2388e > > ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java > c26e1e9 > > ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java > 627ade9 > > Diff: https://reviews.apache.org/r/48722/diff/ > > > Testing > ------- > > Manual testing. > > Unit tests in succeeded. > > > Thanks, > > Sebastian Toader > >