Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-22 Thread Andrew Onischuk

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review139054
---


Ship it!




Ship It!

- Andrew Onischuk


On June 22, 2016, 11:21 a.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 22, 2016, 11:21 a.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
>   ambari-agent/conf/unix/upgrade_agent_configs.py 583b5aa 
>   ambari-agent/conf/windows/ambari-agent.ini df88be6 
>   ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py 91098e0 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestHeartbeat.py f113083 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-22 Thread Victor Galgo

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review139053
---


Ship it!




Ship It!

- Victor Galgo


On June 22, 2016, 11:21 a.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 22, 2016, 11:21 a.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
>   ambari-agent/conf/unix/upgrade_agent_configs.py 583b5aa 
>   ambari-agent/conf/windows/ambari-agent.ini df88be6 
>   ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py 91098e0 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestHeartbeat.py f113083 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-22 Thread Sebastian Toader

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/
---

(Updated June 22, 2016, 1:21 p.m.)


Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, Sandor 
Magyari, and Sumit Mohanty.


Bugs: AMBARI-17248
https://issues.apache.org/jira/browse/AMBARI-17248


Repository: ambari


Description
---

Commands to be executed by ambari-agents are being sent down by the server in 
the response message to agent heartbeat messages. 
The server processes the received heartbeat, it checks if there are next 
commands scheduled to be executed by ambari-agent and adds those to the 
heartbeat response for the ambari-agent.
The server organises the commands that can be executed in parallel into stages. 
Ambari server ensures that only the commands of a single stage is scheduled to 
be executed by the agent and starts scheduling the commands of the next stage 
only after all commands of current stage has finished successfully.
The processing of command status received with the heartbeat message happens 
asynchronously to heartbeat response in HeartBeatProcessor and ActionScheduler 
creation thus when the heartbeat response is created the commands for the next 
stage are not scheduled yet. This means that the next commands will be sent to 
agent only with the next heartbeat.
Agents currently sends a heartbeat to the server on command a completion or at 
a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 seconds 
if there are no commands to be executed.
This means that when the server receives a heartbeat triggered by the 
completion of the last command from the current stage the server will send the 
commands for the next stage only 10 seconds later when the next heartbeat is 
received. This leads to agents spending considerable amount of time idle when 
there are multiple stages to be executed.
Agents should heartbeat at a higher rate while there are still pending stages 
to be executed.


Diffs (updated)
-

  ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
  ambari-agent/conf/unix/upgrade_agent_configs.py 583b5aa 
  ambari-agent/conf/windows/ambari-agent.ini df88be6 
  ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
  ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
  ambari-agent/src/main/python/ambari_agent/Heartbeat.py 91098e0 
  ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
  ambari-agent/src/test/python/ambari_agent/TestHeartbeat.py f113083 
  ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
  ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
8103872 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
 35a37e3 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
 1ab7ae9 
  ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
ac0ddd2 
  ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
bd9de13 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
 3d2388e 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
 c26e1e9 
  
ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
 627ade9 

Diff: https://reviews.apache.org/r/48722/diff/


Testing
---

Manual testing.

Unit tests in succeeded.


Thanks,

Sebastian Toader



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-22 Thread Sebastian Toader

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/
---

(Updated June 22, 2016, 1:10 p.m.)


Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, Sandor 
Magyari, and Sumit Mohanty.


Changes
---

Reduce the amount of logs produced during the heartbeat handling in agent.


Bugs: AMBARI-17248
https://issues.apache.org/jira/browse/AMBARI-17248


Repository: ambari


Description
---

Commands to be executed by ambari-agents are being sent down by the server in 
the response message to agent heartbeat messages. 
The server processes the received heartbeat, it checks if there are next 
commands scheduled to be executed by ambari-agent and adds those to the 
heartbeat response for the ambari-agent.
The server organises the commands that can be executed in parallel into stages. 
Ambari server ensures that only the commands of a single stage is scheduled to 
be executed by the agent and starts scheduling the commands of the next stage 
only after all commands of current stage has finished successfully.
The processing of command status received with the heartbeat message happens 
asynchronously to heartbeat response in HeartBeatProcessor and ActionScheduler 
creation thus when the heartbeat response is created the commands for the next 
stage are not scheduled yet. This means that the next commands will be sent to 
agent only with the next heartbeat.
Agents currently sends a heartbeat to the server on command a completion or at 
a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 seconds 
if there are no commands to be executed.
This means that when the server receives a heartbeat triggered by the 
completion of the last command from the current stage the server will send the 
commands for the next stage only 10 seconds later when the next heartbeat is 
received. This leads to agents spending considerable amount of time idle when 
there are multiple stages to be executed.
Agents should heartbeat at a higher rate while there are still pending stages 
to be executed.


Diffs (updated)
-

  ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
  ambari-agent/conf/unix/upgrade_agent_configs.py 583b5aa 
  ambari-agent/conf/windows/ambari-agent.ini df88be6 
  ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
  ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
  ambari-agent/src/main/python/ambari_agent/Heartbeat.py 91098e0 
  ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
  ambari-agent/src/test/python/ambari_agent/TestHeartbeat.py f113083 
  ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
  ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
8103872 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
 35a37e3 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
 1ab7ae9 
  ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
ac0ddd2 
  ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
bd9de13 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
 3d2388e 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
 c26e1e9 
  
ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
 627ade9 

Diff: https://reviews.apache.org/r/48722/diff/


Testing
---

Manual testing.

Unit tests in succeeded.


Thanks,

Sebastian Toader



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-21 Thread Andrew Onischuk

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review138851
---



We have some logs which are triggered with every heartbeat. This would flood 
logs badly if done every second. Could we fix this. (possibly in a separate 
jira)

- Andrew Onischuk


On June 21, 2016, 3:19 p.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 21, 2016, 3:19 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
>   ambari-agent/conf/unix/upgrade_agent_configs.py 583b5aa 
>   ambari-agent/conf/windows/ambari-agent.ini df88be6 
>   ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py 91098e0 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestHeartbeat.py f113083 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-21 Thread Sebastian Toader


> On June 20, 2016, 4:46 p.m., Andrew Onischuk wrote:
> > Also we could miss some other thing like this (state/status), which bring 
> > slowdown. Would be nice to deploy fullstack with and without your patch. 
> > And see if it brings speedup, and not actually slowdown.

The component status reports is not an issue as the commands are issued by the 
server every minute so it's not dependent on the heartbeat rate.


- Sebastian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review138605
---


On June 21, 2016, 5:19 p.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 21, 2016, 5:19 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
>   ambari-agent/conf/unix/upgrade_agent_configs.py 583b5aa 
>   ambari-agent/conf/windows/ambari-agent.ini df88be6 
>   ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py 91098e0 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestHeartbeat.py f113083 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-21 Thread Sebastian Toader

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/
---

(Updated June 21, 2016, 5:19 p.m.)


Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, Sandor 
Magyari, and Sumit Mohanty.


Changes
---

Made the hosts status reports to be generated independently from heartbeat 
rates periodically every 60 seconds (which is configurable).


Bugs: AMBARI-17248
https://issues.apache.org/jira/browse/AMBARI-17248


Repository: ambari


Description
---

Commands to be executed by ambari-agents are being sent down by the server in 
the response message to agent heartbeat messages. 
The server processes the received heartbeat, it checks if there are next 
commands scheduled to be executed by ambari-agent and adds those to the 
heartbeat response for the ambari-agent.
The server organises the commands that can be executed in parallel into stages. 
Ambari server ensures that only the commands of a single stage is scheduled to 
be executed by the agent and starts scheduling the commands of the next stage 
only after all commands of current stage has finished successfully.
The processing of command status received with the heartbeat message happens 
asynchronously to heartbeat response in HeartBeatProcessor and ActionScheduler 
creation thus when the heartbeat response is created the commands for the next 
stage are not scheduled yet. This means that the next commands will be sent to 
agent only with the next heartbeat.
Agents currently sends a heartbeat to the server on command a completion or at 
a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 seconds 
if there are no commands to be executed.
This means that when the server receives a heartbeat triggered by the 
completion of the last command from the current stage the server will send the 
commands for the next stage only 10 seconds later when the next heartbeat is 
received. This leads to agents spending considerable amount of time idle when 
there are multiple stages to be executed.
Agents should heartbeat at a higher rate while there are still pending stages 
to be executed.


Diffs (updated)
-

  ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
  ambari-agent/conf/unix/upgrade_agent_configs.py 583b5aa 
  ambari-agent/conf/windows/ambari-agent.ini df88be6 
  ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
  ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
  ambari-agent/src/main/python/ambari_agent/Heartbeat.py 91098e0 
  ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
  ambari-agent/src/test/python/ambari_agent/TestHeartbeat.py f113083 
  ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
  ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
8103872 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
 35a37e3 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
 1ab7ae9 
  ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
ac0ddd2 
  ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
bd9de13 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
 3d2388e 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
 c26e1e9 
  
ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
 627ade9 

Diff: https://reviews.apache.org/r/48722/diff/


Testing
---

Manual testing.

Unit tests in succeeded.


Thanks,

Sebastian Toader



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-20 Thread Andrew Onischuk

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review138605
---



Also we could miss some other thing like this (state/status), which bring 
slowdown. Would be nice to deploy fullstack with and without your patch. And 
see if it brings speedup, and not actually slowdown.

- Andrew Onischuk


On June 20, 2016, 1:08 p.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 20, 2016, 1:08 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
>   ambari-agent/conf/windows/ambari-agent.ini df88be6 
>   ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-20 Thread Andrew Onischuk

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review138597
---



Sebastian can you also make sure that status commands doesn't run more often 
with a more frequent heartbeat? Cause we don't want those redundant 
computations too often.

- Andrew Onischuk


On June 20, 2016, 1:08 p.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 20, 2016, 1:08 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
>   ambari-agent/conf/windows/ambari-agent.ini df88be6 
>   ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-20 Thread Sebastian Toader

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/
---

(Updated June 20, 2016, 3:08 p.m.)


Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, Sandor 
Magyari, and Sumit Mohanty.


Changes
---

Removed c1.getHosts()


Bugs: AMBARI-17248
https://issues.apache.org/jira/browse/AMBARI-17248


Repository: ambari


Description
---

Commands to be executed by ambari-agents are being sent down by the server in 
the response message to agent heartbeat messages. 
The server processes the received heartbeat, it checks if there are next 
commands scheduled to be executed by ambari-agent and adds those to the 
heartbeat response for the ambari-agent.
The server organises the commands that can be executed in parallel into stages. 
Ambari server ensures that only the commands of a single stage is scheduled to 
be executed by the agent and starts scheduling the commands of the next stage 
only after all commands of current stage has finished successfully.
The processing of command status received with the heartbeat message happens 
asynchronously to heartbeat response in HeartBeatProcessor and ActionScheduler 
creation thus when the heartbeat response is created the commands for the next 
stage are not scheduled yet. This means that the next commands will be sent to 
agent only with the next heartbeat.
Agents currently sends a heartbeat to the server on command a completion or at 
a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 seconds 
if there are no commands to be executed.
This means that when the server receives a heartbeat triggered by the 
completion of the last command from the current stage the server will send the 
commands for the next stage only 10 seconds later when the next heartbeat is 
received. This leads to agents spending considerable amount of time idle when 
there are multiple stages to be executed.
Agents should heartbeat at a higher rate while there are still pending stages 
to be executed.


Diffs (updated)
-

  ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
  ambari-agent/conf/windows/ambari-agent.ini df88be6 
  ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
  ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
  ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
  ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
  ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
8103872 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
 35a37e3 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
 1ab7ae9 
  ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
ac0ddd2 
  ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
bd9de13 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
 3d2388e 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
 c26e1e9 
  
ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
 627ade9 

Diff: https://reviews.apache.org/r/48722/diff/


Testing
---

Manual testing.

Unit tests in succeeded.


Thanks,

Sebastian Toader



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-20 Thread Laszlo Puskas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review138587
---


Ship it!




Ship It!

- Laszlo Puskas


On June 16, 2016, 4:43 p.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 16, 2016, 4:43 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
>   ambari-agent/conf/windows/ambari-agent.ini df88be6 
>   ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-20 Thread Laszlo Puskas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review138586
---




ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
 (line 532)


This call here is superfluous.


- Laszlo Puskas


On June 16, 2016, 4:43 p.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 16, 2016, 4:43 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
>   ambari-agent/conf/windows/ambari-agent.ini df88be6 
>   ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-20 Thread Sandor Magyari

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review138584
---


Ship it!




Ship It!

- Sandor Magyari


On June 16, 2016, 4:43 p.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 16, 2016, 4:43 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
>   ambari-agent/conf/windows/ambari-agent.ini df88be6 
>   ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-20 Thread Andrew Onischuk

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review138582
---


Ship it!




Ship It!

- Andrew Onischuk


On June 16, 2016, 4:43 p.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 16, 2016, 4:43 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
>   ambari-agent/conf/windows/ambari-agent.ini df88be6 
>   ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-20 Thread Victor Galgo

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review138580
---


Ship it!




Nice!

- Victor Galgo


On June 16, 2016, 4:43 p.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 16, 2016, 4:43 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
>   ambari-agent/conf/windows/ambari-agent.ini df88be6 
>   ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-16 Thread Sebastian Toader

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/
---

(Updated June 16, 2016, 6:43 p.m.)


Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, Sandor 
Magyari, and Sumit Mohanty.


Changes
---

1. Introduced ambari agent config properties (ambari-agent.ini) for a minimum 
and maximum heartbeat internals
2. The computed heartbeat intervals is used all the time not just when there 
are still pending tasks(stages)


Bugs: AMBARI-17248
https://issues.apache.org/jira/browse/AMBARI-17248


Repository: ambari


Description
---

Commands to be executed by ambari-agents are being sent down by the server in 
the response message to agent heartbeat messages. 
The server processes the received heartbeat, it checks if there are next 
commands scheduled to be executed by ambari-agent and adds those to the 
heartbeat response for the ambari-agent.
The server organises the commands that can be executed in parallel into stages. 
Ambari server ensures that only the commands of a single stage is scheduled to 
be executed by the agent and starts scheduling the commands of the next stage 
only after all commands of current stage has finished successfully.
The processing of command status received with the heartbeat message happens 
asynchronously to heartbeat response in HeartBeatProcessor and ActionScheduler 
creation thus when the heartbeat response is created the commands for the next 
stage are not scheduled yet. This means that the next commands will be sent to 
agent only with the next heartbeat.
Agents currently sends a heartbeat to the server on command a completion or at 
a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 seconds 
if there are no commands to be executed.
This means that when the server receives a heartbeat triggered by the 
completion of the last command from the current stage the server will send the 
commands for the next stage only 10 seconds later when the next heartbeat is 
received. This leads to agents spending considerable amount of time idle when 
there are multiple stages to be executed.
Agents should heartbeat at a higher rate while there are still pending stages 
to be executed.


Diffs (updated)
-

  ambari-agent/conf/unix/ambari-agent.ini 8f2ab1b 
  ambari-agent/conf/windows/ambari-agent.ini df88be6 
  ambari-agent/src/main/python/ambari_agent/AmbariConfig.py 89a881a 
  ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
  ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
  ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
  ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
8103872 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
 35a37e3 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
 1ab7ae9 
  ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
ac0ddd2 
  ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
bd9de13 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
 3d2388e 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
 c26e1e9 
  
ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
 627ade9 

Diff: https://reviews.apache.org/r/48722/diff/


Testing
---

Manual testing.

Unit tests in succeeded.


Thanks,

Sebastian Toader



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-15 Thread Alejandro Fernandez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review137827
---


Ship it!




Ship It!

- Alejandro Fernandez


On June 15, 2016, 2:55 p.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 15, 2016, 2:55 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-15 Thread Sebastian Toader


> On June 15, 2016, 5:54 p.m., Andrew Onischuk wrote:
> > ambari-agent/src/main/python/ambari_agent/Controller.py, line 281
> > 
> >
> > Do we really want to do this only if there are tasks pending. 
> > 
> > Could we do this always? This would make one time operations like start 
> > stop a component decommisions etc. much more responsive.

We could do that? Sumit/Rob/Sandor/Laszlo do you see any issues with that?


- Sebastian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review137750
---


On June 15, 2016, 4:55 p.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 15, 2016, 4:55 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-15 Thread Andrew Onischuk

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review137750
---




ambari-agent/src/main/python/ambari_agent/Controller.py (line 281)


Do we really want to do this only if there are tasks pending. 

Could we do this always? This would make one time operations like start 
stop a component decommisions etc. much more responsive.


- Andrew Onischuk


On June 15, 2016, 2:55 p.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 15, 2016, 2:55 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-15 Thread Sandor Magyari

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review137751
---


Ship it!




Ship It!

- Sandor Magyari


On June 15, 2016, 2:55 p.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 15, 2016, 2:55 p.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in succeeded.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-15 Thread Sebastian Toader

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/
---

(Updated June 15, 2016, 4:55 p.m.)


Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, Sandor 
Magyari, and Sumit Mohanty.


Changes
---

Fix failing unit tests


Bugs: AMBARI-17248
https://issues.apache.org/jira/browse/AMBARI-17248


Repository: ambari


Description
---

Commands to be executed by ambari-agents are being sent down by the server in 
the response message to agent heartbeat messages. 
The server processes the received heartbeat, it checks if there are next 
commands scheduled to be executed by ambari-agent and adds those to the 
heartbeat response for the ambari-agent.
The server organises the commands that can be executed in parallel into stages. 
Ambari server ensures that only the commands of a single stage is scheduled to 
be executed by the agent and starts scheduling the commands of the next stage 
only after all commands of current stage has finished successfully.
The processing of command status received with the heartbeat message happens 
asynchronously to heartbeat response in HeartBeatProcessor and ActionScheduler 
creation thus when the heartbeat response is created the commands for the next 
stage are not scheduled yet. This means that the next commands will be sent to 
agent only with the next heartbeat.
Agents currently sends a heartbeat to the server on command a completion or at 
a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 seconds 
if there are no commands to be executed.
This means that when the server receives a heartbeat triggered by the 
completion of the last command from the current stage the server will send the 
commands for the next stage only 10 seconds later when the next heartbeat is 
received. This leads to agents spending considerable amount of time idle when 
there are multiple stages to be executed.
Agents should heartbeat at a higher rate while there are still pending stages 
to be executed.


Diffs (updated)
-

  ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
  ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
  ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
  ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
8103872 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
 35a37e3 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
 1ab7ae9 
  ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
ac0ddd2 
  ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
bd9de13 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
 3d2388e 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
 c26e1e9 
  
ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
 627ade9 

Diff: https://reviews.apache.org/r/48722/diff/


Testing (updated)
---

Manual testing.

Unit tests in succeeded.


Thanks,

Sebastian Toader



Re: Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-15 Thread Robert Levas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/#review137717
---


Ship it!




Ship It!

- Robert Levas


On June 15, 2016, 5:46 a.m., Sebastian Toader wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48722/
> ---
> 
> (Updated June 15, 2016, 5:46 a.m.)
> 
> 
> Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, 
> Sandor Magyari, and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-17248
> https://issues.apache.org/jira/browse/AMBARI-17248
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Commands to be executed by ambari-agents are being sent down by the server in 
> the response message to agent heartbeat messages. 
> The server processes the received heartbeat, it checks if there are next 
> commands scheduled to be executed by ambari-agent and adds those to the 
> heartbeat response for the ambari-agent.
> The server organises the commands that can be executed in parallel into 
> stages. Ambari server ensures that only the commands of a single stage is 
> scheduled to be executed by the agent and starts scheduling the commands of 
> the next stage only after all commands of current stage has finished 
> successfully.
> The processing of command status received with the heartbeat message happens 
> asynchronously to heartbeat response in HeartBeatProcessor and 
> ActionScheduler creation thus when the heartbeat response is created the 
> commands for the next stage are not scheduled yet. This means that the next 
> commands will be sent to agent only with the next heartbeat.
> Agents currently sends a heartbeat to the server on command a completion or 
> at a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
> self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 
> seconds if there are no commands to be executed.
> This means that when the server receives a heartbeat triggered by the 
> completion of the last command from the current stage the server will send 
> the commands for the next stage only 10 seconds later when the next heartbeat 
> is received. This leads to agents spending considerable amount of time idle 
> when there are multiple stages to be executed.
> Agents should heartbeat at a higher rate while there are still pending stages 
> to be executed.
> 
> 
> Diffs
> -
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
>   ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
>   ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
>   ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
> 8103872 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  35a37e3 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  1ab7ae9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
> ac0ddd2 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
> bd9de13 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  3d2388e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
>  c26e1e9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
>  627ade9 
> 
> Diff: https://reviews.apache.org/r/48722/diff/
> 
> 
> Testing
> ---
> 
> Manual testing.
> 
> Unit tests in progress.
> 
> 
> Thanks,
> 
> Sebastian Toader
> 
>



Review Request 48722: Reduce the idle time before first command from next stage is executed on a host

2016-06-15 Thread Sebastian Toader

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48722/
---

Review request for Ambari, Andrew Onischuk, Laszlo Puskas, Robert Levas, Sandor 
Magyari, and Sumit Mohanty.


Bugs: AMBARI-17248
https://issues.apache.org/jira/browse/AMBARI-17248


Repository: ambari


Description
---

Commands to be executed by ambari-agents are being sent down by the server in 
the response message to agent heartbeat messages. 
The server processes the received heartbeat, it checks if there are next 
commands scheduled to be executed by ambari-agent and adds those to the 
heartbeat response for the ambari-agent.
The server organises the commands that can be executed in parallel into stages. 
Ambari server ensures that only the commands of a single stage is scheduled to 
be executed by the agent and starts scheduling the commands of the next stage 
only after all commands of current stage has finished successfully.
The processing of command status received with the heartbeat message happens 
asynchronously to heartbeat response in HeartBeatProcessor and ActionScheduler 
creation thus when the heartbeat response is created the commands for the next 
stage are not scheduled yet. This means that the next commands will be sent to 
agent only with the next heartbeat.
Agents currently sends a heartbeat to the server on command a completion or at 
a timeout = self.netutil.HEARTBEAT_IDDLE_INTERVAL_SEC – 
self.netutil.MINIMUM_INTERVAL_BETWEEN_HEARTBEATS interval which is ~10 seconds 
if there are no commands to be executed.
This means that when the server receives a heartbeat triggered by the 
completion of the last command from the current stage the server will send the 
commands for the next stage only 10 seconds later when the next heartbeat is 
received. This leads to agents spending considerable amount of time idle when 
there are multiple stages to be executed.
Agents should heartbeat at a higher rate while there are still pending stages 
to be executed.


Diffs
-

  ambari-agent/src/main/python/ambari_agent/Controller.py e981a76 
  ambari-agent/src/main/python/ambari_agent/NetUtil.py 80bf3ae 
  ambari-agent/src/test/python/ambari_agent/TestNetUtil.py d72e319 
  ambari-agent/src/test/python/ambari_agent/examples/ControllerTester.py 
8103872 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
 35a37e3 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
 1ab7ae9 
  ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 
ac0ddd2 
  ambari-server/src/main/java/org/apache/ambari/server/state/Clusters.java 
bd9de13 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
 3d2388e 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java
 c26e1e9 
  
ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterImplTest.java
 627ade9 

Diff: https://reviews.apache.org/r/48722/diff/


Testing
---

Manual testing.

Unit tests in progress.


Thanks,

Sebastian Toader