-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27754/
-----------------------------------------------------------
(Updated Nov. 7, 2014, 11:42 p.m.)
Review request for Ambari, Mahadev Konar, Sumit Mohanty, Sid Wagle, and Yusaku
Sako.
Bugs: AMBARI-8220
https://issues.apache.org/jira/browse/AMBARI-8220
Repository: ambari
Description
-------
Very often install fails due to timeout installing hadoop_2_2* packages, which
can take up to 8-12 mins.
Each service has a metainfo.xml file that defines the timeout for each
Component for all types of actions (e.g., INSTALL, START, CONFIGURE, STOP).
Ambari doesn't currently have a mechanism to set a different timeout just for
the INSTALL operation, so instead, the server side java code can do the
following:
Get the default agent timeout from the ambari.properties file (which will be
increased from 10 mins to 15 mins)
Get the service component's timeout if it exists. If the operation is an
INSTALL and service component timeout is less than the default timeout, then
use the default timeout.
Diffs
-----
ambari-agent/src/main/python/ambari_agent/PythonExecutor.py 874b70b
ambari-server/conf/unix/ambari.properties 8563cf2
ambari-server/src/main/java/org/apache/ambari/server/configuration/Configuration.java
a0d5b39
ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java
4f69dbb
Diff: https://reviews.apache.org/r/27754/diff/
Testing
-------
----------------------------------------------------------------------
Total run:693
Total errors:0
Total failures:0
OK
Created an HDP 2.2 cluster with just HDFS and ZK, and then changed the timeouts
as follows,
yes | cp
/vagrant/ambari/ambari-agent/src/main/python/ambari_agent/PythonExecutor.py
/usr/lib/python2.6/site-packages/ambari_server/PythonExecutor.py
yes | cp
/vagrant/ambari/ambari-agent/src/main/python/ambari_agent/PythonExecutor.py
/usr/lib/python2.6/site-packages/ambari_agent/PythonExecutor.py
yes | cp /vagrant/ambari/ambari-server/target/ambari-server-*.jar
/usr/lib/ambari-server/ambari-server-*.jar
Edited /etc/ambari-server/conf/ambari.properties and changed the
agent.task.timeout value from 600 to 900.
Then modified the ResourceManager and NodeManager timeouts in
/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/metainfo.xml as
follows,
ResourceManager <timeout>642</timeout>
NodeManager <timeout>1042</timeout>
Then ran ambari-server restart
Upon adding the YARN services and inspecting the command-*.json files, they had,
"commandParams": {
"command_timeout": "1042",
"script": "scripts/nodemanager.py",
"script_type": "PYTHON",
"service_package_folder": "HDP/2.0.6/services/YARN/package",
"hooks_folder": "HDP/2.0.6/hooks"
},
"commandParams": {
"command_timeout": "900",
"script": "scripts/resourcemanager.py",
"script_type": "PYTHON",
"service_package_folder": "HDP/2.0.6/services/YARN/package",
"hooks_folder": "HDP/2.0.6/hooks"
},
Notice that the resource manager initially had a value less than the agent
default, so it was increased to it.
Thanks,
Alejandro Fernandez