-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43967/
-----------------------------------------------------------
(Updated Feb. 29, 2016, 2:26 p.m.)
Review request for Ambari, Alejandro Fernandez, Nate Cole, Sumit Mohanty,
Sebastian Toader, and Sid Wagle.
Changes
-------
Addressed comment to overload invalidation method.
Bugs: AMBARI-15173
https://issues.apache.org/jira/browse/AMBARI-15173
Repository: ambari
Description
-------
Seen while performing an upgrade, it's possible that the status of a
request/stage does not match that of its tasks. Essentially, the task could be
{{HOLDING}} while the request is still {{IN_PROGRESS}}.
I believe that AMBARI-15011 is responsible for this issue. AMBARI-15011
introduced, among other things, a cache to the
{{HostRoleCommandStatusSummaryDTO}} which is a aggregation of the number of
tasks a stage has in each state (PENDING, HOLDING, etc).
This {{HostRoleCommandStatusSummaryDTO}} is used by {{CalculatedState}} to
calculate a stage's and request's status based on the tasks.
The problem is that {{ServerActionExecutor}} is moving a tasks's state to
{{HOLDING}} (reflected in the database correctly) but the cache invalidation
happens inside the uncommitted transaction. This causes stale data to be
re-cached. So, when we go to calculate the request and state status, we get
{{IN_PROGRESS}} instead of {{HOLDING}}.
{code}
{
"href":
"http://172.22.72.13:8080/api/v1/clusters/cl1/requests/61/stages/1?fields=*,tasks/*",
"Stage": {
"cluster_name": "cl1",
"context": "Stop YARN Queues",
"display_status": "IN_PROGRESS",
"end_time": -1,
"progress_percent": 35,
"request_id": 61,
"skippable": true,
"stage_id": 1,
"start_time": 1456227329191,
"status": "IN_PROGRESS"
},
"tasks": [
{
"href":
"http://172.22.72.13:8080/api/v1/clusters/cl1/requests/61/stages/1/tasks/754",
"Tasks": {
"attempt_cnt": 1,
"cluster_name": "cl1",
"command": "EXECUTE",
"command_detail": "Before continuing, please stop all YARN queues. If
yarn-site's yarn.resourcemanager.work-preserving-recovery.enabled is set to
true, then you can skip this step since the clients will retry on their own.",
"custom_command_name":
"org.apache.ambari.server.serveraction.upgrades.ManualStageAction",
"end_time": -1,
"error_log": "errors-754.txt",
"exit_code": 0,
"host_name": "os-r6-mkqzcs-c10tom21unsecha-6.novalocal",
"id": 754,
"output_log": "output-754.txt",
"request_id": 61,
"role": "AMBARI_SERVER_ACTION",
"stage_id": 1,
"start_time": 1456227329191,
"status": "HOLDING",
"stderr": "",
"stdout": "",
"structured_out": {}
}
}
]
}
{code}
Diffs (updated)
-----
ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionDBAccessorImpl.java
003e2e6
ambari-server/src/main/java/org/apache/ambari/server/orm/AmbariJpaLocalTxnInterceptor.java
b5442c2
ambari-server/src/main/java/org/apache/ambari/server/orm/TransactionalLocks.java
1768dd8
ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java
c2ded2f
ambari-server/src/test/java/org/apache/ambari/annotations/LockAreaTest.java
PRE-CREATION
ambari-server/src/test/java/org/apache/ambari/annotations/TransactionalLockInterceptorTest.java
6ebdc0b
ambari-server/src/test/java/org/apache/ambari/annotations/TransactionalLockTest.java
1862088
Diff: https://reviews.apache.org/r/43967/diff/
Testing
-------
Pending unit tests...
Thanks,
Jonathan Hurley