-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43967/
-----------------------------------------------------------
(Updated Feb. 26, 2016, 12:02 p.m.)
Review request for Ambari, Alejandro Fernandez, Nate Cole, Sumit Mohanty,
Sebastian Toader, and Sid Wagle.
Changes
-------
So, my fear turned out to be valid; there is potentially an "outer transaction"
which eclipses the transaction we're trying to lock around.
```
@Transactional
public void foo(){
HostRoleCommandDAO.bar();
}
@Transactional
@TransactionalLock
HostRoleCommandDao.bar() {}
```
Because the foo() method is transactional, a transaction is started before the
method we decorated is called. Yes, we could walk backward and try to find all
invocations and decorate them too with the TransactionaLock, but this approach
is brittle. A future change could easily add a new invocation of bar() from a
transactiona and the cache would begin failing again with no obvious reason why.
This new solution basically builds the work into the existing interceptor. If
during the course of the thread's traversal through the stack, it encounters a
TransactionalLock, it will lock on it, but it won't release it until the outer
transaction is committed. Here's the workflow:
```
fooInterceptor
fooTransaction.begin
fooTransaction.proceed
mergeInterceptor
lock
proceed (no new transaction)
fooTransaction.commit
unlock
```
Essentially any TransactionalLocks are locked during the Jointpoint.proceed(),
and only released onces the transaction has committed. Because it's the same
thread doing all of the work, re-entrancy is not an issue.
Bugs: AMBARI-15173
https://issues.apache.org/jira/browse/AMBARI-15173
Repository: ambari
Description
-------
Seen while performing an upgrade, it's possible that the status of a
request/stage does not match that of its tasks. Essentially, the task could be
{{HOLDING}} while the request is still {{IN_PROGRESS}}.
I believe that AMBARI-15011 is responsible for this issue. AMBARI-15011
introduced, among other things, a cache to the
{{HostRoleCommandStatusSummaryDTO}} which is a aggregation of the number of
tasks a stage has in each state (PENDING, HOLDING, etc).
This {{HostRoleCommandStatusSummaryDTO}} is used by {{CalculatedState}} to
calculate a stage's and request's status based on the tasks.
The problem is that {{ServerActionExecutor}} is moving a tasks's state to
{{HOLDING}} (reflected in the database correctly) but the cache invalidation
happens inside the uncommitted transaction. This causes stale data to be
re-cached. So, when we go to calculate the request and state status, we get
{{IN_PROGRESS}} instead of {{HOLDING}}.
{code}
{
"href":
"http://172.22.72.13:8080/api/v1/clusters/cl1/requests/61/stages/1?fields=*,tasks/*",
"Stage": {
"cluster_name": "cl1",
"context": "Stop YARN Queues",
"display_status": "IN_PROGRESS",
"end_time": -1,
"progress_percent": 35,
"request_id": 61,
"skippable": true,
"stage_id": 1,
"start_time": 1456227329191,
"status": "IN_PROGRESS"
},
"tasks": [
{
"href":
"http://172.22.72.13:8080/api/v1/clusters/cl1/requests/61/stages/1/tasks/754",
"Tasks": {
"attempt_cnt": 1,
"cluster_name": "cl1",
"command": "EXECUTE",
"command_detail": "Before continuing, please stop all YARN queues. If
yarn-site's yarn.resourcemanager.work-preserving-recovery.enabled is set to
true, then you can skip this step since the clients will retry on their own.",
"custom_command_name":
"org.apache.ambari.server.serveraction.upgrades.ManualStageAction",
"end_time": -1,
"error_log": "errors-754.txt",
"exit_code": 0,
"host_name": "os-r6-mkqzcs-c10tom21unsecha-6.novalocal",
"id": 754,
"output_log": "output-754.txt",
"request_id": 61,
"role": "AMBARI_SERVER_ACTION",
"stage_id": 1,
"start_time": 1456227329191,
"status": "HOLDING",
"stderr": "",
"stdout": "",
"structured_out": {}
}
}
]
}
{code}
Diffs (updated)
-----
ambari-web/app/styles/application.less 3a49d5c
Diff: https://reviews.apache.org/r/43967/diff/
Testing
-------
Pending unit tests...
Thanks,
Jonathan Hurley