[
https://issues.apache.org/jira/browse/AMBARI-24201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525121#comment-16525121
]
Hudson commented on AMBARI-24201:
---------------------------------
SUCCESS: Integrated in Jenkins build Ambari-trunk-Commit #9538 (See
[https://builds.apache.org/job/Ambari-trunk-Commit/9538/])
AMBARI-24201. Command reschedule does not work causing blueprint (aonishuk:
[https://gitbox.apache.org/repos/asf?p=ambari.git&a=commit&h=781b4bfe9879ce56837b913a7ad6db46908bb684])
* (edit) ambari-agent/src/main/python/ambari_agent/ActionQueue.py
> Command reschedule does not work causing blueprint deployments to timeout
> ---------------------------------------------------------------------------
>
> Key: AMBARI-24201
> URL: https://issues.apache.org/jira/browse/AMBARI-24201
> Project: Ambari
> Issue Type: Bug
> Reporter: Andrew Onischuk
> Assignee: Andrew Onischuk
> Priority: Major
> Fix For: 2.7.0
>
> Attachments: AMBARI-24201.patch, AMBARI-24201.patch,
> AMBARI-24201.patch
>
>
> During stage timeout/failure of devilery during blueprint install server
> usually reschedules running command. By sending cancel command along with
> repeated execution command.
> The bug is that agent cancels the command which needs to be newly scheduled.
>
>
> 2018-06-27 01:34:58,105 WARN [agent-message-retry-0] MessageEmitter:255
> - Reschedule execution command emitting, retry: 1, messageId: 19
>
>
>
> ..., u'cancelCommands': [{u'commandType': u'CANCEL_COMMAND',
> u'target_task_id': 145, u'reason': u'Stage timeout'}]}},
> u'requiredConfigTimestamp': 1530060845474}
> INFO 2018-06-27 01:34:58,121 ActionQueue.py:115 - Canceling command with
> taskId = 145
> INFO 2018-06-27 01:34:58,121 ActionQueue.py:134 - Canceling
> EXECUTION_COMMAND for service ZOOKEEPER and role ZOOKEEPER_CLIENT with taskId
> 145
> WARNING 2018-06-27 01:34:58,121 CustomServiceOrchestrator.py:129 - Unable
> to find process associated with taskId = 145
> INFO 2018-06-27 01:34:58,122 ActionQueue.py:103 - Adding
> EXECUTION_COMMAND for role ZOOKEEPER_CLIENT for service ZOOKEEPER of
> cluster_id 2 to the queue.
> INFO 2018-06-27 01:34:58,122 security.py:135 - Event to server at
> /reports/responses (correlation_id=870): {'status': 'OK', 'messageId': '19'}
> INFO 2018-06-27 01:34:58,142 __init__.py:57 - Event from server at /user/
> (correlation_id=870): {u'status': u'OK'}
> INFO 2018-06-27 01:34:59,293 ActionQueue.py:238 - Executing command with
> id = 10-0, taskId = 145 for role = ZOOKEEPER_CLIENT of cluster_id 2.
> INFO 2018-06-27 01:34:59,294 security.py:135 - Event to server at
> /reports/commands_status (correlation_id=871): {'clusters': {u'2':
> [{'status': 'IN_PROGRESS', 'taskId': 145, 'tmpout':
> '/var/lib/ambari-agent/data/output-145.txt', 'roleCommand': u'INSTALL',
> 'structuredOut': '/var/lib/ambari-agent/data/structured-out-145.json',
> 'clusterId': u'2', 'serviceName': u'ZOOKEEPER', 'role': u'ZOOKEEPER_CLIENT',
> 'actionId': u'10-0', 'tmperr': '/var/lib/ambari-agent/data/errors-145.txt'}]}}
> INFO 2018-06-27 01:34:59,295 ActionQueue.py:279 - Command execution
> metadata - taskId = 145, retry enabled = True, max retry duration (sec) =
> 1200, log_output = True
> INFO 2018-06-27 01:34:59,296 ActionQueue.py:285 - Command with taskId =
> 145 canceled
> ERROR 2018-06-27 01:34:59,296 ActionQueue.py:221 - Exception while
> processing EXECUTION_COMMAND command
> Traceback (most recent call last):
> File "/usr/lib/ambari-agent/lib/ambari_agent/ActionQueue.py", line 214,
> in process_command
> self.execute_command(command)
> File "/usr/lib/ambari-agent/lib/ambari_agent/ActionQueue.py", line 354,
> in execute_command
> commandresult['stdout'] += '\n\nCommand completed successfully!\n' if
> status == self.COMPLETED_STATUS else '\n\nCommand failed after ' +
> str(numAttempts) + ' tries\n'
> UnboundLocalError: local variable 'commandresult' referenced before
> assignment
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)