[
https://issues.apache.org/jira/browse/AMBARI-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Myroslav Papirkovskyi reassigned AMBARI-19930:
----------------------------------------------
Assignee: Myroslav Papirkovskyi
> The service check status was set to TIMEOUT even if service check was failed
> ----------------------------------------------------------------------------
>
> Key: AMBARI-19930
> URL: https://issues.apache.org/jira/browse/AMBARI-19930
> Project: Ambari
> Issue Type: Bug
> Reporter: Yesha Vora
> Assignee: Myroslav Papirkovskyi
>
> Steps to reproduce:
> * Install a cluster with Hadoop, Tez, Hbase , Hive, Spark
> * Enable Wire encryption
> * Run Tez service check
> Here, agent.service.check.task.timeout is set to 600 sec. Tez application was
> started in background. The service check then tries to find out SUCCESS file
> for couple of minutes only. In this particular instance, the application took
> 5 minutes to run. Thus, the check for SUCCESS file on HDFS failed.
> In this scenario, the status for service check should be failed instead
> Timeout.
> {code}
> stderr: /var/lib/ambari-agent/data/errors-370.txt
> stdout: /var/lib/ambari-agent/data/output-370.txt
> 2017-02-08 03:55:55,017 -
> HdfsResource['/hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz'] {'security_enabled':
> True, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab':
> '/etc/security/keytabs/hdfs.headless.keytab', 'source':
> '/usr/hdp/2.6.0.0-xxx/tez/lib/tez.tar.gz', 'dfs_type': '', 'default_fs':
> 'hdfs://host:8020', 'replace_existing_files': False,
> 'hdfs_resource_ignore_file':
> '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ...,
> 'kinit_path_local': '/usr/bin/kinit', 'principal_name': '[email protected]',
> 'user': 'hdfs', 'owner': 'hdfs', 'group': 'hadoop', 'hadoop_conf_dir':
> '/usr/hdp/current/hadoop-client/conf', 'type': 'file', 'action':
> ['create_on_execute'], 'immutable_paths': [u'/apps/hive/warehouse',
> u'/mr-history/done', u'/app-logs', u'/tmp'], 'mode': 0444}
> 2017-02-08 03:55:55,017 - Execute['/usr/bin/kinit -kt
> /etc/security/keytabs/hdfs.headless.keytab [email protected]'] {'user': 'hdfs'}
> 2017-02-08 03:55:55,096 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c
> 'curl -sS -L -w '"'"'%{http_code}'"'"' -X GET --negotiate -u : -k
> '"'"'https://host:50470/webhdfs/v1/hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz?op=GETFILESTATUS&user.name=hdfs'"'"'
> 1>/tmp/tmpoIadeN 2>/tmp/tmp6nFiLj''] {'logoutput': None, 'quiet': False}
> 2017-02-08 03:55:55,292 - call returned (0, '')
> 2017-02-08 03:55:55,293 - DFS file /hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz is
> identical to /usr/hdp/2.6.0.0-xxx/tez/lib/tez.tar.gz, skipping the copying
> 2017-02-08 03:55:55,293 - Will attempt to copy tez tarball from
> /usr/hdp/2.6.0.0-xxx/tez/lib/tez.tar.gz to DFS at
> /hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz.
> 2017-02-08 03:55:55,293 - HdfsResource[None] {'security_enabled': True,
> 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab':
> '/etc/security/keytabs/hdfs.headless.keytab', 'dfs_type': '', 'default_fs':
> 'hdfs://host:8020', 'hdfs_resource_ignore_file':
> '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ...,
> 'kinit_path_local': '/usr/bin/kinit', 'principal_name': '[email protected]',
> 'user': 'hdfs', 'action': ['execute'], 'hadoop_conf_dir':
> '/usr/hdp/current/hadoop-client/conf', 'immutable_paths':
> [u'/apps/hive/warehouse', u'/mr-history/done', u'/app-logs', u'/tmp']}
> 2017-02-08 03:55:55,294 - Execute['/usr/bin/kinit -kt
> /etc/security/keytabs/smokeuser.headless.keytab [email protected];']
> {'user': 'ambari-qa'}
> 2017-02-08 03:55:55,389 - ExecuteHadoop['jar
> /usr/hdp/current/tez-client/tez-examples*.jar orderedwordcount
> /tmp/tezsmokeinput/sample-tez-test /tmp/tezsmokeoutput/'] {'try_sleep': 5,
> 'tries': 3, 'bin_dir': '/usr/hdp/current/hadoop-client/bin', 'user':
> 'ambari-qa', 'conf_dir': '/usr/hdp/current/hadoop-client/conf'}
> 2017-02-08 03:55:55,390 - Execute['hadoop --config
> /usr/hdp/current/hadoop-client/conf jar
> /usr/hdp/current/tez-client/tez-examples*.jar orderedwordcount
> /tmp/tezsmokeinput/sample-tez-test /tmp/tezsmokeoutput/'] {'logoutput': None,
> 'try_sleep': 5, 'environment': {}, 'tries': 3, 'user': 'ambari-qa', 'path':
> ['/usr/hdp/current/hadoop-client/bin']}{code}
> {code}
> Requests: {
> aborted_task_count: 0,
> cluster_name: "cl1",
> completed_task_count: 1,
> create_time: 1486526151743,
> end_time: 1486526463038,
> exclusive: false,
> failed_task_count: 0,
> id: 29,
> inputs: "{}",
> operation_level: null,
> progress_percent: 100,
> queued_task_count: 0,
> request_context: "WE API TEZ Service Check",
> request_schedule: null,
> request_status: "TIMEDOUT",
> resource_filters: [
> {
> service_name: "TEZ"
> }
> ],
> start_time: 1486526151751,
> task_count: 1,
> timed_out_task_count: 1,
> type: "COMMAND"
> },{code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)