[
https://issues.apache.org/jira/browse/AMBARI-25285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated AMBARI-25285:
------------------------------------
Labels: pull-request-available (was: )
> Ambari always copies and overrwrites mapreduce.tar.gz to hdfs when WebHDFS is
> not enabled while restarting HiveServer
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: AMBARI-25285
> URL: https://issues.apache.org/jira/browse/AMBARI-25285
> Project: Ambari
> Issue Type: Bug
> Components: ambari-agent
> Affects Versions: 2.7.3
> Reporter: Akhil S Naik
> Assignee: Akhil S Naik
> Priority: Major
> Labels: pull-request-available
>
> Problem Statement :
> When HiveServer2 is restarted, the startup python script will try to copy
> /usr/hdp/<version>/hadoop/mapreduce.tar.gz to
> /hdp/apps/<version>/mapreduce/mapreduce.tar.gz
> Mapreduce jobs will fail with the error if the HiveServer2 restart happens
> and the YARN applications in ACCEPTED state go to RUNNING during the exact
> same time when the mapreduce.tar.gz file copy happens.
> But when WebHDFS is enabled, this problem will never occur as the copying is
> skipped by Ambari and we can see the below line.
> {code:java}
> 2019-05-23 10:11:18,371 - DFS file
> /hdp/apps/2.6.5.0-292/mapreduce/mapreduce.tar.gz is identical to
> /usr/hdp/2.6.5.0-292/hadoop/mapreduce.tar.gz, skipping the copying
> {code}
> When WebHDFS is disabled in the cluster, then the above line is not printed
> when starting HiveServer2.
> But when WebHDFS is not started it will just overwrite the mapreduce.tar.gz
> without asking
> analysis :
> Looks issue with this part of code :
> https://github.com/apache/ambari/blob/4eee0f56d2fbfdfb0caace955339bc0c46a85a3c/contrib/fast-hdfs-resource/src/main/java/org/apache/ambari/fast_hdfs_resource/Runner.java#L131
> https://github.com/apache/ambari/blob/4eee0f56d2fbfdfb0caace955339bc0c46a85a3c/contrib/fast-hdfs-resource/src/main/java/org/apache/ambari/fast_hdfs_resource/Resource.java#L236
> we are just creating the file and overwriting it if exists.
> We should do basic check if the file already exists of not before this copy
> operation and skip if file is same.
> This will save the time of starting hive-server2 and also abnormal failure of
> mapreduce jobs.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)