[jira] [Updated] (AMBARI-25285) Ambari always copies and overrwrites mapreduce.tar.gz to hdfs when WebHDFS is not enabled while restarting HiveServer

ASF GitHub Bot (JIRA) Fri, 24 May 2019 04:19:26 -0700


     [ 
https://issues.apache.org/jira/browse/AMBARI-25285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


ASF GitHub Bot updated AMBARI-25285:
------------------------------------
    Labels: pull-request-available  (was: )

> Ambari always copies and overrwrites mapreduce.tar.gz to hdfs when WebHDFS is 
> not enabled while restarting HiveServer
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMBARI-25285
>                 URL: https://issues.apache.org/jira/browse/AMBARI-25285
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-agent
>    Affects Versions: 2.7.3
>            Reporter: Akhil S Naik
>            Assignee: Akhil S Naik
>            Priority: Major
>              Labels: pull-request-available
>
> Problem Statement : 
> When HiveServer2 is restarted, the startup python script will try to copy 
> /usr/hdp/<version>/hadoop/mapreduce.tar.gz to 
> /hdp/apps/<version>/mapreduce/mapreduce.tar.gz
> Mapreduce jobs will fail with the error if the HiveServer2 restart happens 
> and the YARN applications in ACCEPTED state go to RUNNING during the exact 
> same time when the mapreduce.tar.gz file copy happens.
> But when WebHDFS is enabled, this problem will never occur as the copying is 
> skipped by Ambari and we can see the below line.
> {code:java}
> 2019-05-23 10:11:18,371 - DFS file 
> /hdp/apps/2.6.5.0-292/mapreduce/mapreduce.tar.gz is identical to 
> /usr/hdp/2.6.5.0-292/hadoop/mapreduce.tar.gz, skipping the copying
> {code}
> When WebHDFS is disabled in the cluster, then the above line is not printed 
> when starting HiveServer2.
> But when WebHDFS is not started it will just overwrite the mapreduce.tar.gz 
> without asking
> analysis : 
> Looks issue with this part of code : 
> https://github.com/apache/ambari/blob/4eee0f56d2fbfdfb0caace955339bc0c46a85a3c/contrib/fast-hdfs-resource/src/main/java/org/apache/ambari/fast_hdfs_resource/Runner.java#L131
> https://github.com/apache/ambari/blob/4eee0f56d2fbfdfb0caace955339bc0c46a85a3c/contrib/fast-hdfs-resource/src/main/java/org/apache/ambari/fast_hdfs_resource/Resource.java#L236
> we are just creating the file and overwriting it if exists.
> We should do basic check if the file already exists of not before this copy 
> operation and skip if file is same. 
> This will save the time of starting hive-server2 and also abnormal failure of 
> mapreduce jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AMBARI-25285) Ambari always copies and overrwrites mapreduce.tar.gz to hdfs when WebHDFS is not enabled while restarting HiveServer

Reply via email to