[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Fang updated MAPREDUCE-6258:
---------------------------------
    Status: Patch Available  (was: Open)

> add support to back up JHS files from application master
> --------------------------------------------------------
>
>                 Key: MAPREDUCE-6258
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6258
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: applicationmaster
>    Affects Versions: 2.4.1
>            Reporter: Jian Fang
>         Attachments: MAPREDUCE-6258.patch
>
>
> In hadoop two, job history files are stored on HDFS with a default retention 
> period of one week. In a cloud environment, these HDFS files are actually 
> stored on the disks of ephemeral instances that could go away once the 
> instances are terminated. Users may want to back up the job history files for 
> issue investigation and performance analysis before and after the cluster is 
> terminated. 
> A centralized backup mechanism could have a scalability issue for big and 
> busy Hadoop clusters where there are probably tens of thousands of jobs every 
> day. As a result, it is preferred to have a distributed way to back up the 
> job history files in this case. To achieve this goal, we could add a new 
> feature to back up the job history files in Application master. More 
> specifically, we could copy the job history files to a backup path when they 
> are moved from the temporary staging directory to the intermediate_done path 
> in application master. Since application masters could run on any slave nodes 
> on a Hadoop cluster, we could achieve a better scalability by backing up the 
> job history files in a distributed fashion.
> Please be aware, the backup path should be managed by the Hadoop users based 
> on their needs. For example, some Hadoop users may copy the job history files 
> to a cloud storage directly and keep them there forever. While some other 
> users may want to store the job history files on local disks and clean them 
> up from time to time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to