[ 
https://issues.apache.org/jira/browse/HADOOP-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615283#action_12615283
 ] 

Hemanth Yamijala commented on HADOOP-3245:
------------------------------------------

Using job history seems a reasonable approach. Some concerns though:

- We need to find out a good buffer size to use for writing to the history 
file. A small value could have an impact on performance due to faster flushes. 
A large value could result in a lot of task events not being flushed and hence 
unavailable for the JobTracker on restart. We are exploring what an ideal value 
for this is.
- For a large job with typical job history outputs, we need to make sure the 
time to parse and reconstruct state is not too bad.
- We still need something like the SYNC operation described above, because in 
the window where something is written to job history but not flushed, these 
events would be lost for the JT upon restart. So, there will need to be a way 
to tell the TTs to reset these events. However, this count is going to be much 
smaller than what can happen in the approach currently implemented.

We're doing some tests related to the first two points and then can discuss the 
results.

The completed task state in RAM is not introduced in this patch. I would 
recommend it be addressed in another JIRA, if it is an issue.

> Provide ability to persist running jobs (extend HADOOP-1876)
> ------------------------------------------------------------
>
>                 Key: HADOOP-3245
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3245
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Devaraj Das
>            Assignee: Amar Kamat
>         Attachments: HADOOP-3245-v2.5.patch, HADOOP-3245-v2.6.5.patch, 
> HADOOP-3245-v2.6.9.patch, HADOOP-3245-v4.1.patch
>
>
> This could probably extend the work done in HADOOP-1876. This feature can be 
> applied for things like jobs being able to survive jobtracker restarts.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to