[
https://issues.apache.org/jira/browse/HBASE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838423#comment-13838423
]
Nick Dimiduk commented on HBASE-9485:
-------------------------------------
Thanks for the details, [~vinodkv]. The patch is shockingly small ;)
Indeed, HBase advocates use of idempotent operations when writing data from MR
jobs (also, disable speculative execution). If a task fails part-way through,
we force the user to just deal with the partially written results (hence
advocating bulk load for some scenarios).
> TableOutputCommitter should implement recovery if we don't want jobs to start
> from 0 on RM restart
> --------------------------------------------------------------------------------------------------
>
> Key: HBASE-9485
> URL: https://issues.apache.org/jira/browse/HBASE-9485
> Project: HBase
> Issue Type: Bug
> Components: mapreduce
> Reporter: Ted Yu
> Assignee: Ted Yu
> Attachments: 9485-v2.txt
>
>
> HBase extends OutputCommitter which turns recovery off. Meaning all completed
> maps are lost on RM restart and job starts from scratch. FileOutputCommitter
> implements recovery so we should look at that to see what is potentially
> needed for recovery.
--
This message was sent by Atlassian JIRA
(v6.1#6144)