[
https://issues.apache.org/jira/browse/MAPREDUCE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated MAPREDUCE-5652:
----------------------------------
Attachment: MAPREDUCE-5652-v7.patch
bq. 1. Does leveDB's delete method throw exception? JNI has some exception
handling and the caller needs to retrieve the exceptions, etc.
Nice catch! I didn't notice there were _two_ DBExceptions flying around in
leveldb code. org.fusesource.leveldbjni.internal.NativeDB.DBException comes
from the JNI layer and derives from IOException, and it was the one I was
familiar with. However the wrapper code around the JNI layer catches that
exception and rethrows it as org.iq80.leveldb.DBException which is a
RuntimeException. That means we need to wrap all calls that can throw the
runtime form and either handle them directly or rethrow as an IOException if
it's not appropriate to let the RuntimeException leak out of the method.
Updated the patch to deal with the runtime DBException when necessary. I'll
also have to make similar changes in the NMLevelDBStateStore for the other NM
restart patches.
bq. 2. It seems like recover/restore are common in NM/RM restart. Any abstract
interface defined for that?
They both support recovery but the forms in which they do it are very different
(e.g.: types of state persisted are significantly different, backing store
types have no overlap, etc.) There could be a generic Recoverable interface
that supports a recover() method, but I'm not sure what value that adds. Did
you have a particular interface in mind or ideas on how it would be used?
> NM Recovery. ShuffleHandler should handle NM restarts
> -----------------------------------------------------
>
> Key: MAPREDUCE-5652
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5652
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 2.2.0
> Reporter: Karthik Kambatla
> Assignee: Jason Lowe
> Labels: shuffle
> Attachments: MAPREDUCE-5652-v2.patch, MAPREDUCE-5652-v3.patch,
> MAPREDUCE-5652-v4.patch, MAPREDUCE-5652-v5.patch, MAPREDUCE-5652-v6.patch,
> MAPREDUCE-5652-v7.patch, MAPREDUCE-5652.patch
>
>
> ShuffleHandler should work across NM restarts and not require re-running
> map-tasks. On NM restart, the map outputs are cleaned up requiring
> re-execution of map tasks and should be avoided.
--
This message was sent by Atlassian JIRA
(v6.2#6252)