[ 
https://issues.apache.org/jira/browse/HADOOP-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12690162#action_12690162
 ] 

Konstantin Shvachko commented on HADOOP-4045:
---------------------------------------------

# {{FSImage.setCheckpointTime()}} variable {{al}} is not used.
# {{processIOError(ArrayList<StorageDirectory> sds)}} may be eliminated.
# I would also get rid of {{processIOError(ArrayList<EditLogOutputStream> 
errorStreams)}}.
The point is that it is better to have only one processIOError in each class, 
otherwise it can get
as bad as it is now with all different variants of it.
If you think it is a lot of changes, then lets at least make both of them 
private.
# Do we want to make {{removedStorageDirs}} a map in order to avoid adding the 
same directory 
twice into it or does it never happen?
# Same with {{Storage.storageDirs}}. If we search in a collection then we might 
want to use
searchable collections. This may be done in a separate issue.
# It's somewhat confusing: {{FSImage.processIOError()}} calls 
{{editLog.processIOError()}} and
then {{FSEditLog.processIOError()}} calls {{fsimage.processIOError()}}. Is it 
going to converge
at some point?
# {{setCheckpointTime()}} ignores io errors. Just mentioning this, I don't see 
how to avoid it.
Failed streams/directories will be remove next time flushAndSync() called.

> Increment checkpoint if we see failures in rollEdits
> ----------------------------------------------------
>
>                 Key: HADOOP-4045
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4045
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Lohit Vijayarenu
>            Assignee: Boris Shkolnik
>            Priority: Critical
>             Fix For: 0.19.2
>
>         Attachments: HADOOP-4045-1.patch, HADOOP-4045.patch
>
>
> In _FSEditLog::rollEdits_, if we encounter an error during opening edits.new, 
> we remove  the store directory associated with it. At this point we should 
> also increment checkpoint on all other directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to