[
https://issues.apache.org/jira/browse/HDFS-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006045#comment-13006045
]
Ivan Kelly commented on HDFS-1725:
----------------------------------
{quote}
1. setStorageDirectories also calls removedStorageDirs.clear() and
re-initializes storage directories. This patch removes the calls to
setStorageDirectories from a few places, therefore those StorageDirectories
that were removed due to some error might continue to hang in
removedStorageDirs and won't be reinstated. Will attemptRestoreRemovedStorage
take care of clearing up of removedStorageDirs in all cases?
{quote}
attemptRestoreRemovedStorage is called before attempting to save the namespace,
so it will be called in all cases on the primary node. setStorageDirectories
was only ever called in initialisation for the primary node anyhow, so it
wouldn't have restored anything.
Secondary and Backup node are a different story. They have recoverCreate etc,
which run periodically, and used to call setStorageDirectories. The effect of
this was two fold. a) to unlock the directories for analysis & b) to restore
failed storage directories. It was never to actually change the storage
directories, as it never actually does this. Now it explicitly unlocks and
attempts the restore (added in latest patch).
{quote}
3.
>+ for (URI uri : editDirsToFormat) {
>+ if (!dirsToFormat.contains(uri)) { >+ dirsToFormat.add(uri); >+ }
>+ }
This means currently we don't ask for confirmation before formatting edit
directories that are not namespace directories. I am wondering do we need to
change that, although it seems to be ok.
{quote}
I think it must have been an oversight at some stage, where separate
directories for images and edits where introduced. EditLogs are just as
important as images, so its best to confirm if we're going to delete them.
{quote}
4. In SecondaryNameNode.java#startCheckpoint.
What is the reason behind removing the call to unlockAll?
In recoverCreate a call to unlockAll is added and storage.close is removed.
storage.close was also calling listeners.clear, which will not be called now.
Is that ok?
{quote}
See response to 1. Regarding listeners, I don't think it will make any
difference. The listener is only there to allow NNStorage inform the objects
using it that an error has occurred. If a directory does cause an error,
removing it from use is the correct thing to do.
> Cleanup FSImage construction
> ----------------------------
>
> Key: HDFS-1725
> URL: https://issues.apache.org/jira/browse/HDFS-1725
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Ivan Kelly
> Assignee: Ivan Kelly
> Fix For: 0.23.0
>
> Attachments: HDFS-1725.diff, HDFS-1725.diff, HDFS-1725.diff
>
>
> FSImage construction is messy. Sometimes the storagedirectories in use are
> set straight away, sometimes they are not. This makes it hard for anything
> under FSImage (i.e. FSEditLog) to make assumptions about what it can use.
> Therefore, this patch makes FSImage set the storage directories in use during
> construction, and never allows them to change. If you want to change
> storagedirectories you create a new image.
> Also, all the construction code should be the same with the only difference
> being the parameters passed. When not passed, these should get sensible
> defaults.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira