[
https://issues.apache.org/jira/browse/HDFS-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010748#comment-13010748
]
Ivan Kelly commented on HDFS-1725:
----------------------------------
I've address some of your comments in the newest patch. The others I have
responded to below.
{quote}1. FSImage.java: Make members storage, conf final{quote}
storage cannot be final due to TestSaveNamespace needing to spy.
{quote}1. The log {{NameNode.LOG.info("set FSImage.restoreFailedStorage");}} is
removed. Is this not necessary?{quote}
NNStorage.restoreFailedStorage already logs this, so the log statement in
FSImage is redundant.
{quote}1. The old set of constructors and new are not equivalent. For example:
FSImage(Configuration) constructor in the old code never set editLog. Similarly
only one conustros called setCheckpointDirectories(), now all constructors do.
Is this fine?
Where is the functionality of {{FSImage(URI imageDir)}} imeplemented?
Corresponding change in CreateEditsLog.java could have been avoided. I think
this is a valid use case and a constructor and it should nto be removed.
{quote}
FSImage(Configuration) calls FSImage() which calls FSImage(FSNameSystem) which
creates FSEditLog. It's not very clear, but all constructors eventually call
FSImage(FSNameSystem).
Setting the checkpoint directories in all constructors is harmless. It is only
used for import and SecondaryNameNode (which is there FSImage(Configuration) is
used). Setting it doesn't cause problems and removes unnecessary branching from
the code.
FSImage(Configuration) is also called from FSDirectory(FSNamesystem ns,
Configuration conf) which is what is called in the default case for primary
NameNode.
FSImage(URI) and FSImage(Collection<URI>, Collection<URI>) are only ever used
in test code. I don't think it's good to keep divergent code in production
solely for the purpose of testing, if it can be removed with minimal hassle.
{quote}2. Why are you calling attemptRestoreRemovedStorage() in #reset()
method? {quote}
reset() should bring the image back to it's initial state. If a storage has
been removed, then it is not in it's initial state. The actual affect of this
call is the same as it was before. Before, recoverCreateRead called
setStorageDirectory, which unlocked which did the same thing. I moved it into
the call to reset (instead of putting it in setStorageDirectory) because it is
only needed on reset. The other call to recoverCreateRead is during
initialisation, so no storage should have been removed by then.
{quote} 4. NameNode.java - why did you remove in #format method, adding
editDirsToFormat to dirsToFormat? The list of directories passed into FSImage
in this method are now changed to the list of directories in conf? {quote}
editDirsToFormat is added to dirsToFormat so that it confirms with the user if
they want to format the directory. Before, you could format an edits dir
without confirming with the user, which is risky.
The list of directories are the same in both cases. getNamespaceDirs(conf) is
used in FSImage if no directories are specified. That said, it would be
clearer, and safer, that the names are explicitly passed to FSImage. Changed.
{quote}6. SecondaryNameNode.java - in #startCheckpoint() - why did you remove
unlockAll() call? Also recoverCreate() has attemptRestoreRemovedStorage() and
unlockAll() call? {quote}
This is similar to the case in BackupNode. setStorageDirectories was only being
called to reset the state of the FSImage. attemptRestoreRemovedStorage is used
to try to bring back any removed directories. unlockAll() is only there to
allow analyseStorage to run (it tries to lock).
{quote}8. TestSaveNamespace.java - why is close() method call to unlock storage
not required any more? {quote}
I assume this is referring to testSaveWhileEditsRolled(). The whole spy() is
unnecessary in this test as nothing is spied on. I've not removed it.
The calls to close() and then setStorageDirectories() in these tests are only
there because mockito doesn't like inner classes, so when you spy FSImage, some
things go missing. It's because StorageDirectory is an innerclass of Storage,
so when you initially create the StorageDirectories, they refer to this
original instance of Storage (NNStorage in this case). Then you set a spy on
the storage, creating a new instance and replacing the original. The
storageDirectories still refer to the original, so when you call write and
things like that, they'll write invalid values for checksums and such. It's a
really annoying issue, but I'm not sure how to resolve it, so for now we just
call close() and setStorageDirectories() on NNStorage if we want to spy.
> Cleanup FSImage construction
> ----------------------------
>
> Key: HDFS-1725
> URL: https://issues.apache.org/jira/browse/HDFS-1725
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Ivan Kelly
> Assignee: Ivan Kelly
> Fix For: 0.23.0
>
> Attachments: HDFS-1725.diff, HDFS-1725.diff, HDFS-1725.diff,
> HDFS-1725.diff
>
>
> FSImage construction is messy. Sometimes the storagedirectories in use are
> set straight away, sometimes they are not. This makes it hard for anything
> under FSImage (i.e. FSEditLog) to make assumptions about what it can use.
> Therefore, this patch makes FSImage set the storage directories in use during
> construction, and never allows them to change. If you want to change
> storagedirectories you create a new image.
> Also, all the construction code should be the same with the only difference
> being the parameters passed. When not passed, these should get sensible
> defaults.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira