[ 
https://issues.apache.org/jira/browse/HDFS-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010748#comment-13010748
 ] 

Ivan Kelly commented on HDFS-1725:
----------------------------------

I've address some of your comments in the newest patch. The others I have 
responded to below.

{quote}1. FSImage.java: Make members storage, conf final{quote}

storage cannot be final due to TestSaveNamespace needing to spy.

{quote}1. The log {{NameNode.LOG.info("set FSImage.restoreFailedStorage");}} is 
removed. Is this not necessary?{quote}
NNStorage.restoreFailedStorage already logs this, so the log statement in 
FSImage is redundant. 

{quote}1. The old set of constructors and new are not equivalent. For example: 
FSImage(Configuration) constructor in the old code never set editLog. Similarly 
only one conustros called setCheckpointDirectories(), now all constructors do. 
Is this fine?

Where is the functionality of {{FSImage(URI imageDir)}} imeplemented? 
Corresponding change in CreateEditsLog.java could have been avoided. I think 
this is a valid use case and a constructor and it should nto be removed.
{quote}

FSImage(Configuration) calls FSImage() which calls FSImage(FSNameSystem) which 
creates FSEditLog. It's not very clear, but all constructors eventually call 
FSImage(FSNameSystem). 

Setting the checkpoint directories in all constructors is harmless. It is only 
used for import and SecondaryNameNode (which is there FSImage(Configuration) is 
used). Setting it doesn't cause problems and removes unnecessary branching from 
the code. 

FSImage(Configuration) is also called from FSDirectory(FSNamesystem ns, 
Configuration conf) which is what is called in the default case for primary 
NameNode. 

FSImage(URI) and FSImage(Collection<URI>, Collection<URI>) are only ever used 
in test code. I don't think it's good to keep divergent code in production 
solely for the purpose of testing, if it can be removed with minimal hassle.

{quote}2. Why are you calling attemptRestoreRemovedStorage() in #reset() 
method? {quote}

reset() should bring the image back to it's initial state. If a storage has 
been removed, then it is not in it's initial state. The actual affect of this 
call is the same as it was before. Before, recoverCreateRead called 
setStorageDirectory, which unlocked which did the same thing. I moved it into 
the call to reset (instead of putting it in setStorageDirectory) because it is 
only needed on reset. The other call to recoverCreateRead is during 
initialisation, so no storage should have been removed by then.

{quote} 4. NameNode.java - why did you remove in #format method, adding 
editDirsToFormat to dirsToFormat? The list of directories passed into FSImage 
in this method are now changed to the list of directories in conf? {quote}

editDirsToFormat is added to dirsToFormat so that it confirms with the user if 
they want to format the directory. Before, you could format an edits dir 
without confirming with the user, which is risky.

The list of directories are the same in both cases. getNamespaceDirs(conf) is 
used in FSImage if no directories are specified. That said, it would be 
clearer, and safer, that the names are explicitly passed to FSImage. Changed. 

{quote}6. SecondaryNameNode.java - in #startCheckpoint() - why did you remove 
unlockAll() call? Also recoverCreate() has attemptRestoreRemovedStorage() and 
unlockAll() call? {quote}

This is similar to the case in BackupNode. setStorageDirectories was only being 
called to reset the state of the FSImage. attemptRestoreRemovedStorage is used 
to try to bring back any removed directories. unlockAll() is only there to 
allow analyseStorage to run (it tries to lock).

{quote}8. TestSaveNamespace.java - why is close() method call to unlock storage 
not required any more? {quote}

I assume this is referring to testSaveWhileEditsRolled(). The whole spy() is 
unnecessary in this test as nothing is spied on. I've not removed it.

The calls to close() and then setStorageDirectories() in these tests are only 
there because mockito doesn't like inner classes, so when you spy FSImage, some 
things go missing. It's because StorageDirectory is an innerclass of Storage, 
so when you initially create the StorageDirectories, they refer to this 
original instance of Storage (NNStorage in this case). Then you set a spy on 
the storage, creating a new instance and replacing the original. The 
storageDirectories still refer to the original, so when you call write and 
things like that, they'll write invalid values for checksums and such. It's a 
really annoying issue, but I'm not sure how to resolve it, so for now we just 
call close() and setStorageDirectories() on NNStorage if we want to spy.




> Cleanup FSImage construction
> ----------------------------
>
>                 Key: HDFS-1725
>                 URL: https://issues.apache.org/jira/browse/HDFS-1725
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>             Fix For: 0.23.0
>
>         Attachments: HDFS-1725.diff, HDFS-1725.diff, HDFS-1725.diff, 
> HDFS-1725.diff
>
>
> FSImage construction is messy. Sometimes the storagedirectories in use are 
> set straight away, sometimes they are not. This makes it hard for anything 
> under FSImage (i.e. FSEditLog) to make assumptions about what it can use. 
> Therefore, this patch makes FSImage set the storage directories in use during 
> construction, and never allows them to change. If you want to change 
> storagedirectories you create a new image.
> Also, all the construction code should be the same with the only difference 
> being the parameters passed. When not passed, these should get sensible 
> defaults.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to