[ https://issues.apache.org/jira/browse/HDFS-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200657#comment-13200657 ]
Todd Lipcon commented on HDFS-2579: ----------------------------------- Spent the afternoon working on this. Here's the diagnosis: - This doesn't currently affect trunk, but for a somewhat unintuitive reason -- the DT manager is started so early during startup, that it actually starts _before_ the NN knows whether or not it's in safemode, from what I can tell. So, it's able to log to its edit log. This seems wrong, in that, even if you start the NN in safe mode, it will write to its storage directory. But it does allow the NN to start up, which is why we didn't see this on the trunk. - In the HA branch, we start the DT Secret Manager only upon becoming active. So, if we exit safemode before becoming active, there is no problem. However, if we are in safe mode when we become active, it will fail to transition, since {{logUpdateMasterKey}} fails. The solution I'm implementing is to start/stop the DT Secret Manager upon entering/exiting safemode. This should be correct since we don't allow users to fetch delegation tokens while the NN is in safe mode anyway. Checking tokens while in safe mode should still be alright, since the secret manager object lifecycle is tied to the namesystem and not nulled out when it the DTSM is stopped. > Starting delegation token manager during safemode fails > ------------------------------------------------------- > > Key: HDFS-2579 > URL: https://issues.apache.org/jira/browse/HDFS-2579 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node, security > Affects Versions: 0.23.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > > I noticed this on the HA branch, but it seems to actually affect non-HA > branch 0.23 if security is enabled. When the NN starts up, if security is > enabled, we start the delegation token secret manager, which then tries to > call {{logUpdateMasterKey}}. This fails because the edit logs may not be > written while in safe-mode. > It seems to me that there's not any necessary reason that you have to make a > new master key at startup, since you've loaded the old key when you load the > FSImage. You'd only be lacking a DT master key on a fresh cluster, in which > case we could have it generate one at format time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira