[jira] [Commented] (HDFS-4477) Secondary namenode may retain old tokens

Daryn Sharp (JIRA) Thu, 28 Feb 2013 10:45:19 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589781#comment-13589781
 ]


Daryn Sharp commented on HDFS-4477:
-----------------------------------

The quick-fix is marred by a race condition I was concerned about.  Kihwal and 
have studied the problem and found it's much worse than originally thought.

The NN rolls the edits, followed by the 2NN downloading the image and rolled 
edits.  Tokens set to be expired during the duration of the download, but 
actually renewed during the download, will erroneously be removed from the 
image because the 2NN doesn't know about this edits.  The 2NN will now fail all 
future checkpoints when it can't apply edits for the non-existent token.  The 
2NN will now start trying to checkpoint every minute, and always fail.

Tokens are renewed at 90% of the expiration.  With the default 24h, that's a 
2.4h window in which the checkpoint downloads must occur.  If the window is 
blown, you can try to delete the current fsimage on the NN, bounce the 2NN to 
clear its internal state, and let the 2NN use the prior image and reapply all 
the older and newer edits.  However, if the checkpoint blew the 2.4h window 
because of anything but a transient load or network congestion, it's going to 
blow the window again.  It'll require NN downtime to force a save of the 
namespace.

Under normal load, some of our grids routinely take 1.5h+ to checkpoint due to 
the size of our images/edits and throttled download to avoid saturating the 
NIC.  Under heavy load, we are almost certain to lose the race.  Or if the 2NN 
is out of commission for long, we will hit this issue.  Incurring at least 15m 
of cluster downtime is not an option.

We need another solution...

                
> Secondary namenode may retain old tokens
> ----------------------------------------
>
>                 Key: HDFS-4477
>                 URL: https://issues.apache.org/jira/browse/HDFS-4477
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
>            Reporter: Kihwal Lee
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HDFS-4477.patch, HDFS-4477.patch
>
>
> Upon inspection of a fsimage created by a secondary namenode, we've 
> discovered it contains very old tokens. These are probably the ones that were 
> not explicitly canceled.  It may be related to the optimization done to avoid 
> loading fsimage from scratch every time checkpointing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4477) Secondary namenode may retain old tokens

Reply via email to