[
https://issues.apache.org/jira/browse/HDFS-14500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16847867#comment-16847867
]
Erik Krogen edited comment on HDFS-14500 at 5/24/19 8:57 PM:
-------------------------------------------------------------
Just committed this to {{trunk}} and backported it to branch-3.2, branch-3.1,
branch-3.0, and branch-2. I had to make a minor modification for branch-2
because of the use of method references (Java 8 feature) in the patch, so I've
attached the updated patch. Thanks for the review [~vagarychen]!
was (Author: xkrogen):
Just committed this to {{trunk}}. Thanks for the review [~vagarychen]!
> NameNode StartupProgress continues to report edit log segments after the
> LOADING_EDITS phase is finished
> --------------------------------------------------------------------------------------------------------
>
> Key: HDFS-14500
> URL: https://issues.apache.org/jira/browse/HDFS-14500
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 3.2.0, 2.9.2, 3.0.3, 2.8.5, 3.1.2
> Reporter: Erik Krogen
> Assignee: Erik Krogen
> Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14500-branch-2.001.patch, HDFS-14500.000.patch,
> HDFS-14500.001.patch
>
>
> When testing out a cluster with the edit log tailing fast path feature
> enabled (HDFS-13150), an unrelated issue caused the NameNode to remain in
> safe mode for an extended period of time, preventing the NameNode from fully
> completing its startup sequence. We noticed that the Startup Progress web UI
> displayed many edit log segments (millions of them).
> I traced this problem back to {{StartupProgress}}. Within
> {{FSEditLogLoader}}, the loader continually tries to update the startup
> progress with a new {{Step}} any time that it loads edits. Per the Javadoc
> for {{StartupProgress}}, this should be a no-op once startup is completed:
> {code:title=StartupProgress.java}
> * After startup completes, the tracked data is frozen. Any subsequent
> updates
> * or counter increments are no-ops.
> {code}
> However, {{StartupProgress}} only implements that logic once the _entire_
> startup sequence has been completed. When {{FSEditLogLoader}} calls
> {{addStep()}}, it adds it into the {{LOADING_EDITS}} phase:
> {code:title=FSEditLogLoader.java}
> StartupProgress prog = NameNode.getStartupProgress();
> Step step = createStartupProgressStep(edits);
> prog.beginStep(Phase.LOADING_EDITS, step);
> {code}
> This phase, in our case, ended long before, so it is nonsensical to continue
> to add steps to it. I believe it is a bug that {{StartupProgress}} accepts
> such steps instead of ignoring them; once a phase is complete, it should no
> longer change.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]