Merge commit '9af96d4ed4b6f80d3ca53a2b003d2ef768650dd4' into HDFS-12943 # Conflicts: # hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md
Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/4cdd0b9c Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/4cdd0b9c Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/4cdd0b9c Branch: refs/heads/HDFS-12943 Commit: 4cdd0b9cdc9c39384333c1757766f02b1b9d0daf Parents: 94d7f90 9af96d4 Author: Konstantin V Shvachko <[email protected]> Authored: Mon Sep 17 17:39:11 2018 -0700 Committer: Konstantin V Shvachko <[email protected]> Committed: Fri Sep 21 18:17:41 2018 -0700 ---------------------------------------------------------------------- .../org/apache/hadoop/http/IsActiveServlet.java | 71 +++++++++++++++ .../apache/hadoop/http/TestIsActiveServlet.java | 95 ++++++++++++++++++++ .../router/IsRouterActiveServlet.java | 37 ++++++++ .../federation/router/RouterHttpServer.java | 9 ++ .../src/site/markdown/HDFSRouterFederation.md | 2 +- .../namenode/IsNameNodeActiveServlet.java | 33 +++++++ .../server/namenode/NameNodeHttpServer.java | 3 + .../markdown/HDFSHighAvailabilityWithQJM.md | 8 ++ .../IsResourceManagerActiveServlet.java | 38 ++++++++ .../server/resourcemanager/ResourceManager.java | 5 ++ .../resourcemanager/webapp/RMWebAppFilter.java | 3 +- .../src/site/markdown/ResourceManagerHA.md | 5 ++ 12 files changed, 307 insertions(+), 2 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/hadoop/blob/4cdd0b9c/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md ---------------------------------------------------------------------- diff --cc hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md index 0d20091,e4363fb..76a9837 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md @@@ -423,34 -423,14 +423,42 @@@ This guide describes high-level uses o **Note:** This is not yet implemented, and at present will always return success, unless the given NameNode is completely down. + + ### Load Balancer Setup + + If you are running a set of NameNodes behind a Load Balancer (e.g. [Azure](https://docs.microsoft.com/en-us/azure/load-balancer/load-balancer-custom-probe-overview) or [AWS](https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-healthchecks.html) ) and would like the Load Balancer to point to the active NN, you can use the /isActive HTTP endpoint as a health probe. + http://NN_HOSTNAME/isActive will return a 200 status code response if the NN is in Active HA State, 405 otherwise. + + + +### In-Progress Edit Log Tailing + +Under the default settings, the Standby NameNode will only apply edits that are present in an edit +log segments which has been finalized. If it is desirable to have a Standby NameNode which has more +up-to-date namespace information, it is possible to enable tailing of in-progress edit segments. +This setting will attempt to fetch edits from an in-memory cache on the JournalNodes and can reduce +the lag time before a transaction is applied on the Standby NameNode to the order of milliseconds. +If an edit cannot be served from the cache, the Standby will still be able to retrieve it, but the +lag time will be much longer. The relevant configurations are: + +* **dfs.ha.tail-edits.in-progress** - Whether or not to enable tailing on in-progress edits logs. + This will also enable the in-memory edit cache on the JournalNodes. Disabled by default. + +* **dfs.journalnode.edit-cache-size.bytes** - The size of the in-memory cache of edits on the + JournalNode. Edits take around 200 bytes each in a typical environment, so, for example, the + default of 1048576 (1MB) can hold around 5000 transactions. It is recommended to monitor the + JournalNode metrics RpcRequestCacheMissAmountNumMisses and RpcRequestCacheMissAmountAvgTxns, + which respectively count the number of requests unable to be served by the cache, and the extra + number of transactions which would have needed to have been in the cache for the request to + succeed. For example, if a request attempted to fetch edits starting at transaction ID 10, but + the oldest data in the cache was at transaction ID 20, a value of 10 would be added to the + average. + +This feature is primarily useful in conjunction with the Standby/Observer Read feature. Using this +feature, read requests can be serviced from non-active NameNodes; thus tailing in-progress edits +provides these nodes with the ability to serve requests with data which is much more fresh. See the +Apache JIRA ticket HDFS-12943 for more information on this feature. + Automatic Failover ------------------ --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
