http://git-wip-us.apache.org/repos/asf/hadoop/blob/343cffb0/hadoop-common-project/hadoop-common/src/site/apt/InterfaceClassification.apt.vm ---------------------------------------------------------------------- diff --git a/hadoop-common-project/hadoop-common/src/site/apt/InterfaceClassification.apt.vm b/hadoop-common-project/hadoop-common/src/site/apt/InterfaceClassification.apt.vm deleted file mode 100644 index 85e66bd..0000000 --- a/hadoop-common-project/hadoop-common/src/site/apt/InterfaceClassification.apt.vm +++ /dev/null @@ -1,239 +0,0 @@ -~~ Licensed under the Apache License, Version 2.0 (the "License"); -~~ you may not use this file except in compliance with the License. -~~ You may obtain a copy of the License at -~~ -~~ http://www.apache.org/licenses/LICENSE-2.0 -~~ -~~ Unless required by applicable law or agreed to in writing, software -~~ distributed under the License is distributed on an "AS IS" BASIS, -~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -~~ See the License for the specific language governing permissions and -~~ limitations under the License. See accompanying LICENSE file. - - --- - Hadoop Interface Taxonomy: Audience and Stability Classification - --- - --- - ${maven.build.timestamp} - -Hadoop Interface Taxonomy: Audience and Stability Classification - -%{toc|section=1|fromDepth=0} - -* Motivation - - The interface taxonomy classification provided here is for guidance to - developers and users of interfaces. The classification guides a developer - to declare the targeted audience or users of an interface and also its - stability. - - * Benefits to the user of an interface: Knows which interfaces to use or not - use and their stability. - - * Benefits to the developer: to prevent accidental changes of interfaces and - hence accidental impact on users or other components or system. This is - particularly useful in large systems with many developers who may not all - have a shared state/history of the project. - -* Interface Classification - - Hadoop adopts the following interface classification, - this classification was derived from the - {{{http://www.opensolaris.org/os/community/arc/policies/interface-taxonomy/#Advice}OpenSolaris taxonomy}} - and, to some extent, from taxonomy used inside Yahoo. Interfaces have two main - attributes: Audience and Stability - -** Audience - - Audience denotes the potential consumers of the interface. While many - interfaces are internal/private to the implementation, - other are public/external interfaces are meant for wider consumption by - applications and/or clients. For example, in posix, libc is an external or - public interface, while large parts of the kernel are internal or private - interfaces. Also, some interfaces are targeted towards other specific - subsystems. - - Identifying the audience of an interface helps define the impact of - breaking it. For instance, it might be okay to break the compatibility of - an interface whose audience is a small number of specific subsystems. On - the other hand, it is probably not okay to break a protocol interfaces - that millions of Internet users depend on. - - Hadoop uses the following kinds of audience in order of - increasing/wider visibility: - - * Private: - - * The interface is for internal use within the project (such as HDFS or - MapReduce) and should not be used by applications or by other projects. It - is subject to change at anytime without notice. Most interfaces of a - project are Private (also referred to as project-private). - - * Limited-Private: - - * The interface is used by a specified set of projects or systems - (typically closely related projects). Other projects or systems should not - use the interface. Changes to the interface will be communicated/ - negotiated with the specified projects. For example, in the Hadoop project, - some interfaces are LimitedPrivate\{HDFS, MapReduce\} in that they - are private to the HDFS and MapReduce projects. - - * Public - - * The interface is for general use by any application. - - Hadoop doesn't have a Company-Private classification, - which is meant for APIs which are intended to be used by other projects - within the company, since it doesn't apply to opensource projects. Also, - certain APIs are annotated as @VisibleForTesting (from com.google.common - .annotations.VisibleForTesting) - these are meant to be used strictly for - unit tests and should be treated as "Private" APIs. - -** Stability - - Stability denotes how stable an interface is, as in when incompatible - changes to the interface are allowed. Hadoop APIs have the following - levels of stability. - - * Stable - - * Can evolve while retaining compatibility for minor release boundaries; - in other words, incompatible changes to APIs marked Stable are allowed - only at major releases (i.e. at m.0). - - * Evolving - - * Evolving, but incompatible changes are allowed at minor release (i.e. m - .x) - - * Unstable - - * Incompatible changes to Unstable APIs are allowed any time. This - usually makes sense for only private interfaces. - - * However one may call this out for a supposedly public interface to - highlight that it should not be used as an interface; for public - interfaces, labeling it as Not-an-interface is probably more appropriate - than "Unstable". - - * Examples of publicly visible interfaces that are unstable (i.e. - not-an-interface): GUI, CLIs whose output format will change - - * Deprecated - - * APIs that could potentially removed in the future and should not be - used. - -* How are the Classifications Recorded? - - How will the classification be recorded for Hadoop APIs? - - * Each interface or class will have the audience and stability recorded - using annotations in org.apache.hadoop.classification package. - - * The javadoc generated by the maven target javadoc:javadoc lists only the - public API. - - * One can derive the audience of java classes and java interfaces by the - audience of the package in which they are contained. Hence it is useful to - declare the audience of each java package as public or private (along with - the private audience variations). - -* FAQ - - * Why arenât the java scopes (private, package private and public) good - enough? - - * Javaâs scoping is not very complete. One is often forced to make a class - public in order for other internal components to use it. It does not have - friends or sub-package-private like C++. - - * But I can easily access a private implementation interface if it is Java - public. Where is the protection and control? - - * The purpose of this is not providing absolute access control. Its purpose - is to communicate to users and developers. One can access private - implementation functions in libc; however if they change the internal - implementation details, your application will break and you will have little - sympathy from the folks who are supplying libc. If you use a non-public - interface you understand the risks. - - * Why bother declaring the stability of a private interface? Arenât private - interfaces always unstable? - - * Private interfaces are not always unstable. In the cases where they are - stable they capture internal properties of the system and can communicate - these properties to its internal users and to developers of the interface. - - * e.g. In HDFS, NN-DN protocol is private but stable and can help - implement rolling upgrades. It communicates that this interface should not - be changed in incompatible ways even though it is private. - - * e.g. In HDFS, FSImage stability can help provide more flexible roll - backs. - - * What is the harm in applications using a private interface that is - stable? How is it different than a public stable interface? - - * While a private interface marked as stable is targeted to change only at - major releases, it may break at other times if the providers of that - interface are willing to changes the internal users of that interface. - Further, a public stable interface is less likely to break even at major - releases (even though it is allowed to break compatibility) because the - impact of the change is larger. If you use a private interface (regardless - of its stability) you run the risk of incompatibility. - - * Why bother with Limited-private? Isnât it giving special treatment to some - projects? That is not fair. - - * First, most interfaces should be public or private; actually let us state - it even stronger: make it private unless you really want to expose it to - public for general use. - - * Limited-private is for interfaces that are not intended for general use. - They are exposed to related projects that need special hooks. Such a - classification has a cost to both the supplier and consumer of the limited - interface. Both will have to work together if ever there is a need to break - the interface in the future; for example the supplier and the consumers will - have to work together to get coordinated releases of their respective - projects. This should not be taken lightly â if you can get away with - private then do so; if the interface is really for general use for all - applications then do so. But remember that making an interface public has - huge responsibility. Sometimes Limited-private is just right. - - * A good example of a limited-private interface is BlockLocations, This is - fairly low-level interface that we are willing to expose to MR and perhaps - HBase. We are likely to change it down the road and at that time we will - have get a coordinated effort with the MR team to release matching releases. - While MR and HDFS are always released in sync today, they may change down - the road. - - * If you have a limited-private interface with many projects listed then - you are fooling yourself. It is practically public. - - * It might be worth declaring a special audience classification called - Hadoop-Private for the Hadoop family. - - * Lets treat all private interfaces as Hadoop-private. What is the harm in - projects in the Hadoop family have access to private classes? - - * Do we want MR accessing class files that are implementation details - inside HDFS. There used to be many such layer violations in the code that - we have been cleaning up over the last few years. We donât want such - layer violations to creep back in by no separating between the major - components like HDFS and MR. - - * Aren't all public interfaces stable? - - * One may mark a public interface as evolving in its early days. - Here one is promising to make an effort to make compatible changes but may - need to break it at minor releases. - - * One example of a public interface that is unstable is where one is providing - an implementation of a standards-body based interface that is still under development. - For example, many companies, in an attampt to be first to market, - have provided implementations of a new NFS protocol even when the protocol was not - fully completed by IETF. - The implementor cannot evolve the interface in a fashion that causes least distruption - because the stability is controlled by the standards body. Hence it is appropriate to - label the interface as unstable.
http://git-wip-us.apache.org/repos/asf/hadoop/blob/343cffb0/hadoop-common-project/hadoop-common/src/site/apt/Metrics.apt.vm ---------------------------------------------------------------------- diff --git a/hadoop-common-project/hadoop-common/src/site/apt/Metrics.apt.vm b/hadoop-common-project/hadoop-common/src/site/apt/Metrics.apt.vm deleted file mode 100644 index ecba757..0000000 --- a/hadoop-common-project/hadoop-common/src/site/apt/Metrics.apt.vm +++ /dev/null @@ -1,889 +0,0 @@ -~~ Licensed to the Apache Software Foundation (ASF) under one or more -~~ contributor license agreements. See the NOTICE file distributed with -~~ this work for additional information regarding copyright ownership. -~~ The ASF licenses this file to You under the Apache License, Version 2.0 -~~ (the "License"); you may not use this file except in compliance with -~~ the License. You may obtain a copy of the License at -~~ -~~ http://www.apache.org/licenses/LICENSE-2.0 -~~ -~~ Unless required by applicable law or agreed to in writing, software -~~ distributed under the License is distributed on an "AS IS" BASIS, -~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -~~ See the License for the specific language governing permissions and -~~ limitations under the License. - - --- - Metrics Guide - --- - --- - ${maven.build.timestamp} - -%{toc} - -Overview - - Metrics are statistical information exposed by Hadoop daemons, - used for monitoring, performance tuning and debug. - There are many metrics available by default - and they are very useful for troubleshooting. - This page shows the details of the available metrics. - - Each section describes each context into which metrics are grouped. - - The documentation of Metrics 2.0 framework is - {{{../../api/org/apache/hadoop/metrics2/package-summary.html}here}}. - -jvm context - -* JvmMetrics - - Each metrics record contains tags such as ProcessName, SessionID - and Hostname as additional information along with metrics. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<<<MemNonHeapUsedM>>> | Current non-heap memory used in MB -*-------------------------------------+--------------------------------------+ -|<<<MemNonHeapCommittedM>>> | Current non-heap memory committed in MB -*-------------------------------------+--------------------------------------+ -|<<<MemNonHeapMaxM>>> | Max non-heap memory size in MB -*-------------------------------------+--------------------------------------+ -|<<<MemHeapUsedM>>> | Current heap memory used in MB -*-------------------------------------+--------------------------------------+ -|<<<MemHeapCommittedM>>> | Current heap memory committed in MB -*-------------------------------------+--------------------------------------+ -|<<<MemHeapMaxM>>> | Max heap memory size in MB -*-------------------------------------+--------------------------------------+ -|<<<MemMaxM>>> | Max memory size in MB -*-------------------------------------+--------------------------------------+ -|<<<ThreadsNew>>> | Current number of NEW threads -*-------------------------------------+--------------------------------------+ -|<<<ThreadsRunnable>>> | Current number of RUNNABLE threads -*-------------------------------------+--------------------------------------+ -|<<<ThreadsBlocked>>> | Current number of BLOCKED threads -*-------------------------------------+--------------------------------------+ -|<<<ThreadsWaiting>>> | Current number of WAITING threads -*-------------------------------------+--------------------------------------+ -|<<<ThreadsTimedWaiting>>> | Current number of TIMED_WAITING threads -*-------------------------------------+--------------------------------------+ -|<<<ThreadsTerminated>>> | Current number of TERMINATED threads -*-------------------------------------+--------------------------------------+ -|<<<GcInfo>>> | Total GC count and GC time in msec, grouped by the kind of GC. \ - | ex.) GcCountPS Scavenge=6, GCTimeMillisPS Scavenge=40, - | GCCountPS MarkSweep=0, GCTimeMillisPS MarkSweep=0 -*-------------------------------------+--------------------------------------+ -|<<<GcCount>>> | Total GC count -*-------------------------------------+--------------------------------------+ -|<<<GcTimeMillis>>> | Total GC time in msec -*-------------------------------------+--------------------------------------+ -|<<<LogFatal>>> | Total number of FATAL logs -*-------------------------------------+--------------------------------------+ -|<<<LogError>>> | Total number of ERROR logs -*-------------------------------------+--------------------------------------+ -|<<<LogWarn>>> | Total number of WARN logs -*-------------------------------------+--------------------------------------+ -|<<<LogInfo>>> | Total number of INFO logs -*-------------------------------------+--------------------------------------+ -|<<<GcNumWarnThresholdExceeded>>> | Number of times that the GC warn - | threshold is exceeded -*-------------------------------------+--------------------------------------+ -|<<<GcNumInfoThresholdExceeded>>> | Number of times that the GC info - | threshold is exceeded -*-------------------------------------+--------------------------------------+ -|<<<GcTotalExtraSleepTime>>> | Total GC extra sleep time in msec -*-------------------------------------+--------------------------------------+ - -rpc context - -* rpc - - Each metrics record contains tags such as Hostname - and port (number to which server is bound) - as additional information along with metrics. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<<<ReceivedBytes>>> | Total number of received bytes -*-------------------------------------+--------------------------------------+ -|<<<SentBytes>>> | Total number of sent bytes -*-------------------------------------+--------------------------------------+ -|<<<RpcQueueTimeNumOps>>> | Total number of RPC calls -*-------------------------------------+--------------------------------------+ -|<<<RpcQueueTimeAvgTime>>> | Average queue time in milliseconds -*-------------------------------------+--------------------------------------+ -|<<<RpcProcessingTimeNumOps>>> | Total number of RPC calls (same to - | RpcQueueTimeNumOps) -*-------------------------------------+--------------------------------------+ -|<<<RpcProcessingAvgTime>>> | Average Processing time in milliseconds -*-------------------------------------+--------------------------------------+ -|<<<RpcAuthenticationFailures>>> | Total number of authentication failures -*-------------------------------------+--------------------------------------+ -|<<<RpcAuthenticationSuccesses>>> | Total number of authentication successes -*-------------------------------------+--------------------------------------+ -|<<<RpcAuthorizationFailures>>> | Total number of authorization failures -*-------------------------------------+--------------------------------------+ -|<<<RpcAuthorizationSuccesses>>> | Total number of authorization successes -*-------------------------------------+--------------------------------------+ -|<<<NumOpenConnections>>> | Current number of open connections -*-------------------------------------+--------------------------------------+ -|<<<CallQueueLength>>> | Current length of the call queue -*-------------------------------------+--------------------------------------+ -|<<<rpcQueueTime>>><num><<<sNumOps>>> | Shows total number of RPC calls -| | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to -| | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<rpcQueueTime>>><num><<<s50thPercentileLatency>>> | -| | Shows the 50th percentile of RPC queue time in milliseconds -| | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to -| | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<rpcQueueTime>>><num><<<s75thPercentileLatency>>> | -| | Shows the 75th percentile of RPC queue time in milliseconds -| | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to -| | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<rpcQueueTime>>><num><<<s90thPercentileLatency>>> | -| | Shows the 90th percentile of RPC queue time in milliseconds -| | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to -| | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<rpcQueueTime>>><num><<<s95thPercentileLatency>>> | -| | Shows the 95th percentile of RPC queue time in milliseconds -| | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to -| | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<rpcQueueTime>>><num><<<s99thPercentileLatency>>> | -| | Shows the 99th percentile of RPC queue time in milliseconds -| | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to -| | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<rpcProcessingTime>>><num><<<sNumOps>>> | Shows total number of RPC calls -| | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to -| | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<rpcProcessingTime>>><num><<<s50thPercentileLatency>>> | -| | Shows the 50th percentile of RPC processing time in milliseconds -| | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to -| | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<rpcProcessingTime>>><num><<<s75thPercentileLatency>>> | -| | Shows the 75th percentile of RPC processing time in milliseconds -| | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to -| | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<rpcProcessingTime>>><num><<<s90thPercentileLatency>>> | -| | Shows the 90th percentile of RPC processing time in milliseconds -| | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to -| | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<rpcProcessingTime>>><num><<<s95thPercentileLatency>>> | -| | Shows the 95th percentile of RPC processing time in milliseconds -| | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to -| | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<rpcProcessingTime>>><num><<<s99thPercentileLatency>>> | -| | Shows the 99th percentile of RPC processing time in milliseconds -| | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to -| | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ - -* RetryCache/NameNodeRetryCache - - RetryCache metrics is useful to monitor NameNode fail-over. - Each metrics record contains Hostname tag. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<<<CacheHit>>> | Total number of RetryCache hit -*-------------------------------------+--------------------------------------+ -|<<<CacheCleared>>> | Total number of RetryCache cleared -*-------------------------------------+--------------------------------------+ -|<<<CacheUpdated>>> | Total number of RetryCache updated -*-------------------------------------+--------------------------------------+ - -rpcdetailed context - - Metrics of rpcdetailed context are exposed in unified manner by RPC - layer. Two metrics are exposed for each RPC based on its name. - Metrics named "(RPC method name)NumOps" indicates total number of - method calls, and metrics named "(RPC method name)AvgTime" shows - average turn around time for method calls in milliseconds. - -* rpcdetailed - - Each metrics record contains tags such as Hostname - and port (number to which server is bound) - as additional information along with metrics. - - The Metrics about RPCs which is not called are not included - in metrics record. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<methodname><<<NumOps>>> | Total number of the times the method is called -*-------------------------------------+--------------------------------------+ -|<methodname><<<AvgTime>>> | Average turn around time of the method in - | milliseconds -*-------------------------------------+--------------------------------------+ - -dfs context - -* namenode - - Each metrics record contains tags such as ProcessName, SessionId, - and Hostname as additional information along with metrics. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<<<CreateFileOps>>> | Total number of files created -*-------------------------------------+--------------------------------------+ -|<<<FilesCreated>>> | Total number of files and directories created by create - | or mkdir operations -*-------------------------------------+--------------------------------------+ -|<<<FilesAppended>>> | Total number of files appended -*-------------------------------------+--------------------------------------+ -|<<<GetBlockLocations>>> | Total number of getBlockLocations operations -*-------------------------------------+--------------------------------------+ -|<<<FilesRenamed>>> | Total number of rename <<operations>> (NOT number of - | files/dirs renamed) -*-------------------------------------+--------------------------------------+ -|<<<GetListingOps>>> | Total number of directory listing operations -*-------------------------------------+--------------------------------------+ -|<<<DeleteFileOps>>> | Total number of delete operations -*-------------------------------------+--------------------------------------+ -|<<<FilesDeleted>>> | Total number of files and directories deleted by delete - | or rename operations -*-------------------------------------+--------------------------------------+ -|<<<FileInfoOps>>> | Total number of getFileInfo and getLinkFileInfo - | operations -*-------------------------------------+--------------------------------------+ -|<<<AddBlockOps>>> | Total number of addBlock operations succeeded -*-------------------------------------+--------------------------------------+ -|<<<GetAdditionalDatanodeOps>>> | Total number of getAdditionalDatanode - | operations -*-------------------------------------+--------------------------------------+ -|<<<CreateSymlinkOps>>> | Total number of createSymlink operations -*-------------------------------------+--------------------------------------+ -|<<<GetLinkTargetOps>>> | Total number of getLinkTarget operations -*-------------------------------------+--------------------------------------+ -|<<<FilesInGetListingOps>>> | Total number of files and directories listed by - | directory listing operations -*-------------------------------------+--------------------------------------+ -|<<<AllowSnapshotOps>>> | Total number of allowSnapshot operations -*-------------------------------------+--------------------------------------+ -|<<<DisallowSnapshotOps>>> | Total number of disallowSnapshot operations -*-------------------------------------+--------------------------------------+ -|<<<CreateSnapshotOps>>> | Total number of createSnapshot operations -*-------------------------------------+--------------------------------------+ -|<<<DeleteSnapshotOps>>> | Total number of deleteSnapshot operations -*-------------------------------------+--------------------------------------+ -|<<<RenameSnapshotOps>>> | Total number of renameSnapshot operations -*-------------------------------------+--------------------------------------+ -|<<<ListSnapshottableDirOps>>> | Total number of snapshottableDirectoryStatus - | operations -*-------------------------------------+--------------------------------------+ -|<<<SnapshotDiffReportOps>>> | Total number of getSnapshotDiffReport - | operations -*-------------------------------------+--------------------------------------+ -|<<<TransactionsNumOps>>> | Total number of Journal transactions -*-------------------------------------+--------------------------------------+ -|<<<TransactionsAvgTime>>> | Average time of Journal transactions in - | milliseconds -*-------------------------------------+--------------------------------------+ -|<<<SyncsNumOps>>> | Total number of Journal syncs -*-------------------------------------+--------------------------------------+ -|<<<SyncsAvgTime>>> | Average time of Journal syncs in milliseconds -*-------------------------------------+--------------------------------------+ -|<<<TransactionsBatchedInSync>>> | Total number of Journal transactions batched - | in sync -*-------------------------------------+--------------------------------------+ -|<<<BlockReportNumOps>>> | Total number of processing block reports from - | DataNode -*-------------------------------------+--------------------------------------+ -|<<<BlockReportAvgTime>>> | Average time of processing block reports in - | milliseconds -*-------------------------------------+--------------------------------------+ -|<<<CacheReportNumOps>>> | Total number of processing cache reports from - | DataNode -*-------------------------------------+--------------------------------------+ -|<<<CacheReportAvgTime>>> | Average time of processing cache reports in - | milliseconds -*-------------------------------------+--------------------------------------+ -|<<<SafeModeTime>>> | The interval between FSNameSystem starts and the last - | time safemode leaves in milliseconds. \ - | (sometimes not equal to the time in SafeMode, - | see {{{https://issues.apache.org/jira/browse/HDFS-5156}HDFS-5156}}) -*-------------------------------------+--------------------------------------+ -|<<<FsImageLoadTime>>> | Time loading FS Image at startup in milliseconds -*-------------------------------------+--------------------------------------+ -|<<<FsImageLoadTime>>> | Time loading FS Image at startup in milliseconds -*-------------------------------------+--------------------------------------+ -|<<<GetEditNumOps>>> | Total number of edits downloads from SecondaryNameNode -*-------------------------------------+--------------------------------------+ -|<<<GetEditAvgTime>>> | Average edits download time in milliseconds -*-------------------------------------+--------------------------------------+ -|<<<GetImageNumOps>>> |Total number of fsimage downloads from SecondaryNameNode -*-------------------------------------+--------------------------------------+ -|<<<GetImageAvgTime>>> | Average fsimage download time in milliseconds -*-------------------------------------+--------------------------------------+ -|<<<PutImageNumOps>>> | Total number of fsimage uploads to SecondaryNameNode -*-------------------------------------+--------------------------------------+ -|<<<PutImageAvgTime>>> | Average fsimage upload time in milliseconds -*-------------------------------------+--------------------------------------+ -|<<<TotalFileOps>>> | Total number of file operations performed -*-------------------------------------+--------------------------------------+ - -* FSNamesystem - - Each metrics record contains tags such as HAState and Hostname - as additional information along with metrics. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<<<MissingBlocks>>> | Current number of missing blocks -*-------------------------------------+--------------------------------------+ -|<<<ExpiredHeartbeats>>> | Total number of expired heartbeats -*-------------------------------------+--------------------------------------+ -|<<<TransactionsSinceLastCheckpoint>>> | Total number of transactions since - | last checkpoint -*-------------------------------------+--------------------------------------+ -|<<<TransactionsSinceLastLogRoll>>> | Total number of transactions since last - | edit log roll -*-------------------------------------+--------------------------------------+ -|<<<LastWrittenTransactionId>>> | Last transaction ID written to the edit log -*-------------------------------------+--------------------------------------+ -|<<<LastCheckpointTime>>> | Time in milliseconds since epoch of last checkpoint -*-------------------------------------+--------------------------------------+ -|<<<CapacityTotal>>> | Current raw capacity of DataNodes in bytes -*-------------------------------------+--------------------------------------+ -|<<<CapacityTotalGB>>> | Current raw capacity of DataNodes in GB -*-------------------------------------+--------------------------------------+ -|<<<CapacityUsed>>> | Current used capacity across all DataNodes in bytes -*-------------------------------------+--------------------------------------+ -|<<<CapacityUsedGB>>> | Current used capacity across all DataNodes in GB -*-------------------------------------+--------------------------------------+ -|<<<CapacityRemaining>>> | Current remaining capacity in bytes -*-------------------------------------+--------------------------------------+ -|<<<CapacityRemainingGB>>> | Current remaining capacity in GB -*-------------------------------------+--------------------------------------+ -|<<<CapacityUsedNonDFS>>> | Current space used by DataNodes for non DFS - | purposes in bytes -*-------------------------------------+--------------------------------------+ -|<<<TotalLoad>>> | Current number of connections -*-------------------------------------+--------------------------------------+ -|<<<SnapshottableDirectories>>> | Current number of snapshottable directories -*-------------------------------------+--------------------------------------+ -|<<<Snapshots>>> | Current number of snapshots -*-------------------------------------+--------------------------------------+ -|<<<BlocksTotal>>> | Current number of allocated blocks in the system -*-------------------------------------+--------------------------------------+ -|<<<FilesTotal>>> | Current number of files and directories -*-------------------------------------+--------------------------------------+ -|<<<PendingReplicationBlocks>>> | Current number of blocks pending to be - | replicated -*-------------------------------------+--------------------------------------+ -|<<<UnderReplicatedBlocks>>> | Current number of blocks under replicated -*-------------------------------------+--------------------------------------+ -|<<<CorruptBlocks>>> | Current number of blocks with corrupt replicas. -*-------------------------------------+--------------------------------------+ -|<<<ScheduledReplicationBlocks>>> | Current number of blocks scheduled for - | replications -*-------------------------------------+--------------------------------------+ -|<<<PendingDeletionBlocks>>> | Current number of blocks pending deletion -*-------------------------------------+--------------------------------------+ -|<<<ExcessBlocks>>> | Current number of excess blocks -*-------------------------------------+--------------------------------------+ -|<<<PostponedMisreplicatedBlocks>>> | (HA-only) Current number of blocks - | postponed to replicate -*-------------------------------------+--------------------------------------+ -|<<<PendingDataNodeMessageCourt>>> | (HA-only) Current number of pending - | block-related messages for later - | processing in the standby NameNode -*-------------------------------------+--------------------------------------+ -|<<<MillisSinceLastLoadedEdits>>> | (HA-only) Time in milliseconds since the - | last time standby NameNode load edit log. - | In active NameNode, set to 0 -*-------------------------------------+--------------------------------------+ -|<<<BlockCapacity>>> | Current number of block capacity -*-------------------------------------+--------------------------------------+ -|<<<StaleDataNodes>>> | Current number of DataNodes marked stale due to delayed - | heartbeat -*-------------------------------------+--------------------------------------+ -|<<<TotalFiles>>> |Current number of files and directories (same as FilesTotal) -*-------------------------------------+--------------------------------------+ - -* JournalNode - - The server-side metrics for a journal from the JournalNode's perspective. - Each metrics record contains Hostname tag as additional information - along with metrics. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<<<Syncs60sNumOps>>> | Number of sync operations (1 minute granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs60s50thPercentileLatencyMicros>>> | The 50th percentile of sync -| | latency in microseconds (1 minute granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs60s75thPercentileLatencyMicros>>> | The 75th percentile of sync -| | latency in microseconds (1 minute granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs60s90thPercentileLatencyMicros>>> | The 90th percentile of sync -| | latency in microseconds (1 minute granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs60s95thPercentileLatencyMicros>>> | The 95th percentile of sync -| | latency in microseconds (1 minute granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs60s99thPercentileLatencyMicros>>> | The 99th percentile of sync -| | latency in microseconds (1 minute granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs300sNumOps>>> | Number of sync operations (5 minutes granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs300s50thPercentileLatencyMicros>>> | The 50th percentile of sync -| | latency in microseconds (5 minutes granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs300s75thPercentileLatencyMicros>>> | The 75th percentile of sync -| | latency in microseconds (5 minutes granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs300s90thPercentileLatencyMicros>>> | The 90th percentile of sync -| | latency in microseconds (5 minutes granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs300s95thPercentileLatencyMicros>>> | The 95th percentile of sync -| | latency in microseconds (5 minutes granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs300s99thPercentileLatencyMicros>>> | The 99th percentile of sync -| | latency in microseconds (5 minutes granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs3600sNumOps>>> | Number of sync operations (1 hour granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs3600s50thPercentileLatencyMicros>>> | The 50th percentile of sync -| | latency in microseconds (1 hour granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs3600s75thPercentileLatencyMicros>>> | The 75th percentile of sync -| | latency in microseconds (1 hour granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs3600s90thPercentileLatencyMicros>>> | The 90th percentile of sync -| | latency in microseconds (1 hour granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs3600s95thPercentileLatencyMicros>>> | The 95th percentile of sync -| | latency in microseconds (1 hour granularity) -*-------------------------------------+--------------------------------------+ -|<<<Syncs3600s99thPercentileLatencyMicros>>> | The 99th percentile of sync -| | latency in microseconds (1 hour granularity) -*-------------------------------------+--------------------------------------+ -|<<<BatchesWritten>>> | Total number of batches written since startup -*-------------------------------------+--------------------------------------+ -|<<<TxnsWritten>>> | Total number of transactions written since startup -*-------------------------------------+--------------------------------------+ -|<<<BytesWritten>>> | Total number of bytes written since startup -*-------------------------------------+--------------------------------------+ -|<<<BatchesWrittenWhileLagging>>> | Total number of batches written where this -| | node was lagging -*-------------------------------------+--------------------------------------+ -|<<<LastWriterEpoch>>> | Current writer's epoch number -*-------------------------------------+--------------------------------------+ -|<<<CurrentLagTxns>>> | The number of transactions that this JournalNode is -| | lagging -*-------------------------------------+--------------------------------------+ -|<<<LastWrittenTxId>>> | The highest transaction id stored on this JournalNode -*-------------------------------------+--------------------------------------+ -|<<<LastPromisedEpoch>>> | The last epoch number which this node has promised -| | not to accept any lower epoch, or 0 if no promises have been made -*-------------------------------------+--------------------------------------+ - -* datanode - - Each metrics record contains tags such as SessionId and Hostname - as additional information along with metrics. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<<<BytesWritten>>> | Total number of bytes written to DataNode -*-------------------------------------+--------------------------------------+ -|<<<BytesRead>>> | Total number of bytes read from DataNode -*-------------------------------------+--------------------------------------+ -|<<<BlocksWritten>>> | Total number of blocks written to DataNode -*-------------------------------------+--------------------------------------+ -|<<<BlocksRead>>> | Total number of blocks read from DataNode -*-------------------------------------+--------------------------------------+ -|<<<BlocksReplicated>>> | Total number of blocks replicated -*-------------------------------------+--------------------------------------+ -|<<<BlocksRemoved>>> | Total number of blocks removed -*-------------------------------------+--------------------------------------+ -|<<<BlocksVerified>>> | Total number of blocks verified -*-------------------------------------+--------------------------------------+ -|<<<BlockVerificationFailures>>> | Total number of verifications failures -*-------------------------------------+--------------------------------------+ -|<<<BlocksCached>>> | Total number of blocks cached -*-------------------------------------+--------------------------------------+ -|<<<BlocksUncached>>> | Total number of blocks uncached -*-------------------------------------+--------------------------------------+ -|<<<ReadsFromLocalClient>>> | Total number of read operations from local client -*-------------------------------------+--------------------------------------+ -|<<<ReadsFromRemoteClient>>> | Total number of read operations from remote - | client -*-------------------------------------+--------------------------------------+ -|<<<WritesFromLocalClient>>> | Total number of write operations from local - | client -*-------------------------------------+--------------------------------------+ -|<<<WritesFromRemoteClient>>> | Total number of write operations from remote - | client -*-------------------------------------+--------------------------------------+ -|<<<BlocksGetLocalPathInfo>>> | Total number of operations to get local path - | names of blocks -*-------------------------------------+--------------------------------------+ -|<<<FsyncCount>>> | Total number of fsync -*-------------------------------------+--------------------------------------+ -|<<<VolumeFailures>>> | Total number of volume failures occurred -*-------------------------------------+--------------------------------------+ -|<<<ReadBlockOpNumOps>>> | Total number of read operations -*-------------------------------------+--------------------------------------+ -|<<<ReadBlockOpAvgTime>>> | Average time of read operations in milliseconds -*-------------------------------------+--------------------------------------+ -|<<<WriteBlockOpNumOps>>> | Total number of write operations -*-------------------------------------+--------------------------------------+ -|<<<WriteBlockOpAvgTime>>> | Average time of write operations in milliseconds -*-------------------------------------+--------------------------------------+ -|<<<BlockChecksumOpNumOps>>> | Total number of blockChecksum operations -*-------------------------------------+--------------------------------------+ -|<<<BlockChecksumOpAvgTime>>> | Average time of blockChecksum operations in - | milliseconds -*-------------------------------------+--------------------------------------+ -|<<<CopyBlockOpNumOps>>> | Total number of block copy operations -*-------------------------------------+--------------------------------------+ -|<<<CopyBlockOpAvgTime>>> | Average time of block copy operations in - | milliseconds -*-------------------------------------+--------------------------------------+ -|<<<ReplaceBlockOpNumOps>>> | Total number of block replace operations -*-------------------------------------+--------------------------------------+ -|<<<ReplaceBlockOpAvgTime>>> | Average time of block replace operations in - | milliseconds -*-------------------------------------+--------------------------------------+ -|<<<HeartbeatsNumOps>>> | Total number of heartbeats -*-------------------------------------+--------------------------------------+ -|<<<HeartbeatsAvgTime>>> | Average heartbeat time in milliseconds -*-------------------------------------+--------------------------------------+ -|<<<BlockReportsNumOps>>> | Total number of block report operations -*-------------------------------------+--------------------------------------+ -|<<<BlockReportsAvgTime>>> | Average time of block report operations in - | milliseconds -*-------------------------------------+--------------------------------------+ -|<<<CacheReportsNumOps>>> | Total number of cache report operations -*-------------------------------------+--------------------------------------+ -|<<<CacheReportsAvgTime>>> | Average time of cache report operations in - | milliseconds -*-------------------------------------+--------------------------------------+ -|<<<PacketAckRoundTripTimeNanosNumOps>>> | Total number of ack round trip -*-------------------------------------+--------------------------------------+ -|<<<PacketAckRoundTripTimeNanosAvgTime>>> | Average time from ack send to -| | receive minus the downstream ack time in nanoseconds -*-------------------------------------+--------------------------------------+ -|<<<FlushNanosNumOps>>> | Total number of flushes -*-------------------------------------+--------------------------------------+ -|<<<FlushNanosAvgTime>>> | Average flush time in nanoseconds -*-------------------------------------+--------------------------------------+ -|<<<FsyncNanosNumOps>>> | Total number of fsync -*-------------------------------------+--------------------------------------+ -|<<<FsyncNanosAvgTime>>> | Average fsync time in nanoseconds -*-------------------------------------+--------------------------------------+ -|<<<SendDataPacketBlockedOnNetworkNanosNumOps>>> | Total number of sending - | packets -*-------------------------------------+--------------------------------------+ -|<<<SendDataPacketBlockedOnNetworkNanosAvgTime>>> | Average waiting time of -| | sending packets in nanoseconds -*-------------------------------------+--------------------------------------+ -|<<<SendDataPacketTransferNanosNumOps>>> | Total number of sending packets -*-------------------------------------+--------------------------------------+ -|<<<SendDataPacketTransferNanosAvgTime>>> | Average transfer time of sending - | packets in nanoseconds -*-------------------------------------+--------------------------------------+ -|<<<TotalWriteTime>>> | Total number of milliseconds spent on write - | operation -*-------------------------------------+--------------------------------------+ -|<<<TotalReadTime>>> | Total number of milliseconds spent on read - | operation -*-------------------------------------+--------------------------------------+ -|<<<RemoteBytesRead>>> | Number of bytes read by remote clients -*-------------------------------------+--------------------------------------+ -|<<<RemoteBytesWritten>>> | Number of bytes written by remote clients -*-------------------------------------+--------------------------------------+ - - -yarn context - -* ClusterMetrics - - ClusterMetrics shows the metrics of the YARN cluster from the - ResourceManager's perspective. Each metrics record contains - Hostname tag as additional information along with metrics. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<<<NumActiveNMs>>> | Current number of active NodeManagers -*-------------------------------------+--------------------------------------+ -|<<<NumDecommissionedNMs>>> | Current number of decommissioned NodeManagers -*-------------------------------------+--------------------------------------+ -|<<<NumLostNMs>>> | Current number of lost NodeManagers for not sending - | heartbeats -*-------------------------------------+--------------------------------------+ -|<<<NumUnhealthyNMs>>> | Current number of unhealthy NodeManagers -*-------------------------------------+--------------------------------------+ -|<<<NumRebootedNMs>>> | Current number of rebooted NodeManagers -*-------------------------------------+--------------------------------------+ - -* QueueMetrics - - QueueMetrics shows an application queue from the - ResourceManager's perspective. Each metrics record shows - the statistics of each queue, and contains tags such as - queue name and Hostname as additional information along with metrics. - - In <<<running_>>><num> metrics such as <<<running_0>>>, you can set the - property <<<yarn.resourcemanager.metrics.runtime.buckets>>> in yarn-site.xml - to change the buckets. The default values is <<<60,300,1440>>>. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<<<running_0>>> | Current number of running applications whose elapsed time are - | less than 60 minutes -*-------------------------------------+--------------------------------------+ -|<<<running_60>>> | Current number of running applications whose elapsed time are - | between 60 and 300 minutes -*-------------------------------------+--------------------------------------+ -|<<<running_300>>> | Current number of running applications whose elapsed time are - | between 300 and 1440 minutes -*-------------------------------------+--------------------------------------+ -|<<<running_1440>>> | Current number of running applications elapsed time are - | more than 1440 minutes -*-------------------------------------+--------------------------------------+ -|<<<AppsSubmitted>>> | Total number of submitted applications -*-------------------------------------+--------------------------------------+ -|<<<AppsRunning>>> | Current number of running applications -*-------------------------------------+--------------------------------------+ -|<<<AppsPending>>> | Current number of applications that have not yet been - | assigned by any containers -*-------------------------------------+--------------------------------------+ -|<<<AppsCompleted>>> | Total number of completed applications -*-------------------------------------+--------------------------------------+ -|<<<AppsKilled>>> | Total number of killed applications -*-------------------------------------+--------------------------------------+ -|<<<AppsFailed>>> | Total number of failed applications -*-------------------------------------+--------------------------------------+ -|<<<AllocatedMB>>> | Current allocated memory in MB -*-------------------------------------+--------------------------------------+ -|<<<AllocatedVCores>>> | Current allocated CPU in virtual cores -*-------------------------------------+--------------------------------------+ -|<<<AllocatedContainers>>> | Current number of allocated containers -*-------------------------------------+--------------------------------------+ -|<<<AggregateContainersAllocated>>> | Total number of allocated containers -*-------------------------------------+--------------------------------------+ -|<<<AggregateContainersReleased>>> | Total number of released containers -*-------------------------------------+--------------------------------------+ -|<<<AvailableMB>>> | Current available memory in MB -*-------------------------------------+--------------------------------------+ -|<<<AvailableVCores>>> | Current available CPU in virtual cores -*-------------------------------------+--------------------------------------+ -|<<<PendingMB>>> | Current pending memory resource requests in MB that are - | not yet fulfilled by the scheduler -*-------------------------------------+--------------------------------------+ -|<<<PendingVCores>>> | Current pending CPU allocation requests in virtual - | cores that are not yet fulfilled by the scheduler -*-------------------------------------+--------------------------------------+ -|<<<PendingContainers>>> | Current pending resource requests that are not - | yet fulfilled by the scheduler -*-------------------------------------+--------------------------------------+ -|<<<ReservedMB>>> | Current reserved memory in MB -*-------------------------------------+--------------------------------------+ -|<<<ReservedVCores>>> | Current reserved CPU in virtual cores -*-------------------------------------+--------------------------------------+ -|<<<ReservedContainers>>> | Current number of reserved containers -*-------------------------------------+--------------------------------------+ -|<<<ActiveUsers>>> | Current number of active users -*-------------------------------------+--------------------------------------+ -|<<<ActiveApplications>>> | Current number of active applications -*-------------------------------------+--------------------------------------+ -|<<<FairShareMB>>> | (FairScheduler only) Current fair share of memory in MB -*-------------------------------------+--------------------------------------+ -|<<<FairShareVCores>>> | (FairScheduler only) Current fair share of CPU in - | virtual cores -*-------------------------------------+--------------------------------------+ -|<<<MinShareMB>>> | (FairScheduler only) Minimum share of memory in MB -*-------------------------------------+--------------------------------------+ -|<<<MinShareVCores>>> | (FairScheduler only) Minimum share of CPU in virtual - | cores -*-------------------------------------+--------------------------------------+ -|<<<MaxShareMB>>> | (FairScheduler only) Maximum share of memory in MB -*-------------------------------------+--------------------------------------+ -|<<<MaxShareVCores>>> | (FairScheduler only) Maximum share of CPU in virtual - | cores -*-------------------------------------+--------------------------------------+ - -* NodeManagerMetrics - - NodeManagerMetrics shows the statistics of the containers in the node. - Each metrics record contains Hostname tag as additional information - along with metrics. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<<<containersLaunched>>> | Total number of launched containers -*-------------------------------------+--------------------------------------+ -|<<<containersCompleted>>> | Total number of successfully completed containers -*-------------------------------------+--------------------------------------+ -|<<<containersFailed>>> | Total number of failed containers -*-------------------------------------+--------------------------------------+ -|<<<containersKilled>>> | Total number of killed containers -*-------------------------------------+--------------------------------------+ -|<<<containersIniting>>> | Current number of initializing containers -*-------------------------------------+--------------------------------------+ -|<<<containersRunning>>> | Current number of running containers -*-------------------------------------+--------------------------------------+ -|<<<allocatedContainers>>> | Current number of allocated containers -*-------------------------------------+--------------------------------------+ -|<<<allocatedGB>>> | Current allocated memory in GB -*-------------------------------------+--------------------------------------+ -|<<<availableGB>>> | Current available memory in GB -*-------------------------------------+--------------------------------------+ - -ugi context - -* UgiMetrics - - UgiMetrics is related to user and group information. - Each metrics record contains Hostname tag as additional information - along with metrics. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<<<LoginSuccessNumOps>>> | Total number of successful kerberos logins -*-------------------------------------+--------------------------------------+ -|<<<LoginSuccessAvgTime>>> | Average time for successful kerberos logins in - | milliseconds -*-------------------------------------+--------------------------------------+ -|<<<LoginFailureNumOps>>> | Total number of failed kerberos logins -*-------------------------------------+--------------------------------------+ -|<<<LoginFailureAvgTime>>> | Average time for failed kerberos logins in - | milliseconds -*-------------------------------------+--------------------------------------+ -|<<<getGroupsNumOps>>> | Total number of group resolutions -*-------------------------------------+--------------------------------------+ -|<<<getGroupsAvgTime>>> | Average time for group resolution in milliseconds -*-------------------------------------+--------------------------------------+ -|<<<getGroups>>><num><<<sNumOps>>> | -| | Total number of group resolutions (<num> seconds granularity). <num> is -| | specified by <<<hadoop.user.group.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<getGroups>>><num><<<s50thPercentileLatency>>> | -| | Shows the 50th percentile of group resolution time in milliseconds -| | (<num> seconds granularity). <num> is specified by -| | <<<hadoop.user.group.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<getGroups>>><num><<<s75thPercentileLatency>>> | -| | Shows the 75th percentile of group resolution time in milliseconds -| | (<num> seconds granularity). <num> is specified by -| | <<<hadoop.user.group.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<getGroups>>><num><<<s90thPercentileLatency>>> | -| | Shows the 90th percentile of group resolution time in milliseconds -| | (<num> seconds granularity). <num> is specified by -| | <<<hadoop.user.group.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<getGroups>>><num><<<s95thPercentileLatency>>> | -| | Shows the 95th percentile of group resolution time in milliseconds -| | (<num> seconds granularity). <num> is specified by -| | <<<hadoop.user.group.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ -|<<<getGroups>>><num><<<s99thPercentileLatency>>> | -| | Shows the 99th percentile of group resolution time in milliseconds -| | (<num> seconds granularity). <num> is specified by -| | <<<hadoop.user.group.metrics.percentiles.intervals>>>. -*-------------------------------------+--------------------------------------+ - -metricssystem context - -* MetricsSystem - - MetricsSystem shows the statistics for metrics snapshots and publishes. - Each metrics record contains Hostname tag as additional information - along with metrics. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<<<NumActiveSources>>> | Current number of active metrics sources -*-------------------------------------+--------------------------------------+ -|<<<NumAllSources>>> | Total number of metrics sources -*-------------------------------------+--------------------------------------+ -|<<<NumActiveSinks>>> | Current number of active sinks -*-------------------------------------+--------------------------------------+ -|<<<NumAllSinks>>> | Total number of sinks \ - | (BUT usually less than <<<NumActiveSinks>>>, - | see {{{https://issues.apache.org/jira/browse/HADOOP-9946}HADOOP-9946}}) -*-------------------------------------+--------------------------------------+ -|<<<SnapshotNumOps>>> | Total number of operations to snapshot statistics from - | a metrics source -*-------------------------------------+--------------------------------------+ -|<<<SnapshotAvgTime>>> | Average time in milliseconds to snapshot statistics - | from a metrics source -*-------------------------------------+--------------------------------------+ -|<<<PublishNumOps>>> | Total number of operations to publish statistics to a - | sink -*-------------------------------------+--------------------------------------+ -|<<<PublishAvgTime>>> | Average time in milliseconds to publish statistics to - | a sink -*-------------------------------------+--------------------------------------+ -|<<<DroppedPubAll>>> | Total number of dropped publishes -*-------------------------------------+--------------------------------------+ -|<<<Sink_>>><instance><<<NumOps>>> | Total number of sink operations for the - | <instance> -*-------------------------------------+--------------------------------------+ -|<<<Sink_>>><instance><<<AvgTime>>> | Average time in milliseconds of sink - | operations for the <instance> -*-------------------------------------+--------------------------------------+ -|<<<Sink_>>><instance><<<Dropped>>> | Total number of dropped sink operations - | for the <instance> -*-------------------------------------+--------------------------------------+ -|<<<Sink_>>><instance><<<Qsize>>> | Current queue length of the sink -*-------------------------------------+--------------------------------------+ - -default context - -* StartupProgress - - StartupProgress metrics shows the statistics of NameNode startup. - Four metrics are exposed for each startup phase based on its name. - The startup <phase>s are <<<LoadingFsImage>>>, <<<LoadingEdits>>>, - <<<SavingCheckpoint>>>, and <<<SafeMode>>>. - Each metrics record contains Hostname tag as additional information - along with metrics. - -*-------------------------------------+--------------------------------------+ -|| Name || Description -*-------------------------------------+--------------------------------------+ -|<<<ElapsedTime>>> | Total elapsed time in milliseconds -*-------------------------------------+--------------------------------------+ -|<<<PercentComplete>>> | Current rate completed in NameNode startup progress \ - | (The max value is not 100 but 1.0) -*-------------------------------------+--------------------------------------+ -|<phase><<<Count>>> | Total number of steps completed in the phase -*-------------------------------------+--------------------------------------+ -|<phase><<<ElapsedTime>>> | Total elapsed time in the phase in milliseconds -*-------------------------------------+--------------------------------------+ -|<phase><<<Total>>> | Total number of steps in the phase -*-------------------------------------+--------------------------------------+ -|<phase><<<PercentComplete>>> | Current rate completed in the phase \ - | (The max value is not 100 but 1.0) -*-------------------------------------+--------------------------------------+ http://git-wip-us.apache.org/repos/asf/hadoop/blob/343cffb0/hadoop-common-project/hadoop-common/src/site/apt/NativeLibraries.apt.vm ---------------------------------------------------------------------- diff --git a/hadoop-common-project/hadoop-common/src/site/apt/NativeLibraries.apt.vm b/hadoop-common-project/hadoop-common/src/site/apt/NativeLibraries.apt.vm deleted file mode 100644 index 866b428..0000000 --- a/hadoop-common-project/hadoop-common/src/site/apt/NativeLibraries.apt.vm +++ /dev/null @@ -1,205 +0,0 @@ -~~ Licensed under the Apache License, Version 2.0 (the "License"); -~~ you may not use this file except in compliance with the License. -~~ You may obtain a copy of the License at -~~ -~~ http://www.apache.org/licenses/LICENSE-2.0 -~~ -~~ Unless required by applicable law or agreed to in writing, software -~~ distributed under the License is distributed on an "AS IS" BASIS, -~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -~~ See the License for the specific language governing permissions and -~~ limitations under the License. See accompanying LICENSE file. - - --- - Native Libraries Guide - --- - --- - ${maven.build.timestamp} - -Native Libraries Guide - -%{toc|section=1|fromDepth=0} - -* Overview - - This guide describes the native hadoop library and includes a small - discussion about native shared libraries. - - Note: Depending on your environment, the term "native libraries" could - refer to all *.so's you need to compile; and, the term "native - compression" could refer to all *.so's you need to compile that are - specifically related to compression. Currently, however, this document - only addresses the native hadoop library (<<<libhadoop.so>>>). - The document for libhdfs library (<<<libhdfs.so>>>) is - {{{../hadoop-hdfs/LibHdfs.html}here}}. - -* Native Hadoop Library - - Hadoop has native implementations of certain components for performance - reasons and for non-availability of Java implementations. These - components are available in a single, dynamically-linked native library - called the native hadoop library. On the *nix platforms the library is - named <<<libhadoop.so>>>. - -* Usage - - It is fairly easy to use the native hadoop library: - - [[1]] Review the components. - - [[2]] Review the supported platforms. - - [[3]] Either download a hadoop release, which will include a pre-built - version of the native hadoop library, or build your own version of - the native hadoop library. Whether you download or build, the name - for the library is the same: libhadoop.so - - [[4]] Install the compression codec development packages (>zlib-1.2, - >gzip-1.2): - - * If you download the library, install one or more development - packages - whichever compression codecs you want to use with - your deployment. - - * If you build the library, it is mandatory to install both - development packages. - - [[5]] Check the runtime log files. - -* Components - - The native hadoop library includes various components: - - * Compression Codecs (bzip2, lz4, snappy, zlib) - - * Native IO utilities for {{{../hadoop-hdfs/ShortCircuitLocalReads.html} - HDFS Short-Circuit Local Reads}} and - {{{../hadoop-hdfs/CentralizedCacheManagement.html}Centralized Cache - Management in HDFS}} - - * CRC32 checksum implementation - -* Supported Platforms - - The native hadoop library is supported on *nix platforms only. The - library does not to work with Cygwin or the Mac OS X platform. - - The native hadoop library is mainly used on the GNU/Linus platform and - has been tested on these distributions: - - * RHEL4/Fedora - - * Ubuntu - - * Gentoo - - On all the above distributions a 32/64 bit native hadoop library will - work with a respective 32/64 bit jvm. - -* Download - - The pre-built 32-bit i386-Linux native hadoop library is available as - part of the hadoop distribution and is located in the <<<lib/native>>> - directory. You can download the hadoop distribution from Hadoop Common - Releases. - - Be sure to install the zlib and/or gzip development packages - - whichever compression codecs you want to use with your deployment. - -* Build - - The native hadoop library is written in ANSI C and is built using the - GNU autotools-chain (autoconf, autoheader, automake, autoscan, - libtool). This means it should be straight-forward to build the library - on any platform with a standards-compliant C compiler and the GNU - autotools-chain (see the supported platforms). - - The packages you need to install on the target platform are: - - * C compiler (e.g. GNU C Compiler) - - * GNU Autools Chain: autoconf, automake, libtool - - * zlib-development package (stable version >= 1.2.0) - - * openssl-development package(e.g. libssl-dev) - - Once you installed the prerequisite packages use the standard hadoop - pom.xml file and pass along the native flag to build the native hadoop - library: - ----- - $ mvn package -Pdist,native -DskipTests -Dtar ----- - - You should see the newly-built library in: - ----- - $ hadoop-dist/target/hadoop-${project.version}/lib/native ----- - - Please note the following: - - * It is mandatory to install both the zlib and gzip development - packages on the target platform in order to build the native hadoop - library; however, for deployment it is sufficient to install just - one package if you wish to use only one codec. - - * It is necessary to have the correct 32/64 libraries for zlib, - depending on the 32/64 bit jvm for the target platform, in order to - build and deploy the native hadoop library. - -* Runtime - - The bin/hadoop script ensures that the native hadoop library is on the - library path via the system property: - <<<-Djava.library.path=<path> >>> - - During runtime, check the hadoop log files for your MapReduce tasks. - - * If everything is all right, then: - <<<DEBUG util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...>>> - <<<INFO util.NativeCodeLoader - Loaded the native-hadoop library>>> - - * If something goes wrong, then: - <<<INFO util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable>>> - -* Check - - NativeLibraryChecker is a tool to check whether native libraries are loaded correctly. - You can launch NativeLibraryChecker as follows: - ----- - $ hadoop checknative -a - 14/12/06 01:30:45 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version - 14/12/06 01:30:45 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library - Native library checking: - hadoop: true /home/ozawa/hadoop/lib/native/libhadoop.so.1.0.0 - zlib: true /lib/x86_64-linux-gnu/libz.so.1 - snappy: true /usr/lib/libsnappy.so.1 - lz4: true revision:99 - bzip2: false ----- - - -* Native Shared Libraries - - You can load any native shared library using DistributedCache for - distributing and symlinking the library files. - - This example shows you how to distribute a shared library, mylib.so, - and load it from a MapReduce task. - - [[1]] First copy the library to the HDFS: - <<<bin/hadoop fs -copyFromLocal mylib.so.1 /libraries/mylib.so.1>>> - - [[2]] The job launching program should contain the following: - <<<DistributedCache.createSymlink(conf);>>> - <<<DistributedCache.addCacheFile("hdfs://host:port/libraries/mylib.so. 1#mylib.so", conf);>>> - - [[3]] The MapReduce task can contain: - <<<System.loadLibrary("mylib.so");>>> - - Note: If you downloaded or built the native hadoop library, you donât - need to use DistibutedCache to make the library available to your - MapReduce tasks.
