This is an automated email from the ASF dual-hosted git repository.
srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 6a6075a [SPARK-27157][DOCS] Add Executor level metrics to monitoring
docs
6a6075a is described below
commit 6a6075ac96279e2d3c3bb5e11e292bf21572c5ce
Author: Lantao Jin <[email protected]>
AuthorDate: Sat Mar 16 14:52:19 2019 -0500
[SPARK-27157][DOCS] Add Executor level metrics to monitoring docs
## What changes were proposed in this pull request?
A sub-task of
[SPARK-23206](https://issues.apache.org/jira/browse/SPARK-23206)
Add Executor level metrics to monitoring docs
## How was this patch tested?
jekyll
Closes #24090 from LantaoJin/SPARK-27157.
Authored-by: Lantao Jin <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
---
docs/monitoring.md | 143 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 143 insertions(+)
diff --git a/docs/monitoring.md b/docs/monitoring.md
index 72e4f47..036a575 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -609,7 +609,150 @@ A list of the available metrics, with a short description:
</tr>
</table>
+### Executor Metrics
+Executor-level metrics are sent from each executor to the driver as part of
the Heartbeat to describe the performance metrics of Executor itself like JVM
heap memory, GC infomation. Metrics `peakExecutorMetrics.*` are only enabled if
`spark.eventLog.logStageExecutorMetrics.enabled` is true.
+A list of the available metrics, with a short description:
+
+<table class="table">
+ <tr><th>Executor Level Metric name</th>
+ <th>Short description</th>
+ </tr>
+ <tr>
+ <td>totalGCTime</td>
+ <td>Elapsed time the JVM spent in garbage collection summed in this
Executor.
+ The value is expressed in milliseconds.</td>
+ </tr>
+ <tr>
+ <td>totalInputBytes</td>
+ <td>Total input bytes summed in this Executor.</td>
+ </tr>
+ <tr>
+ <td>totalShuffleRead</td>
+ <td>Total shuffer read bytes summed in this Executor.</td>
+ </tr>
+ <tr>
+ <td>totalShuffleWrite</td>
+ <td>Total shuffer write bytes summed in this Executor.</td>
+ </tr>
+ <tr>
+ <td>maxMemory</td>
+ <td>Total amount of memory available for storage, in bytes.</td>
+ </tr>
+ <tr>
+ <td>memoryMetrics.*</td>
+ <td>Current value of memory metrics:</td>
+ </tr>
+ <tr>
+ <td> .usedOnHeapStorageMemory</td>
+ <td>Used on heap memory currently for storage, in bytes.</td>
+ </tr>
+ <tr>
+ <td> .usedOffHeapStorageMemory</td>
+ <td>Used off heap memory currently for storage, in bytes.</td>
+ </tr>
+ <tr>
+ <td> .totalOnHeapStorageMemory</td>
+ <td>Total available on heap memory for storage, in bytes. This amount can
vary over time, on the MemoryManager implementation.</td>
+ </tr>
+ <tr>
+ <td> .totalOffHeapStorageMemory</td>
+ <td>Total available off heap memory for storage, in bytes. This amount can
vary over time, depending on the MemoryManager implementation.</td>
+ </tr>
+ <tr>
+ <td>peakMemoryMetrics.*</td>
+ <td>Peak value of memory (and GC) metrics:</td>
+ </tr>
+ <tr>
+ <td> .JVMHeapMemory</td>
+ <td>Peak memory usage of the heap that is used for object allocation.
+ The heap consists of one or more memory pools. The used and committed size
of the returned memory usage is the sum of those values of all heap memory
pools whereas the init and max size of the returned memory usage represents the
setting of the heap memory which may not be the sum of those of all heap memory
pools.
+ The amount of used memory in the returned memory usage is the amount of
memory occupied by both live objects and garbage objects that have not been
collected, if any.</td>
+ </tr>
+ <tr>
+ <td> .JVMOffHeapMemory</td>
+ <td>Peak memory usage of non-heap memory that is used by the Java virtual
machine. The non-heap memory consists of one or more memory pools. The used and
committed size of the returned memory usage is the sum of those values of all
non-heap memory pools whereas the init and max size of the returned memory
usage represents the setting of the non-heap memory which may not be the sum of
those of all non-heap memory pools.</td>
+ </tr>
+ <tr>
+ <td> .OnHeapExecutionMemory</td>
+ <td>Peak on heap execution memory in use, in bytes.</td>
+ </tr>
+ <tr>
+ <td> .OffHeapExecutionMemory</td>
+ <td>Peak off heap execution memory in use, in bytes.</td>
+ </tr>
+ <tr>
+ <td> .OnHeapStorageMemory</td>
+ <td>Peak on heap storage memory in use, in bytes.</td>
+ </tr>
+ <tr>
+ <td> .OffHeapStorageMemory</td>
+ <td>Peak off heap storage memory in use, in bytes.</td>
+ </tr>
+ <tr>
+ <td> .OnHeapUnifiedMemory</td>
+ <td>Peak on heap memory (execution and storage).</td>
+ </tr>
+ <tr>
+ <td> .OffHeapUnifiedMemory</td>
+ <td>Peak off heap memory (execution and storage).</td>
+ </tr>
+ <tr>
+ <td> .DirectPoolMemory</td>
+ <td>Peak memory that the JVM is using for direct buffer pool
([[java.lang.management.BufferPoolMXBean]])</td>
+ </tr>
+ <tr>
+ <td> .MappedPoolMemory</td>
+ <td>Peak memory that the JVM is using for mapped buffer pool
([[java.lang.management.BufferPoolMXBean]])</td>
+ </tr>
+ <tr>
+ <td> .ProcessTreeJVMVMemory</td>
+ <td>Virtual memory size in bytes. Enabled if
spark.eventLog.logStageExecutorProcessTreeMetrics.enabled is true.</td>
+ </tr>
+ <tr>
+ <td> .ProcessTreeJVMRSSMemory</td>
+ <td>Resident Set Size: number of pages the process has
+ in real memory. This is just the pages which count
+ toward text, data, or stack space. This does not
+ include pages which have not been demand-loaded in,
+ or which are swapped out. Enabled if
spark.eventLog.logStageExecutorProcessTreeMetrics.enabled is true.</td>
+ </tr>
+ <tr>
+ <td> .ProcessTreePythonVMemory</td>
+ <td>Virtual memory size for Python in bytes. Enabled if
spark.eventLog.logStageExecutorProcessTreeMetrics.enabled is true.</td>
+ </tr>
+ <tr>
+ <td> .ProcessTreePythonRSSMemory</td>
+ <td>Resident Set Size for Python. Enabled if
spark.eventLog.logStageExecutorProcessTreeMetrics.enabled is true.</td>
+ </tr>
+ <tr>
+ <td> .ProcessTreeOtherVMemory</td>
+ <td>Virtual memory size for other kind of process in bytes. Enabled if
spark.eventLog.logStageExecutorProcessTreeMetrics.enabled is true.</td>
+ </tr>
+ <tr>
+ <td> .ProcessTreeOtherRSSMemory</td>
+ <td>Resident Set Size for other kind of process. Enabled if
spark.eventLog.logStageExecutorProcessTreeMetrics.enabled is true.</td>
+ </tr>
+ <tr>
+ <td> .MinorGCCount</td>
+ <td>Total minor GC count. For example, the garbage collector is one of
Copy, PS Scavenge, ParNew, G1 Young Generation and so on.</td>
+ </tr>
+ <tr>
+ <td> .MinorGCTime</td>
+ <td>Elapsed total minor GC time.
+ The value is expressed in milliseconds.</td>
+ </tr>
+ <tr>
+ <td> .MajorGCCount</td>
+ <td>Total major GC count. For example, the garbage collector is one of
MarkSweepCompact, PS MarkSweep, ConcurrentMarkSweep, G1 Old Generation and so
on.</td>
+ </tr>
+ <tr>
+ <td> .MajorGCTime</td>
+ <td>Elapsed total major GC time.
+ The value is expressed in milliseconds.</td>
+ </tr>
+</table>
+The computation of RSS and Vmem are based on
[proc(5)](http://man7.org/linux/man-pages/man5/proc.5.html)
### API Versioning Policy
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]