GitHub user rezasafi opened a pull request:

    https://github.com/apache/spark/pull/21916

    [SPARK-24958][WIP] Report executors' process tree total memory information 
to heartbeat signals

    This is work in progress for SPARK-24958 and this PR is opened on top of 
the PR for SPARK-23429:
    https://github.com/apache/spark/pull/21221/
    To view the changes that are only related to SPARK-24958 you can check the 
following view:
    https://github.com/rezasafi/spark/pull/1
    Spark executors' process tree total memory information can be really 
useful. Currently such information are not available. The goal of this PR is to 
compute such information for each executor, add these information to the 
heartbeat signals, and compute the peaks at the driver.
    
    This PR is tested by running the current unit tests and the ones that are 
added by the PR for SPARK-23429. I have also tested this on our internal 
cluster and have verified that it is working.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rezasafi/spark ptreememory

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21916.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21916
    
----
commit c8e8abedbdfec6e92b0c63e90f3c2c5755fd8978
Author: Edwina Lu <edlu@...>
Date:   2018-03-09T23:39:36Z

    SPARK-23429: Add executor memory metrics to heartbeat and expose in 
executors REST API
    
    Add new executor level memory metrics (JVM used memory, on/off heap 
execution memory, on/off heap storage
    memory), and expose via the executors REST API. This information will help 
provide insight into how executor
    and driver JVM memory is used, and for the different memory regions. It can 
be used to help determine good
    values for spark.executor.memory, spark.driver.memory, 
spark.memory.fraction, and spark.memory.storageFraction.
    
    Add an ExecutorMetrics class, with jvmUsedMemory, onHeapExecutionMemory, 
offHeapExecutionMemory,
    onHeapStorageMemory, and offHeapStorageMemory. The new ExecutorMetrics will 
be sent by executors to the
    driver as part of Heartbeat. A heartbeat will be added for the driver as 
well, to collect these metrics
    for the driver.
    
    Modify the EventLoggingListener to log ExecutorMetricsUpdate events if 
there is a new peak value for any
    of the memory metrics for an executor and stage. Only the ExecutorMetrics 
will be logged, and not the
    TaskMetrics, to minimize additional logging.
    
    Modify the AppStatusListener to record the peak values for each memory 
metric.
    
    Add the new memory metrics to the executors REST API.

commit 5d6ae1c34bf6618754e4b8b2e756a9a7b4bad987
Author: Edwina Lu <edlu@...>
Date:   2018-04-02T02:13:41Z

    modify MimaExcludes.scala to filter changes to 
SparkListenerExecutorMetricsUpdate

commit ad10d2814bbfbaf8c21fcbb1abe83ef7a8e9ffe7
Author: Edwina Lu <edlu@...>
Date:   2018-04-22T00:02:57Z

    Address code review comments, change event logging to stage end.

commit 10ed328bfcf160711e7619aac23472f97bf1c976
Author: Edwina Lu <edlu@...>
Date:   2018-05-15T00:24:22Z

    Add configuration parameter 
spark.eventLog.logExecutorMetricsUpdates.enabled to enable/disable executor 
metrics update logging.
    Code review comments.

commit 2d2036760a298c7434eb4816c1bf045c43713e6f
Author: Imran Rashid <irashid@...>
Date:   2018-05-23T19:37:26Z

    wip on enum based metrics

commit f904f1e0bc3fab90db7f7aa7cfcf71b9fb26e890
Author: Imran Rashid <irashid@...>
Date:   2018-05-23T20:50:26Z

    wip ... has both enum and non-enum version

commit c502ec4c7f55083356187c2906d24440d0168d2f
Author: Imran Rashid <irashid@...>
Date:   2018-05-23T21:23:44Z

    case objects, mostly complete

commit 7879e66eed22cfd4dff2367c0ee3138369243711
Author: edwinalu <edwina.lu@...>
Date:   2018-06-03T02:31:14Z

    Merge pull request #1 from squito/metric_enums
    
    Metric enums

commit 2662f6f9c6a7c34cea34b748f6735eb1625b73cb
Author: Edwina Lu <edlu@...>
Date:   2018-06-10T21:34:19Z

    Address comments (move heartbeater from DAGScheduler to SparkContext, move 
logic for getting
    metrics to Heartbeater), and modifiy tests for the new ExecutorMetrics 
format.

commit 287133597f819417f96ae5965895c1b640703d86
Author: Edwina Lu <edlu@...>
Date:   2018-03-09T23:39:36Z

    SPARK-23429: Add executor memory metrics to heartbeat and expose in 
executors REST API
    
    Add new executor level memory metrics (JVM used memory, on/off heap 
execution memory, on/off heap storage
    memory), and expose via the executors REST API. This information will help 
provide insight into how executor
    and driver JVM memory is used, and for the different memory regions. It can 
be used to help determine good
    values for spark.executor.memory, spark.driver.memory, 
spark.memory.fraction, and spark.memory.storageFraction.
    
    Add an ExecutorMetrics class, with jvmUsedMemory, onHeapExecutionMemory, 
offHeapExecutionMemory,
    onHeapStorageMemory, and offHeapStorageMemory. The new ExecutorMetrics will 
be sent by executors to the
    driver as part of Heartbeat. A heartbeat will be added for the driver as 
well, to collect these metrics
    for the driver.
    
    Modify the EventLoggingListener to log ExecutorMetricsUpdate events if 
there is a new peak value for any
    of the memory metrics for an executor and stage. Only the ExecutorMetrics 
will be logged, and not the
    TaskMetrics, to minimize additional logging.
    
    Modify the AppStatusListener to record the peak values for each memory 
metric.
    
    Add the new memory metrics to the executors REST API.

commit da83f2e58ff7d495111a0c1f36bf54ebcf35d444
Author: Edwina Lu <edlu@...>
Date:   2018-04-02T02:13:41Z

    modify MimaExcludes.scala to filter changes to 
SparkListenerExecutorMetricsUpdate

commit f25a44b95e4e6a8532c6541ee985789dff5bc7de
Author: Edwina Lu <edlu@...>
Date:   2018-04-22T00:02:57Z

    Address code review comments, change event logging to stage end.

commit ca85c8219f46e3265b8191e82a4017c2cb97fc49
Author: Edwina Lu <edlu@...>
Date:   2018-05-15T00:24:22Z

    Add configuration parameter 
spark.eventLog.logExecutorMetricsUpdates.enabled to enable/disable executor 
metrics update logging.
    Code review comments.

commit 8b74ba8fff21b499e7cc9d93f9864831aa29773e
Author: Imran Rashid <irashid@...>
Date:   2018-05-23T19:37:26Z

    wip on enum based metrics

commit 036148cdbe60b7ad7ff318260580896ad0da6cd0
Author: Imran Rashid <irashid@...>
Date:   2018-05-23T20:50:26Z

    wip ... has both enum and non-enum version

commit 91fb1db09504fc4386477ab51221d28240c3c901
Author: Imran Rashid <irashid@...>
Date:   2018-05-23T21:23:44Z

    case objects, mostly complete

commit 2d8894a91f4a0dacd49114dc74cc97b7c9426879
Author: Edwina Lu <edlu@...>
Date:   2018-06-10T21:34:19Z

    Address comments (move heartbeater from DAGScheduler to SparkContext, move 
logic for getting
    metrics to Heartbeater), and modifiy tests for the new ExecutorMetrics 
format.

commit 99044e6ec0cdc1b760c57dd5b7e74349384c6a98
Author: Edwina Lu <edlu@...>
Date:   2018-06-14T00:15:00Z

    Merge branch 'SPARK-23429.2' of https://github.com/edwinalu/spark into 
SPARK-23429.2

commit 263c8c846265b6bdfdce471e44c163ab85b930a3
Author: Edwina Lu <edlu@...>
Date:   2018-06-14T23:52:11Z

    code review comments

commit 812fdcf3961bae2a4fa20b4f60e739b45233fcd0
Author: Edwina Lu <edlu@...>
Date:   2018-06-22T23:53:23Z

    code review comments:
    - remove timestamp
    - change ExecutorMetrics to Array[Long]
    - create new SparkListenerStageExecutorMetrics for recording stage executor 
metric peaks in
      the history log
    
    Fix issue where metrics for a removed executor were ignored (save dead 
executors while
    there currently active stages that the executor was alive for).

commit 7ed42a5d0eb0b93bb9ddecf14d9461c80dfe1ea0
Author: Edwina Lu <edlu@...>
Date:   2018-06-28T18:41:58Z

    Address code review comments. Also make executorUpdates in 
SparkListenerExecutorMetricsUpdate
    not optional. These are no longer logged, and backward compatibility should 
not be an issue.
    These events should only be used to send task and executor updates for 
heartbeats, and
    executors and driver should be the same Spark version.

commit 8d9acdf32984c0c9c621a058b45805872bb9e4c5
Author: Edwina Lu <edlu@...>
Date:   2018-06-29T23:27:51Z

    Revert and make executorUpdates in SparkListenerExecutorMetricsUpdate 
optional again, in
    case of existing users of SparkListenerExecutorMetricsUpdate.

commit 20799d2af7b70334534be913f7defea6d6b79ffb
Author: Edwina Lu <edlu@...>
Date:   2018-07-25T18:02:45Z

    code review comments: hid array implementation of executor metrics, and add 
ExecutorMetrics, with getMetricValue()
    method for accessing executor metric values. Rename MetricGetter to 
ExecutorMetricType.
    
    Should ExecutorMetricType be moved to executor package, or ExecutorMetrics 
be moved to metrics package?
    Should Json (de)serialization functions be moved from api.scala to 
ExecutorMetrics?

commit 8905d231c3a959f70266223d3546b17a655cee39
Author: Edwina Lu <edlu@...>
Date:   2018-07-25T20:49:09Z

    merge with master

commit 81dd2e519fb269a90515f5167f3d8f425515b661
Author: Reza Safi <rezasafi@...>
Date:   2018-07-26T21:33:52Z

    Integration of ProcessTreeMetrics with PR 21221

commit 26dc46bde1506a3718d74fc5edac20856c609a88
Author: Reza Safi <rezasafi@...>
Date:   2018-07-27T15:03:59Z

    Some improvements in integration

commit d60e255de5e90d8529c7d6496b95ceae2ae20be3
Author: Reza Safi <rezasafi@...>
Date:   2018-07-27T15:05:34Z

    Integration with the unit tests of the upstream open PR

commit d8c3293e9cd7238fef5b4c517b23ac05f1d83508
Author: Reza Safi <rezasafi@...>
Date:   2018-07-28T05:44:19Z

    Fix an isuue with memory info computation.

commit ee9ba5985741da26cb148009760b546353d0cb34
Author: Reza Safi <rezasafi@...>
Date:   2018-07-28T06:09:20Z

    Fix scalastyle errors

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to