[ 
https://issues.apache.org/jira/browse/SPARK-20391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976298#comment-15976298
 ] 

Saisai Shao commented on SPARK-20391:
-------------------------------------

bq. I assume managed memory here is spark.memory.fraction on heap + 
spark.memory.offHeap.size?

{{totalManagedMemory}} should be equal to spark.memory.fraction + 
spark.memory.offHeap.size, but {{totalStorageMemory}} is no larger than 
{{totalManagedMemory}}. At beginning when there's no job running, 
{{totalStorageMemory}} == {{totalManagedMemory}}, if execution memory is 
consumed, then {{totalStorageMemory}} < {{totalManagedMemory}}. 

Here we have two problems in block manager:

1. all the tracked memory in block manager is storage memory, so we should 
clarify the naming, which is the purpose of this JIRA.
2. block manager only gets the initial snapshot of storage memory 
({{totalStorageMemory}} == {{totalManagedMemory}}). As {{totalStorageMemory}} 
is varying during runtime, so the {{memRemaining}} tracked in {{StorageStatus}} 
is not accurate. This could be addressed in another JIRA.


> Properly rename the memory related fields in ExecutorSummary REST API
> ---------------------------------------------------------------------
>
>                 Key: SPARK-20391
>                 URL: https://issues.apache.org/jira/browse/SPARK-20391
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Saisai Shao
>            Priority: Blocker
>
> Currently in Spark we could get executor summary through REST API 
> {{/api/v1/applications/<app-id>/executors}}. The format of executor summary 
> is:
> {code}
> class ExecutorSummary private[spark](
>     val id: String,
>     val hostPort: String,
>     val isActive: Boolean,
>     val rddBlocks: Int,
>     val memoryUsed: Long,
>     val diskUsed: Long,
>     val totalCores: Int,
>     val maxTasks: Int,
>     val activeTasks: Int,
>     val failedTasks: Int,
>     val completedTasks: Int,
>     val totalTasks: Int,
>     val totalDuration: Long,
>     val totalGCTime: Long,
>     val totalInputBytes: Long,
>     val totalShuffleRead: Long,
>     val totalShuffleWrite: Long,
>     val isBlacklisted: Boolean,
>     val maxMemory: Long,
>     val executorLogs: Map[String, String],
>     val onHeapMemoryUsed: Option[Long],
>     val offHeapMemoryUsed: Option[Long],
>     val maxOnHeapMemory: Option[Long],
>     val maxOffHeapMemory: Option[Long])
> {code}
> Here are 6 memory related fields: {{memoryUsed}}, {{maxMemory}}, 
> {{onHeapMemoryUsed}}, {{offHeapMemoryUsed}}, {{maxOnHeapMemory}}, 
> {{maxOffHeapMemory}}.
> These all 6 fields reflects the *storage* memory usage in Spark, but from the 
> name of this 6 fields, user doesn't really know it is referring to *storage* 
> memory or the total memory (storage memory + execution memory). This will be 
> misleading.
> So I think we should properly rename these fields to reflect their real 
> meanings. Or we should will document it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to