Re: [PR] [SPARK-48628][CORE] Add task peak on/off heap memory metrics [spark]

via GitHub Fri, 05 Jul 2024 13:02:41 -0700


JoshRosen commented on code in PR #47192:
URL: https://github.com/apache/spark/pull/47192#discussion_r1667098111



##########
core/src/main/java/org/apache/spark/memory/TaskMemoryManager.java:
##########
@@ -507,4 +525,19 @@ public long getMemoryConsumptionForThisTask() {
   public MemoryMode getTungstenMemoryMode() {
     return tungstenMemoryMode;
   }
+
+  /**
+   * Returns peak task-level off-heap memory usage in bytes.
+   *
+   */
+  public long getPeakOnHeapExecutionMemory() {
+    return peakOnHeapMemory;
+  }

Review Comment:
   Brainstorming: 
   
   Do we need to worry about synchronization for the field accesses here (and 
in the other getter)? Right now the writes to these fields are guarded by 
`synchronized (this)` but the reads are unsynchronized.
   
   If we did synchronize these, would we have to worry about deadlocks? I don't 
think so, because TaskMemoryManager is internal to Spark and right now the only 
call to these getters occurs once at the end of the task during final metrics 
emission (and occurs while not holding other TMM / MM locks). Also, 
TaskMemoryManager is not a DeveloperAPI thus we wouldn't expect third-party or 
user code to be calling this during task execution.
   
   If we want to sidestep any potential deadlock concerns, though, then we 
could make both longs `volatile`, similar to the existing `private volatile 
long acquiredButNotUsed` (see https://github.com/apache/spark/pull/12681).
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-48628][CORE] Add task peak on/off heap memory metrics [spark]

Reply via email to