[I] Add memory profiling / logging [datafusion-comet]

via GitHub Wed, 30 Apr 2025 16:19:45 -0700


andygrove opened a new issue, #1701:
URL: https://github.com/apache/datafusion-comet/issues/1701


   ### What is the problem the feature request solves?
   
   I would like to add a config that enabled memory profiling so that we can 
monitor JVM and native memory usage throughout the lifetime of a Spark session 
or job. This data should ideally be written out in a structured file format 
that we can generate charts from.
   
   In JVM side, we can use:
   
   ``scala
   val memoryMXBean = ManagementFactory.getMemoryMXBean
   val heap = memoryMXBean.getHeapMemoryUsage
   val nonHeap = memoryMXBean.getNonHeapMemoryUsage
   ```
   
   In native side, we can use the `procfs` crate:
   
   ```rust
   let pid = std::process::id();
   let process = Process::new(pid as i32).unwrap();
   let statm = process.statm().unwrap();
   ```
   
   By logging JVM usage and overall process memory information, we can infer 
how much native memory is used. We can also log how much memory is reserved in 
the native memory pools and start to see how that aligns with actual usage.
   
   ### Describe the potential solution
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

[I] Add memory profiling / logging [datafusion-comet]

Reply via email to