wiedld commented on issue #11042:
URL: https://github.com/apache/datafusion/issues/11042#issuecomment-2218836013

   > But I think @wiedld said she didn't have good luck with it so your mileage 
may vary
   
   While using the xcode allocations tool, I was getting <10% of the 
allocations vs the peak measured with the time builtin. (Note: our process was 
an application-triggered datafusion query and not via the datafusion-cli.) As a 
result I ended up using heaptrack.
   
   > In general I have been having a hard time trying to debug this since there 
is no heaptrack for Mac and the build process for heaptrack_gui is also broken
   
   
   I ran into the same problems. Here is the work-around I used (I'm sure there 
are others):
   
   1. ran heaptrack on a linux vm. Instead of using the recommended 
[cargo-heaptrack](https://github.com/KDE/heaptrack#running-heaptrack-on-a-rust-binary),
 I was building & running a rust project using 
[cargo-with](https://crates.io/crates/cargo-with). Below is a rough idea 
(dependencies may differ for you).
      ```
      sudo apt install git-all
      curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
      source "$HOME/.cargo/env"
      sudo apt-get install gcc build-essential time heaptrack
      cargo install cargo-with
      cargo with 'heaptrack' -- run --profile=quick-release 
--no-default-features -- <process cmd>
   
      # results in <heaptrack_output_file>.zst
      ``` 
   
   2. use `heaptrack_print` to convert the output files into stack files for 
flamegraph
      ```
      heaptrack_print heaptrack_output_file.zst --flamegraph-cost-type peak -F 
my_stack_file.txt
   
      # results in output file, and summary stats
      total runtime: 27.98s.
      calls to allocation functions: 12432130 (444322/s)
      temporary memory allocations: 1214337 (43400/s)
      peak heap memory consumption: 1.26G
      peak RSS (including heaptrack overhead): 3.22G
      total memory leaked: 1.88M
      ``` 
   
   3. confirm the heaptrack peak memory (summary stats from above) is ~= time 
builtin
      ```
      /usr/bin/time -l -h -p <process>
      ``` 
   
   4. Move files to your macos, then build & analyze the flamegraphs.
      ```
      git clone g...@github.com:brendangregg/FlameGraph.git
      cd FlameGraph
   
      # generate flamegraph files
      flamegraph.pl --title "heaptrack: peak memory" --colors mem --countname 
peak_bytes < my_stack_file.txt > my_flamegraph.svg
      open my_flamegraph.svg
      ``` 
   
   If you run into any issues, or find any better alternatives, please let me 
know @hveiga .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to