Currently, OpenOCD's gmon profiling output is limited to a 128KBucket histogram. This works well when samples lie within roughly 128K instructions of each other, but when profiles contain program counter samples with extremes beyond that, we lose instruction level granularity as the buckets contain multiple instructions. In many cases, the presence of multiple instruction memories at sparse addresses causes 128KBucket profiles to be unusably sparse, as the buckets can span multiple functions.
One workaround is to specify the minimum and maximum address profiled, but this is harder to work with, since each memory must be profiled individually and gives confusing results because the filtered out samples are not removed from the histogram duration. For example, the ESP32-S3 has ~512MB of instruction-bus address space between its RTC instruction area, XIP range, and internal ROM0. The following patch-set first sorts the samples, then encodes one bucket at a time. I've left the limiter in place, but raised limit is now 256MBuckets to prevent giant gmon files on systems with very large instruction address ranges. On my ESP32-S3, I often get profiles of 30-40MB. Patches: https://review.openocd.org/c/openocd/+/8603/ https://review.openocd.org/c/openocd/+/8604/ https://review.openocd.org/c/openocd/+/8605/ https://review.openocd.org/c/openocd/+/8277/ I also wanted to ask for advice about the gmon.out format - perhaps there is a better way to encode these in gmon files? Would a histogram per memory bus or another way work better? Thanks, -Richard