Issue 137669
Summary [BOLT] Malformed / corrupted profile counter on ARM
Labels BOLT
Assignees
Reporter hilldani
    After collecting linux perf samples following [ARM documentation for BOLT](https://learn.arm.com/learning-paths/servers-and-cloud-computing/bolt/bolt-samples/) with this perf command

```bash
perf record -e cycles:u -o perf.data -- ./binary
```

The perf data looks like

```
            perf    6061 [000]   475.121158:          1 cycles:u:            5e4350 [unknown] (/usr/bin/perf)
            perf    6061 [000]   475.121179:          1 cycles:u:            5e437c [unknown] (/usr/bin/perf)
            perf 6061 [000]   475.121197:          1 cycles:u:            4cab00 [unknown] (/usr/bin/perf)
            perf    6061 [000]   475.121215:          3 cycles:u:            4caadc [unknown] (/usr/bin/perf)
            perf 6061 [000]   475.121232:          8 cycles:u:            4caae8 [unknown] (/usr/bin/perf)
            perf    6061 [000]   475.121249:         22 cycles:u:            5e2738 [unknown] (/usr/bin/perf)
            perf 6061 [000]   475.121267:         59 cycles:u:            4cab60 [unknown] (/usr/bin/perf)
            perf    6061 [000]   475.121285:        159 cycles:u:            4cb704 [unknown] (/usr/bin/perf)
            perf 6061 [000]   475.121303:        413 cycles:u:            5e25ec [unknown] (/usr/bin/perf)
etc...
```

I generate the profile data

```bash
$ ./perf2bolt -p perf.data  -o a.fdata -nl binary
BOLT-INFO: shared object or position-independent executable detected
PERF2BOLT: Starting data aggregation job for perf.data
PERF2BOLT: spawning perf job to read events without LBR
PERF2BOLT: spawning perf job to read mem events
PERF2BOLT: spawning perf job to read process events
PERF2BOLT: spawning perf job to read task events
BOLT-INFO: Target architecture: aarch64
BOLT-INFO: BOLT version: d7d4b5ec2904208b4ff88a280d2cc3fcb3edc3a9
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x9c00000, offset 0x9c00000
BOLT-INFO: enabling relocation mode
BOLT-INFO: disabling -align-macro-fusion on non-x86 platform
BOLT-INFO: enabling strict relocation mode for aggregation purposes
BOLT-INFO: pre-processing profile using perf data aggregator
BOLT-INFO: binary build-id is: 1faaef2ee4de701d
PERF2BOLT: spawning perf job to read buildid list
PERF2BOLT: matched build-id and file name
PERF2BOLT: waiting for perf mmap events collection to finish...
PERF2BOLT: parsing perf-script mmap events output
PERF2BOLT: waiting for perf task events collection to finish...
PERF2BOLT: parsing perf-script task events output
PERF2BOLT: input binary is associated with 1 PID(s)
PERF2BOLT: waiting for perf events collection to finish...
PERF2BOLT: parsing basic events (without LBR)...
PERF2BOLT: waiting for perf mem events collection to finish...
PERF2BOLT: processing basic events (without LBR)...
PERF2BOLT: read 19 samples
PERF2BOLT: out of range samples recorded in unknown regions: 7 (36.8%)
PERF2BOLT: wrote 12 objects and 0 memory objects to a.fdata
BOLT-INFO: 11 out of 183192 functions in the binary (0.0%) have non-empty execution profile
BOLT-INFO: 3 functions have instructions with unknown control flow. Use -print-unknown to see the list.
```

The profile looks like

```
no_lbr cycles:u:
1 pthread_mutex_lock@PLT 8 1
etc...
```

If I try merging this profile with anything it fails

```bash
$ ./merge-fdata a.fdata > combined.fdata
Using legacy profile format.
merge-fdata: 'a.fdata': Malformed / corrupted profile counter.
```

However it can be used to generate a BOLT'ed binary

```bash
$ ./llvm-bolt binary -o ./bolt_binary -data a.fdata -reorder-functions=hfsort
BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: aarch64
BOLT-INFO: BOLT version: d7d4b5ec2904208b4ff88a280d2cc3fcb3edc3a9
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x9c00000, offset 0x9c00000
BOLT-INFO: enabling relocation mode
BOLT-INFO: disabling -align-macro-fusion on non-x86 platform
BOLT-INFO: pre-processing profile using branch profile reader
BOLT-INFO: operating with basic samples profiling data (no LBR).
BOLT-INFO: normalizing samples by instruction count.
BOLT-INFO: number of removed linker-inserted veneers: 0
BOLT-INFO: 9 out of 183192 functions in the binary (0.0%) have non-empty execution profile
BOLT-INFO: 2 functions with profile could not be optimized
BOLT-INFO: removed 21129 empty blocks
BOLT-INFO: merged 17 duplicate CFG edges
BOLT-INFO: Starting stub-insertion pass
BOLT-INFO: Inserted 0 stubs in the hot area and 0 stubs in the cold area. Shared 0 times, iterated 1 times.
BOLT-INFO: patched build-id (flipped last bit)
BOLT-INFO: setting __hot_start to 0x9e00000
BOLT-INFO: setting __hot_end to 0x9e02824
BOLT-ERROR: unable to get new address corresponding to input address 0x7266798 in function _ZN2func. Consider adding this function to --skip-funcs=...
```
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to