On 9/25/19 6:36 PM, Evgeny Kudryashov wrote: > On 2019-09-19 11:33, Martin Liška wrote: >> Hi. >> >> Function reordering has been around for quite some time and a naive >> implementation was also part of my diploma thesis some time ago. >> Currently, the GCC can reorder function based on first execution, which >> happens with PGO and LTO of course. Known limitation is that the order >> is preserved only partially as various symbols go into different >> LTRANS partitions. >> >> There has been some research in the area and I would point out the >> Facebook paper >> ([1]) and Sony presentation ([2]). Based on that, I decided to make a >> new implementation >> in the GCC that does the same (in a proper way). First part of the >> enablement are patches >> to ld.bfd and ld.gold that come up with a new section .text.sorted, >> that is always sorted. >> >> Thoughts? I would definitely welcome any interesting measurement on a >> bigger load. >> >> Martin >> > > Hi, Martin! > > Some time ago I tried to do the same but didn't go that far.
Hello. > > I also used the C3 algorithm, except for the fact that ipa_fn_summary > contains information about size and time (somehow missed it). Which is a key part of the algorithm as one wants not to cross a page size boundary (in ideal case). > The linker option --sort-section=name was used for prototyping. It, > obviously, sorts sections and allows to place functions in the desired order > (by placing them into named sections .text.sorted.NNNN, without patching > linkers or adjusting linker scripts). Yes, that can work, but having the new section explicitly sorted will not require a passing of an option to linker. > For testing my implementation I used several benchmarks from SPEC2006: > perlbench, sjeng and gobmk. Unfortunately, no significant positive changes > were obtained. > > I've tested the proposed pass on perlbench with train and reference input > (PGO+LTO as a base) but couldn't obtain stable results (still unclear > environment or perlbench specific reasons). Yes, it's quite difficult to measure something. One needs a huge .text section of a binary and a test-case which has a flat profile. I was able to measure results on the gcc itself. Martin > > Evgeny. >