> On Tue, Sep 2, 2025 at 11:22 PM Michal Jireš <mji...@suse.cz> wrote:
> >
> > On 8/28/25 7:40 PM, Andi Kleen wrote:
> > >
> > > Just 1:1 is slow, isn't it?
> > >
> >
> >
> > Surprisingly not by much:
> >
> > rm vmlinux.o; time make -j16 vmlinux
> >
> > # -flto-partition=1to1
> > ________________________________________________________
> > Executed in  235.38 secs    fish           external
> >     usr time   37.97 mins    0.00 millis   37.97 mins
> >     sys time    2.12 mins    1.69 millis    2.12 mins
> >
> > # -flto-partition=cache
> > ________________________________________________________
> > Executed in  231.91 secs    fish           external
> >     usr time   36.71 mins    0.40 millis   36.71 mins
> >     sys time    1.12 mins    1.12 millis    1.12 mins
> 
> What about -flto-partition=balanced though?  I'd be also interested in
> the temporary space requirements for the LTRANS IR files and object files.

1:1 used to be terrible but really got better over time since a lot of
work went into making WPA->ltrans streaming more scalable by streaming
less contxt.

linking cc1plus with balanced (and release checking) on CPU with 256
hyperthreads gets me:
real    0m54.644s
user    17m46.968s
sys     0m13.082s

While 1to1:
real    1m4.690s
user    23m13.399s
sys     0m32.854s

and max:
real    1m33.235s
user    43m26.934s
sys     156m1.788s


tmp file usage is comparable too

2750420 balanced
2856968 1to1
6253232 max

I am not quite sure why user time grows considerably, I think it is
simply because more lto1 invocations are needed (786 lto1, 127 balanced
and 67846 max).
So it is a collateral damage of starting compiler. Which also reminds me
of increasing default number of partitions for modern CPUs.

Overall getting initial version of kernel LTO with -flto-partition=1to1 as 
first step is IMO acceptable.

Honza

Reply via email to